Stress-Testing USB Accelerators for Efficient Edge Inference
Several manufacturers sell specialized USB devices for accelerating machine learning (ML) on the edge. While being generally promoted as a versatile solution for more efficient edge inference with deep learning models, extensive practical insights on their usability and performance are hard to find. In order to make ML deployment on the edge more sustainable, our work investigates how resource efficient these USB accelerators really are. For that, we first introduce a novel and theoretically sound methodology. It allows for comparing intricate model performance in terms of quality and resource consumption across different execution environments. We then put it into practice by studying the usability and efficiency of Google’s Coral edge tensor processing unit (TPU) and Intel’s neural compute stick 2 (NCS). In total, we benchmark over 30 models across nine hardware configurations, which reveals intricate trade-offs. Our work demonstrates that USB accelerators are indeed capable of reducing the energy consumption by a factor up to ten, however this improvement cannot be observed for all configurations – more than 50% of the investigated models cannot be run on accelerator hardware, and in several other cases, the power draw is only marginally improved. Our experiments show that the NCS improves efficiency in a more stable way, while the TPU shows further benefits in specific cases but performs less predictable. We hope that our paper provides valuable insights for practitioners that want to deploy ML on the edge in the most efficient and sustainable way.
- Published in:
arXiv - Type:
Article - Authors:
Fischer, Raphael; van der Staay, Alexander; Buschjäger, Sebastian - Year:
2024
Citation information
Fischer, Raphael; van der Staay, Alexander; Buschjäger, Sebastian: Stress-Testing USB Accelerators for Efficient Edge Inference, arXiv, 2024, January, https://www.researchsquare.com/article/rs-3793927/v1, Fischer.etal.2024d,
@Article{Fischer.etal.2024d,
author={Fischer, Raphael; van der Staay, Alexander; Buschjäger, Sebastian},
title={Stress-Testing USB Accelerators for Efficient Edge Inference},
journal={arXiv},
month={January},
url={https://www.researchsquare.com/article/rs-3793927/v1},
year={2024},
abstract={Several manufacturers sell specialized USB devices for accelerating machine learning (ML) on the edge. While being generally promoted as a versatile solution for more efficient edge inference with deep learning models, extensive practical insights on their usability and performance are hard to find. In order to make ML deployment on the edge more sustainable, our work investigates how resource...}}