HW/SW Codesign for Approximation-Aware Binary Neural Networks
Binary Neural Networks (BNNs) are rapidly gaining remarkable attention due to their superiority in shrinking the model size, which outstandingly mitigates the fundamental “memory wall” bottleneck that is attributed to the existing von-Neumann architectures. This work investigates how principles from approximate computing can be effectively employed to further optimize BNNs. It demonstrates that HW/SW codesign, in which BNNs are either proactively trained in the presence of approximation-induced errors (i.e. design-time optimization) and/or augmented with an appropriate error-mitigation scheme (i.e., run-time optimization), is a key to realize energy-efficient yet robust BNNs. We unveil, for the first time, that although the underlying HW of BNNs can be implemented using simple XNOR gates, the complexity of the required “Popcount” circuit super-linearly grows with the filter kernel size. This largely impacts the area footprint, inference time, energy, and hence it severely constricts the prospective efficiency gains from BNNs. To overcome this challenge, we replace the accurate full adders constructing the Popcount with Majority gates that approximately perform the required additions. Then, our carefully-crafted error-mitigation scheme along with activations tuning considerably minimizes the induced errors. Afterward, abstracted error probabilities are derived and employed during BNN training to obtain approximation-aware BNNs, that are inherently robust against the underlying HW approximation. Differently from the typical approaches, the proposed HW/SW codesign methodology has the merit of allowing a training of the approximate BNN without the need to modify the existing software frameworks (i.e., PyTorch). This is of great importance since existing tools rely on efficient built-in functions that can be difficult and/or inefficient to be modified. An FPGA-based SoC realizing both accurate and approximation-aware BNNs is developed for validating our proposed method…
- Published in:
IEEE Journal on Emerging and Selected Topics in Circuits and Systems - Type:
Article - Authors:
Dave, Abhilasha; Frustaci, Fabio; Spagnolo, Fanny; Yayla, Mikail; Chen, Jian-Jia; Amrouch, Hussam - Year:
2023
Citation information
Dave, Abhilasha; Frustaci, Fabio; Spagnolo, Fanny; Yayla, Mikail; Chen, Jian-Jia; Amrouch, Hussam: HW/SW Codesign for Approximation-Aware Binary Neural Networks, IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2023, 13, 1, 33--47, https://ieeexplore.ieee.org/document/10040675, Dave.etal.2023a,
@Article{Dave.etal.2023a,
author={Dave, Abhilasha; Frustaci, Fabio; Spagnolo, Fanny; Yayla, Mikail; Chen, Jian-Jia; Amrouch, Hussam},
title={HW/SW Codesign for Approximation-Aware Binary Neural Networks},
journal={IEEE Journal on Emerging and Selected Topics in Circuits and Systems},
volume={13},
number={1},
pages={33--47},
url={https://ieeexplore.ieee.org/document/10040675},
year={2023},
abstract={Binary Neural Networks (BNNs) are rapidly gaining remarkable attention due to their superiority in shrinking the model size, which outstandingly mitigates the fundamental “memory wall” bottleneck that is attributed to the existing von-Neumann architectures. This work investigates how principles from approximate computing can be effectively employed to further optimize BNNs. It demonstrates...}}