BiTAT: Neural Network Binarization with Task-dependent Aggregated Transformation: Neural Network Binarization with Task-Dependent Aggregated Transformation

Geon Park, Jaehong Yoon, Haiyang Zhang, Xing Zhang, Sung Ju Hwang, Yonina C Eldar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Neural network quantization aims to transform high-precision weights and activations of a given neural network into low-precision weights/activations for reduced memory usage and computation, while preserving the performance of the original model. However, extreme quantization (1-bit weight/1-bit activations) of compactly-designed backbone architectures (e.g., MobileNets) often used for edge-device deployments results in severe performance degeneration. This paper proposes a novel Quantization-Aware Training (QAT) method that can effectively alleviate performance degeneration even with extreme quantization by focusing on the inter-weight dependencies, between the weights within each layer and across consecutive layers. To minimize the quantization impact of each weight on others, we perform an orthonormal transformation of the weights at each layer by training an input-dependent correlation matrix and importance vector, such that each weight is disentangled from the others. Then, we quantize the weights based on their importance to minimize the loss of the information from the original weights/activations. We further perform progressive layer-wise quantization from the bottom layer to the top, so that quantization at each layer reflects the quantized distributions of weights and activations at previous layers. We validate the effectiveness of our method on various benchmark datasets against strong neural quantization baselines, demonstrating that it alleviates the performance degeneration on ImageNet and successfully preserves the full-precision model performance on CIFAR-100 with compact backbone networks.
Original languageEnglish
Title of host publicationComputer Vision – ECCV 2022 Workshops
Subtitle of host publicationTel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII
EditorsLeonid Karlinsky, Tomer Michaeli, Ko Nishino
PublisherSpringer Basel AG
Pages50-66
Number of pages17
Volume13807
ISBN (Print)9783031250811
DOIs
StatePublished - 2023
Event17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel
Duration: 23 Oct 202227 Oct 2022

Publication series

NameLecture Notes in Computer Science
ISSN (Print)0302-9743

Conference

Conference17th European Conference on Computer Vision, ECCV 2022
Country/TerritoryIsrael
CityTel Aviv
Period23/10/2227/10/22

Keywords

  • Neural network binarization
  • Quantization-aware training

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'BiTAT: Neural Network Binarization with Task-dependent Aggregated Transformation: Neural Network Binarization with Task-Dependent Aggregated Transformation'. Together they form a unique fingerprint.

Cite this