TY - GEN
T1 - QuantAttack
T2 - 2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
AU - Baras, Amit
AU - Zolfi, Alon
AU - Elovici, Yuval
AU - Shabtai, Asaf
N1 - Publisher Copyright: © 2025 IEEE.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - In recent years, there has been a significant trend in deep neural networks (DNNs), particularly transformer-based models, of developing ever-larger and more capable models. While they demonstrate state-of-the-art performance, their growing scale requires increased computational resources (e.g., GPUs with greater memory capacity). To address this problem, quantization techniques (i.e., low-bit-precision representation and matrix multiplication) have been proposed. Most quantization techniques employ a static strategy in which the model parameters are quantized, either during training or inference, without considering the test-time sample. In contrast, dynamic quantization techniques, which have become increasingly popular, adapt during inference based on the input provided, while maintaining full-precision performance. However, their dynamic behavior and average-case performance assumption makes them vulnerable to a novel threat vector - adversarial attacks that target the model's efficiency and availability. In this paper, we present QuantAttack, a novel attack that targets the availability of quantized vision transformers, slowing down the inference, and increasing memory usage and energy consumption. The source code is available online11https://github.com/barasamit/QuantAttack.
AB - In recent years, there has been a significant trend in deep neural networks (DNNs), particularly transformer-based models, of developing ever-larger and more capable models. While they demonstrate state-of-the-art performance, their growing scale requires increased computational resources (e.g., GPUs with greater memory capacity). To address this problem, quantization techniques (i.e., low-bit-precision representation and matrix multiplication) have been proposed. Most quantization techniques employ a static strategy in which the model parameters are quantized, either during training or inference, without considering the test-time sample. In contrast, dynamic quantization techniques, which have become increasingly popular, adapt during inference based on the input provided, while maintaining full-precision performance. However, their dynamic behavior and average-case performance assumption makes them vulnerable to a novel threat vector - adversarial attacks that target the model's efficiency and availability. In this paper, we present QuantAttack, a novel attack that targets the availability of quantized vision transformers, slowing down the inference, and increasing memory usage and energy consumption. The source code is available online11https://github.com/barasamit/QuantAttack.
UR - http://www.scopus.com/inward/record.url?scp=105003623288&partnerID=8YFLogxK
U2 - 10.1109/WACV61041.2025.00655
DO - 10.1109/WACV61041.2025.00655
M3 - Conference contribution
T3 - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
SP - 6730
EP - 6740
BT - Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
Y2 - 28 February 2025 through 4 March 2025
ER -