Adversarial Training for Raw-Binary Malware Classifiers

Keane Lucas, Lujo Bauer, Samruddhi Pai, Michael K. Reiter, Weiran Lin, Mahmood Sharif

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Machine learning (ML) models have shown promise in classifying raw executable files (binaries) as malicious or benign with high accuracy. This has led to the increasing influence of ML-based classification methods in academic and real-world malware detection, a critical tool in cybersecurity. However, previous work provoked caution by creating variants of malicious binaries, referred to as adversarial examples, that are transformed in a functionality-preserving way to evade detection. In this work, we investigate the effectiveness of using adversarial training methods to create malware-classification models that are more robust to some state-of-the-art attacks. To train our most robust models, we significantly increase the efficiency and scale of creating adversarial examples to make adversarial training practical, which has not been done before in raw-binary malware detectors. We then analyze the effects of varying the length of adversarial training, as well as analyze the effects of training with various types of attacks. We find that data augmentation does not deter state-of-the-art attacks, but that using a generic gradient-guided method, used in other discrete domains, does improve robustness. We also show that in most cases, models can be made more robust to malware-domain attacks by adversarially training them with lower-effort versions of the same attack. In the best case, we reduce one state-of-the-art attack's success rate from 90% to 5%. We also find that training with some types of attacks can increase robustness to other types of attacks. Finally, we discuss insights gained from our results, and how they can be used to more effectively train robust malware detectors.

Original languageEnglish
Title of host publication32nd USENIX Security Symposium, USENIX Security 2023
Pages1163-1180
Number of pages18
ISBN (Electronic)9781713879497
StatePublished - 2023
Event32nd USENIX Security Symposium, USENIX Security 2023 - Anaheim, United States
Duration: 9 Aug 202311 Aug 2023

Publication series

Name32nd USENIX Security Symposium, USENIX Security 2023
Volume2

Conference

Conference32nd USENIX Security Symposium, USENIX Security 2023
Country/TerritoryUnited States
CityAnaheim
Period9/08/2311/08/23

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Adversarial Training for Raw-Binary Malware Classifiers'. Together they form a unique fingerprint.

Cite this