Statistical similarity of binaries

Yaniv David, Nimrod Partush, Eran Yahav

Research output: Contribution to journalArticlepeer-review

Abstract

We address the problem of finding similar procedures in stripped binaries. We present a new statistical approach for measuring the similarity between two procedures. Our notion of similarity allows us to find similar code even when it has been compiled using different compilers, or has been modified. The main idea is to use similarity by composition: decompose the code into smaller comparable fragments, define semantic similarity between fragments, and use statistical reasoning to lift fragment similarity into similarity between procedures. We have implemented our approach in a tool called ESH, and applied it to find various prominent vulnerabilities across compilers and versions, including Heartbleed, Shellshock and Venom. We show that ESH produces high accuracy results, with few to no false positives-a crucial factor in the scenario of vulnerability search in stripped binaries.

Original languageEnglish
Pages (from-to)266-280
Number of pages15
JournalACM SIGPLAN Notices
Volume51
Issue number6
DOIs
StatePublished - Jun 2016
Externally publishedYes

Keywords

  • partial equivalence
  • static binary analysis
  • statistical similarity
  • verification-aided similarity

All Science Journal Classification (ASJC) codes

  • General Computer Science

Fingerprint

Dive into the research topics of 'Statistical similarity of binaries'. Together they form a unique fingerprint.

Cite this