Accelerating regular expression matching over compressed HTTP

Michela Becchi, Anat Bremler-Barr, David Hay, Omer Kochba, Yaron Koral

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper focuses on regular expression matching over compressed traffic. The need for such matching arises from two independent trends. First, the volume and share of compressed HTTP traffic is constantly increasing. Second, due to their superior expressibility, current Deep Packet Inspection engines use regular expressions more and more frequently. We present an algorithmic framework to accelerate such matching, taking advantage of information gathered when the traffic was initially compressed. HTTP compression is typically performed through the GZIP protocol, which uses back-references to repeated strings. Our algorithm is based on calculating (for every byte) the minimum number of (previous) bytes that can be part of a future regular expression matching. When inspecting a back-reference, only these bytes should be taken into account, thus enabling one to skip repeated strings almost entirely without missing a match. We show that our generic framework works with either NFA-based or DFA-based implementations and gains performance boosts of more than 70%. Moreover, it can be readily adapted to most existing regular expression matching algorithms, which usually are based either on NFA, DFA or combinations of the two. Finally, we discuss other applications in which calculating the number of relevant bytes becomes handy, even when the traffic is not compressed.

Original languageEnglish
Title of host publication2015 IEEE Conference on Computer Communications, IEEE INFOCOM 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages540-548
Number of pages9
ISBN (Electronic)9781479983810
DOIs
StatePublished - 21 Aug 2015
Event34th IEEE Annual Conference on Computer Communications and Networks, IEEE INFOCOM 2015 - Hong Kong, Hong Kong
Duration: 26 Apr 20151 May 2015

Publication series

NameProceedings - IEEE INFOCOM
Volume26

Conference

Conference34th IEEE Annual Conference on Computer Communications and Networks, IEEE INFOCOM 2015
Country/TerritoryHong Kong
CityHong Kong
Period26/04/151/05/15

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Accelerating regular expression matching over compressed HTTP'. Together they form a unique fingerprint.

Cite this