4th International ICST Conference on Security and Privacy in Communication Networks

Research Article

Efficient Signature Matching with Multiple Alphabet Compression Tables

  • @INPROCEEDINGS{10.1145/1460877.1460879,
        author={Shijin Kong and Randy Smith and Cristian Estan},
        title={Efficient Signature Matching with Multiple Alphabet Compression Tables},
        proceedings={4th International ICST Conference on Security and Privacy in Communication Networks},
        publisher={ACM},
        proceedings_a={SECURECOMM},
        year={2008},
        month={9},
        keywords={Signature Matching deep packet inspection regular expressions alphabet compression},
        doi={10.1145/1460877.1460879}
    }
    
  • Shijin Kong
    Randy Smith
    Cristian Estan
    Year: 2008
    Efficient Signature Matching with Multiple Alphabet Compression Tables
    SECURECOMM
    ACM
    DOI: 10.1145/1460877.1460879
Shijin Kong1,*, Randy Smith1,*, Cristian Estan1,*
  • 1: Computer Sciences Department University of Wisconsin-Madison
*Contact email: krobin@cs.wisc.edu, smithr@cs.wisc.edu, estan@cs.wisc.edu

Abstract

Signature matching is a performance critical operation in intrusion prevention systems. Modern systems express signatures as regular expressions and use Deterministic Finite Automata (DFAs) to efficiently match them against the input. In principle, DFAs can be combined so that all signatures can be examined in a single pass over the input. In practice, however, combining DFAs corresponding to intrusion prevention signatures results in memory requirements that far exceed feasible sizes. We observe for such signatures that distinct input symbols often have identical behavior in the DFA. In these cases, an Alphabet Compression Table (ACT) can be used to map such groups of symbols to a single symbol to reduce the memory requirements. In this paper, we explore the use of multiple alphabet compression tables as a lightweight method for reducing the memory requirements of DFAs. We evaluate this method on signature sets used in Cisco IPS and Snort. Compared to uncompressed DFAs, multiple ACTs achieve memory savings between a factor of 4 and a factor of 70 at the cost of an increase in run time that is typically between 35% and 85%. Compared to another recent compression technique, D2FAs, ACTs are between 2 and 3.5 times faster in software, and in some cases use less than one tenth of the memory used by D2FAs. Overall, for all signature sets and compression methods evaluated, multiple ACTs offer the best memory versus run-time trade-offs.