Compressed Matching in Dictionaries

Shmuel T. Klein, Dana Shapira

Research output: Contribution to journalArticlepeer-review

Abstract

The problem of compressed pattern matching, which has recently been treated in many papers dealing with free text, is extended to structured files, specifically to dictionaries, which appear in any full-text retrieval system. The prefix-omission method is combined with Huffman coding and a new variant based on Fibonacci codes is presented. Experimental results suggest that the new methods are often preferable to earlier ones, in particular for small files which are typical for dictionaries, since these are usually kept in small chunks.

Original languageEnglish
Pages (from-to)61-74
Number of pages14
JournalAlgorithms
Volume4
Issue number1
DOIs
StatePublished - Mar 2011

Keywords

  • Compressed matching
  • Dictionaries
  • Fibonacci codes
  • Huffman codes
  • IR systems
  • Pattern matching

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Numerical Analysis
  • Computational Theory and Mathematics
  • Computational Mathematics

Fingerprint

Dive into the research topics of 'Compressed Matching in Dictionaries'. Together they form a unique fingerprint.

Cite this