A Simple and Efficient Approach for Adaptive Entropy Coding over Large Alphabets

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Encoding a sequence of independent symbols over a large alphabet size is a challenging problem with applications in many fields. The most widely used adaptive entropy coding techniques (namely, arithmetic and Huffman coding) are known to achieve an average codeword length which may be significantly greater than the empirical entropy of the sequence, as the alphabet size increases. In this work we introduce an efficient and easy-to-implement method for large alphabet adaptive encoding. We propose a conceptual framework in which a sequence of symbols, over a large alphabet size, is decomposed into multiple 'almost independent' sequences over a smaller alphabet. Then each of these sequences is encoded separately. This way, we allow encoding of small alphabet sequences, at the cost of the 'remaining dependence' among the sequences. We demonstrate the advantages of our suggested scheme through a series of theorems and experiments, showing it reduces both the average codeword length and the compression runtime in many large alphabet setups.

Original languageEnglish
Title of host publicationProceedings - DCC 2016
Subtitle of host publication2016 Data Compression Conference
EditorsMichael W. Marcellin, Ali Bilgin, Joan Serra-Sagrista, James A. Storer
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages369-378
Number of pages10
ISBN (Electronic)9781509018536
DOIs
StatePublished - 15 Dec 2016
Event2016 Data Compression Conference, DCC 2016 - Snowbird, United States
Duration: 29 Mar 20161 Apr 2016

Publication series

NameData Compression Conference Proceedings

Conference

Conference2016 Data Compression Conference, DCC 2016
Country/TerritoryUnited States
CitySnowbird
Period29/03/161/04/16

Keywords

  • Factorial codes
  • Independent Component Analysis
  • Large alphabet source coding

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'A Simple and Efficient Approach for Adaptive Entropy Coding over Large Alphabets'. Together they form a unique fingerprint.

Cite this