On Empirical Cumulant Generating Functions of Code Lengths for Individual Sequences

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the problem of lossless compression of individual sequences using finite-state (FS) machines, from the perspective of the best achievable empirical cumulant generating function (CGF) of the code length, i.e., the normalized logarithm of the empirical average of the exponentiated code length. Since the probabilistic CGF is minimized in terms of the Rényi entropy of the source, one of the motivations of this paper is to derive an individual-sequence analogue of the Rényi entropy, in the same way that the FS compressibility is the individual-sequence counterpart of the Shannon entropy. We consider the CGF of the code-length both from the perspective of fixed-to-variable length coding and the perspective of variable-to-variable (V-V) length coding, where the latter turns out to yield a better result, that coincides with the FS compressibility. We also extend our results to compression with side information, available at both the encoder and decoder. In this case, the V-V version no longer coincides with the FS compressibility, but results in a different complexity measure.

Original languageEnglish
Article number8052537
Pages (from-to)7729-7736
Number of pages8
JournalIEEE Transactions on Information Theory
Volume63
Issue number12
DOIs
StatePublished - Dec 2017

Keywords

  • Individual sequences
  • Lempel-Ziv algorithm
  • Rényi entropy
  • compressibility
  • cumulant generating function
  • finite-state machines

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'On Empirical Cumulant Generating Functions of Code Lengths for Individual Sequences'. Together they form a unique fingerprint.

Cite this