Generalized Longest Repeated Substring Min-Entropy Estimator

Jiheon Woo, Chanhee Yoo, Young Sik Kim, Yuval Cassuto, Yongjune Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


The min-entropy is a widely used metric to quantify the randomness of generated random numbers, which measures the difficulty of guessing the most likely output. It is difficult to accurately estimate the min-entropy of a non-independent and identically distributed (non-IID) source. Hence, NIST Special Publication (SP) 800-90B adopts ten different min-entropy estimators and then conservatively selects the minimum value among ten min-entropy estimates. Among these estimators, the longest repeated substring (LRS) estimator estimates the collision entropy instead of the min-entropy by counting the number of repeated substrings. Since the collision entropy is an upper bound on the min-entropy, the LRS estimator inherently provides overestimated outputs. In this paper, we propose two techniques to estimate the min-entropy of a non-IID source accurately. The first technique resolves the overestimation problem by translating the collision entropy into the min-entropy. Next, we generalize the LRS estimator by adopting the general Rényi entropy instead of the collision entropy (i.e., Rényi entropy of order two). We show that adopting a higher order can reduce the variance of min-entropy estimates. By integrating these techniques, we propose a generalized LRS estimator that effectively resolves the overestimation problem and provides stable min-entropy estimates. Theoretical analysis and empirical results support that the proposed generalized LRS estimator improves the estimation accuracy significantly, which makes it an appealing alternative to the current-standard LRS estimator.

Original languageEnglish
Title of host publication2022 IEEE International Symposium on Information Theory, ISIT 2022
Number of pages6
ISBN (Electronic)9781665421591
StatePublished - 2022
Event2022 IEEE International Symposium on Information Theory, ISIT 2022 - Espoo, Finland
Duration: 26 Jun 20221 Jul 2022

Publication series

NameIEEE International Symposium on Information Theory - Proceedings


Conference2022 IEEE International Symposium on Information Theory, ISIT 2022

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Information Systems
  • Modelling and Simulation
  • Applied Mathematics


Dive into the research topics of 'Generalized Longest Repeated Substring Min-Entropy Estimator'. Together they form a unique fingerprint.

Cite this