Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How much data should be fed into entropy assessment tools for an accurate min-entropy estimate? #223

Open
maxwell-pung-bsi opened this issue Jun 28, 2023 · 1 comment

Comments

@maxwell-pung-bsi
Copy link

I have been experimenting with using the NIST 800-90B entropy assessment tools to quantify the min-entropy provided by black-box HRNGs.

The repositories documentation (as far as I can tell) does not provide guidance as to how much data should be fed into the tools in order to get an accurate min-entropy estimate. In the vendor documentation of one of the HRNGs, they claim that the data files passed to the entropy assessment tools should be at least 10MB in size in order to get an accurate result. But, I have no idea where they got this number from.

Is there guidelines for how much data should be passed into the entropy assessment tools to get an accurate min-entropy estimate?

@joshuaehill
Copy link
Contributor

This tool is an implementation of the estimators in NIST SP 800-90B. It is difficult to interpret the meaning of the output of these tools without reading this document in order to get some idea of what this tool is intended to accomplish.

This document (in Section 3.1.1) specifies that the sample size ($L$) should be at least 1 million. It also requires that this data sample must be "raw" output of the noise source (roughly equivalent in AIS-31 terms to somewhere in the range between "das random numbers" and "raw random numbers").

Most of the estimators include some sort of confidence interval calculation, whose width varies proportionally to roughly $1 \over \sqrt{L}$, so (barring some observed defect) using larger samples is likely to result in results that are both numerically larger and more stable across independent tests.

This tool cannot (indeed, it is not theoretically possible for a tool to) reliably estimate the min entropy for all noise sources. For example, imagine statistically assessing almost any reasonable PRNG.

In SP 800-90B, any estimate for min entropy must be based on an understanding of the system producing the numbers (i.e., black box entropy estimation isn't, in general, possible.) This design-based assessment is integrated as the $H_\text{submitter}$ value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants