It was <2020-05-20 śro 13:53>, when Stephan Mueller wrote:
> Am Mittwoch, 20. Mai 2020, 12:44:33 CEST schrieb Lukasz Stelmach:
>> It was <2020-05-20 śro 11:18>, when Stephan Mueller wrote:
>>> Am Mittwoch, 20. Mai 2020, 11:10:32 CEST schrieb Lukasz Stelmach:
>>>> It was <2020-05-20 śro 08:23>, when Stephan Mueller wrote:
>>>>> Am Dienstag, 19. Mai 2020, 23:25:51 CEST schrieb Łukasz Stelmach:
>>>>>
>>>>>> The value was estimaded with ea_iid[1] using on 10485760 bytes
>>>>>> read from the RNG via /dev/hwrng. The min-entropy value
>>>>>> calculated using the most common value estimate (NIST SP
>>>>>> 800-90P[2], section 6.3.1) was 7.964464.
>>>>> 
>>>>> I am sorry, but I think I did not make myself clear: testing
>>>>> random numbers post-processing with the statistical tools does NOT
>>>>> give any idea about the entropy rate. Thus, all that was
>>>>> calculated is the proper implementation of the post-processing
>>>>> operation and not the actual noise source.
>>>>> 
>>>>> What needs to happen is that we need access to raw, unconditioned
>>>>> data from the noise source that is analyzed with the statistical
>>>>> methods.
>>>> 
>>>> I did understand you and I assure you the data I tested were
>>>> obtained directly from RNGs. As I pointed before[1], that is how
>>>> /dev/hwrng works[2].
>>> 
>>> I understand that /dev/hwrng pulls the data straight from the
>>> hardware. But the data from the hardware usually is not obtained
>>> straight from the noise source.
>>> 
>>> Typically you have a noise source (e.g. a ring oscillator) whose data
>>> is digitized then fed into a compression function like an LFSR or a
>>> hash. Then a cryptographic operation like a CBC-MAC, hash or even a
>>> DRBG is applied to that data when the caller wants to have random
>>> numbers.

[...]

>>> In order to estimate entropy, we need the raw unconditioned data from
>>> the, say, ring oscillator and not from the (cryptographic) output
>>> operation.
>> 
>> Can you tell, why it matters in this case? If I understand correctly,
>> the quality field describes not the randomness created by the noise
>> generator but the one delivered by the driver to other software
>> components.
>
> The quality field is used by add_hwgenerator_randomness to increase
> the Linux RNG entropy estimator accordingly. This is the issue.
>
> And giving an entropy rate based on post-processed data is
> meaningless.
>
> The concern is, for example, that you use a DRBG that you seeded with,
> say, a zero buffer. You get perfect random data from it that no
> statistical test can disprove. Yet we know this data stream has zero
> entropy. Thus, we need to get to the source and assess its entropy.

Of course, this makes sense.

>>> That said, the illustrated example is typical for hardware RNGs. Yet
>>> it is never guaranteed to work that way. Thus, if you can point to
>>> architecture documentation of your specific hardware RNGs showing
>>> that the data read from the hardware is pure unconditioned noise
>>> data, then I have no objections to the patch.
>> 
>> I can tell for sure that this is the case for exynos-trng[1].
>
> So you are saying that the output for the exynos-trng is straight from
> a ring oscillator without any post-processing of any kind?
>
> If this is the case, I would like to suggest you add that statement to
> the git commit message with that reference. If so, then I would
> withdraw my objection.

Done. I will do some reaserch on iproc-rng200 and I will send v3 with
the altered commit message.


Thank you *very* much for your patience.
-- 
Łukasz Stelmach
Samsung R&D Institute Poland
Samsung Electronics