All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH] getrandom02: relax check for returned data
@ 2017-02-06 12:13 Jan Stancek
  2017-02-07 15:33 ` Cyril Hrubis
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Stancek @ 2017-02-06 12:13 UTC (permalink / raw)
  To: ltp

Current check function looks for frequency of returned
bytes, which sometimes fail for small buffers, where
one byte repeats.

For example, when getrandom() returns 10 bytes, and one byte
repeats we fail on this check:
	if (max > 0 && table[i] > max)

This patch is replacing it with more relaxed condition:
if at least one byte is non-zero, we consider it a success.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 testcases/kernel/syscalls/getrandom/getrandom02.c | 46 ++++++-----------------
 1 file changed, 12 insertions(+), 34 deletions(-)

diff --git a/testcases/kernel/syscalls/getrandom/getrandom02.c b/testcases/kernel/syscalls/getrandom/getrandom02.c
index 0ac8bd28aed0..2b30c7f49f86 100644
--- a/testcases/kernel/syscalls/getrandom/getrandom02.c
+++ b/testcases/kernel/syscalls/getrandom/getrandom02.c
@@ -38,14 +38,11 @@ static int modes[] = { 0, GRND_RANDOM, GRND_NONBLOCK,
 int TST_TOTAL = ARRAY_SIZE(modes);
 
 static unsigned char buf[256];
-static size_t size = 256;
-
-static void fill(void);
-static int check_content(int nb);
+int size = sizeof(buf);
 
 int main(int ac, char **av)
 {
-	int lc, i;
+	int lc, i, j, tmp;
 
 	tst_parse_opts(ac, av, NULL, NULL);
 
@@ -53,7 +50,7 @@ int main(int ac, char **av)
 		tst_count = 0;
 
 		for (i = 0; i < TST_TOTAL; i++) {
-			fill();
+			memset(buf, '\0', size);
 
 			do {
 				TEST(ltp_syscall(__NR_getrandom, buf, size,
@@ -61,38 +58,19 @@ int main(int ac, char **av)
 			} while ((modes[i] & GRND_NONBLOCK) && TEST_RETURN == -1
 				&& TEST_ERRNO == EAGAIN);
 
-			if (!check_content(TEST_RETURN))
+			if (TEST_RETURN == -1) {
 				tst_resm(TFAIL | TTERRNO, "getrandom failed");
-			else
+			} else {
 				tst_resm(TPASS, "getrandom returned %ld",
 						TEST_RETURN);
+				tmp = 0;
+				for (j = 0; j < TEST_RETURN; j++)
+					tmp |= buf[j];
+				if (tmp == 0)
+					tst_resm(TWARN, "all bytes from random"
+						" buffer are zero?");
+			}
 		}
 	}
 	tst_exit();
 }
-
-static void fill(void)
-{
-	memset(buf, '\0', sizeof(buf));
-}
-
-static int check_content(int nb)
-{
-	int table[256];
-	int i, index, max;
-
-	memset(table, 0, sizeof(table));
-
-	max = nb * 0.10;
-
-	for (i = 0; i < nb; i++) {
-		index = buf[i];
-		table[index]++;
-	}
-
-	for (i = 0; i < nb; i++) {
-		if (max > 0 && table[i] > max)
-			return 0;
-	}
-	return 1;
-}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-06 12:13 [LTP] [PATCH] getrandom02: relax check for returned data Jan Stancek
@ 2017-02-07 15:33 ` Cyril Hrubis
  2017-02-07 16:15   ` Jan Stancek
  0 siblings, 1 reply; 9+ messages in thread
From: Cyril Hrubis @ 2017-02-07 15:33 UTC (permalink / raw)
  To: ltp

Hi!
> +			} else {
>  				tst_resm(TPASS, "getrandom returned %ld",
>  						TEST_RETURN);
> +				tmp = 0;
> +				for (j = 0; j < TEST_RETURN; j++)
> +					tmp |= buf[j];
> +				if (tmp == 0)
> +					tst_resm(TWARN, "all bytes from random"
> +						" buffer are zero?");

This wouldn't for instance when the random output is filled with
non-zero constant bytes...

What about just fixing the max value to something as:

max = 3 + nb * 0.2;

The constat part should handle cases with small buffer and a few
repeating characters while for larger buffer it's neglectible.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-07 15:33 ` Cyril Hrubis
@ 2017-02-07 16:15   ` Jan Stancek
  2017-02-07 17:30     ` Cyril Hrubis
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Stancek @ 2017-02-07 16:15 UTC (permalink / raw)
  To: ltp





----- Original Message -----
> From: "Cyril Hrubis" <chrubis@suse.cz>
> To: "Jan Stancek" <jstancek@redhat.com>
> Cc: ltp@lists.linux.it
> Sent: Tuesday, 7 February, 2017 4:33:52 PM
> Subject: Re: [LTP] [PATCH] getrandom02: relax check for returned data
> 
> Hi!
> > +			} else {
> >  				tst_resm(TPASS, "getrandom returned %ld",
> >  						TEST_RETURN);
> > +				tmp = 0;
> > +				for (j = 0; j < TEST_RETURN; j++)
> > +					tmp |= buf[j];
> > +				if (tmp == 0)
> > +					tst_resm(TWARN, "all bytes from random"
> > +						" buffer are zero?");
> 
> This wouldn't for instance when the random output is filled with
> non-zero constant bytes...
> 
> What about just fixing the max value to something as:
> 
> max = 3 + nb * 0.2;
> 
> The constat part should handle cases with small buffer and a few
> repeating characters while for larger buffer it's neglectible.

In that case we may as well skip the check for small buffers.
What size would make reasonably large sample? 64?

> 
> --
> Cyril Hrubis
> chrubis@suse.cz
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-07 16:15   ` Jan Stancek
@ 2017-02-07 17:30     ` Cyril Hrubis
  2017-02-08 11:23       ` Jiri Jaburek
  2017-02-08 13:14       ` Jan Stancek
  0 siblings, 2 replies; 9+ messages in thread
From: Cyril Hrubis @ 2017-02-07 17:30 UTC (permalink / raw)
  To: ltp

Hi!
> > This wouldn't for instance when the random output is filled with
> > non-zero constant bytes...
> > 
> > What about just fixing the max value to something as:
> > 
> > max = 3 + nb * 0.2;
> > 
> > The constat part should handle cases with small buffer and a few
> > repeating characters while for larger buffer it's neglectible.
> 
> In that case we may as well skip the check for small buffers.

The probability of a failure, given that the distribution is random,
would be:

256 * {N \choose max} / (256 ^ max)

Since you choose one of the possible 256 byte values, then you have N
tries to hit the value, so you choose all possible combinations, each of
them has probability 1/(256 ^ max).

I've tried to visualize the data with gnuplot and it looks like the
minimal reasonable constant part of the equation is 4, with anything
less we got probability of a failure in terms of percents. Bumping it to
4 the maximal probality for a failure is around 10^-5.

With max = 6 + N * 0.2 the maximal probability of a failure for a buffer
of a size around 10 is about 10^-13. So I would say that something like
this is pretty much safe. This also skips buffers smaller than 7 by
default. Or we may get even safer with something as max = 10 * N * 0.1
which has maximal failure probability about 10^-19 and skips buffers up
to size of a 11.

Try for youself:

gnuplot> comb(n,k) = gamma(n+1)/(gamma(k+1) * gamma(n - k + 1))
gnuplot> plot [6:30] 256 * comb(x, int(6 + x * 0.2)) / (256 ** int(6 + x * 0.2))

Note that we got series of spikes where each one is much smaller than
the previous one (this is because the max is not continous and "hops"
every time the x * 0.2 is integer number.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-07 17:30     ` Cyril Hrubis
@ 2017-02-08 11:23       ` Jiri Jaburek
  2017-02-08 13:32         ` Cyril Hrubis
  2017-02-08 13:14       ` Jan Stancek
  1 sibling, 1 reply; 9+ messages in thread
From: Jiri Jaburek @ 2017-02-08 11:23 UTC (permalink / raw)
  To: ltp

On 02/07/17 18:30, Cyril Hrubis wrote:
> Hi!
>>> This wouldn't for instance when the random output is filled with
>>> non-zero constant bytes...
>>>
>>> What about just fixing the max value to something as:
>>>
>>> max = 3 + nb * 0.2;
>>>
>>> The constat part should handle cases with small buffer and a few
>>> repeating characters while for larger buffer it's neglectible.
>>
>> In that case we may as well skip the check for small buffers.
> 
> The probability of a failure, given that the distribution is random,
> would be:
> 
> 256 * {N \choose max} / (256 ^ max)
> 
<snip>

If you're interested in that level of randomness verification (although
I would suggest splitting it from the getrandom syscall, so it could
test ie. /dev/urandom output as well as both should be internally the
same), take a look at rng-tools implementation of some of FIPS-140 at

http://gkernel.cvs.sourceforge.net/viewvc/gkernel/rng-tools/

(fips.c / rngtest.c)

This would AFAIK cover most catastrophic failures of HW RNGs, although
on Linux, you're really testing just the CSPRNG,
http://lxr.free-electrons.com/source/drivers/char/random.c

Jiri


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-07 17:30     ` Cyril Hrubis
  2017-02-08 11:23       ` Jiri Jaburek
@ 2017-02-08 13:14       ` Jan Stancek
  2017-02-08 14:08         ` Cyril Hrubis
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Stancek @ 2017-02-08 13:14 UTC (permalink / raw)
  To: ltp

On 02/07/2017 06:30 PM, Cyril Hrubis wrote:
> Hi!
>>> This wouldn't for instance when the random output is filled with
>>> non-zero constant bytes...
>>>
>>> What about just fixing the max value to something as:
>>>
>>> max = 3 + nb * 0.2;
>>>
>>> The constat part should handle cases with small buffer and a few
>>> repeating characters while for larger buffer it's neglectible.
>>
>> In that case we may as well skip the check for small buffers.
> 
> The probability of a failure, given that the distribution is random,
> would be:
> 
> 256 * {N \choose max} / (256 ^ max)
> 
> Since you choose one of the possible 256 byte values, then you have N
> tries to hit the value, so you choose all possible combinations, each of
> them has probability 1/(256 ^ max).
> 
> I've tried to visualize the data with gnuplot and it looks like the
> minimal reasonable constant part of the equation is 4, with anything
> less we got probability of a failure in terms of percents. Bumping it to
> 4 the maximal probality for a failure is around 10^-5.
> 
> With max = 6 + N * 0.2 the maximal probability of a failure for a buffer
> of a size around 10 is about 10^-13. So I would say that something like
> this is pretty much safe. This also skips buffers smaller than 7 by
> default. Or we may get even safer with something as max = 10 * N * 0.1
> which has maximal failure probability about 10^-19 and skips buffers up
> to size of a 11.
> 
> Try for youself:
> 
> gnuplot> comb(n,k) = gamma(n+1)/(gamma(k+1) * gamma(n - k + 1))
> gnuplot> plot [6:30] 256 * comb(x, int(6 + x * 0.2)) / (256 ** int(6 + x * 0.2))

This does not look correct to me. Try it with old formula, it goes above 1:
gnuplot> comb(n,k) = gamma(n+1)/(gamma(k+1) * gamma(n - k + 1))
gnuplot> plot [6:30] 256 * comb(x, int(x * 0.1)) / (256 ** int(x * 0.1))

---

With old formula, for N<20 (and max=1), this is pretty much birthday
problem [1] with smaller pool - we don't have 365 days, we have 256 bytes.

With birthday problem you get 50% chance of having 2 people with same
birthday if number of people is 23. Since we have smaller pool, I'd expect
we reach 50% chance sooner.

With N>=20, max increases and it gets a lot more complicated. Best advice
I've found online was to use Poisson approximation [2].

Based on that (see attached python snippet) I'm getting that:
"x*0.1" is quite bad, specially for N=19
"6+x*0.2" is a lot better, with chance ~10^-16

I'll send v2, with "6+x*0.2".

Regards,
Jan

[1] https://en.wikipedia.org/wiki/Birthday_problem
[2] http://math.stackexchange.com/questions/25876/probability-of-3-people-in-a-room-of-30-having-the-same-birthday

-------------- next part --------------
A non-text attachment was scrubbed...
Name: birthday_problem.py
Type: text/x-python
Size: 650 bytes
Desc: not available
URL: <http://lists.linux.it/pipermail/ltp/attachments/20170208/b3ca6ad4/attachment.py>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-08 11:23       ` Jiri Jaburek
@ 2017-02-08 13:32         ` Cyril Hrubis
  0 siblings, 0 replies; 9+ messages in thread
From: Cyril Hrubis @ 2017-02-08 13:32 UTC (permalink / raw)
  To: ltp

Hi!
> > The probability of a failure, given that the distribution is random,
> > would be:
> > 
> > 256 * {N \choose max} / (256 ^ max)
> > 
> <snip>
> 
> If you're interested in that level of randomness verification (although
> I would suggest splitting it from the getrandom syscall, so it could
> test ie. /dev/urandom output as well as both should be internally the
> same), take a look at rng-tools implementation of some of FIPS-140 at
> 
> http://gkernel.cvs.sourceforge.net/viewvc/gkernel/rng-tools/
> 
> (fips.c / rngtest.c)
> 
> This would AFAIK cover most catastrophic failures of HW RNGs, although
> on Linux, you're really testing just the CSPRNG,
> http://lxr.free-electrons.com/source/drivers/char/random.c

All I strive for for the getrandom test is to have some reasonable
verification that the buffer was filled with something that looks at
least a bit like random data, so that we rule out cases where the buffer
was only partially filled or not filled in at all. The hard part is
where to draw the line between the complexity of the check and
probability of a failure...

Adding real entropy tests to LTP sounds like a good excercise though,
maybe I will have a look into that later :-).

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-08 13:14       ` Jan Stancek
@ 2017-02-08 14:08         ` Cyril Hrubis
  2017-02-08 14:11           ` Cyril Hrubis
  0 siblings, 1 reply; 9+ messages in thread
From: Cyril Hrubis @ 2017-02-08 14:08 UTC (permalink / raw)
  To: ltp

Hi!
> > gnuplot> comb(n,k) = gamma(n+1)/(gamma(k+1) * gamma(n - k + 1))
> > gnuplot> plot [6:30] 256 * comb(x, int(6 + x * 0.2)) / (256 ** int(6 + x * 0.2))
> 
> This does not look correct to me. Try it with old formula, it goes above 1:
> gnuplot> comb(n,k) = gamma(n+1)/(gamma(k+1) * gamma(n - k + 1))
> gnuplot> plot [6:30] 256 * comb(x, int(x * 0.1)) / (256 ** int(x * 0.1))

You are right, this would be for if we fail the test test when
table[i] >= max, which would fail with 100% probability for most of the smaller
buffers, hence we get results that does not make much sense. We have to
add 1 to the int(x * 0.1) in the formula.

Try plotting this:

plot [10:60] 256 * comb(x, int(x * 0.1) + 1) / (256 ** (int(x * 0.1) + 1))

which pretty much shows the same you are saying for N=19 the old formula
has about 70% chance of failing. Well your script with poisson
aproximation gets close to 50% but the order of magnitude seems to be
the same for both formulae.

> ---
> 
> With old formula, for N<20 (and max=1), this is pretty much birthday
> problem [1] with smaller pool - we don't have 365 days, we have 256 bytes.
> 
> With birthday problem you get 50% chance of having 2 people with same
> birthday if number of people is 23. Since we have smaller pool, I'd expect
> we reach 50% chance sooner.
> 
> With N>=20, max increases and it gets a lot more complicated. Best advice
> I've found online was to use Poisson approximation [2].
> 
> Based on that (see attached python snippet) I'm getting that:
> "x*0.1" is quite bad, specially for N=19
> "6+x*0.2" is a lot better, with chance ~10^-16

I'm getting the same with the fixed formula in gnuplot:

plot [10:60] 256 * comb(x, int(6 + x * 0.2) + 1) / (256 ** (int(6 + x * 0.2) + 1))

This shows a peak for 15 which is about ~10^-16.

> I'll send v2, with "6+x*0.2".

Agreed.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [LTP] [PATCH] getrandom02: relax check for returned data
  2017-02-08 14:08         ` Cyril Hrubis
@ 2017-02-08 14:11           ` Cyril Hrubis
  0 siblings, 0 replies; 9+ messages in thread
From: Cyril Hrubis @ 2017-02-08 14:11 UTC (permalink / raw)
  To: ltp

Hi!
> > Based on that (see attached python snippet) I'm getting that:
> > "x*0.1" is quite bad, specially for N=19
> > "6+x*0.2" is a lot better, with chance ~10^-16
> 
> I'm getting the same with the fixed formula in gnuplot:
> 
> plot [10:60] 256 * comb(x, int(6 + x * 0.2) + 1) / (256 ** (int(6 + x * 0.2) + 1))
> 
> This shows a peak for 15 which is about ~10^-16.
                         ^
			should have been 14 of course

I was typing faster than thinking again...

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-02-08 14:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-06 12:13 [LTP] [PATCH] getrandom02: relax check for returned data Jan Stancek
2017-02-07 15:33 ` Cyril Hrubis
2017-02-07 16:15   ` Jan Stancek
2017-02-07 17:30     ` Cyril Hrubis
2017-02-08 11:23       ` Jiri Jaburek
2017-02-08 13:32         ` Cyril Hrubis
2017-02-08 13:14       ` Jan Stancek
2017-02-08 14:08         ` Cyril Hrubis
2017-02-08 14:11           ` Cyril Hrubis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.