netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Flaw in "random32: update the net random state on interrupt and activity"
@ 2020-08-08 15:26 George Spelvin
  2020-08-08 17:07 ` Andy Lutomirski
                   ` (2 more replies)
  0 siblings, 3 replies; 68+ messages in thread
From: George Spelvin @ 2020-08-08 15:26 UTC (permalink / raw)
  To: netdev
  Cc: w, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx,
	peterz, tytso, lkml.mplumb, stephen, George Spelvin

[-- Attachment #1: letter --]
[-- Type: text/plain, Size: 9210 bytes --]

I'm quite annoyed at the way that this discussion has drifted away from
the main problem, which is that f227e3ec3b5c is broken and needs reverted.

The original bug report is accurate, and it's a critical issue.
My proposed fix is described (patch imminent) below.


THE PROBLEM:

The reseeding design in add_interrupt_randomness is fatally flawed.
Quite simply, drip-feeding entropy into a PRNG is pissing the entropy
straight down the drain.  It's a complete waste of time *and* it fatally
damages /dev/random.

This is an information-theoretic flaw which cannot be patched
by any amount of clever crypto.

People don't seem to be grasping this fundamental concept.  Ted,
I'm particularly disappointed in you; you *know* better.

First, why reseed at all?  We do it to reduce an attacker's ability
to predict the PRNG output, which in practice means to reduce their
knowledge of the PRNG's internal state.

So any discussion of reseeding begins by *assuming* an attacker has
captured the PRNG state.  If we weren't worried about this possibility,
we wouldn't need to reseed in the first place!

[Footnote: most crypto PRNGs also want rekeying before their cycle
length limit is reached, but these limits are more than 2^64 bits
nowadays and thus infinite in practice.]

If we add k bits of seed entropy to the (attacker-known!) state and let an
attacker see >= k bits of output, there is a trivial brute-force attack
on *any* deterministic PRNG algorithm: try all 2^k possible inputs and
choose the one that matches the observed output.

[Footnote: in real life, you don't have 2^k equally likely possibilities,
so you end up trying possibilities in decreasing order of likelihood,
giving the same average time to solve, but with a longer tail.
The implementation details are messier, but well known; RTFM of any
password-guessing software for details.]

If k is small, this is trivially easy.  Reseeding many times (large n)
doesn't help if you allow an attacker to observe PRNG output after each
small reseeding; the effort is n * 2^k.

The way to prevent this from becoming a problem is well-known: ensure
that k is large.  Save up the n seeds and reseed once, making the
brute-force effort 2^(n*k).  This is called "catastrophic reseeding"
and is described in more detail at e.g.
https://www.schneier.com/academic/paperfiles/paper-prngs.pdf

But the amount of entropy is any one interrupt timing sample is small.
/dev/random assumes it lies somewhere between 1/64 and 2 bits.  If
our attacker is less skilled, they may have more subjective uncertainty,
but guessing a TSC delta to within 20 bits (0.26 ms at 4 GHz) doesn't
seem at all difficult.  Testing 2^20 possibilities is trivial.

*This* is why Willy's patch is useless.  No amount of deckchair
rearrangement or bikeshed painting will fix this core structural flaw.
It *must* be redesigned to do catastrophic reseeding.


Additionally, the entropy revealed by this attack is *also* the primary
source for /dev/random, destroying its security guarantee.  This elevates
the problem from "useless" to "disastrous".  This patch *must* be reverted
for 5.8.1.


In response to various messages saying "this is all theoretical; I won't
believe you until I see a PoC", all I can say is "I feel that you should
be aware that some asshole is signing your name to stupid letters."

Just like a buffer overflow, a working attack is plausible using a
combination of well-understood techniques.  It's ridiculous to go to
the effort of developing a full exploit when it's less work to fix the
problem in the first place.

I also notice a lot of amateurish handwaving about the security of
various primitives.  This is particularly frustrating, but I will refrain
from giving vent/rant to my feelings, instead referring people to the
various references linked from

https://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own


A SOLUTION:

Now, how to fix it?

First, what is the problem?  "Non-cryptographic PRNGs are predictable"
fits in the cateogry of Not A Bug.  There are may applications for
a non-cryptographic PRNG in the kernel.  Self-tests, spreading wear
across flash memory, randomized backoff among cooperating processes,
and assigning IDs in protocols which aren't designed for DoS resistance
in the first place.

But apparently the network code is (ab)using lib/random32.c for choosing
TCP/UDP port numbers and someone has found a (not publicly disclosed)
attack which exploits the predictability to commit some mischief.

And apparently switching to the fastest secure PRNG currently
in the kernel (get_random_u32() using ChaCha + per-CPU buffers)
would cause too much performance penalty.

So the request is for a less predictable PRNG which is still extremely
fast.  It's specifically okay if the crypto is half-assed; this is
apparently some kind of nuisance (DoS?) attack rather than something
really valuable.

Gee, I seem to recall solving a very similar problem with the
dache hash function.  I think I can do this.

An important design constraint is that we want low-latency random number
generation, not just high bandwidth.  Amortizing over bulk operations
is *not* okay.


Well, the best crypto primitive I know of for such an application is
SipHash.  Its 4x 64-bit words of state is only twice the size of the
current struct rnd_state.  Converting it from a a pseudo-random function
to a CRNG is some half-assed amateur cryptography, but at least it's a
robust and well-studied primitive.

So here's my proposal, for comment:  (I'll post an RFC patch shortly.)
- Replace the prandom_u32() generator with something that does
  2 rounds of SipHash.  (Could drop to 1 round if we get complaints.)
- Keep the per-CPU structure, to minimize cacheline bouncing.
- On 32-bit processors, use HSipHash to keep performance up.  In the
  spirit of half-assedness, this is weaker, but hopefully good enough,
  and while there are lots of such processors in routers and IoT devices,
  they aren't handling thousands of connections per second and so expose
  much less PRNG output.
- Using random_ready_callback and periodically thereafter, do a full
  catastrophic reseed from the ChaCha generator.  There's no need for
  locking; tearing on updates doesn't do any harm.

Current plans I'm open to discussion about:
- Replace the current prandom_u32(), rather than adding yet another
  PRNG.  This keeps the patch smaller.
- Leave the 60-second reseed interval alone.  If someone can suggest a
  way to suppress reseeding when not needed, we could drop the interval
  without penalizing mobile devices.
- Leave the prandom_u32_state() code paths alone.  Those functions are
  used in self-test code and it's probably worth not changing the output.
  (The downside is misleading function names.)
- For robustness, seed each CPU's PRNG state independently.  We could
  use a global key and use SipHash itself to hash in the CPU number which
  would be faster than ChaCha, but let's KISS.


INDIVIDUAL REPLIES:

Jason A. Donenfeld: You're quite right about the accretion of
	superstitious incantations in random.c, but I do think it's a
	fundamentally sound design, just with an inelegant implementation.
	A comparison to SHACAL1 is not appropriate.  SHACAL is invertible
	because it leaves off the final Davies-Meyer add-back,	random.c
	uses SHA1 *with* the add-back, so the compression function
	is not invertible.  The most maliciously backdoored rdrand
	implementation imaginable can't analyze the entropy pool and
	choose its output so as to leak data to the Bavarian Illuminati.
	That's laughably impractical.

Willy Tarreau: Middle square + Weil sequence isn't even *close* to
	crypto-quality.  And you're badly misunderstanding what the
	fast_pool is doing.  Please stop trying to invent your own crypto
	primitives; see
	https://www.schneier.com/crypto-gram/archives/1998/1015.html#cipherdesign

Ted Ts'o: Actually, I'm (slowly) working through a complete audit of all
	RNG users in the kernel.  Current code offers four basic
	levels of security:
	- Trivial LCG (next_pseudo_random32, preserved for compatibility)
	- Trivially predictable (but statistically good) LFSR
	- Cryptographically strong ChaCha, batched
	- Cryptographically strong ChaCha, with anti-backtracking.
	(Because I'm working strong-to-weak, I've mostly downgraded
	call sites so far.  There's also the earlyrandom stuff, which
	is its own mess.)

	Do we want an additional "hopefully strong" level in the middle
	for TCP ports and other visible IDs only, or should we replace
	the LFSR and not have a conventional PRNG at all?

Andy Lutomirski: I also have some code for combining the 32- and 64-bit
	entropy batches, and I've generalized it to return bytes as well,
	because *most* users of get_random_bytes() in the kernel don't
	need anti-backtracking, but my understanding from Linus is that
	for this application, amortized time is *not* okay; we want a
	fast 99th percentile.


PERSONAL NOTE:

Events this year have left me without the bandwidth to keep up with kernel
development and so I've been AFK for some months.  My problems haven't
gone away, but this issue is important enough that I'll be making time
to see it through to resolution.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 15:26 Flaw in "random32: update the net random state on interrupt and activity" George Spelvin
@ 2020-08-08 17:07 ` Andy Lutomirski
  2020-08-08 18:08   ` Willy Tarreau
                     ` (2 more replies)
  2020-08-08 17:44 ` Willy Tarreau
  2020-08-09  6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin
  2 siblings, 3 replies; 68+ messages in thread
From: Andy Lutomirski @ 2020-08-08 17:07 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, w, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen


> On Aug 8, 2020, at 8:29 AM, George Spelvin <lkml@sdf.org> wrote:
> 

> And apparently switching to the fastest secure PRNG currently
> in the kernel (get_random_u32() using ChaCha + per-CPU buffers)
> would cause too much performance penalty.

Can someone explain *why* the slow path latency is particularly relevant here?  What workload has the net code generating random numbers in a place where even a whole microsecond is a problem as long as the amortized cost is low?  (I’m not saying I won’t believe this matters, but it’s not obvious to me that it matters.)

>    - Cryptographically strong ChaCha, batched
>    - Cryptographically strong ChaCha, with anti-backtracking.

I think we should just anti-backtrack everything.  With the “fast key erasure” construction, already implemented in my patchset for the buffered bytes, this is extremely fast.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 15:26 Flaw in "random32: update the net random state on interrupt and activity" George Spelvin
  2020-08-08 17:07 ` Andy Lutomirski
@ 2020-08-08 17:44 ` Willy Tarreau
  2020-08-08 18:19   ` Linus Torvalds
                     ` (2 more replies)
  2020-08-09  6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin
  2 siblings, 3 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-08 17:44 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen

On Sat, Aug 08, 2020 at 03:26:28PM +0000, George Spelvin wrote:
> So any discussion of reseeding begins by *assuming* an attacker has
> captured the PRNG state.  If we weren't worried about this possibility,
> we wouldn't need to reseed in the first place!

Sure, but what matters is the time it takes to capture the state.
In the previous case it used to take Amit a few packets only. If it
takes longer than the reseeding period, then it's not captured anymore.
That was exactly the point there: break the certainty of the observations
so that the state cannot be reliably guessed anymore, hence cancelling
the concerns about the input bits being guessed. For sure if it is
assumed that the state is guessed nevertheless, this doesn't work
anymore, but that wasn't the assumption.

> Just like a buffer overflow, a working attack is plausible using a
> combination of well-understood techniques.  It's ridiculous to go to
> the effort of developing a full exploit when it's less work to fix the
> problem in the first place.

I totally agree with this principle. However, systematically starting
with assumptions like "the PRNG's state is known to the attacker" while
it's the purpose of the attack doesn't exactly match something which
does not warrant a bit more explanation. Otherwise I could as well say
in other contexts "let's assume I know the private key of this site's
certificate, after all it's only 2048 bits". And the reason why we're
looping here is that all explanations about the risk of losing entropy
are based on the assumption that the attacker has already won. The patch
was made to keep that assumption sufficiently away, otherwise it's
useless.

> A SOLUTION:
> 
> Now, how to fix it?
> 
> First, what is the problem?  "Non-cryptographic PRNGs are predictable"
> fits in the cateogry of Not A Bug.  There are may applications for
> a non-cryptographic PRNG in the kernel.  Self-tests, spreading wear
> across flash memory, randomized backoff among cooperating processes,
> and assigning IDs in protocols which aren't designed for DoS resistance
> in the first place.
> 
> But apparently the network code is (ab)using lib/random32.c for choosing
> TCP/UDP port numbers and someone has found a (not publicly disclosed)
> attack which exploits the predictability to commit some mischief.

Note, it's beyond predictability, it's reversibility at this point.

> And apparently switching to the fastest secure PRNG currently
> in the kernel (get_random_u32() using ChaCha + per-CPU buffers)
> would cause too much performance penalty.

Based on some quick userland tests, apparently yes.

> So the request is for a less predictable PRNG which is still extremely
> fast.

That was the goal around the MSWS PRNG, as it didn't leak its bits
and would require a certain amount of brute force.

> Well, the best crypto primitive I know of for such an application is
> SipHash.  Its 4x 64-bit words of state is only twice the size of the
> current struct rnd_state.  Converting it from a a pseudo-random function
> to a CRNG is some half-assed amateur cryptography, but at least it's a
> robust and well-studied primitive.

If it's considered as cryptographically secure, it's OK to feed it with
just a counter starting from a random value, since the only way to guess
the current counter state is to brute-force it, right ? I've done this in
the appended patch. Note that for now I've commented out all the reseeding
code because I just wanted to test the impact. 

In my perf measurements I've had some erratic behaviors, one test showing
almost no loss, and another one showing a lot, which didn't make sense, so
I'll need to measure again under better condition (i.e. not my laptop with
a variable frequency CPU).

> Willy Tarreau: Middle square + Weil sequence isn't even *close* to
> 	crypto-quality.

Absolutely, I've even said this myself.

>  And you're badly misunderstanding what the
> 	fast_pool is doing.  Please stop trying to invent your own crypto
> 	primitives; see
> 	https://www.schneier.com/crypto-gram/archives/1998/1015.html#cipherdesign

I know this pretty well and am NOT trying to invent crypto and am not
qualified for this. I'm not even trying to invent a PRNG myself because
there are some aspects I can't judge (such as measuring the sequence).
This is why I proposed to reuse some published work.

There seems to be some general thinking among many crypto-savvy people
that everything in the world needs strong crypto, including PRNGs, and
that doesn't always help. As you mentioned above, there are plenty of
valid use cases for weaker PRNGs, and my opinion (which may or may not
be shared) is that if guessing an internal state is expensive enough
for an attacker, either in terms of number of probes, or in brute-force
time, it may be sufficient for certain use cases. The main problem is
that every time someone says "oh it seems easy to guess a 16-bit port",
it's almost impossible not to receive proposals to implement expensive
super-strong crypto that will be suitable for everything. It would help
if security people would actually more often try to adapt to the context,
or explain why it still matters in a certain case. Actually that's what
Jason tried but as he already said, the covid outbreak arrived in the
middle of the work on this issue and stalled a lot of things.  

Regards,
Willy


commit dec36680a3f4888a7c026f8d40f978eec31ce1a1
Author: Willy Tarreau <w@1wt.eu>
Date:   Sun Apr 19 17:31:46 2020 +0200

    WIP: random32: use siphash with a counter for prandom_u32

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d20ba1b104ca..2a41b21623ae 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1277,7 +1277,6 @@ void add_interrupt_randomness(int irq, int irq_flags)
 
 	fast_mix(fast_pool);
 	add_interrupt_bench(cycles);
-	this_cpu_add(net_rand_state.s1, fast_pool->pool[cycles & 3]);
 
 	if (unlikely(crng_init == 0)) {
 		if ((fast_pool->count >= 64) &&
diff --git a/include/linux/random.h b/include/linux/random.h
index 9ab7443bd91b..9e22973b207c 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -12,6 +12,7 @@
 #include <linux/list.h>
 #include <linux/once.h>
 #include <asm/percpu.h>
+#include <linux/siphash.h>
 
 #include <uapi/linux/random.h>
 
@@ -117,7 +118,8 @@ void prandom_seed(u32 seed);
 void prandom_reseed_late(void);
 
 struct rnd_state {
-	__u32 s1, s2, s3, s4;
+	siphash_key_t key;
+	uint64_t counter;
 };
 
 DECLARE_PER_CPU(struct rnd_state, net_rand_state);
@@ -161,12 +163,14 @@ static inline u32 __seed(u32 x, u32 m)
  */
 static inline void prandom_seed_state(struct rnd_state *state, u64 seed)
 {
+#if 0
 	u32 i = (seed >> 32) ^ (seed << 10) ^ seed;
 
 	state->s1 = __seed(i,   2U);
 	state->s2 = __seed(i,   8U);
 	state->s3 = __seed(i,  16U);
 	state->s4 = __seed(i, 128U);
+#endif
 }
 
 #ifdef CONFIG_ARCH_RANDOM
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 026ac01af9da..c9d839c2b179 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1743,13 +1743,6 @@ void update_process_times(int user_tick)
 	scheduler_tick();
 	if (IS_ENABLED(CONFIG_POSIX_TIMERS))
 		run_posix_cpu_timers();
-
-	/* The current CPU might make use of net randoms without receiving IRQs
-	 * to renew them often enough. Let's update the net_rand_state from a
-	 * non-constant value that's not affine to the number of calls to make
-	 * sure it's updated when there's some activity (we don't care in idle).
-	 */
-	this_cpu_add(net_rand_state.s1, rol32(jiffies, 24) + user_tick);
 }
 
 /**
diff --git a/lib/random32.c b/lib/random32.c
index 3d749abb9e80..aa83cade911a 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -59,13 +59,8 @@ DEFINE_PER_CPU(struct rnd_state, net_rand_state);
  */
 u32 prandom_u32_state(struct rnd_state *state)
 {
-#define TAUSWORTHE(s, a, b, c, d) ((s & c) << d) ^ (((s << a) ^ s) >> b)
-	state->s1 = TAUSWORTHE(state->s1,  6U, 13U, 4294967294U, 18U);
-	state->s2 = TAUSWORTHE(state->s2,  2U, 27U, 4294967288U,  2U);
-	state->s3 = TAUSWORTHE(state->s3, 13U, 21U, 4294967280U,  7U);
-	state->s4 = TAUSWORTHE(state->s4,  3U, 12U, 4294967168U, 13U);
-
-	return (state->s1 ^ state->s2 ^ state->s3 ^ state->s4);
+	state->counter++;
+	return siphash(&state->counter, sizeof(state->counter), &state->key);
 }
 EXPORT_SYMBOL(prandom_u32_state);
 
@@ -161,12 +156,14 @@ static u32 __extract_hwseed(void)
 static void prandom_seed_early(struct rnd_state *state, u32 seed,
 			       bool mix_with_hwseed)
 {
+#if 0
 #define LCG(x)	 ((x) * 69069U)	/* super-duper LCG */
 #define HWSEED() (mix_with_hwseed ? __extract_hwseed() : 0)
 	state->s1 = __seed(HWSEED() ^ LCG(seed),        2U);
 	state->s2 = __seed(HWSEED() ^ LCG(state->s1),   8U);
 	state->s3 = __seed(HWSEED() ^ LCG(state->s2),  16U);
 	state->s4 = __seed(HWSEED() ^ LCG(state->s3), 128U);
+#endif
 }
 
 /**
@@ -184,8 +181,6 @@ void prandom_seed(u32 entropy)
 	 */
 	for_each_possible_cpu(i) {
 		struct rnd_state *state = &per_cpu(net_rand_state, i);
-
-		state->s1 = __seed(state->s1 ^ entropy, 2U);
 		prandom_warmup(state);
 	}
 }
@@ -244,14 +239,8 @@ void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state)
 
 	for_each_possible_cpu(i) {
 		struct rnd_state *state = per_cpu_ptr(pcpu_state, i);
-		u32 seeds[4];
-
-		get_random_bytes(&seeds, sizeof(seeds));
-		state->s1 = __seed(seeds[0],   2U);
-		state->s2 = __seed(seeds[1],   8U);
-		state->s3 = __seed(seeds[2],  16U);
-		state->s4 = __seed(seeds[3], 128U);
 
+		get_random_bytes(state, sizeof(*state));
 		prandom_warmup(state);
 	}
 }

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 17:07 ` Andy Lutomirski
@ 2020-08-08 18:08   ` Willy Tarreau
  2020-08-08 18:13   ` Linus Torvalds
  2020-08-08 19:03   ` George Spelvin
  2 siblings, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-08 18:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason,
	luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen

On Sat, Aug 08, 2020 at 10:07:51AM -0700, Andy Lutomirski wrote:
> 
> > On Aug 8, 2020, at 8:29 AM, George Spelvin <lkml@sdf.org> wrote:
> > 
> 
> > And apparently switching to the fastest secure PRNG currently
> > in the kernel (get_random_u32() using ChaCha + per-CPU buffers)
> > would cause too much performance penalty.
> 
> Can someone explain *why* the slow path latency is particularly relevant
> here?  What workload has the net code generating random numbers in a place
> where even a whole microsecond is a problem as long as the amortized cost is
> low?  (I'm not saying I won't believe this matters, but it's not obvious to
> me that it matters.)

It really depends on what is being done and how we want that code to
continue to be used. Often it will be a matter of queue sizes or
concurrency in certain contexts under a lock. For illustration let's
say you want to use randoms to choose a sequence number for a SYN
cookie (it's not what is done today), a 40G NIC can deliver 60Mpps
or one packet every 16 ns. If you periodically add 1us you end up
queuing 60 extra packets in a queue when this happens. This might
or might not be acceptable. Regarding concurrency, you could imagine
some code picking a random in a small function running under a
spinlock. Frequently adding 1us there can be expensive.

To be clear, I'm not *that* much worried about *some* extra latency,
however I'm extremely worried about the risks of increase once
some security researcher considers the PRNG not secure enough again
(especially once it starts to be used for security purposes after
having being claimed secure) and we replace it with another one
showing a higher cost and longer tail to amortize the cost. And given
that we're speaking about replacing a non-secure PRNG with *something*,
it doesn't seem really wise to start to introduce new constraints on
the data path when there are probably a large enough number of
possibilities which do not require these constraints.

Last point, we must not rule out the possibility than some clever
researcher will later come up with new time-based attacks consisting
in observing latencies of packet processing, explaining how they can
count the number of connections reaching a given host and reveal
certain useful information, and that we're then forced to slow down
all of them to proceed in constant time.

Just my two cents,
Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 17:07 ` Andy Lutomirski
  2020-08-08 18:08   ` Willy Tarreau
@ 2020-08-08 18:13   ` Linus Torvalds
  2020-08-08 19:03   ` George Spelvin
  2 siblings, 0 replies; 68+ messages in thread
From: Linus Torvalds @ 2020-08-08 18:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: George Spelvin, Netdev, Willy Tarreau, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 8, 2020 at 10:07 AM Andy Lutomirski <luto@amacapital.net> wrote:
>
>
> > On Aug 8, 2020, at 8:29 AM, George Spelvin <lkml@sdf.org> wrote:
> >
>
> > And apparently switching to the fastest secure PRNG currently
> > in the kernel (get_random_u32() using ChaCha + per-CPU buffers)
> > would cause too much performance penalty.
>
> Can someone explain *why* the slow path latency is particularly relevant here?

You guys know what's relevant?  Reality.

The thing is, I see Spelvin's rant, and I've generally enjoyed our
encounters in the past, but I look back at the kinds of security
problems we've had, and they are the exact _reverse_ of what
cryptographers and Spelvin is worried about.

The fact is, nobody ever EVER had any practical issues with our
"secure hash function" even back when it was MD5, which is today
considered trivially breakable.

Thinking back on it, I don't think it was even md5. I think it was
half-md5, wasn't it?

So what have people have had _real_ security problems with in our
random generators - pseudo or not?

EVERY SINGLE problem I can remember was because some theoretical
crypto person said "I can't guarantee that" and removed real security
- or kept it from being merged.

Seriously.

We've had absolutely horrendous performance issues due to the
so-called "secure" random number thing blocking. End result: people
didn't use it.

We've had absolutely garbage fast random numbers because the crypto
people don't think performance matters, so the people who _do_ think
it matters just roill their own.

We've had "random" numbers that weren't, because random.c wanted to
use only use inputs it can analyze and as a result didn't use any
input at all, and every single embedded device ended up having the
exact same state (this ends up being intertwined with the latency
thing).

We've had years of the drivers/char/random.c not getting fixed because
Ted listens too much to the "serious crypto" guys, with the end result
that I then have to step in and say "f*ck that, we're doing this
RIGHT".

And RIGHT means caring about reality, not theory.

So instead of asking "why is the slow path relevant", the question
should be "why is some theoretical BS relevant", when history says it
has never ever mattered.

Our random number generation needs to be _practical_.

When Spelvin and Marc complain about us now taking the fastpool data
and adding it to the prandom pool and "exposing" it, they are full of
shit. That fastpool data gets whitened so much by two layers of
further mixing, that there is no way you'll ever see any sign of us
taking one word of it for other things.

I want to hear _practical_ attacks against this whole "let's mix in
the instruction pointer and the cycle counter into both the 'serious'
random number generator and the prandom one".

Because I don't think they exist. And I think it's actively dangerous
to continue on the path of thinking that stable and "serious"
algorithms that can be analyzed are better than the one that end up
depending on random hardware details.

I really think kernel random people need to stop this whole "in
theory" garbage, and embrace the "let's try to mix in small
non-predictable things in various places to actively make it hard to
analyze this".

Because at some point, "security by obscurity" is actually better than
"analyze this".

And we have decades of history to back this up. Stop with the crazy
"let's only use very expensive stuff has been analyzed".

It's actively detrimental.

It's detrimental to performance, but it's also detrimental to real security.

And stop dismissing performance as "who cares". Because it's just
feeding into this bad loop.

               Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 17:44 ` Willy Tarreau
@ 2020-08-08 18:19   ` Linus Torvalds
  2020-08-08 18:53     ` Willy Tarreau
  2020-08-08 20:47     ` George Spelvin
  2020-08-08 19:18   ` Florian Westphal
  2020-08-08 20:08   ` George Spelvin
  2 siblings, 2 replies; 68+ messages in thread
From: Linus Torvalds @ 2020-08-08 18:19 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 8, 2020 at 10:45 AM Willy Tarreau <w@1wt.eu> wrote:
>
>
>     WIP: random32: use siphash with a counter for prandom_u32

siphash is good.

But no:

> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -1277,7 +1277,6 @@ void add_interrupt_randomness(int irq, int irq_flags)
>
>         fast_mix(fast_pool);
>         add_interrupt_bench(cycles);
> -       this_cpu_add(net_rand_state.s1, fast_pool->pool[cycles & 3]);
>
>         if (unlikely(crng_init == 0)) {
>                 if ((fast_pool->count >= 64) &&
> --- a/include/linux/random.h
> +++ b/include/linux/random.h
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 026ac01af9da..c9d839c2b179 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -1743,13 +1743,6 @@ void update_process_times(int user_tick)
>         scheduler_tick();
>         if (IS_ENABLED(CONFIG_POSIX_TIMERS))
>                 run_posix_cpu_timers();
> -
> -       /* The current CPU might make use of net randoms without receiving IRQs
> -        * to renew them often enough. Let's update the net_rand_state from a
> -        * non-constant value that's not affine to the number of calls to make
> -        * sure it's updated when there's some activity (we don't care in idle).
> -        */
> -       this_cpu_add(net_rand_state.s1, rol32(jiffies, 24) + user_tick);
>  }

We're not going back to "don't add noise, just make a stronger
analyzable function".

I simply won't take it. See my previous email. I'm 100% fed up with
security people screwing up real security by trying to make things
overly analyzable.

If siphash is a good enough hash to make the pseudo-random state hard
to guess, then it's also a good enough hash to hide the small part of
the fast-pool we mix in.

And while security researchers may hate it because it's hard to
analyze, that's the POINT, dammit.

This "it's analyzable" makes it repeatable. And since mixing in
serious things is too slow, and people thus delay reseeding for too
long, then we reseed with random crap that doesn't hurt and that isn't
realistically analyzable.

              Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 18:19   ` Linus Torvalds
@ 2020-08-08 18:53     ` Willy Tarreau
  2020-08-08 20:47     ` George Spelvin
  1 sibling, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-08 18:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 08, 2020 at 11:19:01AM -0700, Linus Torvalds wrote:
> On Sat, Aug 8, 2020 at 10:45 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> >
> >     WIP: random32: use siphash with a counter for prandom_u32
> 
> siphash is good.
> 
> But no:
> 
> > --- a/drivers/char/random.c
> > +++ b/drivers/char/random.c
> > @@ -1277,7 +1277,6 @@ void add_interrupt_randomness(int irq, int irq_flags)
> >
> >         fast_mix(fast_pool);
> >         add_interrupt_bench(cycles);
> > -       this_cpu_add(net_rand_state.s1, fast_pool->pool[cycles & 3]);
> >
> >         if (unlikely(crng_init == 0)) {
> >                 if ((fast_pool->count >= 64) &&
> > --- a/include/linux/random.h
> > +++ b/include/linux/random.h
> > diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> > index 026ac01af9da..c9d839c2b179 100644
> > --- a/kernel/time/timer.c
> > +++ b/kernel/time/timer.c
> > @@ -1743,13 +1743,6 @@ void update_process_times(int user_tick)
> >         scheduler_tick();
> >         if (IS_ENABLED(CONFIG_POSIX_TIMERS))
> >                 run_posix_cpu_timers();
> > -
> > -       /* The current CPU might make use of net randoms without receiving IRQs
> > -        * to renew them often enough. Let's update the net_rand_state from a
> > -        * non-constant value that's not affine to the number of calls to make
> > -        * sure it's updated when there's some activity (we don't care in idle).
> > -        */
> > -       this_cpu_add(net_rand_state.s1, rol32(jiffies, 24) + user_tick);
> >  }
> 
> We're not going back to "don't add noise, just make a stronger
> analyzable function".
> 
> I simply won't take it. See my previous email. I'm 100% fed up with
> security people screwing up real security by trying to make things
> overly analyzable.
> 
> If siphash is a good enough hash to make the pseudo-random state hard
> to guess, then it's also a good enough hash to hide the small part of
> the fast-pool we mix in.

I'm totally fine with that. In fact, my secret goal there was to put
net_rand_state back to random32.c as a static and reinstall the
__latent_entropy that we had to remove :-)

I'll need to re-run more correct tests though. My measurements were
really erratic with some of them showing half an HTTP request rate in
a test, which makes absolutely no sense and thus disqualifies my
measurements.

But if the results are correct enough I'm fine with continuing on this
one and forgetting MSWS.

> And while security researchers may hate it because it's hard to
> analyze, that's the POINT, dammit.

Actually I think there are two approaches which explains the constant
disagreements in this area. Some people want something provably
difficult, and others want something that cannot be proven to be easy. 
Sadly by trying to please everyone we probably got something between
provably easy and not provably difficult :-/

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 17:07 ` Andy Lutomirski
  2020-08-08 18:08   ` Willy Tarreau
  2020-08-08 18:13   ` Linus Torvalds
@ 2020-08-08 19:03   ` George Spelvin
  2020-08-08 19:49     ` Andy Lutomirski
  2 siblings, 1 reply; 68+ messages in thread
From: George Spelvin @ 2020-08-08 19:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: netdev, w, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen

On Sat, Aug 08, 2020 at 10:07:51AM -0700, Andy Lutomirski wrote:
>>    - Cryptographically strong ChaCha, batched
>>    - Cryptographically strong ChaCha, with anti-backtracking.
> 
> I think we should just anti-backtrack everything.  With the "fast key 
> erasure" construction, already implemented in my patchset for the 
> buffered bytes, this is extremely fast.

The problem is that this is really *amorized* key erasure, and
requires large buffers to amortize the cost down to a reasonable
level.

E,g, if using 256-bit (32-byte) keys, 5% overhead would require generating
640 bytes at a time.

Are we okay with ~1K per core for this?  Which we might have to
throw away occasionally to incorporate fresh seed material?

You're right that the simplification in usage is a benefit.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 17:44 ` Willy Tarreau
  2020-08-08 18:19   ` Linus Torvalds
@ 2020-08-08 19:18   ` Florian Westphal
  2020-08-08 20:59     ` George Spelvin
  2020-08-08 21:18     ` Willy Tarreau
  2020-08-08 20:08   ` George Spelvin
  2 siblings, 2 replies; 68+ messages in thread
From: Florian Westphal @ 2020-08-08 19:18 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason,
	luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen

Willy Tarreau <w@1wt.eu> wrote:
> diff --git a/include/linux/random.h b/include/linux/random.h
> index 9ab7443bd91b..9e22973b207c 100644
> --- a/include/linux/random.h
> +++ b/include/linux/random.h
> @@ -12,6 +12,7 @@
>  #include <linux/list.h>
>  #include <linux/once.h>
>  #include <asm/percpu.h>
> +#include <linux/siphash.h>
>  
>  #include <uapi/linux/random.h>
>  
> @@ -117,7 +118,8 @@ void prandom_seed(u32 seed);
>  void prandom_reseed_late(void);
>  
>  struct rnd_state {
> -	__u32 s1, s2, s3, s4;
> +	siphash_key_t key;
> +	uint64_t counter;
>  };

Does the siphash_key really need to be percpu?
The counter is different of course.
Alternative would be to siphash a few prandom_u32 results
if the extra u64 is too much storage.

>  DECLARE_PER_CPU(struct rnd_state, net_rand_state);
> @@ -161,12 +163,14 @@ static inline u32 __seed(u32 x, u32 m)
>   */
>  static inline void prandom_seed_state(struct rnd_state *state, u64 seed)
>  {
> +#if 0
>  	u32 i = (seed >> 32) ^ (seed << 10) ^ seed;
>  
>  	state->s1 = __seed(i,   2U);
>  	state->s2 = __seed(i,   8U);
>  	state->s3 = __seed(i,  16U);
>  	state->s4 = __seed(i, 128U);
> +#endif
>  }
[..]

Can't we keep prandom_u32 as-is...?  Most of the usage, esp. in the
packet schedulers, is fine.

I'd much rather have a prandom_u32_hashed() or whatever for
those cases where some bits might leak to the outside and then convert
those prandom_u32 users over to the siphashed version.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 19:03   ` George Spelvin
@ 2020-08-08 19:49     ` Andy Lutomirski
  2020-08-08 21:29       ` George Spelvin
  0 siblings, 1 reply; 68+ messages in thread
From: Andy Lutomirski @ 2020-08-08 19:49 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, w, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen


> On Aug 8, 2020, at 12:03 PM, George Spelvin <lkml@sdf.org> wrote:
> 
> On Sat, Aug 08, 2020 at 10:07:51AM -0700, Andy Lutomirski wrote:
>>>   - Cryptographically strong ChaCha, batched
>>>   - Cryptographically strong ChaCha, with anti-backtracking.
>> 
>> I think we should just anti-backtrack everything.  With the "fast key 
>> erasure" construction, already implemented in my patchset for the 
>> buffered bytes, this is extremely fast.
> 
> The problem is that this is really *amorized* key erasure, and
> requires large buffers to amortize the cost down to a reasonable
> level.
> 
> E,g, if using 256-bit (32-byte) keys, 5% overhead would require generating
> 640 bytes at a time.
> 
> Are we okay with ~1K per core for this?  Which we might have to
> throw away occasionally to incorporate fresh seed material?

I don’t care about throwing this stuff away. My plan (not quite implemented yet) is to have a percpu RNG stream and to never to anything resembling mixing anything in. The stream is periodically discarded and reinitialized from the global “primary” pool instead.  The primary pool has a global lock. We do some vaguely clever trickery to arrange for all the percpu pools to reseed from the primary pool at different times.

Meanwhile the primary pool gets reseeded by the input pool on a schedule for catastrophic reseeding.

5% overhead to make a fresh ChaCha20 key each time sounds totally fine to me. The real issue is that the bigger we make this thing, the bigger the latency spike each time we run it.

Do we really need 256 bits of key erasure?  I suppose if we only replace half the key each time, we’re just asking for some cryptographer to run the numbers on a break-one-of-many attack and come up with something vaguely alarming.

I wonder if we get good performance by spreading out the work. We could, for example, have a 320 byte output buffer that get_random_bytes() uses and a 320+32 byte “next” buffer that is generated as the output buffer is used. When we finish the output buffer, the first 320 bytes of the next buffer becomes the current buffer and the extra 32 bytes becomes the new key (or nonce).  This will have lower worst case latency, but it will hit the cache lines more often, potentially hurting throughout.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 17:44 ` Willy Tarreau
  2020-08-08 18:19   ` Linus Torvalds
  2020-08-08 19:18   ` Florian Westphal
@ 2020-08-08 20:08   ` George Spelvin
  2020-08-08 20:47     ` Linus Torvalds
  2 siblings, 1 reply; 68+ messages in thread
From: George Spelvin @ 2020-08-08 20:08 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen

On Sat, Aug 08, 2020 at 07:44:51PM +0200, Willy Tarreau wrote:
> On Sat, Aug 08, 2020 at 03:26:28PM +0000, George Spelvin wrote:
>> So any discussion of reseeding begins by *assuming* an attacker has
>> captured the PRNG state.  If we weren't worried about this possibility,
>> we wouldn't need to reseed in the first place!
> 
> Sure, but what matters is the time it takes to capture the state.
> In the previous case it used to take Amit a few packets only. If it
> takes longer than the reseeding period, then it's not captured anymore.
> That was exactly the point there: break the certainty of the observations
> so that the state cannot be reliably guessed anymore, hence cancelling
> the concerns about the input bits being guessed. For sure if it is
> assumed that the state is guessed nevertheless, this doesn't work
> anymore, but that wasn't the assumption.

Just checking that you're aware: modern high-end network hardware does 
aggressive interupt coalescing, and NAPI disables interrupts entirely.  So 
you can transfer 10K packets per interrupt in extreme cases.

*Especially* on the transmit side, where the only thing you need to do 
after transmission is recycle the buffer, so latency concerns are minimal.  
In the steady state, all buffer recycling is done by polling while 
queueing new packets. You only need an interrupt to reclaim the last few 
buffers when transmission pauses.

So assuming that once per interrupt equals "often" is completely false.

Not to mention that the generators are per-CPU, and the CPU gnerating the 
random numbers might not be the one getting what few interrupts there are.  
(It's quite common in networking applications to bind network interrupts 
and application logic to separate CPUs.)

This whole bit of logic just seems ridiculously fragile to me.

>> Just like a buffer overflow, a working attack is plausible using a
>> combination of well-understood techniques.  It's ridiculous to go to
>> the effort of developing a full exploit when it's less work to fix the
>> problem in the first place.

> I totally agree with this principle. However, systematically starting
> with assumptions like "the PRNG's state is known to the attacker" while
> it's the purpose of the attack doesn't exactly match something which
> does not warrant a bit more explanation.

I thought I explained it, though.  Reseeding only matters if you're
trying to disrupt an attacker's knowledge of the PRNG state.  Yes,
assuming complete knowledge is a simplified "spherical cow", but the
logic applies to more complex partial-knowledge attacks as well.

The point is that a solution which solves that limiting case automatically 
takes care of all the other attacks, too.

If we have to carefully quantify an attacker's knowledge, using careful 
and delicate analysis to show security, then it's both more work, and 
*brittle*: the analysis must be re-checked each time the code is modified.

It just seems like a really crappy slumlord-quality solution.

> Note, it's beyond predictability, it's reversibility at this point.

I'm not quite sure what the distinction you're making is, but
I don't disagree.  Conventional PRNGs are trivially extrapolatable
forward and backward.  The *real* problem is that the network code
is asking for a property the PRNG explicitly lacks.

Just remember that the property of *not* being extrapolatable is 
specifically and exactly what makes a CPRNG cryptographic.

>> So the request is for a less predictable PRNG which is still extremely
>> fast.
> 
> That was the goal around the MSWS PRNG, as it didn't leak its bits
> and would require a certain amount of brute force.

The techniques for MSWS cryptanalysis aren't as widely documented as
LFSRs and LCGs, but I'm *very* skeptical of this claim.  I'd like to use
the most robust primitive that's not prohibitively slow.

> If it's considered as cryptographically secure, it's OK to feed it with
> just a counter starting from a random value, since the only way to guess
> the current counter state is to brute-force it, right ? I've done this in
> the appended patch. Note that for now I've commented out all the reseeding
> code because I just wanted to test the impact. 
> 
> In my perf measurements I've had some erratic behaviors, one test showing
> almost no loss, and another one showing a lot, which didn't make sense, so
> I'll need to measure again under better condition (i.e. not my laptop with
> a variable frequency CPU).

Actually, I was going to use a somewhat weaker construction that
does 2 rounds per output number.  Standard SipHash does 2n+4
rounds to handle n words of input, so would do 6 rounds in this
standard construction.  The 3x performance gain seems worth it.

The standard construction *would* allow a great code simplification,
since the key could be global and shared, and the per-CPU data would
be limited to a counter.  But I don't think 6 rounds would be fast
enough.

> There seems to be some general thinking among many crypto-savvy people
> that everything in the world needs strong crypto, including PRNGs, and
> that doesn't always help.

No, it's not needed for everything, but it *greatly* simplifies the 
analysis.  I agree that the crypto world suffers from "everything looks 
like a nail" syndrome, but there's value in indestructible building 
blocks: it makes reasoning about the resultant structure ever so much 
easier.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 18:19   ` Linus Torvalds
  2020-08-08 18:53     ` Willy Tarreau
@ 2020-08-08 20:47     ` George Spelvin
  2020-08-08 20:52       ` Linus Torvalds
  1 sibling, 1 reply; 68+ messages in thread
From: George Spelvin @ 2020-08-08 20:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Willy Tarreau, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 08, 2020 at 11:19:01AM -0700, Linus Torvalds wrote:
> If siphash is a good enough hash to make the pseudo-random state hard
> to guess, then it's also a good enough hash to hide the small part of
> the fast-pool we mix in.

No!

No, no, a thousand times no.

I *just* finished explaining, using dribs and drabs of entropy allows an 
*information theoretical attack* which *no* crypto can prevent.

The drip-feed design of the current patch is a catastrophic antifeature
which must be anethametized.

You could replace SipHash with a random function, the platonic ideal
to which all other cryptographic functions are compared, and you'd
still have a hole.

The fact that *nothing* can fix bad seeding is the entire reason
/dev/random exists in the first place.


*Second*, I have no intention of using full SipHash.  I'm spending
all of that security margin making it as fast as possible, leaving
just enough to discourage the original attack.

As you just pointed out, half-MD4 (still used in fs/ext4/hash.c) is quite 
enough to discourage attack if used appropriately.

If you go and *vastly* increase the value of a successful attack, 
letting an attacker at long-lived high-value keys, I have to put all
that margin back.

(Not to mention, even full SipHash only offers 128-bit security in the 
first place!  Shall I go and delete AES-256 and SHA2-512, since we've 
decided the Linux kernel is capped at 128-bit security?)

It's not even remotely difficult: use the *output* of random.c.  
Making that safe is what all that code is for.

(It costs some cycles, but SipHash's strength lets you go long 
enough between reseeds that it amorizes to insignificance.)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 20:08   ` George Spelvin
@ 2020-08-08 20:47     ` Linus Torvalds
  0 siblings, 0 replies; 68+ messages in thread
From: Linus Torvalds @ 2020-08-08 20:47 UTC (permalink / raw)
  To: George Spelvin
  Cc: Willy Tarreau, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 8, 2020 at 1:08 PM George Spelvin <lkml@sdf.org> wrote:
>
> So assuming that once per interrupt equals "often" is completely false.

That's not what people are assuming.

They are assuming it's "unpredictable".

Guys, look up the real and true meaning of "random" some day.

Guess what? It at no point says "secure hash".

> Not to mention that the generators are per-CPU, and the CPU gnerating the
> random numbers might not be the one getting what few interrupts there are.
> (It's quite common in networking applications to bind network interrupts
> and application logic to separate CPUs.)

.. which is exactly why the commit that introduced this _also_ does
things from timer interrupts.

And yes. We know.Timer interrupts are turned off when there's nothing going on.

But the sending side - if it's not responding to an interrupt -
explicitly does have something going on, so you actually end up having
timer interrupts.

And yes, both IP and the TSC are "predictable". In theory. Not in
reality, and particularly not remotely.

And again, we knew - and discussed - interrupts coalescing and not
happening under load.

And again - that's a completely specious and pointless argument.

If you can put a machine under such load that it goes into polling
mode _and_ you also control _all_ the network traffic to such an
extent that other actors don't matter, and you get all the data out
and can analyze it to the point of trying to figure out what the prng
internal buffers are, you are basically already on the same network
and are controlling the machine.

It's not an interesting attack, in other words.

I repeat: reality matters. Your theoretical arguments are irrelevant,
because they simply don't even apply.

Btw, don't get me wrong - I think we can improve on the actual hashing
on the prng too. But I'm not backing down on the whole "we need noise
too, and the noise is actually more important".

> This whole bit of logic just seems ridiculously fragile to me.

No, what is fragile is assuming that you can have a random number
generator that is an analyzable secure hash, and then expecting that
to be high-performance and unpredictable.

We can do both. I'm _hoping_ people can come up with a
high-performance hash that is better than what we have. But there was
absolutely nothing going on for months after the last report, and even
with a better high-performance hash, I will absolutely refuse to get
rid of the noise factor.

           Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 20:47     ` George Spelvin
@ 2020-08-08 20:52       ` Linus Torvalds
  2020-08-08 22:27         ` George Spelvin
  0 siblings, 1 reply; 68+ messages in thread
From: Linus Torvalds @ 2020-08-08 20:52 UTC (permalink / raw)
  To: George Spelvin
  Cc: Willy Tarreau, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 8, 2020 at 1:47 PM George Spelvin <lkml@sdf.org> wrote:
>
> I *just* finished explaining, using dribs and drabs of entropy allows an
> *information theoretical attack* which *no* crypto can prevent.

The key word here being "theoretical".

The other key word is "reality".

We will have to agree to disagree. I don't _care_ about the
theoretical holes. I care about the real ones.

We plugged a real one. Deal with it.

              Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 19:18   ` Florian Westphal
@ 2020-08-08 20:59     ` George Spelvin
  2020-08-08 21:18     ` Willy Tarreau
  1 sibling, 0 replies; 68+ messages in thread
From: George Spelvin @ 2020-08-08 20:59 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Willy Tarreau, netdev, aksecurity, torvalds, edumazet, Jason,
	luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen

On Sat, Aug 08, 2020 at 09:18:27PM +0200, Florian Westphal wrote:
> Can't we keep prandom_u32 as-is...?  Most of the usage, esp. in the
> packet schedulers, is fine.
> 
> I'd much rather have a prandom_u32_hashed() or whatever for
> those cases where some bits might leak to the outside and then convert
> those prandom_u32 users over to the siphashed version.

That's a question I've been asking.  Since this is apparently an
Important Security Bug that wants backported to -stable, I'm making
the minimally-invasive change, which is to change prandom_u32() for
all callers rather that decide which gets what.

But going forward, adding an additional security level between
the current prandom_u32() and get_random_u32() is possible.

I'm not sure it's a good idea, however.  This entire hullalbaloo stems
from someone choosing the wrong PRNG.  Adding another option doesn't
seem likely to prevent a repetition in future.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 19:18   ` Florian Westphal
  2020-08-08 20:59     ` George Spelvin
@ 2020-08-08 21:18     ` Willy Tarreau
  1 sibling, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-08 21:18 UTC (permalink / raw)
  To: Florian Westphal
  Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason,
	luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen

Hi Florian,

On Sat, Aug 08, 2020 at 09:18:27PM +0200, Florian Westphal wrote:
> >  struct rnd_state {
> > -	__u32 s1, s2, s3, s4;
> > +	siphash_key_t key;
> > +	uint64_t counter;
> >  };
> 
> Does the siphash_key really need to be percpu?

I don't think so, I was really exploring a quick and easy change, and
given that it was convenient to put that into rnd_state to ease
adaptation to existing code, I left it here.

> The counter is different of course.
> Alternative would be to siphash a few prandom_u32 results
> if the extra u64 is too much storage.

I don't think we need to make it complicated. If we can implement a
partial siphash that's fast enough and that makes everyone happy, it
will also not take too much space so that could become an overall
improvement (and I'd say that reconciling everyone would also be a
huge improvement).

> >  DECLARE_PER_CPU(struct rnd_state, net_rand_state);
> > @@ -161,12 +163,14 @@ static inline u32 __seed(u32 x, u32 m)
> >   */
> >  static inline void prandom_seed_state(struct rnd_state *state, u64 seed)
> >  {
> > +#if 0
> >  	u32 i = (seed >> 32) ^ (seed << 10) ^ seed;
> >  
> >  	state->s1 = __seed(i,   2U);
> >  	state->s2 = __seed(i,   8U);
> >  	state->s3 = __seed(i,  16U);
> >  	state->s4 = __seed(i, 128U);
> > +#endif
> >  }
> [..]
> 
> Can't we keep prandom_u32 as-is...?  Most of the usage, esp. in the
> packet schedulers, is fine.
>
> I'd much rather have a prandom_u32_hashed() or whatever for
> those cases where some bits might leak to the outside and then convert
> those prandom_u32 users over to the siphashed version.

In fact I even thought about having a different name, such as "weak_u32"
or something like this for the parts where we need slightly more than a
purely predictable random, and keeping the existing function as-is. But
I discovered there's also another one which is sufficient for stats, it
just doesn't have the same interface. But I totally agree on the benefit
of keeping the fastest version available for packet schedulers. In fact
I've run some of my tests with iptables -m statistics --probability ...
However if it turns out we can end up on a very fast PRNG (Tausworthe
was not *that* fast), it would be 100% benefit to turn to it. That's why
I don't want to rule out the possibility of a simple drop-in replacement.

By the way, if we end up keeping a different function for simpler cases,
we should probably find a better name for them like "predictable random"
or "trivial random" that makes users think twice. Doing that in a packet
scheduler is fine. The problem with the "pseudo random" term is that it
evolved over time from "totally predictable" to "may be strongly secure"
depending on implementations and that various people understand it
differently and have different expectations on it :-/

Cheers,
Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 19:49     ` Andy Lutomirski
@ 2020-08-08 21:29       ` George Spelvin
  0 siblings, 0 replies; 68+ messages in thread
From: George Spelvin @ 2020-08-08 21:29 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: netdev, w, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen

On Sat, Aug 08, 2020 at 12:49:30PM -0700, Andy Lutomirski wrote:

> I don't care about throwing this stuff away. My plan (not quite 
> implemented yet) is to have a percpu RNG stream and to never to anything 
> resembling mixing anything in. The stream is periodically discarded and 
> reinitialized from the global "primary" pool instead.  The primary pool 
> has a global lock. We do some vaguely clever trickery to arrange for all 
> the percpu pools to reseed from the primary pool at different times.
>
> Meanwhile the primary pool gets reseeded by the input pool on a schedule 
> for catastrophic reseeding.

Sounds good to me.

> Do we really need 256 bits of key erasure?  I suppose if we only replace 
> half the key each time, we're just asking for some cryptographer to run 
> the numbers on a break-one-of-many attack and come up with something 
> vaguely alarming.

It's possible to have different levels of overall and key-erasure 
security, but I'm not sure what the point is.  It doesn't change the 
numbers *that* much.

(But yes, if you do it, I like the idea of arranging the key 
overwrite so all of the key gets replaced after two passes.)

> I wonder if we get good performance by spreading out the work. We could, 
> for example, have a 320 byte output buffer that get_random_bytes() uses 
> and a 320+32 byte ?next? buffer that is generated as the output buffer 
> is used. When we finish the output buffer, the first 320 bytes of the next 
> buffer becomes the current buffer and the extra 32 bytes becomes the new 
> key (or nonce).  This will have lower worst case latency, but it will 
> hit the cache lines more often, potentially hurting throughout.

You definitely lose something in locality of reference when you spread out 
the work, but you don't need a double-sized buffer (and the resultant 
D-cache hit). Every time you use up a block of current output, fill it 
with a block of next output.

The last 32 bytes of the buffer are the next key. When you've used up all 
of the current buffer but that, overwrite the last block of the current 
buffer with the next^2 key and start over at the beginning, outputting the 
was-next-now-current data.

On other words, with a 320-byte buffer, 320-32 = 288 bytes are available 
for output.  When we pass 64, 128, 256 and 288 bytes, there is a small 
latency spike to run one iteration of ChaCha.

The main issue is the latency between seeding and it affecting the output.  
In particular, I think people expect writes to /dev/random (RNDADDENTROPY) 
to affect subsequent reads immediately, so we'd need to invalidate & 
regenerate the buffer in that case.  We could do something with generation 
numbers so in-kernel users aren't affected.

(And remember that we don't have to fill the whole buffer.  If it's
early boot and we're expecting crng_init to increment, we could
pregenerate less.)



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 20:52       ` Linus Torvalds
@ 2020-08-08 22:27         ` George Spelvin
  2020-08-09  2:07           ` Linus Torvalds
  0 siblings, 1 reply; 68+ messages in thread
From: George Spelvin @ 2020-08-08 22:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Willy Tarreau, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 08, 2020 at 01:52:37PM -0700, Linus Torvalds wrote:
> On Sat, Aug 8, 2020 at 1:47 PM George Spelvin <lkml@sdf.org> wrote:
>> I *just* finished explaining, using dribs and drabs of entropy allows an
>> *information theoretical attack* which *no* crypto can prevent.
> 
> The key word here being "theoretical".
> 
> The other key word is "reality".
> 
> We will have to agree to disagree. I don't _care_ about the
> theoretical holes. I care about the real ones.

It's not a theoretical hole, it's a very real one.  Other than the cycles 
to do the brute-force part, it's not even all that complicated.  The 
theory part is that it's impossible to patch.

*If* you do the stupid thing.  WHICH YOU COULD JUST STOP DOING.

> We plugged a real one. Deal with it.

The explain it to me.  What is that actual *problem*?  Nobody's described 
one, so I've been guessing.  What is this *monumentally stupid* abuse of 
/dev/random allegedly fixing?

If you're not an idiot, explain.

Because right now you sound like one.  There's a simple and easy fix which 
I've described and will get back to implementing as soon as I've finished 
yelling at you.  What, FFS, is your objection to considering it?

I'm trying to implement a solution that satisfies everyone's requirements 
*including* the absence of catastrophic security holes.  If there's some 
requirement I'm not satisfying, please tell me.  Just please don't say "I 
prefer doing the stupid thing to changing my mind."  I hear enough of that 
on the news.

I can deal with it *personally* by patching it out of my private kernels, 
but I'd really rather it doesn't get deployed to a billion devices before 
someone exploits it.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-08 22:27         ` George Spelvin
@ 2020-08-09  2:07           ` Linus Torvalds
  2020-08-11 16:01             ` Eric Dumazet
  0 siblings, 1 reply; 68+ messages in thread
From: Linus Torvalds @ 2020-08-09  2:07 UTC (permalink / raw)
  To: George Spelvin
  Cc: Willy Tarreau, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 8, 2020 at 3:28 PM George Spelvin <lkml@sdf.org> wrote:
>
> It's not a theoretical hole, it's a very real one.  Other than the cycles
> to do the brute-force part, it's not even all that complicated.  The
> theory part is that it's impossible to patch.

We'll just disagree.

I'm really fed up with security holes that are brought on by crypto
people not being willing to just do reasonable things.

> *If* you do the stupid thing.  WHICH YOU COULD JUST STOP DOING.

We're not feeding all the same bits to the /dev/random that we're
using to also update the pseudo-random state, so I think you're
overreacting. Think of it as "/dev/random gets a few bits, and prandom
gets a few other bits".

The fact that there is overlap between the bits is immaterial, when
other parts are disjoint. Fractonal bits of entropy and all that.

> The explain it to me.  What is that actual *problem*?  Nobody's described
> one, so I've been guessing.  What is this *monumentally stupid* abuse of
> /dev/random allegedly fixing?

The problem is that the networking people really want unguessable
random state. There was a remote DNS spoofing poisoning attack because
the UDP ports ended up being guessable.

And that was actually reported to us back in early March.

Almost five months later, nothing had been done, all the discussion
bogged down about "theoretical randomness" rather than to actual
real-life security.

> If you're not an idiot, explain.
>
> Because right now you sound like one.  There's a simple and easy fix which
> I've described and will get back to implementing as soon as I've finished
> yelling at you.  What, FFS, is your objection to considering it?

By now, I'm not in the least interested in theoretical arguments.

I claim that the simple "mix in (different parts of) the TSC bits and
IP bits into _both_ the pseudo random state and the /dev/random state"
is going to make it hell on earth for anybody to ever find weaknesses
in either. Or if they do, they need to find and then exploit them
really quickly, because practically speaking, a few hundred times a
second you end up with noise that you cannot attack algorithmically
perturbing the state of both.

And I realize that drives you crazy. You _want_ to be able to analyze
the state to prove something about it. And that's absolutely the
opposite of what I want. I want the internal state (both prandom and
/dev/random) to simply not be amenable to analysis - simple because we
intentionally screw it up on interrupts. You can analyze it all you
want, knowing that in a few milliseconds you'll have to start over (at
least partially).

So even if you're the NSA, and it turns out that you have a magical
quantum computer that can break the best hash function crypto people
know about by just seeing a fairly small part stream of random
numbers, you'd better figure that state out quickly, because next
interrupt comes along, and we'll perturb it.

Or, say that you find something like meltdown, and can actually use
some side channel to read out the full state of our internal
/dev/random mixing buffers. ALL of it. No magic NSA quantum computer
required, just a side channel. Your "theoretical" are all complete and
utter shit in that case, if all we do is have the best possible secure
algorithm. Because the state is right there, and our
super-duper-secure algorithm is all open source, and your claims of
"this is unbreakable in theory" is just complet and utter nonsense.

This is why I claim the noise that you can't analyze is so important.
You see it as a weakness, because you see it as a "now I can't prove
certain things". I see it as exactly the opposite.

So to me, your whole "theoretical safety" argument is actually a huge
honking weakness.

Btw, you'll really hate what I did to /dev/random for initializing the
pool last year. Another case of "Christ, crypto people have screwed us
for decades, now I'm going to just fix it" situation.

Go do

    git log --author=Torvalds --no-merges drivers/char/random.c

and you'll probably have a heart attack.

              Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-08 15:26 Flaw in "random32: update the net random state on interrupt and activity" George Spelvin
  2020-08-08 17:07 ` Andy Lutomirski
  2020-08-08 17:44 ` Willy Tarreau
@ 2020-08-09  6:57 ` George Spelvin
  2020-08-09  9:38   ` Willy Tarreau
                     ` (2 more replies)
  2 siblings, 3 replies; 68+ messages in thread
From: George Spelvin @ 2020-08-09  6:57 UTC (permalink / raw)
  To: netdev
  Cc: w, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx,
	peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin

[-- Attachment #1: 0001-random32-Make-prandom_u32-output-difficult-to-predic.patch --]
[-- Type: text/plain, Size: 15932 bytes --]

Non-cryptographic PRNGs may have great statistical proprties, but
are usually trivially predictable to someone who knows the algorithm,
given a small sample of their output.  An LFSR like prandom_u32() is
particularly simple, even if the sample is widely scattered bits.

It turns out the network stack uses prandom_u32() for some things like
random port numbers which it would prefer are *not* trivially predictable.
Predictability led to a practical DNS spoofing attack.  Oops.

This patch replaces the LFSR with a homebrew cryptographic PRNG based
on the SipHash round function, which is in turn seeded with 128 bits
of strong random key.  (The authors of SipHash have *not* been consulted
about this abuse of their algorithm.)  Speed is prioritized over security;
attacks are rare, while performance is always wanted.

Replacing all callers of prandom_u32() is the quick fix.
Whether to reinstate a weaker PRNG for uses which can tolerate it
is an open question.

TODO: The prandom_seed() function is currently a no-op stub.
It's not obvious how to use this additional data, so suggestions are
invited.

f227e3ec3b5c ("random32: update the net random state on interrupt and
activity") was an earlier attempt at a solution.  This patch
replaces it; revert it before applying this one.

Reported-by: Amit Klein <aksecurity@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: f227e3ec3b5c ("random32: update the net random state on interrupt and activity")
Replaces: f227e3ec3b5c ("random32: update the net random state on interrupt and activity")
Signed-off-by: George Spelvin <lkml@sdf.org>
---
 lib/random32.c | 437 ++++++++++++++++++++++++++++++-------------------
 1 file changed, 267 insertions(+), 170 deletions(-)

I haven't yet rebooted, so this is only build-tested, but it should illustrate
the idea.

diff --git a/lib/random32.c b/lib/random32.c
index 763b920a6206c..2b048e2ea99f6 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -48,8 +48,6 @@ static inline void prandom_state_selftest(void)
 }
 #endif
 
-static DEFINE_PER_CPU(struct rnd_state, net_rand_state) __latent_entropy;
-
 /**
  *	prandom_u32_state - seeded pseudo-random number generator.
  *	@state: pointer to state structure holding seeded state.
@@ -69,25 +67,6 @@ u32 prandom_u32_state(struct rnd_state *state)
 }
 EXPORT_SYMBOL(prandom_u32_state);
 
-/**
- *	prandom_u32 - pseudo random number generator
- *
- *	A 32 bit pseudo-random number is generated using a fast
- *	algorithm suitable for simulation. This algorithm is NOT
- *	considered safe for cryptographic use.
- */
-u32 prandom_u32(void)
-{
-	struct rnd_state *state = &get_cpu_var(net_rand_state);
-	u32 res;
-
-	res = prandom_u32_state(state);
-	put_cpu_var(net_rand_state);
-
-	return res;
-}
-EXPORT_SYMBOL(prandom_u32);
-
 /**
  *	prandom_bytes_state - get the requested number of pseudo-random bytes
  *
@@ -119,20 +98,6 @@ void prandom_bytes_state(struct rnd_state *state, void *buf, size_t bytes)
 }
 EXPORT_SYMBOL(prandom_bytes_state);
 
-/**
- *	prandom_bytes - get the requested number of pseudo-random bytes
- *	@buf: where to copy the pseudo-random bytes to
- *	@bytes: the requested number of bytes
- */
-void prandom_bytes(void *buf, size_t bytes)
-{
-	struct rnd_state *state = &get_cpu_var(net_rand_state);
-
-	prandom_bytes_state(state, buf, bytes);
-	put_cpu_var(net_rand_state);
-}
-EXPORT_SYMBOL(prandom_bytes);
-
 static void prandom_warmup(struct rnd_state *state)
 {
 	/* Calling RNG ten times to satisfy recurrence condition */
@@ -148,96 +113,6 @@ static void prandom_warmup(struct rnd_state *state)
 	prandom_u32_state(state);
 }
 
-static u32 __extract_hwseed(void)
-{
-	unsigned int val = 0;
-
-	(void)(arch_get_random_seed_int(&val) ||
-	       arch_get_random_int(&val));
-
-	return val;
-}
-
-static void prandom_seed_early(struct rnd_state *state, u32 seed,
-			       bool mix_with_hwseed)
-{
-#define LCG(x)	 ((x) * 69069U)	/* super-duper LCG */
-#define HWSEED() (mix_with_hwseed ? __extract_hwseed() : 0)
-	state->s1 = __seed(HWSEED() ^ LCG(seed),        2U);
-	state->s2 = __seed(HWSEED() ^ LCG(state->s1),   8U);
-	state->s3 = __seed(HWSEED() ^ LCG(state->s2),  16U);
-	state->s4 = __seed(HWSEED() ^ LCG(state->s3), 128U);
-}
-
-/**
- *	prandom_seed - add entropy to pseudo random number generator
- *	@entropy: entropy value
- *
- *	Add some additional entropy to the prandom pool.
- */
-void prandom_seed(u32 entropy)
-{
-	int i;
-	/*
-	 * No locking on the CPUs, but then somewhat random results are, well,
-	 * expected.
-	 */
-	for_each_possible_cpu(i) {
-		struct rnd_state *state = &per_cpu(net_rand_state, i);
-
-		state->s1 = __seed(state->s1 ^ entropy, 2U);
-		prandom_warmup(state);
-	}
-}
-EXPORT_SYMBOL(prandom_seed);
-
-/*
- *	Generate some initially weak seeding values to allow
- *	to start the prandom_u32() engine.
- */
-static int __init prandom_init(void)
-{
-	int i;
-
-	prandom_state_selftest();
-
-	for_each_possible_cpu(i) {
-		struct rnd_state *state = &per_cpu(net_rand_state, i);
-		u32 weak_seed = (i + jiffies) ^ random_get_entropy();
-
-		prandom_seed_early(state, weak_seed, true);
-		prandom_warmup(state);
-	}
-
-	return 0;
-}
-core_initcall(prandom_init);
-
-static void __prandom_timer(struct timer_list *unused);
-
-static DEFINE_TIMER(seed_timer, __prandom_timer);
-
-static void __prandom_timer(struct timer_list *unused)
-{
-	u32 entropy;
-	unsigned long expires;
-
-	get_random_bytes(&entropy, sizeof(entropy));
-	prandom_seed(entropy);
-
-	/* reseed every ~60 seconds, in [40 .. 80) interval with slack */
-	expires = 40 + prandom_u32_max(40);
-	seed_timer.expires = jiffies + msecs_to_jiffies(expires * MSEC_PER_SEC);
-
-	add_timer(&seed_timer);
-}
-
-static void __init __prandom_start_seed_timer(void)
-{
-	seed_timer.expires = jiffies + msecs_to_jiffies(40 * MSEC_PER_SEC);
-	add_timer(&seed_timer);
-}
-
 void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state)
 {
 	int i;
@@ -257,51 +132,6 @@ void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state)
 }
 EXPORT_SYMBOL(prandom_seed_full_state);
 
-/*
- *	Generate better values after random number generator
- *	is fully initialized.
- */
-static void __prandom_reseed(bool late)
-{
-	unsigned long flags;
-	static bool latch = false;
-	static DEFINE_SPINLOCK(lock);
-
-	/* Asking for random bytes might result in bytes getting
-	 * moved into the nonblocking pool and thus marking it
-	 * as initialized. In this case we would double back into
-	 * this function and attempt to do a late reseed.
-	 * Ignore the pointless attempt to reseed again if we're
-	 * already waiting for bytes when the nonblocking pool
-	 * got initialized.
-	 */
-
-	/* only allow initial seeding (late == false) once */
-	if (!spin_trylock_irqsave(&lock, flags))
-		return;
-
-	if (latch && !late)
-		goto out;
-
-	latch = true;
-	prandom_seed_full_state(&net_rand_state);
-out:
-	spin_unlock_irqrestore(&lock, flags);
-}
-
-void prandom_reseed_late(void)
-{
-	__prandom_reseed(true);
-}
-
-static int __init prandom_reseed(void)
-{
-	__prandom_reseed(false);
-	__prandom_start_seed_timer();
-	return 0;
-}
-late_initcall(prandom_reseed);
-
 #ifdef CONFIG_RANDOM32_SELFTEST
 static struct prandom_test1 {
 	u32 seed;
@@ -463,3 +293,270 @@ static void __init prandom_state_selftest(void)
 		pr_info("prandom: %d self tests passed\n", runs);
 }
 #endif
+
+
+
+/*
+ * The prandom_u32() implementation is now completely separate from the
+ * prandom_state() functions, which are retained (for now) for compatibility.
+ *
+ * Because of (ab)use in the networking code for choosing random TCP/UDP port
+ * numbers, which open DoS possibilities if guessable, we want something
+ * stronger than a standard PRNG.  But the performance requirements of
+ * the network code do not allow robust crypto for this application.
+ *
+ * So this is a homebrew Junior Spaceman implementation, based on the
+ * lowest-latency trustworthy crypto primitive available, SipHash.
+ * (The authors of SipHash have not been consulted about this abuse of
+ * their work.)
+ *
+ * Standard SipHash-2-4 uses 2n+4 rounds to hash n words of input to
+ * one word of output.  This abbreviated version uses 2 rounds per word
+ * of output.
+ */
+
+struct siprand_state {
+	unsigned long v[4];
+};
+
+static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
+
+#if BITS_PER_LONG == 64
+/*
+ * The core SipHash round function.  Each line can be executed in
+ * parallel given enough CPU resources.
+ */
+#define SIPROUND(v0,v1,v2,v3) ( \
+	v0 += v1, v1 = rol64(v1, 13),  v2 += v3, v3 = rol64(v3, 16), \
+	v1 ^= v0, v0 = rol64(v0, 32),  v3 ^= v2,                     \
+	v0 += v3, v3 = rol64(v3, 21),  v2 += v1, v1 = rol64(v1, 17), \
+	v3 ^= v0,                      v1 ^= v2, v2 = rol64(v2, 32)  )
+#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
+#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
+
+#elif BITS_PER_LONG == 23
+/*
+ * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash.
+ * This is weaker, but 32-bit machines are not used for high-traffic
+ * applications, so there is less output for an attacker to analyze.
+ */
+#define SIPROUND(v0,v1,v2,v3) ( \
+	v0 += v1, v1 = rol32(v1,  5),  v2 += v3, v3 = rol32(v3,  8), \
+	v1 ^= v0, v0 = rol32(v0, 16),  v3 ^= v2,                     \
+	v0 += v3, v3 = rol32(v3,  7),  v2 += v1, v1 = rol32(v1, 13), \
+	v3 ^= v0,                      v1 ^= v2, v2 = rol32(v2, 16)  )
+#define K0 0x6c796765
+#define K1 0x74656462
+
+#else
+#error Unsupported BITS_PER_LONG
+#endif
+
+/*
+ * This is the core CPRNG function.  As "pseudorandom", this is not used
+ * for truly valuable things, just intended to be a PITA to guess.
+ * For maximum speed, we do just two SipHash rounds per word.  This is
+ * the same rate as 4 rounds per 64 bits that SipHash normally uses,
+ * so hopefully it's reasonably secure.
+ *
+ * There are two changes from the official SipHash finalization:
+ * - We omit some constants XORed with v2 in the SipHash spec as irrelevant;
+ *   they are there only to make the output rounds distinct from the input
+ *   rounds, and this application has no input rounds.
+ * - Rather than returning v0^v1^v2^v3, return v1+v3.
+ *   If you look at the SipHash round, the last operation on v3 is
+ *   "v3 ^= v0", so "v0 ^ v3" just undoes that, a waste of time.
+ *   Likewise "v1 ^= v2".  (The rotate of v2 makes a difference, but
+ *   it still cancels out half of the bits in v2 for no benefit.)
+ *   Second, since the last combining operation was xor, continue the
+ *   pattern of alternating xor/add for a tiny bit of extra non-linearity.
+ */
+static u32 siprand_u32(struct siprand_state *s)
+{
+	unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
+
+	SIPROUND(v0, v1, v2, v3);
+	SIPROUND(v0, v1, v2, v3);
+	s->v[0] = v0;  s->v[1] = v1;  s->v[2] = v2;  s->v[3] = v3;
+	return v1 + v3;
+}
+
+
+/**
+ *	prandom_u32 - pseudo random number generator
+ *
+ *	A 32 bit pseudo-random number is generated using a fast
+ *	algorithm suitable for simulation. This algorithm is NOT
+ *	considered safe for cryptographic use.
+ */
+u32 prandom_u32(void)
+{
+	struct siprand_state *state = get_cpu_ptr(&net_rand_state);
+	u32 res = siprand_u32(state);
+
+	put_cpu_ptr(&net_rand_state);
+	return res;
+}
+EXPORT_SYMBOL(prandom_u32);
+
+/**
+ *	prandom_bytes - get the requested number of pseudo-random bytes
+ *	@buf: where to copy the pseudo-random bytes to
+ *	@bytes: the requested number of bytes
+ */
+void prandom_bytes(void *buf, size_t bytes)
+{
+	struct siprand_state *state = get_cpu_ptr(&net_rand_state);
+	u8 *ptr = buf;
+
+	while (bytes >= sizeof(u32)) {
+		put_unaligned(siprand_u32(state), (u32 *)ptr);
+		ptr += sizeof(u32);
+		bytes -= sizeof(u32);
+	}
+
+	if (bytes > 0) {
+		u32 rem = siprand_u32(state);
+		do {
+			*ptr++ = (u8)rem;
+			rem >>= BITS_PER_BYTE;
+		} while (--bytes > 0);
+	}
+	put_cpu_ptr(&net_rand_state);
+}
+EXPORT_SYMBOL(prandom_bytes);
+
+/**
+ *	prandom_seed - add entropy to pseudo random number generator
+ *	@entropy: entropy value
+ *
+ *	Add some additional seed material to the prandom pool.
+ *	The "entropy" is actually our IP address (the only caller is
+ *	the network code), not for unpredictability, but to ensure that
+ *	different machines are initialized differently.
+ */
+void prandom_seed(u32 entropy)
+{
+	/* FIXME: Where to feed this in? */
+}
+EXPORT_SYMBOL(prandom_seed);
+
+/*
+ *	Generate some initially weak seeding values to allow
+ *	the prandom_u32() engine to be started.
+ */
+static int __init prandom_init_early(void)
+{
+	int i;
+	unsigned long v0, v1, v2, v3;
+
+	if (!arch_get_random_long(&v0))
+		v0 = jiffies;
+	if (!arch_get_random_long(&v1))
+		v0 = random_get_entropy();
+	v2 = v0 ^ K0;
+	v3 = v1 ^ K1;
+
+	for_each_possible_cpu(i) {
+		struct siprand_state *state;
+
+		v3 ^= i;
+		SIPROUND(v0, v1, v2, v3);
+		SIPROUND(v0, v1, v2, v3);
+		v0 ^= i;
+
+		state = per_cpu_ptr(&net_rand_state, i);
+		state->v[0] = v0;  state->v[1] = v1;
+		state->v[2] = v2;  state->v[3] = v3;
+	}
+
+	return 0;
+}
+core_initcall(prandom_init_early);
+
+
+/* Stronger reseeding when available, and periodically thereafter. */
+static void prandom_reseed(struct timer_list *unused);
+
+static DEFINE_TIMER(seed_timer, prandom_reseed);
+
+static void prandom_reseed(struct timer_list *unused)
+{
+	unsigned long expires;
+	int i;
+
+	/*
+	 * Reinitialize each CPU's PRNG with 128 bits of key.
+	 * No locking on the CPUs, but then somewhat random results are,
+	 * well, expected.
+	 */
+	for_each_possible_cpu(i) {
+		struct siprand_state *state;
+		unsigned long v0 = get_random_long(), v2 = v0 ^ K0;
+		unsigned long v1 = get_random_long(), v3 = v1 ^ K1;
+#if BITS_PER_LONG == 32
+		int j;
+
+		/*
+		 * On 32-bit machines, hash in two extra words to
+		 * approximate 128-bit key length.  Not that the hash
+		 * has that much security, but this prevents a trivial
+		 * 64-bit brute force.
+		 */
+		for (j = 0; j < 2; j++) {
+			unsigned long m = get_random_long();
+			v3 ^= m;
+			SIPROUND(v0, v1, v2, v3);
+			SIPROUND(v0, v1, v2, v3);
+			v0 ^= m;
+		}
+#endif
+		/*
+		 * Probably impossible in practice, but there is a
+		 * theoretical risk that a race between this reseeding
+		 * and the target CPU writing its state back could
+		 * create the all-zero SipHash fixed point.
+		 *
+		 * To ensure that never happens, ensure the state
+		 * we write contains no zero words.
+		 */
+		state = per_cpu_ptr(&net_rand_state, i);
+		WRITE_ONCE(state->v[0], v0 ? v0 : -1ul);
+		WRITE_ONCE(state->v[1], v1 ? v1 : -1ul);
+		WRITE_ONCE(state->v[2], v2 ? v2 : -1ul);
+		WRITE_ONCE(state->v[3], v3 ? v3 : -1ul);
+	}
+
+	/* reseed every ~60 seconds, in [40 .. 80) interval with slack */
+	expires = round_jiffies(jiffies + 40 * HZ + prandom_u32_max(40 * HZ));
+	mod_timer(&seed_timer, expires);
+}
+
+/*
+ * The random ready callback can be called from almost any interrupt.
+ * To avoid worrying about whether it's safe to delay that interrupt
+ * long enough to seed all CPUs, just schedule an immediate timer event.
+ */
+static void prandom_timer_start(struct random_ready_callback *unused)
+{
+	mod_timer(&seed_timer, jiffies);
+}
+
+/*
+ * Start periodic full reseeding as soon as strong
+ * random numbers are available.
+ */
+static int __init prandom_init_late(void)
+{
+	static struct random_ready_callback random_ready = {
+	        .func = prandom_timer_start
+	};
+	int ret = add_random_ready_callback(&random_ready);
+
+	if (ret == -EALREADY) {
+		prandom_timer_start(&random_ready);
+		ret = 0;
+	}
+	return ret;
+}
+late_initcall(prandom_init_late);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09  6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin
@ 2020-08-09  9:38   ` Willy Tarreau
  2020-08-09 17:06     ` George Spelvin
       [not found]     ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com>
  2020-08-09 13:50   ` Randy Dunlap
       [not found]   ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com>
  2 siblings, 2 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-09  9:38 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw

Hi George,

On Sun, Aug 09, 2020 at 06:57:44AM +0000, George Spelvin wrote:
> +/*
> + * This is the core CPRNG function.  As "pseudorandom", this is not used
> + * for truly valuable things, just intended to be a PITA to guess.
> + * For maximum speed, we do just two SipHash rounds per word.  This is
> + * the same rate as 4 rounds per 64 bits that SipHash normally uses,
> + * so hopefully it's reasonably secure.
> + *
> + * There are two changes from the official SipHash finalization:
> + * - We omit some constants XORed with v2 in the SipHash spec as irrelevant;
> + *   they are there only to make the output rounds distinct from the input
> + *   rounds, and this application has no input rounds.
> + * - Rather than returning v0^v1^v2^v3, return v1+v3.
> + *   If you look at the SipHash round, the last operation on v3 is
> + *   "v3 ^= v0", so "v0 ^ v3" just undoes that, a waste of time.
> + *   Likewise "v1 ^= v2".  (The rotate of v2 makes a difference, but
> + *   it still cancels out half of the bits in v2 for no benefit.)
> + *   Second, since the last combining operation was xor, continue the
> + *   pattern of alternating xor/add for a tiny bit of extra non-linearity.
> + */
> +static u32 siprand_u32(struct siprand_state *s)
> +{
> +	unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
> +
> +	SIPROUND(v0, v1, v2, v3);
> +	SIPROUND(v0, v1, v2, v3);
> +	s->v[0] = v0;  s->v[1] = v1;  s->v[2] = v2;  s->v[3] = v3;
> +	return v1 + v3;
> +}
> +
> +
> +/**
> + *	prandom_u32 - pseudo random number generator
> + *
> + *	A 32 bit pseudo-random number is generated using a fast
> + *	algorithm suitable for simulation. This algorithm is NOT
> + *	considered safe for cryptographic use.
> + */
> +u32 prandom_u32(void)
> +{
> +	struct siprand_state *state = get_cpu_ptr(&net_rand_state);
> +	u32 res = siprand_u32(state);
> +
> +	put_cpu_ptr(&net_rand_state);
> +	return res;
> +}

So I gave it a quick test under Qemu and it didn't show any obvious
performance difference compared to Tausworthe, which is a good thing,
even though there's a significant amount of measurement noise in each
case.

However it keeps the problem that the whole sequence is entirely
determined at the moment of reseeding, so if one were to be able to
access the state, e.g. using l1tf/spectre/meltdown/whatever, then
this state could be used to predict the whole ongoing sequence for
the next minute. What some view as a security feature, others will
see as a backdoor :-/  That's why I really like the noise approach.
Even just the below would significantly harm that capability because
that state alone isn't sufficient anymore to pre-compute all future
values:

--- a/lib/random32.c
+++ b/lib/random32.c
@@ -375,6 +375,7 @@ static u32 siprand_u32(struct siprand_state *s)
 {
        unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
 
+       v0 += get_cycles();
        SIPROUND(v0, v1, v2, v3);
        SIPROUND(v0, v1, v2, v3);
        s->v[0] = v0;  s->v[1] = v1;  s->v[2] = v2;  s->v[3] = v3;

Regards,
Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09  6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin
  2020-08-09  9:38   ` Willy Tarreau
@ 2020-08-09 13:50   ` Randy Dunlap
       [not found]   ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com>
  2 siblings, 0 replies; 68+ messages in thread
From: Randy Dunlap @ 2020-08-09 13:50 UTC (permalink / raw)
  To: George Spelvin, netdev
  Cc: w, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx,
	peterz, tytso, lkml.mplumb, stephen, fw

On 8/8/20 11:57 PM, George Spelvin wrote:



+#elif BITS_PER_LONG == 23


s/23/32/ ???

-- 
~Randy


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
       [not found]   ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com>
@ 2020-08-09 16:08     ` George Spelvin
  0 siblings, 0 replies; 68+ messages in thread
From: George Spelvin @ 2020-08-09 16:08 UTC (permalink / raw)
  To: Amit Klein
  Cc: Jason, edumazet, fw, keescook, lkml.mplumb, luto, netdev, peterz,
	stephen, tglx, torvalds, tytso, w

On Sun, Aug 09, 2020 at 11:26:31AM +0300, Amit Klein wrote:
> BITS_PER_LONG==23 ???
> Should be ==32, I guess.

Of course.  I've been testing on a 64-bit system so hadn't
exercised that branch yet, and it's a case of "seeing what
I expect to see".

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09  9:38   ` Willy Tarreau
@ 2020-08-09 17:06     ` George Spelvin
  2020-08-09 17:33       ` Willy Tarreau
       [not found]     ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com>
  1 sibling, 1 reply; 68+ messages in thread
From: George Spelvin @ 2020-08-09 17:06 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin

On Sun, Aug 09, 2020 at 11:38:05AM +0200, Willy Tarreau wrote:
> So I gave it a quick test under Qemu and it didn't show any obvious
> performance difference compared to Tausworthe, which is a good thing,
> even though there's a significant amount of measurement noise in each
> case.

Thank you very much!  I'm not quite sure how to benchmark this.
The whole idea is that it's *not* used in a tight cache-hot loop.
Hopefully someone already has a test setup so I don't have to invent
one.

> However it keeps the problem that the whole sequence is entirely
> determined at the moment of reseeding, so if one were to be able to
> access the state, e.g. using l1tf/spectre/meltdown/whatever, then
> this state could be used to predict the whole ongoing sequence for
> the next minute. What some view as a security feature, others will
> see as a backdoor :-/  That's why I really like the noise approach.
> Even just the below would significantly harm that capability because
> that state alone isn't sufficient anymore to pre-compute all future
> values:
> 
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -375,6 +375,7 @@ static u32 siprand_u32(struct siprand_state *s)
>  {
>         unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
>  
> +       v0 += get_cycles();
>         SIPROUND(v0, v1, v2, v3);
>         SIPROUND(v0, v1, v2, v3);
>         s->v[0] = v0;  s->v[1] = v1;  s->v[2] = v2;  s->v[3] = v3;

As long as:
1) The periodic catastrophic reseeding remains, and
2) You use fresh measurements, not the exact same bits
   that add_*_randomness feeds into /dev/random
then it doesn't do any real harm, so if it makes you feel better...

But I really want to stress how weak a design drip-reseeding is.

If an attacker has enough local machine access to do a meltdown-style attack,
then they can calibrate the TSC used in get_cycles very accurately, so the
remaining timing uncertainty is very low.  This makes a brute-force attack on
one or two reseedings quite easy.  I.e. if you can see every other output,
It's straightforward to figure out the ones in between.

I wonder if, on general principles, it would be better to use a more
SipHash style mixing in of the sample:
	m = get_cycles();
	v3 ^= m;
	SIPROUND(v0, v1, v2, v3);
	SIPROUND(v0, v1, v2, v3);
	v0 ^= m;

Not sure if it's worth the extra register (and associated spill/fill).

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09 17:06     ` George Spelvin
@ 2020-08-09 17:33       ` Willy Tarreau
  2020-08-09 18:30         ` George Spelvin
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-09 17:33 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw

On Sun, Aug 09, 2020 at 05:06:39PM +0000, George Spelvin wrote:
> On Sun, Aug 09, 2020 at 11:38:05AM +0200, Willy Tarreau wrote:
> > So I gave it a quick test under Qemu and it didn't show any obvious
> > performance difference compared to Tausworthe, which is a good thing,
> > even though there's a significant amount of measurement noise in each
> > case.
> 
> Thank you very much!  I'm not quite sure how to benchmark this.
> The whole idea is that it's *not* used in a tight cache-hot loop.
> Hopefully someone already has a test setup so I don't have to invent
> one.

Due to limited access to representative hardware, the to main tests
I've been running were on my laptop in qemu, and consisted in :

   - a connect-accept-close test to stress the accept() code and
     verify we don't observe a significant drop. The thing is that
     connect() usually is much slower and running the two on the
     same machine tends to significantly soften the differences
     compared to what a real machine would see when handling a
     DDoS for example.

   - a packet rate test through this rule (which uses prandom_u32()
     for each packet and which matches what can be done in packet
     schedulers or just by users having to deal with random drop) :

     iptables -I INPUT -m statistic --probability 0.5 -j ACCEPT

While these ones are not very relevant, especially in a VM, not
seeing significant variations tends to indicate we should not see
a catastrophic loss.

> > However it keeps the problem that the whole sequence is entirely
> > determined at the moment of reseeding, so if one were to be able to
> > access the state, e.g. using l1tf/spectre/meltdown/whatever, then
> > this state could be used to predict the whole ongoing sequence for
> > the next minute. What some view as a security feature, others will
> > see as a backdoor :-/  That's why I really like the noise approach.
> > Even just the below would significantly harm that capability because
> > that state alone isn't sufficient anymore to pre-compute all future
> > values:
> > 
> > --- a/lib/random32.c
> > +++ b/lib/random32.c
> > @@ -375,6 +375,7 @@ static u32 siprand_u32(struct siprand_state *s)
> >  {
> >         unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
> >  
> > +       v0 += get_cycles();
> >         SIPROUND(v0, v1, v2, v3);
> >         SIPROUND(v0, v1, v2, v3);
> >         s->v[0] = v0;  s->v[1] = v1;  s->v[2] = v2;  s->v[3] = v3;
> 
> As long as:
> 1) The periodic catastrophic reseeding remains, and
> 2) You use fresh measurements, not the exact same bits
>    that add_*_randomness feeds into /dev/random
> then it doesn't do any real harm, so if it makes you feel better...
> 
> But I really want to stress how weak a design drip-reseeding is.
> 
> If an attacker has enough local machine access to do a meltdown-style attack,
> then they can calibrate the TSC used in get_cycles very accurately,

Absolutely.

> so the
> remaining timing uncertainty is very low.

Not that low in fact because they don't know precisely when the call is
made. I mean, let's say we're in the worst case, with two VMs running on
two siblings of the same core, with the same TSC, on a 3 GHz machine. The
attacker can stress the victim at 100k probes per second. That's still
15 bits of uncertainty on the TSC value estimation which is added to each
call. Even on the first call this is enough to make a source port
unguessable, and preventing the attacker from staying synchronized with
its victim. And I'm only speaking about an idle remote machine, not even
one taking unobservable traffic, which further adds to the difficulty.

> This makes a brute-force attack on
> one or two reseedings quite easy.  I.e. if you can see every other output,
> It's straightforward to figure out the ones in between.

But they already become useless because you're only observing stuff of
the past.

> I wonder if, on general principles, it would be better to use a more
> SipHash style mixing in of the sample:
> 	m = get_cycles();
> 	v3 ^= m;
> 	SIPROUND(v0, v1, v2, v3);
> 	SIPROUND(v0, v1, v2, v3);
> 	v0 ^= m;

Probably, if it's the recommended way to mix in other values, yes.

> Not sure if it's worth the extra register (and associated spill/fill).

If this makes the hash better, maybe. I can run some tests on this as
well. I'd really need to try on a cleaner setup, I have remote machines
at the office but I don't feel safe enough to remotely reboot them and
risk to lose them :-/

I'll also test on arm/arm64 to make sure we don't introduce a significant
cost there.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09 17:33       ` Willy Tarreau
@ 2020-08-09 18:30         ` George Spelvin
  2020-08-09 19:16           ` Willy Tarreau
  2020-08-10 11:47           ` Willy Tarreau
  0 siblings, 2 replies; 68+ messages in thread
From: George Spelvin @ 2020-08-09 18:30 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin

On Sun, Aug 09, 2020 at 07:33:03PM +0200, Willy Tarreau wrote:
> Not that low in fact because they don't know precisely when the call is
> made. I mean, let's say we're in the worst case, with two VMs running on
> two siblings of the same core, with the same TSC, on a 3 GHz machine. The
> attacker can stress the victim at 100k probes per second. That's still
> 15 bits of uncertainty on the TSC value estimation which is added to each
> call. Even on the first call this is enough to make a source port
> unguessable, and preventing the attacker from staying synchronized with
> its victim. And I'm only speaking about an idle remote machine, not even
> one taking unobservable traffic, which further adds to the difficulty.

I'm trying to understand your attack scenario.  I'm assuming that an
attacker can call prandom_u32() locally.  (I don't have a specific code
path, but given the number of uses in the kernel, I assume *one* of
them will leak the output directly.)  And repeat the call fast
enough that there's at most *one* other user between our calls.

If an attacker knows the initial state, does an rdtsc, prandom_u32(),
and a second rdtsc, then they can guess the TSC value used in than
prandom_u32() quite accurately (4-6 bits fuzz, perhaps).  This is
trivial to brute force.

The fun comes if someone else does a prandom_u32() call in between.

All of a sudden, the 4-6 bit brute force of one get_cycles() value
fails to find a solution.  Someone else has called prandom_u32()!

Now we have 15 bits of uncertainty about that other call, and 5 bits
of uncertainty about our call.  2^20 possibilities only takes a few
milliseconds to test, and the 32-bit output of prandom_u32() can
verify a guess with minimal probability of error.

(Note that, to maintain tracking, we have to keep hammering
prandom_u32() *during* the search, but we can just buffer the results
and process them after the expensive search is complete.)

What you can see here is the incredible power of *multiple* unobserved
seedings.  As long as an attacker can limit things to one unobserved
prandom_u32(), it's a simple brute force.

If there are mroe than one, the additional bits of uncertainty
quickly make things impractical.

This is why I'm so keen on less frequent, more catastrophic,
reseeding.  Yes, the delay means an attacker who has captured
the state retains full knowledge for longer.  But they get
kicked off as soon as the catastophe happens.  Without it, they
can keep tracking the state indefinitely.

Even something simple like buffering 8 TSC samples, and adding them
at 32-bit offsets across the state every 8th call, would make a huge
difference.

Even if 7 of the timestamps are from attacker calls (4 bits
uncertainty), the time of the target call is 8x less known
(so it goes from 15 to 18 bits), and the total becomes
46 bits.  *So* much better.

> I can run some tests on this as
> well. I'd really need to try on a cleaner setup, I have remote machines
> at the office but I don't feel safe enough to remotely reboot them and
> risk to lose them :-/

Yeah, test kernels are nervous-making that way.

> I'll also test on arm/arm64 to make sure we don't introduce a significant
> cost there.

I don't expect a problem, but SipHash is optimized for 4-issue processors,
and on narrower machines fewer instructions are "free".

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09 18:30         ` George Spelvin
@ 2020-08-09 19:16           ` Willy Tarreau
  2020-08-10 11:47           ` Willy Tarreau
  1 sibling, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-09 19:16 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw

On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote:
> I'm trying to understand your attack scenario.  I'm assuming that an
> attacker can call prandom_u32() locally.

Well, first, I'm mostly focusing on remote attacks, because if the
attacker has a permanent local access, he can as well just bind()
a source port instead of having to guess the next one that will be
used. Second, I'm not aware of direct access from userland, so the
calls to prandom_u32() are expected to be done only via specific
code parts. The worst case (in my opinion) is when the attacker
runs on the same physical CPU and exploits some memory leakage to
find the internal state and uses the same TSC, but can only trigger
prandom_u32() via the network, hence with much less precision.

(...)
> Even something simple like buffering 8 TSC samples, and adding them
> at 32-bit offsets across the state every 8th call, would make a huge
> difference.
> 
> Even if 7 of the timestamps are from attacker calls (4 bits
> uncertainty), the time of the target call is 8x less known
> (so it goes from 15 to 18 bits), and the total becomes
> 46 bits.  *So* much better.

Maybe. I'm just suggesting to add *some* noise to keep things not
exploitable if the state is stolen once in a while (which would
already be a big problem, admittedly).

> > I can run some tests on this as
> > well. I'd really need to try on a cleaner setup, I have remote machines
> > at the office but I don't feel safe enough to remotely reboot them and
> > risk to lose them :-/
> 
> Yeah, test kernels are nervous-making that way.

In fact I never booted a 5.8 kernel on the machine I'm thinking about
yet and can't remote-control its power supply, so I'm worried about
a dirty hang at boot by missing a config entry. The risk is low but
that's annoying when it happens.

> > I'll also test on arm/arm64 to make sure we don't introduce a significant
> > cost there.
> 
> I don't expect a problem, but SipHash is optimized for 4-issue processors,
> and on narrower machines fewer instructions are "free".

OK.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
       [not found]     ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com>
@ 2020-08-09 21:10       ` Marc Plumb
  2020-08-09 21:48         ` Linus Torvalds
  0 siblings, 1 reply; 68+ messages in thread
From: Marc Plumb @ 2020-08-09 21:10 UTC (permalink / raw)
  To: Willy Tarreau, George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, stephen, fw

(Reseending since I accidentally sent it as HTML which the netdev 
mailing list doesn't like)

On 2020-08-09 2:05 p.m., Marc Plumb wrote:
>
> Willy,
>
>
> On 2020-08-07 3:19 p.m., Willy Tarreau wrote:
>>> If I can figure the state out once,
>> Yes but how do you take that as granted ? This state doesn't appear
>> without its noise counterpart, so taking as a prerequisite that you may
>> guess one separately obviously indicates that you then just have to
>> deduce the other, but the point of mixing precisely is that we do not
>> expose individual parts.
>
>
> On 2020-08-09 2:38 a.m., Willy Tarreau wrote:
>> However it keeps the problem that the whole sequence is entirely
>> determined at the moment of reseeding, so if one were to be able to
>> access the state, e.g. using l1tf/spectre/meltdown/whatever, then
>> this state could be used to predict the whole ongoing sequence for
>> the next minute. What some view as a security feature, others will
>> see as a backdoor :-/  That's why I really like the noise approach.
>> Even just the below would significantly harm that capability because
>> that state alone isn't sufficient anymore to pre-compute all future
>> values:
>
>
> So two days ago I was unreasonable for assuming an attacker to could 
> recover the entire state, now you're assuming the same thing? As I 
> said before, if an attacker can recover the complete state, then 
> slowly adding noise doesn't help significantly since an attacker can 
> brute force the noise added (even if a perfect CPRNG is used).
>
> However, I think I'm starting to see your underlying assumptions. 
> You're thinking that raw noise data are the only truly unpredictable 
> thing so you think that adding it is a defense against attacks like 
> foreshadow/spectre/meltdown. You aren't entirely wrong -- if there was 
> a fast noise source then it might be a good option. Just if the noise 
> source is slow, you're just causing far more damage to a far more 
> critical crytpto function to get very little benefit.
>
> If you want to add noise to the network PRNG, then you can't put the 
> same noise into the dev/random CPRNG at all, in any way. I've tried 
> explaining the crypto reasons for this, and George has added to that, 
> so let me try a different approach: It breaks FIPS 140-2 for all of 
> Linux. While FIPS certification isn't a key driver, it is a 
> consideration for the crypt modules. FIPS references NIST.SP.800-90B 
> (which is specifically about Recommendation for the Entropy Sources 
> Used for Random Bit Generation) which has a requirement that the noise 
> source not pass any data used for crypto operations to anything 
> outside of the defined security boundary. If you want to feed a noise 
> measurement into the network PRNG, then you can't feed it into the 
> /dev/random pool also. You have to decide where the different 
> measurements are going to go and keep them completely seperate.
>
> I'm not intimately familiar with the standards so I spoke with someone 
> who does FIPS 140-x certification and was told "I don't think the 
> standards even considered the idea that someone might pass data from a 
> conditioning pool to other functions. It completely violates the 
> security boundary concept so is just prohibited ... that type of 
> change would absolutely be a problem."
>
>
> Marc
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09 21:10       ` Marc Plumb
@ 2020-08-09 21:48         ` Linus Torvalds
  0 siblings, 0 replies; 68+ messages in thread
From: Linus Torvalds @ 2020-08-09 21:48 UTC (permalink / raw)
  To: Marc Plumb
  Cc: Willy Tarreau, George Spelvin, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o,
	Stephen Hemminger, Florian Westphal

On Sun, Aug 9, 2020 at 2:10 PM Marc Plumb <lkml.mplumb@gmail.com> wrote:
>
> However, I think I'm starting to see your underlying assumptions.
> You're thinking that raw noise data are the only truly unpredictable
> thing so you think that adding it is a defense against attacks like
> foreshadow/spectre/meltdown. You aren't entirely wrong -- if there was
> a fast noise source then it might be a good option. Just if the noise
> source is slow, you're just causing far more damage to a far more
> critical crypto function to get very little benefit.

The only truly good noise source we have is any CPU native randomness.

Sadly, that is usually never really "fast". But we do use that for any
actual real /dev/random reading. We still whiten it through our
hashing, and we mix it in rather than use the raw CPU hw-provided
values (because not everybody trusts the CPU vendors either), but
/dev/random will most certainly take advantage of it as one source of
noise.

(Honesty in advertising: you can also use other interfaces that don't
bother with the remixing, and _will_ just return the raw CPU
randomness).

So if you make the (imho reasonable) assumption that you're running on
a modern enough CPU, and that the CPU hw randomness is of reasonable
quality, then you never need to worry about /dev/random. Every single
time you extract something from one of the pools, I think we mix in
that CPU randomness if it's available.

But the CPU randomness is too slow for the prandom code to use at
extraction time. It's on the order of a couple of hundred to a couple
of thousand cycles. That's peanuts for /dev/random, but quite a lot
for prandom.

In fact, at the slow end it is slow enough that you don't want to do
it at any fast granularity (ie not "every interrupt"), it's the "when
reseeding once a second" kind of slow.

arm64 has randomness too these days too, but that's only of the
"really slow, useful for seeding" variety. And I'm not sure which (if
any) CPU implementations out there actually do it yet.

Anyway, I suspect /dev/random has been over-engineered (I think we
have something like three layers of mixing bits _and_ the CPU
randomness _and_ all the interrupt randomness), and prandom may have
been left alone too much.

             Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-09 18:30         ` George Spelvin
  2020-08-09 19:16           ` Willy Tarreau
@ 2020-08-10 11:47           ` Willy Tarreau
  2020-08-10 12:01             ` David Laight
                               ` (3 more replies)
  1 sibling, 4 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-10 11:47 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw

Hi George,

On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote:
> Even something simple like buffering 8 TSC samples, and adding them
> at 32-bit offsets across the state every 8th call, would make a huge
> difference.

Doing testing on real hardware showed that retrieving the TSC on every
call had a non negligible cost, causing a loss of 2.5% on the accept()
rate and 4% on packet rate when using iptables -m statistics. However
I reused your idea of accumulating old TSCs to increase the uncertainty
about their exact value, except that I retrieve it only on 1/8 calls
and use the previous noise in this case. With this I observe the same
performance as plain 5.8. Below are the connection rates accepted on
a single core :

        5.8           5.8+patch     5.8+patch+tsc
   192900-197900   188800->192200   194500-197500  (conn/s)

This was on a core i7-8700K. I looked at the asm code for the function
and it remains reasonably light, in the same order of complexity as the
original one, so I think we could go with that.

My proposed change is below, in case you have any improvements to suggest.

Regards,
Willy


diff --git a/lib/random32.c b/lib/random32.c
index 2b048e2ea99f..a12d63028106 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -317,6 +317,8 @@ static void __init prandom_state_selftest(void)
 
 struct siprand_state {
 	unsigned long v[4];
+	unsigned long noise;
+	unsigned long count;
 };
 
 static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
@@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
 #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
 #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
 
-#elif BITS_PER_LONG == 23
+#elif BITS_PER_LONG == 32
 /*
  * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash.
  * This is weaker, but 32-bit machines are not used for high-traffic
@@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s)
 {
 	unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
 
+	if (++s->count >= 8) {
+		v3 ^= s->noise;
+		s->noise += random_get_entropy();
+		s->count = 0;
+	}
+
 	SIPROUND(v0, v1, v2, v3);
 	SIPROUND(v0, v1, v2, v3);
 	s->v[0] = v0;  s->v[1] = v1;  s->v[2] = v2;  s->v[3] = v3;

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* RE: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 11:47           ` Willy Tarreau
@ 2020-08-10 12:01             ` David Laight
  2020-08-10 14:48               ` Willy Tarreau
  2020-08-10 12:03             ` Florian Westphal
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 68+ messages in thread
From: David Laight @ 2020-08-10 12:01 UTC (permalink / raw)
  To: 'Willy Tarreau', George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw

From: Willy Tarreau
> Sent: 10 August 2020 12:47
> On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote:
> > Even something simple like buffering 8 TSC samples, and adding them
> > at 32-bit offsets across the state every 8th call, would make a huge
> > difference.
> 
> Doing testing on real hardware showed that retrieving the TSC on every
> call had a non negligible cost, causing a loss of 2.5% on the accept()
> rate and 4% on packet rate when using iptables -m statistics. However
> I reused your idea of accumulating old TSCs to increase the uncertainty
> about their exact value, except that I retrieve it only on 1/8 calls
> and use the previous noise in this case. With this I observe the same
> performance as plain 5.8. Below are the connection rates accepted on
> a single core :
> 
>         5.8           5.8+patch     5.8+patch+tsc
>    192900-197900   188800->192200   194500-197500  (conn/s)
> 
> This was on a core i7-8700K. I looked at the asm code for the function
> and it remains reasonably light, in the same order of complexity as the
> original one, so I think we could go with that.
> 
> My proposed change is below, in case you have any improvements to suggest.
> 
> Regards,
> Willy
> 
> 
> diff --git a/lib/random32.c b/lib/random32.c
> index 2b048e2ea99f..a12d63028106 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -317,6 +317,8 @@ static void __init prandom_state_selftest(void)
> 
>  struct siprand_state {
>  	unsigned long v[4];
> +	unsigned long noise;
> +	unsigned long count;
>  };
> 
>  static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
> @@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
>  #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
>  #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
> 
> -#elif BITS_PER_LONG == 23
> +#elif BITS_PER_LONG == 32
>  /*
>   * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash.
>   * This is weaker, but 32-bit machines are not used for high-traffic
> @@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s)
>  {
>  	unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
> 
> +	if (++s->count >= 8) {
> +		v3 ^= s->noise;
> +		s->noise += random_get_entropy();
> +		s->count = 0;
> +	}
> +

Using:
	if (s->count-- <= 0) {
		...
		s->count = 8;
	}
probably generates better code.
Although you may want to use a 'signed int' instead of 'unsigned long'.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 11:47           ` Willy Tarreau
  2020-08-10 12:01             ` David Laight
@ 2020-08-10 12:03             ` Florian Westphal
  2020-08-10 14:53               ` Willy Tarreau
  2020-08-10 16:31             ` Linus Torvalds
  2020-08-11  3:47             ` George Spelvin
  3 siblings, 1 reply; 68+ messages in thread
From: Florian Westphal @ 2020-08-10 12:03 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason,
	luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw

Willy Tarreau <w@1wt.eu> wrote:
> On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote:
> > Even something simple like buffering 8 TSC samples, and adding them
> > at 32-bit offsets across the state every 8th call, would make a huge
> > difference.
> 
> Doing testing on real hardware showed that retrieving the TSC on every
> call had a non negligible cost, causing a loss of 2.5% on the accept()
> rate and 4% on packet rate when using iptables -m statistics. However
> I reused your idea of accumulating old TSCs to increase the uncertainty
> about their exact value, except that I retrieve it only on 1/8 calls
> and use the previous noise in this case. With this I observe the same
> performance as plain 5.8. Below are the connection rates accepted on
> a single core :
> 
>         5.8           5.8+patch     5.8+patch+tsc
>    192900-197900   188800->192200   194500-197500  (conn/s)
> 
> This was on a core i7-8700K. I looked at the asm code for the function
> and it remains reasonably light, in the same order of complexity as the
> original one, so I think we could go with that.
> 
> My proposed change is below, in case you have any improvements to suggest.

As this relates to networking, you could also hook perturbation into rx/tx
softirq processing.  E.g. once for each new napi poll round or only once
for each softnet invocation, depending on cost.

IIRC the proposed draft left a unused prandom_seed() stub around, you could
re-use that to place extra data to include in the hash in percpu data.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 12:01             ` David Laight
@ 2020-08-10 14:48               ` Willy Tarreau
  0 siblings, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-10 14:48 UTC (permalink / raw)
  To: David Laight
  Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason,
	luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw

On Mon, Aug 10, 2020 at 12:01:11PM +0000, David Laight wrote:
> >  /*
> >   * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash.
> >   * This is weaker, but 32-bit machines are not used for high-traffic
> > @@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s)
> >  {
> >  	unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
> > 
> > +	if (++s->count >= 8) {
> > +		v3 ^= s->noise;
> > +		s->noise += random_get_entropy();
> > +		s->count = 0;
> > +	}
> > +
> 
> Using:
> 	if (s->count-- <= 0) {
> 		...
> 		s->count = 8;
> 	}
> probably generates better code.
> Although you may want to use a 'signed int' instead of 'unsigned long'.

Yeah I know, it's just because I only slightly changed the previous code
there. I had an earlier version that kept the rand state fully padded
when storing intermediate values. That's among the final cleanups I'll
bring if we go down that route.

Thanks!

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 12:03             ` Florian Westphal
@ 2020-08-10 14:53               ` Willy Tarreau
  0 siblings, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-10 14:53 UTC (permalink / raw)
  To: Florian Westphal
  Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason,
	luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen

On Mon, Aug 10, 2020 at 02:03:02PM +0200, Florian Westphal wrote:
> As this relates to networking, you could also hook perturbation into rx/tx
> softirq processing.  E.g. once for each new napi poll round or only once
> for each softnet invocation, depending on cost.

I was wondering what/where to add some processing. I was thinking about
having a per_cpu "noise" variable that would get mixed with the randoms
when generated. This "noise" would need to be global so that we can easily
append to it. For example on the network path it would be nice to use
checksums but nowadays they're not calculated anymore.

> IIRC the proposed draft left a unused prandom_seed() stub around, you could
> re-use that to place extra data to include in the hash in percpu data.

Probably. What I saw was that prandom_seed() expected to perform a full
(hence slow) reseed. Instead I'd like to do something cheap. I like the
principle of "noise" in that it doesn't promise to bring any security,
only to perturb a little bit.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 11:47           ` Willy Tarreau
  2020-08-10 12:01             ` David Laight
  2020-08-10 12:03             ` Florian Westphal
@ 2020-08-10 16:31             ` Linus Torvalds
  2020-08-10 16:58               ` Willy Tarreau
  2020-08-11  3:47             ` George Spelvin
  3 siblings, 1 reply; 68+ messages in thread
From: Linus Torvalds @ 2020-08-10 16:31 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger, Florian Westphal

On Mon, Aug 10, 2020 at 4:47 AM Willy Tarreau <w@1wt.eu> wrote:
>
> Doing testing on real hardware showed that retrieving the TSC on every
> call had a non negligible cost, causing a loss of 2.5% on the accept()
> rate and 4% on packet rate when using iptables -m statistics.

And by "real hardware" I assume you mean x86, with a fairly fast and
high-performance TSC for get_random_entropy().

Reading the TSC takes on the order of 20-50 cycles, iirc.

But it can actually be *much* more expensive. On non-x86, it can be an
IO cycle to external chips.

And on older hardware VM's in x86, it can be a vm exit etc, so
thousands of cycles. I hope nobody uses those VM's any more, but it
would be a reasonable test-case for some non-x86 implementations, so..

IOW, no. You guys are - once again - ignoring reality.

            Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 16:31             ` Linus Torvalds
@ 2020-08-10 16:58               ` Willy Tarreau
  2020-08-10 17:45                 ` Linus Torvalds
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-10 16:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger, Florian Westphal

On Mon, Aug 10, 2020 at 09:31:48AM -0700, Linus Torvalds wrote:
> On Mon, Aug 10, 2020 at 4:47 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > Doing testing on real hardware showed that retrieving the TSC on every
> > call had a non negligible cost, causing a loss of 2.5% on the accept()
> > rate and 4% on packet rate when using iptables -m statistics.
> 
> And by "real hardware" I assume you mean x86, with a fairly fast and
> high-performance TSC for get_random_entropy().

Yep.

> Reading the TSC takes on the order of 20-50 cycles, iirc.
> 
> But it can actually be *much* more expensive. On non-x86, it can be an
> IO cycle to external chips.

I took what we were already using in add_interrupt_randomness() since
I considered that if it was acceptable there, it probably was elsewhere.

> And on older hardware VM's in x86, it can be a vm exit etc, so
> thousands of cycles. I hope nobody uses those VM's any more, but it
> would be a reasonable test-case for some non-x86 implementations, so..

Yes, I remember these ones, they were not fun at all.

> IOW, no. You guys are - once again - ignoring reality.

I'm not ignoring reality, quite the opposite, trying to take all knowledge
into account. If I'm missing some points, fine. But if we were already
calling that in the interrupt handler I expected that this would be OK.

The alternative Florian talked about is quite interesting as well,
which is to collect some cheap noise in the network rx/tx paths since
these are the areas we care about.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 16:58               ` Willy Tarreau
@ 2020-08-10 17:45                 ` Linus Torvalds
  2020-08-10 18:01                   ` Willy Tarreau
  2020-08-10 21:04                   ` Willy Tarreau
  0 siblings, 2 replies; 68+ messages in thread
From: Linus Torvalds @ 2020-08-10 17:45 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger, Florian Westphal

On Mon, Aug 10, 2020 at 9:59 AM Willy Tarreau <w@1wt.eu> wrote:
>
> I took what we were already using in add_interrupt_randomness() since
> I considered that if it was acceptable there, it probably was elsewhere.

Once you've taken an interrupt, you're doing IO anyway, and the
interrupt costs will dominate anything you do.

But the prandom_u32() interface is potentially done many times per
interrupt. For all I know it's done inside fairly critical locks etc
too.

So I don't think one usage translates to another very well.

              Linus

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 17:45                 ` Linus Torvalds
@ 2020-08-10 18:01                   ` Willy Tarreau
  2020-08-10 21:04                   ` Willy Tarreau
  1 sibling, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-10 18:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger, Florian Westphal

On Mon, Aug 10, 2020 at 10:45:26AM -0700, Linus Torvalds wrote:
> On Mon, Aug 10, 2020 at 9:59 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > I took what we were already using in add_interrupt_randomness() since
> > I considered that if it was acceptable there, it probably was elsewhere.
> 
> Once you've taken an interrupt, you're doing IO anyway, and the
> interrupt costs will dominate anything you do.
> 
> But the prandom_u32() interface is potentially done many times per
> interrupt. For all I know it's done inside fairly critical locks etc
> too.
> 
> So I don't think one usage translates to another very well.

Possible, hence the better solution to just feed noise in hot paths.
Using jiffies and skb pointer in xmit is better than nothing anyway.
I'm not seeking anything extremist, I just want to make sure we don't
get yet-another-report on this area next summer just because some
researcher using two VMs discovers that attacking a 100% idle VM from
another one running on the same CPU core is trivial after having stolen
some memory data.  If at least such an attack is boring and rarely
works, the rest of the job is provided by siphash and we should be fine.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 17:45                 ` Linus Torvalds
  2020-08-10 18:01                   ` Willy Tarreau
@ 2020-08-10 21:04                   ` Willy Tarreau
  2020-08-11  5:26                     ` George Spelvin
  1 sibling, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-10 21:04 UTC (permalink / raw)
  To: Linus Torvalds, George Spelvin, Florian Westphal
  Cc: Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Theodore Ts'o, Marc Plumb, Stephen Hemminger

Linus, George, Florian,

would something in this vein be OK in your opinion ?

- update_process_times still updates the noise
- we don't touch the fast_pool anymore
- we don't read any TSC on hot paths
- we update the noise in xmit from jiffies and a few pointer values instead

I've applied it on top of George's patch rebased to mainline for simplicity.
I've used a separate per_cpu noise variable to keep the net_rand_state static
with its __latent_entropy.

With this I'm even getting very slightly better performance than with
the previous patch (195.7 - 197.8k cps).

What could be improved is the way the input values are mixed (just
added hence commutative for now). I didn't want to call a siphash
round on the hot paths, but just shifting the previous noise value
before adding would work, such as the following for example:

  void prandom_u32_add_noise(a, b, c, d)
  {
  	unsigned long *noise = get_cpu_ptr(&net_rand_noise);

  #if BITS_PER_LONG == 64
	*noise = rol64(*noise, 7) + a + b + c + d;
  #else
	*noise = rol32(*noise, 7) + a + b + c + d;
  #endif
  	put_cpu_ptr(&net_rand_noise);

  }

Thanks,
Willy

---


diff --git a/drivers/char/random.c b/drivers/char/random.c
index d20ba1b104ca..2a41b21623ae 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1277,7 +1277,6 @@ void add_interrupt_randomness(int irq, int irq_flags)
 
 	fast_mix(fast_pool);
 	add_interrupt_bench(cycles);
-	this_cpu_add(net_rand_state.s1, fast_pool->pool[cycles & 3]);
 
 	if (unlikely(crng_init == 0)) {
 		if ((fast_pool->count >= 64) &&
diff --git a/include/linux/prandom.h b/include/linux/prandom.h
index aa16e6468f91..e2b4990f2126 100644
--- a/include/linux/prandom.h
+++ b/include/linux/prandom.h
@@ -20,7 +20,7 @@ struct rnd_state {
 	__u32 s1, s2, s3, s4;
 };
 
-DECLARE_PER_CPU(struct rnd_state, net_rand_state);
+DECLARE_PER_CPU(unsigned long, net_rand_noise);
 
 u32 prandom_u32_state(struct rnd_state *state);
 void prandom_bytes_state(struct rnd_state *state, void *buf, size_t nbytes);
@@ -67,6 +67,7 @@ static inline void prandom_seed_state(struct rnd_state *state, u64 seed)
 	state->s2 = __seed(i,   8U);
 	state->s3 = __seed(i,  16U);
 	state->s4 = __seed(i, 128U);
+	__this_cpu_add(net_rand_noise, i);
 }
 
 /* Pseudo random number generator from numerical recipes. */
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index ae5029f984a8..2f07c569af77 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1721,7 +1721,7 @@ void update_process_times(int user_tick)
 	 * non-constant value that's not affine to the number of calls to make
 	 * sure it's updated when there's some activity (we don't care in idle).
 	 */
-	this_cpu_add(net_rand_state.s1, rol32(jiffies, 24) + user_tick);
+	__this_cpu_add(net_rand_noise, rol32(jiffies, 24) + user_tick);
 }
 
 /**
diff --git a/lib/random32.c b/lib/random32.c
index 2b048e2ea99f..d74cf1db8cc9 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -320,6 +320,8 @@ struct siprand_state {
 };
 
 static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
+DEFINE_PER_CPU(unsigned long, net_rand_noise);
+EXPORT_SYMBOL(net_rand_noise);
 
 #if BITS_PER_LONG == 64
 /*
@@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
 #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
 #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
 
-#elif BITS_PER_LONG == 23
+#elif BITS_PER_LONG == 32
 /*
  * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash.
  * This is weaker, but 32-bit machines are not used for high-traffic
@@ -374,9 +376,12 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
 static u32 siprand_u32(struct siprand_state *s)
 {
 	unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
+	unsigned long n = __this_cpu_read(net_rand_noise);
 
+	v3 ^= n;
 	SIPROUND(v0, v1, v2, v3);
 	SIPROUND(v0, v1, v2, v3);
+	v0 ^= n;
 	s->v[0] = v0;  s->v[1] = v1;  s->v[2] = v2;  s->v[3] = v3;
 	return v1 + v3;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index 7df6c9617321..55a2471cd75b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -144,6 +144,7 @@
 #include <linux/indirect_call_wrapper.h>
 #include <net/devlink.h>
 #include <linux/pm_runtime.h>
+#include <linux/prandom.h>
 
 #include "net-sysfs.h"
 
@@ -3557,6 +3558,7 @@ static int xmit_one(struct sk_buff *skb, struct net_device *dev,
 		dev_queue_xmit_nit(skb, dev);
 
 	len = skb->len;
+	__this_cpu_add(net_rand_noise, (long)skb + (long)dev + (long)txq + len + jiffies);
 	trace_net_dev_start_xmit(skb, dev);
 	rc = netdev_start_xmit(skb, dev, txq, more);
 	trace_net_dev_xmit(skb, rc, dev, len);
@@ -4129,6 +4131,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 			if (!skb)
 				goto out;
 
+			__this_cpu_add(net_rand_noise, (long)skb + (long)dev + (long)txq + jiffies);
 			HARD_TX_LOCK(dev, txq, cpu);
 
 			if (!netif_xmit_stopped(txq)) {
@@ -4194,6 +4197,7 @@ int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
 
 	skb_set_queue_mapping(skb, queue_id);
 	txq = skb_get_tx_queue(dev, skb);
+	__this_cpu_add(net_rand_noise, (long)skb + (long)dev + (long)txq + jiffies);
 
 	local_bh_disable();
 

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 11:47           ` Willy Tarreau
                               ` (2 preceding siblings ...)
  2020-08-10 16:31             ` Linus Torvalds
@ 2020-08-11  3:47             ` George Spelvin
  2020-08-11  3:58               ` Willy Tarreau
  3 siblings, 1 reply; 68+ messages in thread
From: George Spelvin @ 2020-08-11  3:47 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin

On Mon, Aug 10, 2020 at 01:47:00PM +0200, Willy Tarreau wrote:
> except that I retrieve it only on 1/8 calls
> and use the previous noise in this case.

Er... that's quite different.  I was saying you measure them all, and do:

 struct siprand_state {
 	...
+	uint32_t noise[i];
+	unsigned counter;
 }
 	...
+	s->noise[--s->counter] = random_get_entropy();
+
+	if (!s->counter) {
+		for (i = 0; i < 4; i++)
+			s->v[i] += s->noise[2*i] +
+				((unsigned long)s->noise[2*i+1] << BITS_PER_LONG/2);
+		s->counter = 8;
+	}

What you're doing is just decreasing the amount of seeding by a factor
of 8.  (Roughly.  You do gain log2(8)/2 = 1.5 bits because the sum of
8 random values has a standard deviation sqrt(8) times as large as
the inputs.)

> diff --git a/lib/random32.c b/lib/random32.c
> index 2b048e2ea99f..a12d63028106 100644
> --- a/lib/random32.c
> +++ b/lib/random32.c
> @@ -317,6 +317,8 @@ static void __init prandom_state_selftest(void)
>  
>  struct siprand_state {
>  	unsigned long v[4];
> +	unsigned long noise;
> +	unsigned long count;
>  };
>  
>  static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
> @@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
>  #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
>  #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
>  
> -#elif BITS_PER_LONG == 23
> +#elif BITS_PER_LONG == 32
>  /*
>   * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash.
>   * This is weaker, but 32-bit machines are not used for high-traffic
> @@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s)
>  {
>  	unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3];
>  
> +	if (++s->count >= 8) {
> +		v3 ^= s->noise;
> +		s->noise += random_get_entropy();
> +		s->count = 0;
> +	}
> +

- Can you explain why you save the "noise" until next time?  Is this meant to
  make it harder for an attacker to observe the time?
- How about doing away with s->count and making it statistical:

+	if ((v3 & 7) == 0)
+		v3 ^= random_get_entropy();

That still does the seed 1/8 of the time, but in a much less regular pattern.
(Admittedly, it will totally break the branch predictor.  An unlikely()
might help.)


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-11  3:47             ` George Spelvin
@ 2020-08-11  3:58               ` Willy Tarreau
  0 siblings, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-11  3:58 UTC (permalink / raw)
  To: George Spelvin
  Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook,
	tglx, peterz, tytso, lkml.mplumb, stephen, fw

On Tue, Aug 11, 2020 at 03:47:24AM +0000, George Spelvin wrote:
> On Mon, Aug 10, 2020 at 01:47:00PM +0200, Willy Tarreau wrote:
> > except that I retrieve it only on 1/8 calls
> > and use the previous noise in this case.
> 
> Er... that's quite different.  I was saying you measure them all, and do:

That was my first approach and it resulted in a significant performance
loss, hence the change (and the resulting ugly construct with the counter).

> > +	if (++s->count >= 8) {
> > +		v3 ^= s->noise;
> > +		s->noise += random_get_entropy();
> > +		s->count = 0;
> > +	}
> > +
> 
> - Can you explain why you save the "noise" until next time?  Is this meant to
>   make it harder for an attacker to observe the time?

Just to make the observable call not depend on immediately measurable TSC
values. It's weak, but the point was that when mixing attack traffic with
regular one, if you can measure the time variations on TSC to know when
it was used and don't have its resulting effect at the same time, it's
harder to analyse than when you have both at once.

> - How about doing away with s->count and making it statistical:
> 
> +	if ((v3 & 7) == 0)
> +		v3 ^= random_get_entropy();

Absolutely. I just kept the counter from previous attempt. But Linus
prefers that we completely remove TSC calls from direct calls due to
old VMs that were slow at this. We could collect it anywhere else once
in a while instead.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-10 21:04                   ` Willy Tarreau
@ 2020-08-11  5:26                     ` George Spelvin
  2020-08-11  5:37                       ` Willy Tarreau
  0 siblings, 1 reply; 68+ messages in thread
From: George Spelvin @ 2020-08-11  5:26 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Linus Torvalds, Florian Westphal, Netdev, Amit Klein,
	Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger, George Spelvin

On Mon, Aug 10, 2020 at 11:04:55PM +0200, Willy Tarreau wrote:
> What could be improved is the way the input values are mixed (just
> added hence commutative for now). I didn't want to call a siphash
> round on the hot paths, but just shifting the previous noise value
> before adding would work, such as the following for example:
> 
>   void prandom_u32_add_noise(a, b, c, d)
>   {
>   	unsigned long *noise = get_cpu_ptr(&net_rand_noise);
> 
>   #if BITS_PER_LONG == 64
> 	*noise = rol64(*noise, 7) + a + b + c + d;
>   #else
> 	*noise = rol32(*noise, 7) + a + b + c + d;
>   #endif
>   	put_cpu_ptr(&net_rand_noise);
> 
>   }

If you think this is enough seed material, I'm fine with it.

I don't hugely like the fact that you sum all the inputs, since
entropy tends to be concentrated in the low-order words, and summing
risks cancellation.

You can't afford even one SIPROUND as a non-cryptographic hash?  E.g.

DEFINE_PER_CPU(unsigned long[4], net_rand_noise);
EXPORT_SYMBOL(net_rand_noise);

void prandom_u32_add_noise(a, b, c, d)
{
	unsigned long *noise = get_cpu_ptr(&net_rand_noise);

	a ^= noise[0];
	b ^= noise[1];
	c ^= noise[2];
	d ^= noise[3];
	/*
	 * This is not used cryptographically; it's just
	 * a convenient 4-word hash function.
	 */
	SIPROUND(a, b, c, d);
	noise[0] = a;
	noise[1] = b;
	noise[2] = c;
	put_cpu_ptr(&net_rand_noise);
}

(And then you mix in net_rand_noise[0].)

Other options are HASH_MIX() from fs/namei.c, but that's
more sequential.

There's also a simple Xorshift generator.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-11  5:26                     ` George Spelvin
@ 2020-08-11  5:37                       ` Willy Tarreau
  0 siblings, 0 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-11  5:37 UTC (permalink / raw)
  To: George Spelvin
  Cc: Linus Torvalds, Florian Westphal, Netdev, Amit Klein,
	Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Tue, Aug 11, 2020 at 05:26:49AM +0000, George Spelvin wrote:
> On Mon, Aug 10, 2020 at 11:04:55PM +0200, Willy Tarreau wrote:
> > What could be improved is the way the input values are mixed (just
> > added hence commutative for now). I didn't want to call a siphash
> > round on the hot paths, but just shifting the previous noise value
> > before adding would work, such as the following for example:
> > 
> >   void prandom_u32_add_noise(a, b, c, d)
> >   {
> >   	unsigned long *noise = get_cpu_ptr(&net_rand_noise);
> > 
> >   #if BITS_PER_LONG == 64
> > 	*noise = rol64(*noise, 7) + a + b + c + d;
> >   #else
> > 	*noise = rol32(*noise, 7) + a + b + c + d;
> >   #endif
> >   	put_cpu_ptr(&net_rand_noise);
> > 
> >   }
> 
> If you think this is enough seed material, I'm fine with it.
> 
> I don't hugely like the fact that you sum all the inputs, since
> entropy tends to be concentrated in the low-order words, and summing
> risks cancellation.

Yes I've figured this. But I thought it was still better than
a pure xor which would cancell the high bits from pointers.

> You can't afford even one SIPROUND as a non-cryptographic hash?  E.g.

That's what I mentioned above, I'm still hesitating. I need to test.

> 
> DEFINE_PER_CPU(unsigned long[4], net_rand_noise);
> EXPORT_SYMBOL(net_rand_noise);
> 
> void prandom_u32_add_noise(a, b, c, d)
> {
> 	unsigned long *noise = get_cpu_ptr(&net_rand_noise);
> 
> 	a ^= noise[0];
> 	b ^= noise[1];
> 	c ^= noise[2];
> 	d ^= noise[3];
> 	/*
> 	 * This is not used cryptographically; it's just
> 	 * a convenient 4-word hash function.
> 	 */
> 	SIPROUND(a, b, c, d);
> 	noise[0] = a;
> 	noise[1] = b;
> 	noise[2] = c;
> 	put_cpu_ptr(&net_rand_noise);
> }
> 
> (And then you mix in net_rand_noise[0].)
> 
> Other options are HASH_MIX() from fs/namei.c, but that's
> more sequential.
> 
> There's also a simple Xorshift generator.

I think a xorshift on each value will have roughly the same cost
as a single SIPROUND. But I've not yet completely eliminated these
options until I've tested. If we lose a few cycles per packet, that
might be OK.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: Flaw in "random32: update the net random state on interrupt and activity"
  2020-08-09  2:07           ` Linus Torvalds
@ 2020-08-11 16:01             ` Eric Dumazet
  0 siblings, 0 replies; 68+ messages in thread
From: Eric Dumazet @ 2020-08-11 16:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: George Spelvin, Willy Tarreau, Netdev, Amit Klein,
	Jason A. Donenfeld, Andrew Lutomirski, Kees Cook,
	Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb,
	Stephen Hemminger

On Sat, Aug 8, 2020 at 7:07 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sat, Aug 8, 2020 at 3:28 PM George Spelvin <lkml@sdf.org> wrote:
> >
> > It's not a theoretical hole, it's a very real one.  Other than the cycles
> > to do the brute-force part, it's not even all that complicated.  The
> > theory part is that it's impossible to patch.
>
> We'll just disagree.
>
> I'm really fed up with security holes that are brought on by crypto
> people not being willing to just do reasonable things.
>
> > *If* you do the stupid thing.  WHICH YOU COULD JUST STOP DOING.
>
> We're not feeding all the same bits to the /dev/random that we're
> using to also update the pseudo-random state, so I think you're
> overreacting. Think of it as "/dev/random gets a few bits, and prandom
> gets a few other bits".
>
> The fact that there is overlap between the bits is immaterial, when
> other parts are disjoint. Fractonal bits of entropy and all that.
>
> > The explain it to me.  What is that actual *problem*?  Nobody's described
> > one, so I've been guessing.  What is this *monumentally stupid* abuse of
> > /dev/random allegedly fixing?
>
> The problem is that the networking people really want unguessable
> random state. There was a remote DNS spoofing poisoning attack because
> the UDP ports ended up being guessable.
>
> And that was actually reported to us back in early March.
>

Another typical use of prandom_u32() is the one in
tcp_conn_request(), when processing a SYN packet.

My fear was that adding much more cpu cycles to prandom_u32() would
reduce our ability to cope with a SYN flood attack,
but looking more closely to tcp_conn_request(), there might be a way
to remove the prandom_u32() call
when we generate a syncookie, reflecting incoming skb hash (if already
populated)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 184ea556f50e35141a4be5940c692db41e09f464..fc698a8ea13b1b6a6bd952308d11414eadfa4eaf
100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6740,10 +6740,12 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
                isn = cookie_init_sequence(af_ops, sk, skb, &req->mss);
                if (!tmp_opt.tstamp_ok)
                        inet_rsk(req)->ecn_ok = 0;
+               tcp_rsk(req)->txhash = skb->hash ?: 1;
+       } else {
+               tcp_rsk(req)->txhash = net_tx_rndhash();
        }

        tcp_rsk(req)->snt_isn = isn;
-       tcp_rsk(req)->txhash = net_tx_rndhash();
        tcp_openreq_init_rwin(req, sk, dst);
        sk_rx_queue_set(req_to_sk(req), skb);
        if (!want_cookie) {


BTW we could add a trace event so that the answer to 'who is using
prandom_u32' could be easily answered.

diff --git a/include/trace/events/random.h b/include/trace/events/random.h
index 32c10a515e2d5438e8d620a0c2313aab5f849b2b..9570a10cb949b5792c4290ba8e82a077ac655069
100644
--- a/include/trace/events/random.h
+++ b/include/trace/events/random.h
@@ -307,6 +307,23 @@ TRACE_EVENT(urandom_read,
                  __entry->pool_left, __entry->input_left)
 );

+TRACE_EVENT(prandom_u32,
+
+       TP_PROTO(unsigned int ret),
+
+       TP_ARGS(ret),
+
+       TP_STRUCT__entry(
+               __field(   unsigned int, ret)
+       ),
+
+       TP_fast_assign(
+               __entry->ret = ret;
+       ),
+
+       TP_printk("ret=%u" , __entry->ret)
+);
+
 #endif /* _TRACE_RANDOM_H */

 /* This part must be outside protection */
diff --git a/lib/random32.c b/lib/random32.c
index 3d749abb9e80d54d8e330e07fb8b773b7bec2b83..932345323af092a93fc2690b0ebbf4f7485ae4f3
100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -39,6 +39,7 @@
 #include <linux/random.h>
 #include <linux/sched.h>
 #include <asm/unaligned.h>
+#include <trace/events/random.h>

 #ifdef CONFIG_RANDOM32_SELFTEST
 static void __init prandom_state_selftest(void);
@@ -82,6 +83,7 @@ u32 prandom_u32(void)
        u32 res;

        res = prandom_u32_state(state);
+       trace_prandom_u32(res);
        put_cpu_var(net_rand_state);

        return res;

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-27  1:09                                             ` Willy Tarreau
@ 2020-08-27  7:08                                               ` Sedat Dilek
  0 siblings, 0 replies; 68+ messages in thread
From: Sedat Dilek @ 2020-08-27  7:08 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Amit Klein, Eric Dumazet, George Spelvin, Linus Torvalds,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Thu, Aug 27, 2020 at 3:09 AM Willy Tarreau <w@1wt.eu> wrote:
>
> Hi Amit,
>
> On Thu, Aug 27, 2020 at 02:06:39AM +0300, Amit Klein wrote:
> > Hi
> >
> > Is there an ETA for this patch then?
>
> No particular ETA on my side, I was waiting for potential criticisms
> before going further. I suspect that if nobody complains anymore, it's
> an implicit voucher and I'll have to clean up and post the series then.
>

Hi Willy,

again a feedback from me as a tester of the last patchset.

I have tested it in several build-environments:

#1: GCC v10.2 + GNU/ld (BFD)
#2: GCC v10.2 + LLVM/ld.lld (LLD)
#3: GCC v10.2 + LLD plus all available LLVM "bin"utils v11.0.0-rc2
(replacements for GNU's nm, objdump, objcopy, ar, strin, ranlib, etc.)
#4: LLVM toolchain v11.0.0-rc2 and Clang-IAS (Integrated ASsembler)
#5: LLVM toolchain v11.0.0-rc2 and Clang-LTO (LinkTime Optimization)
#6: LLVM toolchain v11.0.0-rc2 and Clang-CFI (Control Flow Integrity)

For what are you waiting for :-)?

Feel free to add my...

Tested-by: Sedat Dilek <sedat.dilek@gmail.com>

If you are interested in Clang-IAS/Clang-LTO/Clang-CFI... I want to
promote the LLVM MC (micro-conference) at Linux Plumbers Conference
2020 today (27-Aug-2020) - starting 16:00 UTC.
AFAICS it is freely available via YouTube (untested by me).

- Sedat -

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
       [not found]                                           ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com>
@ 2020-08-27  1:09                                             ` Willy Tarreau
  2020-08-27  7:08                                               ` Sedat Dilek
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-27  1:09 UTC (permalink / raw)
  To: Amit Klein
  Cc: Sedat Dilek, Eric Dumazet, George Spelvin, Linus Torvalds,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

Hi Amit,

On Thu, Aug 27, 2020 at 02:06:39AM +0300, Amit Klein wrote:
> Hi
> 
> Is there an ETA for this patch then?

No particular ETA on my side, I was waiting for potential criticisms
before going further. I suspect that if nobody complains anymore, it's
an implicit voucher and I'll have to clean up and post the series then.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-20  8:05                                         ` Willy Tarreau
@ 2020-08-20 12:08                                           ` Sedat Dilek
       [not found]                                           ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com>
  1 sibling, 0 replies; 68+ messages in thread
From: Sedat Dilek @ 2020-08-20 12:08 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

[-- Attachment #1: Type: text/plain, Size: 2348 bytes --]

On Thu, Aug 20, 2020 at 10:05 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Thu, Aug 20, 2020 at 08:58:38AM +0200, Willy Tarreau wrote:
> > I've just pushed a new branch "20200820-siphash-noise" that I also
> > rebased onto latest master. It's currently running make allmodconfig
> > here, so that will take a while, but I think it's OK as random32.o is
> > already OK. I've also addressed a strange build issue caused by having
> > an array instead of 4 ints in siprand_state.
> >
> > Please just let me know if that's OK for you now.
>
> At least it worked for me now (built with no errors on x86_64):
>
> $ time make -j 8 bzImage modules
> (...)
> real    65m7.986s
> user    477m22.477s
> sys     38m0.545s
> $ find . -name '*.ko' |wc -l
> 7983
>

Runs fine here.

Thanks Willy for the "20200820-siphash-noise" [1] patchset and
including/fixing all my reported issues.

[1] Staging driver build failures fixed by...
WIP: random32: rename the K0/K1 SipHash constants to PRND_K*

[2] modpost undefined "net_rand_noise" errors fixed by...
WIP: random32: export net_rand_noise

[3] Consolidate/move/cleanup stuff in random32.c and prandom.h
WIP: random32: keep a single macro definition for sipround

This patchset looks very good to me :-).

[ TESTING WITH LTP ]

I run another perf-session by running
"LTP::net.features::tcp_fastopen" test only.
Unsure if there exist some more appropriate LTP tests.
For example there exists "net_stress.*" (see [2]).

[ PERF-SESSION ]

Link: https://github.com/ClangBuiltLinux/linux/issues/1086#issuecomment-675783804

/home/dileks/bin/perf list | grep prandom_u32 | column -t
random:prandom_u32  [Tracepoint  event]

cd /opt/ltp

echo 0 | tee /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid

/home/dileks/bin/perf record -a -g -e random:prandom_u32 ./runltp -f
net.features -s tcp_fastopen
/home/dileks/bin/perf report --no-children --stdio > ./perf-report.txt
/home/dileks/bin/perf script > ./perf-script.txt

echo 1 | tee /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid

[ /PERF-SESSION ]

For a "comparison" (?) I attached two perf-reports - the newer one
includes Willy's latest patchset.

- Sedat -

[1] https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/log/?h=20200820-siphash-noise
[2] https://github.com/linux-test-project/ltp/tree/master/runtest

[-- Attachment #2: perf-report-2020-08-19.txt --]
[-- Type: text/plain, Size: 21043 bytes --]

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 120K of event 'random:prandom_u32'
# Event count (approx.): 120473
#
# Overhead  Command          Shared Object      Symbol         
# ........  ...............  .................  ...............
#
    59.67%  netstress        [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
               |--33.21%--tcp_v4_connect
               |          __inet_stream_connect
               |          |          
               |          |--22.14%--inet_stream_connect
               |          |          __sys_connect
               |          |          __x64_sys_connect
               |          |          __noinstr_text_start
               |          |          entry_SYSCALL_64_after_hwframe
               |          |          __libc_connect
               |          |          0x65736e7500632e73
               |          |          
               |           --11.07%--tcp_sendmsg_locked
               |                     tcp_sendmsg
               |                     __sys_sendto
               |                     __x64_sys_sendto
               |                     __noinstr_text_start
               |                     entry_SYSCALL_64_after_hwframe
               |                     __libc_sendto
               |          
               |--16.61%--tcp_v6_connect
               |          __inet_stream_connect
               |          |          
               |          |--11.07%--inet_stream_connect
               |          |          __sys_connect
               |          |          __x64_sys_connect
               |          |          __noinstr_text_start
               |          |          entry_SYSCALL_64_after_hwframe
               |          |          __libc_connect
               |          |          0x65736e7500632e73
               |          |          
               |           --5.54%--tcp_sendmsg_locked
               |                     tcp_sendmsg
               |                     __sys_sendto
               |                     __x64_sys_sendto
               |                     __noinstr_text_start
               |                     entry_SYSCALL_64_after_hwframe
               |                     __libc_sendto
               |          
               |--6.64%--tcp_conn_request
               |          tcp_rcv_state_process
               |          |          
               |          |--3.35%--tcp_v4_do_rcv
               |          |          tcp_v4_rcv
               |          |          ip_protocol_deliver_rcu
               |          |          ip_local_deliver
               |          |          ip_rcv
               |          |          __netif_receive_skb
               |          |          process_backlog
               |          |          napi_poll
               |          |          net_rx_action
               |          |          __softirqentry_text_start
               |          |          asm_call_on_stack
               |          |          do_softirq_own_stack
               |          |          |          
               |          |          |--2.05%--__irq_exit_rcu
               |          |          |          |          
               |          |          |           --2.04%--sysvec_apic_timer_interrupt
               |          |          |                     asm_sysvec_apic_timer_interrupt
               |          |          |          
               |          |           --1.30%--do_softirq
               |          |                     __local_bh_enable_ip
               |          |                     |          
               |          |                      --1.27%--ip_finish_output2
               |          |                                ip_output
               |          |                                __ip_queue_xmit
               |          |                                __tcp_transmit_skb
               |          |                                |          
               |          |                                 --1.20%--tcp_write_xmit
               |          |                                           |          
               |          |                                            --1.02%--__tcp_push_pending_frames
               |          |                                                      tcp_sendmsg_locked
               |          |                                                      tcp_sendmsg
               |          |                                                      |          
               |          |                                                       --0.63%--__sys_sendto
               |          |                                                                 __x64_sys_sendto
               |          |                                                                 __noinstr_text_start
               |          |                                                                 entry_SYSCALL_64_after_hwframe
               |          |          
               |           --3.29%--tcp_v6_do_rcv
               |                     tcp_v6_rcv
               |                     ip6_protocol_deliver_rcu
               |                     ip6_input
               |                     ipv6_rcv
               |                     __netif_receive_skb
               |                     process_backlog
               |                     napi_poll
               |                     net_rx_action
               |                     __softirqentry_text_start
               |                     asm_call_on_stack
               |                     do_softirq_own_stack
               |                     |          
               |                     |--2.02%--__irq_exit_rcu
               |                     |          sysvec_apic_timer_interrupt
               |                     |          asm_sysvec_apic_timer_interrupt
               |                     |          
               |                      --1.28%--do_softirq
               |                                __local_bh_enable_ip
               |                                |          
               |                                 --1.23%--ip6_finish_output2
               |                                           ip6_output
               |                                           ip6_xmit
               |                                           inet6_csk_xmit
               |                                           __tcp_transmit_skb
               |                                           |          
               |                                            --1.14%--tcp_write_xmit
               |                                                      |          
               |                                                       --0.98%--__tcp_push_pending_frames
               |                                                                 tcp_sendmsg_locked
               |                                                                 tcp_sendmsg
               |                                                                 |          
               |                                                                  --0.60%--__sys_sendto
               |                                                                            __x64_sys_sendto
               |                                                                            __noinstr_text_start
               |                                                                            entry_SYSCALL_64_after_hwframe
               |          
                --3.19%--tcp_v4_syn_recv_sock
                          tcp_v6_syn_recv_sock
                          |          
                           --2.70%--tcp_try_fastopen
                                     tcp_conn_request
                                     tcp_rcv_state_process
                                     tcp_v4_do_rcv
                                     tcp_v4_rcv
                                     ip_protocol_deliver_rcu
                                     ip_local_deliver
                                     ip_rcv
                                     __netif_receive_skb
                                     process_backlog
                                     napi_poll
                                     net_rx_action
                                     __softirqentry_text_start
                                     asm_call_on_stack
                                     do_softirq_own_stack
                                     |          
                                     |--1.71%--__irq_exit_rcu
                                     |          |          
                                     |           --1.70%--sysvec_apic_timer_interrupt
                                     |                     asm_sysvec_apic_timer_interrupt
                                     |          
                                      --0.99%--do_softirq
                                                __local_bh_enable_ip
                                                |          
                                                 --0.97%--ip_finish_output2
                                                           ip_output
                                                           __ip_queue_xmit
                                                           __tcp_transmit_skb
                                                           |          
                                                            --0.92%--tcp_write_xmit
                                                                      |          
                                                                       --0.78%--__tcp_push_pending_frames
                                                                                 tcp_sendmsg_locked
                                                                                 tcp_sendmsg

    36.09%  swapper          [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
               |--23.84%--tcp_conn_request
               |          tcp_rcv_state_process
               |          |          
               |          |--11.94%--tcp_v6_do_rcv
               |          |          tcp_v6_rcv
               |          |          ip6_protocol_deliver_rcu
               |          |          ip6_input
               |          |          ipv6_rcv
               |          |          __netif_receive_skb
               |          |          process_backlog
               |          |          napi_poll
               |          |          net_rx_action
               |          |          __softirqentry_text_start
               |          |          asm_call_on_stack
               |          |          do_softirq_own_stack
               |          |          __irq_exit_rcu
               |          |          sysvec_apic_timer_interrupt
               |          |          asm_sysvec_apic_timer_interrupt
               |          |          |          
               |          |          |--10.49%--cpuidle_enter_state
               |          |          |          cpuidle_enter
               |          |          |          do_idle
               |          |          |          cpu_startup_entry
               |          |          |          |          
               |          |          |          |--7.92%--secondary_startup_64
               |          |          |          |          
               |          |          |           --2.57%--start_kernel
               |          |          |                     secondary_startup_64
               |          |          |          
               |          |           --0.92%--poll_idle
               |          |                     cpuidle_enter_state
               |          |                     cpuidle_enter
               |          |                     do_idle
               |          |                     cpu_startup_entry
               |          |                     |          
               |          |                      --0.68%--secondary_startup_64
               |          |          
               |           --11.90%--tcp_v4_do_rcv
               |                     tcp_v4_rcv
               |                     ip_protocol_deliver_rcu
               |                     ip_local_deliver
               |                     ip_rcv
               |                     __netif_receive_skb
               |                     process_backlog
               |                     napi_poll
               |                     net_rx_action
               |                     __softirqentry_text_start
               |                     asm_call_on_stack
               |                     do_softirq_own_stack
               |                     __irq_exit_rcu
               |                     sysvec_apic_timer_interrupt
               |                     asm_sysvec_apic_timer_interrupt
               |                     |          
               |                     |--10.61%--cpuidle_enter_state
               |                     |          cpuidle_enter
               |                     |          do_idle
               |                     |          cpu_startup_entry
               |                     |          |          
               |                     |          |--7.92%--secondary_startup_64
               |                     |          |          
               |                     |           --2.69%--start_kernel
               |                     |                     secondary_startup_64
               |                     |          
               |                      --0.78%--poll_idle
               |                                cpuidle_enter_state
               |                                cpuidle_enter
               |                                do_idle
               |                                cpu_startup_entry
               |                                |          
               |                                 --0.60%--secondary_startup_64
               |          
                --12.23%--tcp_v4_syn_recv_sock
                          tcp_v6_syn_recv_sock
                          |          
                          |--7.45%--tcp_try_fastopen
                          |          tcp_conn_request
                          |          tcp_rcv_state_process
                          |          tcp_v4_do_rcv
                          |          tcp_v4_rcv
                          |          ip_protocol_deliver_rcu
                          |          ip_local_deliver
                          |          ip_rcv
                          |          __netif_receive_skb
                          |          process_backlog
                          |          napi_poll
                          |          net_rx_action
                          |          __softirqentry_text_start
                          |          asm_call_on_stack
                          |          do_softirq_own_stack
                          |          __irq_exit_rcu
                          |          sysvec_apic_timer_interrupt
                          |          asm_sysvec_apic_timer_interrupt
                          |          |          
                          |           --6.62%--cpuidle_enter_state
                          |                     cpuidle_enter
                          |                     do_idle
                          |                     cpu_startup_entry
                          |                     |          
                          |                     |--4.85%--secondary_startup_64
                          |                     |          
                          |                      --1.77%--start_kernel
                          |                                secondary_startup_64
                          |          
                           --4.78%--tcp_check_req
                                     tcp_v4_rcv
                                     ip_protocol_deliver_rcu
                                     ip_local_deliver
                                     ip_rcv
                                     __netif_receive_skb
                                     process_backlog
                                     napi_poll
                                     net_rx_action
                                     __softirqentry_text_start
                                     asm_call_on_stack
                                     do_softirq_own_stack
                                     __irq_exit_rcu
                                     sysvec_apic_timer_interrupt
                                     asm_sysvec_apic_timer_interrupt
                                     |          
                                      --4.53%--cpuidle_enter_state
                                                cpuidle_enter
                                                do_idle
                                                cpu_startup_entry
                                                |          
                                                |--3.48%--secondary_startup_64
                                                |          
                                                 --1.05%--start_kernel
                                                           secondary_startup_64

     0.92%  ksoftirqd/3      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
                --0.59%--tcp_conn_request
                          tcp_rcv_state_process

     0.90%  ksoftirqd/1      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
                --0.62%--tcp_conn_request
                          tcp_rcv_state_process

     0.84%  ksoftirqd/0      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
                --0.56%--tcp_conn_request
                          tcp_rcv_state_process

     0.77%  ksoftirqd/2      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
                --0.51%--tcp_conn_request
                          tcp_rcv_state_process

     0.17%  kworker/3:7-eve  [kernel.kallsyms]  [k] prandom_u32
     0.15%  kworker/1:2-eve  [kernel.kallsyms]  [k] prandom_u32
     0.14%  kworker/2:0-eve  [kernel.kallsyms]  [k] prandom_u32
     0.13%  kworker/0:5-eve  [kernel.kallsyms]  [k] prandom_u32
     0.07%  Xorg             [kernel.kallsyms]  [k] prandom_u32
     0.04%  avahi-daemon     [kernel.kallsyms]  [k] prandom_u32
     0.04%  ip               [kernel.kallsyms]  [k] prandom_u32
     0.02%  perf             [kernel.kallsyms]  [k] prandom_u32
     0.01%  rcu_sched        [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/u16:2-i  [kernel.kallsyms]  [k] prandom_u32
     0.00%  ltp-pan          [kernel.kallsyms]  [k] prandom_u32
     0.00%  awk              [kernel.kallsyms]  [k] prandom_u32
     0.00%  mktemp           [kernel.kallsyms]  [k] prandom_u32
     0.00%  mysqld           [kernel.kallsyms]  [k] prandom_u32
     0.00%  systemd-journal  [kernel.kallsyms]  [k] prandom_u32
     0.00%  NetworkManager   [kernel.kallsyms]  [k] prandom_u32
     0.00%  QDBusConnection  [kernel.kallsyms]  [k] prandom_u32
     0.00%  gdbus            [kernel.kallsyms]  [k] prandom_u32
     0.00%  jbd2/sdc2-8      [kernel.kallsyms]  [k] prandom_u32
     0.00%  kded5            [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/2:2-eve  [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/3:0-eve  [kernel.kallsyms]  [k] prandom_u32
     0.00%  runltp           [kernel.kallsyms]  [k] prandom_u32
     0.00%  tcp_fastopen_ru  [kernel.kallsyms]  [k] prandom_u32
     0.00%  gnome-software   [kernel.kallsyms]  [k] prandom_u32
     0.00%  irq/35-iwlwifi   [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/u16:1-p  [kernel.kallsyms]  [k] prandom_u32
     0.00%  org_kde_powerde  [kernel.kallsyms]  [k] prandom_u32
     0.00%  pool-org.gnome.  [kernel.kallsyms]  [k] prandom_u32
     0.00%  tst_net_iface_p  [kernel.kallsyms]  [k] prandom_u32
     0.00%  upowerd          [kernel.kallsyms]  [k] prandom_u32
     0.00%  xdg-desktop-por  [kernel.kallsyms]  [k] prandom_u32


# Samples: 0  of event 'dummy:HG'
# Event count (approx.): 0
#
# Overhead  Command  Shared Object  Symbol
# ........  .......  .............  ......
#


#
# (Tip: Customize output of perf script with: perf script -F event,ip,sym)
#

[-- Attachment #3: perf-report-2020-08-20.txt --]
[-- Type: text/plain, Size: 21103 bytes --]

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 120K of event 'random:prandom_u32'
# Event count (approx.): 120542
#
# Overhead  Command          Shared Object      Symbol         
# ........  ...............  .................  ...............
#
    59.66%  netstress        [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
               |--33.20%--tcp_v4_connect
               |          __inet_stream_connect
               |          |          
               |          |--22.13%--inet_stream_connect
               |          |          __sys_connect
               |          |          __x64_sys_connect
               |          |          __noinstr_text_start
               |          |          entry_SYSCALL_64_after_hwframe
               |          |          __libc_connect
               |          |          0x65736e7500632e73
               |          |          
               |           --11.07%--tcp_sendmsg_locked
               |                     tcp_sendmsg
               |                     __sys_sendto
               |                     __x64_sys_sendto
               |                     __noinstr_text_start
               |                     entry_SYSCALL_64_after_hwframe
               |                     __libc_sendto
               |          
               |--16.60%--tcp_v6_connect
               |          __inet_stream_connect
               |          |          
               |          |--11.07%--inet_stream_connect
               |          |          __sys_connect
               |          |          __x64_sys_connect
               |          |          __noinstr_text_start
               |          |          entry_SYSCALL_64_after_hwframe
               |          |          __libc_connect
               |          |          0x65736e7500632e73
               |          |          
               |           --5.53%--tcp_sendmsg_locked
               |                     tcp_sendmsg
               |                     __sys_sendto
               |                     __x64_sys_sendto
               |                     __noinstr_text_start
               |                     entry_SYSCALL_64_after_hwframe
               |                     __libc_sendto
               |          
               |--6.69%--tcp_conn_request
               |          tcp_rcv_state_process
               |          |          
               |          |--3.41%--tcp_v6_do_rcv
               |          |          tcp_v6_rcv
               |          |          ip6_protocol_deliver_rcu
               |          |          ip6_input
               |          |          ipv6_rcv
               |          |          __netif_receive_skb
               |          |          process_backlog
               |          |          napi_poll
               |          |          net_rx_action
               |          |          __softirqentry_text_start
               |          |          asm_call_on_stack
               |          |          do_softirq_own_stack
               |          |          |          
               |          |          |--2.15%--__irq_exit_rcu
               |          |          |          sysvec_apic_timer_interrupt
               |          |          |          asm_sysvec_apic_timer_interrupt
               |          |          |          
               |          |           --1.26%--do_softirq
               |          |                     __local_bh_enable_ip
               |          |                     |          
               |          |                      --1.22%--ip6_finish_output2
               |          |                                ip6_output
               |          |                                ip6_xmit
               |          |                                inet6_csk_xmit
               |          |                                __tcp_transmit_skb
               |          |                                |          
               |          |                                 --1.13%--tcp_write_xmit
               |          |                                           |          
               |          |                                            --0.99%--__tcp_push_pending_frames
               |          |                                                      tcp_sendmsg_locked
               |          |                                                      tcp_sendmsg
               |          |                                                      |          
               |          |                                                       --0.56%--__sys_sendto
               |          |                                                                 __x64_sys_sendto
               |          |                                                                 __noinstr_text_start
               |          |                                                                 entry_SYSCALL_64_after_hwframe
               |          |          
               |           --3.27%--tcp_v4_do_rcv
               |                     tcp_v4_rcv
               |                     ip_protocol_deliver_rcu
               |                     ip_local_deliver
               |                     ip_rcv
               |                     __netif_receive_skb
               |                     process_backlog
               |                     napi_poll
               |                     net_rx_action
               |                     __softirqentry_text_start
               |                     asm_call_on_stack
               |                     do_softirq_own_stack
               |                     |          
               |                     |--1.94%--__irq_exit_rcu
               |                     |          sysvec_apic_timer_interrupt
               |                     |          asm_sysvec_apic_timer_interrupt
               |                     |          
               |                      --1.33%--do_softirq
               |                                __local_bh_enable_ip
               |                                |          
               |                                 --1.29%--ip_finish_output2
               |                                           ip_output
               |                                           __ip_queue_xmit
               |                                           __tcp_transmit_skb
               |                                           |          
               |                                            --1.21%--tcp_write_xmit
               |                                                      |          
               |                                                       --1.02%--__tcp_push_pending_frames
               |                                                                 tcp_sendmsg_locked
               |                                                                 tcp_sendmsg
               |                                                                 |          
               |                                                                  --0.67%--__sys_sendto
               |                                                                            __x64_sys_sendto
               |                                                                            __noinstr_text_start
               |                                                                            entry_SYSCALL_64_after_hwframe
               |          
                --3.16%--tcp_v4_syn_recv_sock
                          tcp_v6_syn_recv_sock
                          |          
                           --2.68%--tcp_try_fastopen
                                     tcp_conn_request
                                     tcp_rcv_state_process
                                     tcp_v4_do_rcv
                                     tcp_v4_rcv
                                     ip_protocol_deliver_rcu
                                     ip_local_deliver
                                     ip_rcv
                                     __netif_receive_skb
                                     process_backlog
                                     napi_poll
                                     net_rx_action
                                     __softirqentry_text_start
                                     asm_call_on_stack
                                     do_softirq_own_stack
                                     |          
                                     |--1.62%--__irq_exit_rcu
                                     |          sysvec_apic_timer_interrupt
                                     |          asm_sysvec_apic_timer_interrupt
                                     |          
                                      --1.07%--do_softirq
                                                __local_bh_enable_ip
                                                |          
                                                 --1.03%--ip_finish_output2
                                                           ip_output
                                                           __ip_queue_xmit
                                                           __tcp_transmit_skb
                                                           |          
                                                            --0.98%--tcp_write_xmit
                                                                      |          
                                                                       --0.82%--__tcp_push_pending_frames
                                                                                 tcp_sendmsg_locked
                                                                                 tcp_sendmsg
                                                                                 |          
                                                                                  --0.53%--__sys_sendto
                                                                                            __x64_sys_sendto
                                                                                            __noinstr_text_start
                                                                                            entry_SYSCALL_64_after_hwframe

    36.30%  swapper          [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
               |--23.99%--tcp_conn_request
               |          tcp_rcv_state_process
               |          |          
               |          |--12.11%--tcp_v4_do_rcv
               |          |          tcp_v4_rcv
               |          |          ip_protocol_deliver_rcu
               |          |          ip_local_deliver
               |          |          ip_rcv
               |          |          __netif_receive_skb
               |          |          process_backlog
               |          |          napi_poll
               |          |          net_rx_action
               |          |          __softirqentry_text_start
               |          |          asm_call_on_stack
               |          |          do_softirq_own_stack
               |          |          __irq_exit_rcu
               |          |          sysvec_apic_timer_interrupt
               |          |          asm_sysvec_apic_timer_interrupt
               |          |          |          
               |          |          |--10.79%--cpuidle_enter_state
               |          |          |          cpuidle_enter
               |          |          |          do_idle
               |          |          |          cpu_startup_entry
               |          |          |          |          
               |          |          |          |--8.20%--secondary_startup_64
               |          |          |          |          
               |          |          |           --2.59%--start_kernel
               |          |          |                     secondary_startup_64
               |          |          |          
               |          |           --0.80%--poll_idle
               |          |                     cpuidle_enter_state
               |          |                     cpuidle_enter
               |          |                     do_idle
               |          |                     cpu_startup_entry
               |          |                     |          
               |          |                      --0.62%--secondary_startup_64
               |          |          
               |           --11.88%--tcp_v6_do_rcv
               |                     tcp_v6_rcv
               |                     ip6_protocol_deliver_rcu
               |                     ip6_input
               |                     ipv6_rcv
               |                     __netif_receive_skb
               |                     process_backlog
               |                     napi_poll
               |                     net_rx_action
               |                     __softirqentry_text_start
               |                     asm_call_on_stack
               |                     do_softirq_own_stack
               |                     __irq_exit_rcu
               |                     sysvec_apic_timer_interrupt
               |                     asm_sysvec_apic_timer_interrupt
               |                     |          
               |                     |--10.46%--cpuidle_enter_state
               |                     |          cpuidle_enter
               |                     |          do_idle
               |                     |          cpu_startup_entry
               |                     |          |          
               |                     |          |--7.92%--secondary_startup_64
               |                     |          |          
               |                     |           --2.55%--start_kernel
               |                     |                     secondary_startup_64
               |                     |          
               |                      --0.89%--poll_idle
               |                                cpuidle_enter_state
               |                                cpuidle_enter
               |                                do_idle
               |                                cpu_startup_entry
               |                                |          
               |                                 --0.69%--secondary_startup_64
               |          
                --12.30%--tcp_v4_syn_recv_sock
                          tcp_v6_syn_recv_sock
                          |          
                          |--7.52%--tcp_try_fastopen
                          |          tcp_conn_request
                          |          tcp_rcv_state_process
                          |          tcp_v4_do_rcv
                          |          tcp_v4_rcv
                          |          ip_protocol_deliver_rcu
                          |          ip_local_deliver
                          |          ip_rcv
                          |          __netif_receive_skb
                          |          process_backlog
                          |          napi_poll
                          |          net_rx_action
                          |          __softirqentry_text_start
                          |          asm_call_on_stack
                          |          do_softirq_own_stack
                          |          __irq_exit_rcu
                          |          sysvec_apic_timer_interrupt
                          |          asm_sysvec_apic_timer_interrupt
                          |          |          
                          |           --6.67%--cpuidle_enter_state
                          |                     cpuidle_enter
                          |                     do_idle
                          |                     cpu_startup_entry
                          |                     |          
                          |                     |--4.94%--secondary_startup_64
                          |                     |          
                          |                      --1.73%--start_kernel
                          |                                secondary_startup_64
                          |          
                           --4.79%--tcp_check_req
                                     tcp_v4_rcv
                                     ip_protocol_deliver_rcu
                                     ip_local_deliver
                                     ip_rcv
                                     __netif_receive_skb
                                     process_backlog
                                     napi_poll
                                     net_rx_action
                                     __softirqentry_text_start
                                     asm_call_on_stack
                                     do_softirq_own_stack
                                     __irq_exit_rcu
                                     sysvec_apic_timer_interrupt
                                     asm_sysvec_apic_timer_interrupt
                                     |          
                                      --4.52%--cpuidle_enter_state
                                                cpuidle_enter
                                                do_idle
                                                cpu_startup_entry
                                                |          
                                                |--3.59%--secondary_startup_64
                                                |          
                                                 --0.93%--start_kernel
                                                           secondary_startup_64

     0.89%  ksoftirqd/3      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
                --0.58%--tcp_conn_request
                          tcp_rcv_state_process

     0.87%  ksoftirqd/1      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
                --0.58%--tcp_conn_request
                          tcp_rcv_state_process

     0.76%  ksoftirqd/0      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
                --0.52%--tcp_conn_request
                          tcp_rcv_state_process

     0.71%  ksoftirqd/2      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32

     0.16%  kworker/3:3-eve  [kernel.kallsyms]  [k] prandom_u32
     0.14%  kworker/1:2-eve  [kernel.kallsyms]  [k] prandom_u32
     0.12%  Xorg             [kernel.kallsyms]  [k] prandom_u32
     0.12%  kworker/0:5-eve  [kernel.kallsyms]  [k] prandom_u32
     0.11%  kworker/2:0-eve  [kernel.kallsyms]  [k] prandom_u32
     0.03%  avahi-daemon     [kernel.kallsyms]  [k] prandom_u32
     0.03%  ip               [kernel.kallsyms]  [k] prandom_u32
     0.03%  systemd-udevd    [kernel.kallsyms]  [k] prandom_u32
     0.02%  perf             [kernel.kallsyms]  [k] prandom_u32
     0.01%  rcu_sched        [kernel.kallsyms]  [k] prandom_u32
     0.00%  ltp-pan          [kernel.kallsyms]  [k] prandom_u32
     0.00%  NetworkManager   [kernel.kallsyms]  [k] prandom_u32
     0.00%  irq/35-iwlwifi   [kernel.kallsyms]  [k] prandom_u32
     0.00%  mktemp           [kernel.kallsyms]  [k] prandom_u32
     0.00%  mysqld           [kernel.kallsyms]  [k] prandom_u32
     0.00%  tcp_fastopen_ru  [kernel.kallsyms]  [k] prandom_u32
     0.00%  DiscoverNotifie  [kernel.kallsyms]  [k] prandom_u32
     0.00%  QSGRenderThread  [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/1:0-eve  [kernel.kallsyms]  [k] prandom_u32
     0.00%  migration/3      [kernel.kallsyms]  [k] prandom_u32
     0.00%  ns_create        [kernel.kallsyms]  [k] prandom_u32
     0.00%  ns_ifmove        [kernel.kallsyms]  [k] prandom_u32
     0.00%  runltp           [kernel.kallsyms]  [k] prandom_u32
     0.00%  QXcbEventQueue   [kernel.kallsyms]  [k] prandom_u32
     0.00%  Thread (pooled)  [kernel.kallsyms]  [k] prandom_u32
     0.00%  akonadi_followu  [kernel.kallsyms]  [k] prandom_u32
     0.00%  gnome-software   [kernel.kallsyms]  [k] prandom_u32
     0.00%  ln               [kernel.kallsyms]  [k] prandom_u32
     0.00%  mkdir            [kernel.kallsyms]  [k] prandom_u32


# Samples: 0  of event 'dummy:HG'
# Event count (approx.): 0
#
# Overhead  Command  Shared Object  Symbol
# ........  .......  .............  ......
#


#
# (Tip: Generate a script for your data: perf script -g <lang>)
#

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-20  6:58                                       ` Willy Tarreau
@ 2020-08-20  8:05                                         ` Willy Tarreau
  2020-08-20 12:08                                           ` Sedat Dilek
       [not found]                                           ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com>
  0 siblings, 2 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-20  8:05 UTC (permalink / raw)
  To: Sedat Dilek
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Thu, Aug 20, 2020 at 08:58:38AM +0200, Willy Tarreau wrote:
> I've just pushed a new branch "20200820-siphash-noise" that I also
> rebased onto latest master. It's currently running make allmodconfig
> here, so that will take a while, but I think it's OK as random32.o is
> already OK. I've also addressed a strange build issue caused by having
> an array instead of 4 ints in siprand_state.
> 
> Please just let me know if that's OK for you now.

At least it worked for me now (built with no errors on x86_64):

$ time make -j 8 bzImage modules
(...)
real    65m7.986s
user    477m22.477s
sys     38m0.545s
$ find . -name '*.ko' |wc -l
7983

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-20  6:08                                     ` Willy Tarreau
@ 2020-08-20  6:58                                       ` Willy Tarreau
  2020-08-20  8:05                                         ` Willy Tarreau
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-20  6:58 UTC (permalink / raw)
  To: Sedat Dilek
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Thu, Aug 20, 2020 at 08:08:43AM +0200, Willy Tarreau wrote:
> On Thu, Aug 20, 2020 at 06:42:23AM +0200, Sedat Dilek wrote:
> > On Thu, Aug 20, 2020 at 6:33 AM Willy Tarreau <w@1wt.eu> wrote:
> > >
> > > On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote:
> > > > We have the same defines for K0 and K1 in include/linux/prandom.h and
> > > > lib/random32.c?
> > > > More room for simplifications?
> > >
> > > Definitely, I'm not surprized at all. As I said, the purpose was to
> > > discuss around the proposal, not much more. If we think it's the way
> > > to go, some major lifting is required. I just don't want to invest
> > > significant time on this if nobody cares.
> > >
> > 
> > OK.
> > 
> > Right now, I will try with the attached diff.
> 
> No, don't waste your time this way, it's not the right way to address it,
> you're still facing competition between defines. I'll do another one if
> you want to go further in the tests.

I've just pushed a new branch "20200820-siphash-noise" that I also
rebased onto latest master. It's currently running make allmodconfig
here, so that will take a while, but I think it's OK as random32.o is
already OK. I've also addressed a strange build issue caused by having
an array instead of 4 ints in siprand_state.

Please just let me know if that's OK for you now.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-20  4:42                                   ` Sedat Dilek
@ 2020-08-20  6:08                                     ` Willy Tarreau
  2020-08-20  6:58                                       ` Willy Tarreau
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-20  6:08 UTC (permalink / raw)
  To: Sedat Dilek
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Thu, Aug 20, 2020 at 06:42:23AM +0200, Sedat Dilek wrote:
> On Thu, Aug 20, 2020 at 6:33 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote:
> > > We have the same defines for K0 and K1 in include/linux/prandom.h and
> > > lib/random32.c?
> > > More room for simplifications?
> >
> > Definitely, I'm not surprized at all. As I said, the purpose was to
> > discuss around the proposal, not much more. If we think it's the way
> > to go, some major lifting is required. I just don't want to invest
> > significant time on this if nobody cares.
> >
> 
> OK.
> 
> Right now, I will try with the attached diff.

No, don't waste your time this way, it's not the right way to address it,
you're still facing competition between defines. I'll do another one if
you want to go further in the tests.

> Unclear to me where this modpost "net_rand_noise" undefined! comes from.
> Any hints?

Sure, the symbol is not exported. I'll address it as well for you.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-20  4:33                                 ` Willy Tarreau
@ 2020-08-20  4:42                                   ` Sedat Dilek
  2020-08-20  6:08                                     ` Willy Tarreau
  0 siblings, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-20  4:42 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Thu, Aug 20, 2020 at 6:33 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote:
> > We have the same defines for K0 and K1 in include/linux/prandom.h and
> > lib/random32.c?
> > More room for simplifications?
>
> Definitely, I'm not surprized at all. As I said, the purpose was to
> discuss around the proposal, not much more. If we think it's the way
> to go, some major lifting is required. I just don't want to invest
> significant time on this if nobody cares.
>

OK.

Right now, I will try with the attached diff.

Unclear to me where this modpost "net_rand_noise" undefined! comes from.
Any hints?

- Sedat -

[ prandom-siphash-noise-wtarreau-20200816-dileks.diff ]

diff --git a/drivers/staging/rtl8188eu/include/rtw_security.h
b/drivers/staging/rtl8188eu/include/rtw_security.h
index 8ba02a7cea60..5cbb6fec71cd 100644
--- a/drivers/staging/rtl8188eu/include/rtw_security.h
+++ b/drivers/staging/rtl8188eu/include/rtw_security.h
@@ -221,6 +221,9 @@ do {
                         \
 #define ROL32(A, n)    (((A) << (n)) | (((A) >> (32 - (n)))  & ((1UL
<< (n)) - 1)))
 #define ROR32(A, n)    ROL32((A), 32 - (n))

+// XXX: Workaround: Undef defines from <include/linux/prandom.h>
+#undef K0
+#undef K1
 struct mic_data {
        u32  K0, K1;         /*  Key */
        u32  L, R;           /*  Current state */
diff --git a/drivers/staging/rtl8712/rtl871x_security.h
b/drivers/staging/rtl8712/rtl871x_security.h
index b2dda16cbd0a..d4ffb31d9d14 100644
--- a/drivers/staging/rtl8712/rtl871x_security.h
+++ b/drivers/staging/rtl8712/rtl871x_security.h
@@ -188,6 +188,9 @@ do {\
 #define ROL32(A, n) (((A) << (n)) | (((A)>>(32-(n)))  & ((1UL << (n)) - 1)))
 #define ROR32(A, n) ROL32((A), 32 - (n))

+// XXX: Workaround: Undef defines from <include/linux/prandom.h>
+#undef K0
+#undef K1
 struct mic_data {
        u32  K0, K1;         /* Key */
        u32  L, R;           /* Current state */
diff --git a/drivers/staging/rtl8723bs/include/rtw_security.h
b/drivers/staging/rtl8723bs/include/rtw_security.h
index 514c0799c34b..260ca9f29a35 100644
--- a/drivers/staging/rtl8723bs/include/rtw_security.h
+++ b/drivers/staging/rtl8723bs/include/rtw_security.h
@@ -271,6 +271,9 @@ do {\
 #define ROL32(A, n)    (((A) << (n)) | (((A)>>(32-(n)))  & ((1UL << (n)) - 1)))
 #define ROR32(A, n)    ROL32((A), 32-(n))

+// XXX: Workaround: Undef defines from <include/linux/prandom.h>
+#undef K0
+#undef K1
 struct mic_data {
        u32  K0, K1;         /*  Key */
        u32  L, R;           /*  Current state */
diff --git a/include/linux/prandom.h b/include/linux/prandom.h
index 95d73b01d8c5..efebcff3c93d 100644
--- a/include/linux/prandom.h
+++ b/include/linux/prandom.h
@@ -32,6 +32,11 @@ DECLARE_PER_CPU(unsigned long, net_rand_noise);
        v1 ^= v0, v0 = rol64(v0, 32),  v3 ^= v2,                     \
        v0 += v3, v3 = rol64(v3, 21),  v2 += v1, v1 = rol64(v1, 17), \
        v3 ^= v0,                      v1 ^= v2, v2 = rol64(v2, 32)  )
+#define SIPROUND(v0,v1,v2,v3) ( \
+       v0 += v1, v1 = rol64(v1, 13),  v2 += v3, v3 = rol64(v3, 16), \
+       v1 ^= v0, v0 = rol64(v0, 32),  v3 ^= v2,                     \
+       v0 += v3, v3 = rol64(v3, 21),  v2 += v1, v1 = rol64(v1, 17), \
+       v3 ^= v0,                      v1 ^= v2, v2 = rol64(v2, 32)  )
 #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
 #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )

@@ -46,6 +51,11 @@ DECLARE_PER_CPU(unsigned long, net_rand_noise);
        v1 ^= v0, v0 = rol32(v0, 16),  v3 ^= v2,                     \
        v0 += v3, v3 = rol32(v3,  7),  v2 += v1, v1 = rol32(v1, 13), \
        v3 ^= v0,                      v1 ^= v2, v2 = rol32(v2, 16)  )
+#define SIPROUND(v0,v1,v2,v3) ( \
+       v0 += v1, v1 = rol32(v1,  5),  v2 += v3, v3 = rol32(v3,  8), \
+       v1 ^= v0, v0 = rol32(v0, 16),  v3 ^= v2,                     \
+       v0 += v3, v3 = rol32(v3,  7),  v2 += v1, v1 = rol32(v1, 13), \
+       v3 ^= v0,                      v1 ^= v2, v2 = rol32(v2, 16)  )
 #define K0 0x6c796765
 #define K1 0x74656462

diff --git a/lib/random32.c b/lib/random32.c
index 93f0cd3a67ee..f24c7a0febf0 100644
--- a/lib/random32.c
+++ b/lib/random32.c
@@ -323,37 +323,6 @@ struct siprand_state {
 static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy;
 DEFINE_PER_CPU(unsigned long, net_rand_noise);

-#if BITS_PER_LONG == 64
-/*
- * The core SipHash round function.  Each line can be executed in
- * parallel given enough CPU resources.
- */
-#define SIPROUND(v0,v1,v2,v3) ( \
-       v0 += v1, v1 = rol64(v1, 13),  v2 += v3, v3 = rol64(v3, 16), \
-       v1 ^= v0, v0 = rol64(v0, 32),  v3 ^= v2,                     \
-       v0 += v3, v3 = rol64(v3, 21),  v2 += v1, v1 = rol64(v1, 17), \
-       v3 ^= v0,                      v1 ^= v2, v2 = rol64(v2, 32)  )
-#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
-#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
-
-#elif BITS_PER_LONG == 32
-/*
- * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash.
- * This is weaker, but 32-bit machines are not used for high-traffic
- * applications, so there is less output for an attacker to analyze.
- */
-#define SIPROUND(v0,v1,v2,v3) ( \
-       v0 += v1, v1 = rol32(v1,  5),  v2 += v3, v3 = rol32(v3,  8), \
-       v1 ^= v0, v0 = rol32(v0, 16),  v3 ^= v2,                     \
-       v0 += v3, v3 = rol32(v3,  7),  v2 += v1, v1 = rol32(v1, 13), \
-       v3 ^= v0,                      v1 ^= v2, v2 = rol32(v2, 16)  )
-#define K0 0x6c796765
-#define K1 0x74656462
-
-#else
-#error Unsupported BITS_PER_LONG
-#endif
-
 /*
  * This is the core CPRNG function.  As "pseudorandom", this is not used
  * for truly valuable things, just intended to be a PITA to guess.

- EOF -

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-20  3:05                               ` Sedat Dilek
@ 2020-08-20  4:33                                 ` Willy Tarreau
  2020-08-20  4:42                                   ` Sedat Dilek
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-20  4:33 UTC (permalink / raw)
  To: Sedat Dilek
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote:
> We have the same defines for K0 and K1 in include/linux/prandom.h and
> lib/random32.c?
> More room for simplifications?

Definitely, I'm not surprized at all. As I said, the purpose was to
discuss around the proposal, not much more. If we think it's the way
to go, some major lifting is required. I just don't want to invest
significant time on this if nobody cares.

Thanks!
Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-16 16:48                             ` Sedat Dilek
@ 2020-08-20  3:05                               ` Sedat Dilek
  2020-08-20  4:33                                 ` Willy Tarreau
  0 siblings, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-20  3:05 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Sun, Aug 16, 2020 at 6:48 PM Sedat Dilek <sedat.dilek@gmail.com> wrote:
>
> On Sun, Aug 16, 2020 at 5:01 PM Willy Tarreau <w@1wt.eu> wrote:
> >
> > Hi,
> >
> > so as I mentioned, I could run several test on our lab with variations
> > around the various proposals and come to quite positive conclusions.
> >
> > Synthetic observations: the connection rate and the SYN cookie rate do not
> > seem to be affected the same way by the prandom changes. One explanation
> > is that the connection rates are less stable across reboots. Another
> > possible explanation is that the larger state update is more sensitive
> > to cache misses that increase when calling userland. I noticed that the
> > compiler didn't inline siprand_u32() for me, resulting in one extra
> > function call and noticeable register clobbering that mostly vanish
> > once siprand_u32() is inlined, getting back to the original performance.
> >
> > The noise generation was placed as discussed in the xmit calls, however
> > the extra function call and state update had a negative effect on
> > performance and the noise function alone appeared for up to 0.23% of the
> > CPU usage. Simplifying the mix of data by keeping only one long for
> > the noise and using one siphash round on 4 input words to keep only
> > the last word allowed to use very few instructions and to inline them,
> > making the noise collection imperceptible in microbenchmarks. The noise
> > is now collected this way (I verified that all inputs are used), this
> > performs 3 xor, 2 add and 2 rol, which is way sufficient and already
> > better than my initial attempt with a bare add :
> >
> >   static inline
> >   void prandom_u32_add_noise(unsigned long a, unsigned long b,
> >                              unsigned long c, unsigned long d)
> >   {
> >         /*
> >          * This is not used cryptographically; it's just
> >          * a convenient 4-word hash function. (3 xor, 2 add, 2 rol)
> >          */
> >         a ^= __this_cpu_read(net_rand_noise);
> >         PRND_SIPROUND(a, b, c, d);
> >         __this_cpu_write(net_rand_noise, d);
> >   }
> >
> > My tests were run on a 6-core 12-thread Core i7-8700k equipped with a 40G
> > NIC (i40e). I've mainly run two types of tests:
> >
> >   - connections per second: the machine runs a server which accepts and
> >     closes incoming connections. The load generators aim at it and the
> >     connection rate is measured once it's stabilized.
> >
> >   - SYN cookie rate: the load generators flood the machine with enough
> >     SYNs to saturate the CPU and the rate of response SYN-ACK is measured.
> >
> > Both correspond to real world use cases (DDoS protection against SYN flood
> > and connection flood).
> >
> > The base kernel was fc80c51f + Eric's patch to add a tracepoint in
> > prandom_u32(). Another test was made by adding George's changes to use
> > siphash. Then another test was made with the siprand_u32() function
> > inlined and with noise stored as a full siphash state. Then one test
> > was run with the noise reduced to a single long. And a final test was
> > run with the noise function inlined.
> >
> >           connections    SYN cookies   Notes
> >           per second     emitted/s
> >
> >   base:     556k          5.38M
> >
> >   siphash:  535k          5.33M
> >
> >   siphash inlined
> >   +noise:   548k          5.40M    add_noise=0.23%
> >
> >   siphash + single-word
> >    noise    555k          5.45M    add_noise=0.10%
> >
> >   siphash + single-word&inlined
> >    noise    559k          5.38M
> >
> > Actually the last one is better than the previous one because it also
> > swallows more packets. There were 10.9M pps in and 5.38M pps out versus
> > 10.77M in and 5.45M out for the previous one. I didn't report the incoming
> > traffic for the other ones as it was mostly irrelevant and always within
> > these bounds.
> >
> > Finally I've added Eric's patch to reuse the skb hash when known in
> > tcp_conn_request(), and was happy to see the SYN cookies reach 5.45 Mpps
> > again and the connection rate remain unaffected. A perf record during
> > the SYN flood showed almost no call to prandom_u32() anymore (just a few
> > in tcp_rtx_synack()) so this looks like a desirable optimization.
> >
> > At the moment the code is ugly, in experimental state (I've pushed all of
> > it at https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/).
> >
> > My impression on this is that given that it's possible to maintain the
> > same level of performance as we currently have while making the PRNG much
> > better, there's no more reason for not doing it.
> >
> > If there's enough interest at this point, I'm OK with restarting from
> > George's patches and doing the adjustments there. There's still this
> > prandom_seed() which looks very close to prandom_reseed() and that we
> > might possibly better remerge, but I'd vote for not changing everything
> > at once, it's ugly enough already. Also I suspect we can have an infinite
> > loop in prandom_seed() if entropy is 0 and the state is zero as well.
> > We'd be unlucky but I'd just make sure entropy is not all zeroes. And
> > running tests on 32-bit would be desirable as well.
> >
> > Finally one can wonder whether it makes sense to keep Tausworthe for
> > other cases (basic statistical sampling) or drop it. We could definitely
> > drop it and simplify everything given that we now have the same level of
> > performance. But if we do it, what should we do with the test patterns ?
> > I personally don't think that testing a PRNG against a known sequence
> > brings any value by definition, and that the more random we make it the
> > less relevant this is.
> >
>
> Hi Willy,
>
> Thanks for the new patchset and offering it via public available Git.
>
> Thanks for the numbers.
>
> As said I tested here against Linux v5.8.1 - with your previous patchset.
>
> I cannot promise I will test the new one.
>
> First, I have to see how things work with Linux v5.9-rc1 - which will
> hopefully be released within a few hours.
> My primary focus is to make it work with my GNU and LLVM toolchains.
>

I had some time to play with your new patchset.

I tried on top of Linux v5.9-rc1.

Unfortunately, some drivers from the staging area needed to be
disabled due to build failures.

I changed from kernel-modules to disabled:

scripts/config -d RTL8723BS
scripts/config -d R8712U
scripts/config -d R8188EU

See below for the error and [
drivers/staging/rtl8723bs/include/rtw_security.h ] and git grep for
#define K0 | #define K1.

Then I saw some modpost errors:

mkdir -p ./arch/x86_64/boot
ln -fsn ../../x86/boot/bzImage ./arch/x86_64/boot/bzImage
ERROR: modpost: "net_rand_noise" [drivers/scsi/fcoe/libfcoe.ko] undefined!
ERROR: modpost: "net_rand_noise" [lib/test_bpf.ko] undefined!
make[4]: *** [scripts/Makefile.modpost:111: Module.symvers] Error 1
make[4]: *** Deleting file 'Module.symvers'

Where I changed from kernel-modules to disabled:

scripts/config -d TEST_BPF
scripts/config -d LIBFCOE

- Sedat -

[ CONFIG_R8188EU=m and CONFIG_88EU_AP_MODE=y ]

$ grep "Error 1" build-log_5.9.0-rc1-5-amd64-llvm11-ias.txt
make[6]: *** [scripts/Makefile.build:283:
drivers/staging/rtl8188eu/core/rtw_ap.o] Error 1

[ CONFIG_RTL8723BS=m ]

In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7:
In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42:
./drivers/staging/rtl8723bs/include/rtw_security.h:275:7: error:
expected member name or ';' after declaration specifiers
        u32  K0, K1;         /*  Key */
        ~~~  ^
./include/linux/prandom.h:35:13: note: expanded from macro 'K0'
#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
            ^
In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7:
In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42:
./drivers/staging/rtl8723bs/include/rtw_security.h:275:7: error: expected ')'
./include/linux/prandom.h:35:13: note: expanded from macro 'K0'
#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
            ^
./drivers/staging/rtl8723bs/include/rtw_security.h:275:7: note: to
match this '('
./include/linux/prandom.h:35:12: note: expanded from macro 'K0'
#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
           ^
In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7:
In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42:
./drivers/staging/rtl8723bs/include/rtw_security.h:275:11: error:
expected member name or ';' after declaration specifiers
        u32  K0, K1;         /*  Key */
        ~~~      ^
./include/linux/prandom.h:36:13: note: expanded from macro 'K1'
#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
            ^
In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7:
In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42:
./drivers/staging/rtl8723bs/include/rtw_security.h:275:11: error: expected ')'
./include/linux/prandom.h:36:13: note: expanded from macro 'K1'
#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
            ^
./drivers/staging/rtl8723bs/include/rtw_security.h:275:11: note: to
match this '('
./include/linux/prandom.h:36:12: note: expanded from macro 'K1'
#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
           ^
4 errors generated.
make[6]: *** [scripts/Makefile.build:283:
drivers/staging/rtl8723bs/core/rtw_btcoex.o] Error 1

[ drivers/staging/rtl8723bs/include/rtw_security.h ]

struct mic_data {
    u32  K0, K1;         /*  Key */
    u32  L, R;           /*  Current state */
    u32  M;              /*  Message accumulator (single word) */
    u32     nBytesInM;      /*  # bytes in M */
}

$ git grep -E 'K0 |K1 ' drivers/staging/ | grep -i key
drivers/staging/rtl8188eu/core/rtw_security.c:  pmicdata->K0 =
secmicgetuint32(key);
drivers/staging/rtl8188eu/core/rtw_security.c:  pmicdata->K1 =
secmicgetuint32(key + 4);
drivers/staging/rtl8712/rtl871x_security.c:     pmicdata->K0 =
secmicgetuint32(key);
drivers/staging/rtl8712/rtl871x_security.c:     pmicdata->K1 =
secmicgetuint32(key + 4);
drivers/staging/rtl8723bs/core/rtw_security.c:  pmicdata->K0 =
secmicgetuint32(key);
drivers/staging/rtl8723bs/core/rtw_security.c:  pmicdata->K1 =
secmicgetuint32(key + 4);

$ git grep -E '\#define K0 |\#define K1 '
arch/arm/crypto/sha1-armv7-neon.S:#define K1  0x5A827999
arch/x86/crypto/sha1_avx2_x86_64_asm.S:#define K1 0x5a827999
crypto/rmd128.c:#define K1  RMD_K1
crypto/rmd160.c:#define K1  RMD_K1
crypto/rmd256.c:#define K1  RMD_K1
crypto/rmd320.c:#define K1  RMD_K1
fs/ext4/hash.c:#define K1 0
include/linux/prandom.h:#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
include/linux/prandom.h:#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
include/linux/prandom.h:#define K0 0x6c796765
include/linux/prandom.h:#define K1 0x74656462
lib/random32.c:#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 )
lib/random32.c:#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 )
lib/random32.c:#define K0 0x6c796765
lib/random32.c:#define K1 0x74656462

We have the same defines for K0 and K1 in include/linux/prandom.h and
lib/random32.c?
More room for simplifications?

- EOT -

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-16 15:01                           ` Willy Tarreau
@ 2020-08-16 16:48                             ` Sedat Dilek
  2020-08-20  3:05                               ` Sedat Dilek
  0 siblings, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-16 16:48 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein,
	Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner,
	Peter Zijlstra, netdev

On Sun, Aug 16, 2020 at 5:01 PM Willy Tarreau <w@1wt.eu> wrote:
>
> Hi,
>
> so as I mentioned, I could run several test on our lab with variations
> around the various proposals and come to quite positive conclusions.
>
> Synthetic observations: the connection rate and the SYN cookie rate do not
> seem to be affected the same way by the prandom changes. One explanation
> is that the connection rates are less stable across reboots. Another
> possible explanation is that the larger state update is more sensitive
> to cache misses that increase when calling userland. I noticed that the
> compiler didn't inline siprand_u32() for me, resulting in one extra
> function call and noticeable register clobbering that mostly vanish
> once siprand_u32() is inlined, getting back to the original performance.
>
> The noise generation was placed as discussed in the xmit calls, however
> the extra function call and state update had a negative effect on
> performance and the noise function alone appeared for up to 0.23% of the
> CPU usage. Simplifying the mix of data by keeping only one long for
> the noise and using one siphash round on 4 input words to keep only
> the last word allowed to use very few instructions and to inline them,
> making the noise collection imperceptible in microbenchmarks. The noise
> is now collected this way (I verified that all inputs are used), this
> performs 3 xor, 2 add and 2 rol, which is way sufficient and already
> better than my initial attempt with a bare add :
>
>   static inline
>   void prandom_u32_add_noise(unsigned long a, unsigned long b,
>                              unsigned long c, unsigned long d)
>   {
>         /*
>          * This is not used cryptographically; it's just
>          * a convenient 4-word hash function. (3 xor, 2 add, 2 rol)
>          */
>         a ^= __this_cpu_read(net_rand_noise);
>         PRND_SIPROUND(a, b, c, d);
>         __this_cpu_write(net_rand_noise, d);
>   }
>
> My tests were run on a 6-core 12-thread Core i7-8700k equipped with a 40G
> NIC (i40e). I've mainly run two types of tests:
>
>   - connections per second: the machine runs a server which accepts and
>     closes incoming connections. The load generators aim at it and the
>     connection rate is measured once it's stabilized.
>
>   - SYN cookie rate: the load generators flood the machine with enough
>     SYNs to saturate the CPU and the rate of response SYN-ACK is measured.
>
> Both correspond to real world use cases (DDoS protection against SYN flood
> and connection flood).
>
> The base kernel was fc80c51f + Eric's patch to add a tracepoint in
> prandom_u32(). Another test was made by adding George's changes to use
> siphash. Then another test was made with the siprand_u32() function
> inlined and with noise stored as a full siphash state. Then one test
> was run with the noise reduced to a single long. And a final test was
> run with the noise function inlined.
>
>           connections    SYN cookies   Notes
>           per second     emitted/s
>
>   base:     556k          5.38M
>
>   siphash:  535k          5.33M
>
>   siphash inlined
>   +noise:   548k          5.40M    add_noise=0.23%
>
>   siphash + single-word
>    noise    555k          5.45M    add_noise=0.10%
>
>   siphash + single-word&inlined
>    noise    559k          5.38M
>
> Actually the last one is better than the previous one because it also
> swallows more packets. There were 10.9M pps in and 5.38M pps out versus
> 10.77M in and 5.45M out for the previous one. I didn't report the incoming
> traffic for the other ones as it was mostly irrelevant and always within
> these bounds.
>
> Finally I've added Eric's patch to reuse the skb hash when known in
> tcp_conn_request(), and was happy to see the SYN cookies reach 5.45 Mpps
> again and the connection rate remain unaffected. A perf record during
> the SYN flood showed almost no call to prandom_u32() anymore (just a few
> in tcp_rtx_synack()) so this looks like a desirable optimization.
>
> At the moment the code is ugly, in experimental state (I've pushed all of
> it at https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/).
>
> My impression on this is that given that it's possible to maintain the
> same level of performance as we currently have while making the PRNG much
> better, there's no more reason for not doing it.
>
> If there's enough interest at this point, I'm OK with restarting from
> George's patches and doing the adjustments there. There's still this
> prandom_seed() which looks very close to prandom_reseed() and that we
> might possibly better remerge, but I'd vote for not changing everything
> at once, it's ugly enough already. Also I suspect we can have an infinite
> loop in prandom_seed() if entropy is 0 and the state is zero as well.
> We'd be unlucky but I'd just make sure entropy is not all zeroes. And
> running tests on 32-bit would be desirable as well.
>
> Finally one can wonder whether it makes sense to keep Tausworthe for
> other cases (basic statistical sampling) or drop it. We could definitely
> drop it and simplify everything given that we now have the same level of
> performance. But if we do it, what should we do with the test patterns ?
> I personally don't think that testing a PRNG against a known sequence
> brings any value by definition, and that the more random we make it the
> less relevant this is.
>

Hi Willy,

Thanks for the new patchset and offering it via public available Git.

Thanks for the numbers.

As said I tested here against Linux v5.8.1 - with your previous patchset.

I cannot promise I will test the new one.

First, I have to see how things work with Linux v5.9-rc1 - which will
hopefully be released within a few hours.
My primary focus is to make it work with my GNU and LLVM toolchains.

Regards,
- Sedat -

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-14 16:17                         ` Sedat Dilek
@ 2020-08-16 15:01                           ` Willy Tarreau
  2020-08-16 16:48                             ` Sedat Dilek
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-16 15:01 UTC (permalink / raw)
  To: Eric Dumazet, George Spelvin, Linus Torvalds
  Cc: Sedat Dilek, Amit Klein, Jason A. Donenfeld, Andy Lutomirski,
	Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev

Hi,

so as I mentioned, I could run several test on our lab with variations
around the various proposals and come to quite positive conclusions.

Synthetic observations: the connection rate and the SYN cookie rate do not
seem to be affected the same way by the prandom changes. One explanation
is that the connection rates are less stable across reboots. Another
possible explanation is that the larger state update is more sensitive
to cache misses that increase when calling userland. I noticed that the
compiler didn't inline siprand_u32() for me, resulting in one extra
function call and noticeable register clobbering that mostly vanish
once siprand_u32() is inlined, getting back to the original performance.

The noise generation was placed as discussed in the xmit calls, however
the extra function call and state update had a negative effect on
performance and the noise function alone appeared for up to 0.23% of the
CPU usage. Simplifying the mix of data by keeping only one long for
the noise and using one siphash round on 4 input words to keep only
the last word allowed to use very few instructions and to inline them,
making the noise collection imperceptible in microbenchmarks. The noise
is now collected this way (I verified that all inputs are used), this
performs 3 xor, 2 add and 2 rol, which is way sufficient and already
better than my initial attempt with a bare add :

  static inline
  void prandom_u32_add_noise(unsigned long a, unsigned long b,
                             unsigned long c, unsigned long d)
  { 
	/*
	 * This is not used cryptographically; it's just
	 * a convenient 4-word hash function. (3 xor, 2 add, 2 rol)
	 */
	a ^= __this_cpu_read(net_rand_noise);
	PRND_SIPROUND(a, b, c, d);
	__this_cpu_write(net_rand_noise, d);
  }

My tests were run on a 6-core 12-thread Core i7-8700k equipped with a 40G
NIC (i40e). I've mainly run two types of tests:

  - connections per second: the machine runs a server which accepts and
    closes incoming connections. The load generators aim at it and the
    connection rate is measured once it's stabilized.

  - SYN cookie rate: the load generators flood the machine with enough
    SYNs to saturate the CPU and the rate of response SYN-ACK is measured.

Both correspond to real world use cases (DDoS protection against SYN flood
and connection flood).

The base kernel was fc80c51f + Eric's patch to add a tracepoint in
prandom_u32(). Another test was made by adding George's changes to use
siphash. Then another test was made with the siprand_u32() function
inlined and with noise stored as a full siphash state. Then one test
was run with the noise reduced to a single long. And a final test was
run with the noise function inlined.

          connections    SYN cookies   Notes
          per second     emitted/s
  
  base:     556k          5.38M
  
  siphash:  535k          5.33M
  
  siphash inlined
  +noise:   548k          5.40M    add_noise=0.23%
  
  siphash + single-word
   noise    555k          5.45M    add_noise=0.10%
  
  siphash + single-word&inlined
   noise    559k          5.38M

Actually the last one is better than the previous one because it also
swallows more packets. There were 10.9M pps in and 5.38M pps out versus
10.77M in and 5.45M out for the previous one. I didn't report the incoming
traffic for the other ones as it was mostly irrelevant and always within
these bounds.

Finally I've added Eric's patch to reuse the skb hash when known in
tcp_conn_request(), and was happy to see the SYN cookies reach 5.45 Mpps
again and the connection rate remain unaffected. A perf record during
the SYN flood showed almost no call to prandom_u32() anymore (just a few
in tcp_rtx_synack()) so this looks like a desirable optimization.

At the moment the code is ugly, in experimental state (I've pushed all of
it at https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/).

My impression on this is that given that it's possible to maintain the
same level of performance as we currently have while making the PRNG much
better, there's no more reason for not doing it.

If there's enough interest at this point, I'm OK with restarting from
George's patches and doing the adjustments there. There's still this
prandom_seed() which looks very close to prandom_reseed() and that we
might possibly better remerge, but I'd vote for not changing everything
at once, it's ugly enough already. Also I suspect we can have an infinite
loop in prandom_seed() if entropy is 0 and the state is zero as well.
We'd be unlucky but I'd just make sure entropy is not all zeroes. And
running tests on 32-bit would be desirable as well.

Finally one can wonder whether it makes sense to keep Tausworthe for
other cases (basic statistical sampling) or drop it. We could definitely
drop it and simplify everything given that we now have the same level of
performance. But if we do it, what should we do with the test patterns ?
I personally don't think that testing a PRNG against a known sequence
brings any value by definition, and that the more random we make it the
less relevant this is.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-14 16:05                       ` Willy Tarreau
@ 2020-08-14 16:17                         ` Sedat Dilek
  2020-08-16 15:01                           ` Willy Tarreau
  0 siblings, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-14 16:17 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

On Fri, Aug 14, 2020 at 6:05 PM Willy Tarreau <w@1wt.eu> wrote:
>
> On Fri, Aug 14, 2020 at 05:32:32PM +0200, Sedat Dilek wrote:
> > commit 94c7eb54c4b8e81618ec79f414fe1ca5767f9720
> > "random32: add a tracepoint for prandom_u32()"
> >
> > ...I gave Willy's patches a try and used the Linux Test Project (LTP)
> > for testing.
>
> Just FWIW today I could run several relevant tests with a 40 Gbps NIC
> at high connection rates and under SYN flood to stress SYN cookies.
> I couldn't post earlier due to a net outage but will post the results
> here. In short, what I'm seeing is very good. The only thing is that
> the noise collection as-is with the 4 longs takes a bit too much CPU
> (0.2% measured) but if keeping only one word we're back to tausworthe
> performance, while using siphash all along.
>
> The noise generation function is so small that we're wasting cycles
> calling it and renaming registers. I'll run one more test by inlining
> it and exporting the noise.
>
> So provided quite some cleanup now, I really think we're about to
> reach a solution which will satisfy everyone. More on this after I
> extract the results.
>

Great news!

Will you publish your new version in [1]?

- Sedat -

[1] https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-14 15:32                     ` Sedat Dilek
@ 2020-08-14 16:05                       ` Willy Tarreau
  2020-08-14 16:17                         ` Sedat Dilek
  0 siblings, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-14 16:05 UTC (permalink / raw)
  To: Sedat Dilek
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

On Fri, Aug 14, 2020 at 05:32:32PM +0200, Sedat Dilek wrote:
> commit 94c7eb54c4b8e81618ec79f414fe1ca5767f9720
> "random32: add a tracepoint for prandom_u32()"
> 
> ...I gave Willy's patches a try and used the Linux Test Project (LTP)
> for testing.

Just FWIW today I could run several relevant tests with a 40 Gbps NIC
at high connection rates and under SYN flood to stress SYN cookies.
I couldn't post earlier due to a net outage but will post the results
here. In short, what I'm seeing is very good. The only thing is that
the noise collection as-is with the 4 longs takes a bit too much CPU
(0.2% measured) but if keeping only one word we're back to tausworthe
performance, while using siphash all along.

The noise generation function is so small that we're wasting cycles
calling it and renaming registers. I'll run one more test by inlining
it and exporting the noise.

So provided quite some cleanup now, I really think we're about to
reach a solution which will satisfy everyone. More on this after I
extract the results.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-13  8:06                   ` Willy Tarreau
  2020-08-13  8:13                     ` Sedat Dilek
@ 2020-08-14 15:32                     ` Sedat Dilek
  2020-08-14 16:05                       ` Willy Tarreau
  1 sibling, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-14 15:32 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

[-- Attachment #1: Type: text/plain, Size: 2163 bytes --]

On Thu, Aug 13, 2020 at 10:06 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Thu, Aug 13, 2020 at 09:53:11AM +0200, Sedat Dilek wrote:
> > On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote:
> > >
> > > On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote:
> > > > Can you share this "rebased to mainline" version of George's patch?
> > >
> > > You can pick it from there if that helps, but keep in mind that
> > > it's just experimental code that we use to explain our ideas and
> > > that we really don't care a single second what kernel it's applied
> > > to:
> > >
> > >    https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1
> > >
> >
> > Thanks Willy.
> >
> > I disagree: the base for testing should be clear(ly communicated).
>
> It is. As you can see on the log above, this was applied on top of
> fc80c51fd4b2, there's nothing special here. In addition we're not even
> talking about testing nor calling for testers, just trying to find a
> reasonable solution. Maybe today I'll be able to re-run a few tests by
> the way.
>
> > There are two diffs from Eric to #1: add a trace event for
> > prandom_u32() and #2: a removal of prandom_u32() call in
> > tcp_conn_request().
> > In case you have not seen.
>
> I've seen, just not had the time to test yet.
>

Now with Eric' patch (see [1]) in mainline...

commit 94c7eb54c4b8e81618ec79f414fe1ca5767f9720
"random32: add a tracepoint for prandom_u32()"

...I gave Willy's patches a try and used the Linux Test Project (LTP)
for testing.

[ PERF SESSION #2 ]

Link: https://github.com/linux-test-project/ltp/blob/master/doc/mini-howto-building-ltp-from-git.txt

cd /opt/ltp

/home/dileks/bin/perf record -a -g -e random:prandom_u32 ./runltp -f
net.features -s tcp_fastopen

/home/dileks/bin/perf report --no-children --stdio > ./perf-report.txt
/home/dileks/bin/perf script > ./perf-script.txt

du -h perf*
34M     perf.data
20K     perf-report.txt
134M    perf-script.txt

Note: For a first test I used net.features::tcp_fastopen.

Attached is my perf-report.txt.

- Sedat -

[1] https://git.kernel.org/linus/94c7eb54c4b8e81618ec79f414fe1ca5767f9720

[-- Attachment #2: perf-report.txt --]
[-- Type: text/plain, Size: 20177 bytes --]

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 120K of event 'random:prandom_u32'
# Event count (approx.): 120452
#
# Overhead  Command          Shared Object      Symbol         
# ........  ...............  .................  ...............
#
    58.22%  netstress        [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
               |--33.22%--tcp_v4_connect
               |          __inet_stream_connect
               |          |          
               |          |--22.15%--inet_stream_connect
               |          |          __sys_connect
               |          |          __x64_sys_connect
               |          |          do_syscall_64
               |          |          entry_SYSCALL_64_after_hwframe
               |          |          __libc_connect
               |          |          0x65736e7500632e73
               |          |          
               |           --11.07%--tcp_sendmsg_locked
               |                     tcp_sendmsg
               |                     sock_sendmsg
               |                     __sys_sendto
               |                     __x64_sys_sendto
               |                     do_syscall_64
               |                     entry_SYSCALL_64_after_hwframe
               |                     __libc_sendto
               |          
               |--16.61%--tcp_v6_connect
               |          __inet_stream_connect
               |          |          
               |          |--11.07%--inet_stream_connect
               |          |          __sys_connect
               |          |          __x64_sys_connect
               |          |          do_syscall_64
               |          |          entry_SYSCALL_64_after_hwframe
               |          |          __libc_connect
               |          |          0x65736e7500632e73
               |          |          
               |           --5.54%--tcp_sendmsg_locked
               |                     tcp_sendmsg
               |                     sock_sendmsg
               |                     __sys_sendto
               |                     __x64_sys_sendto
               |                     do_syscall_64
               |                     entry_SYSCALL_64_after_hwframe
               |                     __libc_sendto
               |          
               |--5.24%--tcp_conn_request
               |          tcp_rcv_state_process
               |          |          
               |          |--3.26%--tcp_v4_do_rcv
               |          |          tcp_v4_rcv
               |          |          ip_protocol_deliver_rcu
               |          |          ip_local_deliver_finish
               |          |          __netif_receive_skb_one_core
               |          |          process_backlog
               |          |          net_rx_action
               |          |          __softirqentry_text_start
               |          |          asm_call_on_stack
               |          |          do_softirq_own_stack
               |          |          |          
               |          |          |--2.03%--irq_exit_rcu
               |          |          |          sysvec_apic_timer_interrupt
               |          |          |          asm_sysvec_apic_timer_interrupt
               |          |          |          
               |          |           --1.23%--do_softirq
               |          |                     __local_bh_enable_ip
               |          |                     |          
               |          |                      --1.19%--ip_finish_output2
               |          |                                __ip_queue_xmit
               |          |                                __tcp_transmit_skb
               |          |                                |          
               |          |                                 --1.13%--tcp_write_xmit
               |          |                                           __tcp_push_pending_frames
               |          |                                           |          
               |          |                                            --0.95%--tcp_sendmsg_locked
               |          |                                                      tcp_sendmsg
               |          |                                                      sock_sendmsg
               |          |                                                      |          
               |          |                                                       --0.59%--__sys_sendto
               |          |                                                                 __x64_sys_sendto
               |          |                                                                 do_syscall_64
               |          |                                                                 entry_SYSCALL_64_after_hwframe
               |          |          
               |           --1.99%--tcp_v6_do_rcv
               |                     tcp_v6_rcv
               |                     ip6_protocol_deliver_rcu
               |                     ip6_input
               |                     __netif_receive_skb_one_core
               |                     process_backlog
               |                     net_rx_action
               |                     __softirqentry_text_start
               |                     asm_call_on_stack
               |                     do_softirq_own_stack
               |                     |          
               |                     |--1.27%--irq_exit_rcu
               |                     |          sysvec_apic_timer_interrupt
               |                     |          asm_sysvec_apic_timer_interrupt
               |                     |          
               |                      --0.71%--do_softirq
               |                                __local_bh_enable_ip
               |                                |          
               |                                 --0.68%--ip6_finish_output2
               |                                           ip6_xmit
               |                                           inet6_csk_xmit
               |                                           __tcp_transmit_skb
               |                                           |          
               |                                            --0.62%--tcp_write_xmit
               |                                                      __tcp_push_pending_frames
               |                                                      |          
               |                                                       --0.53%--tcp_sendmsg_locked
               |                                                                 tcp_sendmsg
               |                                                                 sock_sendmsg
               |          
                --3.13%--tcp_v4_syn_recv_sock
                          tcp_v6_syn_recv_sock
                          |          
                          |--2.61%--tcp_try_fastopen
                          |          tcp_conn_request
                          |          tcp_rcv_state_process
                          |          tcp_v4_do_rcv
                          |          tcp_v4_rcv
                          |          ip_protocol_deliver_rcu
                          |          ip_local_deliver_finish
                          |          __netif_receive_skb_one_core
                          |          process_backlog
                          |          net_rx_action
                          |          __softirqentry_text_start
                          |          asm_call_on_stack
                          |          do_softirq_own_stack
                          |          |          
                          |          |--1.66%--irq_exit_rcu
                          |          |          sysvec_apic_timer_interrupt
                          |          |          asm_sysvec_apic_timer_interrupt
                          |          |          
                          |           --0.95%--do_softirq
                          |                     __local_bh_enable_ip
                          |                     |          
                          |                      --0.92%--ip_finish_output2
                          |                                __ip_queue_xmit
                          |                                __tcp_transmit_skb
                          |                                |          
                          |                                 --0.87%--tcp_write_xmit
                          |                                           __tcp_push_pending_frames
                          |                                           |          
                          |                                            --0.72%--tcp_sendmsg_locked
                          |                                                      tcp_sendmsg
                          |                                                      sock_sendmsg
                          |          
                           --0.52%--tcp_check_req
                                     tcp_v4_rcv
                                     ip_protocol_deliver_rcu
                                     ip_local_deliver_finish
                                     __netif_receive_skb_one_core
                                     process_backlog
                                     net_rx_action
                                     __softirqentry_text_start
                                     asm_call_on_stack
                                     do_softirq_own_stack

    37.93%  swapper          [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |          
               |--25.55%--tcp_conn_request
               |          tcp_rcv_state_process
               |          |          
               |          |--13.54%--tcp_v6_do_rcv
               |          |          tcp_v6_rcv
               |          |          ip6_protocol_deliver_rcu
               |          |          ip6_input
               |          |          __netif_receive_skb_one_core
               |          |          process_backlog
               |          |          net_rx_action
               |          |          __softirqentry_text_start
               |          |          asm_call_on_stack
               |          |          do_softirq_own_stack
               |          |          irq_exit_rcu
               |          |          sysvec_apic_timer_interrupt
               |          |          asm_sysvec_apic_timer_interrupt
               |          |          |          
               |          |          |--12.41%--cpuidle_enter_state
               |          |          |          cpuidle_enter
               |          |          |          do_idle
               |          |          |          cpu_startup_entry
               |          |          |          |          
               |          |          |          |--9.59%--start_secondary
               |          |          |          |          secondary_startup_64
               |          |          |          |          
               |          |          |           --2.82%--start_kernel
               |          |          |                     secondary_startup_64
               |          |          |          
               |          |           --0.68%--poll_idle
               |          |                     cpuidle_enter_state
               |          |                     cpuidle_enter
               |          |                     do_idle
               |          |                     cpu_startup_entry
               |          |                     |          
               |          |                      --0.52%--start_secondary
               |          |                                secondary_startup_64
               |          |          
               |           --12.02%--tcp_v4_do_rcv
               |                     tcp_v4_rcv
               |                     ip_protocol_deliver_rcu
               |                     ip_local_deliver_finish
               |                     __netif_receive_skb_one_core
               |                     process_backlog
               |                     net_rx_action
               |                     __softirqentry_text_start
               |                     asm_call_on_stack
               |                     do_softirq_own_stack
               |                     irq_exit_rcu
               |                     sysvec_apic_timer_interrupt
               |                     asm_sysvec_apic_timer_interrupt
               |                     |          
               |                     |--10.71%--cpuidle_enter_state
               |                     |          cpuidle_enter
               |                     |          do_idle
               |                     |          cpu_startup_entry
               |                     |          |          
               |                     |          |--8.21%--start_secondary
               |                     |          |          secondary_startup_64
               |                     |          |          
               |                     |           --2.50%--start_kernel
               |                     |                     secondary_startup_64
               |                     |          
               |                      --0.77%--poll_idle
               |                                cpuidle_enter_state
               |                                cpuidle_enter
               |                                do_idle
               |                                cpu_startup_entry
               |                                |          
               |                                 --0.60%--start_secondary
               |                                           secondary_startup_64
               |          
                --12.36%--tcp_v4_syn_recv_sock
                          tcp_v6_syn_recv_sock
                          |          
                          |--7.54%--tcp_try_fastopen
                          |          tcp_conn_request
                          |          tcp_rcv_state_process
                          |          tcp_v4_do_rcv
                          |          tcp_v4_rcv
                          |          ip_protocol_deliver_rcu
                          |          ip_local_deliver_finish
                          |          __netif_receive_skb_one_core
                          |          process_backlog
                          |          net_rx_action
                          |          __softirqentry_text_start
                          |          asm_call_on_stack
                          |          do_softirq_own_stack
                          |          irq_exit_rcu
                          |          sysvec_apic_timer_interrupt
                          |          asm_sysvec_apic_timer_interrupt
                          |          |          
                          |           --6.76%--cpuidle_enter_state
                          |                     cpuidle_enter
                          |                     do_idle
                          |                     cpu_startup_entry
                          |                     |          
                          |                     |--5.11%--start_secondary
                          |                     |          secondary_startup_64
                          |                     |          
                          |                      --1.65%--start_kernel
                          |                                secondary_startup_64
                          |          
                           --4.82%--tcp_check_req
                                     tcp_v4_rcv
                                     ip_protocol_deliver_rcu
                                     ip_local_deliver_finish
                                     __netif_receive_skb_one_core
                                     process_backlog
                                     net_rx_action
                                     __softirqentry_text_start
                                     asm_call_on_stack
                                     do_softirq_own_stack
                                     irq_exit_rcu
                                     sysvec_apic_timer_interrupt
                                     asm_sysvec_apic_timer_interrupt
                                     |          
                                      --4.59%--cpuidle_enter_state
                                                cpuidle_enter
                                                do_idle
                                                cpu_startup_entry
                                                |          
                                                |--3.63%--start_secondary
                                                |          secondary_startup_64
                                                |          
                                                 --0.95%--start_kernel
                                                           secondary_startup_64

     0.78%  ksoftirqd/1      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32

     0.76%  ksoftirqd/0      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32

     0.76%  ksoftirqd/3      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32

     0.71%  ksoftirqd/2      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32

     0.20%  kworker/3:3-eve  [kernel.kallsyms]  [k] prandom_u32
     0.17%  kworker/1:1-eve  [kernel.kallsyms]  [k] prandom_u32
     0.15%  kworker/2:2-eve  [kernel.kallsyms]  [k] prandom_u32
     0.12%  kworker/0:7-eve  [kernel.kallsyms]  [k] prandom_u32
     0.06%  Xorg             [kernel.kallsyms]  [k] prandom_u32
     0.04%  ip               [kernel.kallsyms]  [k] prandom_u32
     0.03%  avahi-daemon     [kernel.kallsyms]  [k] prandom_u32
     0.02%  perf             [kernel.kallsyms]  [k] prandom_u32
     0.01%  mysqld           [kernel.kallsyms]  [k] prandom_u32
     0.01%  tcp_fastopen_ru  [kernel.kallsyms]  [k] prandom_u32
     0.00%  irq/35-iwlwifi   [kernel.kallsyms]  [k] prandom_u32
     0.00%  rcu_sched        [kernel.kallsyms]  [k] prandom_u32
     0.00%  dbus-daemon      [kernel.kallsyms]  [k] prandom_u32
     0.00%  ltp-pan          [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/u16:3-i  [kernel.kallsyms]  [k] prandom_u32
     0.00%  mktemp           [kernel.kallsyms]  [k] prandom_u32
     0.00%  QSGRenderThread  [kernel.kallsyms]  [k] prandom_u32
     0.00%  gdbus            [kernel.kallsyms]  [k] prandom_u32
     0.00%  jbd2/sdc2-8      [kernel.kallsyms]  [k] prandom_u32
     0.00%  kded5            [kernel.kallsyms]  [k] prandom_u32
     0.00%  konsole          [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/1:0-eve  [kernel.kallsyms]  [k] prandom_u32
     0.00%  kworker/2:0-eve  [kernel.kallsyms]  [k] prandom_u32
     0.00%  runltp           [kernel.kallsyms]  [k] prandom_u32
     0.00%  systemd-journal  [kernel.kallsyms]  [k] prandom_u32
     0.00%  xdg-desktop-por  [kernel.kallsyms]  [k] prandom_u32
     0.00%  sh               [kernel.kallsyms]  [k] prandom_u32


# Samples: 0  of event 'dummy:HG'
# Event count (approx.): 0
#
# Overhead  Command  Shared Object  Symbol
# ........  .......  .............  ......
#


#
# (Tip: To show assembler sample contexts use perf record -b / perf script -F +brstackinsn --xed)
#

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-13 14:00                         ` Eric Dumazet
@ 2020-08-13 16:02                           ` Sedat Dilek
  0 siblings, 0 replies; 68+ messages in thread
From: Sedat Dilek @ 2020-08-13 16:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: George Spelvin, Amit Klein, Jason A. Donenfeld, Andy Lutomirski,
	Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds,
	netdev, Willy Tarreau

On Thu, Aug 13, 2020 at 4:00 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Aug 13, 2020 at 1:27 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> >
> > I run a perf session looks like this in my KDE/Plasma desktop-environment:
> >
> > [ PERF SESSION ]
> >
> >  1016  2020-08-13 09:57:24 echo 1 > /proc/sys/kernel/sched_schedstats
> >  1017  2020-08-13 09:57:24 echo prandom_u32 >>
> > /sys/kernel/debug/tracing/set_event
> >  1018  2020-08-13 09:57:24 echo traceon >
> > /sys/kernel/debug/tracing/events/random/prandom_u32/trigger
> >  1019  2020-08-13 09:57:25 echo 1 > /sys/kernel/debug/tracing/events/enable
> >
> >  1020  2020-08-13 09:57:32 sysctl -n kernel.sched_schedstats
> >  1021  2020-08-13 09:57:32 cat /sys/kernel/debug/tracing/events/enable
> >  1022  2020-08-13 09:57:32 grep prandom_u32 /sys/kernel/debug/tracing/set_event
> >  1023  2020-08-13 09:57:33 cat
> > /sys/kernel/debug/tracing/events/random/prandom_u32/trigger
> >
> > root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5
> >
>
> To be clear : This "perf record -a -g -e random:prandom_u32 sleep 5"
> is self sufficient.
>
> You have nothing to do before (as reported in your email), this is
> simply not needed.
>
> I am not sure why you added all this irrelevant stuff, this is distracting.

Initially I followed these Links:

Link: https://www.kernel.org/doc/html/v5.8/trace/events.html
Link: https://www.kernel.org/doc/html/v5.8/trace/events.html#boot-option
Link: http://www.brendangregg.com/perf.html
Link: http://www.brendangregg.com/perf.html#DynamicTracing

You are right, it's not needed to set and check all these variables as
perf says:

root# /home/dileks/bin/perf list | grep prandom_u32
  random:prandom_u32                                 [Tracepoint event]

So these two steps are indeed enough:

root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5
root# /home/dileks/bin/perf report --no-children --stdio

Lessons learned.

- Sedat -

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-13  8:27                       ` Sedat Dilek
@ 2020-08-13 14:00                         ` Eric Dumazet
  2020-08-13 16:02                           ` Sedat Dilek
  0 siblings, 1 reply; 68+ messages in thread
From: Eric Dumazet @ 2020-08-13 14:00 UTC (permalink / raw)
  To: sedat.dilek
  Cc: George Spelvin, Amit Klein, Jason A. Donenfeld, Andy Lutomirski,
	Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds,
	netdev, Willy Tarreau

On Thu, Aug 13, 2020 at 1:27 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
>
> I run a perf session looks like this in my KDE/Plasma desktop-environment:
>
> [ PERF SESSION ]
>
>  1016  2020-08-13 09:57:24 echo 1 > /proc/sys/kernel/sched_schedstats
>  1017  2020-08-13 09:57:24 echo prandom_u32 >>
> /sys/kernel/debug/tracing/set_event
>  1018  2020-08-13 09:57:24 echo traceon >
> /sys/kernel/debug/tracing/events/random/prandom_u32/trigger
>  1019  2020-08-13 09:57:25 echo 1 > /sys/kernel/debug/tracing/events/enable
>
>  1020  2020-08-13 09:57:32 sysctl -n kernel.sched_schedstats
>  1021  2020-08-13 09:57:32 cat /sys/kernel/debug/tracing/events/enable
>  1022  2020-08-13 09:57:32 grep prandom_u32 /sys/kernel/debug/tracing/set_event
>  1023  2020-08-13 09:57:33 cat
> /sys/kernel/debug/tracing/events/random/prandom_u32/trigger
>
> root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5
>

To be clear : This "perf record -a -g -e random:prandom_u32 sleep 5"
is self sufficient.

You have nothing to do before (as reported in your email), this is
simply not needed.

I am not sure why you added all this irrelevant stuff, this is distracting.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-13  8:13                     ` Sedat Dilek
@ 2020-08-13  8:27                       ` Sedat Dilek
  2020-08-13 14:00                         ` Eric Dumazet
  0 siblings, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-13  8:27 UTC (permalink / raw)
  To: George Spelvin
  Cc: Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski,
	Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds,
	netdev, Willy Tarreau

I run a perf session looks like this in my KDE/Plasma desktop-environment:

[ PERF SESSION ]

 1016  2020-08-13 09:57:24 echo 1 > /proc/sys/kernel/sched_schedstats
 1017  2020-08-13 09:57:24 echo prandom_u32 >>
/sys/kernel/debug/tracing/set_event
 1018  2020-08-13 09:57:24 echo traceon >
/sys/kernel/debug/tracing/events/random/prandom_u32/trigger
 1019  2020-08-13 09:57:25 echo 1 > /sys/kernel/debug/tracing/events/enable

 1020  2020-08-13 09:57:32 sysctl -n kernel.sched_schedstats
 1021  2020-08-13 09:57:32 cat /sys/kernel/debug/tracing/events/enable
 1022  2020-08-13 09:57:32 grep prandom_u32 /sys/kernel/debug/tracing/set_event
 1023  2020-08-13 09:57:33 cat
/sys/kernel/debug/tracing/events/random/prandom_u32/trigger

root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5

Simple-Test: Press F5 in Firefox in each opened tab (5 tabs in total)

root# /home/dileks/bin/perf report --no-children --stdio > /tmp/perf-report.txt

- Sedat -

[ perf-report.txt ]

# To display the perf.data header info, please use
--header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 273  of event 'random:prandom_u32'
# Event count (approx.): 273
#
# Overhead  Command          Shared Object      Symbol
# ........  ...............  .................  ...............
#
    27.47%  firefox-esr      [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               shmem_mknod
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x7a6f6d2e67726f2f

    13.19%  DNS Resolver #9  [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               |
               |--5.86%--__ip4_datagram_connect
               |          ip4_datagram_connect
               |          __sys_connect
               |          __x64_sys_connect
               |          do_syscall_64
               |          entry_SYSCALL_64_after_hwframe
               |          |
               |          |--4.40%--__libc_connect
               |          |          0x2c1
               |          |
               |           --1.47%--connect
               |                     PR_GetAddrInfoByName
               |
               |--2.93%--udp_lib_get_port
               |          inet_dgram_connect
               |          __sys_connect
               |          __x64_sys_connect
               |          do_syscall_64
               |          entry_SYSCALL_64_after_hwframe
               |          |
               |          |--2.20%--__libc_connect
               |          |          0x2c1
               |          |
               |           --0.73%--connect
               |                     PR_GetAddrInfoByName
               |
               |--2.20%--__ip_select_ident
               |          __ip_make_skb
               |          ip_make_skb
               |          udp_sendmsg
               |          __sys_sendto
               |          __x64_sys_sendto
               |          do_syscall_64
               |          entry_SYSCALL_64_after_hwframe
               |          __libc_send
               |
                --2.20%--netlink_autobind
                          netlink_bind
                          __sys_bind
                          __x64_sys_bind
                          do_syscall_64
                          entry_SYSCALL_64_after_hwframe
                          bind
                          getaddrinfo
                          PR_GetAddrInfoByName

    12.45%  FS Broker 7060   [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               shmem_mknod
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x7a6f6d2e67726f2f

     8.79%  FS Broker 6334   [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               shmem_mknod
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x7a6f6d2e67726f2f

     8.06%  FS Broker 7161   [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               shmem_mknod
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x7a6f6d2e67726f2f

     8.06%  FS Broker 7195   [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               shmem_mknod
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x7a6f6d2e67726f2f

     7.33%  FS Broker 7126   [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               shmem_mknod
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x7a6f6d2e67726f2f

     7.33%  Socket Thread    [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               tcp_v4_connect
               __inet_stream_connect
               inet_stream_connect
               __sys_connect
               __x64_sys_connect
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __libc_connect

     4.40%  Cache2 I/O       [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               __ext4_new_inode
               ext4_create
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x61632e2f736b656c

     1.10%  Xorg             [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               __shmem_file_setup
               create_shmem
               i915_gem_object_create_region
               i915_gem_create_ioctl
               drm_ioctl_kernel
               drm_ioctl
               __se_sys_ioctl
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               ioctl

     0.73%  QSGRenderThread  [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               shmem_get_inode
               __shmem_file_setup
               create_shmem
               i915_gem_object_create_region
               i915_gem_create_ioctl
               drm_ioctl_kernel
               drm_ioctl
               __se_sys_ioctl
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               ioctl

     0.37%  DOM Worker       [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               __ext4_new_inode
               ext4_create
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x6f6d2e2f736b656c

     0.37%  cupsd            [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               __ext4_new_inode
               ext4_create
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x7263736275732f73

     0.37%  mozStorage #8    [kernel.kallsyms]  [k] prandom_u32
            |
            ---prandom_u32
               prandom_u32
               __ext4_new_inode
               ext4_create
               path_openat
               do_filp_open
               do_sys_openat2
               __x64_sys_openat
               do_syscall_64
               entry_SYSCALL_64_after_hwframe
               __open64
               0x6f6d2e2f736b656c



# Samples: 0  of event 'dummy:HG'
# Event count (approx.): 0
#
# Overhead  Command  Shared Object  Symbol
# ........  .......  .............  ......
#


#
# (Tip: Print event counts in CSV format with: perf stat -x,)
#

- EOF -

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-13  8:06                   ` Willy Tarreau
@ 2020-08-13  8:13                     ` Sedat Dilek
  2020-08-13  8:27                       ` Sedat Dilek
  2020-08-14 15:32                     ` Sedat Dilek
  1 sibling, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-13  8:13 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

On Thu, Aug 13, 2020 at 10:06 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Thu, Aug 13, 2020 at 09:53:11AM +0200, Sedat Dilek wrote:
> > On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote:
> > >
> > > On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote:
> > > > Can you share this "rebased to mainline" version of George's patch?
> > >
> > > You can pick it from there if that helps, but keep in mind that
> > > it's just experimental code that we use to explain our ideas and
> > > that we really don't care a single second what kernel it's applied
> > > to:
> > >
> > >    https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1
> > >
> >
> > Thanks Willy.
> >
> > I disagree: the base for testing should be clear(ly communicated).
>
> It is. As you can see on the log above, this was applied on top of
> fc80c51fd4b2, there's nothing special here. In addition we're not even
> talking about testing nor calling for testers, just trying to find a
> reasonable solution. Maybe today I'll be able to re-run a few tests by
> the way.
>

I agree with publishing in your Git tree it is clear.

> > There are two diffs from Eric to #1: add a trace event for
> > prandom_u32() and #2: a removal of prandom_u32() call in
> > tcp_conn_request().
> > In case you have not seen.
>
> I've seen, just not had the time to test yet.
>

Can you describe and share your test-environment/setup?

The Linux-kernel has kunit tests (I never played with that) - you
happen to know there is a suitable one available?

Maybe the Linux Test Project has some suitable tests?

- Sedat -

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-13  7:53                 ` Sedat Dilek
@ 2020-08-13  8:06                   ` Willy Tarreau
  2020-08-13  8:13                     ` Sedat Dilek
  2020-08-14 15:32                     ` Sedat Dilek
  0 siblings, 2 replies; 68+ messages in thread
From: Willy Tarreau @ 2020-08-13  8:06 UTC (permalink / raw)
  To: Sedat Dilek
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

On Thu, Aug 13, 2020 at 09:53:11AM +0200, Sedat Dilek wrote:
> On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote:
> >
> > On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote:
> > > Can you share this "rebased to mainline" version of George's patch?
> >
> > You can pick it from there if that helps, but keep in mind that
> > it's just experimental code that we use to explain our ideas and
> > that we really don't care a single second what kernel it's applied
> > to:
> >
> >    https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1
> >
> 
> Thanks Willy.
> 
> I disagree: the base for testing should be clear(ly communicated).

It is. As you can see on the log above, this was applied on top of
fc80c51fd4b2, there's nothing special here. In addition we're not even
talking about testing nor calling for testers, just trying to find a
reasonable solution. Maybe today I'll be able to re-run a few tests by
the way.

> There are two diffs from Eric to #1: add a trace event for
> prandom_u32() and #2: a removal of prandom_u32() call in
> tcp_conn_request().
> In case you have not seen.

I've seen, just not had the time to test yet.

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-12  3:21               ` Willy Tarreau
@ 2020-08-13  7:53                 ` Sedat Dilek
  2020-08-13  8:06                   ` Willy Tarreau
  0 siblings, 1 reply; 68+ messages in thread
From: Sedat Dilek @ 2020-08-13  7:53 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote:
>
> On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote:
> > Can you share this "rebased to mainline" version of George's patch?
>
> You can pick it from there if that helps, but keep in mind that
> it's just experimental code that we use to explain our ideas and
> that we really don't care a single second what kernel it's applied
> to:
>
>    https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1
>

Thanks Willy.

I disagree: the base for testing should be clear(ly communicated).

There are two diffs from Eric to #1: add a trace event for
prandom_u32() and #2: a removal of prandom_u32() call in
tcp_conn_request().
In case you have not seen.
The first was helpful for playing with linux-perf.
I have tested both together with [2].

- Sedat -

[1] https://marc.info/?l=linux-netdev&m=159716173516111&w=2
[2] https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=random/fast

Note2myself: Enable some useful random/random32 Kconfigs

RANDOM32_SELFTEST n -> y
WARN_ALL_UNSEEDED_RANDOM n -> y

- EOT -

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-11 10:51             ` Sedat Dilek
  2020-08-11 11:01               ` Sedat Dilek
@ 2020-08-12  3:21               ` Willy Tarreau
  2020-08-13  7:53                 ` Sedat Dilek
  1 sibling, 1 reply; 68+ messages in thread
From: Willy Tarreau @ 2020-08-12  3:21 UTC (permalink / raw)
  To: Sedat Dilek
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote:
> Can you share this "rebased to mainline" version of George's patch?

You can pick it from there if that helps, but keep in mind that
it's just experimental code that we use to explain our ideas and
that we really don't care a single second what kernel it's applied
to:

   https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1

Willy

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
  2020-08-11 10:51             ` Sedat Dilek
@ 2020-08-11 11:01               ` Sedat Dilek
  2020-08-12  3:21               ` Willy Tarreau
  1 sibling, 0 replies; 68+ messages in thread
From: Sedat Dilek @ 2020-08-11 11:01 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

In the previous discussion...

"Flaw in "random32: update the net random state on interrupt and activity"

...someone referred to <luto/linux.git#random/fast>.

Someone tested this?
Feedback?

- Sedat -

[0] https://marc.info/?t=159658903500002&r=1&w=2
[1] https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=random/fast

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
       [not found]           ` <20200811074538.GA9523@1wt.eu>
@ 2020-08-11 10:51             ` Sedat Dilek
  2020-08-11 11:01               ` Sedat Dilek
  2020-08-12  3:21               ` Willy Tarreau
  0 siblings, 2 replies; 68+ messages in thread
From: Sedat Dilek @ 2020-08-11 10:51 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

[ CC netdev ML ]

Hi Willy,

in [1] you say:

> I've applied it on top of George's patch rebased to mainline for simplicity.
> I've used a separate per_cpu noise variable to keep the net_rand_state static
> with its __latent_entropy.

Can you share this "rebased to mainline" version of George's patch?
Maybe put your work "user-friendly-fetchable" in one of your
<kernel.org> Git tree (see [2])?

Yesterday random/random32/prandom mainline patches hit Linux
v5.8.1-rc1 (see [3]).

So, as I asked in my first email what is a suitable base?
Linux v5.9-rc1 (this Sunday) or if stable Linux v5.8.1 (next 1-2 days)

Thanks.

Regards,
- Sedat -


[1] https://marc.info/?l=linux-netdev&m=159709355528675&w=2
[2] https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau
[3] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/?h=linux-5.8.y

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable
       [not found] <CA+icZUVnsmf1kXPYFYufStQ_MxnLuxL+EWfDS2wQy1VbAEMwkA@mail.gmail.com>
@ 2020-08-09 21:10 ` Sedat Dilek
       [not found] ` <20200809235412.GD25124@SDF.ORG>
  1 sibling, 0 replies; 68+ messages in thread
From: Sedat Dilek @ 2020-08-09 21:10 UTC (permalink / raw)
  To: George Spelvin
  Cc: Amit Klein, Willy Tarreau, Eric Dumazet, Jason A. Donenfeld,
	Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra,
	Linus Torvalds, netdev

+ CC:netdev

( Just FYI: Build and boot on bare metal. )

- Sedat -

On Sun, Aug 9, 2020 at 11:01 PM Sedat Dilek <sedat.dilek@gmail.com> wrote:
>
> Hi George,
>
> I have tried your patch on top of Linux v5.8 with...
>
> commit f227e3ec3b5c ("random32: update the net random state on
> interrupt and activity")
>
> ...reverted.
> This was a bit tricky - what was your base?
>
> Applied the typo fix from Randy - will a v2 follow?
>
> Why DRAFT and not RFC?
>
> Please drop the CC:stable - it's a DRAFT.
>
> Other Linux stable like linux-5.7.y include...
>
> commit c0842fbc1b18c7a044e6ff3e8fa78bfa822c7d1a
> random32: move the pseudo-random 32-bit definitions to prandom.h
>
> commit 585524081ecdcde1c719e63916c514866d898217
> random: random.h should include archrandom.h, not the other way around
>
> ...linux-5.8.y stable will follow.
>
> Isn't the move to prandom.h making your patch easier to apply?
>
> In a second build I applied the snippet from Willy.
>
> What do you mean by...?
>
> [ quote ]
> I wonder if, on general principles, it would be better to use a more
> SipHash style mixing in of the sample:
>     m = get_cycles();
>     v3 ^= m;
>     SIPROUND(v0, v1, v2, v3);
>     SIPROUND(v0, v1, v2, v3);
>     v0 ^= m;
>
> Not sure if it's worth the extra register (and associated spill/fill).
> [ /quote ]
>
> Have you a snippet for testing?
>
> Thanks.
>
> Regards,
> - Sedat -

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2020-08-27  7:08 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-08 15:26 Flaw in "random32: update the net random state on interrupt and activity" George Spelvin
2020-08-08 17:07 ` Andy Lutomirski
2020-08-08 18:08   ` Willy Tarreau
2020-08-08 18:13   ` Linus Torvalds
2020-08-08 19:03   ` George Spelvin
2020-08-08 19:49     ` Andy Lutomirski
2020-08-08 21:29       ` George Spelvin
2020-08-08 17:44 ` Willy Tarreau
2020-08-08 18:19   ` Linus Torvalds
2020-08-08 18:53     ` Willy Tarreau
2020-08-08 20:47     ` George Spelvin
2020-08-08 20:52       ` Linus Torvalds
2020-08-08 22:27         ` George Spelvin
2020-08-09  2:07           ` Linus Torvalds
2020-08-11 16:01             ` Eric Dumazet
2020-08-08 19:18   ` Florian Westphal
2020-08-08 20:59     ` George Spelvin
2020-08-08 21:18     ` Willy Tarreau
2020-08-08 20:08   ` George Spelvin
2020-08-08 20:47     ` Linus Torvalds
2020-08-09  6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin
2020-08-09  9:38   ` Willy Tarreau
2020-08-09 17:06     ` George Spelvin
2020-08-09 17:33       ` Willy Tarreau
2020-08-09 18:30         ` George Spelvin
2020-08-09 19:16           ` Willy Tarreau
2020-08-10 11:47           ` Willy Tarreau
2020-08-10 12:01             ` David Laight
2020-08-10 14:48               ` Willy Tarreau
2020-08-10 12:03             ` Florian Westphal
2020-08-10 14:53               ` Willy Tarreau
2020-08-10 16:31             ` Linus Torvalds
2020-08-10 16:58               ` Willy Tarreau
2020-08-10 17:45                 ` Linus Torvalds
2020-08-10 18:01                   ` Willy Tarreau
2020-08-10 21:04                   ` Willy Tarreau
2020-08-11  5:26                     ` George Spelvin
2020-08-11  5:37                       ` Willy Tarreau
2020-08-11  3:47             ` George Spelvin
2020-08-11  3:58               ` Willy Tarreau
     [not found]     ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com>
2020-08-09 21:10       ` Marc Plumb
2020-08-09 21:48         ` Linus Torvalds
2020-08-09 13:50   ` Randy Dunlap
     [not found]   ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com>
2020-08-09 16:08     ` George Spelvin
     [not found] <CA+icZUVnsmf1kXPYFYufStQ_MxnLuxL+EWfDS2wQy1VbAEMwkA@mail.gmail.com>
2020-08-09 21:10 ` Sedat Dilek
     [not found] ` <20200809235412.GD25124@SDF.ORG>
     [not found]   ` <20200810034948.GB8262@1wt.eu>
     [not found]     ` <20200811053455.GH25124@SDF.ORG>
     [not found]       ` <20200811054328.GD9456@1wt.eu>
     [not found]         ` <20200811062814.GI25124@SDF.ORG>
     [not found]           ` <20200811074538.GA9523@1wt.eu>
2020-08-11 10:51             ` Sedat Dilek
2020-08-11 11:01               ` Sedat Dilek
2020-08-12  3:21               ` Willy Tarreau
2020-08-13  7:53                 ` Sedat Dilek
2020-08-13  8:06                   ` Willy Tarreau
2020-08-13  8:13                     ` Sedat Dilek
2020-08-13  8:27                       ` Sedat Dilek
2020-08-13 14:00                         ` Eric Dumazet
2020-08-13 16:02                           ` Sedat Dilek
2020-08-14 15:32                     ` Sedat Dilek
2020-08-14 16:05                       ` Willy Tarreau
2020-08-14 16:17                         ` Sedat Dilek
2020-08-16 15:01                           ` Willy Tarreau
2020-08-16 16:48                             ` Sedat Dilek
2020-08-20  3:05                               ` Sedat Dilek
2020-08-20  4:33                                 ` Willy Tarreau
2020-08-20  4:42                                   ` Sedat Dilek
2020-08-20  6:08                                     ` Willy Tarreau
2020-08-20  6:58                                       ` Willy Tarreau
2020-08-20  8:05                                         ` Willy Tarreau
2020-08-20 12:08                                           ` Sedat Dilek
     [not found]                                           ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com>
2020-08-27  1:09                                             ` Willy Tarreau
2020-08-27  7:08                                               ` Sedat Dilek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).