* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
@ 2016-12-16 20:49 Jason A. Donenfeld
2016-12-16 21:25 ` George Spelvin
0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16 20:49 UTC (permalink / raw)
To: George Spelvin
Cc: Andi Kleen, David Miller, David Laight, Daniel J . Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
kernel-hardening, Linux Crypto Mailing List, LKML,
Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds,
Theodore Ts'o, Vegard Nossum
On Fri, Dec 16, 2016 at 9:17 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> My (speaking enerally; I should walk through every hash table you've
> converted) opinion is that:
>
> - Hash tables, even network-facing ones, can all use hsiphash as long
> as an attacker can only see collisions, i.e. ((H(x) ^ H(y)) & bits) ==
> 0, and the consequences of a successful attack is only more collisions
> (timing). While the attack is only 2x the cost (two hashes rather than
> one to test a key), the knowledge of the collision is statistical,
> especially for network attackers, which raises the cost of guessing
> beyond an even more brute-force attack.
> - When the hash value directly visible (e.g. included in a network
> packet), full SipHash should be the default.
> - Syncookies *could* use hsiphash, especially as there are
> two keys in there. Not sure if we need the performance.
> - For TCP ISNs, I'd prefer to use full SipHash. I know this is
> a very hot path, and if that's a performance bottleneck,
> we can work harder on it.
>
> In particular, TCP ISNs *used* to rotate the key periodically,
> limiting the time available to an attacker to perform an
> attack before the secret goes stale and is useless. commit
> 6e5714eaf77d79ae1c8b47e3e040ff5411b717ec upgraded to md5 and dropped
> the key rotation.
While I generally agree with this analysis for the most part, I do
think we should use SipHash and not HalfSipHash for syncookies.
Although the security risk is lower than with sequence numbers, it
previously used full MD5 for this, which means performance is not
generally a bottleneck and we'll get massive speedups no matter what,
whether using SipHash or HalfSipHash. In addition, using SipHash means
that the 128-bit key gives a larger margin and can be safe longterm.
So, I think we should err on the side of caution and stick with
SipHash in all cases in which we're upgrading from MD5.
In other words, only current jhash users should be potentially
eligible for hsiphash.
> Current code uses a 64 ns tick for the ISN, so it counts 2^24 per second.
> (32 bits wraps every 4.6 minutes.) A 4-bit counter and 28-bit hash
> (or even 3+29) would work as long as the key is regenerated no more
> than once per minute. (Just using the 4.6-minute ISN wrap time is the
> obvious simple implementation.)
>
> (Of course, I defer to DaveM's judgement on all network-related issues.)
I saw that jiffies addition in there and was wondering what it was all
about. It's currently added _after_ the siphash input, not before, to
keep with how the old algorithm worked. I'm not sure if this is
correct or if there's something wrong with that, as I haven't studied
how it works. If that jiffies should be part of the siphash input and
not added to the result, please tell me. Otherwise I'll keep things
how they are to avoid breaking something that seems to be working.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-16 20:49 [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld
@ 2016-12-16 21:25 ` George Spelvin
2016-12-16 21:31 ` [kernel-hardening] " Jason A. Donenfeld
0 siblings, 1 reply; 11+ messages in thread
From: George Spelvin @ 2016-12-16 21:25 UTC (permalink / raw)
To: Jason, linux
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum
Jason A. Donenfeld wrote:
> I saw that jiffies addition in there and was wondering what it was all
> about. It's currently added _after_ the siphash input, not before, to
> keep with how the old algorithm worked. I'm not sure if this is
> correct or if there's something wrong with that, as I haven't studied
> how it works. If that jiffies should be part of the siphash input and
> not added to the result, please tell me. Otherwise I'll keep things
> how they are to avoid breaking something that seems to be working.
Oh, geez, I didn't realize you didn't understand this code.
Full details at
https://en.wikipedia.org/wiki/TCP_sequence_prediction_attack
But yes, the sequence number is supposed to be (random base) + (timestamp).
In the old days before Canter & Siegel when the internet was a nice place,
people just used a counter that started at boot time.
But then someone observed that I can start a connection to host X,
see the sequence number it gives back to me, and thereby learn the
seauence number it's using on its connections to host Y.
And I can use that to inject forged data into an X-to-Y connection,
without ever seeing a single byte of the traffic! (If I *can* observe
the traffic, of course, none of this makes the slightest difference.)
So the random base was made a keyed hash of the endpoint identifiers.
(Practically only the hosts matter, but generally the ports are thrown
in for good measure.) That way, the ISN that host X sends to me
tells me nothing about the ISN it's using to talk to host Y. Now the
only way to inject forged data into the X-to-Y connection is to
send 2^32 bytes, which is a little less practical.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-16 21:25 ` George Spelvin
@ 2016-12-16 21:31 ` Jason A. Donenfeld
2016-12-16 22:41 ` George Spelvin
0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16 21:31 UTC (permalink / raw)
To: kernel-hardening
Cc: George Spelvin, Andi Kleen, David Miller, David Laight,
Daniel J . Bernstein, Eric Biggers, Hannes Frederic Sowa,
Jean-Philippe Aumasson, Linux Crypto Mailing List, LKML,
Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds,
Theodore Ts'o, Vegard Nossum
Hi George,
On Fri, Dec 16, 2016 at 10:25 PM, George Spelvin
<linux@sciencehorizons.net> wrote:
> But yes, the sequence number is supposed to be (random base) + (timestamp).
> In the old days before Canter & Siegel when the internet was a nice place,
> people just used a counter that started at boot time.
>
> But then someone observed that I can start a connection to host X,
> see the sequence number it gives back to me, and thereby learn the
> seauence number it's using on its connections to host Y.
>
> And I can use that to inject forged data into an X-to-Y connection,
> without ever seeing a single byte of the traffic! (If I *can* observe
> the traffic, of course, none of this makes the slightest difference.)
>
> So the random base was made a keyed hash of the endpoint identifiers.
> (Practically only the hosts matter, but generally the ports are thrown
> in for good measure.) That way, the ISN that host X sends to me
> tells me nothing about the ISN it's using to talk to host Y. Now the
> only way to inject forged data into the X-to-Y connection is to
> send 2^32 bytes, which is a little less practical.
Oh, okay, that is exactly what I thought was going on. I just thought
you were implying that jiffies could be moved inside the hash, which
then confused my understanding of how things should be. In any case,
thanks for the explanation.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-16 21:31 ` [kernel-hardening] " Jason A. Donenfeld
@ 2016-12-16 22:41 ` George Spelvin
0 siblings, 0 replies; 11+ messages in thread
From: George Spelvin @ 2016-12-16 22:41 UTC (permalink / raw)
To: Jason, kernel-hardening
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, linux-crypto, linux-kernel, linux, luto,
netdev, tom, torvalds, tytso, vegard.nossum
An idea I had which mght be useful:
You could perhaps save two rounds in siphash_*u64.
The final word with the length (called "b" in your implementation)
only needs to be there if the input is variable-sized.
If every use of a given key is of a fixed-size input, you don't need
a length suffix. When the input is an even number of words, that can
save you two rounds.
This requires an audit of callers (e.g. you have to use different
keys for IPv4 and IPv6 ISNs), but can save time.
(This is crypto 101; search "MD-strengthening" or see the remark on
p. 101 on Damgaard's 1989 paper "A design principle for hash functions" at
http://saluc.engr.uconn.edu/refs/algorithms/hashalg/damgard89adesign.pdf
but I'm sure that Ted, Jean-Philippe, and/or DJB will confirm if you'd
like.)
Jason A. Donenfeld wrote:
> Oh, okay, that is exactly what I thought was going on. I just thought
> you were implying that jiffies could be moved inside the hash, which
> then confused my understanding of how things should be. In any case,
> thanks for the explanation.
No, the rekeying procedure is cleverer.
The thing is, all that matters is that the ISN increments fast enough,
but not wrap too soon.
It *is* permitted to change the random base, as long as it only
increases, and slower than the timestamp does.
So what you do is every few minutes, you increment the high 4 bits of the
random base and change the key used to generate the low 28 bits.
The base used for any particular host might change from 0x10000000
to 0x2fffffff, or from 0x1fffffff to 0x20000000, but either way, it's
increasing, and not too fast.
This has the downside that an attacker can see 4 bits of the base,
so only needs to send send 2^28 = 256 MB to flood the connection,
but the upside that the key used to generate the low bits changes
faster than it can be broken.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
@ 2016-12-17 1:39 Jason A. Donenfeld
2016-12-17 2:15 ` George Spelvin
0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-12-17 1:39 UTC (permalink / raw)
To: George Spelvin
Cc: Andi Kleen, David Miller, David Laight, Daniel J . Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
kernel-hardening, Linux Crypto Mailing List, LKML,
Andy Lutomirski, Netdev, Tom Herbert, Linus Torvalds,
Theodore Ts'o, Vegard Nossum
On Sat, Dec 17, 2016 at 12:44 AM, George Spelvin
<linux@sciencehorizons.net> wrote:
> Ths advice I'd give now is:
> - Implement
> unsigned long hsiphash(const void *data, size_t len, const unsigned long key[2])
> .. as SipHash on 64-bit (maybe SipHash-1-3, still being discussed) and
> HalfSipHash on 32-bit.
I already did this. Check my branch.
> - Document when it may or may not be used carefully.
Good idea. I'll write up some extensive documentation about all of
this, detailing use cases and our various conclusions.
> - #define get_random_int (unsigned)get_random_long
That's a good idea, since ultimately the other just casts in the
return value. I wonder if this could also lead to a similar aliasing
with arch_get_random_int, since I'm pretty sure all rdrand-like
instructions return native word size anyway.
> - Ted, Andy Lutorminski and I will try to figure out a construction of
> get_random_long() that we all like.
And me, I hope... No need to make this exclusive.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-17 1:39 Jason A. Donenfeld
@ 2016-12-17 2:15 ` George Spelvin
2016-12-17 15:41 ` [kernel-hardening] " Theodore Ts'o
0 siblings, 1 reply; 11+ messages in thread
From: George Spelvin @ 2016-12-17 2:15 UTC (permalink / raw)
To: Jason, linux
Cc: ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, luto, netdev, tom, torvalds, tytso, vegard.nossum
> I already did this. Check my branch.
Do you think it should return "u32" (as you currently have it) or
"unsigned long"? I thought the latter, since it doesn't cost any
more and makes more
> I wonder if this could also lead to a similar aliasing
> with arch_get_random_int, since I'm pretty sure all rdrand-like
> instructions return native word size anyway.
Well, Intel's can return 16, 32 or 64 bits, and it makes a
small difference with reseed scheduling.
>> - Ted, Andy Lutorminski and I will try to figure out a construction of
>> get_random_long() that we all like.
> And me, I hope... No need to make this exclusive.
Gaah, engage brain before fingers. That was so obvious I didn't say
it, and the result came out sounding extremely rude.
A better (but longer) way to write it would be "I'm sorry that I, Ted,
and Andy are all arguing with you and each other about how to do this
and we can't finalize this part yet".
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-17 2:15 ` George Spelvin
@ 2016-12-17 15:41 ` Theodore Ts'o
2016-12-17 16:14 ` Jeffrey Walton
2016-12-19 17:21 ` Jason A. Donenfeld
0 siblings, 2 replies; 11+ messages in thread
From: Theodore Ts'o @ 2016-12-17 15:41 UTC (permalink / raw)
To: kernel-hardening
Cc: Jason, linux, ak, davem, David.Laight, djb, ebiggers3, hannes,
jeanphilippe.aumasson, linux-crypto, linux-kernel, luto, netdev,
tom, torvalds, vegard.nossum
On Fri, Dec 16, 2016 at 09:15:03PM -0500, George Spelvin wrote:
> >> - Ted, Andy Lutorminski and I will try to figure out a construction of
> >> get_random_long() that we all like.
We don't have to find the most optimal solution right away; we can
approach this incrementally, after all.
So long as we replace get_random_{long,int}() with something which is
(a) strictly better in terms of security given today's use of MD5, and
(b) which is strictly *faster* than the current construction on 32-bit
and 64-bit systems, we can do that, and can try to make it be faster
while maintaining some minimum level of security which is sufficient
for all current users of get_random_{long,int}() and which can be
clearly artificulated for future users of get_random_{long,int}().
The main worry at this point I have is benchmarking siphash on a
32-bit system. It may be that simply batching the chacha20 output so
that we're using the urandom construction more efficiently is the
better way to go, since that *does* meet the criteron of strictly more
secure and strictly faster than the current MD5 solution. I'm open to
using siphash, but I want to see the the 32-bit numbers first.
As far as half-siphash is concerned, it occurs to me that the main
problem will be those users who need to guarantee that output can't be
guessed over a long period of time. For example, if you have a
long-running process, then the output needs to remain unguessable over
potentially months or years, or else you might be weakening the ASLR
protections. If on the other hand, the hash table or the process will
be going away in a matter of seconds or minutes, the requirements with
respect to cryptographic strength go down significantly.
Now, maybe this doesn't matter that much if we can guarantee (or make
assumptions) that the attacker doesn't have unlimited access the
output stream of get_random_{long,int}(), or if it's being used in an
anti-DOS use case where it ultimately only needs to be harder than
alternate ways of attacking the system.
Rekeying every five minutes doesn't necessarily help the with respect
to ASLR, but it might reduce the amount of the output stream that
would be available to the attacker in order to be able to attack the
get_random_{long,int}() generator, and it also reduces the value of
doing that attack to only compromising the ASLR for those processes
started within that five minute window.
Cheers,
- Ted
P.S. I'm using ASLR as an example use case, above; of course we will
need to make similar eximainations of the other uses of
get_random_{long,int}().
P.P.S. We might also want to think about potentially defining
get_random_{long,int}() to be unambiguously strong, and then creating
a get_weak_random_{long,int}() which on platforms where performance
might be a consideration, it uses a weaker algorithm perhaps with some
kind of rekeying interval.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-17 15:41 ` [kernel-hardening] " Theodore Ts'o
@ 2016-12-17 16:14 ` Jeffrey Walton
2016-12-19 17:21 ` Jason A. Donenfeld
1 sibling, 0 replies; 11+ messages in thread
From: Jeffrey Walton @ 2016-12-17 16:14 UTC (permalink / raw)
To: Theodore Ts'o, kernel-hardening, Jason A. Donenfeld,
George Spelvin, ak, davem, David Laight, D. J. Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
linux-crypto, LKML, luto, Netdev, Tom Herbert, Linus Torvalds,
Vegard Nossum
> As far as half-siphash is concerned, it occurs to me that the main
> problem will be those users who need to guarantee that output can't be
> guessed over a long period of time. For example, if you have a
> long-running process, then the output needs to remain unguessable over
> potentially months or years, or else you might be weakening the ASLR
> protections. If on the other hand, the hash table or the process will
> be going away in a matter of seconds or minutes, the requirements with
> respect to cryptographic strength go down significantly.
Perhaps SipHash-4-8 should be used instead of SipHash-2-4. I believe
SipHash-4-8 is recommended for the security conscious who want to be
more conservative in their security estimates.
SipHash-4-8 does not add much more processing. If you are clocking
SipHash-2-4 at 2.0 or 2.5 cpb, then SipHash-4-8 will run at 3.0 to
4.0. Both are well below MD5 times. (At least with the data sets I've
tested).
> Now, maybe this doesn't matter that much if we can guarantee (or make
> assumptions) that the attacker doesn't have unlimited access the
> output stream of get_random_{long,int}(), or if it's being used in an
> anti-DOS use case where it ultimately only needs to be harder than
> alternate ways of attacking the system.
>
> Rekeying every five minutes doesn't necessarily help the with respect
> to ASLR, but it might reduce the amount of the output stream that
> would be available to the attacker in order to be able to attack the
> get_random_{long,int}() generator, and it also reduces the value of
> doing that attack to only compromising the ASLR for those processes
> started within that five minute window.
Forgive my ignorance... I did not find reading on using the primitive
in a PRNG. Does anyone know what Aumasson or Bernstein have to say?
Aumasson's site does not seem to discuss the use case:
https://www.google.com/search?q=siphash+rng+site%3A131002.net. (And
their paper only mentions random-number once in a different context).
Making the leap from internal hash tables and short-lived network
packets to the rng case may leave something to be desired, especially
if the bits get used in unanticipated ways, like creating long term
private keys.
Jeff
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-17 15:41 ` [kernel-hardening] " Theodore Ts'o
2016-12-17 16:14 ` Jeffrey Walton
@ 2016-12-19 17:21 ` Jason A. Donenfeld
1 sibling, 0 replies; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-12-19 17:21 UTC (permalink / raw)
To: Theodore Ts'o, kernel-hardening, Jason, George Spelvin,
Andi Kleen, David Miller, David Laight, Daniel J . Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
Tom Herbert, Linus Torvalds, Vegard Nossum
Hi Ted,
On Sat, Dec 17, 2016 at 4:41 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Fri, Dec 16, 2016 at 09:15:03PM -0500, George Spelvin wrote:
>> >> - Ted, Andy Lutorminski and I will try to figure out a construction of
>> >> get_random_long() that we all like.
>
> We don't have to find the most optimal solution right away; we can
> approach this incrementally, after all.
Thanks to your call for moderation. This is the impression I have too.
And with all the back and forth of these threads, I fear nothing will
get done. I'm going to collect the best ideas amongst all of these,
and try to get it merged. Then after that we can incrementally improve
on it.
David Miller -- would you merge something into your 4.11 tree? Seems
like you might be the guy for this, since the changes primarily affect
net/*.
Latest patches are here:
https://git.zx2c4.com/linux-dev/log/?h=siphash
> So long as we replace get_random_{long,int}() with something which is
> (a) strictly better in terms of security given today's use of MD5, and
> (b) which is strictly *faster* than the current construction on 32-bit
> and 64-bit systems, we can do that, and can try to make it be faster
> while maintaining some minimum level of security which is sufficient
> for all current users of get_random_{long,int}() and which can be
> clearly artificulated for future users of get_random_{long,int}().
>
> The main worry at this point I have is benchmarking siphash on a
> 32-bit system. It may be that simply batching the chacha20 output so
> that we're using the urandom construction more efficiently is the
> better way to go, since that *does* meet the criteron of strictly more
> secure and strictly faster than the current MD5 solution. I'm open to
> using siphash, but I want to see the the 32-bit numbers first.
Sure, I'll run some benchmarks and report back.
> As far as half-siphash is concerned, it occurs to me that the main
> problem will be those users who need to guarantee that output can't be
> guessed over a long period of time. For example, if you have a
> long-running process, then the output needs to remain unguessable over
> potentially months or years, or else you might be weakening the ASLR
> protections. If on the other hand, the hash table or the process will
> be going away in a matter of seconds or minutes, the requirements with
> respect to cryptographic strength go down significantly.
>
> Now, maybe this doesn't matter that much if we can guarantee (or make
> assumptions) that the attacker doesn't have unlimited access the
> output stream of get_random_{long,int}(), or if it's being used in an
> anti-DOS use case where it ultimately only needs to be harder than
> alternate ways of attacking the system.
The only acceptable usage of HalfSipHash is for hash table lookups
where top security isn't actually a goal. I'm still a bit queasy about
going with it, but George S has very aggressively been pursuing a
"save every last cycle" agenda, which makes sense given how
performance sensitive certain hash tables are, and so I suspect this
could be an okay compromise between performance and security. But,
only for hash tables. Certainly not for the RNG.
>
> Rekeying every five minutes doesn't necessarily help the with respect
> to ASLR, but it might reduce the amount of the output stream that
> would be available to the attacker in order to be able to attack the
> get_random_{long,int}() generator, and it also reduces the value of
> doing that attack to only compromising the ASLR for those processes
> started within that five minute window.
The current implemention of get_random_int/long in my branch uses
128-bit key siphash, so the chances of compromising the key are pretty
much nil. The periodic rekeying is to protect against direct-ram
attacks or other kernel exploits -- a concern brought up by Andy.
With siphash, which takes a 128-bit key, if you got an RNG output once
every picosecond, I believe it would take approximately 10^19 years...
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
@ 2016-12-16 21:01 Jason A. Donenfeld
2016-12-16 21:15 ` Hannes Frederic Sowa
0 siblings, 1 reply; 11+ messages in thread
From: Jason A. Donenfeld @ 2016-12-16 21:01 UTC (permalink / raw)
To: kernel-hardening, Theodore Ts'o, George Spelvin, Jason,
Andi Kleen, David Miller, David Laight, Daniel J . Bernstein,
Eric Biggers, Hannes Frederic Sowa, Jean-Philippe Aumasson,
Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
Tom Herbert, Linus Torvalds, Vegard Nossum
Hi Ted,
On Fri, Dec 16, 2016 at 9:43 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> What should we do with get_random_int() and get_random_long()? In
> some cases it's being used in performance sensitive areas, and where
> anti-DoS protection might be enough. In others, maybe not so much.
>
> If we rekeyed the secret used by get_random_int() and
> get_random_long() frequently (say, every minute or every 5 minutes),
> would that be sufficient for current and future users of these
> interfaces?
get_random_int() and get_random_long() should quite clearly use
SipHash with its secure 128-bit key and not HalfSipHash with its
64-bit key. HalfSipHash is absolutely insufficient for this use case.
Remember, we're already an order of magnitude or more faster than
md5...
With regard to periodic rekeying... since the secret is 128-bits, I
believe this is likely sufficient for _not_ rekeying. There's also the
chaining variable, to tie together invocations of the function. If
you'd prefer, instead of the chaining variable, we could use some
siphash output to mutate the original key, but I don't think this
approach is actually better and might introduce vulnerabilities. In my
opinion chaining+128bitkey is sufficient. On the other hand, rekeying
every X minutes is 3 or 4 lines of code. If you want (just say so),
I'll add this to my next revision.
You asked about the security requirements of these functions. The
comment says they're not cryptographically secure. And right now with
MD5 they're not. So the expectations are pretty low. Moving to siphash
adds some cryptographic security, certainly. Moving to siphash plus
rekeying adds a bit more. Of course, on recent x86, RDRAND is used
instead, so the cryptographic strength then depends on the thickness
of your tinfoil hat. So probably we shouldn't change what we advertise
these functions provide, even though we're certainly improving them
performance-wise and security-wise.
> P.S. I'll note that my performance figures when testing changes to
> get_random_int() were done on a 32-bit x86; Jason, I'm guessing your
> figures were using a 64-bit x86 system?. I haven't tried 32-bit ARM
> or smaller CPU's (e.g., mips, et. al.) that might be more likely to be
> used on IoT devices, but I'm worried about those too, of course.
Yes, on x86-64. But on i386 chacha20 incurs nearly the same kind of
slowdown as siphash, so I expect the comparison to be more or less
equal. There's another thing I really didn't like about your chacha20
approach which is that it uses the /dev/urandom pool, which means
various things need to kick in in the background to refill this.
Additionally, having to refill the buffered chacha output every 32 or
so longs isn't nice. These things together make for inconsistent and
hard to understand general operating system performance, because
get_random_long is called at every process startup for ASLR. So, in
the end, I believe there's another reason for going with the siphash
approach: deterministic performance.
Jason
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [kernel-hardening] Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
2016-12-16 21:01 Jason A. Donenfeld
@ 2016-12-16 21:15 ` Hannes Frederic Sowa
0 siblings, 0 replies; 11+ messages in thread
From: Hannes Frederic Sowa @ 2016-12-16 21:15 UTC (permalink / raw)
To: Jason A. Donenfeld, kernel-hardening, Theodore Ts'o,
George Spelvin, Andi Kleen, David Miller, David Laight,
Daniel J . Bernstein, Eric Biggers, Jean-Philippe Aumasson,
Linux Crypto Mailing List, LKML, Andy Lutomirski, Netdev,
Tom Herbert, Linus Torvalds, Vegard Nossum
On Fri, Dec 16, 2016, at 22:01, Jason A. Donenfeld wrote:
> Yes, on x86-64. But on i386 chacha20 incurs nearly the same kind of
> slowdown as siphash, so I expect the comparison to be more or less
> equal. There's another thing I really didn't like about your chacha20
> approach which is that it uses the /dev/urandom pool, which means
> various things need to kick in in the background to refill this.
> Additionally, having to refill the buffered chacha output every 32 or
> so longs isn't nice. These things together make for inconsistent and
> hard to understand general operating system performance, because
> get_random_long is called at every process startup for ASLR. So, in
> the end, I believe there's another reason for going with the siphash
> approach: deterministic performance.
*Hust*, so from where do you generate your key for siphash if called
early from ASLR?
Bye,
Hannes
^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com>]
* Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF
[not found] <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com>
@ 2016-12-16 3:46 ` George Spelvin
[not found] ` <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com>
0 siblings, 1 reply; 11+ messages in thread
From: George Spelvin @ 2016-12-16 3:46 UTC (permalink / raw)
To: ak, davem, David.Laight, ebiggers3, hannes, Jason,
jeanphilippe.aumasson, kernel-hardening, linux-crypto,
linux-kernel, linux, luto, netdev, tom, torvalds, tytso,
vegard.nossum
Cc: djb
Jean-Philippe Aumasson wrote:
> If a halved version of SipHash can bring significant performance boost
> (with 32b words instead of 64b words) with an acceptable security level
> (64-bit enough?) then we may design such a version.
It would be fairly significant, a 2x speed benefit on a lot of 32-bit
machines.
First is the fact that a 64-bit SipHash round on a generic 32-bit machine
requires not twice as many instructions, but more than three.
Consider the core SipHash quarter-round operation:
a += b;
b = rotate_left(b, k)
b ^= a
The add and xor are equivalent between 32- and 64-bit rounds; twice the
instructions do twice the work. (There's a dependency via the carry
bit between the two halves of the add, but it ends up not being on the
critical path even in a superscalar implementation.)
The problem is the rotates. Although some particularly nice code is
possible on 32-bit ARM due to its support for shift-and-xor operations,
on a generic 32-bit CPU the rotate grows to 6 instructions with a 2-cycle
dependency chain (more in practice because barrel shifters are large and
even quad-issue CPUs can't do 4 shifts per cycle):
temp_lo = b_lo >> (32-k)
temp_hi = b_hi >> (32-k)
b_lo <<= k
b_hi <<= k
b_lo ^= temp_hi
b_hi ^= temp_lo
The resultant instruction counts and (assuming wide issue)
latencies are:
64-bit SipHash "Half" SipHash
Inst. Latency Inst. Latency
10 3 3 2 Quarter round
40 6 12 4 Full round
80 12 24 8 Two rounds
82 13 26 9 Mix in one word
82 13 52 18 Mix in 64 bits
166 26 61 18 Four round finalization + final XOR
248 39 113 36 Hash 64 bits
330 52 165 54 Hash 128 bits
412 65 217 72 Hash 192 bits
While the ideal latencies are actually better for the 64-bit algorithm,
that requires an unrealistic 6+-wide superscalar implementation that's
more than twice as wide as the 64-bit code requires (which is already
optimized for quad-issue). For a 1- or 2-wide processor, the instruction
counts dominate, and not only does the 64-bit algorithm take 60% more
time to mix in the same number of bytes, but the finalization rounds
bring the ratio to 2:1 for small inputs.
(And I haven't included the possible savings if the input size is an odd
number of 32-bit words, such as networking applications which include
the source/dest port numbers.)
Notes on particular processors:
- x86 can do a 64-bit rotate in 3 instructions and 2 cycles using
the SHLD/SHRD instructions instead:
movl %b_hi, %temp
shldl $k, %b_lo, %b_hi
shldl $k, %temp, %b_lo
... but as I mentioned the problem is registers. SipHash needs 8 32-bit
words plus at least one temporary, and 32-bit x86 has only 7 available.
(And compilers can rarely manage to keep more than 6 of them busy.)
- 64-bit SipHash is particularly efficient on 32-bit ARM due to its
support for shift-and-op instructions. The 64-bit shift and following
xor can be done in 4 instructions. So the only benefit is from the
reduced finalization.
- Double-width adds cost a little more on CPUs like MIPS and RISC-V without
condition codes.
- Certain particularly crappy uClinux processors with slow shifts
(68000, anyone?) really suffer from extra shifts.
One *weakly* requested feature: It might simplify some programming
interfaces if we could use the same key for multiple hash tables with a
1-word "tweak" (e.g. pointer to the hash table, so it could be assumed
non-zero if that helped) to make distinct functions. That would let us
more safely use a global key for multiple small hash tables without the
need to add code to generate and store key material for each place that
an unkeyed hash is replaced.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-12-19 17:21 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-16 20:49 [PATCH v5 1/4] siphash: add cryptographically secure PRF Jason A. Donenfeld
2016-12-16 21:25 ` George Spelvin
2016-12-16 21:31 ` [kernel-hardening] " Jason A. Donenfeld
2016-12-16 22:41 ` George Spelvin
-- strict thread matches above, loose matches on Subject: below --
2016-12-17 1:39 Jason A. Donenfeld
2016-12-17 2:15 ` George Spelvin
2016-12-17 15:41 ` [kernel-hardening] " Theodore Ts'o
2016-12-17 16:14 ` Jeffrey Walton
2016-12-19 17:21 ` Jason A. Donenfeld
2016-12-16 21:01 Jason A. Donenfeld
2016-12-16 21:15 ` Hannes Frederic Sowa
[not found] <CAGiyFdfmiCMyHvAg=5sGh8KjBBrF0Wb4Qf=JLzJqUAx4yFSS3Q@mail.gmail.com>
2016-12-16 3:46 ` George Spelvin
[not found] ` <CAGiyFdd6_LVzUUfFcaqMyub1c2WPvWUzAQDCH+Aza-_t6mvmXg@mail.gmail.com>
2016-12-16 12:39 ` Jason A. Donenfeld
2016-12-16 19:47 ` Tom Herbert
2016-12-16 20:44 ` [kernel-hardening] " Daniel Micay
2016-12-16 21:09 ` Jason A. Donenfeld
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).