* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable [not found] <CA+icZUVnsmf1kXPYFYufStQ_MxnLuxL+EWfDS2wQy1VbAEMwkA@mail.gmail.com> @ 2020-08-09 21:10 ` Sedat Dilek [not found] ` <20200809235412.GD25124@SDF.ORG> 1 sibling, 0 replies; 48+ messages in thread From: Sedat Dilek @ 2020-08-09 21:10 UTC (permalink / raw) To: George Spelvin Cc: Amit Klein, Willy Tarreau, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev + CC:netdev ( Just FYI: Build and boot on bare metal. ) - Sedat - On Sun, Aug 9, 2020 at 11:01 PM Sedat Dilek <sedat.dilek@gmail.com> wrote: > > Hi George, > > I have tried your patch on top of Linux v5.8 with... > > commit f227e3ec3b5c ("random32: update the net random state on > interrupt and activity") > > ...reverted. > This was a bit tricky - what was your base? > > Applied the typo fix from Randy - will a v2 follow? > > Why DRAFT and not RFC? > > Please drop the CC:stable - it's a DRAFT. > > Other Linux stable like linux-5.7.y include... > > commit c0842fbc1b18c7a044e6ff3e8fa78bfa822c7d1a > random32: move the pseudo-random 32-bit definitions to prandom.h > > commit 585524081ecdcde1c719e63916c514866d898217 > random: random.h should include archrandom.h, not the other way around > > ...linux-5.8.y stable will follow. > > Isn't the move to prandom.h making your patch easier to apply? > > In a second build I applied the snippet from Willy. > > What do you mean by...? > > [ quote ] > I wonder if, on general principles, it would be better to use a more > SipHash style mixing in of the sample: > m = get_cycles(); > v3 ^= m; > SIPROUND(v0, v1, v2, v3); > SIPROUND(v0, v1, v2, v3); > v0 ^= m; > > Not sure if it's worth the extra register (and associated spill/fill). > [ /quote ] > > Have you a snippet for testing? > > Thanks. > > Regards, > - Sedat - ^ permalink raw reply [flat|nested] 48+ messages in thread
[parent not found: <20200809235412.GD25124@SDF.ORG>]
[parent not found: <20200810034948.GB8262@1wt.eu>]
[parent not found: <20200811053455.GH25124@SDF.ORG>]
[parent not found: <20200811054328.GD9456@1wt.eu>]
[parent not found: <20200811062814.GI25124@SDF.ORG>]
[parent not found: <20200811074538.GA9523@1wt.eu>]
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable [not found] ` <20200811074538.GA9523@1wt.eu> @ 2020-08-11 10:51 ` Sedat Dilek 2020-08-11 11:01 ` Sedat Dilek 2020-08-12 3:21 ` Willy Tarreau 0 siblings, 2 replies; 48+ messages in thread From: Sedat Dilek @ 2020-08-11 10:51 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev [ CC netdev ML ] Hi Willy, in [1] you say: > I've applied it on top of George's patch rebased to mainline for simplicity. > I've used a separate per_cpu noise variable to keep the net_rand_state static > with its __latent_entropy. Can you share this "rebased to mainline" version of George's patch? Maybe put your work "user-friendly-fetchable" in one of your <kernel.org> Git tree (see [2])? Yesterday random/random32/prandom mainline patches hit Linux v5.8.1-rc1 (see [3]). So, as I asked in my first email what is a suitable base? Linux v5.9-rc1 (this Sunday) or if stable Linux v5.8.1 (next 1-2 days) Thanks. Regards, - Sedat - [1] https://marc.info/?l=linux-netdev&m=159709355528675&w=2 [2] https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau [3] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git/log/?h=linux-5.8.y ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-11 10:51 ` Sedat Dilek @ 2020-08-11 11:01 ` Sedat Dilek 2020-08-12 3:21 ` Willy Tarreau 1 sibling, 0 replies; 48+ messages in thread From: Sedat Dilek @ 2020-08-11 11:01 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev In the previous discussion... "Flaw in "random32: update the net random state on interrupt and activity" ...someone referred to <luto/linux.git#random/fast>. Someone tested this? Feedback? - Sedat - [0] https://marc.info/?t=159658903500002&r=1&w=2 [1] https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=random/fast ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-11 10:51 ` Sedat Dilek 2020-08-11 11:01 ` Sedat Dilek @ 2020-08-12 3:21 ` Willy Tarreau 2020-08-13 7:53 ` Sedat Dilek 1 sibling, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-12 3:21 UTC (permalink / raw) To: Sedat Dilek Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote: > Can you share this "rebased to mainline" version of George's patch? You can pick it from there if that helps, but keep in mind that it's just experimental code that we use to explain our ideas and that we really don't care a single second what kernel it's applied to: https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1 Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-12 3:21 ` Willy Tarreau @ 2020-08-13 7:53 ` Sedat Dilek 2020-08-13 8:06 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-13 7:53 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote: > > On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote: > > Can you share this "rebased to mainline" version of George's patch? > > You can pick it from there if that helps, but keep in mind that > it's just experimental code that we use to explain our ideas and > that we really don't care a single second what kernel it's applied > to: > > https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1 > Thanks Willy. I disagree: the base for testing should be clear(ly communicated). There are two diffs from Eric to #1: add a trace event for prandom_u32() and #2: a removal of prandom_u32() call in tcp_conn_request(). In case you have not seen. The first was helpful for playing with linux-perf. I have tested both together with [2]. - Sedat - [1] https://marc.info/?l=linux-netdev&m=159716173516111&w=2 [2] https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=random/fast Note2myself: Enable some useful random/random32 Kconfigs RANDOM32_SELFTEST n -> y WARN_ALL_UNSEEDED_RANDOM n -> y - EOT - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-13 7:53 ` Sedat Dilek @ 2020-08-13 8:06 ` Willy Tarreau 2020-08-13 8:13 ` Sedat Dilek 2020-08-14 15:32 ` Sedat Dilek 0 siblings, 2 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-13 8:06 UTC (permalink / raw) To: Sedat Dilek Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev On Thu, Aug 13, 2020 at 09:53:11AM +0200, Sedat Dilek wrote: > On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote: > > > > On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote: > > > Can you share this "rebased to mainline" version of George's patch? > > > > You can pick it from there if that helps, but keep in mind that > > it's just experimental code that we use to explain our ideas and > > that we really don't care a single second what kernel it's applied > > to: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1 > > > > Thanks Willy. > > I disagree: the base for testing should be clear(ly communicated). It is. As you can see on the log above, this was applied on top of fc80c51fd4b2, there's nothing special here. In addition we're not even talking about testing nor calling for testers, just trying to find a reasonable solution. Maybe today I'll be able to re-run a few tests by the way. > There are two diffs from Eric to #1: add a trace event for > prandom_u32() and #2: a removal of prandom_u32() call in > tcp_conn_request(). > In case you have not seen. I've seen, just not had the time to test yet. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-13 8:06 ` Willy Tarreau @ 2020-08-13 8:13 ` Sedat Dilek 2020-08-13 8:27 ` Sedat Dilek 2020-08-14 15:32 ` Sedat Dilek 1 sibling, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-13 8:13 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev On Thu, Aug 13, 2020 at 10:06 AM Willy Tarreau <w@1wt.eu> wrote: > > On Thu, Aug 13, 2020 at 09:53:11AM +0200, Sedat Dilek wrote: > > On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote: > > > > > > On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote: > > > > Can you share this "rebased to mainline" version of George's patch? > > > > > > You can pick it from there if that helps, but keep in mind that > > > it's just experimental code that we use to explain our ideas and > > > that we really don't care a single second what kernel it's applied > > > to: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1 > > > > > > > Thanks Willy. > > > > I disagree: the base for testing should be clear(ly communicated). > > It is. As you can see on the log above, this was applied on top of > fc80c51fd4b2, there's nothing special here. In addition we're not even > talking about testing nor calling for testers, just trying to find a > reasonable solution. Maybe today I'll be able to re-run a few tests by > the way. > I agree with publishing in your Git tree it is clear. > > There are two diffs from Eric to #1: add a trace event for > > prandom_u32() and #2: a removal of prandom_u32() call in > > tcp_conn_request(). > > In case you have not seen. > > I've seen, just not had the time to test yet. > Can you describe and share your test-environment/setup? The Linux-kernel has kunit tests (I never played with that) - you happen to know there is a suitable one available? Maybe the Linux Test Project has some suitable tests? - Sedat - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-13 8:13 ` Sedat Dilek @ 2020-08-13 8:27 ` Sedat Dilek 2020-08-13 14:00 ` Eric Dumazet 0 siblings, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-13 8:27 UTC (permalink / raw) To: George Spelvin Cc: Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev, Willy Tarreau I run a perf session looks like this in my KDE/Plasma desktop-environment: [ PERF SESSION ] 1016 2020-08-13 09:57:24 echo 1 > /proc/sys/kernel/sched_schedstats 1017 2020-08-13 09:57:24 echo prandom_u32 >> /sys/kernel/debug/tracing/set_event 1018 2020-08-13 09:57:24 echo traceon > /sys/kernel/debug/tracing/events/random/prandom_u32/trigger 1019 2020-08-13 09:57:25 echo 1 > /sys/kernel/debug/tracing/events/enable 1020 2020-08-13 09:57:32 sysctl -n kernel.sched_schedstats 1021 2020-08-13 09:57:32 cat /sys/kernel/debug/tracing/events/enable 1022 2020-08-13 09:57:32 grep prandom_u32 /sys/kernel/debug/tracing/set_event 1023 2020-08-13 09:57:33 cat /sys/kernel/debug/tracing/events/random/prandom_u32/trigger root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5 Simple-Test: Press F5 in Firefox in each opened tab (5 tabs in total) root# /home/dileks/bin/perf report --no-children --stdio > /tmp/perf-report.txt - Sedat - [ perf-report.txt ] # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 273 of event 'random:prandom_u32' # Event count (approx.): 273 # # Overhead Command Shared Object Symbol # ........ ............... ................. ............... # 27.47% firefox-esr [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode shmem_mknod path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x7a6f6d2e67726f2f 13.19% DNS Resolver #9 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--5.86%--__ip4_datagram_connect | ip4_datagram_connect | __sys_connect | __x64_sys_connect | do_syscall_64 | entry_SYSCALL_64_after_hwframe | | | |--4.40%--__libc_connect | | 0x2c1 | | | --1.47%--connect | PR_GetAddrInfoByName | |--2.93%--udp_lib_get_port | inet_dgram_connect | __sys_connect | __x64_sys_connect | do_syscall_64 | entry_SYSCALL_64_after_hwframe | | | |--2.20%--__libc_connect | | 0x2c1 | | | --0.73%--connect | PR_GetAddrInfoByName | |--2.20%--__ip_select_ident | __ip_make_skb | ip_make_skb | udp_sendmsg | __sys_sendto | __x64_sys_sendto | do_syscall_64 | entry_SYSCALL_64_after_hwframe | __libc_send | --2.20%--netlink_autobind netlink_bind __sys_bind __x64_sys_bind do_syscall_64 entry_SYSCALL_64_after_hwframe bind getaddrinfo PR_GetAddrInfoByName 12.45% FS Broker 7060 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode shmem_mknod path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x7a6f6d2e67726f2f 8.79% FS Broker 6334 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode shmem_mknod path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x7a6f6d2e67726f2f 8.06% FS Broker 7161 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode shmem_mknod path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x7a6f6d2e67726f2f 8.06% FS Broker 7195 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode shmem_mknod path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x7a6f6d2e67726f2f 7.33% FS Broker 7126 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode shmem_mknod path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x7a6f6d2e67726f2f 7.33% Socket Thread [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 tcp_v4_connect __inet_stream_connect inet_stream_connect __sys_connect __x64_sys_connect do_syscall_64 entry_SYSCALL_64_after_hwframe __libc_connect 4.40% Cache2 I/O [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 __ext4_new_inode ext4_create path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x61632e2f736b656c 1.10% Xorg [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode __shmem_file_setup create_shmem i915_gem_object_create_region i915_gem_create_ioctl drm_ioctl_kernel drm_ioctl __se_sys_ioctl do_syscall_64 entry_SYSCALL_64_after_hwframe ioctl 0.73% QSGRenderThread [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 shmem_get_inode __shmem_file_setup create_shmem i915_gem_object_create_region i915_gem_create_ioctl drm_ioctl_kernel drm_ioctl __se_sys_ioctl do_syscall_64 entry_SYSCALL_64_after_hwframe ioctl 0.37% DOM Worker [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 __ext4_new_inode ext4_create path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x6f6d2e2f736b656c 0.37% cupsd [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 __ext4_new_inode ext4_create path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x7263736275732f73 0.37% mozStorage #8 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 __ext4_new_inode ext4_create path_openat do_filp_open do_sys_openat2 __x64_sys_openat do_syscall_64 entry_SYSCALL_64_after_hwframe __open64 0x6f6d2e2f736b656c # Samples: 0 of event 'dummy:HG' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # # (Tip: Print event counts in CSV format with: perf stat -x,) # - EOF - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-13 8:27 ` Sedat Dilek @ 2020-08-13 14:00 ` Eric Dumazet 2020-08-13 16:02 ` Sedat Dilek 0 siblings, 1 reply; 48+ messages in thread From: Eric Dumazet @ 2020-08-13 14:00 UTC (permalink / raw) To: sedat.dilek Cc: George Spelvin, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev, Willy Tarreau On Thu, Aug 13, 2020 at 1:27 AM Sedat Dilek <sedat.dilek@gmail.com> wrote: > > I run a perf session looks like this in my KDE/Plasma desktop-environment: > > [ PERF SESSION ] > > 1016 2020-08-13 09:57:24 echo 1 > /proc/sys/kernel/sched_schedstats > 1017 2020-08-13 09:57:24 echo prandom_u32 >> > /sys/kernel/debug/tracing/set_event > 1018 2020-08-13 09:57:24 echo traceon > > /sys/kernel/debug/tracing/events/random/prandom_u32/trigger > 1019 2020-08-13 09:57:25 echo 1 > /sys/kernel/debug/tracing/events/enable > > 1020 2020-08-13 09:57:32 sysctl -n kernel.sched_schedstats > 1021 2020-08-13 09:57:32 cat /sys/kernel/debug/tracing/events/enable > 1022 2020-08-13 09:57:32 grep prandom_u32 /sys/kernel/debug/tracing/set_event > 1023 2020-08-13 09:57:33 cat > /sys/kernel/debug/tracing/events/random/prandom_u32/trigger > > root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5 > To be clear : This "perf record -a -g -e random:prandom_u32 sleep 5" is self sufficient. You have nothing to do before (as reported in your email), this is simply not needed. I am not sure why you added all this irrelevant stuff, this is distracting. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-13 14:00 ` Eric Dumazet @ 2020-08-13 16:02 ` Sedat Dilek 0 siblings, 0 replies; 48+ messages in thread From: Sedat Dilek @ 2020-08-13 16:02 UTC (permalink / raw) To: Eric Dumazet Cc: George Spelvin, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev, Willy Tarreau On Thu, Aug 13, 2020 at 4:00 PM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Aug 13, 2020 at 1:27 AM Sedat Dilek <sedat.dilek@gmail.com> wrote: > > > > I run a perf session looks like this in my KDE/Plasma desktop-environment: > > > > [ PERF SESSION ] > > > > 1016 2020-08-13 09:57:24 echo 1 > /proc/sys/kernel/sched_schedstats > > 1017 2020-08-13 09:57:24 echo prandom_u32 >> > > /sys/kernel/debug/tracing/set_event > > 1018 2020-08-13 09:57:24 echo traceon > > > /sys/kernel/debug/tracing/events/random/prandom_u32/trigger > > 1019 2020-08-13 09:57:25 echo 1 > /sys/kernel/debug/tracing/events/enable > > > > 1020 2020-08-13 09:57:32 sysctl -n kernel.sched_schedstats > > 1021 2020-08-13 09:57:32 cat /sys/kernel/debug/tracing/events/enable > > 1022 2020-08-13 09:57:32 grep prandom_u32 /sys/kernel/debug/tracing/set_event > > 1023 2020-08-13 09:57:33 cat > > /sys/kernel/debug/tracing/events/random/prandom_u32/trigger > > > > root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5 > > > > To be clear : This "perf record -a -g -e random:prandom_u32 sleep 5" > is self sufficient. > > You have nothing to do before (as reported in your email), this is > simply not needed. > > I am not sure why you added all this irrelevant stuff, this is distracting. Initially I followed these Links: Link: https://www.kernel.org/doc/html/v5.8/trace/events.html Link: https://www.kernel.org/doc/html/v5.8/trace/events.html#boot-option Link: http://www.brendangregg.com/perf.html Link: http://www.brendangregg.com/perf.html#DynamicTracing You are right, it's not needed to set and check all these variables as perf says: root# /home/dileks/bin/perf list | grep prandom_u32 random:prandom_u32 [Tracepoint event] So these two steps are indeed enough: root# /home/dileks/bin/perf record -a -g -e random:prandom_u32 sleep 5 root# /home/dileks/bin/perf report --no-children --stdio Lessons learned. - Sedat - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-13 8:06 ` Willy Tarreau 2020-08-13 8:13 ` Sedat Dilek @ 2020-08-14 15:32 ` Sedat Dilek 2020-08-14 16:05 ` Willy Tarreau 1 sibling, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-14 15:32 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev [-- Attachment #1: Type: text/plain, Size: 2163 bytes --] On Thu, Aug 13, 2020 at 10:06 AM Willy Tarreau <w@1wt.eu> wrote: > > On Thu, Aug 13, 2020 at 09:53:11AM +0200, Sedat Dilek wrote: > > On Wed, Aug 12, 2020 at 5:21 AM Willy Tarreau <w@1wt.eu> wrote: > > > > > > On Tue, Aug 11, 2020 at 12:51:43PM +0200, Sedat Dilek wrote: > > > > Can you share this "rebased to mainline" version of George's patch? > > > > > > You can pick it from there if that helps, but keep in mind that > > > it's just experimental code that we use to explain our ideas and > > > that we really don't care a single second what kernel it's applied > > > to: > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/log/?h=20200811-prandom-1 > > > > > > > Thanks Willy. > > > > I disagree: the base for testing should be clear(ly communicated). > > It is. As you can see on the log above, this was applied on top of > fc80c51fd4b2, there's nothing special here. In addition we're not even > talking about testing nor calling for testers, just trying to find a > reasonable solution. Maybe today I'll be able to re-run a few tests by > the way. > > > There are two diffs from Eric to #1: add a trace event for > > prandom_u32() and #2: a removal of prandom_u32() call in > > tcp_conn_request(). > > In case you have not seen. > > I've seen, just not had the time to test yet. > Now with Eric' patch (see [1]) in mainline... commit 94c7eb54c4b8e81618ec79f414fe1ca5767f9720 "random32: add a tracepoint for prandom_u32()" ...I gave Willy's patches a try and used the Linux Test Project (LTP) for testing. [ PERF SESSION #2 ] Link: https://github.com/linux-test-project/ltp/blob/master/doc/mini-howto-building-ltp-from-git.txt cd /opt/ltp /home/dileks/bin/perf record -a -g -e random:prandom_u32 ./runltp -f net.features -s tcp_fastopen /home/dileks/bin/perf report --no-children --stdio > ./perf-report.txt /home/dileks/bin/perf script > ./perf-script.txt du -h perf* 34M perf.data 20K perf-report.txt 134M perf-script.txt Note: For a first test I used net.features::tcp_fastopen. Attached is my perf-report.txt. - Sedat - [1] https://git.kernel.org/linus/94c7eb54c4b8e81618ec79f414fe1ca5767f9720 [-- Attachment #2: perf-report.txt --] [-- Type: text/plain, Size: 20177 bytes --] # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 120K of event 'random:prandom_u32' # Event count (approx.): 120452 # # Overhead Command Shared Object Symbol # ........ ............... ................. ............... # 58.22% netstress [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--33.22%--tcp_v4_connect | __inet_stream_connect | | | |--22.15%--inet_stream_connect | | __sys_connect | | __x64_sys_connect | | do_syscall_64 | | entry_SYSCALL_64_after_hwframe | | __libc_connect | | 0x65736e7500632e73 | | | --11.07%--tcp_sendmsg_locked | tcp_sendmsg | sock_sendmsg | __sys_sendto | __x64_sys_sendto | do_syscall_64 | entry_SYSCALL_64_after_hwframe | __libc_sendto | |--16.61%--tcp_v6_connect | __inet_stream_connect | | | |--11.07%--inet_stream_connect | | __sys_connect | | __x64_sys_connect | | do_syscall_64 | | entry_SYSCALL_64_after_hwframe | | __libc_connect | | 0x65736e7500632e73 | | | --5.54%--tcp_sendmsg_locked | tcp_sendmsg | sock_sendmsg | __sys_sendto | __x64_sys_sendto | do_syscall_64 | entry_SYSCALL_64_after_hwframe | __libc_sendto | |--5.24%--tcp_conn_request | tcp_rcv_state_process | | | |--3.26%--tcp_v4_do_rcv | | tcp_v4_rcv | | ip_protocol_deliver_rcu | | ip_local_deliver_finish | | __netif_receive_skb_one_core | | process_backlog | | net_rx_action | | __softirqentry_text_start | | asm_call_on_stack | | do_softirq_own_stack | | | | | |--2.03%--irq_exit_rcu | | | sysvec_apic_timer_interrupt | | | asm_sysvec_apic_timer_interrupt | | | | | --1.23%--do_softirq | | __local_bh_enable_ip | | | | | --1.19%--ip_finish_output2 | | __ip_queue_xmit | | __tcp_transmit_skb | | | | | --1.13%--tcp_write_xmit | | __tcp_push_pending_frames | | | | | --0.95%--tcp_sendmsg_locked | | tcp_sendmsg | | sock_sendmsg | | | | | --0.59%--__sys_sendto | | __x64_sys_sendto | | do_syscall_64 | | entry_SYSCALL_64_after_hwframe | | | --1.99%--tcp_v6_do_rcv | tcp_v6_rcv | ip6_protocol_deliver_rcu | ip6_input | __netif_receive_skb_one_core | process_backlog | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | | | |--1.27%--irq_exit_rcu | | sysvec_apic_timer_interrupt | | asm_sysvec_apic_timer_interrupt | | | --0.71%--do_softirq | __local_bh_enable_ip | | | --0.68%--ip6_finish_output2 | ip6_xmit | inet6_csk_xmit | __tcp_transmit_skb | | | --0.62%--tcp_write_xmit | __tcp_push_pending_frames | | | --0.53%--tcp_sendmsg_locked | tcp_sendmsg | sock_sendmsg | --3.13%--tcp_v4_syn_recv_sock tcp_v6_syn_recv_sock | |--2.61%--tcp_try_fastopen | tcp_conn_request | tcp_rcv_state_process | tcp_v4_do_rcv | tcp_v4_rcv | ip_protocol_deliver_rcu | ip_local_deliver_finish | __netif_receive_skb_one_core | process_backlog | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | | | |--1.66%--irq_exit_rcu | | sysvec_apic_timer_interrupt | | asm_sysvec_apic_timer_interrupt | | | --0.95%--do_softirq | __local_bh_enable_ip | | | --0.92%--ip_finish_output2 | __ip_queue_xmit | __tcp_transmit_skb | | | --0.87%--tcp_write_xmit | __tcp_push_pending_frames | | | --0.72%--tcp_sendmsg_locked | tcp_sendmsg | sock_sendmsg | --0.52%--tcp_check_req tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver_finish __netif_receive_skb_one_core process_backlog net_rx_action __softirqentry_text_start asm_call_on_stack do_softirq_own_stack 37.93% swapper [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--25.55%--tcp_conn_request | tcp_rcv_state_process | | | |--13.54%--tcp_v6_do_rcv | | tcp_v6_rcv | | ip6_protocol_deliver_rcu | | ip6_input | | __netif_receive_skb_one_core | | process_backlog | | net_rx_action | | __softirqentry_text_start | | asm_call_on_stack | | do_softirq_own_stack | | irq_exit_rcu | | sysvec_apic_timer_interrupt | | asm_sysvec_apic_timer_interrupt | | | | | |--12.41%--cpuidle_enter_state | | | cpuidle_enter | | | do_idle | | | cpu_startup_entry | | | | | | | |--9.59%--start_secondary | | | | secondary_startup_64 | | | | | | | --2.82%--start_kernel | | | secondary_startup_64 | | | | | --0.68%--poll_idle | | cpuidle_enter_state | | cpuidle_enter | | do_idle | | cpu_startup_entry | | | | | --0.52%--start_secondary | | secondary_startup_64 | | | --12.02%--tcp_v4_do_rcv | tcp_v4_rcv | ip_protocol_deliver_rcu | ip_local_deliver_finish | __netif_receive_skb_one_core | process_backlog | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | irq_exit_rcu | sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | | | |--10.71%--cpuidle_enter_state | | cpuidle_enter | | do_idle | | cpu_startup_entry | | | | | |--8.21%--start_secondary | | | secondary_startup_64 | | | | | --2.50%--start_kernel | | secondary_startup_64 | | | --0.77%--poll_idle | cpuidle_enter_state | cpuidle_enter | do_idle | cpu_startup_entry | | | --0.60%--start_secondary | secondary_startup_64 | --12.36%--tcp_v4_syn_recv_sock tcp_v6_syn_recv_sock | |--7.54%--tcp_try_fastopen | tcp_conn_request | tcp_rcv_state_process | tcp_v4_do_rcv | tcp_v4_rcv | ip_protocol_deliver_rcu | ip_local_deliver_finish | __netif_receive_skb_one_core | process_backlog | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | irq_exit_rcu | sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | | | --6.76%--cpuidle_enter_state | cpuidle_enter | do_idle | cpu_startup_entry | | | |--5.11%--start_secondary | | secondary_startup_64 | | | --1.65%--start_kernel | secondary_startup_64 | --4.82%--tcp_check_req tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver_finish __netif_receive_skb_one_core process_backlog net_rx_action __softirqentry_text_start asm_call_on_stack do_softirq_own_stack irq_exit_rcu sysvec_apic_timer_interrupt asm_sysvec_apic_timer_interrupt | --4.59%--cpuidle_enter_state cpuidle_enter do_idle cpu_startup_entry | |--3.63%--start_secondary | secondary_startup_64 | --0.95%--start_kernel secondary_startup_64 0.78% ksoftirqd/1 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 0.76% ksoftirqd/0 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 0.76% ksoftirqd/3 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 0.71% ksoftirqd/2 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 0.20% kworker/3:3-eve [kernel.kallsyms] [k] prandom_u32 0.17% kworker/1:1-eve [kernel.kallsyms] [k] prandom_u32 0.15% kworker/2:2-eve [kernel.kallsyms] [k] prandom_u32 0.12% kworker/0:7-eve [kernel.kallsyms] [k] prandom_u32 0.06% Xorg [kernel.kallsyms] [k] prandom_u32 0.04% ip [kernel.kallsyms] [k] prandom_u32 0.03% avahi-daemon [kernel.kallsyms] [k] prandom_u32 0.02% perf [kernel.kallsyms] [k] prandom_u32 0.01% mysqld [kernel.kallsyms] [k] prandom_u32 0.01% tcp_fastopen_ru [kernel.kallsyms] [k] prandom_u32 0.00% irq/35-iwlwifi [kernel.kallsyms] [k] prandom_u32 0.00% rcu_sched [kernel.kallsyms] [k] prandom_u32 0.00% dbus-daemon [kernel.kallsyms] [k] prandom_u32 0.00% ltp-pan [kernel.kallsyms] [k] prandom_u32 0.00% kworker/u16:3-i [kernel.kallsyms] [k] prandom_u32 0.00% mktemp [kernel.kallsyms] [k] prandom_u32 0.00% QSGRenderThread [kernel.kallsyms] [k] prandom_u32 0.00% gdbus [kernel.kallsyms] [k] prandom_u32 0.00% jbd2/sdc2-8 [kernel.kallsyms] [k] prandom_u32 0.00% kded5 [kernel.kallsyms] [k] prandom_u32 0.00% konsole [kernel.kallsyms] [k] prandom_u32 0.00% kworker/1:0-eve [kernel.kallsyms] [k] prandom_u32 0.00% kworker/2:0-eve [kernel.kallsyms] [k] prandom_u32 0.00% runltp [kernel.kallsyms] [k] prandom_u32 0.00% systemd-journal [kernel.kallsyms] [k] prandom_u32 0.00% xdg-desktop-por [kernel.kallsyms] [k] prandom_u32 0.00% sh [kernel.kallsyms] [k] prandom_u32 # Samples: 0 of event 'dummy:HG' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # # (Tip: To show assembler sample contexts use perf record -b / perf script -F +brstackinsn --xed) # ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-14 15:32 ` Sedat Dilek @ 2020-08-14 16:05 ` Willy Tarreau 2020-08-14 16:17 ` Sedat Dilek 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-14 16:05 UTC (permalink / raw) To: Sedat Dilek Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev On Fri, Aug 14, 2020 at 05:32:32PM +0200, Sedat Dilek wrote: > commit 94c7eb54c4b8e81618ec79f414fe1ca5767f9720 > "random32: add a tracepoint for prandom_u32()" > > ...I gave Willy's patches a try and used the Linux Test Project (LTP) > for testing. Just FWIW today I could run several relevant tests with a 40 Gbps NIC at high connection rates and under SYN flood to stress SYN cookies. I couldn't post earlier due to a net outage but will post the results here. In short, what I'm seeing is very good. The only thing is that the noise collection as-is with the 4 longs takes a bit too much CPU (0.2% measured) but if keeping only one word we're back to tausworthe performance, while using siphash all along. The noise generation function is so small that we're wasting cycles calling it and renaming registers. I'll run one more test by inlining it and exporting the noise. So provided quite some cleanup now, I really think we're about to reach a solution which will satisfy everyone. More on this after I extract the results. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-14 16:05 ` Willy Tarreau @ 2020-08-14 16:17 ` Sedat Dilek 2020-08-16 15:01 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-14 16:17 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Linus Torvalds, netdev On Fri, Aug 14, 2020 at 6:05 PM Willy Tarreau <w@1wt.eu> wrote: > > On Fri, Aug 14, 2020 at 05:32:32PM +0200, Sedat Dilek wrote: > > commit 94c7eb54c4b8e81618ec79f414fe1ca5767f9720 > > "random32: add a tracepoint for prandom_u32()" > > > > ...I gave Willy's patches a try and used the Linux Test Project (LTP) > > for testing. > > Just FWIW today I could run several relevant tests with a 40 Gbps NIC > at high connection rates and under SYN flood to stress SYN cookies. > I couldn't post earlier due to a net outage but will post the results > here. In short, what I'm seeing is very good. The only thing is that > the noise collection as-is with the 4 longs takes a bit too much CPU > (0.2% measured) but if keeping only one word we're back to tausworthe > performance, while using siphash all along. > > The noise generation function is so small that we're wasting cycles > calling it and renaming registers. I'll run one more test by inlining > it and exporting the noise. > > So provided quite some cleanup now, I really think we're about to > reach a solution which will satisfy everyone. More on this after I > extract the results. > Great news! Will you publish your new version in [1]? - Sedat - [1] https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/cleanups.git/ ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-14 16:17 ` Sedat Dilek @ 2020-08-16 15:01 ` Willy Tarreau 2020-08-16 16:48 ` Sedat Dilek 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-16 15:01 UTC (permalink / raw) To: Eric Dumazet, George Spelvin, Linus Torvalds Cc: Sedat Dilek, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev Hi, so as I mentioned, I could run several test on our lab with variations around the various proposals and come to quite positive conclusions. Synthetic observations: the connection rate and the SYN cookie rate do not seem to be affected the same way by the prandom changes. One explanation is that the connection rates are less stable across reboots. Another possible explanation is that the larger state update is more sensitive to cache misses that increase when calling userland. I noticed that the compiler didn't inline siprand_u32() for me, resulting in one extra function call and noticeable register clobbering that mostly vanish once siprand_u32() is inlined, getting back to the original performance. The noise generation was placed as discussed in the xmit calls, however the extra function call and state update had a negative effect on performance and the noise function alone appeared for up to 0.23% of the CPU usage. Simplifying the mix of data by keeping only one long for the noise and using one siphash round on 4 input words to keep only the last word allowed to use very few instructions and to inline them, making the noise collection imperceptible in microbenchmarks. The noise is now collected this way (I verified that all inputs are used), this performs 3 xor, 2 add and 2 rol, which is way sufficient and already better than my initial attempt with a bare add : static inline void prandom_u32_add_noise(unsigned long a, unsigned long b, unsigned long c, unsigned long d) { /* * This is not used cryptographically; it's just * a convenient 4-word hash function. (3 xor, 2 add, 2 rol) */ a ^= __this_cpu_read(net_rand_noise); PRND_SIPROUND(a, b, c, d); __this_cpu_write(net_rand_noise, d); } My tests were run on a 6-core 12-thread Core i7-8700k equipped with a 40G NIC (i40e). I've mainly run two types of tests: - connections per second: the machine runs a server which accepts and closes incoming connections. The load generators aim at it and the connection rate is measured once it's stabilized. - SYN cookie rate: the load generators flood the machine with enough SYNs to saturate the CPU and the rate of response SYN-ACK is measured. Both correspond to real world use cases (DDoS protection against SYN flood and connection flood). The base kernel was fc80c51f + Eric's patch to add a tracepoint in prandom_u32(). Another test was made by adding George's changes to use siphash. Then another test was made with the siprand_u32() function inlined and with noise stored as a full siphash state. Then one test was run with the noise reduced to a single long. And a final test was run with the noise function inlined. connections SYN cookies Notes per second emitted/s base: 556k 5.38M siphash: 535k 5.33M siphash inlined +noise: 548k 5.40M add_noise=0.23% siphash + single-word noise 555k 5.45M add_noise=0.10% siphash + single-word&inlined noise 559k 5.38M Actually the last one is better than the previous one because it also swallows more packets. There were 10.9M pps in and 5.38M pps out versus 10.77M in and 5.45M out for the previous one. I didn't report the incoming traffic for the other ones as it was mostly irrelevant and always within these bounds. Finally I've added Eric's patch to reuse the skb hash when known in tcp_conn_request(), and was happy to see the SYN cookies reach 5.45 Mpps again and the connection rate remain unaffected. A perf record during the SYN flood showed almost no call to prandom_u32() anymore (just a few in tcp_rtx_synack()) so this looks like a desirable optimization. At the moment the code is ugly, in experimental state (I've pushed all of it at https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/). My impression on this is that given that it's possible to maintain the same level of performance as we currently have while making the PRNG much better, there's no more reason for not doing it. If there's enough interest at this point, I'm OK with restarting from George's patches and doing the adjustments there. There's still this prandom_seed() which looks very close to prandom_reseed() and that we might possibly better remerge, but I'd vote for not changing everything at once, it's ugly enough already. Also I suspect we can have an infinite loop in prandom_seed() if entropy is 0 and the state is zero as well. We'd be unlucky but I'd just make sure entropy is not all zeroes. And running tests on 32-bit would be desirable as well. Finally one can wonder whether it makes sense to keep Tausworthe for other cases (basic statistical sampling) or drop it. We could definitely drop it and simplify everything given that we now have the same level of performance. But if we do it, what should we do with the test patterns ? I personally don't think that testing a PRNG against a known sequence brings any value by definition, and that the more random we make it the less relevant this is. Thanks, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-16 15:01 ` Willy Tarreau @ 2020-08-16 16:48 ` Sedat Dilek 2020-08-20 3:05 ` Sedat Dilek 0 siblings, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-16 16:48 UTC (permalink / raw) To: Willy Tarreau Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Sun, Aug 16, 2020 at 5:01 PM Willy Tarreau <w@1wt.eu> wrote: > > Hi, > > so as I mentioned, I could run several test on our lab with variations > around the various proposals and come to quite positive conclusions. > > Synthetic observations: the connection rate and the SYN cookie rate do not > seem to be affected the same way by the prandom changes. One explanation > is that the connection rates are less stable across reboots. Another > possible explanation is that the larger state update is more sensitive > to cache misses that increase when calling userland. I noticed that the > compiler didn't inline siprand_u32() for me, resulting in one extra > function call and noticeable register clobbering that mostly vanish > once siprand_u32() is inlined, getting back to the original performance. > > The noise generation was placed as discussed in the xmit calls, however > the extra function call and state update had a negative effect on > performance and the noise function alone appeared for up to 0.23% of the > CPU usage. Simplifying the mix of data by keeping only one long for > the noise and using one siphash round on 4 input words to keep only > the last word allowed to use very few instructions and to inline them, > making the noise collection imperceptible in microbenchmarks. The noise > is now collected this way (I verified that all inputs are used), this > performs 3 xor, 2 add and 2 rol, which is way sufficient and already > better than my initial attempt with a bare add : > > static inline > void prandom_u32_add_noise(unsigned long a, unsigned long b, > unsigned long c, unsigned long d) > { > /* > * This is not used cryptographically; it's just > * a convenient 4-word hash function. (3 xor, 2 add, 2 rol) > */ > a ^= __this_cpu_read(net_rand_noise); > PRND_SIPROUND(a, b, c, d); > __this_cpu_write(net_rand_noise, d); > } > > My tests were run on a 6-core 12-thread Core i7-8700k equipped with a 40G > NIC (i40e). I've mainly run two types of tests: > > - connections per second: the machine runs a server which accepts and > closes incoming connections. The load generators aim at it and the > connection rate is measured once it's stabilized. > > - SYN cookie rate: the load generators flood the machine with enough > SYNs to saturate the CPU and the rate of response SYN-ACK is measured. > > Both correspond to real world use cases (DDoS protection against SYN flood > and connection flood). > > The base kernel was fc80c51f + Eric's patch to add a tracepoint in > prandom_u32(). Another test was made by adding George's changes to use > siphash. Then another test was made with the siprand_u32() function > inlined and with noise stored as a full siphash state. Then one test > was run with the noise reduced to a single long. And a final test was > run with the noise function inlined. > > connections SYN cookies Notes > per second emitted/s > > base: 556k 5.38M > > siphash: 535k 5.33M > > siphash inlined > +noise: 548k 5.40M add_noise=0.23% > > siphash + single-word > noise 555k 5.45M add_noise=0.10% > > siphash + single-word&inlined > noise 559k 5.38M > > Actually the last one is better than the previous one because it also > swallows more packets. There were 10.9M pps in and 5.38M pps out versus > 10.77M in and 5.45M out for the previous one. I didn't report the incoming > traffic for the other ones as it was mostly irrelevant and always within > these bounds. > > Finally I've added Eric's patch to reuse the skb hash when known in > tcp_conn_request(), and was happy to see the SYN cookies reach 5.45 Mpps > again and the connection rate remain unaffected. A perf record during > the SYN flood showed almost no call to prandom_u32() anymore (just a few > in tcp_rtx_synack()) so this looks like a desirable optimization. > > At the moment the code is ugly, in experimental state (I've pushed all of > it at https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/). > > My impression on this is that given that it's possible to maintain the > same level of performance as we currently have while making the PRNG much > better, there's no more reason for not doing it. > > If there's enough interest at this point, I'm OK with restarting from > George's patches and doing the adjustments there. There's still this > prandom_seed() which looks very close to prandom_reseed() and that we > might possibly better remerge, but I'd vote for not changing everything > at once, it's ugly enough already. Also I suspect we can have an infinite > loop in prandom_seed() if entropy is 0 and the state is zero as well. > We'd be unlucky but I'd just make sure entropy is not all zeroes. And > running tests on 32-bit would be desirable as well. > > Finally one can wonder whether it makes sense to keep Tausworthe for > other cases (basic statistical sampling) or drop it. We could definitely > drop it and simplify everything given that we now have the same level of > performance. But if we do it, what should we do with the test patterns ? > I personally don't think that testing a PRNG against a known sequence > brings any value by definition, and that the more random we make it the > less relevant this is. > Hi Willy, Thanks for the new patchset and offering it via public available Git. Thanks for the numbers. As said I tested here against Linux v5.8.1 - with your previous patchset. I cannot promise I will test the new one. First, I have to see how things work with Linux v5.9-rc1 - which will hopefully be released within a few hours. My primary focus is to make it work with my GNU and LLVM toolchains. Regards, - Sedat - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-16 16:48 ` Sedat Dilek @ 2020-08-20 3:05 ` Sedat Dilek 2020-08-20 4:33 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-20 3:05 UTC (permalink / raw) To: Willy Tarreau Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Sun, Aug 16, 2020 at 6:48 PM Sedat Dilek <sedat.dilek@gmail.com> wrote: > > On Sun, Aug 16, 2020 at 5:01 PM Willy Tarreau <w@1wt.eu> wrote: > > > > Hi, > > > > so as I mentioned, I could run several test on our lab with variations > > around the various proposals and come to quite positive conclusions. > > > > Synthetic observations: the connection rate and the SYN cookie rate do not > > seem to be affected the same way by the prandom changes. One explanation > > is that the connection rates are less stable across reboots. Another > > possible explanation is that the larger state update is more sensitive > > to cache misses that increase when calling userland. I noticed that the > > compiler didn't inline siprand_u32() for me, resulting in one extra > > function call and noticeable register clobbering that mostly vanish > > once siprand_u32() is inlined, getting back to the original performance. > > > > The noise generation was placed as discussed in the xmit calls, however > > the extra function call and state update had a negative effect on > > performance and the noise function alone appeared for up to 0.23% of the > > CPU usage. Simplifying the mix of data by keeping only one long for > > the noise and using one siphash round on 4 input words to keep only > > the last word allowed to use very few instructions and to inline them, > > making the noise collection imperceptible in microbenchmarks. The noise > > is now collected this way (I verified that all inputs are used), this > > performs 3 xor, 2 add and 2 rol, which is way sufficient and already > > better than my initial attempt with a bare add : > > > > static inline > > void prandom_u32_add_noise(unsigned long a, unsigned long b, > > unsigned long c, unsigned long d) > > { > > /* > > * This is not used cryptographically; it's just > > * a convenient 4-word hash function. (3 xor, 2 add, 2 rol) > > */ > > a ^= __this_cpu_read(net_rand_noise); > > PRND_SIPROUND(a, b, c, d); > > __this_cpu_write(net_rand_noise, d); > > } > > > > My tests were run on a 6-core 12-thread Core i7-8700k equipped with a 40G > > NIC (i40e). I've mainly run two types of tests: > > > > - connections per second: the machine runs a server which accepts and > > closes incoming connections. The load generators aim at it and the > > connection rate is measured once it's stabilized. > > > > - SYN cookie rate: the load generators flood the machine with enough > > SYNs to saturate the CPU and the rate of response SYN-ACK is measured. > > > > Both correspond to real world use cases (DDoS protection against SYN flood > > and connection flood). > > > > The base kernel was fc80c51f + Eric's patch to add a tracepoint in > > prandom_u32(). Another test was made by adding George's changes to use > > siphash. Then another test was made with the siprand_u32() function > > inlined and with noise stored as a full siphash state. Then one test > > was run with the noise reduced to a single long. And a final test was > > run with the noise function inlined. > > > > connections SYN cookies Notes > > per second emitted/s > > > > base: 556k 5.38M > > > > siphash: 535k 5.33M > > > > siphash inlined > > +noise: 548k 5.40M add_noise=0.23% > > > > siphash + single-word > > noise 555k 5.45M add_noise=0.10% > > > > siphash + single-word&inlined > > noise 559k 5.38M > > > > Actually the last one is better than the previous one because it also > > swallows more packets. There were 10.9M pps in and 5.38M pps out versus > > 10.77M in and 5.45M out for the previous one. I didn't report the incoming > > traffic for the other ones as it was mostly irrelevant and always within > > these bounds. > > > > Finally I've added Eric's patch to reuse the skb hash when known in > > tcp_conn_request(), and was happy to see the SYN cookies reach 5.45 Mpps > > again and the connection rate remain unaffected. A perf record during > > the SYN flood showed almost no call to prandom_u32() anymore (just a few > > in tcp_rtx_synack()) so this looks like a desirable optimization. > > > > At the moment the code is ugly, in experimental state (I've pushed all of > > it at https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/). > > > > My impression on this is that given that it's possible to maintain the > > same level of performance as we currently have while making the PRNG much > > better, there's no more reason for not doing it. > > > > If there's enough interest at this point, I'm OK with restarting from > > George's patches and doing the adjustments there. There's still this > > prandom_seed() which looks very close to prandom_reseed() and that we > > might possibly better remerge, but I'd vote for not changing everything > > at once, it's ugly enough already. Also I suspect we can have an infinite > > loop in prandom_seed() if entropy is 0 and the state is zero as well. > > We'd be unlucky but I'd just make sure entropy is not all zeroes. And > > running tests on 32-bit would be desirable as well. > > > > Finally one can wonder whether it makes sense to keep Tausworthe for > > other cases (basic statistical sampling) or drop it. We could definitely > > drop it and simplify everything given that we now have the same level of > > performance. But if we do it, what should we do with the test patterns ? > > I personally don't think that testing a PRNG against a known sequence > > brings any value by definition, and that the more random we make it the > > less relevant this is. > > > > Hi Willy, > > Thanks for the new patchset and offering it via public available Git. > > Thanks for the numbers. > > As said I tested here against Linux v5.8.1 - with your previous patchset. > > I cannot promise I will test the new one. > > First, I have to see how things work with Linux v5.9-rc1 - which will > hopefully be released within a few hours. > My primary focus is to make it work with my GNU and LLVM toolchains. > I had some time to play with your new patchset. I tried on top of Linux v5.9-rc1. Unfortunately, some drivers from the staging area needed to be disabled due to build failures. I changed from kernel-modules to disabled: scripts/config -d RTL8723BS scripts/config -d R8712U scripts/config -d R8188EU See below for the error and [ drivers/staging/rtl8723bs/include/rtw_security.h ] and git grep for #define K0 | #define K1. Then I saw some modpost errors: mkdir -p ./arch/x86_64/boot ln -fsn ../../x86/boot/bzImage ./arch/x86_64/boot/bzImage ERROR: modpost: "net_rand_noise" [drivers/scsi/fcoe/libfcoe.ko] undefined! ERROR: modpost: "net_rand_noise" [lib/test_bpf.ko] undefined! make[4]: *** [scripts/Makefile.modpost:111: Module.symvers] Error 1 make[4]: *** Deleting file 'Module.symvers' Where I changed from kernel-modules to disabled: scripts/config -d TEST_BPF scripts/config -d LIBFCOE - Sedat - [ CONFIG_R8188EU=m and CONFIG_88EU_AP_MODE=y ] $ grep "Error 1" build-log_5.9.0-rc1-5-amd64-llvm11-ias.txt make[6]: *** [scripts/Makefile.build:283: drivers/staging/rtl8188eu/core/rtw_ap.o] Error 1 [ CONFIG_RTL8723BS=m ] In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7: In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42: ./drivers/staging/rtl8723bs/include/rtw_security.h:275:7: error: expected member name or ';' after declaration specifiers u32 K0, K1; /* Key */ ~~~ ^ ./include/linux/prandom.h:35:13: note: expanded from macro 'K0' #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) ^ In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7: In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42: ./drivers/staging/rtl8723bs/include/rtw_security.h:275:7: error: expected ')' ./include/linux/prandom.h:35:13: note: expanded from macro 'K0' #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) ^ ./drivers/staging/rtl8723bs/include/rtw_security.h:275:7: note: to match this '(' ./include/linux/prandom.h:35:12: note: expanded from macro 'K0' #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) ^ In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7: In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42: ./drivers/staging/rtl8723bs/include/rtw_security.h:275:11: error: expected member name or ';' after declaration specifiers u32 K0, K1; /* Key */ ~~~ ^ ./include/linux/prandom.h:36:13: note: expanded from macro 'K1' #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) ^ In file included from drivers/staging/rtl8723bs/core/rtw_btcoex.c:7: In file included from ./drivers/staging/rtl8723bs/include/drv_types.h:42: ./drivers/staging/rtl8723bs/include/rtw_security.h:275:11: error: expected ')' ./include/linux/prandom.h:36:13: note: expanded from macro 'K1' #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) ^ ./drivers/staging/rtl8723bs/include/rtw_security.h:275:11: note: to match this '(' ./include/linux/prandom.h:36:12: note: expanded from macro 'K1' #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) ^ 4 errors generated. make[6]: *** [scripts/Makefile.build:283: drivers/staging/rtl8723bs/core/rtw_btcoex.o] Error 1 [ drivers/staging/rtl8723bs/include/rtw_security.h ] struct mic_data { u32 K0, K1; /* Key */ u32 L, R; /* Current state */ u32 M; /* Message accumulator (single word) */ u32 nBytesInM; /* # bytes in M */ } $ git grep -E 'K0 |K1 ' drivers/staging/ | grep -i key drivers/staging/rtl8188eu/core/rtw_security.c: pmicdata->K0 = secmicgetuint32(key); drivers/staging/rtl8188eu/core/rtw_security.c: pmicdata->K1 = secmicgetuint32(key + 4); drivers/staging/rtl8712/rtl871x_security.c: pmicdata->K0 = secmicgetuint32(key); drivers/staging/rtl8712/rtl871x_security.c: pmicdata->K1 = secmicgetuint32(key + 4); drivers/staging/rtl8723bs/core/rtw_security.c: pmicdata->K0 = secmicgetuint32(key); drivers/staging/rtl8723bs/core/rtw_security.c: pmicdata->K1 = secmicgetuint32(key + 4); $ git grep -E '\#define K0 |\#define K1 ' arch/arm/crypto/sha1-armv7-neon.S:#define K1 0x5A827999 arch/x86/crypto/sha1_avx2_x86_64_asm.S:#define K1 0x5a827999 crypto/rmd128.c:#define K1 RMD_K1 crypto/rmd160.c:#define K1 RMD_K1 crypto/rmd256.c:#define K1 RMD_K1 crypto/rmd320.c:#define K1 RMD_K1 fs/ext4/hash.c:#define K1 0 include/linux/prandom.h:#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) include/linux/prandom.h:#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) include/linux/prandom.h:#define K0 0x6c796765 include/linux/prandom.h:#define K1 0x74656462 lib/random32.c:#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) lib/random32.c:#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) lib/random32.c:#define K0 0x6c796765 lib/random32.c:#define K1 0x74656462 We have the same defines for K0 and K1 in include/linux/prandom.h and lib/random32.c? More room for simplifications? - EOT - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-20 3:05 ` Sedat Dilek @ 2020-08-20 4:33 ` Willy Tarreau 2020-08-20 4:42 ` Sedat Dilek 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-20 4:33 UTC (permalink / raw) To: Sedat Dilek Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote: > We have the same defines for K0 and K1 in include/linux/prandom.h and > lib/random32.c? > More room for simplifications? Definitely, I'm not surprized at all. As I said, the purpose was to discuss around the proposal, not much more. If we think it's the way to go, some major lifting is required. I just don't want to invest significant time on this if nobody cares. Thanks! Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-20 4:33 ` Willy Tarreau @ 2020-08-20 4:42 ` Sedat Dilek 2020-08-20 6:08 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: Sedat Dilek @ 2020-08-20 4:42 UTC (permalink / raw) To: Willy Tarreau Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Thu, Aug 20, 2020 at 6:33 AM Willy Tarreau <w@1wt.eu> wrote: > > On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote: > > We have the same defines for K0 and K1 in include/linux/prandom.h and > > lib/random32.c? > > More room for simplifications? > > Definitely, I'm not surprized at all. As I said, the purpose was to > discuss around the proposal, not much more. If we think it's the way > to go, some major lifting is required. I just don't want to invest > significant time on this if nobody cares. > OK. Right now, I will try with the attached diff. Unclear to me where this modpost "net_rand_noise" undefined! comes from. Any hints? - Sedat - [ prandom-siphash-noise-wtarreau-20200816-dileks.diff ] diff --git a/drivers/staging/rtl8188eu/include/rtw_security.h b/drivers/staging/rtl8188eu/include/rtw_security.h index 8ba02a7cea60..5cbb6fec71cd 100644 --- a/drivers/staging/rtl8188eu/include/rtw_security.h +++ b/drivers/staging/rtl8188eu/include/rtw_security.h @@ -221,6 +221,9 @@ do { \ #define ROL32(A, n) (((A) << (n)) | (((A) >> (32 - (n))) & ((1UL << (n)) - 1))) #define ROR32(A, n) ROL32((A), 32 - (n)) +// XXX: Workaround: Undef defines from <include/linux/prandom.h> +#undef K0 +#undef K1 struct mic_data { u32 K0, K1; /* Key */ u32 L, R; /* Current state */ diff --git a/drivers/staging/rtl8712/rtl871x_security.h b/drivers/staging/rtl8712/rtl871x_security.h index b2dda16cbd0a..d4ffb31d9d14 100644 --- a/drivers/staging/rtl8712/rtl871x_security.h +++ b/drivers/staging/rtl8712/rtl871x_security.h @@ -188,6 +188,9 @@ do {\ #define ROL32(A, n) (((A) << (n)) | (((A)>>(32-(n))) & ((1UL << (n)) - 1))) #define ROR32(A, n) ROL32((A), 32 - (n)) +// XXX: Workaround: Undef defines from <include/linux/prandom.h> +#undef K0 +#undef K1 struct mic_data { u32 K0, K1; /* Key */ u32 L, R; /* Current state */ diff --git a/drivers/staging/rtl8723bs/include/rtw_security.h b/drivers/staging/rtl8723bs/include/rtw_security.h index 514c0799c34b..260ca9f29a35 100644 --- a/drivers/staging/rtl8723bs/include/rtw_security.h +++ b/drivers/staging/rtl8723bs/include/rtw_security.h @@ -271,6 +271,9 @@ do {\ #define ROL32(A, n) (((A) << (n)) | (((A)>>(32-(n))) & ((1UL << (n)) - 1))) #define ROR32(A, n) ROL32((A), 32-(n)) +// XXX: Workaround: Undef defines from <include/linux/prandom.h> +#undef K0 +#undef K1 struct mic_data { u32 K0, K1; /* Key */ u32 L, R; /* Current state */ diff --git a/include/linux/prandom.h b/include/linux/prandom.h index 95d73b01d8c5..efebcff3c93d 100644 --- a/include/linux/prandom.h +++ b/include/linux/prandom.h @@ -32,6 +32,11 @@ DECLARE_PER_CPU(unsigned long, net_rand_noise); v1 ^= v0, v0 = rol64(v0, 32), v3 ^= v2, \ v0 += v3, v3 = rol64(v3, 21), v2 += v1, v1 = rol64(v1, 17), \ v3 ^= v0, v1 ^= v2, v2 = rol64(v2, 32) ) +#define SIPROUND(v0,v1,v2,v3) ( \ + v0 += v1, v1 = rol64(v1, 13), v2 += v3, v3 = rol64(v3, 16), \ + v1 ^= v0, v0 = rol64(v0, 32), v3 ^= v2, \ + v0 += v3, v3 = rol64(v3, 21), v2 += v1, v1 = rol64(v1, 17), \ + v3 ^= v0, v1 ^= v2, v2 = rol64(v2, 32) ) #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) @@ -46,6 +51,11 @@ DECLARE_PER_CPU(unsigned long, net_rand_noise); v1 ^= v0, v0 = rol32(v0, 16), v3 ^= v2, \ v0 += v3, v3 = rol32(v3, 7), v2 += v1, v1 = rol32(v1, 13), \ v3 ^= v0, v1 ^= v2, v2 = rol32(v2, 16) ) +#define SIPROUND(v0,v1,v2,v3) ( \ + v0 += v1, v1 = rol32(v1, 5), v2 += v3, v3 = rol32(v3, 8), \ + v1 ^= v0, v0 = rol32(v0, 16), v3 ^= v2, \ + v0 += v3, v3 = rol32(v3, 7), v2 += v1, v1 = rol32(v1, 13), \ + v3 ^= v0, v1 ^= v2, v2 = rol32(v2, 16) ) #define K0 0x6c796765 #define K1 0x74656462 diff --git a/lib/random32.c b/lib/random32.c index 93f0cd3a67ee..f24c7a0febf0 100644 --- a/lib/random32.c +++ b/lib/random32.c @@ -323,37 +323,6 @@ struct siprand_state { static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; DEFINE_PER_CPU(unsigned long, net_rand_noise); -#if BITS_PER_LONG == 64 -/* - * The core SipHash round function. Each line can be executed in - * parallel given enough CPU resources. - */ -#define SIPROUND(v0,v1,v2,v3) ( \ - v0 += v1, v1 = rol64(v1, 13), v2 += v3, v3 = rol64(v3, 16), \ - v1 ^= v0, v0 = rol64(v0, 32), v3 ^= v2, \ - v0 += v3, v3 = rol64(v3, 21), v2 += v1, v1 = rol64(v1, 17), \ - v3 ^= v0, v1 ^= v2, v2 = rol64(v2, 32) ) -#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) -#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) - -#elif BITS_PER_LONG == 32 -/* - * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash. - * This is weaker, but 32-bit machines are not used for high-traffic - * applications, so there is less output for an attacker to analyze. - */ -#define SIPROUND(v0,v1,v2,v3) ( \ - v0 += v1, v1 = rol32(v1, 5), v2 += v3, v3 = rol32(v3, 8), \ - v1 ^= v0, v0 = rol32(v0, 16), v3 ^= v2, \ - v0 += v3, v3 = rol32(v3, 7), v2 += v1, v1 = rol32(v1, 13), \ - v3 ^= v0, v1 ^= v2, v2 = rol32(v2, 16) ) -#define K0 0x6c796765 -#define K1 0x74656462 - -#else -#error Unsupported BITS_PER_LONG -#endif - /* * This is the core CPRNG function. As "pseudorandom", this is not used * for truly valuable things, just intended to be a PITA to guess. - EOF - ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-20 4:42 ` Sedat Dilek @ 2020-08-20 6:08 ` Willy Tarreau 2020-08-20 6:58 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-20 6:08 UTC (permalink / raw) To: Sedat Dilek Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Thu, Aug 20, 2020 at 06:42:23AM +0200, Sedat Dilek wrote: > On Thu, Aug 20, 2020 at 6:33 AM Willy Tarreau <w@1wt.eu> wrote: > > > > On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote: > > > We have the same defines for K0 and K1 in include/linux/prandom.h and > > > lib/random32.c? > > > More room for simplifications? > > > > Definitely, I'm not surprized at all. As I said, the purpose was to > > discuss around the proposal, not much more. If we think it's the way > > to go, some major lifting is required. I just don't want to invest > > significant time on this if nobody cares. > > > > OK. > > Right now, I will try with the attached diff. No, don't waste your time this way, it's not the right way to address it, you're still facing competition between defines. I'll do another one if you want to go further in the tests. > Unclear to me where this modpost "net_rand_noise" undefined! comes from. > Any hints? Sure, the symbol is not exported. I'll address it as well for you. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-20 6:08 ` Willy Tarreau @ 2020-08-20 6:58 ` Willy Tarreau 2020-08-20 8:05 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-20 6:58 UTC (permalink / raw) To: Sedat Dilek Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Thu, Aug 20, 2020 at 08:08:43AM +0200, Willy Tarreau wrote: > On Thu, Aug 20, 2020 at 06:42:23AM +0200, Sedat Dilek wrote: > > On Thu, Aug 20, 2020 at 6:33 AM Willy Tarreau <w@1wt.eu> wrote: > > > > > > On Thu, Aug 20, 2020 at 05:05:49AM +0200, Sedat Dilek wrote: > > > > We have the same defines for K0 and K1 in include/linux/prandom.h and > > > > lib/random32.c? > > > > More room for simplifications? > > > > > > Definitely, I'm not surprized at all. As I said, the purpose was to > > > discuss around the proposal, not much more. If we think it's the way > > > to go, some major lifting is required. I just don't want to invest > > > significant time on this if nobody cares. > > > > > > > OK. > > > > Right now, I will try with the attached diff. > > No, don't waste your time this way, it's not the right way to address it, > you're still facing competition between defines. I'll do another one if > you want to go further in the tests. I've just pushed a new branch "20200820-siphash-noise" that I also rebased onto latest master. It's currently running make allmodconfig here, so that will take a while, but I think it's OK as random32.o is already OK. I've also addressed a strange build issue caused by having an array instead of 4 ints in siprand_state. Please just let me know if that's OK for you now. Thanks, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-20 6:58 ` Willy Tarreau @ 2020-08-20 8:05 ` Willy Tarreau 2020-08-20 12:08 ` Sedat Dilek [not found] ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com> 0 siblings, 2 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-20 8:05 UTC (permalink / raw) To: Sedat Dilek Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Thu, Aug 20, 2020 at 08:58:38AM +0200, Willy Tarreau wrote: > I've just pushed a new branch "20200820-siphash-noise" that I also > rebased onto latest master. It's currently running make allmodconfig > here, so that will take a while, but I think it's OK as random32.o is > already OK. I've also addressed a strange build issue caused by having > an array instead of 4 ints in siprand_state. > > Please just let me know if that's OK for you now. At least it worked for me now (built with no errors on x86_64): $ time make -j 8 bzImage modules (...) real 65m7.986s user 477m22.477s sys 38m0.545s $ find . -name '*.ko' |wc -l 7983 Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-20 8:05 ` Willy Tarreau @ 2020-08-20 12:08 ` Sedat Dilek [not found] ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com> 1 sibling, 0 replies; 48+ messages in thread From: Sedat Dilek @ 2020-08-20 12:08 UTC (permalink / raw) To: Willy Tarreau Cc: Eric Dumazet, George Spelvin, Linus Torvalds, Amit Klein, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev [-- Attachment #1: Type: text/plain, Size: 2348 bytes --] On Thu, Aug 20, 2020 at 10:05 AM Willy Tarreau <w@1wt.eu> wrote: > > On Thu, Aug 20, 2020 at 08:58:38AM +0200, Willy Tarreau wrote: > > I've just pushed a new branch "20200820-siphash-noise" that I also > > rebased onto latest master. It's currently running make allmodconfig > > here, so that will take a while, but I think it's OK as random32.o is > > already OK. I've also addressed a strange build issue caused by having > > an array instead of 4 ints in siprand_state. > > > > Please just let me know if that's OK for you now. > > At least it worked for me now (built with no errors on x86_64): > > $ time make -j 8 bzImage modules > (...) > real 65m7.986s > user 477m22.477s > sys 38m0.545s > $ find . -name '*.ko' |wc -l > 7983 > Runs fine here. Thanks Willy for the "20200820-siphash-noise" [1] patchset and including/fixing all my reported issues. [1] Staging driver build failures fixed by... WIP: random32: rename the K0/K1 SipHash constants to PRND_K* [2] modpost undefined "net_rand_noise" errors fixed by... WIP: random32: export net_rand_noise [3] Consolidate/move/cleanup stuff in random32.c and prandom.h WIP: random32: keep a single macro definition for sipround This patchset looks very good to me :-). [ TESTING WITH LTP ] I run another perf-session by running "LTP::net.features::tcp_fastopen" test only. Unsure if there exist some more appropriate LTP tests. For example there exists "net_stress.*" (see [2]). [ PERF-SESSION ] Link: https://github.com/ClangBuiltLinux/linux/issues/1086#issuecomment-675783804 /home/dileks/bin/perf list | grep prandom_u32 | column -t random:prandom_u32 [Tracepoint event] cd /opt/ltp echo 0 | tee /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid /home/dileks/bin/perf record -a -g -e random:prandom_u32 ./runltp -f net.features -s tcp_fastopen /home/dileks/bin/perf report --no-children --stdio > ./perf-report.txt /home/dileks/bin/perf script > ./perf-script.txt echo 1 | tee /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid [ /PERF-SESSION ] For a "comparison" (?) I attached two perf-reports - the newer one includes Willy's latest patchset. - Sedat - [1] https://git.kernel.org/pub/scm/linux/kernel/git/wtarreau/prandom.git/log/?h=20200820-siphash-noise [2] https://github.com/linux-test-project/ltp/tree/master/runtest [-- Attachment #2: perf-report-2020-08-19.txt --] [-- Type: text/plain, Size: 21043 bytes --] # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 120K of event 'random:prandom_u32' # Event count (approx.): 120473 # # Overhead Command Shared Object Symbol # ........ ............... ................. ............... # 59.67% netstress [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--33.21%--tcp_v4_connect | __inet_stream_connect | | | |--22.14%--inet_stream_connect | | __sys_connect | | __x64_sys_connect | | __noinstr_text_start | | entry_SYSCALL_64_after_hwframe | | __libc_connect | | 0x65736e7500632e73 | | | --11.07%--tcp_sendmsg_locked | tcp_sendmsg | __sys_sendto | __x64_sys_sendto | __noinstr_text_start | entry_SYSCALL_64_after_hwframe | __libc_sendto | |--16.61%--tcp_v6_connect | __inet_stream_connect | | | |--11.07%--inet_stream_connect | | __sys_connect | | __x64_sys_connect | | __noinstr_text_start | | entry_SYSCALL_64_after_hwframe | | __libc_connect | | 0x65736e7500632e73 | | | --5.54%--tcp_sendmsg_locked | tcp_sendmsg | __sys_sendto | __x64_sys_sendto | __noinstr_text_start | entry_SYSCALL_64_after_hwframe | __libc_sendto | |--6.64%--tcp_conn_request | tcp_rcv_state_process | | | |--3.35%--tcp_v4_do_rcv | | tcp_v4_rcv | | ip_protocol_deliver_rcu | | ip_local_deliver | | ip_rcv | | __netif_receive_skb | | process_backlog | | napi_poll | | net_rx_action | | __softirqentry_text_start | | asm_call_on_stack | | do_softirq_own_stack | | | | | |--2.05%--__irq_exit_rcu | | | | | | | --2.04%--sysvec_apic_timer_interrupt | | | asm_sysvec_apic_timer_interrupt | | | | | --1.30%--do_softirq | | __local_bh_enable_ip | | | | | --1.27%--ip_finish_output2 | | ip_output | | __ip_queue_xmit | | __tcp_transmit_skb | | | | | --1.20%--tcp_write_xmit | | | | | --1.02%--__tcp_push_pending_frames | | tcp_sendmsg_locked | | tcp_sendmsg | | | | | --0.63%--__sys_sendto | | __x64_sys_sendto | | __noinstr_text_start | | entry_SYSCALL_64_after_hwframe | | | --3.29%--tcp_v6_do_rcv | tcp_v6_rcv | ip6_protocol_deliver_rcu | ip6_input | ipv6_rcv | __netif_receive_skb | process_backlog | napi_poll | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | | | |--2.02%--__irq_exit_rcu | | sysvec_apic_timer_interrupt | | asm_sysvec_apic_timer_interrupt | | | --1.28%--do_softirq | __local_bh_enable_ip | | | --1.23%--ip6_finish_output2 | ip6_output | ip6_xmit | inet6_csk_xmit | __tcp_transmit_skb | | | --1.14%--tcp_write_xmit | | | --0.98%--__tcp_push_pending_frames | tcp_sendmsg_locked | tcp_sendmsg | | | --0.60%--__sys_sendto | __x64_sys_sendto | __noinstr_text_start | entry_SYSCALL_64_after_hwframe | --3.19%--tcp_v4_syn_recv_sock tcp_v6_syn_recv_sock | --2.70%--tcp_try_fastopen tcp_conn_request tcp_rcv_state_process tcp_v4_do_rcv tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver ip_rcv __netif_receive_skb process_backlog napi_poll net_rx_action __softirqentry_text_start asm_call_on_stack do_softirq_own_stack | |--1.71%--__irq_exit_rcu | | | --1.70%--sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | --0.99%--do_softirq __local_bh_enable_ip | --0.97%--ip_finish_output2 ip_output __ip_queue_xmit __tcp_transmit_skb | --0.92%--tcp_write_xmit | --0.78%--__tcp_push_pending_frames tcp_sendmsg_locked tcp_sendmsg 36.09% swapper [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--23.84%--tcp_conn_request | tcp_rcv_state_process | | | |--11.94%--tcp_v6_do_rcv | | tcp_v6_rcv | | ip6_protocol_deliver_rcu | | ip6_input | | ipv6_rcv | | __netif_receive_skb | | process_backlog | | napi_poll | | net_rx_action | | __softirqentry_text_start | | asm_call_on_stack | | do_softirq_own_stack | | __irq_exit_rcu | | sysvec_apic_timer_interrupt | | asm_sysvec_apic_timer_interrupt | | | | | |--10.49%--cpuidle_enter_state | | | cpuidle_enter | | | do_idle | | | cpu_startup_entry | | | | | | | |--7.92%--secondary_startup_64 | | | | | | | --2.57%--start_kernel | | | secondary_startup_64 | | | | | --0.92%--poll_idle | | cpuidle_enter_state | | cpuidle_enter | | do_idle | | cpu_startup_entry | | | | | --0.68%--secondary_startup_64 | | | --11.90%--tcp_v4_do_rcv | tcp_v4_rcv | ip_protocol_deliver_rcu | ip_local_deliver | ip_rcv | __netif_receive_skb | process_backlog | napi_poll | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | __irq_exit_rcu | sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | | | |--10.61%--cpuidle_enter_state | | cpuidle_enter | | do_idle | | cpu_startup_entry | | | | | |--7.92%--secondary_startup_64 | | | | | --2.69%--start_kernel | | secondary_startup_64 | | | --0.78%--poll_idle | cpuidle_enter_state | cpuidle_enter | do_idle | cpu_startup_entry | | | --0.60%--secondary_startup_64 | --12.23%--tcp_v4_syn_recv_sock tcp_v6_syn_recv_sock | |--7.45%--tcp_try_fastopen | tcp_conn_request | tcp_rcv_state_process | tcp_v4_do_rcv | tcp_v4_rcv | ip_protocol_deliver_rcu | ip_local_deliver | ip_rcv | __netif_receive_skb | process_backlog | napi_poll | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | __irq_exit_rcu | sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | | | --6.62%--cpuidle_enter_state | cpuidle_enter | do_idle | cpu_startup_entry | | | |--4.85%--secondary_startup_64 | | | --1.77%--start_kernel | secondary_startup_64 | --4.78%--tcp_check_req tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver ip_rcv __netif_receive_skb process_backlog napi_poll net_rx_action __softirqentry_text_start asm_call_on_stack do_softirq_own_stack __irq_exit_rcu sysvec_apic_timer_interrupt asm_sysvec_apic_timer_interrupt | --4.53%--cpuidle_enter_state cpuidle_enter do_idle cpu_startup_entry | |--3.48%--secondary_startup_64 | --1.05%--start_kernel secondary_startup_64 0.92% ksoftirqd/3 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | --0.59%--tcp_conn_request tcp_rcv_state_process 0.90% ksoftirqd/1 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | --0.62%--tcp_conn_request tcp_rcv_state_process 0.84% ksoftirqd/0 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | --0.56%--tcp_conn_request tcp_rcv_state_process 0.77% ksoftirqd/2 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | --0.51%--tcp_conn_request tcp_rcv_state_process 0.17% kworker/3:7-eve [kernel.kallsyms] [k] prandom_u32 0.15% kworker/1:2-eve [kernel.kallsyms] [k] prandom_u32 0.14% kworker/2:0-eve [kernel.kallsyms] [k] prandom_u32 0.13% kworker/0:5-eve [kernel.kallsyms] [k] prandom_u32 0.07% Xorg [kernel.kallsyms] [k] prandom_u32 0.04% avahi-daemon [kernel.kallsyms] [k] prandom_u32 0.04% ip [kernel.kallsyms] [k] prandom_u32 0.02% perf [kernel.kallsyms] [k] prandom_u32 0.01% rcu_sched [kernel.kallsyms] [k] prandom_u32 0.00% kworker/u16:2-i [kernel.kallsyms] [k] prandom_u32 0.00% ltp-pan [kernel.kallsyms] [k] prandom_u32 0.00% awk [kernel.kallsyms] [k] prandom_u32 0.00% mktemp [kernel.kallsyms] [k] prandom_u32 0.00% mysqld [kernel.kallsyms] [k] prandom_u32 0.00% systemd-journal [kernel.kallsyms] [k] prandom_u32 0.00% NetworkManager [kernel.kallsyms] [k] prandom_u32 0.00% QDBusConnection [kernel.kallsyms] [k] prandom_u32 0.00% gdbus [kernel.kallsyms] [k] prandom_u32 0.00% jbd2/sdc2-8 [kernel.kallsyms] [k] prandom_u32 0.00% kded5 [kernel.kallsyms] [k] prandom_u32 0.00% kworker/2:2-eve [kernel.kallsyms] [k] prandom_u32 0.00% kworker/3:0-eve [kernel.kallsyms] [k] prandom_u32 0.00% runltp [kernel.kallsyms] [k] prandom_u32 0.00% tcp_fastopen_ru [kernel.kallsyms] [k] prandom_u32 0.00% gnome-software [kernel.kallsyms] [k] prandom_u32 0.00% irq/35-iwlwifi [kernel.kallsyms] [k] prandom_u32 0.00% kworker/u16:1-p [kernel.kallsyms] [k] prandom_u32 0.00% org_kde_powerde [kernel.kallsyms] [k] prandom_u32 0.00% pool-org.gnome. [kernel.kallsyms] [k] prandom_u32 0.00% tst_net_iface_p [kernel.kallsyms] [k] prandom_u32 0.00% upowerd [kernel.kallsyms] [k] prandom_u32 0.00% xdg-desktop-por [kernel.kallsyms] [k] prandom_u32 # Samples: 0 of event 'dummy:HG' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # # (Tip: Customize output of perf script with: perf script -F event,ip,sym) # [-- Attachment #3: perf-report-2020-08-20.txt --] [-- Type: text/plain, Size: 21103 bytes --] # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 120K of event 'random:prandom_u32' # Event count (approx.): 120542 # # Overhead Command Shared Object Symbol # ........ ............... ................. ............... # 59.66% netstress [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--33.20%--tcp_v4_connect | __inet_stream_connect | | | |--22.13%--inet_stream_connect | | __sys_connect | | __x64_sys_connect | | __noinstr_text_start | | entry_SYSCALL_64_after_hwframe | | __libc_connect | | 0x65736e7500632e73 | | | --11.07%--tcp_sendmsg_locked | tcp_sendmsg | __sys_sendto | __x64_sys_sendto | __noinstr_text_start | entry_SYSCALL_64_after_hwframe | __libc_sendto | |--16.60%--tcp_v6_connect | __inet_stream_connect | | | |--11.07%--inet_stream_connect | | __sys_connect | | __x64_sys_connect | | __noinstr_text_start | | entry_SYSCALL_64_after_hwframe | | __libc_connect | | 0x65736e7500632e73 | | | --5.53%--tcp_sendmsg_locked | tcp_sendmsg | __sys_sendto | __x64_sys_sendto | __noinstr_text_start | entry_SYSCALL_64_after_hwframe | __libc_sendto | |--6.69%--tcp_conn_request | tcp_rcv_state_process | | | |--3.41%--tcp_v6_do_rcv | | tcp_v6_rcv | | ip6_protocol_deliver_rcu | | ip6_input | | ipv6_rcv | | __netif_receive_skb | | process_backlog | | napi_poll | | net_rx_action | | __softirqentry_text_start | | asm_call_on_stack | | do_softirq_own_stack | | | | | |--2.15%--__irq_exit_rcu | | | sysvec_apic_timer_interrupt | | | asm_sysvec_apic_timer_interrupt | | | | | --1.26%--do_softirq | | __local_bh_enable_ip | | | | | --1.22%--ip6_finish_output2 | | ip6_output | | ip6_xmit | | inet6_csk_xmit | | __tcp_transmit_skb | | | | | --1.13%--tcp_write_xmit | | | | | --0.99%--__tcp_push_pending_frames | | tcp_sendmsg_locked | | tcp_sendmsg | | | | | --0.56%--__sys_sendto | | __x64_sys_sendto | | __noinstr_text_start | | entry_SYSCALL_64_after_hwframe | | | --3.27%--tcp_v4_do_rcv | tcp_v4_rcv | ip_protocol_deliver_rcu | ip_local_deliver | ip_rcv | __netif_receive_skb | process_backlog | napi_poll | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | | | |--1.94%--__irq_exit_rcu | | sysvec_apic_timer_interrupt | | asm_sysvec_apic_timer_interrupt | | | --1.33%--do_softirq | __local_bh_enable_ip | | | --1.29%--ip_finish_output2 | ip_output | __ip_queue_xmit | __tcp_transmit_skb | | | --1.21%--tcp_write_xmit | | | --1.02%--__tcp_push_pending_frames | tcp_sendmsg_locked | tcp_sendmsg | | | --0.67%--__sys_sendto | __x64_sys_sendto | __noinstr_text_start | entry_SYSCALL_64_after_hwframe | --3.16%--tcp_v4_syn_recv_sock tcp_v6_syn_recv_sock | --2.68%--tcp_try_fastopen tcp_conn_request tcp_rcv_state_process tcp_v4_do_rcv tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver ip_rcv __netif_receive_skb process_backlog napi_poll net_rx_action __softirqentry_text_start asm_call_on_stack do_softirq_own_stack | |--1.62%--__irq_exit_rcu | sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | --1.07%--do_softirq __local_bh_enable_ip | --1.03%--ip_finish_output2 ip_output __ip_queue_xmit __tcp_transmit_skb | --0.98%--tcp_write_xmit | --0.82%--__tcp_push_pending_frames tcp_sendmsg_locked tcp_sendmsg | --0.53%--__sys_sendto __x64_sys_sendto __noinstr_text_start entry_SYSCALL_64_after_hwframe 36.30% swapper [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | |--23.99%--tcp_conn_request | tcp_rcv_state_process | | | |--12.11%--tcp_v4_do_rcv | | tcp_v4_rcv | | ip_protocol_deliver_rcu | | ip_local_deliver | | ip_rcv | | __netif_receive_skb | | process_backlog | | napi_poll | | net_rx_action | | __softirqentry_text_start | | asm_call_on_stack | | do_softirq_own_stack | | __irq_exit_rcu | | sysvec_apic_timer_interrupt | | asm_sysvec_apic_timer_interrupt | | | | | |--10.79%--cpuidle_enter_state | | | cpuidle_enter | | | do_idle | | | cpu_startup_entry | | | | | | | |--8.20%--secondary_startup_64 | | | | | | | --2.59%--start_kernel | | | secondary_startup_64 | | | | | --0.80%--poll_idle | | cpuidle_enter_state | | cpuidle_enter | | do_idle | | cpu_startup_entry | | | | | --0.62%--secondary_startup_64 | | | --11.88%--tcp_v6_do_rcv | tcp_v6_rcv | ip6_protocol_deliver_rcu | ip6_input | ipv6_rcv | __netif_receive_skb | process_backlog | napi_poll | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | __irq_exit_rcu | sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | | | |--10.46%--cpuidle_enter_state | | cpuidle_enter | | do_idle | | cpu_startup_entry | | | | | |--7.92%--secondary_startup_64 | | | | | --2.55%--start_kernel | | secondary_startup_64 | | | --0.89%--poll_idle | cpuidle_enter_state | cpuidle_enter | do_idle | cpu_startup_entry | | | --0.69%--secondary_startup_64 | --12.30%--tcp_v4_syn_recv_sock tcp_v6_syn_recv_sock | |--7.52%--tcp_try_fastopen | tcp_conn_request | tcp_rcv_state_process | tcp_v4_do_rcv | tcp_v4_rcv | ip_protocol_deliver_rcu | ip_local_deliver | ip_rcv | __netif_receive_skb | process_backlog | napi_poll | net_rx_action | __softirqentry_text_start | asm_call_on_stack | do_softirq_own_stack | __irq_exit_rcu | sysvec_apic_timer_interrupt | asm_sysvec_apic_timer_interrupt | | | --6.67%--cpuidle_enter_state | cpuidle_enter | do_idle | cpu_startup_entry | | | |--4.94%--secondary_startup_64 | | | --1.73%--start_kernel | secondary_startup_64 | --4.79%--tcp_check_req tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver ip_rcv __netif_receive_skb process_backlog napi_poll net_rx_action __softirqentry_text_start asm_call_on_stack do_softirq_own_stack __irq_exit_rcu sysvec_apic_timer_interrupt asm_sysvec_apic_timer_interrupt | --4.52%--cpuidle_enter_state cpuidle_enter do_idle cpu_startup_entry | |--3.59%--secondary_startup_64 | --0.93%--start_kernel secondary_startup_64 0.89% ksoftirqd/3 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | --0.58%--tcp_conn_request tcp_rcv_state_process 0.87% ksoftirqd/1 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | --0.58%--tcp_conn_request tcp_rcv_state_process 0.76% ksoftirqd/0 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 | --0.52%--tcp_conn_request tcp_rcv_state_process 0.71% ksoftirqd/2 [kernel.kallsyms] [k] prandom_u32 | ---prandom_u32 prandom_u32 0.16% kworker/3:3-eve [kernel.kallsyms] [k] prandom_u32 0.14% kworker/1:2-eve [kernel.kallsyms] [k] prandom_u32 0.12% Xorg [kernel.kallsyms] [k] prandom_u32 0.12% kworker/0:5-eve [kernel.kallsyms] [k] prandom_u32 0.11% kworker/2:0-eve [kernel.kallsyms] [k] prandom_u32 0.03% avahi-daemon [kernel.kallsyms] [k] prandom_u32 0.03% ip [kernel.kallsyms] [k] prandom_u32 0.03% systemd-udevd [kernel.kallsyms] [k] prandom_u32 0.02% perf [kernel.kallsyms] [k] prandom_u32 0.01% rcu_sched [kernel.kallsyms] [k] prandom_u32 0.00% ltp-pan [kernel.kallsyms] [k] prandom_u32 0.00% NetworkManager [kernel.kallsyms] [k] prandom_u32 0.00% irq/35-iwlwifi [kernel.kallsyms] [k] prandom_u32 0.00% mktemp [kernel.kallsyms] [k] prandom_u32 0.00% mysqld [kernel.kallsyms] [k] prandom_u32 0.00% tcp_fastopen_ru [kernel.kallsyms] [k] prandom_u32 0.00% DiscoverNotifie [kernel.kallsyms] [k] prandom_u32 0.00% QSGRenderThread [kernel.kallsyms] [k] prandom_u32 0.00% kworker/1:0-eve [kernel.kallsyms] [k] prandom_u32 0.00% migration/3 [kernel.kallsyms] [k] prandom_u32 0.00% ns_create [kernel.kallsyms] [k] prandom_u32 0.00% ns_ifmove [kernel.kallsyms] [k] prandom_u32 0.00% runltp [kernel.kallsyms] [k] prandom_u32 0.00% QXcbEventQueue [kernel.kallsyms] [k] prandom_u32 0.00% Thread (pooled) [kernel.kallsyms] [k] prandom_u32 0.00% akonadi_followu [kernel.kallsyms] [k] prandom_u32 0.00% gnome-software [kernel.kallsyms] [k] prandom_u32 0.00% ln [kernel.kallsyms] [k] prandom_u32 0.00% mkdir [kernel.kallsyms] [k] prandom_u32 # Samples: 0 of event 'dummy:HG' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # # (Tip: Generate a script for your data: perf script -g <lang>) # ^ permalink raw reply [flat|nested] 48+ messages in thread
[parent not found: <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com>]
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable [not found] ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com> @ 2020-08-27 1:09 ` Willy Tarreau 2020-08-27 7:08 ` Sedat Dilek 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-27 1:09 UTC (permalink / raw) To: Amit Klein Cc: Sedat Dilek, Eric Dumazet, George Spelvin, Linus Torvalds, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev Hi Amit, On Thu, Aug 27, 2020 at 02:06:39AM +0300, Amit Klein wrote: > Hi > > Is there an ETA for this patch then? No particular ETA on my side, I was waiting for potential criticisms before going further. I suspect that if nobody complains anymore, it's an implicit voucher and I'll have to clean up and post the series then. Thanks, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-27 1:09 ` Willy Tarreau @ 2020-08-27 7:08 ` Sedat Dilek 0 siblings, 0 replies; 48+ messages in thread From: Sedat Dilek @ 2020-08-27 7:08 UTC (permalink / raw) To: Willy Tarreau Cc: Amit Klein, Eric Dumazet, George Spelvin, Linus Torvalds, Jason A. Donenfeld, Andy Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, netdev On Thu, Aug 27, 2020 at 3:09 AM Willy Tarreau <w@1wt.eu> wrote: > > Hi Amit, > > On Thu, Aug 27, 2020 at 02:06:39AM +0300, Amit Klein wrote: > > Hi > > > > Is there an ETA for this patch then? > > No particular ETA on my side, I was waiting for potential criticisms > before going further. I suspect that if nobody complains anymore, it's > an implicit voucher and I'll have to clean up and post the series then. > Hi Willy, again a feedback from me as a tester of the last patchset. I have tested it in several build-environments: #1: GCC v10.2 + GNU/ld (BFD) #2: GCC v10.2 + LLVM/ld.lld (LLD) #3: GCC v10.2 + LLD plus all available LLVM "bin"utils v11.0.0-rc2 (replacements for GNU's nm, objdump, objcopy, ar, strin, ranlib, etc.) #4: LLVM toolchain v11.0.0-rc2 and Clang-IAS (Integrated ASsembler) #5: LLVM toolchain v11.0.0-rc2 and Clang-LTO (LinkTime Optimization) #6: LLVM toolchain v11.0.0-rc2 and Clang-CFI (Control Flow Integrity) For what are you waiting for :-)? Feel free to add my... Tested-by: Sedat Dilek <sedat.dilek@gmail.com> If you are interested in Clang-IAS/Clang-LTO/Clang-CFI... I want to promote the LLVM MC (micro-conference) at Linux Plumbers Conference 2020 today (27-Aug-2020) - starting 16:00 UTC. AFAICS it is freely available via YouTube (untested by me). - Sedat - ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Flaw in "random32: update the net random state on interrupt and activity" @ 2020-08-08 15:26 George Spelvin 2020-08-09 6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin 0 siblings, 1 reply; 48+ messages in thread From: George Spelvin @ 2020-08-08 15:26 UTC (permalink / raw) To: netdev Cc: w, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, George Spelvin [-- Attachment #1: letter --] [-- Type: text/plain, Size: 9210 bytes --] I'm quite annoyed at the way that this discussion has drifted away from the main problem, which is that f227e3ec3b5c is broken and needs reverted. The original bug report is accurate, and it's a critical issue. My proposed fix is described (patch imminent) below. THE PROBLEM: The reseeding design in add_interrupt_randomness is fatally flawed. Quite simply, drip-feeding entropy into a PRNG is pissing the entropy straight down the drain. It's a complete waste of time *and* it fatally damages /dev/random. This is an information-theoretic flaw which cannot be patched by any amount of clever crypto. People don't seem to be grasping this fundamental concept. Ted, I'm particularly disappointed in you; you *know* better. First, why reseed at all? We do it to reduce an attacker's ability to predict the PRNG output, which in practice means to reduce their knowledge of the PRNG's internal state. So any discussion of reseeding begins by *assuming* an attacker has captured the PRNG state. If we weren't worried about this possibility, we wouldn't need to reseed in the first place! [Footnote: most crypto PRNGs also want rekeying before their cycle length limit is reached, but these limits are more than 2^64 bits nowadays and thus infinite in practice.] If we add k bits of seed entropy to the (attacker-known!) state and let an attacker see >= k bits of output, there is a trivial brute-force attack on *any* deterministic PRNG algorithm: try all 2^k possible inputs and choose the one that matches the observed output. [Footnote: in real life, you don't have 2^k equally likely possibilities, so you end up trying possibilities in decreasing order of likelihood, giving the same average time to solve, but with a longer tail. The implementation details are messier, but well known; RTFM of any password-guessing software for details.] If k is small, this is trivially easy. Reseeding many times (large n) doesn't help if you allow an attacker to observe PRNG output after each small reseeding; the effort is n * 2^k. The way to prevent this from becoming a problem is well-known: ensure that k is large. Save up the n seeds and reseed once, making the brute-force effort 2^(n*k). This is called "catastrophic reseeding" and is described in more detail at e.g. https://www.schneier.com/academic/paperfiles/paper-prngs.pdf But the amount of entropy is any one interrupt timing sample is small. /dev/random assumes it lies somewhere between 1/64 and 2 bits. If our attacker is less skilled, they may have more subjective uncertainty, but guessing a TSC delta to within 20 bits (0.26 ms at 4 GHz) doesn't seem at all difficult. Testing 2^20 possibilities is trivial. *This* is why Willy's patch is useless. No amount of deckchair rearrangement or bikeshed painting will fix this core structural flaw. It *must* be redesigned to do catastrophic reseeding. Additionally, the entropy revealed by this attack is *also* the primary source for /dev/random, destroying its security guarantee. This elevates the problem from "useless" to "disastrous". This patch *must* be reverted for 5.8.1. In response to various messages saying "this is all theoretical; I won't believe you until I see a PoC", all I can say is "I feel that you should be aware that some asshole is signing your name to stupid letters." Just like a buffer overflow, a working attack is plausible using a combination of well-understood techniques. It's ridiculous to go to the effort of developing a full exploit when it's less work to fix the problem in the first place. I also notice a lot of amateurish handwaving about the security of various primitives. This is particularly frustrating, but I will refrain from giving vent/rant to my feelings, instead referring people to the various references linked from https://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own A SOLUTION: Now, how to fix it? First, what is the problem? "Non-cryptographic PRNGs are predictable" fits in the cateogry of Not A Bug. There are may applications for a non-cryptographic PRNG in the kernel. Self-tests, spreading wear across flash memory, randomized backoff among cooperating processes, and assigning IDs in protocols which aren't designed for DoS resistance in the first place. But apparently the network code is (ab)using lib/random32.c for choosing TCP/UDP port numbers and someone has found a (not publicly disclosed) attack which exploits the predictability to commit some mischief. And apparently switching to the fastest secure PRNG currently in the kernel (get_random_u32() using ChaCha + per-CPU buffers) would cause too much performance penalty. So the request is for a less predictable PRNG which is still extremely fast. It's specifically okay if the crypto is half-assed; this is apparently some kind of nuisance (DoS?) attack rather than something really valuable. Gee, I seem to recall solving a very similar problem with the dache hash function. I think I can do this. An important design constraint is that we want low-latency random number generation, not just high bandwidth. Amortizing over bulk operations is *not* okay. Well, the best crypto primitive I know of for such an application is SipHash. Its 4x 64-bit words of state is only twice the size of the current struct rnd_state. Converting it from a a pseudo-random function to a CRNG is some half-assed amateur cryptography, but at least it's a robust and well-studied primitive. So here's my proposal, for comment: (I'll post an RFC patch shortly.) - Replace the prandom_u32() generator with something that does 2 rounds of SipHash. (Could drop to 1 round if we get complaints.) - Keep the per-CPU structure, to minimize cacheline bouncing. - On 32-bit processors, use HSipHash to keep performance up. In the spirit of half-assedness, this is weaker, but hopefully good enough, and while there are lots of such processors in routers and IoT devices, they aren't handling thousands of connections per second and so expose much less PRNG output. - Using random_ready_callback and periodically thereafter, do a full catastrophic reseed from the ChaCha generator. There's no need for locking; tearing on updates doesn't do any harm. Current plans I'm open to discussion about: - Replace the current prandom_u32(), rather than adding yet another PRNG. This keeps the patch smaller. - Leave the 60-second reseed interval alone. If someone can suggest a way to suppress reseeding when not needed, we could drop the interval without penalizing mobile devices. - Leave the prandom_u32_state() code paths alone. Those functions are used in self-test code and it's probably worth not changing the output. (The downside is misleading function names.) - For robustness, seed each CPU's PRNG state independently. We could use a global key and use SipHash itself to hash in the CPU number which would be faster than ChaCha, but let's KISS. INDIVIDUAL REPLIES: Jason A. Donenfeld: You're quite right about the accretion of superstitious incantations in random.c, but I do think it's a fundamentally sound design, just with an inelegant implementation. A comparison to SHACAL1 is not appropriate. SHACAL is invertible because it leaves off the final Davies-Meyer add-back, random.c uses SHA1 *with* the add-back, so the compression function is not invertible. The most maliciously backdoored rdrand implementation imaginable can't analyze the entropy pool and choose its output so as to leak data to the Bavarian Illuminati. That's laughably impractical. Willy Tarreau: Middle square + Weil sequence isn't even *close* to crypto-quality. And you're badly misunderstanding what the fast_pool is doing. Please stop trying to invent your own crypto primitives; see https://www.schneier.com/crypto-gram/archives/1998/1015.html#cipherdesign Ted Ts'o: Actually, I'm (slowly) working through a complete audit of all RNG users in the kernel. Current code offers four basic levels of security: - Trivial LCG (next_pseudo_random32, preserved for compatibility) - Trivially predictable (but statistically good) LFSR - Cryptographically strong ChaCha, batched - Cryptographically strong ChaCha, with anti-backtracking. (Because I'm working strong-to-weak, I've mostly downgraded call sites so far. There's also the earlyrandom stuff, which is its own mess.) Do we want an additional "hopefully strong" level in the middle for TCP ports and other visible IDs only, or should we replace the LFSR and not have a conventional PRNG at all? Andy Lutomirski: I also have some code for combining the 32- and 64-bit entropy batches, and I've generalized it to return bytes as well, because *most* users of get_random_bytes() in the kernel don't need anti-backtracking, but my understanding from Linus is that for this application, amortized time is *not* okay; we want a fast 99th percentile. PERSONAL NOTE: Events this year have left me without the bandwidth to keep up with kernel development and so I've been AFK for some months. My problems haven't gone away, but this issue is important enough that I'll be making time to see it through to resolution. ^ permalink raw reply [flat|nested] 48+ messages in thread
* [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-08 15:26 Flaw in "random32: update the net random state on interrupt and activity" George Spelvin @ 2020-08-09 6:57 ` George Spelvin 2020-08-09 9:38 ` Willy Tarreau ` (2 more replies) 0 siblings, 3 replies; 48+ messages in thread From: George Spelvin @ 2020-08-09 6:57 UTC (permalink / raw) To: netdev Cc: w, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin [-- Attachment #1: 0001-random32-Make-prandom_u32-output-difficult-to-predic.patch --] [-- Type: text/plain, Size: 15932 bytes --] Non-cryptographic PRNGs may have great statistical proprties, but are usually trivially predictable to someone who knows the algorithm, given a small sample of their output. An LFSR like prandom_u32() is particularly simple, even if the sample is widely scattered bits. It turns out the network stack uses prandom_u32() for some things like random port numbers which it would prefer are *not* trivially predictable. Predictability led to a practical DNS spoofing attack. Oops. This patch replaces the LFSR with a homebrew cryptographic PRNG based on the SipHash round function, which is in turn seeded with 128 bits of strong random key. (The authors of SipHash have *not* been consulted about this abuse of their algorithm.) Speed is prioritized over security; attacks are rare, while performance is always wanted. Replacing all callers of prandom_u32() is the quick fix. Whether to reinstate a weaker PRNG for uses which can tolerate it is an open question. TODO: The prandom_seed() function is currently a no-op stub. It's not obvious how to use this additional data, so suggestions are invited. f227e3ec3b5c ("random32: update the net random state on interrupt and activity") was an earlier attempt at a solution. This patch replaces it; revert it before applying this one. Reported-by: Amit Klein <aksecurity@gmail.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Eric Dumazet <edumazet@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Fixes: f227e3ec3b5c ("random32: update the net random state on interrupt and activity") Replaces: f227e3ec3b5c ("random32: update the net random state on interrupt and activity") Signed-off-by: George Spelvin <lkml@sdf.org> --- lib/random32.c | 437 ++++++++++++++++++++++++++++++------------------- 1 file changed, 267 insertions(+), 170 deletions(-) I haven't yet rebooted, so this is only build-tested, but it should illustrate the idea. diff --git a/lib/random32.c b/lib/random32.c index 763b920a6206c..2b048e2ea99f6 100644 --- a/lib/random32.c +++ b/lib/random32.c @@ -48,8 +48,6 @@ static inline void prandom_state_selftest(void) } #endif -static DEFINE_PER_CPU(struct rnd_state, net_rand_state) __latent_entropy; - /** * prandom_u32_state - seeded pseudo-random number generator. * @state: pointer to state structure holding seeded state. @@ -69,25 +67,6 @@ u32 prandom_u32_state(struct rnd_state *state) } EXPORT_SYMBOL(prandom_u32_state); -/** - * prandom_u32 - pseudo random number generator - * - * A 32 bit pseudo-random number is generated using a fast - * algorithm suitable for simulation. This algorithm is NOT - * considered safe for cryptographic use. - */ -u32 prandom_u32(void) -{ - struct rnd_state *state = &get_cpu_var(net_rand_state); - u32 res; - - res = prandom_u32_state(state); - put_cpu_var(net_rand_state); - - return res; -} -EXPORT_SYMBOL(prandom_u32); - /** * prandom_bytes_state - get the requested number of pseudo-random bytes * @@ -119,20 +98,6 @@ void prandom_bytes_state(struct rnd_state *state, void *buf, size_t bytes) } EXPORT_SYMBOL(prandom_bytes_state); -/** - * prandom_bytes - get the requested number of pseudo-random bytes - * @buf: where to copy the pseudo-random bytes to - * @bytes: the requested number of bytes - */ -void prandom_bytes(void *buf, size_t bytes) -{ - struct rnd_state *state = &get_cpu_var(net_rand_state); - - prandom_bytes_state(state, buf, bytes); - put_cpu_var(net_rand_state); -} -EXPORT_SYMBOL(prandom_bytes); - static void prandom_warmup(struct rnd_state *state) { /* Calling RNG ten times to satisfy recurrence condition */ @@ -148,96 +113,6 @@ static void prandom_warmup(struct rnd_state *state) prandom_u32_state(state); } -static u32 __extract_hwseed(void) -{ - unsigned int val = 0; - - (void)(arch_get_random_seed_int(&val) || - arch_get_random_int(&val)); - - return val; -} - -static void prandom_seed_early(struct rnd_state *state, u32 seed, - bool mix_with_hwseed) -{ -#define LCG(x) ((x) * 69069U) /* super-duper LCG */ -#define HWSEED() (mix_with_hwseed ? __extract_hwseed() : 0) - state->s1 = __seed(HWSEED() ^ LCG(seed), 2U); - state->s2 = __seed(HWSEED() ^ LCG(state->s1), 8U); - state->s3 = __seed(HWSEED() ^ LCG(state->s2), 16U); - state->s4 = __seed(HWSEED() ^ LCG(state->s3), 128U); -} - -/** - * prandom_seed - add entropy to pseudo random number generator - * @entropy: entropy value - * - * Add some additional entropy to the prandom pool. - */ -void prandom_seed(u32 entropy) -{ - int i; - /* - * No locking on the CPUs, but then somewhat random results are, well, - * expected. - */ - for_each_possible_cpu(i) { - struct rnd_state *state = &per_cpu(net_rand_state, i); - - state->s1 = __seed(state->s1 ^ entropy, 2U); - prandom_warmup(state); - } -} -EXPORT_SYMBOL(prandom_seed); - -/* - * Generate some initially weak seeding values to allow - * to start the prandom_u32() engine. - */ -static int __init prandom_init(void) -{ - int i; - - prandom_state_selftest(); - - for_each_possible_cpu(i) { - struct rnd_state *state = &per_cpu(net_rand_state, i); - u32 weak_seed = (i + jiffies) ^ random_get_entropy(); - - prandom_seed_early(state, weak_seed, true); - prandom_warmup(state); - } - - return 0; -} -core_initcall(prandom_init); - -static void __prandom_timer(struct timer_list *unused); - -static DEFINE_TIMER(seed_timer, __prandom_timer); - -static void __prandom_timer(struct timer_list *unused) -{ - u32 entropy; - unsigned long expires; - - get_random_bytes(&entropy, sizeof(entropy)); - prandom_seed(entropy); - - /* reseed every ~60 seconds, in [40 .. 80) interval with slack */ - expires = 40 + prandom_u32_max(40); - seed_timer.expires = jiffies + msecs_to_jiffies(expires * MSEC_PER_SEC); - - add_timer(&seed_timer); -} - -static void __init __prandom_start_seed_timer(void) -{ - seed_timer.expires = jiffies + msecs_to_jiffies(40 * MSEC_PER_SEC); - add_timer(&seed_timer); -} - void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state) { int i; @@ -257,51 +132,6 @@ void prandom_seed_full_state(struct rnd_state __percpu *pcpu_state) } EXPORT_SYMBOL(prandom_seed_full_state); -/* - * Generate better values after random number generator - * is fully initialized. - */ -static void __prandom_reseed(bool late) -{ - unsigned long flags; - static bool latch = false; - static DEFINE_SPINLOCK(lock); - - /* Asking for random bytes might result in bytes getting - * moved into the nonblocking pool and thus marking it - * as initialized. In this case we would double back into - * this function and attempt to do a late reseed. - * Ignore the pointless attempt to reseed again if we're - * already waiting for bytes when the nonblocking pool - * got initialized. - */ - - /* only allow initial seeding (late == false) once */ - if (!spin_trylock_irqsave(&lock, flags)) - return; - - if (latch && !late) - goto out; - - latch = true; - prandom_seed_full_state(&net_rand_state); -out: - spin_unlock_irqrestore(&lock, flags); -} - -void prandom_reseed_late(void) -{ - __prandom_reseed(true); -} - -static int __init prandom_reseed(void) -{ - __prandom_reseed(false); - __prandom_start_seed_timer(); - return 0; -} -late_initcall(prandom_reseed); - #ifdef CONFIG_RANDOM32_SELFTEST static struct prandom_test1 { u32 seed; @@ -463,3 +293,270 @@ static void __init prandom_state_selftest(void) pr_info("prandom: %d self tests passed\n", runs); } #endif + + + +/* + * The prandom_u32() implementation is now completely separate from the + * prandom_state() functions, which are retained (for now) for compatibility. + * + * Because of (ab)use in the networking code for choosing random TCP/UDP port + * numbers, which open DoS possibilities if guessable, we want something + * stronger than a standard PRNG. But the performance requirements of + * the network code do not allow robust crypto for this application. + * + * So this is a homebrew Junior Spaceman implementation, based on the + * lowest-latency trustworthy crypto primitive available, SipHash. + * (The authors of SipHash have not been consulted about this abuse of + * their work.) + * + * Standard SipHash-2-4 uses 2n+4 rounds to hash n words of input to + * one word of output. This abbreviated version uses 2 rounds per word + * of output. + */ + +struct siprand_state { + unsigned long v[4]; +}; + +static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; + +#if BITS_PER_LONG == 64 +/* + * The core SipHash round function. Each line can be executed in + * parallel given enough CPU resources. + */ +#define SIPROUND(v0,v1,v2,v3) ( \ + v0 += v1, v1 = rol64(v1, 13), v2 += v3, v3 = rol64(v3, 16), \ + v1 ^= v0, v0 = rol64(v0, 32), v3 ^= v2, \ + v0 += v3, v3 = rol64(v3, 21), v2 += v1, v1 = rol64(v1, 17), \ + v3 ^= v0, v1 ^= v2, v2 = rol64(v2, 32) ) +#define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) +#define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) + +#elif BITS_PER_LONG == 23 +/* + * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash. + * This is weaker, but 32-bit machines are not used for high-traffic + * applications, so there is less output for an attacker to analyze. + */ +#define SIPROUND(v0,v1,v2,v3) ( \ + v0 += v1, v1 = rol32(v1, 5), v2 += v3, v3 = rol32(v3, 8), \ + v1 ^= v0, v0 = rol32(v0, 16), v3 ^= v2, \ + v0 += v3, v3 = rol32(v3, 7), v2 += v1, v1 = rol32(v1, 13), \ + v3 ^= v0, v1 ^= v2, v2 = rol32(v2, 16) ) +#define K0 0x6c796765 +#define K1 0x74656462 + +#else +#error Unsupported BITS_PER_LONG +#endif + +/* + * This is the core CPRNG function. As "pseudorandom", this is not used + * for truly valuable things, just intended to be a PITA to guess. + * For maximum speed, we do just two SipHash rounds per word. This is + * the same rate as 4 rounds per 64 bits that SipHash normally uses, + * so hopefully it's reasonably secure. + * + * There are two changes from the official SipHash finalization: + * - We omit some constants XORed with v2 in the SipHash spec as irrelevant; + * they are there only to make the output rounds distinct from the input + * rounds, and this application has no input rounds. + * - Rather than returning v0^v1^v2^v3, return v1+v3. + * If you look at the SipHash round, the last operation on v3 is + * "v3 ^= v0", so "v0 ^ v3" just undoes that, a waste of time. + * Likewise "v1 ^= v2". (The rotate of v2 makes a difference, but + * it still cancels out half of the bits in v2 for no benefit.) + * Second, since the last combining operation was xor, continue the + * pattern of alternating xor/add for a tiny bit of extra non-linearity. + */ +static u32 siprand_u32(struct siprand_state *s) +{ + unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; + + SIPROUND(v0, v1, v2, v3); + SIPROUND(v0, v1, v2, v3); + s->v[0] = v0; s->v[1] = v1; s->v[2] = v2; s->v[3] = v3; + return v1 + v3; +} + + +/** + * prandom_u32 - pseudo random number generator + * + * A 32 bit pseudo-random number is generated using a fast + * algorithm suitable for simulation. This algorithm is NOT + * considered safe for cryptographic use. + */ +u32 prandom_u32(void) +{ + struct siprand_state *state = get_cpu_ptr(&net_rand_state); + u32 res = siprand_u32(state); + + put_cpu_ptr(&net_rand_state); + return res; +} +EXPORT_SYMBOL(prandom_u32); + +/** + * prandom_bytes - get the requested number of pseudo-random bytes + * @buf: where to copy the pseudo-random bytes to + * @bytes: the requested number of bytes + */ +void prandom_bytes(void *buf, size_t bytes) +{ + struct siprand_state *state = get_cpu_ptr(&net_rand_state); + u8 *ptr = buf; + + while (bytes >= sizeof(u32)) { + put_unaligned(siprand_u32(state), (u32 *)ptr); + ptr += sizeof(u32); + bytes -= sizeof(u32); + } + + if (bytes > 0) { + u32 rem = siprand_u32(state); + do { + *ptr++ = (u8)rem; + rem >>= BITS_PER_BYTE; + } while (--bytes > 0); + } + put_cpu_ptr(&net_rand_state); +} +EXPORT_SYMBOL(prandom_bytes); + +/** + * prandom_seed - add entropy to pseudo random number generator + * @entropy: entropy value + * + * Add some additional seed material to the prandom pool. + * The "entropy" is actually our IP address (the only caller is + * the network code), not for unpredictability, but to ensure that + * different machines are initialized differently. + */ +void prandom_seed(u32 entropy) +{ + /* FIXME: Where to feed this in? */ +} +EXPORT_SYMBOL(prandom_seed); + +/* + * Generate some initially weak seeding values to allow + * the prandom_u32() engine to be started. + */ +static int __init prandom_init_early(void) +{ + int i; + unsigned long v0, v1, v2, v3; + + if (!arch_get_random_long(&v0)) + v0 = jiffies; + if (!arch_get_random_long(&v1)) + v0 = random_get_entropy(); + v2 = v0 ^ K0; + v3 = v1 ^ K1; + + for_each_possible_cpu(i) { + struct siprand_state *state; + + v3 ^= i; + SIPROUND(v0, v1, v2, v3); + SIPROUND(v0, v1, v2, v3); + v0 ^= i; + + state = per_cpu_ptr(&net_rand_state, i); + state->v[0] = v0; state->v[1] = v1; + state->v[2] = v2; state->v[3] = v3; + } + + return 0; +} +core_initcall(prandom_init_early); + + +/* Stronger reseeding when available, and periodically thereafter. */ +static void prandom_reseed(struct timer_list *unused); + +static DEFINE_TIMER(seed_timer, prandom_reseed); + +static void prandom_reseed(struct timer_list *unused) +{ + unsigned long expires; + int i; + + /* + * Reinitialize each CPU's PRNG with 128 bits of key. + * No locking on the CPUs, but then somewhat random results are, + * well, expected. + */ + for_each_possible_cpu(i) { + struct siprand_state *state; + unsigned long v0 = get_random_long(), v2 = v0 ^ K0; + unsigned long v1 = get_random_long(), v3 = v1 ^ K1; +#if BITS_PER_LONG == 32 + int j; + + /* + * On 32-bit machines, hash in two extra words to + * approximate 128-bit key length. Not that the hash + * has that much security, but this prevents a trivial + * 64-bit brute force. + */ + for (j = 0; j < 2; j++) { + unsigned long m = get_random_long(); + v3 ^= m; + SIPROUND(v0, v1, v2, v3); + SIPROUND(v0, v1, v2, v3); + v0 ^= m; + } +#endif + /* + * Probably impossible in practice, but there is a + * theoretical risk that a race between this reseeding + * and the target CPU writing its state back could + * create the all-zero SipHash fixed point. + * + * To ensure that never happens, ensure the state + * we write contains no zero words. + */ + state = per_cpu_ptr(&net_rand_state, i); + WRITE_ONCE(state->v[0], v0 ? v0 : -1ul); + WRITE_ONCE(state->v[1], v1 ? v1 : -1ul); + WRITE_ONCE(state->v[2], v2 ? v2 : -1ul); + WRITE_ONCE(state->v[3], v3 ? v3 : -1ul); + } + + /* reseed every ~60 seconds, in [40 .. 80) interval with slack */ + expires = round_jiffies(jiffies + 40 * HZ + prandom_u32_max(40 * HZ)); + mod_timer(&seed_timer, expires); +} + +/* + * The random ready callback can be called from almost any interrupt. + * To avoid worrying about whether it's safe to delay that interrupt + * long enough to seed all CPUs, just schedule an immediate timer event. + */ +static void prandom_timer_start(struct random_ready_callback *unused) +{ + mod_timer(&seed_timer, jiffies); +} + +/* + * Start periodic full reseeding as soon as strong + * random numbers are available. + */ +static int __init prandom_init_late(void) +{ + static struct random_ready_callback random_ready = { + .func = prandom_timer_start + }; + int ret = add_random_ready_callback(&random_ready); + + if (ret == -EALREADY) { + prandom_timer_start(&random_ready); + ret = 0; + } + return ret; +} +late_initcall(prandom_init_late); -- 2.28.0 ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin @ 2020-08-09 9:38 ` Willy Tarreau 2020-08-09 17:06 ` George Spelvin [not found] ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com> 2020-08-09 13:50 ` Randy Dunlap [not found] ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com> 2 siblings, 2 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-09 9:38 UTC (permalink / raw) To: George Spelvin Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw Hi George, On Sun, Aug 09, 2020 at 06:57:44AM +0000, George Spelvin wrote: > +/* > + * This is the core CPRNG function. As "pseudorandom", this is not used > + * for truly valuable things, just intended to be a PITA to guess. > + * For maximum speed, we do just two SipHash rounds per word. This is > + * the same rate as 4 rounds per 64 bits that SipHash normally uses, > + * so hopefully it's reasonably secure. > + * > + * There are two changes from the official SipHash finalization: > + * - We omit some constants XORed with v2 in the SipHash spec as irrelevant; > + * they are there only to make the output rounds distinct from the input > + * rounds, and this application has no input rounds. > + * - Rather than returning v0^v1^v2^v3, return v1+v3. > + * If you look at the SipHash round, the last operation on v3 is > + * "v3 ^= v0", so "v0 ^ v3" just undoes that, a waste of time. > + * Likewise "v1 ^= v2". (The rotate of v2 makes a difference, but > + * it still cancels out half of the bits in v2 for no benefit.) > + * Second, since the last combining operation was xor, continue the > + * pattern of alternating xor/add for a tiny bit of extra non-linearity. > + */ > +static u32 siprand_u32(struct siprand_state *s) > +{ > + unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; > + > + SIPROUND(v0, v1, v2, v3); > + SIPROUND(v0, v1, v2, v3); > + s->v[0] = v0; s->v[1] = v1; s->v[2] = v2; s->v[3] = v3; > + return v1 + v3; > +} > + > + > +/** > + * prandom_u32 - pseudo random number generator > + * > + * A 32 bit pseudo-random number is generated using a fast > + * algorithm suitable for simulation. This algorithm is NOT > + * considered safe for cryptographic use. > + */ > +u32 prandom_u32(void) > +{ > + struct siprand_state *state = get_cpu_ptr(&net_rand_state); > + u32 res = siprand_u32(state); > + > + put_cpu_ptr(&net_rand_state); > + return res; > +} So I gave it a quick test under Qemu and it didn't show any obvious performance difference compared to Tausworthe, which is a good thing, even though there's a significant amount of measurement noise in each case. However it keeps the problem that the whole sequence is entirely determined at the moment of reseeding, so if one were to be able to access the state, e.g. using l1tf/spectre/meltdown/whatever, then this state could be used to predict the whole ongoing sequence for the next minute. What some view as a security feature, others will see as a backdoor :-/ That's why I really like the noise approach. Even just the below would significantly harm that capability because that state alone isn't sufficient anymore to pre-compute all future values: --- a/lib/random32.c +++ b/lib/random32.c @@ -375,6 +375,7 @@ static u32 siprand_u32(struct siprand_state *s) { unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; + v0 += get_cycles(); SIPROUND(v0, v1, v2, v3); SIPROUND(v0, v1, v2, v3); s->v[0] = v0; s->v[1] = v1; s->v[2] = v2; s->v[3] = v3; Regards, Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 9:38 ` Willy Tarreau @ 2020-08-09 17:06 ` George Spelvin 2020-08-09 17:33 ` Willy Tarreau [not found] ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com> 1 sibling, 1 reply; 48+ messages in thread From: George Spelvin @ 2020-08-09 17:06 UTC (permalink / raw) To: Willy Tarreau Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin On Sun, Aug 09, 2020 at 11:38:05AM +0200, Willy Tarreau wrote: > So I gave it a quick test under Qemu and it didn't show any obvious > performance difference compared to Tausworthe, which is a good thing, > even though there's a significant amount of measurement noise in each > case. Thank you very much! I'm not quite sure how to benchmark this. The whole idea is that it's *not* used in a tight cache-hot loop. Hopefully someone already has a test setup so I don't have to invent one. > However it keeps the problem that the whole sequence is entirely > determined at the moment of reseeding, so if one were to be able to > access the state, e.g. using l1tf/spectre/meltdown/whatever, then > this state could be used to predict the whole ongoing sequence for > the next minute. What some view as a security feature, others will > see as a backdoor :-/ That's why I really like the noise approach. > Even just the below would significantly harm that capability because > that state alone isn't sufficient anymore to pre-compute all future > values: > > --- a/lib/random32.c > +++ b/lib/random32.c > @@ -375,6 +375,7 @@ static u32 siprand_u32(struct siprand_state *s) > { > unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; > > + v0 += get_cycles(); > SIPROUND(v0, v1, v2, v3); > SIPROUND(v0, v1, v2, v3); > s->v[0] = v0; s->v[1] = v1; s->v[2] = v2; s->v[3] = v3; As long as: 1) The periodic catastrophic reseeding remains, and 2) You use fresh measurements, not the exact same bits that add_*_randomness feeds into /dev/random then it doesn't do any real harm, so if it makes you feel better... But I really want to stress how weak a design drip-reseeding is. If an attacker has enough local machine access to do a meltdown-style attack, then they can calibrate the TSC used in get_cycles very accurately, so the remaining timing uncertainty is very low. This makes a brute-force attack on one or two reseedings quite easy. I.e. if you can see every other output, It's straightforward to figure out the ones in between. I wonder if, on general principles, it would be better to use a more SipHash style mixing in of the sample: m = get_cycles(); v3 ^= m; SIPROUND(v0, v1, v2, v3); SIPROUND(v0, v1, v2, v3); v0 ^= m; Not sure if it's worth the extra register (and associated spill/fill). ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 17:06 ` George Spelvin @ 2020-08-09 17:33 ` Willy Tarreau 2020-08-09 18:30 ` George Spelvin 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-09 17:33 UTC (permalink / raw) To: George Spelvin Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw On Sun, Aug 09, 2020 at 05:06:39PM +0000, George Spelvin wrote: > On Sun, Aug 09, 2020 at 11:38:05AM +0200, Willy Tarreau wrote: > > So I gave it a quick test under Qemu and it didn't show any obvious > > performance difference compared to Tausworthe, which is a good thing, > > even though there's a significant amount of measurement noise in each > > case. > > Thank you very much! I'm not quite sure how to benchmark this. > The whole idea is that it's *not* used in a tight cache-hot loop. > Hopefully someone already has a test setup so I don't have to invent > one. Due to limited access to representative hardware, the to main tests I've been running were on my laptop in qemu, and consisted in : - a connect-accept-close test to stress the accept() code and verify we don't observe a significant drop. The thing is that connect() usually is much slower and running the two on the same machine tends to significantly soften the differences compared to what a real machine would see when handling a DDoS for example. - a packet rate test through this rule (which uses prandom_u32() for each packet and which matches what can be done in packet schedulers or just by users having to deal with random drop) : iptables -I INPUT -m statistic --probability 0.5 -j ACCEPT While these ones are not very relevant, especially in a VM, not seeing significant variations tends to indicate we should not see a catastrophic loss. > > However it keeps the problem that the whole sequence is entirely > > determined at the moment of reseeding, so if one were to be able to > > access the state, e.g. using l1tf/spectre/meltdown/whatever, then > > this state could be used to predict the whole ongoing sequence for > > the next minute. What some view as a security feature, others will > > see as a backdoor :-/ That's why I really like the noise approach. > > Even just the below would significantly harm that capability because > > that state alone isn't sufficient anymore to pre-compute all future > > values: > > > > --- a/lib/random32.c > > +++ b/lib/random32.c > > @@ -375,6 +375,7 @@ static u32 siprand_u32(struct siprand_state *s) > > { > > unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; > > > > + v0 += get_cycles(); > > SIPROUND(v0, v1, v2, v3); > > SIPROUND(v0, v1, v2, v3); > > s->v[0] = v0; s->v[1] = v1; s->v[2] = v2; s->v[3] = v3; > > As long as: > 1) The periodic catastrophic reseeding remains, and > 2) You use fresh measurements, not the exact same bits > that add_*_randomness feeds into /dev/random > then it doesn't do any real harm, so if it makes you feel better... > > But I really want to stress how weak a design drip-reseeding is. > > If an attacker has enough local machine access to do a meltdown-style attack, > then they can calibrate the TSC used in get_cycles very accurately, Absolutely. > so the > remaining timing uncertainty is very low. Not that low in fact because they don't know precisely when the call is made. I mean, let's say we're in the worst case, with two VMs running on two siblings of the same core, with the same TSC, on a 3 GHz machine. The attacker can stress the victim at 100k probes per second. That's still 15 bits of uncertainty on the TSC value estimation which is added to each call. Even on the first call this is enough to make a source port unguessable, and preventing the attacker from staying synchronized with its victim. And I'm only speaking about an idle remote machine, not even one taking unobservable traffic, which further adds to the difficulty. > This makes a brute-force attack on > one or two reseedings quite easy. I.e. if you can see every other output, > It's straightforward to figure out the ones in between. But they already become useless because you're only observing stuff of the past. > I wonder if, on general principles, it would be better to use a more > SipHash style mixing in of the sample: > m = get_cycles(); > v3 ^= m; > SIPROUND(v0, v1, v2, v3); > SIPROUND(v0, v1, v2, v3); > v0 ^= m; Probably, if it's the recommended way to mix in other values, yes. > Not sure if it's worth the extra register (and associated spill/fill). If this makes the hash better, maybe. I can run some tests on this as well. I'd really need to try on a cleaner setup, I have remote machines at the office but I don't feel safe enough to remotely reboot them and risk to lose them :-/ I'll also test on arm/arm64 to make sure we don't introduce a significant cost there. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 17:33 ` Willy Tarreau @ 2020-08-09 18:30 ` George Spelvin 2020-08-09 19:16 ` Willy Tarreau 2020-08-10 11:47 ` Willy Tarreau 0 siblings, 2 replies; 48+ messages in thread From: George Spelvin @ 2020-08-09 18:30 UTC (permalink / raw) To: Willy Tarreau Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin On Sun, Aug 09, 2020 at 07:33:03PM +0200, Willy Tarreau wrote: > Not that low in fact because they don't know precisely when the call is > made. I mean, let's say we're in the worst case, with two VMs running on > two siblings of the same core, with the same TSC, on a 3 GHz machine. The > attacker can stress the victim at 100k probes per second. That's still > 15 bits of uncertainty on the TSC value estimation which is added to each > call. Even on the first call this is enough to make a source port > unguessable, and preventing the attacker from staying synchronized with > its victim. And I'm only speaking about an idle remote machine, not even > one taking unobservable traffic, which further adds to the difficulty. I'm trying to understand your attack scenario. I'm assuming that an attacker can call prandom_u32() locally. (I don't have a specific code path, but given the number of uses in the kernel, I assume *one* of them will leak the output directly.) And repeat the call fast enough that there's at most *one* other user between our calls. If an attacker knows the initial state, does an rdtsc, prandom_u32(), and a second rdtsc, then they can guess the TSC value used in than prandom_u32() quite accurately (4-6 bits fuzz, perhaps). This is trivial to brute force. The fun comes if someone else does a prandom_u32() call in between. All of a sudden, the 4-6 bit brute force of one get_cycles() value fails to find a solution. Someone else has called prandom_u32()! Now we have 15 bits of uncertainty about that other call, and 5 bits of uncertainty about our call. 2^20 possibilities only takes a few milliseconds to test, and the 32-bit output of prandom_u32() can verify a guess with minimal probability of error. (Note that, to maintain tracking, we have to keep hammering prandom_u32() *during* the search, but we can just buffer the results and process them after the expensive search is complete.) What you can see here is the incredible power of *multiple* unobserved seedings. As long as an attacker can limit things to one unobserved prandom_u32(), it's a simple brute force. If there are mroe than one, the additional bits of uncertainty quickly make things impractical. This is why I'm so keen on less frequent, more catastrophic, reseeding. Yes, the delay means an attacker who has captured the state retains full knowledge for longer. But they get kicked off as soon as the catastophe happens. Without it, they can keep tracking the state indefinitely. Even something simple like buffering 8 TSC samples, and adding them at 32-bit offsets across the state every 8th call, would make a huge difference. Even if 7 of the timestamps are from attacker calls (4 bits uncertainty), the time of the target call is 8x less known (so it goes from 15 to 18 bits), and the total becomes 46 bits. *So* much better. > I can run some tests on this as > well. I'd really need to try on a cleaner setup, I have remote machines > at the office but I don't feel safe enough to remotely reboot them and > risk to lose them :-/ Yeah, test kernels are nervous-making that way. > I'll also test on arm/arm64 to make sure we don't introduce a significant > cost there. I don't expect a problem, but SipHash is optimized for 4-issue processors, and on narrower machines fewer instructions are "free". ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 18:30 ` George Spelvin @ 2020-08-09 19:16 ` Willy Tarreau 2020-08-10 11:47 ` Willy Tarreau 1 sibling, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-09 19:16 UTC (permalink / raw) To: George Spelvin Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote: > I'm trying to understand your attack scenario. I'm assuming that an > attacker can call prandom_u32() locally. Well, first, I'm mostly focusing on remote attacks, because if the attacker has a permanent local access, he can as well just bind() a source port instead of having to guess the next one that will be used. Second, I'm not aware of direct access from userland, so the calls to prandom_u32() are expected to be done only via specific code parts. The worst case (in my opinion) is when the attacker runs on the same physical CPU and exploits some memory leakage to find the internal state and uses the same TSC, but can only trigger prandom_u32() via the network, hence with much less precision. (...) > Even something simple like buffering 8 TSC samples, and adding them > at 32-bit offsets across the state every 8th call, would make a huge > difference. > > Even if 7 of the timestamps are from attacker calls (4 bits > uncertainty), the time of the target call is 8x less known > (so it goes from 15 to 18 bits), and the total becomes > 46 bits. *So* much better. Maybe. I'm just suggesting to add *some* noise to keep things not exploitable if the state is stolen once in a while (which would already be a big problem, admittedly). > > I can run some tests on this as > > well. I'd really need to try on a cleaner setup, I have remote machines > > at the office but I don't feel safe enough to remotely reboot them and > > risk to lose them :-/ > > Yeah, test kernels are nervous-making that way. In fact I never booted a 5.8 kernel on the machine I'm thinking about yet and can't remote-control its power supply, so I'm worried about a dirty hang at boot by missing a config entry. The risk is low but that's annoying when it happens. > > I'll also test on arm/arm64 to make sure we don't introduce a significant > > cost there. > > I don't expect a problem, but SipHash is optimized for 4-issue processors, > and on narrower machines fewer instructions are "free". OK. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 18:30 ` George Spelvin 2020-08-09 19:16 ` Willy Tarreau @ 2020-08-10 11:47 ` Willy Tarreau 2020-08-10 12:01 ` David Laight ` (3 more replies) 1 sibling, 4 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-10 11:47 UTC (permalink / raw) To: George Spelvin Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw Hi George, On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote: > Even something simple like buffering 8 TSC samples, and adding them > at 32-bit offsets across the state every 8th call, would make a huge > difference. Doing testing on real hardware showed that retrieving the TSC on every call had a non negligible cost, causing a loss of 2.5% on the accept() rate and 4% on packet rate when using iptables -m statistics. However I reused your idea of accumulating old TSCs to increase the uncertainty about their exact value, except that I retrieve it only on 1/8 calls and use the previous noise in this case. With this I observe the same performance as plain 5.8. Below are the connection rates accepted on a single core : 5.8 5.8+patch 5.8+patch+tsc 192900-197900 188800->192200 194500-197500 (conn/s) This was on a core i7-8700K. I looked at the asm code for the function and it remains reasonably light, in the same order of complexity as the original one, so I think we could go with that. My proposed change is below, in case you have any improvements to suggest. Regards, Willy diff --git a/lib/random32.c b/lib/random32.c index 2b048e2ea99f..a12d63028106 100644 --- a/lib/random32.c +++ b/lib/random32.c @@ -317,6 +317,8 @@ static void __init prandom_state_selftest(void) struct siprand_state { unsigned long v[4]; + unsigned long noise; + unsigned long count; }; static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; @@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) -#elif BITS_PER_LONG == 23 +#elif BITS_PER_LONG == 32 /* * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash. * This is weaker, but 32-bit machines are not used for high-traffic @@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s) { unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; + if (++s->count >= 8) { + v3 ^= s->noise; + s->noise += random_get_entropy(); + s->count = 0; + } + SIPROUND(v0, v1, v2, v3); SIPROUND(v0, v1, v2, v3); s->v[0] = v0; s->v[1] = v1; s->v[2] = v2; s->v[3] = v3; ^ permalink raw reply related [flat|nested] 48+ messages in thread
* RE: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 11:47 ` Willy Tarreau @ 2020-08-10 12:01 ` David Laight 2020-08-10 14:48 ` Willy Tarreau 2020-08-10 12:03 ` Florian Westphal ` (2 subsequent siblings) 3 siblings, 1 reply; 48+ messages in thread From: David Laight @ 2020-08-10 12:01 UTC (permalink / raw) To: 'Willy Tarreau', George Spelvin Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw From: Willy Tarreau > Sent: 10 August 2020 12:47 > On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote: > > Even something simple like buffering 8 TSC samples, and adding them > > at 32-bit offsets across the state every 8th call, would make a huge > > difference. > > Doing testing on real hardware showed that retrieving the TSC on every > call had a non negligible cost, causing a loss of 2.5% on the accept() > rate and 4% on packet rate when using iptables -m statistics. However > I reused your idea of accumulating old TSCs to increase the uncertainty > about their exact value, except that I retrieve it only on 1/8 calls > and use the previous noise in this case. With this I observe the same > performance as plain 5.8. Below are the connection rates accepted on > a single core : > > 5.8 5.8+patch 5.8+patch+tsc > 192900-197900 188800->192200 194500-197500 (conn/s) > > This was on a core i7-8700K. I looked at the asm code for the function > and it remains reasonably light, in the same order of complexity as the > original one, so I think we could go with that. > > My proposed change is below, in case you have any improvements to suggest. > > Regards, > Willy > > > diff --git a/lib/random32.c b/lib/random32.c > index 2b048e2ea99f..a12d63028106 100644 > --- a/lib/random32.c > +++ b/lib/random32.c > @@ -317,6 +317,8 @@ static void __init prandom_state_selftest(void) > > struct siprand_state { > unsigned long v[4]; > + unsigned long noise; > + unsigned long count; > }; > > static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; > @@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; > #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) > #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) > > -#elif BITS_PER_LONG == 23 > +#elif BITS_PER_LONG == 32 > /* > * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash. > * This is weaker, but 32-bit machines are not used for high-traffic > @@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s) > { > unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; > > + if (++s->count >= 8) { > + v3 ^= s->noise; > + s->noise += random_get_entropy(); > + s->count = 0; > + } > + Using: if (s->count-- <= 0) { ... s->count = 8; } probably generates better code. Although you may want to use a 'signed int' instead of 'unsigned long'. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 12:01 ` David Laight @ 2020-08-10 14:48 ` Willy Tarreau 0 siblings, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-10 14:48 UTC (permalink / raw) To: David Laight Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw On Mon, Aug 10, 2020 at 12:01:11PM +0000, David Laight wrote: > > /* > > * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash. > > * This is weaker, but 32-bit machines are not used for high-traffic > > @@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s) > > { > > unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; > > > > + if (++s->count >= 8) { > > + v3 ^= s->noise; > > + s->noise += random_get_entropy(); > > + s->count = 0; > > + } > > + > > Using: > if (s->count-- <= 0) { > ... > s->count = 8; > } > probably generates better code. > Although you may want to use a 'signed int' instead of 'unsigned long'. Yeah I know, it's just because I only slightly changed the previous code there. I had an earlier version that kept the rand state fully padded when storing intermediate values. That's among the final cleanups I'll bring if we go down that route. Thanks! Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 11:47 ` Willy Tarreau 2020-08-10 12:01 ` David Laight @ 2020-08-10 12:03 ` Florian Westphal 2020-08-10 14:53 ` Willy Tarreau 2020-08-10 16:31 ` Linus Torvalds 2020-08-11 3:47 ` George Spelvin 3 siblings, 1 reply; 48+ messages in thread From: Florian Westphal @ 2020-08-10 12:03 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw Willy Tarreau <w@1wt.eu> wrote: > On Sun, Aug 09, 2020 at 06:30:17PM +0000, George Spelvin wrote: > > Even something simple like buffering 8 TSC samples, and adding them > > at 32-bit offsets across the state every 8th call, would make a huge > > difference. > > Doing testing on real hardware showed that retrieving the TSC on every > call had a non negligible cost, causing a loss of 2.5% on the accept() > rate and 4% on packet rate when using iptables -m statistics. However > I reused your idea of accumulating old TSCs to increase the uncertainty > about their exact value, except that I retrieve it only on 1/8 calls > and use the previous noise in this case. With this I observe the same > performance as plain 5.8. Below are the connection rates accepted on > a single core : > > 5.8 5.8+patch 5.8+patch+tsc > 192900-197900 188800->192200 194500-197500 (conn/s) > > This was on a core i7-8700K. I looked at the asm code for the function > and it remains reasonably light, in the same order of complexity as the > original one, so I think we could go with that. > > My proposed change is below, in case you have any improvements to suggest. As this relates to networking, you could also hook perturbation into rx/tx softirq processing. E.g. once for each new napi poll round or only once for each softnet invocation, depending on cost. IIRC the proposed draft left a unused prandom_seed() stub around, you could re-use that to place extra data to include in the hash in percpu data. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 12:03 ` Florian Westphal @ 2020-08-10 14:53 ` Willy Tarreau 0 siblings, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-10 14:53 UTC (permalink / raw) To: Florian Westphal Cc: George Spelvin, netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen On Mon, Aug 10, 2020 at 02:03:02PM +0200, Florian Westphal wrote: > As this relates to networking, you could also hook perturbation into rx/tx > softirq processing. E.g. once for each new napi poll round or only once > for each softnet invocation, depending on cost. I was wondering what/where to add some processing. I was thinking about having a per_cpu "noise" variable that would get mixed with the randoms when generated. This "noise" would need to be global so that we can easily append to it. For example on the network path it would be nice to use checksums but nowadays they're not calculated anymore. > IIRC the proposed draft left a unused prandom_seed() stub around, you could > re-use that to place extra data to include in the hash in percpu data. Probably. What I saw was that prandom_seed() expected to perform a full (hence slow) reseed. Instead I'd like to do something cheap. I like the principle of "noise" in that it doesn't promise to bring any security, only to perturb a little bit. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 11:47 ` Willy Tarreau 2020-08-10 12:01 ` David Laight 2020-08-10 12:03 ` Florian Westphal @ 2020-08-10 16:31 ` Linus Torvalds 2020-08-10 16:58 ` Willy Tarreau 2020-08-11 3:47 ` George Spelvin 3 siblings, 1 reply; 48+ messages in thread From: Linus Torvalds @ 2020-08-10 16:31 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb, Stephen Hemminger, Florian Westphal On Mon, Aug 10, 2020 at 4:47 AM Willy Tarreau <w@1wt.eu> wrote: > > Doing testing on real hardware showed that retrieving the TSC on every > call had a non negligible cost, causing a loss of 2.5% on the accept() > rate and 4% on packet rate when using iptables -m statistics. And by "real hardware" I assume you mean x86, with a fairly fast and high-performance TSC for get_random_entropy(). Reading the TSC takes on the order of 20-50 cycles, iirc. But it can actually be *much* more expensive. On non-x86, it can be an IO cycle to external chips. And on older hardware VM's in x86, it can be a vm exit etc, so thousands of cycles. I hope nobody uses those VM's any more, but it would be a reasonable test-case for some non-x86 implementations, so.. IOW, no. You guys are - once again - ignoring reality. Linus ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 16:31 ` Linus Torvalds @ 2020-08-10 16:58 ` Willy Tarreau 2020-08-10 17:45 ` Linus Torvalds 0 siblings, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-10 16:58 UTC (permalink / raw) To: Linus Torvalds Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb, Stephen Hemminger, Florian Westphal On Mon, Aug 10, 2020 at 09:31:48AM -0700, Linus Torvalds wrote: > On Mon, Aug 10, 2020 at 4:47 AM Willy Tarreau <w@1wt.eu> wrote: > > > > Doing testing on real hardware showed that retrieving the TSC on every > > call had a non negligible cost, causing a loss of 2.5% on the accept() > > rate and 4% on packet rate when using iptables -m statistics. > > And by "real hardware" I assume you mean x86, with a fairly fast and > high-performance TSC for get_random_entropy(). Yep. > Reading the TSC takes on the order of 20-50 cycles, iirc. > > But it can actually be *much* more expensive. On non-x86, it can be an > IO cycle to external chips. I took what we were already using in add_interrupt_randomness() since I considered that if it was acceptable there, it probably was elsewhere. > And on older hardware VM's in x86, it can be a vm exit etc, so > thousands of cycles. I hope nobody uses those VM's any more, but it > would be a reasonable test-case for some non-x86 implementations, so.. Yes, I remember these ones, they were not fun at all. > IOW, no. You guys are - once again - ignoring reality. I'm not ignoring reality, quite the opposite, trying to take all knowledge into account. If I'm missing some points, fine. But if we were already calling that in the interrupt handler I expected that this would be OK. The alternative Florian talked about is quite interesting as well, which is to collect some cheap noise in the network rx/tx paths since these are the areas we care about. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 16:58 ` Willy Tarreau @ 2020-08-10 17:45 ` Linus Torvalds 2020-08-10 18:01 ` Willy Tarreau 2020-08-10 21:04 ` Willy Tarreau 0 siblings, 2 replies; 48+ messages in thread From: Linus Torvalds @ 2020-08-10 17:45 UTC (permalink / raw) To: Willy Tarreau Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb, Stephen Hemminger, Florian Westphal On Mon, Aug 10, 2020 at 9:59 AM Willy Tarreau <w@1wt.eu> wrote: > > I took what we were already using in add_interrupt_randomness() since > I considered that if it was acceptable there, it probably was elsewhere. Once you've taken an interrupt, you're doing IO anyway, and the interrupt costs will dominate anything you do. But the prandom_u32() interface is potentially done many times per interrupt. For all I know it's done inside fairly critical locks etc too. So I don't think one usage translates to another very well. Linus ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 17:45 ` Linus Torvalds @ 2020-08-10 18:01 ` Willy Tarreau 2020-08-10 21:04 ` Willy Tarreau 1 sibling, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-10 18:01 UTC (permalink / raw) To: Linus Torvalds Cc: George Spelvin, Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb, Stephen Hemminger, Florian Westphal On Mon, Aug 10, 2020 at 10:45:26AM -0700, Linus Torvalds wrote: > On Mon, Aug 10, 2020 at 9:59 AM Willy Tarreau <w@1wt.eu> wrote: > > > > I took what we were already using in add_interrupt_randomness() since > > I considered that if it was acceptable there, it probably was elsewhere. > > Once you've taken an interrupt, you're doing IO anyway, and the > interrupt costs will dominate anything you do. > > But the prandom_u32() interface is potentially done many times per > interrupt. For all I know it's done inside fairly critical locks etc > too. > > So I don't think one usage translates to another very well. Possible, hence the better solution to just feed noise in hot paths. Using jiffies and skb pointer in xmit is better than nothing anyway. I'm not seeking anything extremist, I just want to make sure we don't get yet-another-report on this area next summer just because some researcher using two VMs discovers that attacking a 100% idle VM from another one running on the same CPU core is trivial after having stolen some memory data. If at least such an attack is boring and rarely works, the rest of the job is provided by siphash and we should be fine. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 17:45 ` Linus Torvalds 2020-08-10 18:01 ` Willy Tarreau @ 2020-08-10 21:04 ` Willy Tarreau 2020-08-11 5:26 ` George Spelvin 1 sibling, 1 reply; 48+ messages in thread From: Willy Tarreau @ 2020-08-10 21:04 UTC (permalink / raw) To: Linus Torvalds, George Spelvin, Florian Westphal Cc: Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb, Stephen Hemminger Linus, George, Florian, would something in this vein be OK in your opinion ? - update_process_times still updates the noise - we don't touch the fast_pool anymore - we don't read any TSC on hot paths - we update the noise in xmit from jiffies and a few pointer values instead I've applied it on top of George's patch rebased to mainline for simplicity. I've used a separate per_cpu noise variable to keep the net_rand_state static with its __latent_entropy. With this I'm even getting very slightly better performance than with the previous patch (195.7 - 197.8k cps). What could be improved is the way the input values are mixed (just added hence commutative for now). I didn't want to call a siphash round on the hot paths, but just shifting the previous noise value before adding would work, such as the following for example: void prandom_u32_add_noise(a, b, c, d) { unsigned long *noise = get_cpu_ptr(&net_rand_noise); #if BITS_PER_LONG == 64 *noise = rol64(*noise, 7) + a + b + c + d; #else *noise = rol32(*noise, 7) + a + b + c + d; #endif put_cpu_ptr(&net_rand_noise); } Thanks, Willy --- diff --git a/drivers/char/random.c b/drivers/char/random.c index d20ba1b104ca..2a41b21623ae 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -1277,7 +1277,6 @@ void add_interrupt_randomness(int irq, int irq_flags) fast_mix(fast_pool); add_interrupt_bench(cycles); - this_cpu_add(net_rand_state.s1, fast_pool->pool[cycles & 3]); if (unlikely(crng_init == 0)) { if ((fast_pool->count >= 64) && diff --git a/include/linux/prandom.h b/include/linux/prandom.h index aa16e6468f91..e2b4990f2126 100644 --- a/include/linux/prandom.h +++ b/include/linux/prandom.h @@ -20,7 +20,7 @@ struct rnd_state { __u32 s1, s2, s3, s4; }; -DECLARE_PER_CPU(struct rnd_state, net_rand_state); +DECLARE_PER_CPU(unsigned long, net_rand_noise); u32 prandom_u32_state(struct rnd_state *state); void prandom_bytes_state(struct rnd_state *state, void *buf, size_t nbytes); @@ -67,6 +67,7 @@ static inline void prandom_seed_state(struct rnd_state *state, u64 seed) state->s2 = __seed(i, 8U); state->s3 = __seed(i, 16U); state->s4 = __seed(i, 128U); + __this_cpu_add(net_rand_noise, i); } /* Pseudo random number generator from numerical recipes. */ diff --git a/kernel/time/timer.c b/kernel/time/timer.c index ae5029f984a8..2f07c569af77 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1721,7 +1721,7 @@ void update_process_times(int user_tick) * non-constant value that's not affine to the number of calls to make * sure it's updated when there's some activity (we don't care in idle). */ - this_cpu_add(net_rand_state.s1, rol32(jiffies, 24) + user_tick); + __this_cpu_add(net_rand_noise, rol32(jiffies, 24) + user_tick); } /** diff --git a/lib/random32.c b/lib/random32.c index 2b048e2ea99f..d74cf1db8cc9 100644 --- a/lib/random32.c +++ b/lib/random32.c @@ -320,6 +320,8 @@ struct siprand_state { }; static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; +DEFINE_PER_CPU(unsigned long, net_rand_noise); +EXPORT_SYMBOL(net_rand_noise); #if BITS_PER_LONG == 64 /* @@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) -#elif BITS_PER_LONG == 23 +#elif BITS_PER_LONG == 32 /* * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash. * This is weaker, but 32-bit machines are not used for high-traffic @@ -374,9 +376,12 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; static u32 siprand_u32(struct siprand_state *s) { unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; + unsigned long n = __this_cpu_read(net_rand_noise); + v3 ^= n; SIPROUND(v0, v1, v2, v3); SIPROUND(v0, v1, v2, v3); + v0 ^= n; s->v[0] = v0; s->v[1] = v1; s->v[2] = v2; s->v[3] = v3; return v1 + v3; } diff --git a/net/core/dev.c b/net/core/dev.c index 7df6c9617321..55a2471cd75b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -144,6 +144,7 @@ #include <linux/indirect_call_wrapper.h> #include <net/devlink.h> #include <linux/pm_runtime.h> +#include <linux/prandom.h> #include "net-sysfs.h" @@ -3557,6 +3558,7 @@ static int xmit_one(struct sk_buff *skb, struct net_device *dev, dev_queue_xmit_nit(skb, dev); len = skb->len; + __this_cpu_add(net_rand_noise, (long)skb + (long)dev + (long)txq + len + jiffies); trace_net_dev_start_xmit(skb, dev); rc = netdev_start_xmit(skb, dev, txq, more); trace_net_dev_xmit(skb, rc, dev, len); @@ -4129,6 +4131,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev) if (!skb) goto out; + __this_cpu_add(net_rand_noise, (long)skb + (long)dev + (long)txq + jiffies); HARD_TX_LOCK(dev, txq, cpu); if (!netif_xmit_stopped(txq)) { @@ -4194,6 +4197,7 @@ int dev_direct_xmit(struct sk_buff *skb, u16 queue_id) skb_set_queue_mapping(skb, queue_id); txq = skb_get_tx_queue(dev, skb); + __this_cpu_add(net_rand_noise, (long)skb + (long)dev + (long)txq + jiffies); local_bh_disable(); ^ permalink raw reply related [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 21:04 ` Willy Tarreau @ 2020-08-11 5:26 ` George Spelvin 2020-08-11 5:37 ` Willy Tarreau 0 siblings, 1 reply; 48+ messages in thread From: George Spelvin @ 2020-08-11 5:26 UTC (permalink / raw) To: Willy Tarreau Cc: Linus Torvalds, Florian Westphal, Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb, Stephen Hemminger, George Spelvin On Mon, Aug 10, 2020 at 11:04:55PM +0200, Willy Tarreau wrote: > What could be improved is the way the input values are mixed (just > added hence commutative for now). I didn't want to call a siphash > round on the hot paths, but just shifting the previous noise value > before adding would work, such as the following for example: > > void prandom_u32_add_noise(a, b, c, d) > { > unsigned long *noise = get_cpu_ptr(&net_rand_noise); > > #if BITS_PER_LONG == 64 > *noise = rol64(*noise, 7) + a + b + c + d; > #else > *noise = rol32(*noise, 7) + a + b + c + d; > #endif > put_cpu_ptr(&net_rand_noise); > > } If you think this is enough seed material, I'm fine with it. I don't hugely like the fact that you sum all the inputs, since entropy tends to be concentrated in the low-order words, and summing risks cancellation. You can't afford even one SIPROUND as a non-cryptographic hash? E.g. DEFINE_PER_CPU(unsigned long[4], net_rand_noise); EXPORT_SYMBOL(net_rand_noise); void prandom_u32_add_noise(a, b, c, d) { unsigned long *noise = get_cpu_ptr(&net_rand_noise); a ^= noise[0]; b ^= noise[1]; c ^= noise[2]; d ^= noise[3]; /* * This is not used cryptographically; it's just * a convenient 4-word hash function. */ SIPROUND(a, b, c, d); noise[0] = a; noise[1] = b; noise[2] = c; put_cpu_ptr(&net_rand_noise); } (And then you mix in net_rand_noise[0].) Other options are HASH_MIX() from fs/namei.c, but that's more sequential. There's also a simple Xorshift generator. ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-11 5:26 ` George Spelvin @ 2020-08-11 5:37 ` Willy Tarreau 0 siblings, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-11 5:37 UTC (permalink / raw) To: George Spelvin Cc: Linus Torvalds, Florian Westphal, Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Marc Plumb, Stephen Hemminger On Tue, Aug 11, 2020 at 05:26:49AM +0000, George Spelvin wrote: > On Mon, Aug 10, 2020 at 11:04:55PM +0200, Willy Tarreau wrote: > > What could be improved is the way the input values are mixed (just > > added hence commutative for now). I didn't want to call a siphash > > round on the hot paths, but just shifting the previous noise value > > before adding would work, such as the following for example: > > > > void prandom_u32_add_noise(a, b, c, d) > > { > > unsigned long *noise = get_cpu_ptr(&net_rand_noise); > > > > #if BITS_PER_LONG == 64 > > *noise = rol64(*noise, 7) + a + b + c + d; > > #else > > *noise = rol32(*noise, 7) + a + b + c + d; > > #endif > > put_cpu_ptr(&net_rand_noise); > > > > } > > If you think this is enough seed material, I'm fine with it. > > I don't hugely like the fact that you sum all the inputs, since > entropy tends to be concentrated in the low-order words, and summing > risks cancellation. Yes I've figured this. But I thought it was still better than a pure xor which would cancell the high bits from pointers. > You can't afford even one SIPROUND as a non-cryptographic hash? E.g. That's what I mentioned above, I'm still hesitating. I need to test. > > DEFINE_PER_CPU(unsigned long[4], net_rand_noise); > EXPORT_SYMBOL(net_rand_noise); > > void prandom_u32_add_noise(a, b, c, d) > { > unsigned long *noise = get_cpu_ptr(&net_rand_noise); > > a ^= noise[0]; > b ^= noise[1]; > c ^= noise[2]; > d ^= noise[3]; > /* > * This is not used cryptographically; it's just > * a convenient 4-word hash function. > */ > SIPROUND(a, b, c, d); > noise[0] = a; > noise[1] = b; > noise[2] = c; > put_cpu_ptr(&net_rand_noise); > } > > (And then you mix in net_rand_noise[0].) > > Other options are HASH_MIX() from fs/namei.c, but that's > more sequential. > > There's also a simple Xorshift generator. I think a xorshift on each value will have roughly the same cost as a single SIPROUND. But I've not yet completely eliminated these options until I've tested. If we lose a few cycles per packet, that might be OK. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-10 11:47 ` Willy Tarreau ` (2 preceding siblings ...) 2020-08-10 16:31 ` Linus Torvalds @ 2020-08-11 3:47 ` George Spelvin 2020-08-11 3:58 ` Willy Tarreau 3 siblings, 1 reply; 48+ messages in thread From: George Spelvin @ 2020-08-11 3:47 UTC (permalink / raw) To: Willy Tarreau Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw, George Spelvin On Mon, Aug 10, 2020 at 01:47:00PM +0200, Willy Tarreau wrote: > except that I retrieve it only on 1/8 calls > and use the previous noise in this case. Er... that's quite different. I was saying you measure them all, and do: struct siprand_state { ... + uint32_t noise[i]; + unsigned counter; } ... + s->noise[--s->counter] = random_get_entropy(); + + if (!s->counter) { + for (i = 0; i < 4; i++) + s->v[i] += s->noise[2*i] + + ((unsigned long)s->noise[2*i+1] << BITS_PER_LONG/2); + s->counter = 8; + } What you're doing is just decreasing the amount of seeding by a factor of 8. (Roughly. You do gain log2(8)/2 = 1.5 bits because the sum of 8 random values has a standard deviation sqrt(8) times as large as the inputs.) > diff --git a/lib/random32.c b/lib/random32.c > index 2b048e2ea99f..a12d63028106 100644 > --- a/lib/random32.c > +++ b/lib/random32.c > @@ -317,6 +317,8 @@ static void __init prandom_state_selftest(void) > > struct siprand_state { > unsigned long v[4]; > + unsigned long noise; > + unsigned long count; > }; > > static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; > @@ -334,7 +336,7 @@ static DEFINE_PER_CPU(struct siprand_state, net_rand_state) __latent_entropy; > #define K0 (0x736f6d6570736575 ^ 0x6c7967656e657261 ) > #define K1 (0x646f72616e646f6d ^ 0x7465646279746573 ) > > -#elif BITS_PER_LONG == 23 > +#elif BITS_PER_LONG == 32 > /* > * On 32-bit machines, we use HSipHash, a reduced-width version of SipHash. > * This is weaker, but 32-bit machines are not used for high-traffic > @@ -375,6 +377,12 @@ static u32 siprand_u32(struct siprand_state *s) > { > unsigned long v0 = s->v[0], v1 = s->v[1], v2 = s->v[2], v3 = s->v[3]; > > + if (++s->count >= 8) { > + v3 ^= s->noise; > + s->noise += random_get_entropy(); > + s->count = 0; > + } > + - Can you explain why you save the "noise" until next time? Is this meant to make it harder for an attacker to observe the time? - How about doing away with s->count and making it statistical: + if ((v3 & 7) == 0) + v3 ^= random_get_entropy(); That still does the seed 1/8 of the time, but in a much less regular pattern. (Admittedly, it will totally break the branch predictor. An unlikely() might help.) ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-11 3:47 ` George Spelvin @ 2020-08-11 3:58 ` Willy Tarreau 0 siblings, 0 replies; 48+ messages in thread From: Willy Tarreau @ 2020-08-11 3:58 UTC (permalink / raw) To: George Spelvin Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw On Tue, Aug 11, 2020 at 03:47:24AM +0000, George Spelvin wrote: > On Mon, Aug 10, 2020 at 01:47:00PM +0200, Willy Tarreau wrote: > > except that I retrieve it only on 1/8 calls > > and use the previous noise in this case. > > Er... that's quite different. I was saying you measure them all, and do: That was my first approach and it resulted in a significant performance loss, hence the change (and the resulting ugly construct with the counter). > > + if (++s->count >= 8) { > > + v3 ^= s->noise; > > + s->noise += random_get_entropy(); > > + s->count = 0; > > + } > > + > > - Can you explain why you save the "noise" until next time? Is this meant to > make it harder for an attacker to observe the time? Just to make the observable call not depend on immediately measurable TSC values. It's weak, but the point was that when mixing attack traffic with regular one, if you can measure the time variations on TSC to know when it was used and don't have its resulting effect at the same time, it's harder to analyse than when you have both at once. > - How about doing away with s->count and making it statistical: > > + if ((v3 & 7) == 0) > + v3 ^= random_get_entropy(); Absolutely. I just kept the counter from previous attempt. But Linus prefers that we completely remove TSC calls from direct calls due to old VMs that were slow at this. We could collect it anywhere else once in a while instead. Willy ^ permalink raw reply [flat|nested] 48+ messages in thread
[parent not found: <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com>]
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable [not found] ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com> @ 2020-08-09 21:10 ` Marc Plumb 2020-08-09 21:48 ` Linus Torvalds 0 siblings, 1 reply; 48+ messages in thread From: Marc Plumb @ 2020-08-09 21:10 UTC (permalink / raw) To: Willy Tarreau, George Spelvin Cc: netdev, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, stephen, fw (Reseending since I accidentally sent it as HTML which the netdev mailing list doesn't like) On 2020-08-09 2:05 p.m., Marc Plumb wrote: > > Willy, > > > On 2020-08-07 3:19 p.m., Willy Tarreau wrote: >>> If I can figure the state out once, >> Yes but how do you take that as granted ? This state doesn't appear >> without its noise counterpart, so taking as a prerequisite that you may >> guess one separately obviously indicates that you then just have to >> deduce the other, but the point of mixing precisely is that we do not >> expose individual parts. > > > On 2020-08-09 2:38 a.m., Willy Tarreau wrote: >> However it keeps the problem that the whole sequence is entirely >> determined at the moment of reseeding, so if one were to be able to >> access the state, e.g. using l1tf/spectre/meltdown/whatever, then >> this state could be used to predict the whole ongoing sequence for >> the next minute. What some view as a security feature, others will >> see as a backdoor :-/ That's why I really like the noise approach. >> Even just the below would significantly harm that capability because >> that state alone isn't sufficient anymore to pre-compute all future >> values: > > > So two days ago I was unreasonable for assuming an attacker to could > recover the entire state, now you're assuming the same thing? As I > said before, if an attacker can recover the complete state, then > slowly adding noise doesn't help significantly since an attacker can > brute force the noise added (even if a perfect CPRNG is used). > > However, I think I'm starting to see your underlying assumptions. > You're thinking that raw noise data are the only truly unpredictable > thing so you think that adding it is a defense against attacks like > foreshadow/spectre/meltdown. You aren't entirely wrong -- if there was > a fast noise source then it might be a good option. Just if the noise > source is slow, you're just causing far more damage to a far more > critical crytpto function to get very little benefit. > > If you want to add noise to the network PRNG, then you can't put the > same noise into the dev/random CPRNG at all, in any way. I've tried > explaining the crypto reasons for this, and George has added to that, > so let me try a different approach: It breaks FIPS 140-2 for all of > Linux. While FIPS certification isn't a key driver, it is a > consideration for the crypt modules. FIPS references NIST.SP.800-90B > (which is specifically about Recommendation for the Entropy Sources > Used for Random Bit Generation) which has a requirement that the noise > source not pass any data used for crypto operations to anything > outside of the defined security boundary. If you want to feed a noise > measurement into the network PRNG, then you can't feed it into the > /dev/random pool also. You have to decide where the different > measurements are going to go and keep them completely seperate. > > I'm not intimately familiar with the standards so I spoke with someone > who does FIPS 140-x certification and was told "I don't think the > standards even considered the idea that someone might pass data from a > conditioning pool to other functions. It completely violates the > security boundary concept so is just prohibited ... that type of > change would absolutely be a problem." > > > Marc > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 21:10 ` Marc Plumb @ 2020-08-09 21:48 ` Linus Torvalds 0 siblings, 0 replies; 48+ messages in thread From: Linus Torvalds @ 2020-08-09 21:48 UTC (permalink / raw) To: Marc Plumb Cc: Willy Tarreau, George Spelvin, Netdev, Amit Klein, Eric Dumazet, Jason A. Donenfeld, Andrew Lutomirski, Kees Cook, Thomas Gleixner, Peter Zijlstra, Theodore Ts'o, Stephen Hemminger, Florian Westphal On Sun, Aug 9, 2020 at 2:10 PM Marc Plumb <lkml.mplumb@gmail.com> wrote: > > However, I think I'm starting to see your underlying assumptions. > You're thinking that raw noise data are the only truly unpredictable > thing so you think that adding it is a defense against attacks like > foreshadow/spectre/meltdown. You aren't entirely wrong -- if there was > a fast noise source then it might be a good option. Just if the noise > source is slow, you're just causing far more damage to a far more > critical crypto function to get very little benefit. The only truly good noise source we have is any CPU native randomness. Sadly, that is usually never really "fast". But we do use that for any actual real /dev/random reading. We still whiten it through our hashing, and we mix it in rather than use the raw CPU hw-provided values (because not everybody trusts the CPU vendors either), but /dev/random will most certainly take advantage of it as one source of noise. (Honesty in advertising: you can also use other interfaces that don't bother with the remixing, and _will_ just return the raw CPU randomness). So if you make the (imho reasonable) assumption that you're running on a modern enough CPU, and that the CPU hw randomness is of reasonable quality, then you never need to worry about /dev/random. Every single time you extract something from one of the pools, I think we mix in that CPU randomness if it's available. But the CPU randomness is too slow for the prandom code to use at extraction time. It's on the order of a couple of hundred to a couple of thousand cycles. That's peanuts for /dev/random, but quite a lot for prandom. In fact, at the slow end it is slow enough that you don't want to do it at any fast granularity (ie not "every interrupt"), it's the "when reseeding once a second" kind of slow. arm64 has randomness too these days too, but that's only of the "really slow, useful for seeding" variety. And I'm not sure which (if any) CPU implementations out there actually do it yet. Anyway, I suspect /dev/random has been over-engineered (I think we have something like three layers of mixing bits _and_ the CPU randomness _and_ all the interrupt randomness), and prandom may have been left alone too much. Linus ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable 2020-08-09 6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin 2020-08-09 9:38 ` Willy Tarreau @ 2020-08-09 13:50 ` Randy Dunlap [not found] ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com> 2 siblings, 0 replies; 48+ messages in thread From: Randy Dunlap @ 2020-08-09 13:50 UTC (permalink / raw) To: George Spelvin, netdev Cc: w, aksecurity, torvalds, edumazet, Jason, luto, keescook, tglx, peterz, tytso, lkml.mplumb, stephen, fw On 8/8/20 11:57 PM, George Spelvin wrote: +#elif BITS_PER_LONG == 23 s/23/32/ ??? -- ~Randy ^ permalink raw reply [flat|nested] 48+ messages in thread
[parent not found: <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com>]
* Re: [DRAFT PATCH] random32: make prandom_u32() output unpredictable [not found] ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com> @ 2020-08-09 16:08 ` George Spelvin 0 siblings, 0 replies; 48+ messages in thread From: George Spelvin @ 2020-08-09 16:08 UTC (permalink / raw) To: Amit Klein Cc: Jason, edumazet, fw, keescook, lkml.mplumb, luto, netdev, peterz, stephen, tglx, torvalds, tytso, w On Sun, Aug 09, 2020 at 11:26:31AM +0300, Amit Klein wrote: > BITS_PER_LONG==23 ??? > Should be ==32, I guess. Of course. I've been testing on a 64-bit system so hadn't exercised that branch yet, and it's a case of "seeing what I expect to see". ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2020-08-27 7:08 UTC | newest] Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CA+icZUVnsmf1kXPYFYufStQ_MxnLuxL+EWfDS2wQy1VbAEMwkA@mail.gmail.com> 2020-08-09 21:10 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable Sedat Dilek [not found] ` <20200809235412.GD25124@SDF.ORG> [not found] ` <20200810034948.GB8262@1wt.eu> [not found] ` <20200811053455.GH25124@SDF.ORG> [not found] ` <20200811054328.GD9456@1wt.eu> [not found] ` <20200811062814.GI25124@SDF.ORG> [not found] ` <20200811074538.GA9523@1wt.eu> 2020-08-11 10:51 ` Sedat Dilek 2020-08-11 11:01 ` Sedat Dilek 2020-08-12 3:21 ` Willy Tarreau 2020-08-13 7:53 ` Sedat Dilek 2020-08-13 8:06 ` Willy Tarreau 2020-08-13 8:13 ` Sedat Dilek 2020-08-13 8:27 ` Sedat Dilek 2020-08-13 14:00 ` Eric Dumazet 2020-08-13 16:02 ` Sedat Dilek 2020-08-14 15:32 ` Sedat Dilek 2020-08-14 16:05 ` Willy Tarreau 2020-08-14 16:17 ` Sedat Dilek 2020-08-16 15:01 ` Willy Tarreau 2020-08-16 16:48 ` Sedat Dilek 2020-08-20 3:05 ` Sedat Dilek 2020-08-20 4:33 ` Willy Tarreau 2020-08-20 4:42 ` Sedat Dilek 2020-08-20 6:08 ` Willy Tarreau 2020-08-20 6:58 ` Willy Tarreau 2020-08-20 8:05 ` Willy Tarreau 2020-08-20 12:08 ` Sedat Dilek [not found] ` <CANEQ_+L+22Hkdqf38Zr0bfq16fcL1Ax2X9fToXV_niHKXCB8aA@mail.gmail.com> 2020-08-27 1:09 ` Willy Tarreau 2020-08-27 7:08 ` Sedat Dilek 2020-08-08 15:26 Flaw in "random32: update the net random state on interrupt and activity" George Spelvin 2020-08-09 6:57 ` [DRAFT PATCH] random32: make prandom_u32() output unpredictable George Spelvin 2020-08-09 9:38 ` Willy Tarreau 2020-08-09 17:06 ` George Spelvin 2020-08-09 17:33 ` Willy Tarreau 2020-08-09 18:30 ` George Spelvin 2020-08-09 19:16 ` Willy Tarreau 2020-08-10 11:47 ` Willy Tarreau 2020-08-10 12:01 ` David Laight 2020-08-10 14:48 ` Willy Tarreau 2020-08-10 12:03 ` Florian Westphal 2020-08-10 14:53 ` Willy Tarreau 2020-08-10 16:31 ` Linus Torvalds 2020-08-10 16:58 ` Willy Tarreau 2020-08-10 17:45 ` Linus Torvalds 2020-08-10 18:01 ` Willy Tarreau 2020-08-10 21:04 ` Willy Tarreau 2020-08-11 5:26 ` George Spelvin 2020-08-11 5:37 ` Willy Tarreau 2020-08-11 3:47 ` George Spelvin 2020-08-11 3:58 ` Willy Tarreau [not found] ` <fdbc7d7d-cba2-ef94-9bde-b3ccae0cfaac@gmail.com> 2020-08-09 21:10 ` Marc Plumb 2020-08-09 21:48 ` Linus Torvalds 2020-08-09 13:50 ` Randy Dunlap [not found] ` <CANEQ_++a4YcwQQ2XhuguTono9=RxbSRVsMw08zLWBWJ_wxG2AQ@mail.gmail.com> 2020-08-09 16:08 ` George Spelvin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).