From: Jeremy Linton <jeremy.linton@arm.com> To: Arnd Bergmann <arnd@arndb.de>, Kees Cook <keescook@chromium.org> Cc: linux-arm-kernel@lists.infradead.org, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, "Jason A . Donenfeld" <Jason@zx2c4.com>, "Gustavo A. R. Silva" <gustavoars@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Mark Brown <broonie@kernel.org>, Guo Hui <guohui@uniontech.com>, Manoj.Iyer@arm.com, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, James Yang <james.yang@arm.com>, Shiyou Huang <shiyou.huang@arm.com> Subject: Re: [PATCH 1/1] arm64: syscall: Direct PRNG kstack randomization Date: Wed, 6 Mar 2024 15:54:57 -0600 [thread overview] Message-ID: <38f9541b-dd88-4d49-af3b-bc7880a4e2f4@arm.com> (raw) In-Reply-To: <34351804-ad1d-498f-932a-c1844b78589f@app.fastmail.com> Hi, On 3/6/24 14:46, Arnd Bergmann wrote: > On Wed, Mar 6, 2024, at 00:33, Kees Cook wrote: >> On Tue, Mar 05, 2024 at 04:18:24PM -0600, Jeremy Linton wrote: >>> The existing arm64 stack randomization uses the kernel rng to acquire >>> 5 bits of address space randomization. This is problematic because it >>> creates non determinism in the syscall path when the rng needs to be >>> generated or reseeded. This shows up as large tail latencies in some >>> benchmarks and directly affects the minimum RT latencies as seen by >>> cyclictest. >>> >>> Other architectures are using timers/cycle counters for this function, >>> which is sketchy from a randomization perspective because it should be >>> possible to estimate this value from knowledge of the syscall return >>> time, and from reading the current value of the timer/counters. > > As I commented on the previous version, I don't want to see > a change that only addresses one architecture like this. If you > are convinced that using a cycle counter is a mistake, then we > should do the same thing on the other architectures as well > that currently use a cycle counter. I personally tend to agree as long as we aren't creating a similar set of problems for those architectures as we are seeing on arm. Currently the kstack rng on/off choice is basically zero overhead for them. > >>> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET >>> +DEFINE_PER_CPU(struct rnd_state, kstackrng); >>> + >>> +static u16 kstack_rng(void) >>> +{ >>> + u32 rng = prandom_u32_state(this_cpu_ptr(&kstackrng)); >>> + >>> + return rng & 0x1ff; >>> +} >>> + >>> +/* Should we reseed? */ >>> +static int kstack_rng_setup(unsigned int cpu) >>> +{ >>> + u32 rng_seed; >>> + >>> + /* zero should be avoided as a seed */ >>> + do { >>> + rng_seed = get_random_u32(); >>> + } while (!rng_seed); >>> + prandom_seed_state(this_cpu_ptr(&kstackrng), rng_seed); >>> + return 0; >>> +} >>> + >>> +static int kstack_init(void) >>> +{ >>> + int ret; >>> + >>> + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/cpuinfo:kstackrandomize", >>> + kstack_rng_setup, NULL); >> >> This will run initial seeding, but don't we need to reseed this with >> some kind of frequency? > > Won't that defeat the purpose of the patch that was intended > to make the syscall latency more predictable? At least the > simpler approaches of reseeding from the kstack_rng() > function itself would have this problem, deferring it to > another context comes with a separate set of problems. And that describes why I've not come up with an inline reseeding solution. Which of course isn't a problem on !arm if one just pushes a few bits of a cycle counter into the rnd_state every few dozen syscalls, or whatever. Mark R, mentioned offline the idea of just picking a few bits off CNTVCT as a seed, but its so slow it basically has to be used to fuzz a bit or two of rnd_state on some fairly long interval. Long enough that if someone has a solution for extracting rnd_state it might not add any additional security. Or that is my take, since i'm not a big fan of any independent counter/clock based RNG seeding (AFAIK, entropy from clocks requires multiple _independent_ sources). This is a bit out of my wheelhouse, so I defer to anyone with a better feel or some actual data. The best plan I have at the moment is just some deferred work to call kstack_rng_setup on some call or time based interval, which AFAIK isn't ideal for RT workloads which expect ~100% CPU isolation. Plus, that solution assumes we have some handle on how fast an attacker can extract kstackrng sufficiently to make predictions. Again, thanks to everyone for looking at this, Jeremy
WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Linton <jeremy.linton@arm.com> To: Arnd Bergmann <arnd@arndb.de>, Kees Cook <keescook@chromium.org> Cc: linux-arm-kernel@lists.infradead.org, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, "Jason A . Donenfeld" <Jason@zx2c4.com>, "Gustavo A. R. Silva" <gustavoars@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Mark Brown <broonie@kernel.org>, Guo Hui <guohui@uniontech.com>, Manoj.Iyer@arm.com, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, James Yang <james.yang@arm.com>, Shiyou Huang <shiyou.huang@arm.com> Subject: Re: [PATCH 1/1] arm64: syscall: Direct PRNG kstack randomization Date: Wed, 6 Mar 2024 15:54:57 -0600 [thread overview] Message-ID: <38f9541b-dd88-4d49-af3b-bc7880a4e2f4@arm.com> (raw) In-Reply-To: <34351804-ad1d-498f-932a-c1844b78589f@app.fastmail.com> Hi, On 3/6/24 14:46, Arnd Bergmann wrote: > On Wed, Mar 6, 2024, at 00:33, Kees Cook wrote: >> On Tue, Mar 05, 2024 at 04:18:24PM -0600, Jeremy Linton wrote: >>> The existing arm64 stack randomization uses the kernel rng to acquire >>> 5 bits of address space randomization. This is problematic because it >>> creates non determinism in the syscall path when the rng needs to be >>> generated or reseeded. This shows up as large tail latencies in some >>> benchmarks and directly affects the minimum RT latencies as seen by >>> cyclictest. >>> >>> Other architectures are using timers/cycle counters for this function, >>> which is sketchy from a randomization perspective because it should be >>> possible to estimate this value from knowledge of the syscall return >>> time, and from reading the current value of the timer/counters. > > As I commented on the previous version, I don't want to see > a change that only addresses one architecture like this. If you > are convinced that using a cycle counter is a mistake, then we > should do the same thing on the other architectures as well > that currently use a cycle counter. I personally tend to agree as long as we aren't creating a similar set of problems for those architectures as we are seeing on arm. Currently the kstack rng on/off choice is basically zero overhead for them. > >>> +#ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET >>> +DEFINE_PER_CPU(struct rnd_state, kstackrng); >>> + >>> +static u16 kstack_rng(void) >>> +{ >>> + u32 rng = prandom_u32_state(this_cpu_ptr(&kstackrng)); >>> + >>> + return rng & 0x1ff; >>> +} >>> + >>> +/* Should we reseed? */ >>> +static int kstack_rng_setup(unsigned int cpu) >>> +{ >>> + u32 rng_seed; >>> + >>> + /* zero should be avoided as a seed */ >>> + do { >>> + rng_seed = get_random_u32(); >>> + } while (!rng_seed); >>> + prandom_seed_state(this_cpu_ptr(&kstackrng), rng_seed); >>> + return 0; >>> +} >>> + >>> +static int kstack_init(void) >>> +{ >>> + int ret; >>> + >>> + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "arm64/cpuinfo:kstackrandomize", >>> + kstack_rng_setup, NULL); >> >> This will run initial seeding, but don't we need to reseed this with >> some kind of frequency? > > Won't that defeat the purpose of the patch that was intended > to make the syscall latency more predictable? At least the > simpler approaches of reseeding from the kstack_rng() > function itself would have this problem, deferring it to > another context comes with a separate set of problems. And that describes why I've not come up with an inline reseeding solution. Which of course isn't a problem on !arm if one just pushes a few bits of a cycle counter into the rnd_state every few dozen syscalls, or whatever. Mark R, mentioned offline the idea of just picking a few bits off CNTVCT as a seed, but its so slow it basically has to be used to fuzz a bit or two of rnd_state on some fairly long interval. Long enough that if someone has a solution for extracting rnd_state it might not add any additional security. Or that is my take, since i'm not a big fan of any independent counter/clock based RNG seeding (AFAIK, entropy from clocks requires multiple _independent_ sources). This is a bit out of my wheelhouse, so I defer to anyone with a better feel or some actual data. The best plan I have at the moment is just some deferred work to call kstack_rng_setup on some call or time based interval, which AFAIK isn't ideal for RT workloads which expect ~100% CPU isolation. Plus, that solution assumes we have some handle on how fast an attacker can extract kstackrng sufficiently to make predictions. Again, thanks to everyone for looking at this, Jeremy _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2024-03-06 21:55 UTC|newest] Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-03-05 22:18 [PATCH 0/1] Bring kstack randomized perf closer to unrandomized Jeremy Linton 2024-03-05 22:18 ` Jeremy Linton 2024-03-05 22:18 ` [PATCH 1/1] arm64: syscall: Direct PRNG kstack randomization Jeremy Linton 2024-03-05 22:18 ` Jeremy Linton 2024-03-05 23:33 ` Kees Cook 2024-03-05 23:33 ` Kees Cook 2024-03-06 20:46 ` Arnd Bergmann 2024-03-06 20:46 ` Arnd Bergmann 2024-03-06 21:54 ` Jeremy Linton [this message] 2024-03-06 21:54 ` Jeremy Linton 2024-03-07 11:10 ` Arnd Bergmann 2024-03-07 11:10 ` Arnd Bergmann 2024-03-07 19:10 ` Kees Cook 2024-03-07 19:10 ` Kees Cook 2024-03-07 21:56 ` Arnd Bergmann 2024-03-07 21:56 ` Arnd Bergmann 2024-03-07 19:15 ` Kees Cook 2024-03-07 19:15 ` Kees Cook 2024-03-07 22:02 ` Arnd Bergmann 2024-03-07 22:02 ` Arnd Bergmann 2024-03-08 16:49 ` Jeremy Linton 2024-03-08 16:49 ` Jeremy Linton 2024-03-08 20:29 ` Arnd Bergmann 2024-03-08 20:29 ` Arnd Bergmann 2024-03-22 23:40 ` Jeremy Linton 2024-03-22 23:40 ` Jeremy Linton 2024-03-23 12:47 ` Arnd Bergmann 2024-03-23 12:47 ` Arnd Bergmann 2024-03-07 19:05 ` kernel test robot 2024-03-07 19:05 ` kernel test robot
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=38f9541b-dd88-4d49-af3b-bc7880a4e2f4@arm.com \ --to=jeremy.linton@arm.com \ --cc=Jason@zx2c4.com \ --cc=Manoj.Iyer@arm.com \ --cc=arnd@arndb.de \ --cc=broonie@kernel.org \ --cc=catalin.marinas@arm.com \ --cc=guohui@uniontech.com \ --cc=gustavoars@kernel.org \ --cc=james.yang@arm.com \ --cc=keescook@chromium.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-hardening@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mark.rutland@arm.com \ --cc=rostedt@goodmis.org \ --cc=shiyou.huang@arm.com \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.