linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* randomize_kstack: To init or not to init?
@ 2021-12-09  9:58 Marco Elver
  2021-12-09 10:27 ` Peter Zijlstra
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Marco Elver @ 2021-12-09  9:58 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Alexander Potapenko, Jann Horn, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
default since dcb7c0b9461c2, which is why this came on my radar. And
Clang also performs auto-init of allocas when auto-init is on
(https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
allocas.

add_random_kstack_offset() uses __builtin_alloca() to add a stack
offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
enabled, add_random_kstack_offset() will auto-init that unused portion
of the stack used to add an offset.

There are several problems with this:

	1. These offsets can be as large as 1023 bytes. Performing
	   memset() on them isn't exactly cheap, and this is done on
	   every syscall entry.

	2. Architectures adding add_random_kstack_offset() to syscall
	   entry implemented in C require them to be 'noinstr' (e.g. see
	   x86 and s390). The potential problem here is that a call to
	   memset may occur, which is not noinstr.

A defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:

 | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
 | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
 | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
 | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section

Switching to INIT_STACK_ALL_NONE resolves the warnings as expected.

To figure out what the right solution is, the first thing to figure out
is, do we actually want that offset portion of the stack to be
auto-init'd?

There are several options:

	A. Make memset (and probably all other mem-transfer functions)
	   noinstr compatible, if that is even possible. This only solves
	   problem #2.

	B. A workaround could be using a VLA with
	   __attribute__((uninitialized)), but requires some restructuring
	   to make sure the VLA remains in scope and other trickery to
	   convince the compiler to not give up that stack space.

	C. Introduce a new __builtin_alloca_uninitialized().

I think #C would be the most robust solution, but means this would
remain as-is for a while.

Preferences?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09  9:58 randomize_kstack: To init or not to init? Marco Elver
@ 2021-12-09 10:27 ` Peter Zijlstra
  2021-12-09 12:58   ` Marco Elver
  2021-12-09 20:16 ` Segher Boessenkool
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2021-12-09 10:27 UTC (permalink / raw)
  To: Marco Elver
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Alexander Potapenko, Jann Horn,
	Peter Collingbourne, kasan-dev, linux-kernel, llvm,
	linux-toolchains

On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> default since dcb7c0b9461c2, which is why this came on my radar. And
> Clang also performs auto-init of allocas when auto-init is on
> (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> allocas.
> 
> add_random_kstack_offset() uses __builtin_alloca() to add a stack
> offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> enabled, add_random_kstack_offset() will auto-init that unused portion
> of the stack used to add an offset.
> 
> There are several problems with this:
> 
> 	1. These offsets can be as large as 1023 bytes. Performing
> 	   memset() on them isn't exactly cheap, and this is done on
> 	   every syscall entry.
> 
> 	2. Architectures adding add_random_kstack_offset() to syscall
> 	   entry implemented in C require them to be 'noinstr' (e.g. see
> 	   x86 and s390). The potential problem here is that a call to
> 	   memset may occur, which is not noinstr.
> 
> A defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
> 
>  | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
> 
> Switching to INIT_STACK_ALL_NONE resolves the warnings as expected.
> 
> To figure out what the right solution is, the first thing to figure out
> is, do we actually want that offset portion of the stack to be
> auto-init'd?
> 
> There are several options:
> 
> 	A. Make memset (and probably all other mem-transfer functions)
> 	   noinstr compatible, if that is even possible. This only solves
> 	   problem #2.

While we can shut up objtool real easy, the bigger problem is that
noinstr also excludes things like kprobes and breakpoints and other such
goodness from being placed in the text.

> 	B. A workaround could be using a VLA with
> 	   __attribute__((uninitialized)), but requires some restructuring
> 	   to make sure the VLA remains in scope and other trickery to
> 	   convince the compiler to not give up that stack space.
> 
> 	C. Introduce a new __builtin_alloca_uninitialized().
> 
> I think #C would be the most robust solution, but means this would
> remain as-is for a while.
> 
> Preferences?

I'm with you on C.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09 10:27 ` Peter Zijlstra
@ 2021-12-09 12:58   ` Marco Elver
  0 siblings, 0 replies; 15+ messages in thread
From: Marco Elver @ 2021-12-09 12:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Alexander Potapenko, Jann Horn,
	Peter Collingbourne, kasan-dev, linux-kernel, llvm,
	linux-toolchains

On Thu, 9 Dec 2021 at 11:27, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
[...]
> > There are several options:
> >
> >       A. Make memset (and probably all other mem-transfer functions)
> >          noinstr compatible, if that is even possible. This only solves
> >          problem #2.
>
> While we can shut up objtool real easy, the bigger problem is that
> noinstr also excludes things like kprobes and breakpoints and other such
> goodness from being placed in the text.
>
> >       B. A workaround could be using a VLA with
> >          __attribute__((uninitialized)), but requires some restructuring
> >          to make sure the VLA remains in scope and other trickery to
> >          convince the compiler to not give up that stack space.
> >
> >       C. Introduce a new __builtin_alloca_uninitialized().
> >
> > I think #C would be the most robust solution, but means this would
> > remain as-is for a while.
> >
> > Preferences?
>
> I'm with you on C.

Seems simple enough, so I've sent https://reviews.llvm.org/D115440 --
either way, there needs to be support to not initialize alloca'd
memory. Let's see where we end up.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09  9:58 randomize_kstack: To init or not to init? Marco Elver
  2021-12-09 10:27 ` Peter Zijlstra
@ 2021-12-09 20:16 ` Segher Boessenkool
  2021-12-09 20:33   ` Marco Elver
  2021-12-09 20:48 ` Kees Cook
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Segher Boessenkool @ 2021-12-09 20:16 UTC (permalink / raw)
  To: Marco Elver
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Alexander Potapenko, Jann Horn, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> default since dcb7c0b9461c2, which is why this came on my radar. And
> Clang also performs auto-init of allocas when auto-init is on
> (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> allocas.

The space allocated by alloca is not an automatic variable, so of course
it is not affected by this compiler flag.  And it should not, this flag
is explicitly for *small fixed-size* stack variables (initialising
others can be much too expensive).

> 	C. Introduce a new __builtin_alloca_uninitialized().

That is completely backwards.  That is the normal behaviour of alloca
already.  Also you can get __builtin_alloca inserted by the compiler
(for a variable length array for example), and you typically do not want
those initialised either, for the same reasons.


Segher

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09 20:16 ` Segher Boessenkool
@ 2021-12-09 20:33   ` Marco Elver
  2021-12-20  7:00     ` Marco Elver
  0 siblings, 1 reply; 15+ messages in thread
From: Marco Elver @ 2021-12-09 20:33 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Alexander Potapenko, Jann Horn, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Thu, 9 Dec 2021 at 21:19, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> > Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> > default since dcb7c0b9461c2, which is why this came on my radar. And
> > Clang also performs auto-init of allocas when auto-init is on
> > (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> > aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> > allocas.
>
> The space allocated by alloca is not an automatic variable, so of course
> it is not affected by this compiler flag.  And it should not, this flag
> is explicitly for *small fixed-size* stack variables (initialising
> others can be much too expensive).
>
> >       C. Introduce a new __builtin_alloca_uninitialized().
>
> That is completely backwards.  That is the normal behaviour of alloca
> already.  Also you can get __builtin_alloca inserted by the compiler
> (for a variable length array for example), and you typically do not want
> those initialised either, for the same reasons.

You're right, if we're strict about it, initializing allocas is
technically out-of-scope of that feature.

So, option D: Add a param to control this, and probably it shouldn't
do it by default. Let's see how far that gets then.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09  9:58 randomize_kstack: To init or not to init? Marco Elver
  2021-12-09 10:27 ` Peter Zijlstra
  2021-12-09 20:16 ` Segher Boessenkool
@ 2021-12-09 20:48 ` Kees Cook
  2021-12-09 20:54   ` Marco Elver
  2021-12-09 21:14 ` Kees Cook
  2021-12-09 21:16 ` Jann Horn
  4 siblings, 1 reply; 15+ messages in thread
From: Kees Cook @ 2021-12-09 20:48 UTC (permalink / raw)
  To: Marco Elver
  Cc: Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Alexander Potapenko, Jann Horn, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> default since dcb7c0b9461c2, which is why this came on my radar. And
> Clang also performs auto-init of allocas when auto-init is on
> (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> allocas.
> 
> add_random_kstack_offset() uses __builtin_alloca() to add a stack
> offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> enabled, add_random_kstack_offset() will auto-init that unused portion
> of the stack used to add an offset.
> 
> There are several problems with this:
> 
> 	1. These offsets can be as large as 1023 bytes. Performing
> 	   memset() on them isn't exactly cheap, and this is done on
> 	   every syscall entry.
> 
> 	2. Architectures adding add_random_kstack_offset() to syscall
> 	   entry implemented in C require them to be 'noinstr' (e.g. see
> 	   x86 and s390). The potential problem here is that a call to
> 	   memset may occur, which is not noinstr.
> 
> A defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
> 
>  | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
> 
> Switching to INIT_STACK_ALL_NONE resolves the warnings as expected.
> 
> To figure out what the right solution is, the first thing to figure out
> is, do we actually want that offset portion of the stack to be
> auto-init'd?
> 
> There are several options:
> 
> 	A. Make memset (and probably all other mem-transfer functions)
> 	   noinstr compatible, if that is even possible. This only solves
> 	   problem #2.

I'd agree: "A" isn't going to work well here.

> 
> 	B. A workaround could be using a VLA with
> 	   __attribute__((uninitialized)), but requires some restructuring
> 	   to make sure the VLA remains in scope and other trickery to
> 	   convince the compiler to not give up that stack space.

I was hoping the existing trickery would work for a VLA, but it seems
not. It'd be nice if it could work with a VLA, which could just gain the
attribute and we'd be done.

> 	C. Introduce a new __builtin_alloca_uninitialized().

Hrm, this means conditional logic between compilers, too. :(

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09 20:48 ` Kees Cook
@ 2021-12-09 20:54   ` Marco Elver
  2021-12-09 21:11     ` Alexander Potapenko
  0 siblings, 1 reply; 15+ messages in thread
From: Marco Elver @ 2021-12-09 20:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Alexander Potapenko, Jann Horn, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Thu, 9 Dec 2021 at 21:48, Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> > Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> > default since dcb7c0b9461c2, which is why this came on my radar. And
> > Clang also performs auto-init of allocas when auto-init is on
> > (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> > aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> > allocas.
> >
> > add_random_kstack_offset() uses __builtin_alloca() to add a stack
> > offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> > enabled, add_random_kstack_offset() will auto-init that unused portion
> > of the stack used to add an offset.
> >
> > There are several problems with this:
> >
> >       1. These offsets can be as large as 1023 bytes. Performing
> >          memset() on them isn't exactly cheap, and this is done on
> >          every syscall entry.
> >
> >       2. Architectures adding add_random_kstack_offset() to syscall
> >          entry implemented in C require them to be 'noinstr' (e.g. see
> >          x86 and s390). The potential problem here is that a call to
> >          memset may occur, which is not noinstr.
> >
> > A defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
> >
> >  | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
> >  | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
> >  | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
> >  | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
> >
> > Switching to INIT_STACK_ALL_NONE resolves the warnings as expected.
> >
> > To figure out what the right solution is, the first thing to figure out
> > is, do we actually want that offset portion of the stack to be
> > auto-init'd?
> >
> > There are several options:
> >
> >       A. Make memset (and probably all other mem-transfer functions)
> >          noinstr compatible, if that is even possible. This only solves
> >          problem #2.
>
> I'd agree: "A" isn't going to work well here.
>
> >
> >       B. A workaround could be using a VLA with
> >          __attribute__((uninitialized)), but requires some restructuring
> >          to make sure the VLA remains in scope and other trickery to
> >          convince the compiler to not give up that stack space.
>
> I was hoping the existing trickery would work for a VLA, but it seems
> not. It'd be nice if it could work with a VLA, which could just gain the
> attribute and we'd be done.
>
> >       C. Introduce a new __builtin_alloca_uninitialized().
>
> Hrm, this means conditional logic between compilers, too. :(

And as Segher just pointed out, I think Clang has a "bug" because
explicit alloca() calls aren't "automatic storage". I think Clang
needs a new -mllvm param.

Because I think making #B work is quite ugly and also brittle. :-/

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09 20:54   ` Marco Elver
@ 2021-12-09 21:11     ` Alexander Potapenko
  2021-12-10  0:01       ` Segher Boessenkool
  0 siblings, 1 reply; 15+ messages in thread
From: Alexander Potapenko @ 2021-12-09 21:11 UTC (permalink / raw)
  To: Marco Elver, segher
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra, Jann Horn,
	Peter Collingbourne, kasan-dev, linux-kernel, llvm,
	linux-toolchains

On Thu, Dec 9, 2021 at 9:54 PM Marco Elver <elver@google.com> wrote:
>
> On Thu, 9 Dec 2021 at 21:48, Kees Cook <keescook@chromium.org> wrote:
> >
> > On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> > > Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> > > default since dcb7c0b9461c2, which is why this came on my radar. And
> > > Clang also performs auto-init of allocas when auto-init is on
> > > (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> > > aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> > > allocas.
> > >
> > > add_random_kstack_offset() uses __builtin_alloca() to add a stack
> > > offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> > > enabled, add_random_kstack_offset() will auto-init that unused portion
> > > of the stack used to add an offset.
> > >
> > > There are several problems with this:
> > >
> > >       1. These offsets can be as large as 1023 bytes. Performing
> > >          memset() on them isn't exactly cheap, and this is done on
> > >          every syscall entry.
> > >
> > >       2. Architectures adding add_random_kstack_offset() to syscall
> > >          entry implemented in C require them to be 'noinstr' (e.g. see
> > >          x86 and s390). The potential problem here is that a call to
> > >          memset may occur, which is not noinstr.
> > >
> > > A defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
> > >
> > >  | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
> > >  | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
> > >  | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
> > >  | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
> > >
> > > Switching to INIT_STACK_ALL_NONE resolves the warnings as expected.
> > >
> > > To figure out what the right solution is, the first thing to figure out
> > > is, do we actually want that offset portion of the stack to be
> > > auto-init'd?
> > >
> > > There are several options:
> > >
> > >       A. Make memset (and probably all other mem-transfer functions)
> > >          noinstr compatible, if that is even possible. This only solves
> > >          problem #2.
> >
> > I'd agree: "A" isn't going to work well here.
> >
> > >
> > >       B. A workaround could be using a VLA with
> > >          __attribute__((uninitialized)), but requires some restructuring
> > >          to make sure the VLA remains in scope and other trickery to
> > >          convince the compiler to not give up that stack space.
> >
> > I was hoping the existing trickery would work for a VLA, but it seems
> > not. It'd be nice if it could work with a VLA, which could just gain the
> > attribute and we'd be done.
> >
> > >       C. Introduce a new __builtin_alloca_uninitialized().
> >
> > Hrm, this means conditional logic between compilers, too. :(
>
> And as Segher just pointed out, I think Clang has a "bug" because
> explicit alloca() calls aren't "automatic storage". I think Clang
> needs a new -mllvm param.

I don't think the original Clang flag was built with just "automatic
storage" in mind.
After all, people do forget to initialize their variables, regardless
of whether they are automatic stack variables, or malloc() or alloca()
allocations.

If __builtin_alloca() wasn't banned in the kernel, we'd probably want
it to return initialized memory, because otherwise people would be
making the same mistakes using it.
Now that there's a single call to __builtin_alloca() that happens to
suffer from initialization, it is hard to justify that initializing
allocas is a good thing to do.
But I believe developers in other projects don't want to worry about
how they allocate their memory when turning stack initialization on -
they just want to be on the safe side.

> Because I think making #B work is quite ugly and also brittle. :-/



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09  9:58 randomize_kstack: To init or not to init? Marco Elver
                   ` (2 preceding siblings ...)
  2021-12-09 20:48 ` Kees Cook
@ 2021-12-09 21:14 ` Kees Cook
  2021-12-09 21:16 ` Jann Horn
  4 siblings, 0 replies; 15+ messages in thread
From: Kees Cook @ 2021-12-09 21:14 UTC (permalink / raw)
  To: Marco Elver
  Cc: Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Alexander Potapenko, Jann Horn, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> default since dcb7c0b9461c2, which is why this came on my radar. And
> Clang also performs auto-init of allocas when auto-init is on
> (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> allocas.
> 
> add_random_kstack_offset() uses __builtin_alloca() to add a stack
> offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> enabled, add_random_kstack_offset() will auto-init that unused portion
> of the stack used to add an offset.
> 
> There are several problems with this:
> 
> 	1. These offsets can be as large as 1023 bytes. Performing
> 	   memset() on them isn't exactly cheap, and this is done on
> 	   every syscall entry.
> 
> 	2. Architectures adding add_random_kstack_offset() to syscall
> 	   entry implemented in C require them to be 'noinstr' (e.g. see
> 	   x86 and s390). The potential problem here is that a call to
> 	   memset may occur, which is not noinstr.
> 
> A defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
> 
>  | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
>  | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
> 
> Switching to INIT_STACK_ALL_NONE resolves the warnings as expected.
> 
> To figure out what the right solution is, the first thing to figure out
> is, do we actually want that offset portion of the stack to be
> auto-init'd?

I actually can't reproduce this with the latest Clang. I see no memset
call with/without INIT_STACK_ALL_ZERO. I do see:

vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section

I assume that's from:

	struct bad_iret_stack tmp

But I'll try with Clang 11 and report back...

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09  9:58 randomize_kstack: To init or not to init? Marco Elver
                   ` (3 preceding siblings ...)
  2021-12-09 21:14 ` Kees Cook
@ 2021-12-09 21:16 ` Jann Horn
  2021-12-09 21:40   ` Marco Elver
  2021-12-11 17:01   ` David Laight
  4 siblings, 2 replies; 15+ messages in thread
From: Jann Horn @ 2021-12-09 21:16 UTC (permalink / raw)
  To: Marco Elver, Peter Zijlstra, Alexander Potapenko
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Thu, Dec 9, 2021 at 10:58 AM Marco Elver <elver@google.com> wrote:
> Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> default since dcb7c0b9461c2, which is why this came on my radar. And
> Clang also performs auto-init of allocas when auto-init is on
> (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> allocas.
>
> add_random_kstack_offset() uses __builtin_alloca() to add a stack
> offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> enabled, add_random_kstack_offset() will auto-init that unused portion
> of the stack used to add an offset.
>
> There are several problems with this:
>
>         1. These offsets can be as large as 1023 bytes. Performing
>            memset() on them isn't exactly cheap, and this is done on
>            every syscall entry.
>
>         2. Architectures adding add_random_kstack_offset() to syscall
>            entry implemented in C require them to be 'noinstr' (e.g. see
>            x86 and s390). The potential problem here is that a call to
>            memset may occur, which is not noinstr.

This doesn't just affect alloca(), right? According to godbolt.org
(https://godbolt.org/z/jYrWEx7o8):

void bar(char *p);
void foo() {
  char arr[512];
  bar(arr);
}

when compiled with "-ftrivial-auto-var-init=pattern -O2 -mno-sse"
gives this result:

foo:                                    # @foo
        push    rbx
        sub     rsp, 512
        mov     rbx, rsp
        mov     edx, 512
        mov     rdi, rbx
        mov     esi, 170
        call    memset@PLT
        mov     rdi, rbx
        call    bar
        add     rsp, 512
        pop     rbx
        ret

So I think to fix this properly in a way that doesn't conflict with
noinstr validation, I think you'll have to add a compiler flag that
lets you specify a noinstr-safe replacement for memset() that should
be used here?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09 21:16 ` Jann Horn
@ 2021-12-09 21:40   ` Marco Elver
  2021-12-11 17:01   ` David Laight
  1 sibling, 0 replies; 15+ messages in thread
From: Marco Elver @ 2021-12-09 21:40 UTC (permalink / raw)
  To: Jann Horn
  Cc: Peter Zijlstra, Alexander Potapenko, Kees Cook, Thomas Gleixner,
	Nathan Chancellor, Nick Desaulniers, Elena Reshetova,
	Mark Rutland, Peter Collingbourne, kasan-dev, linux-kernel, llvm,
	linux-toolchains

On Thu, 9 Dec 2021 at 22:16, Jann Horn <jannh@google.com> wrote:
>
> On Thu, Dec 9, 2021 at 10:58 AM Marco Elver <elver@google.com> wrote:
> > Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> > default since dcb7c0b9461c2, which is why this came on my radar. And
> > Clang also performs auto-init of allocas when auto-init is on
> > (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> > aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> > allocas.
> >
> > add_random_kstack_offset() uses __builtin_alloca() to add a stack
> > offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> > enabled, add_random_kstack_offset() will auto-init that unused portion
> > of the stack used to add an offset.
> >
> > There are several problems with this:
> >
> >         1. These offsets can be as large as 1023 bytes. Performing
> >            memset() on them isn't exactly cheap, and this is done on
> >            every syscall entry.
> >
> >         2. Architectures adding add_random_kstack_offset() to syscall
> >            entry implemented in C require them to be 'noinstr' (e.g. see
> >            x86 and s390). The potential problem here is that a call to
> >            memset may occur, which is not noinstr.
>
> This doesn't just affect alloca(), right? According to godbolt.org
> (https://godbolt.org/z/jYrWEx7o8):
>
> void bar(char *p);
> void foo() {
>   char arr[512];
>   bar(arr);
> }
>
> when compiled with "-ftrivial-auto-var-init=pattern -O2 -mno-sse"
> gives this result:
>
> foo:                                    # @foo
>         push    rbx
>         sub     rsp, 512
>         mov     rbx, rsp
>         mov     edx, 512
>         mov     rdi, rbx
>         mov     esi, 170
>         call    memset@PLT
>         mov     rdi, rbx
>         call    bar
>         add     rsp, 512
>         pop     rbx
>         ret
>
> So I think to fix this properly in a way that doesn't conflict with
> noinstr validation, I think you'll have to add a compiler flag that
> lets you specify a noinstr-safe replacement for memset() that should
> be used here?

Yeah, this story isn't over with __builtin_alloca().

A workaround would be to use __attribute__((uninitialized)). Of course
that implies there are no uninit bugs. ;-)
To initialize in noinstr, __memset can be used explicitly.

Maybe there's some guidance on what is and what isn't ok in noinstr
code so we can actually decide what is the right thing to do. I found
this: https://lore.kernel.org/all/878rx5b7i5.ffs@tglx/

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09 21:11     ` Alexander Potapenko
@ 2021-12-10  0:01       ` Segher Boessenkool
  0 siblings, 0 replies; 15+ messages in thread
From: Segher Boessenkool @ 2021-12-10  0:01 UTC (permalink / raw)
  To: Alexander Potapenko
  Cc: Marco Elver, Kees Cook, Thomas Gleixner, Nathan Chancellor,
	Nick Desaulniers, Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Jann Horn, Peter Collingbourne, kasan-dev, linux-kernel, llvm,
	linux-toolchains

On Thu, Dec 09, 2021 at 10:11:19PM +0100, Alexander Potapenko wrote:
> On Thu, Dec 9, 2021 at 9:54 PM Marco Elver <elver@google.com> wrote:
> >
> > On Thu, 9 Dec 2021 at 21:48, Kees Cook <keescook@chromium.org> wrote:
> > >
> > > On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> > > > Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> > > > default since dcb7c0b9461c2, which is why this came on my radar. And
> > > > Clang also performs auto-init of allocas when auto-init is on
> > > > (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> > > > aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> > > > allocas.
> > > >
> > > > add_random_kstack_offset() uses __builtin_alloca() to add a stack
> > > > offset. This means, when CONFIG_INIT_STACK_ALL_{ZERO,PATTERN} is
> > > > enabled, add_random_kstack_offset() will auto-init that unused portion
> > > > of the stack used to add an offset.
> > > >
> > > > There are several problems with this:
> > > >
> > > >       1. These offsets can be as large as 1023 bytes. Performing
> > > >          memset() on them isn't exactly cheap, and this is done on
> > > >          every syscall entry.
> > > >
> > > >       2. Architectures adding add_random_kstack_offset() to syscall
> > > >          entry implemented in C require them to be 'noinstr' (e.g. see
> > > >          x86 and s390). The potential problem here is that a call to
> > > >          memset may occur, which is not noinstr.
> > > >
> > > > A defconfig kernel with Clang 11 and CONFIG_VMLINUX_VALIDATION shows:
> > > >
> > > >  | vmlinux.o: warning: objtool: do_syscall_64()+0x9d: call to memset() leaves .noinstr.text section
> > > >  | vmlinux.o: warning: objtool: do_int80_syscall_32()+0xab: call to memset() leaves .noinstr.text section
> > > >  | vmlinux.o: warning: objtool: __do_fast_syscall_32()+0xe2: call to memset() leaves .noinstr.text section
> > > >  | vmlinux.o: warning: objtool: fixup_bad_iret()+0x2f: call to memset() leaves .noinstr.text section
> > > >
> > > > Switching to INIT_STACK_ALL_NONE resolves the warnings as expected.
> > > >
> > > > To figure out what the right solution is, the first thing to figure out
> > > > is, do we actually want that offset portion of the stack to be
> > > > auto-init'd?
> > > >
> > > > There are several options:
> > > >
> > > >       A. Make memset (and probably all other mem-transfer functions)
> > > >          noinstr compatible, if that is even possible. This only solves
> > > >          problem #2.
> > >
> > > I'd agree: "A" isn't going to work well here.
> > >
> > > >
> > > >       B. A workaround could be using a VLA with
> > > >          __attribute__((uninitialized)), but requires some restructuring
> > > >          to make sure the VLA remains in scope and other trickery to
> > > >          convince the compiler to not give up that stack space.
> > >
> > > I was hoping the existing trickery would work for a VLA, but it seems
> > > not. It'd be nice if it could work with a VLA, which could just gain the
> > > attribute and we'd be done.
> > >
> > > >       C. Introduce a new __builtin_alloca_uninitialized().
> > >
> > > Hrm, this means conditional logic between compilers, too. :(
> >
> > And as Segher just pointed out, I think Clang has a "bug" because
> > explicit alloca() calls aren't "automatic storage". I think Clang
> > needs a new -mllvm param.
> 
> I don't think the original Clang flag was built with just "automatic
> storage" in mind.

My comment was about the GCC -ftrivial-auto-var-init= flag.  It only
influences automatic variables, as the name suggests :-)

What it does is it adds an initialiser for the variable early on, long
before it is generating anything like target code.  Handling alloca has
to be done somewhere different (and probably should have its own flag
anyway).

> Now that there's a single call to __builtin_alloca() that happens to
> suffer from initialization, it is hard to justify that initializing
> allocas is a good thing to do.

The same argument holds for *all* unnecessary initialisations, btw; it
just isn't obvious what it costs for a single simple variable, but it
all adds up.

> But I believe developers in other projects don't want to worry about
> how they allocate their memory when turning stack initialization on -
> they just want to be on the safe side.

Sure.  And the kernel has a tiny stack anyway, so unless that alloca is
in a hot path (or wastefully allocates way more than is actually used),
initialising stuff isn't worse than anywhere else, just more obvious to
spot in your profiles (everything will easily fit in cache after all).


Segher

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: randomize_kstack: To init or not to init?
  2021-12-09 21:16 ` Jann Horn
  2021-12-09 21:40   ` Marco Elver
@ 2021-12-11 17:01   ` David Laight
  2021-12-11 20:20     ` Segher Boessenkool
  1 sibling, 1 reply; 15+ messages in thread
From: David Laight @ 2021-12-11 17:01 UTC (permalink / raw)
  To: 'Jann Horn', Marco Elver, Peter Zijlstra, Alexander Potapenko
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

From: Jann Horn
> Sent: 09 December 2021 21:16
...
> This doesn't just affect alloca(), right? According to godbolt.org
> (https://godbolt.org/z/jYrWEx7o8):
> 
> void bar(char *p);
> void foo() {
>   char arr[512];
>   bar(arr);
> }
> 
> when compiled with "-ftrivial-auto-var-init=pattern -O2 -mno-sse"
> gives this result:
> 
> foo:                                    # @foo
>         push    rbx
>         sub     rsp, 512
>         mov     rbx, rsp
>         mov     edx, 512
>         mov     rdi, rbx
>         mov     esi, 170
>         call    memset@PLT
>         mov     rdi, rbx
>         call    bar
>         add     rsp, 512
>         pop     rbx
>         ret

Jeepers - I don't ever want that to happen not ever...

There is plenty of userspace code that allocates large arrays on stack
(I bet some get into MB sizes) that are correctly bound-checked but
the expense of initialising them will be horrid.

So you end up with horrid, complex, more likely to be buggy, code
that tries to allocate things that are 'just big enough' rather
than just a sanity check on a large buffer.

Typical examples are char path[MAXPATH].
You know the path will almost certainly be < 100 bytes.
MAXPATH is overkill - but can be tested for.
But you don't want path[] initialised.
So you cane to pick a shorter length - and then it all goes 'TITSUP'
when the actual path is a bit longer than you allowed for.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-11 17:01   ` David Laight
@ 2021-12-11 20:20     ` Segher Boessenkool
  0 siblings, 0 replies; 15+ messages in thread
From: Segher Boessenkool @ 2021-12-11 20:20 UTC (permalink / raw)
  To: David Laight
  Cc: 'Jann Horn',
	Marco Elver, Peter Zijlstra, Alexander Potapenko, Kees Cook,
	Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Sat, Dec 11, 2021 at 05:01:07PM +0000, David Laight wrote:
> From: Jann Horn
> > void bar(char *p);
> > void foo() {
> >   char arr[512];
> >   bar(arr);
> > }

> >         call    memset@PLT

> There is plenty of userspace code that allocates large arrays on stack
> (I bet some get into MB sizes) that are correctly bound-checked but
> the expense of initialising them will be horrid.

Yes, you need ulimit -s much more often now than when the default limit
was introduced (in 1995 apparently); back then it was comparable to main
memory size, now it is a fraction of a thousandth of it.  But because of
the same you do not need to increase the stack size for pretty much
anything in distros now :-)

> So you end up with horrid, complex, more likely to be buggy, code
> that tries to allocate things that are 'just big enough' rather
> than just a sanity check on a large buffer.

Yes.  The only problem is this will touch memory that is cold in cache
still (because the stack grows down and arrays are addressed in the
positive direction).  This is a pretty minimal effect of course.

> Typical examples are char path[MAXPATH].
> You know the path will almost certainly be < 100 bytes.
> MAXPATH is overkill - but can be tested for.
> But you don't want path[] initialised.
> So you cane to pick a shorter length - and then it all goes 'TITSUP'
> when the actual path is a bit longer than you allowed for.

If you do this, you probably want to warn for any non-tail functions
that have such a stack allocation, because over-allocating there is
pretty bad.  Or maybe you want to warn whenever you omit the
initialisation even :-)


Segher

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: randomize_kstack: To init or not to init?
  2021-12-09 20:33   ` Marco Elver
@ 2021-12-20  7:00     ` Marco Elver
  0 siblings, 0 replies; 15+ messages in thread
From: Marco Elver @ 2021-12-20  7:00 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Kees Cook, Thomas Gleixner, Nathan Chancellor, Nick Desaulniers,
	Elena Reshetova, Mark Rutland, Peter Zijlstra,
	Alexander Potapenko, Jann Horn, Peter Collingbourne, kasan-dev,
	linux-kernel, llvm, linux-toolchains

On Thu, 9 Dec 2021 at 21:33, Marco Elver <elver@google.com> wrote:
> On Thu, 9 Dec 2021 at 21:19, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > On Thu, Dec 09, 2021 at 10:58:01AM +0100, Marco Elver wrote:
> > > Clang supports CONFIG_INIT_STACK_ALL_ZERO, which appears to be the
> > > default since dcb7c0b9461c2, which is why this came on my radar. And
> > > Clang also performs auto-init of allocas when auto-init is on
> > > (https://reviews.llvm.org/D60548), with no way to skip. As far as I'm
> > > aware, GCC 12's upcoming -ftrivial-auto-var-init= doesn't yet auto-init
> > > allocas.
> >
> > The space allocated by alloca is not an automatic variable, so of course
> > it is not affected by this compiler flag.  And it should not, this flag
> > is explicitly for *small fixed-size* stack variables (initialising
> > others can be much too expensive).
> >
> > >       C. Introduce a new __builtin_alloca_uninitialized().
> >
> > That is completely backwards.  That is the normal behaviour of alloca
> > already.  Also you can get __builtin_alloca inserted by the compiler
> > (for a variable length array for example), and you typically do not want
> > those initialised either, for the same reasons.
>
> You're right, if we're strict about it, initializing allocas is
> technically out-of-scope of that feature.
>
> So, option D: Add a param to control this, and probably it shouldn't
> do it by default. Let's see how far that gets then.

Just an update: after some discussion, the Clang side says that
alloca() is in scope, because the intent is that trivially initialized
"automatic stack storage" should be handled by ftrivial-auto-var-init.
And alloca() is automatic stack storage:
https://www.gnu.org/software/libc/manual/html_node/Variable-Size-Automatic.html

So currently it looks like the builtin is the only solution in this case.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-12-20  7:00 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-09  9:58 randomize_kstack: To init or not to init? Marco Elver
2021-12-09 10:27 ` Peter Zijlstra
2021-12-09 12:58   ` Marco Elver
2021-12-09 20:16 ` Segher Boessenkool
2021-12-09 20:33   ` Marco Elver
2021-12-20  7:00     ` Marco Elver
2021-12-09 20:48 ` Kees Cook
2021-12-09 20:54   ` Marco Elver
2021-12-09 21:11     ` Alexander Potapenko
2021-12-10  0:01       ` Segher Boessenkool
2021-12-09 21:14 ` Kees Cook
2021-12-09 21:16 ` Jann Horn
2021-12-09 21:40   ` Marco Elver
2021-12-11 17:01   ` David Laight
2021-12-11 20:20     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).