linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/13] stackleak: fixes and rework
@ 2022-04-27 17:31 Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack() Mark Rutland
                   ` (13 more replies)
  0 siblings, 14 replies; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, alex.popov, keescook
  Cc: akpm, catalin.marinas, luto, mark.rutland, will

Hi Alexander, Kees,

This is the vs I promised. Since Alexander wanted to look at this in
more detail (and since this is subtle and needs review), I'm assuming
that Kees will pick this up some time next week after that's happened,
if all goes well. :)

This series reworks the stackleak code and the associated LKDTM test.
The first patch fixes some latent issues on arm64, and the subsequent
patches improve the code to improve clarity and permit better code
generation. Patches 8-10 address some latent issues in the LKDTM test
and add more diagnostic output.

Since v1 [1]:
* Fix and rework the LKDTM test
* Rework the poison scan

[1] https://lore.kernel.org/lkml/20220425115603.781311-1-mark.rutland@arm.com/

Benchmarking arm64 with a QEMU HVF VM on an M1 Macbook Pro shows a
reasonable improvement when stackleak is enabled. I've included figures
for when stackleak is dynamically disabled with v2:

* Calling getpid 1^22 times in a loop (avg 50 runs)
  
  5.18-rc1/on: 0.652099387s ( +-  0.13% )
  v1/on:       0.641005661s ( +-  0.13% ) ; 1.7% improvement
  v2/on:       0.611699824s ( +-  0.08% ) ; 6.2% improvement
  v2/off:      0.507505632s ( +-  0.15% )

* perf bench sched pipe (single run)

  5.18-rc1/on: 2.138s total
  v1/on:       2.118s total ; 0.93% improvement
  v2/on:       2.116s total ; 1.03% improvement
  v2/off:      1.708s total

The large jump between v1 and v2 is largely due to the changes to poison
scanning avoiding redundantly overwriting poison. For the getpid syscall
the poison which gets overwritten is a substantial fraction of the stack
usage, and for more involved syscalls this may be a trivial fraction, so
the average benefit is fairly small.

With the whole series applied, the LKDTM test reliably passes on both
arm64 and x86_64, e.g.

| # uname -a
| Linux buildroot 5.18.0-rc1-00013-g26f638ab0d7c #3 SMP PREEMPT Wed Apr 27 16:21:37 BST 2022 aarch64 GNU/Linux
| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT 
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: stackleak stack usage:
|   high offset: 336 bytes
|   current:     688 bytes
|   lowest:      1232 bytes
|   tracked:     1232 bytes
|   untracked:   672 bytes
|   poisoned:    14136 bytes
|   low offset:  8 bytes
| lkdtm: OK: the rest of the thread stack is properly erased

The first patch fixes the major issues on arm64, and is Cc'd to stable
for backporting.

The second patch is a trivial optimization for when stackleak is
dynamically disabled.

The subsequent patches rework the way stackleak manipulates the stack
boundary values. This is partially for clarity (e.g. with separate 'low'
and 'high' boundary variables), and also permits the compiler to
generate more optimal assembly by generating the high and low bounds
from the same base.

Patch 6 changes the way that `current->lowest_stack` is reset prior to
return to userspace. The existing code uses an undocumented offset
relative to the top of the stack which doesn't make much sense (as thie
sometimes falls within the task's pt_regs, or sometimes adds 600+ bytes
to erase upon the next exit to userspace). For now I've removed the
offset entirely.

Patch 7 changes the way we scan for poison. This fixes what I believe
are fencepost errors, and also avoids the need to redundantly overwrite
poison.

Patches 8-11 rework the LKDTM test, fixing cases where the test could
spuriously fail and improving the diagnostic output.

Patches 12 and 13 add stackleak_erase_on_task_stack() and
stackleak_erase_off_task_stack() that can be used when a caller knows
they're always on or off the task stack respectively, avoiding redundant
logic to check this and generate the high boundary value. The former is
used on arm64 where we always perform the erase on the task stack, and
I've included the latter for completeness (as e.g. it could be used on
x86) given it is trivial.

Thanks,
Mark.

Mark Rutland (13):
  arm64: stackleak: fix current_top_of_stack()
  stackleak: move skip_erasing() check earlier
  stackleak: remove redundant check
  stackleak: rework stack low bound handling
  stackleak: clarify variable names
  stackleak: rework stack high bound handling
  stackleak: rework poison scanning
  lkdtm/stackleak: avoid spurious failure
  lkdtm/stackleak: rework boundary management
  lkdtm/stackleak: prevent unexpected stack usage
  lkdtm/stackleak: check stack boundaries
  stackleak: add on/off stack variants
  arm64: entry: use stackleak_erase_on_task_stack()

 arch/arm64/include/asm/processor.h |  10 +--
 arch/arm64/kernel/entry.S          |   2 +-
 drivers/misc/lkdtm/stackleak.c     | 133 +++++++++++++++++++----------
 include/linux/stackleak.h          |  55 +++++++++++-
 kernel/stackleak.c                 | 105 ++++++++++++++---------
 5 files changed, 210 insertions(+), 95 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 47+ messages in thread

* [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-04 16:41   ` Catalin Marinas
  2022-05-08 17:24   ` Alexander Popov
  2022-04-27 17:31 ` [PATCH v2 02/13] stackleak: move skip_erasing() check earlier Mark Rutland
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

Due to some historical confusion, arm64's current_top_of_stack() isn't
what the stackleak code expects. This could in theory result in a number
of problems, and practically results in an unnecessary performance hit.
We can avoid this by aligning the arm64 implementation with the x86
implementation.

The arm64 implementation of current_top_of_stack() was added
specifically for stackleak in commit:

  0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")

This was intended to be equivalent to the x86 implementation, but the
implementation, semantics, and performance characteristics differ
wildly:

* On x86, current_top_of_stack() returns the top of the current task's
  task stack, regardless of which stack is in active use.

  The implementation accesses a percpu variable which the x86 entry code
  maintains, and returns the location immediately above the pt_regs on
  the task stack (above which x86 has some padding).

* On arm64 current_top_of_stack() returns the top of the stack in active
  use (i.e. the one which is currently being used).

  The implementation checks the SP against a number of
  potentially-accessible stacks, and will BUG() if no stack is found.

The core stackleak_erase() code determines the upper bound of stack to
erase with:

| if (on_thread_stack())
|         boundary = current_stack_pointer;
| else
|         boundary = current_top_of_stack();

On arm64 stackleak_erase() is always called on a task stack, and
on_thread_stack() should always be true. On x86, stackleak_erase() is
mostly called on a trampoline stack, and is sometimes called on a task
stack.

Currently, this results in a lot of unnecessary code being generated for
arm64 for the impossible !on_thread_stack() case. Some of this is
inlined, bloating stackleak_erase(), while portions of this are left
out-of-line and permitted to be instrumented (which would be a
functional problem if that code were reachable).

As a first step towards improving this, this patch aligns arm64's
implementation of current_top_of_stack() with x86's, always returning
the top of the current task's stack. With GCC 11.1.0 this results in the
bulk of the unnecessary code being removed, including all of the
out-of-line instrumentable code.

While I don't believe there's a functional problem in practice I've
marked this as a fix since the semantic was clearly wrong, the fix
itself is simple, and other code might rely upon this in future.

Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/processor.h | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 73e38d9a540ce..6b1a12c23fe77 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -381,12 +381,10 @@ long get_tagged_addr_ctrl(struct task_struct *task);
  * of header definitions for the use of task_stack_page.
  */
 
-#define current_top_of_stack()								\
-({											\
-	struct stack_info _info;							\
-	BUG_ON(!on_accessible_stack(current, current_stack_pointer, 1, &_info));	\
-	_info.high;									\
-})
+/*
+ * The top of the current task's task stack
+ */
+#define current_top_of_stack()	((unsigned long)current->stack + THREAD_SIZE)
 #define on_thread_stack()	(on_task_stack(current, current_stack_pointer, 1, NULL))
 
 #endif /* __ASSEMBLY__ */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 02/13] stackleak: move skip_erasing() check earlier
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack() Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-08 17:44   ` Alexander Popov
  2022-04-27 17:31 ` [PATCH v2 03/13] stackleak: remove redundant check Mark Rutland
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

In stackleak_erase() we check skip_erasing() after accessing some fields
from current. As generating the address of current uses asm which
hazards with the static branch asm, this work is always performed, even
when the static branch is patched to jump to the return a the end of the
function.

This patch avoids this redundant work by moving the skip_erasing() check
earlier.

To avoid complicating initialization within stackleak_erase(), the body
of the function is split out into a __stackleak_erase() helper, with the
check left in a wrapper function. The __stackleak_erase() helper is
marked __always_inline to ensure that this is inlined into
stackleak_erase() and not instrumented.

Before this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
   00 00
   48 8b 48 20             mov    0x20(%rax),%rcx
   48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
   66 90                   xchg   %ax,%ax  <------------ static branch
   48 89 c2                mov    %rax,%rdx
   48 29 ca                sub    %rcx,%rdx
   48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx

After this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)  <--- static branch
   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
   00 00
   48 8b 48 20             mov    0x20(%rax),%rcx
   48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
   48 89 c2                mov    %rax,%rdx
   48 29 ca                sub    %rcx,%rdx
   48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx

Before this patch, on arm64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   d503245f        bti     c
   d5384100        mrs     x0, sp_el0
   f9401003        ldr     x3, [x0, #32]
   f9451000        ldr     x0, [x0, #2592]
   d503201f        nop  <------------------------------- static branch
   d503233f        paciasp
   cb030002        sub     x2, x0, x3
   d287ffe1        mov     x1, #0x3fff
   eb01005f        cmp     x2, x1

After this patch, on arm64 w/ GCC 11.1.0 the start of the function is:

<stackleak_erase>:
   d503245f        bti     c
   d503201f        nop  <------------------------------- static branch
   d503233f        paciasp
   d5384100        mrs     x0, sp_el0
   f9401003        ldr     x3, [x0, #32]
   d287ffe1        mov     x1, #0x3fff
   f9451000        ldr     x0, [x0, #2592]
   cb030002        sub     x2, x0, x3
   eb01005f        cmp     x2, x1

While this may not be a huge win on its own, moving the static branch
will permit further optimization of the body of the function in
subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 kernel/stackleak.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index ddb5a7f48d69e..753eab797a04d 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -70,7 +70,7 @@ late_initcall(stackleak_sysctls_init);
 #define skip_erasing()	false
 #endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
 
-asmlinkage void noinstr stackleak_erase(void)
+static __always_inline void __stackleak_erase(void)
 {
 	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
 	unsigned long kstack_ptr = current->lowest_stack;
@@ -78,9 +78,6 @@ asmlinkage void noinstr stackleak_erase(void)
 	unsigned int poison_count = 0;
 	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
 
-	if (skip_erasing())
-		return;
-
 	/* Check that 'lowest_stack' value is sane */
 	if (unlikely(kstack_ptr - boundary >= THREAD_SIZE))
 		kstack_ptr = boundary;
@@ -125,6 +122,14 @@ asmlinkage void noinstr stackleak_erase(void)
 	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
 }
 
+asmlinkage void noinstr stackleak_erase(void)
+{
+	if (skip_erasing())
+		return;
+
+	__stackleak_erase();
+}
+
 void __used __no_caller_saved_registers noinstr stackleak_track_stack(void)
 {
 	unsigned long sp = current_stack_pointer;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 03/13] stackleak: remove redundant check
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack() Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 02/13] stackleak: move skip_erasing() check earlier Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-08 18:17   ` Alexander Popov
  2022-04-27 17:31 ` [PATCH v2 04/13] stackleak: rework stack low bound handling Mark Rutland
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

In __stackleak_erase() we check that the `erase_low` value derived from
`current->lowest_stack` is above the lowest legitimate stack pointer
value, but this is already enforced by stackleak_track_stack() when
recording the lowest stack value.

Remove the redundant check.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 kernel/stackleak.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index 753eab797a04d..f7a0f8cf73c37 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -78,10 +78,6 @@ static __always_inline void __stackleak_erase(void)
 	unsigned int poison_count = 0;
 	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
 
-	/* Check that 'lowest_stack' value is sane */
-	if (unlikely(kstack_ptr - boundary >= THREAD_SIZE))
-		kstack_ptr = boundary;
-
 	/* Search for the poison value in the kernel stack */
 	while (kstack_ptr > boundary && poison_count <= depth) {
 		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 04/13] stackleak: rework stack low bound handling
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (2 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 03/13] stackleak: remove redundant check Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 05/13] stackleak: clarify variable names Mark Rutland
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

In stackleak_task_init(), stackleak_track_stack(), and
__stackleak_erase(), we open-code skipping the STACK_END_MAGIC at the
bottom of the stack. Each case is implemented slightly differently, and
only the __stackleak_erase() case is commented.

In stackleak_task_init() and stackleak_track_stack() we unconditionally
add sizeof(unsigned long) to the lowest stack address. In
stackleak_task_init() we use end_of_stack() for this, and in
stackleak_track_stack() we use task_stack_page(). In __stackleak_erase()
we handle this by detecting if `kstack_ptr` has hit the stack end
boundary, and if so, conditionally moving it above the magic.

This patch adds a new stackleak_task_low_bound() helper which is used in
all three cases, which unconditionally adds sizeof(unsigned long) to the
lowest address on the task stack, with commentary as to why. This uses
end_of_stack() as stackleak_task_init() did prior to this patch, as this
is consistent with the code in kernel/fork.c which initializes the
STACK_END_MAGIC value.

In __stackleak_erase() we no longer need to check whether we've spilled
into the STACK_END_MAGIC value, as stackleak_track_stack() ensures that
`current->lowest_stack` stops immediately above this, and similarly the
poison scan will stop immediately above this.

For stackleak_task_init() and stackleak_track_stack() this results in no
change to code generation. For __stackleak_erase() the generated
assembly is slightly simpler and shorter.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 include/linux/stackleak.h | 15 ++++++++++++++-
 kernel/stackleak.c        | 14 ++++----------
 2 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
index ccaab2043fcd5..67430faa5c518 100644
--- a/include/linux/stackleak.h
+++ b/include/linux/stackleak.h
@@ -15,9 +15,22 @@
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 #include <asm/stacktrace.h>
 
+/*
+ * The lowest address on tsk's stack which we can plausibly erase.
+ */
+static __always_inline unsigned long
+stackleak_task_low_bound(const struct task_struct *tsk)
+{
+	/*
+	 * The lowest unsigned long on the task stack contains STACK_END_MAGIC,
+	 * which we must not corrupt.
+	 */
+	return (unsigned long)end_of_stack(tsk) + sizeof(unsigned long);
+}
+
 static inline void stackleak_task_init(struct task_struct *t)
 {
-	t->lowest_stack = (unsigned long)end_of_stack(t) + sizeof(unsigned long);
+	t->lowest_stack = stackleak_task_low_bound(t);
 # ifdef CONFIG_STACKLEAK_METRICS
 	t->prev_lowest_stack = t->lowest_stack;
 # endif
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index f7a0f8cf73c37..24b7cf01b2972 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -72,9 +72,11 @@ late_initcall(stackleak_sysctls_init);
 
 static __always_inline void __stackleak_erase(void)
 {
+	const unsigned long task_stack_low = stackleak_task_low_bound(current);
+
 	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
 	unsigned long kstack_ptr = current->lowest_stack;
-	unsigned long boundary = (unsigned long)end_of_stack(current);
+	unsigned long boundary = task_stack_low;
 	unsigned int poison_count = 0;
 	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
 
@@ -88,13 +90,6 @@ static __always_inline void __stackleak_erase(void)
 		kstack_ptr -= sizeof(unsigned long);
 	}
 
-	/*
-	 * One 'long int' at the bottom of the thread stack is reserved and
-	 * should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK=y).
-	 */
-	if (kstack_ptr == boundary)
-		kstack_ptr += sizeof(unsigned long);
-
 #ifdef CONFIG_STACKLEAK_METRICS
 	current->prev_lowest_stack = kstack_ptr;
 #endif
@@ -140,8 +135,7 @@ void __used __no_caller_saved_registers noinstr stackleak_track_stack(void)
 	/* 'lowest_stack' should be aligned on the register width boundary */
 	sp = ALIGN(sp, sizeof(unsigned long));
 	if (sp < current->lowest_stack &&
-	    sp >= (unsigned long)task_stack_page(current) +
-						sizeof(unsigned long)) {
+	    sp >= stackleak_task_low_bound(current)) {
 		current->lowest_stack = sp;
 	}
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 05/13] stackleak: clarify variable names
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (3 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 04/13] stackleak: rework stack low bound handling Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-08 20:49   ` Alexander Popov
  2022-04-27 17:31 ` [PATCH v2 06/13] stackleak: rework stack high bound handling Mark Rutland
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

The logic within __stackleak_erase() can be a little hard to follow, as
`boundary` switches from being the low bound to the high bound mid way
through the function, and `kstack_ptr` is used to represent the start of
the region to erase while `boundary` represents the end of the region to
erase.

Make this a little clearer by consistently using clearer variable names.
The `boundary` variable is removed, the bounds of the region to erase
are described by `erase_low` and `erase_high`, and bounds of the task
stack are described by `task_stack_low` and `task_stck_high`.

As the same time, remove the comment above the variables, since it is
unclear whether it's intended as rationale, a complaint, or a TODO, and
is more confusing than helpful.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 kernel/stackleak.c | 30 ++++++++++++++----------------
 1 file changed, 14 insertions(+), 16 deletions(-)

diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index 24b7cf01b2972..d5f684dc0a2d9 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -73,40 +73,38 @@ late_initcall(stackleak_sysctls_init);
 static __always_inline void __stackleak_erase(void)
 {
 	const unsigned long task_stack_low = stackleak_task_low_bound(current);
-
-	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
-	unsigned long kstack_ptr = current->lowest_stack;
-	unsigned long boundary = task_stack_low;
+	unsigned long erase_low = current->lowest_stack;
+	unsigned long erase_high;
 	unsigned int poison_count = 0;
 	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
 
 	/* Search for the poison value in the kernel stack */
-	while (kstack_ptr > boundary && poison_count <= depth) {
-		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
+	while (erase_low > task_stack_low && poison_count <= depth) {
+		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
 			poison_count++;
 		else
 			poison_count = 0;
 
-		kstack_ptr -= sizeof(unsigned long);
+		erase_low -= sizeof(unsigned long);
 	}
 
 #ifdef CONFIG_STACKLEAK_METRICS
-	current->prev_lowest_stack = kstack_ptr;
+	current->prev_lowest_stack = erase_low;
 #endif
 
 	/*
-	 * Now write the poison value to the kernel stack. Start from
-	 * 'kstack_ptr' and move up till the new 'boundary'. We assume that
-	 * the stack pointer doesn't change when we write poison.
+	 * Now write the poison value to the kernel stack between 'erase_low'
+	 * and 'erase_high'. We assume that the stack pointer doesn't change
+	 * when we write poison.
 	 */
 	if (on_thread_stack())
-		boundary = current_stack_pointer;
+		erase_high = current_stack_pointer;
 	else
-		boundary = current_top_of_stack();
+		erase_high = current_top_of_stack();
 
-	while (kstack_ptr < boundary) {
-		*(unsigned long *)kstack_ptr = STACKLEAK_POISON;
-		kstack_ptr += sizeof(unsigned long);
+	while (erase_low < erase_high) {
+		*(unsigned long *)erase_low = STACKLEAK_POISON;
+		erase_low += sizeof(unsigned long);
 	}
 
 	/* Reset the 'lowest_stack' value for the next syscall */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 06/13] stackleak: rework stack high bound handling
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (4 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 05/13] stackleak: clarify variable names Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-08 21:27   ` Alexander Popov
  2022-04-27 17:31 ` [PATCH v2 07/13] stackleak: rework poison scanning Mark Rutland
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

Prior to returning to userpace, we reset current->lowest_stack to a
reasonable high bound. Currently we do this by subtracting the arbitrary
value `THREAD_SIZE/64` from the top of the stack, for reasons lost to
history.

Looking at configurations today:

* On i386 where THREAD_SIZE is 8K, the bound will be 128 bytes. The
  pt_regs at the top of the stack is 68 bytes (with 0 to 16 bytes of
  padding above), and so this covers an additional portion of 44 to 60
  bytes.

* On x86_64 where THREAD_SIZE is at least 16K (up to 32K with KASAN) the
  bound will be at least 256 bytes (up to 512 with KASAN). The pt_regs
  at the top of the stack is 168 bytes, and so this cover an additional
  88 bytes of stack (up to 344 with KASAN).

* On arm64 where THREAD_SIZE is at least 16K (up to 64K with 64K pages
  and VMAP_STACK), the bound will be at least 256 bytes (up to 1024 with
  KASAN). The pt_regs at the top of the stack is 336 bytes, so this can
  fall within the pt_regs, or can cover an additional 688 bytes of
  stack.

Clearly the `THREAD_SIZE/64` value doesn't make much sense -- in the
worst case, this will cause more than 600 bytes of stack to be erased
for every syscall, even if actual stack usage were substantially
smaller.

This patches makes this slightly less nonsensical by consistently
resetting current->lowest_stack to the base of the task pt_regs. For
clarity and for consistency with the handling of the low bound, the
generation of the high bound is split into a helper with commentary
explaining why.

Since the pt_regs at the top of the stack will be clobbered upon the
next exception entry, we don't need to poison these at exception exit.
By using task_pt_regs() as the high stack boundary instead of
current_top_of_stack() we avoid some redundant poisoning, and the
compiler can share the address generation between the poisoning and
restting of `current->lowest_stack`, making the generated code more
optimal.

It's not clear to me whether the existing `THREAD_SIZE/64` offset was a
dodgy heuristic to skip the pt_regs, or whether it was attempting to
minimize the number of times stackleak_check_stack() would have to
update `current->lowest_stack` when stack usage was shallow at the cost
of unconditionally poisoning a small portion of the stack for every exit
to userspace.

For now I've simply removed the offset, and if we need/want to minimize
updates for shallow stack usage it should be easy to add a better
heuristic atop, with appropriate commentary so we know what's going on.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 include/linux/stackleak.h | 14 ++++++++++++++
 kernel/stackleak.c        | 19 ++++++++++++++-----
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
index 67430faa5c518..467661aeb4136 100644
--- a/include/linux/stackleak.h
+++ b/include/linux/stackleak.h
@@ -28,6 +28,20 @@ stackleak_task_low_bound(const struct task_struct *tsk)
 	return (unsigned long)end_of_stack(tsk) + sizeof(unsigned long);
 }
 
+/*
+ * The address immediately after the highest address on tsk's stack which we
+ * can plausibly erase.
+ */
+static __always_inline unsigned long
+stackleak_task_high_bound(const struct task_struct *tsk)
+{
+	/*
+	 * The task's pt_regs lives at the top of the task stack and will be
+	 * overwritten by exception entry, so there's no need to erase them.
+	 */
+	return (unsigned long)task_pt_regs(tsk);
+}
+
 static inline void stackleak_task_init(struct task_struct *t)
 {
 	t->lowest_stack = stackleak_task_low_bound(t);
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index d5f684dc0a2d9..ba346d46218f5 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -73,6 +73,7 @@ late_initcall(stackleak_sysctls_init);
 static __always_inline void __stackleak_erase(void)
 {
 	const unsigned long task_stack_low = stackleak_task_low_bound(current);
+	const unsigned long task_stack_high = stackleak_task_high_bound(current);
 	unsigned long erase_low = current->lowest_stack;
 	unsigned long erase_high;
 	unsigned int poison_count = 0;
@@ -93,14 +94,22 @@ static __always_inline void __stackleak_erase(void)
 #endif
 
 	/*
-	 * Now write the poison value to the kernel stack between 'erase_low'
-	 * and 'erase_high'. We assume that the stack pointer doesn't change
-	 * when we write poison.
+	 * Write poison to the task's stack between 'erase_low' and
+	 * 'erase_high'.
+	 *
+	 * If we're running on a different stack (e.g. an entry trampoline
+	 * stack) we can erase everything below the pt_regs at the top of the
+	 * task stack.
+	 *
+	 * If we're running on the task stack itself, we must not clobber any
+	 * stack used by this function and its caller. We assume that this
+	 * function has a fixed-size stack frame, and the current stack pointer
+	 * doesn't change while we write poison.
 	 */
 	if (on_thread_stack())
 		erase_high = current_stack_pointer;
 	else
-		erase_high = current_top_of_stack();
+		erase_high = task_stack_high;
 
 	while (erase_low < erase_high) {
 		*(unsigned long *)erase_low = STACKLEAK_POISON;
@@ -108,7 +117,7 @@ static __always_inline void __stackleak_erase(void)
 	}
 
 	/* Reset the 'lowest_stack' value for the next syscall */
-	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
+	current->lowest_stack = task_stack_high;
 }
 
 asmlinkage void noinstr stackleak_erase(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 07/13] stackleak: rework poison scanning
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (5 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 06/13] stackleak: rework stack high bound handling Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-09 13:51   ` Alexander Popov
  2022-04-27 17:31 ` [PATCH v2 08/13] lkdtm/stackleak: avoid spurious failure Mark Rutland
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

Currently we over-estimate the region of stack which must be erased.

To determine the region to be erased, we scan downards for a contiguous
block of poison values (or the low bound of the stack). There are a few
minor problems with this today:

* When we find a block of poison values, we include this block within
  the region to erase.

  As this is included within the region to erase, this causes us to
  redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
  poison.

* As the loop condition checks 'poison_count <= depth', it will run an
  additional iteration after finding the contiguous block of poison,
  decrementing 'erase_low' once more than necessary.

  As this is included within the region to erase, this causes us to
  redundantly overwrite an additional unsigned long with poison.

* As we always decrement 'erase_low' after checking an element on the
  stack, we always include the element below this within the region to
  erase.

  As this is included within the region to erase, this causes us to
  redundantly overwrite an additional unsigned long with poison.

  Note that this is not a functional problem. As the loop condition
  checks 'erase_low > task_stack_low', we'll never clobber the
  STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
  never fail to erase the element immediately above the STACK_END_MAGIC.

In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
bytes more than necessary, which is unfortunate.

This patch reworks the logic to find the address immediately above the
poisoned region, by finding the lowest non-poisoned address. This is
factored into a stackleak_find_top_of_poison() helper both for clarity
and so that this can be shared with the LKDTM test in subsequent
patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 include/linux/stackleak.h | 26 ++++++++++++++++++++++++++
 kernel/stackleak.c        | 18 ++++--------------
 2 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
index 467661aeb4136..c36e7a3b45e7e 100644
--- a/include/linux/stackleak.h
+++ b/include/linux/stackleak.h
@@ -42,6 +42,32 @@ stackleak_task_high_bound(const struct task_struct *tsk)
 	return (unsigned long)task_pt_regs(tsk);
 }
 
+/*
+ * Find the address immediately above the poisoned region of the stack, where
+ * that region falls between 'low' (inclusive) and 'high' (exclusive).
+ */
+static __always_inline unsigned long
+stackleak_find_top_of_poison(const unsigned long low, const unsigned long high)
+{
+	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
+	unsigned int poison_count = 0;
+	unsigned long poison_high = high;
+	unsigned long sp = high;
+
+	while (sp > low && poison_count < depth) {
+		sp -= sizeof(unsigned long);
+
+		if (*(unsigned long *)sp == STACKLEAK_POISON) {
+			poison_count++;
+		} else {
+			poison_count = 0;
+			poison_high = sp;
+		}
+	}
+
+	return poison_high;
+}
+
 static inline void stackleak_task_init(struct task_struct *t)
 {
 	t->lowest_stack = stackleak_task_low_bound(t);
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index ba346d46218f5..afd54b8e10b83 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -74,20 +74,10 @@ static __always_inline void __stackleak_erase(void)
 {
 	const unsigned long task_stack_low = stackleak_task_low_bound(current);
 	const unsigned long task_stack_high = stackleak_task_high_bound(current);
-	unsigned long erase_low = current->lowest_stack;
-	unsigned long erase_high;
-	unsigned int poison_count = 0;
-	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
-
-	/* Search for the poison value in the kernel stack */
-	while (erase_low > task_stack_low && poison_count <= depth) {
-		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
-			poison_count++;
-		else
-			poison_count = 0;
-
-		erase_low -= sizeof(unsigned long);
-	}
+	unsigned long erase_low, erase_high;
+
+	erase_low = stackleak_find_top_of_poison(task_stack_low,
+						 current->lowest_stack);
 
 #ifdef CONFIG_STACKLEAK_METRICS
 	current->prev_lowest_stack = erase_low;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 08/13] lkdtm/stackleak: avoid spurious failure
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (6 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 07/13] stackleak: rework poison scanning Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 09/13] lkdtm/stackleak: rework boundary management Mark Rutland
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

The lkdtm_STACKLEAK_ERASING() test scans for a contiguous block of
poison values between the low stack bound and the stack pointer, and
fails if it does not find a sufficiently large block.

This can happen legitimately if the scan the low stack bound, which
could occur if functions called prior to lkdtm_STACKLEAK_ERASING() used
a large amount of stack. If this were to occur, it means that the erased
portion of the stack is smaller than the size used by the scan, but does
not cause a functional problem

In practice this is unlikely to happen, but as this is legitimate and
would not result in a functional problem, the test should not fail in
this case.

Remove the spurious failure case.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 drivers/misc/lkdtm/stackleak.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/misc/lkdtm/stackleak.c b/drivers/misc/lkdtm/stackleak.c
index 00db21ff115e4..707d530d509b7 100644
--- a/drivers/misc/lkdtm/stackleak.c
+++ b/drivers/misc/lkdtm/stackleak.c
@@ -53,13 +53,6 @@ void lkdtm_STACKLEAK_ERASING(void)
 			found = 0;
 	}
 
-	if (found <= check_depth) {
-		pr_err("FAIL: the erased part is not found (checked %lu bytes)\n",
-						i * sizeof(unsigned long));
-		test_failed = true;
-		goto end;
-	}
-
 	pr_info("the erased part begins after %lu not poisoned bytes\n",
 				(i - found) * sizeof(unsigned long));
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 09/13] lkdtm/stackleak: rework boundary management
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (7 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 08/13] lkdtm/stackleak: avoid spurious failure Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-04 19:07   ` Kees Cook
  2022-04-27 17:31 ` [PATCH v2 10/13] lkdtm/stackleak: prevent unexpected stack usage Mark Rutland
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

There are a few problems with the way the LKDTM STACKLEAK_ERASING test
manipulates the stack pointer and boundary values:

* It uses the address of a local variable to determine the current stack
  pointer, rather than using current_stack_pointer directly. As the
  local variable could be placed anywhere within the stack frame, this
  can be an over-estimate of the true stack pointer value.

* Is uses an estiamte of the current stack pointer as the upper boundary
  when scanning for poison, even though prior functions could have used
  more stack (and may have updated current->lowest stack accordingly).

* A pr_info() call is made in the middle of the test. As the printk()
  code is out-of-line and will make use of the stack, this could clobber
  poison and/or adjust current->lowest_stack. It would be better to log
  the metadata after the body of the test to avoid such problems.

These have been observed to result in spurious test failures on arm64.

In addition to this there are a couple of thigns which are sub-optimal:

* To avoid the STACK_END_MAGIC value, it conditionally modifies 'left'
  if this contains more than a single element, when it could instead
  calculate the bound unconditionally using stackleak_task_low_bound().

* It open-codes the poison scanning. It would be better if this used the
  same helper code as used by erasing function so that the two cannot
  diverge.

This patch reworks the test to avoid these issues, making use of the
recently introduced helpers to ensure this is aligned with the regular
stackleak code.

As the new code tests stack boundaries before accessing the stack, there
is no need to fail early when the tracked or untracked portions of the
stack extend all the way to the low stack boundary.

As stackleak_find_top_of_poison() is now used to find the top of the
poisoned region of the stack, the subsequent poison checking starts at
this boundary and verifies that stackleak_find_top_of_poison() is
working correctly.

The pr_info() which logged the untracked portion of stack is now moved
to the end of the function, and logs the size of all the portions of the
stack relevant to the test, including the portions at the top and bottom
of the stack which are not erased or scanned, and the current / lowest
recorded stack usage.

Tested on x86_64:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: stackleak stack usage:
|   high offset: 168 bytes
|   current:     336 bytes
|   lowest:      656 bytes
|   tracked:     656 bytes
|   untracked:   400 bytes
|   poisoned:    15152 bytes
|   low offset:  8 bytes
| lkdtm: OK: the rest of the thread stack is properly erased

Tested on arm64:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: stackleak stack usage:
|   high offset: 336 bytes
|   current:     656 bytes
|   lowest:      1232 bytes
|   tracked:     1232 bytes
|   untracked:   672 bytes
|   poisoned:    14136 bytes
|   low offset:  8 bytes
| lkdtm: OK: the rest of the thread stack is properly erased

Tested on arm64 with deliberate breakage to the starting stack value and
poison scanning:

| # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry STACKLEAK_ERASING
| lkdtm: FAIL: non-poison value 24 bytes below poison boundary: 0x0
| lkdtm: FAIL: non-poison value 32 bytes below poison boundary: 0xffff8000083dbc00
...
| lkdtm: FAIL: non-poison value 1912 bytes below poison boundary: 0x78b4b9999e8cb15
| lkdtm: FAIL: non-poison value 1920 bytes below poison boundary: 0xffff8000083db400
| lkdtm: stackleak stack usage:
|   high offset: 336 bytes
|   current:     688 bytes
|   lowest:      1232 bytes
|   tracked:     576 bytes
|   untracked:   288 bytes
|   poisoned:    15176 bytes
|   low offset:  8 bytes
| lkdtm: FAIL: the thread stack is NOT properly erased!
| lkdtm: Unexpected! This kernel (5.18.0-rc1-00013-g1f7b1f1e29e0-dirty aarch64) was built with CONFIG_GCC_PLUGIN_STACKLEAK=y

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 drivers/misc/lkdtm/stackleak.c | 86 +++++++++++++++++++---------------
 1 file changed, 47 insertions(+), 39 deletions(-)

diff --git a/drivers/misc/lkdtm/stackleak.c b/drivers/misc/lkdtm/stackleak.c
index 707d530d509b7..0aafa46ced7d6 100644
--- a/drivers/misc/lkdtm/stackleak.c
+++ b/drivers/misc/lkdtm/stackleak.c
@@ -13,59 +13,67 @@
 
 void lkdtm_STACKLEAK_ERASING(void)
 {
-	unsigned long *sp, left, found, i;
-	const unsigned long check_depth =
-			STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
+	const unsigned long task_stack_base = (unsigned long)task_stack_page(current);
+	const unsigned long task_stack_low = stackleak_task_low_bound(current);
+	const unsigned long task_stack_high = stackleak_task_high_bound(current);
+	const unsigned long current_sp = current_stack_pointer;
+	const unsigned long lowest_sp = current->lowest_stack;
+	unsigned long untracked_high;
+	unsigned long poison_high, poison_low;
 	bool test_failed = false;
 
 	/*
-	 * For the details about the alignment of the poison values, see
-	 * the comment in stackleak_track_stack().
+	 * Depending on what has run prior to this test, the lowest recorded
+	 * stack pointer could be above or below the current stack pointer.
+	 * Start from the lowest of the two.
+	 *
+	 * Poison values are naturally-aligned unsigned longs. As the current
+	 * stack pointer might not be sufficiently aligned, we must align
+	 * downwards to find the lowest known stack pointer value. This is the
+	 * high boundary for a portion of the stack which may have been used
+	 * without being tracked, and has to be scanned for poison.
 	 */
-	sp = PTR_ALIGN(&i, sizeof(unsigned long));
-
-	left = ((unsigned long)sp & (THREAD_SIZE - 1)) / sizeof(unsigned long);
-	sp--;
+	untracked_high = min(current_sp, lowest_sp);
+	untracked_high = ALIGN_DOWN(untracked_high, sizeof(unsigned long));
 
 	/*
-	 * One 'long int' at the bottom of the thread stack is reserved
-	 * and not poisoned.
+	 * Find the top of the poison in the same way as the erasing code.
 	 */
-	if (left > 1) {
-		left--;
-	} else {
-		pr_err("FAIL: not enough stack space for the test\n");
-		test_failed = true;
-		goto end;
-	}
-
-	pr_info("checking unused part of the thread stack (%lu bytes)...\n",
-					left * sizeof(unsigned long));
+	poison_high = stackleak_find_top_of_poison(task_stack_low, untracked_high);
 
 	/*
-	 * Search for 'check_depth' poison values in a row (just like
-	 * stackleak_erase() does).
+	 * Check whether the poisoned portion of the stack (if any) consists
+	 * entirely of poison. This verifies the entries that
+	 * stackleak_find_top_of_poison() should have checked.
 	 */
-	for (i = 0, found = 0; i < left && found <= check_depth; i++) {
-		if (*(sp - i) == STACKLEAK_POISON)
-			found++;
-		else
-			found = 0;
-	}
+	poison_low = poison_high;
+	while (poison_low > task_stack_low) {
+		poison_low -= sizeof(unsigned long);
 
-	pr_info("the erased part begins after %lu not poisoned bytes\n",
-				(i - found) * sizeof(unsigned long));
+		if (*(unsigned long *)poison_low == STACKLEAK_POISON)
+			continue;
 
-	/* The rest of thread stack should be erased */
-	for (; i < left; i++) {
-		if (*(sp - i) != STACKLEAK_POISON) {
-			pr_err("FAIL: bad value number %lu in the erased part: 0x%lx\n",
-								i, *(sp - i));
-			test_failed = true;
-		}
+		pr_err("FAIL: non-poison value %lu bytes below poison boundary: 0x%lx\n",
+		       poison_high - poison_low, *(unsigned long *)poison_low);
+		test_failed = true;
 	}
 
-end:
+	pr_info("stackleak stack usage:\n"
+		"  high offset: %lu bytes\n"
+		"  current:     %lu bytes\n"
+		"  lowest:      %lu bytes\n"
+		"  tracked:     %lu bytes\n"
+		"  untracked:   %lu bytes\n"
+		"  poisoned:    %lu bytes\n"
+		"  low offset:  %lu bytes\n",
+		task_stack_base + THREAD_SIZE - task_stack_high,
+		task_stack_high - current_sp,
+		task_stack_high - lowest_sp,
+		task_stack_high - untracked_high,
+		untracked_high - poison_high,
+		poison_high - task_stack_low,
+		task_stack_low - task_stack_base);
+
 	if (test_failed) {
 		pr_err("FAIL: the thread stack is NOT properly erased!\n");
 		pr_expected_config(CONFIG_GCC_PLUGIN_STACKLEAK);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 10/13] lkdtm/stackleak: prevent unexpected stack usage
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (8 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 09/13] lkdtm/stackleak: rework boundary management Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 11/13] lkdtm/stackleak: check stack boundaries Mark Rutland
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

The lkdtm_STACKLEAK_ERASING() test is instrumentable and runs with IRQs
unmasked, so it's possible for unrelated code to clobber the task stack
and/or manipulate current->lowest_stack while the test is running,
resulting in spurious failures.

The regular stackleak erasing code is non-instrumentable and runs with
IRQs masked, preventing similar issues.

Make the body of the test non-instrumentable, and run it with IRQs
masked, avoiding such spurious failures.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
---
 drivers/misc/lkdtm/stackleak.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/lkdtm/stackleak.c b/drivers/misc/lkdtm/stackleak.c
index 0aafa46ced7d6..46c60761a05ea 100644
--- a/drivers/misc/lkdtm/stackleak.c
+++ b/drivers/misc/lkdtm/stackleak.c
@@ -11,7 +11,20 @@
 #include "lkdtm.h"
 #include <linux/stackleak.h>
 
-void lkdtm_STACKLEAK_ERASING(void)
+/*
+ * Check that stackleak tracks the lowest stack pointer and erases the stack
+ * below this as expected.
+ *
+ * To prevent the lowest stack pointer changing during the test, IRQs are
+ * masked and instrumentation of this function is disabled. We assume that the
+ * compiler will create a fixed-size stack frame for this function.
+ *
+ * Any non-inlined function may make further use of the stack, altering the
+ * lowest stack pointer and/or clobbering poison values. To avoid spurious
+ * failures we must avoid printing until the end of the test or have already
+ * encountered a failure condition.
+ */
+static void noinstr check_stackleak_irqoff(void)
 {
 	const unsigned long task_stack_base = (unsigned long)task_stack_page(current);
 	const unsigned long task_stack_low = stackleak_task_low_bound(current);
@@ -81,3 +94,12 @@ void lkdtm_STACKLEAK_ERASING(void)
 		pr_info("OK: the rest of the thread stack is properly erased\n");
 	}
 }
+
+void lkdtm_STACKLEAK_ERASING(void)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	check_stackleak_irqoff();
+	local_irq_restore(flags);
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 11/13] lkdtm/stackleak: check stack boundaries
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (9 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 10/13] lkdtm/stackleak: prevent unexpected stack usage Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 12/13] stackleak: add on/off stack variants Mark Rutland
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

The stackleak code relies upon the current SP and lowest recorded SP
falling within expected task stack boundaries.

Check this at the start of the test.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
---
 drivers/misc/lkdtm/stackleak.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/drivers/misc/lkdtm/stackleak.c b/drivers/misc/lkdtm/stackleak.c
index 46c60761a05ea..52800583fd051 100644
--- a/drivers/misc/lkdtm/stackleak.c
+++ b/drivers/misc/lkdtm/stackleak.c
@@ -35,6 +35,25 @@ static void noinstr check_stackleak_irqoff(void)
 	unsigned long poison_high, poison_low;
 	bool test_failed = false;
 
+	/*
+	 * Check that the current and lowest recorded stack pointer values fall
+	 * within the expected task stack boundaries. These tests should never
+	 * fail unless the boundaries are incorrect or we're clobbering the
+	 * STACK_END_MAGIC, and in either casee something is seriously wrong.
+	 */
+	if (current_sp < task_stack_low || current_sp >= task_stack_high) {
+		pr_err("FAIL: current_stack_pointer (0x%lx) outside of task stack bounds [0x%lx..0x%lx]\n",
+		       current_sp, task_stack_low, task_stack_high - 1);
+		test_failed = true;
+		goto out;
+	}
+	if (lowest_sp < task_stack_low || lowest_sp >= task_stack_high) {
+		pr_err("FAIL: current->lowest_stack (0x%lx) outside of task stack bounds [0x%lx..0x%lx]\n",
+		       lowest_sp, task_stack_low, task_stack_high - 1);
+		test_failed = true;
+		goto out;
+	}
+
 	/*
 	 * Depending on what has run prior to this test, the lowest recorded
 	 * stack pointer could be above or below the current stack pointer.
@@ -87,6 +106,7 @@ static void noinstr check_stackleak_irqoff(void)
 		poison_high - task_stack_low,
 		task_stack_low - task_stack_base);
 
+out:
 	if (test_failed) {
 		pr_err("FAIL: the thread stack is NOT properly erased!\n");
 		pr_expected_config(CONFIG_GCC_PLUGIN_STACKLEAK);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 12/13] stackleak: add on/off stack variants
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (10 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 11/13] lkdtm/stackleak: check stack boundaries Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-04-27 17:31 ` [PATCH v2 13/13] arm64: entry: use stackleak_erase_on_task_stack() Mark Rutland
  2022-05-04 19:16 ` [PATCH v2 00/13] stackleak: fixes and rework Kees Cook
  13 siblings, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

The stackleak_erase() code dynamically handles being on a task stack or
another stack. In most cases, this is a fixed property of the caller,
which the caller is aware of, as an architecture might always return
using the task stack, or might always return using a trampoline stack.

This patch adds stackleak_erase_on_task_stack() and
stackleak_erase_off_task_stack() functions which callers can use to
avoid on_thread_stack() check and associated redundant work when the
calling stack is known. The existing stackleak_erase() is retained as a
safe default.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
---
 kernel/stackleak.c | 35 ++++++++++++++++++++++++++++++++---
 1 file changed, 32 insertions(+), 3 deletions(-)

diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index afd54b8e10b83..c2c33d2202e9a 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -70,7 +70,7 @@ late_initcall(stackleak_sysctls_init);
 #define skip_erasing()	false
 #endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
 
-static __always_inline void __stackleak_erase(void)
+static __always_inline void __stackleak_erase(bool on_task_stack)
 {
 	const unsigned long task_stack_low = stackleak_task_low_bound(current);
 	const unsigned long task_stack_high = stackleak_task_high_bound(current);
@@ -96,7 +96,7 @@ static __always_inline void __stackleak_erase(void)
 	 * function has a fixed-size stack frame, and the current stack pointer
 	 * doesn't change while we write poison.
 	 */
-	if (on_thread_stack())
+	if (on_task_stack)
 		erase_high = current_stack_pointer;
 	else
 		erase_high = task_stack_high;
@@ -110,12 +110,41 @@ static __always_inline void __stackleak_erase(void)
 	current->lowest_stack = task_stack_high;
 }
 
+/*
+ * Erase and poison the portion of the task stack used since the last erase.
+ * Can be called from the task stack or an entry stack when the task stack is
+ * no longer in use.
+ */
 asmlinkage void noinstr stackleak_erase(void)
 {
 	if (skip_erasing())
 		return;
 
-	__stackleak_erase();
+	__stackleak_erase(on_thread_stack());
+}
+
+/*
+ * Erase and poison the portion of the task stack used since the last erase.
+ * Can only be called from the task stack.
+ */
+asmlinkage void noinstr stackleak_erase_on_task_stack(void)
+{
+	if (skip_erasing())
+		return;
+
+	__stackleak_erase(true);
+}
+
+/*
+ * Erase and poison the portion of the task stack used since the last erase.
+ * Can only be called from a stack other than the task stack.
+ */
+asmlinkage void noinstr stackleak_erase_off_task_stack(void)
+{
+	if (skip_erasing())
+		return;
+
+	__stackleak_erase(false);
 }
 
 void __used __no_caller_saved_registers noinstr stackleak_track_stack(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* [PATCH v2 13/13] arm64: entry: use stackleak_erase_on_task_stack()
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (11 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 12/13] stackleak: add on/off stack variants Mark Rutland
@ 2022-04-27 17:31 ` Mark Rutland
  2022-05-04 16:42   ` Catalin Marinas
  2022-05-04 19:16 ` [PATCH v2 00/13] stackleak: fixes and rework Kees Cook
  13 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-04-27 17:31 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: akpm, alex.popov, catalin.marinas, keescook, linux-kernel, luto,
	mark.rutland, will

On arm64 we always call stackleak_erase() on a task stack, and never
call it on another stack. We can avoid some redundant work by using
stackleak_erase_on_task_stack(), telling the stackleak code that it's
being called on a task stack.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/entry.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index ede028dee81b0..5b82b92924005 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -596,7 +596,7 @@ SYM_CODE_START_LOCAL(ret_to_user)
 	ldr	x19, [tsk, #TSK_TI_FLAGS]	// re-check for single-step
 	enable_step_tsk x19, x2
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
-	bl	stackleak_erase
+	bl	stackleak_erase_on_task_stack
 #endif
 	kernel_exit 0
 SYM_CODE_END(ret_to_user)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()
  2022-04-27 17:31 ` [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack() Mark Rutland
@ 2022-05-04 16:41   ` Catalin Marinas
  2022-05-04 19:01     ` Kees Cook
  2022-05-08 17:24   ` Alexander Popov
  1 sibling, 1 reply; 47+ messages in thread
From: Catalin Marinas @ 2022-05-04 16:41 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, akpm, alex.popov, keescook, linux-kernel, luto, will

On Wed, Apr 27, 2022 at 06:31:16PM +0100, Mark Rutland wrote:
> Due to some historical confusion, arm64's current_top_of_stack() isn't
> what the stackleak code expects. This could in theory result in a number
> of problems, and practically results in an unnecessary performance hit.
> We can avoid this by aligning the arm64 implementation with the x86
> implementation.
> 
> The arm64 implementation of current_top_of_stack() was added
> specifically for stackleak in commit:
> 
>   0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> 
> This was intended to be equivalent to the x86 implementation, but the
> implementation, semantics, and performance characteristics differ
> wildly:
> 
> * On x86, current_top_of_stack() returns the top of the current task's
>   task stack, regardless of which stack is in active use.
> 
>   The implementation accesses a percpu variable which the x86 entry code
>   maintains, and returns the location immediately above the pt_regs on
>   the task stack (above which x86 has some padding).
> 
> * On arm64 current_top_of_stack() returns the top of the stack in active
>   use (i.e. the one which is currently being used).
> 
>   The implementation checks the SP against a number of
>   potentially-accessible stacks, and will BUG() if no stack is found.
> 
> The core stackleak_erase() code determines the upper bound of stack to
> erase with:
> 
> | if (on_thread_stack())
> |         boundary = current_stack_pointer;
> | else
> |         boundary = current_top_of_stack();
> 
> On arm64 stackleak_erase() is always called on a task stack, and
> on_thread_stack() should always be true. On x86, stackleak_erase() is
> mostly called on a trampoline stack, and is sometimes called on a task
> stack.
> 
> Currently, this results in a lot of unnecessary code being generated for
> arm64 for the impossible !on_thread_stack() case. Some of this is
> inlined, bloating stackleak_erase(), while portions of this are left
> out-of-line and permitted to be instrumented (which would be a
> functional problem if that code were reachable).
> 
> As a first step towards improving this, this patch aligns arm64's
> implementation of current_top_of_stack() with x86's, always returning
> the top of the current task's stack. With GCC 11.1.0 this results in the
> bulk of the unnecessary code being removed, including all of the
> out-of-line instrumentable code.
> 
> While I don't believe there's a functional problem in practice I've
> marked this as a fix since the semantic was clearly wrong, the fix
> itself is simple, and other code might rely upon this in future.
> 
> Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Will Deacon <will@kernel.org>

I thought this was queued already but I couldn't find it in -next. So:

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 13/13] arm64: entry: use stackleak_erase_on_task_stack()
  2022-04-27 17:31 ` [PATCH v2 13/13] arm64: entry: use stackleak_erase_on_task_stack() Mark Rutland
@ 2022-05-04 16:42   ` Catalin Marinas
  0 siblings, 0 replies; 47+ messages in thread
From: Catalin Marinas @ 2022-05-04 16:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, akpm, alex.popov, keescook, linux-kernel, luto, will

On Wed, Apr 27, 2022 at 06:31:28PM +0100, Mark Rutland wrote:
> On arm64 we always call stackleak_erase() on a task stack, and never
> call it on another stack. We can avoid some redundant work by using
> stackleak_erase_on_task_stack(), telling the stackleak code that it's
> being called on a task stack.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Will Deacon <will@kernel.org>

Acked-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()
  2022-05-04 16:41   ` Catalin Marinas
@ 2022-05-04 19:01     ` Kees Cook
  2022-05-04 19:55       ` Catalin Marinas
  0 siblings, 1 reply; 47+ messages in thread
From: Kees Cook @ 2022-05-04 19:01 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, linux-arm-kernel, akpm, alex.popov, linux-kernel,
	luto, will

On Wed, May 04, 2022 at 05:41:32PM +0100, Catalin Marinas wrote:
> On Wed, Apr 27, 2022 at 06:31:16PM +0100, Mark Rutland wrote:
> > [...]
> > Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Popov <alex.popov@linux.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Will Deacon <will@kernel.org>
> 
> I thought this was queued already but I couldn't find it in -next. So:
> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>

Should this patch go via the arm64 tree for -rc6, or should I just carry
it as part of the overall stackleak series?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 09/13] lkdtm/stackleak: rework boundary management
  2022-04-27 17:31 ` [PATCH v2 09/13] lkdtm/stackleak: rework boundary management Mark Rutland
@ 2022-05-04 19:07   ` Kees Cook
  0 siblings, 0 replies; 47+ messages in thread
From: Kees Cook @ 2022-05-04 19:07 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, akpm, alex.popov, catalin.marinas,
	linux-kernel, luto, will

On Wed, Apr 27, 2022 at 06:31:24PM +0100, Mark Rutland wrote:
> There are a few problems with the way the LKDTM STACKLEAK_ERASING test
> manipulates the stack pointer and boundary values:
> 
> * It uses the address of a local variable to determine the current stack
>   pointer, rather than using current_stack_pointer directly. As the
>   local variable could be placed anywhere within the stack frame, this
>   can be an over-estimate of the true stack pointer value.
> 
> * Is uses an estiamte of the current stack pointer as the upper boundary
>   when scanning for poison, even though prior functions could have used
>   more stack (and may have updated current->lowest stack accordingly).
> 
> * A pr_info() call is made in the middle of the test. As the printk()
>   code is out-of-line and will make use of the stack, this could clobber
>   poison and/or adjust current->lowest_stack. It would be better to log
>   the metadata after the body of the test to avoid such problems.

Yeah, I noticed this too when I was testing the v1 series. I started
cleaning it up, but your version is much better. :)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 00/13] stackleak: fixes and rework
  2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
                   ` (12 preceding siblings ...)
  2022-04-27 17:31 ` [PATCH v2 13/13] arm64: entry: use stackleak_erase_on_task_stack() Mark Rutland
@ 2022-05-04 19:16 ` Kees Cook
  13 siblings, 0 replies; 47+ messages in thread
From: Kees Cook @ 2022-05-04 19:16 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, mark.rutland, alex.popov
  Cc: Kees Cook, Andrew Morton, catalin.marinas, will, luto

On Wed, 27 Apr 2022 18:31:15 +0100, Mark Rutland wrote:
> This is the vs I promised. Since Alexander wanted to look at this in
> more detail (and since this is subtle and needs review), I'm assuming
> that Kees will pick this up some time next week after that's happened,
> if all goes well. :)
> 
> This series reworks the stackleak code and the associated LKDTM test.
> The first patch fixes some latent issues on arm64, and the subsequent
> patches improve the code to improve clarity and permit better code
> generation. Patches 8-10 address some latent issues in the LKDTM test
> and add more diagnostic output.
> 
> [...]

I fixed some small commit log typos, but otherwise this looks great. If
anything new comes up we can adjust it.

Applied to for-next/hardening, thanks!

[01/13] arm64: stackleak: fix current_top_of_stack()
        https://git.kernel.org/kees/c/4c849d27b729
[02/13] stackleak: move skip_erasing() check earlier
        https://git.kernel.org/kees/c/e98a7c56d73c
[03/13] stackleak: remove redundant check
        https://git.kernel.org/kees/c/e45d9f71deea
[04/13] stackleak: rework stack low bound handling
        https://git.kernel.org/kees/c/cbe7edb47d3c
[05/13] stackleak: clarify variable names
        https://git.kernel.org/kees/c/e9da2241ed85
[06/13] stackleak: rework stack high bound handling
        https://git.kernel.org/kees/c/cfef4372a4b7
[07/13] stackleak: rework poison scanning
        https://git.kernel.org/kees/c/ff5f6d37e5bc
[08/13] lkdtm/stackleak: avoid spurious failure
        https://git.kernel.org/kees/c/23fd893fa0d7
[09/13] lkdtm/stackleak: rework boundary management
        https://git.kernel.org/kees/c/f4cfacd92972
[10/13] lkdtm/stackleak: prevent unexpected stack usage
        https://git.kernel.org/kees/c/c393c0b98d75
[11/13] lkdtm/stackleak: check stack boundaries
        https://git.kernel.org/kees/c/b6bf5a354eca
[12/13] stackleak: add on/off stack variants
        https://git.kernel.org/kees/c/96c59349a56c
[13/13] arm64: entry: use stackleak_erase_on_task_stack()
        https://git.kernel.org/kees/c/d46ac904fd35

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()
  2022-05-04 19:01     ` Kees Cook
@ 2022-05-04 19:55       ` Catalin Marinas
  2022-05-05  8:25         ` Will Deacon
  0 siblings, 1 reply; 47+ messages in thread
From: Catalin Marinas @ 2022-05-04 19:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: Mark Rutland, linux-arm-kernel, akpm, alex.popov, linux-kernel,
	luto, will

On Wed, May 04, 2022 at 12:01:11PM -0700, Kees Cook wrote:
> On Wed, May 04, 2022 at 05:41:32PM +0100, Catalin Marinas wrote:
> > On Wed, Apr 27, 2022 at 06:31:16PM +0100, Mark Rutland wrote:
> > > [...]
> > > Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Alexander Popov <alex.popov@linux.com>
> > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > Cc: Andy Lutomirski <luto@kernel.org>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Kees Cook <keescook@chromium.org>
> > > Cc: Will Deacon <will@kernel.org>
> > 
> > I thought this was queued already but I couldn't find it in -next. So:
> > 
> > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> Should this patch go via the arm64 tree for -rc6, or should I just carry
> it as part of the overall stackleak series?

I'll leave this up to Will (we take turns in managing the kernel
releases) but it doesn't look urgent at all to me since it fixes a
commit in 4.19.

-- 
Catalin

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()
  2022-05-04 19:55       ` Catalin Marinas
@ 2022-05-05  8:25         ` Will Deacon
  0 siblings, 0 replies; 47+ messages in thread
From: Will Deacon @ 2022-05-05  8:25 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Kees Cook, Mark Rutland, linux-arm-kernel, akpm, alex.popov,
	linux-kernel, luto

On Wed, May 04, 2022 at 08:55:39PM +0100, Catalin Marinas wrote:
> On Wed, May 04, 2022 at 12:01:11PM -0700, Kees Cook wrote:
> > On Wed, May 04, 2022 at 05:41:32PM +0100, Catalin Marinas wrote:
> > > On Wed, Apr 27, 2022 at 06:31:16PM +0100, Mark Rutland wrote:
> > > > [...]
> > > > Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: Alexander Popov <alex.popov@linux.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: Andy Lutomirski <luto@kernel.org>
> > > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > Cc: Kees Cook <keescook@chromium.org>
> > > > Cc: Will Deacon <will@kernel.org>
> > > 
> > > I thought this was queued already but I couldn't find it in -next. So:
> > > 
> > > Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> > 
> > Should this patch go via the arm64 tree for -rc6, or should I just carry
> > it as part of the overall stackleak series?
> 
> I'll leave this up to Will (we take turns in managing the kernel
> releases) but it doesn't look urgent at all to me since it fixes a
> commit in 4.19.

Agreed, and nobody has actually experienced a problem with the current code
afaict, so I'd prefer to leave this with the rest of the series rather than
run the risk of a late regression.

Will

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()
  2022-04-27 17:31 ` [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack() Mark Rutland
  2022-05-04 16:41   ` Catalin Marinas
@ 2022-05-08 17:24   ` Alexander Popov
  2022-05-10 11:36     ` Mark Rutland
  1 sibling, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-08 17:24 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel
  Cc: akpm, catalin.marinas, keescook, linux-kernel, luto, will

Hi Mark!

On 27.04.2022 20:31, Mark Rutland wrote:
> Due to some historical confusion, arm64's current_top_of_stack() isn't
> what the stackleak code expects. This could in theory result in a number
> of problems, and practically results in an unnecessary performance hit.
> We can avoid this by aligning the arm64 implementation with the x86
> implementation.
> 
> The arm64 implementation of current_top_of_stack() was added
> specifically for stackleak in commit:
> 
>    0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> 
> This was intended to be equivalent to the x86 implementation, but the
> implementation, semantics, and performance characteristics differ
> wildly:
> 
> * On x86, current_top_of_stack() returns the top of the current task's
>    task stack, regardless of which stack is in active use.
> 
>    The implementation accesses a percpu variable which the x86 entry code
>    maintains, and returns the location immediately above the pt_regs on
>    the task stack (above which x86 has some padding).
> 
> * On arm64 current_top_of_stack() returns the top of the stack in active
>    use (i.e. the one which is currently being used).
> 
>    The implementation checks the SP against a number of
>    potentially-accessible stacks, and will BUG() if no stack is found.

As I could understand, for arm64, calling stackleak_erase() not from the thread 
stack would bring troubles because current_top_of_stack() would return an 
unexpected address from a foreign stack. Is this correct?

But this bug doesn't happen because arm64 always calls stackleak_erase() from 
the current thread stack. Right?

> The core stackleak_erase() code determines the upper bound of stack to
> erase with:
> 
> | if (on_thread_stack())
> |         boundary = current_stack_pointer;
> | else
> |         boundary = current_top_of_stack();
> 
> On arm64 stackleak_erase() is always called on a task stack, and
> on_thread_stack() should always be true. On x86, stackleak_erase() is
> mostly called on a trampoline stack, and is sometimes called on a task
> stack.
> 
> Currently, this results in a lot of unnecessary code being generated for
> arm64 for the impossible !on_thread_stack() case. Some of this is
> inlined, bloating stackleak_erase(), while portions of this are left
> out-of-line and permitted to be instrumented (which would be a
> functional problem if that code were reachable).

Sorry, I didn't understand this part about instrumentation. Could you elaborate 
please?

> As a first step towards improving this, this patch aligns arm64's
> implementation of current_top_of_stack() with x86's, always returning
> the top of the current task's stack. With GCC 11.1.0 this results in the
> bulk of the unnecessary code being removed, including all of the
> out-of-line instrumentable code.
> 
> While I don't believe there's a functional problem in practice I've
> marked this as a fix since the semantic was clearly wrong, the fix
> itself is simple, and other code might rely upon this in future.
> 
> Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>   arch/arm64/include/asm/processor.h | 10 ++++------
>   1 file changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 73e38d9a540ce..6b1a12c23fe77 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -381,12 +381,10 @@ long get_tagged_addr_ctrl(struct task_struct *task);
>    * of header definitions for the use of task_stack_page.
>    */
>   
> -#define current_top_of_stack()								\
> -({											\
> -	struct stack_info _info;							\
> -	BUG_ON(!on_accessible_stack(current, current_stack_pointer, 1, &_info));	\
> -	_info.high;									\
> -})
> +/*
> + * The top of the current task's task stack
> + */
> +#define current_top_of_stack()	((unsigned long)current->stack + THREAD_SIZE)
>   #define on_thread_stack()	(on_task_stack(current, current_stack_pointer, 1, NULL))
>   
>   #endif /* __ASSEMBLY__ */


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 02/13] stackleak: move skip_erasing() check earlier
  2022-04-27 17:31 ` [PATCH v2 02/13] stackleak: move skip_erasing() check earlier Mark Rutland
@ 2022-05-08 17:44   ` Alexander Popov
  2022-05-10 11:40     ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-08 17:44 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel
  Cc: akpm, catalin.marinas, keescook, linux-kernel, luto, will

On 27.04.2022 20:31, Mark Rutland wrote:
> In stackleak_erase() we check skip_erasing() after accessing some fields
> from current. As generating the address of current uses asm which
> hazards with the static branch asm, this work is always performed, even
> when the static branch is patched to jump to the return a the end of the
> function.

Nice find!

> This patch avoids this redundant work by moving the skip_erasing() check
> earlier.
> 
> To avoid complicating initialization within stackleak_erase(), the body
> of the function is split out into a __stackleak_erase() helper, with the
> check left in a wrapper function. The __stackleak_erase() helper is
> marked __always_inline to ensure that this is inlined into
> stackleak_erase() and not instrumented.
> 
> Before this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:
> 
> <stackleak_erase>:
>     65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
>     00 00
>     48 8b 48 20             mov    0x20(%rax),%rcx
>     48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
>     66 90                   xchg   %ax,%ax  <------------ static branch
>     48 89 c2                mov    %rax,%rdx
>     48 29 ca                sub    %rcx,%rdx
>     48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx
> 
> After this patch, on x86-64 w/ GCC 11.1.0 the start of the function is:
> 
> <stackleak_erase>:
>     0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)  <--- static branch
>     65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
>     00 00
>     48 8b 48 20             mov    0x20(%rax),%rcx
>     48 8b 80 98 0a 00 00    mov    0xa98(%rax),%rax
>     48 89 c2                mov    %rax,%rdx
>     48 29 ca                sub    %rcx,%rdx
>     48 81 fa ff 3f 00 00    cmp    $0x3fff,%rdx
> 
> Before this patch, on arm64 w/ GCC 11.1.0 the start of the function is:
> 
> <stackleak_erase>:
>     d503245f        bti     c
>     d5384100        mrs     x0, sp_el0
>     f9401003        ldr     x3, [x0, #32]
>     f9451000        ldr     x0, [x0, #2592]
>     d503201f        nop  <------------------------------- static branch
>     d503233f        paciasp
>     cb030002        sub     x2, x0, x3
>     d287ffe1        mov     x1, #0x3fff
>     eb01005f        cmp     x2, x1
> 
> After this patch, on arm64 w/ GCC 11.1.0 the start of the function is:
> 
> <stackleak_erase>:
>     d503245f        bti     c
>     d503201f        nop  <------------------------------- static branch
>     d503233f        paciasp
>     d5384100        mrs     x0, sp_el0
>     f9401003        ldr     x3, [x0, #32]
>     d287ffe1        mov     x1, #0x3fff
>     f9451000        ldr     x0, [x0, #2592]
>     cb030002        sub     x2, x0, x3
>     eb01005f        cmp     x2, x1
> 
> While this may not be a huge win on its own, moving the static branch
> will permit further optimization of the body of the function in
> subsequent patches.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> ---
>   kernel/stackleak.c | 13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> index ddb5a7f48d69e..753eab797a04d 100644
> --- a/kernel/stackleak.c
> +++ b/kernel/stackleak.c
> @@ -70,7 +70,7 @@ late_initcall(stackleak_sysctls_init);
>   #define skip_erasing()	false
>   #endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
>   
> -asmlinkage void noinstr stackleak_erase(void)
> +static __always_inline void __stackleak_erase(void)

Are you sure that __stackleak_erase() doesn't need asmlinkage and noinstr as well?

>   {
>   	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
>   	unsigned long kstack_ptr = current->lowest_stack;
> @@ -78,9 +78,6 @@ asmlinkage void noinstr stackleak_erase(void)
>   	unsigned int poison_count = 0;
>   	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>   
> -	if (skip_erasing())
> -		return;
> -
>   	/* Check that 'lowest_stack' value is sane */
>   	if (unlikely(kstack_ptr - boundary >= THREAD_SIZE))
>   		kstack_ptr = boundary;
> @@ -125,6 +122,14 @@ asmlinkage void noinstr stackleak_erase(void)
>   	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
>   }
>   
> +asmlinkage void noinstr stackleak_erase(void)
> +{
> +	if (skip_erasing())
> +		return;
> +
> +	__stackleak_erase();
> +}
> +
>   void __used __no_caller_saved_registers noinstr stackleak_track_stack(void)
>   {
>   	unsigned long sp = current_stack_pointer;


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-04-27 17:31 ` [PATCH v2 03/13] stackleak: remove redundant check Mark Rutland
@ 2022-05-08 18:17   ` Alexander Popov
  2022-05-10 11:46     ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-08 18:17 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel
  Cc: akpm, catalin.marinas, keescook, linux-kernel, luto, will

On 27.04.2022 20:31, Mark Rutland wrote:
> In __stackleak_erase() we check that the `erase_low` value derived from
> `current->lowest_stack` is above the lowest legitimate stack pointer
> value, but this is already enforced by stackleak_track_stack() when
> recording the lowest stack value.
> 
> Remove the redundant check.
> 
> There should be no functional change as a result of this patch.

Mark, I can't agree here. I think this check is important.
The performance profit from dropping it is less than the confidence decrease :)

With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't 
overwrite some wrong kernel memory, but simply clears the whole thread stack, 
which is safe behavior.

> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> ---
>   kernel/stackleak.c | 4 ----
>   1 file changed, 4 deletions(-)
> 
> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> index 753eab797a04d..f7a0f8cf73c37 100644
> --- a/kernel/stackleak.c
> +++ b/kernel/stackleak.c
> @@ -78,10 +78,6 @@ static __always_inline void __stackleak_erase(void)
>   	unsigned int poison_count = 0;
>   	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>   
> -	/* Check that 'lowest_stack' value is sane */
> -	if (unlikely(kstack_ptr - boundary >= THREAD_SIZE))
> -		kstack_ptr = boundary;
> -
>   	/* Search for the poison value in the kernel stack */
>   	while (kstack_ptr > boundary && poison_count <= depth) {
>   		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/13] stackleak: clarify variable names
  2022-04-27 17:31 ` [PATCH v2 05/13] stackleak: clarify variable names Mark Rutland
@ 2022-05-08 20:49   ` Alexander Popov
  2022-05-10 13:01     ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-08 20:49 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel
  Cc: akpm, catalin.marinas, keescook, linux-kernel, luto, will

On 27.04.2022 20:31, Mark Rutland wrote:
> The logic within __stackleak_erase() can be a little hard to follow, as
> `boundary` switches from being the low bound to the high bound mid way
> through the function, and `kstack_ptr` is used to represent the start of
> the region to erase while `boundary` represents the end of the region to
> erase.
> 
> Make this a little clearer by consistently using clearer variable names.
> The `boundary` variable is removed, the bounds of the region to erase
> are described by `erase_low` and `erase_high`, and bounds of the task
> stack are described by `task_stack_low` and `task_stck_high`.

A typo here in `task_stck_high`.

> As the same time, remove the comment above the variables, since it is
> unclear whether it's intended as rationale, a complaint, or a TODO, and
> is more confusing than helpful.

Yes, this comment is a bit confusing :) I can elaborate.

In the original grsecurity patch, the stackleak erasing was written in asm.
When I adopted it and proposed for the upstream, Linus strongly opposed this.
So I developed stackleak erasing in C.

And I wrote this comment to remember that having 'kstack_ptr' and 'boundary' 
variables on the stack (which we are clearing) would not be good.

That was also the main reason why I reused the 'boundary' variable: I wanted the 
compiler to allocate it in the register and I avoided creating many local variables.

Mark, did your refactoring make the compiler allocate local variables on the 
stack instead of the registers?

> There should be no functional change as a result of this patch.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> ---
>   kernel/stackleak.c | 30 ++++++++++++++----------------
>   1 file changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> index 24b7cf01b2972..d5f684dc0a2d9 100644
> --- a/kernel/stackleak.c
> +++ b/kernel/stackleak.c
> @@ -73,40 +73,38 @@ late_initcall(stackleak_sysctls_init);
>   static __always_inline void __stackleak_erase(void)
>   {
>   	const unsigned long task_stack_low = stackleak_task_low_bound(current);
> -
> -	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
> -	unsigned long kstack_ptr = current->lowest_stack;
> -	unsigned long boundary = task_stack_low;
> +	unsigned long erase_low = current->lowest_stack;
> +	unsigned long erase_high;
>   	unsigned int poison_count = 0;
>   	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>   
>   	/* Search for the poison value in the kernel stack */
> -	while (kstack_ptr > boundary && poison_count <= depth) {
> -		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
> +	while (erase_low > task_stack_low && poison_count <= depth) {
> +		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
>   			poison_count++;
>   		else
>   			poison_count = 0;
>   
> -		kstack_ptr -= sizeof(unsigned long);
> +		erase_low -= sizeof(unsigned long);
>   	}
>   
>   #ifdef CONFIG_STACKLEAK_METRICS
> -	current->prev_lowest_stack = kstack_ptr;
> +	current->prev_lowest_stack = erase_low;
>   #endif
>   
>   	/*
> -	 * Now write the poison value to the kernel stack. Start from
> -	 * 'kstack_ptr' and move up till the new 'boundary'. We assume that
> -	 * the stack pointer doesn't change when we write poison.
> +	 * Now write the poison value to the kernel stack between 'erase_low'
> +	 * and 'erase_high'. We assume that the stack pointer doesn't change
> +	 * when we write poison.
>   	 */
>   	if (on_thread_stack())
> -		boundary = current_stack_pointer;
> +		erase_high = current_stack_pointer;
>   	else
> -		boundary = current_top_of_stack();
> +		erase_high = current_top_of_stack();
>   
> -	while (kstack_ptr < boundary) {
> -		*(unsigned long *)kstack_ptr = STACKLEAK_POISON;
> -		kstack_ptr += sizeof(unsigned long);
> +	while (erase_low < erase_high) {
> +		*(unsigned long *)erase_low = STACKLEAK_POISON;
> +		erase_low += sizeof(unsigned long);
>   	}
>   
>   	/* Reset the 'lowest_stack' value for the next syscall */


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 06/13] stackleak: rework stack high bound handling
  2022-04-27 17:31 ` [PATCH v2 06/13] stackleak: rework stack high bound handling Mark Rutland
@ 2022-05-08 21:27   ` Alexander Popov
  2022-05-10 11:22     ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-08 21:27 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel
  Cc: akpm, catalin.marinas, keescook, linux-kernel, luto, will

On 27.04.2022 20:31, Mark Rutland wrote:
> Prior to returning to userpace, we reset current->lowest_stack to a
> reasonable high bound. Currently we do this by subtracting the arbitrary
> value `THREAD_SIZE/64` from the top of the stack, for reasons lost to
> history.
> 
> Looking at configurations today:
> 
> * On i386 where THREAD_SIZE is 8K, the bound will be 128 bytes. The
>    pt_regs at the top of the stack is 68 bytes (with 0 to 16 bytes of
>    padding above), and so this covers an additional portion of 44 to 60
>    bytes.
> 
> * On x86_64 where THREAD_SIZE is at least 16K (up to 32K with KASAN) the
>    bound will be at least 256 bytes (up to 512 with KASAN). The pt_regs
>    at the top of the stack is 168 bytes, and so this cover an additional
>    88 bytes of stack (up to 344 with KASAN).
> 
> * On arm64 where THREAD_SIZE is at least 16K (up to 64K with 64K pages
>    and VMAP_STACK), the bound will be at least 256 bytes (up to 1024 with
>    KASAN). The pt_regs at the top of the stack is 336 bytes, so this can
>    fall within the pt_regs, or can cover an additional 688 bytes of
>    stack.
> 
> Clearly the `THREAD_SIZE/64` value doesn't make much sense -- in the
> worst case, this will cause more than 600 bytes of stack to be erased
> for every syscall, even if actual stack usage were substantially
> smaller.
> 
> This patches makes this slightly less nonsensical by consistently
> resetting current->lowest_stack to the base of the task pt_regs. For
> clarity and for consistency with the handling of the low bound, the
> generation of the high bound is split into a helper with commentary
> explaining why.
> 
> Since the pt_regs at the top of the stack will be clobbered upon the
> next exception entry, we don't need to poison these at exception exit.
> By using task_pt_regs() as the high stack boundary instead of
> current_top_of_stack() we avoid some redundant poisoning, and the
> compiler can share the address generation between the poisoning and
> restting of `current->lowest_stack`, making the generated code more
> optimal.
> 
> It's not clear to me whether the existing `THREAD_SIZE/64` offset was a
> dodgy heuristic to skip the pt_regs, or whether it was attempting to
> minimize the number of times stackleak_check_stack() would have to
> update `current->lowest_stack` when stack usage was shallow at the cost
> of unconditionally poisoning a small portion of the stack for every exit
> to userspace.

I inherited this 'THREAD_SIZE/64' logic is from the original grsecurity patch.
As I mentioned, originally this was written in asm.

For x86_64:
	mov	TASK_thread_sp0(%r11), %rdi
	sub	$256, %rdi
	mov	%rdi, TASK_lowest_stack(%r11)

For x86_32:
	mov TASK_thread_sp0(%ebp), %edi
	sub $128, %edi
	mov %edi, TASK_lowest_stack(%ebp)

256 bytes for x86_64 and 128 bytes for x86_32 are exactly THREAD_SIZE/64.

I think this value was chosen as optimal for minimizing poison scanning.
It's possible that stackleak_track_stack() is not called during the syscall 
because all the called functions have small stack frames.

> For now I've simply removed the offset, and if we need/want to minimize
> updates for shallow stack usage it should be easy to add a better
> heuristic atop, with appropriate commentary so we know what's going on.

I like your idea to erase the thread stack up to pt_regs if we call the 
stackleak erasing from the trampoline stack.

But here I don't understand where task_pt_regs() points to...

> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> ---
>   include/linux/stackleak.h | 14 ++++++++++++++
>   kernel/stackleak.c        | 19 ++++++++++++++-----
>   2 files changed, 28 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
> index 67430faa5c518..467661aeb4136 100644
> --- a/include/linux/stackleak.h
> +++ b/include/linux/stackleak.h
> @@ -28,6 +28,20 @@ stackleak_task_low_bound(const struct task_struct *tsk)
>   	return (unsigned long)end_of_stack(tsk) + sizeof(unsigned long);
>   }
>   
> +/*
> + * The address immediately after the highest address on tsk's stack which we
> + * can plausibly erase.
> + */
> +static __always_inline unsigned long
> +stackleak_task_high_bound(const struct task_struct *tsk)
> +{
> +	/*
> +	 * The task's pt_regs lives at the top of the task stack and will be
> +	 * overwritten by exception entry, so there's no need to erase them.
> +	 */
> +	return (unsigned long)task_pt_regs(tsk);
> +}
> +
>   static inline void stackleak_task_init(struct task_struct *t)
>   {
>   	t->lowest_stack = stackleak_task_low_bound(t);
> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> index d5f684dc0a2d9..ba346d46218f5 100644
> --- a/kernel/stackleak.c
> +++ b/kernel/stackleak.c
> @@ -73,6 +73,7 @@ late_initcall(stackleak_sysctls_init);
>   static __always_inline void __stackleak_erase(void)
>   {
>   	const unsigned long task_stack_low = stackleak_task_low_bound(current);
> +	const unsigned long task_stack_high = stackleak_task_high_bound(current);
>   	unsigned long erase_low = current->lowest_stack;
>   	unsigned long erase_high;
>   	unsigned int poison_count = 0;
> @@ -93,14 +94,22 @@ static __always_inline void __stackleak_erase(void)
>   #endif
>   
>   	/*
> -	 * Now write the poison value to the kernel stack between 'erase_low'
> -	 * and 'erase_high'. We assume that the stack pointer doesn't change
> -	 * when we write poison.
> +	 * Write poison to the task's stack between 'erase_low' and
> +	 * 'erase_high'.
> +	 *
> +	 * If we're running on a different stack (e.g. an entry trampoline
> +	 * stack) we can erase everything below the pt_regs at the top of the
> +	 * task stack.
> +	 *
> +	 * If we're running on the task stack itself, we must not clobber any
> +	 * stack used by this function and its caller. We assume that this
> +	 * function has a fixed-size stack frame, and the current stack pointer
> +	 * doesn't change while we write poison.
>   	 */
>   	if (on_thread_stack())
>   		erase_high = current_stack_pointer;
>   	else
> -		erase_high = current_top_of_stack();
> +		erase_high = task_stack_high;
>   
>   	while (erase_low < erase_high) {
>   		*(unsigned long *)erase_low = STACKLEAK_POISON;
> @@ -108,7 +117,7 @@ static __always_inline void __stackleak_erase(void)
>   	}
>   
>   	/* Reset the 'lowest_stack' value for the next syscall */
> -	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
> +	current->lowest_stack = task_stack_high;
>   }
>   
>   asmlinkage void noinstr stackleak_erase(void)


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/13] stackleak: rework poison scanning
  2022-04-27 17:31 ` [PATCH v2 07/13] stackleak: rework poison scanning Mark Rutland
@ 2022-05-09 13:51   ` Alexander Popov
  2022-05-10 13:13     ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-09 13:51 UTC (permalink / raw)
  To: Mark Rutland, linux-arm-kernel
  Cc: akpm, catalin.marinas, keescook, linux-kernel, luto, will

Hello Mark!

On 27.04.2022 20:31, Mark Rutland wrote:
> Currently we over-estimate the region of stack which must be erased.
> 
> To determine the region to be erased, we scan downards for a contiguous
> block of poison values (or the low bound of the stack). There are a few
> minor problems with this today:
> 
> * When we find a block of poison values, we include this block within
>    the region to erase.
> 
>    As this is included within the region to erase, this causes us to
>    redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
>    poison.

Right, this can be improved.

> * As the loop condition checks 'poison_count <= depth', it will run an
>    additional iteration after finding the contiguous block of poison,
>    decrementing 'erase_low' once more than necessary.

Actually, I think the current code is correct.

I'm intentionally searching one poison value more than STACKLEAK_SEARCH_DEPTH to 
avoid the corner case. See the BUILD_BUG_ON assertion in stackleak_track_stack() 
that describes it:

/*
  * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
  * STACKLEAK_SEARCH_DEPTH makes the poison search in
  * stackleak_erase() unreliable. Let's prevent that.
  */
BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE > STACKLEAK_SEARCH_DEPTH);

>    As this is included within the region to erase, this causes us to
>    redundantly overwrite an additional unsigned long with poison.
> 
> * As we always decrement 'erase_low' after checking an element on the
>    stack, we always include the element below this within the region to
>    erase.
> 
>    As this is included within the region to erase, this causes us to
>    redundantly overwrite an additional unsigned long with poison.
> 
>    Note that this is not a functional problem. As the loop condition
>    checks 'erase_low > task_stack_low', we'll never clobber the
>    STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
>    never fail to erase the element immediately above the STACK_END_MAGIC.

Right, I don't see any bug in the current erasing code.

When I wrote the current code, I carefully checked all the corner cases. For 
example, on the first stack erasing, the STACK_END_MAGIC was not overwritten, 
but the memory next to it was erased. Same for the beginning of the stack: I 
carefully checked that no unpoisoned bytes were left on the thread stack.

> In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
> bytes more than necessary, which is unfortunate.
> 
> This patch reworks the logic to find the address immediately above the
> poisoned region, by finding the lowest non-poisoned address. This is
> factored into a stackleak_find_top_of_poison() helper both for clarity
> and so that this can be shared with the LKDTM test in subsequent
> patches.

You know, I wrote stackleak_erase() in very plain C. I wanted a compiler to 
generate assembly that is very close to the original asm version. I worried that 
compilers might do weird stuff with the local variables and the stack pointer.

So I checked stackleak for gcc versions 4.8, 5, 6, 7, 8, 9, and 10 on x86_64, 
i386 and arm64. This is my project that helped with this work:
https://github.com/a13xp0p0v/kernel-build-containers

Mark, in this patch series you use many local variables and helper functions.
Honestly, this worries me. For example, compilers can (and usually do) ignore 
the presence of the 'inline' specifier for the purpose of optimization.

Thanks!

> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Popov <alex.popov@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> ---
>   include/linux/stackleak.h | 26 ++++++++++++++++++++++++++
>   kernel/stackleak.c        | 18 ++++--------------
>   2 files changed, 30 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
> index 467661aeb4136..c36e7a3b45e7e 100644
> --- a/include/linux/stackleak.h
> +++ b/include/linux/stackleak.h
> @@ -42,6 +42,32 @@ stackleak_task_high_bound(const struct task_struct *tsk)
>   	return (unsigned long)task_pt_regs(tsk);
>   }
>   
> +/*
> + * Find the address immediately above the poisoned region of the stack, where
> + * that region falls between 'low' (inclusive) and 'high' (exclusive).
> + */
> +static __always_inline unsigned long
> +stackleak_find_top_of_poison(const unsigned long low, const unsigned long high)
> +{
> +	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> +	unsigned int poison_count = 0;
> +	unsigned long poison_high = high;
> +	unsigned long sp = high;
> +
> +	while (sp > low && poison_count < depth) {
> +		sp -= sizeof(unsigned long);
> +
> +		if (*(unsigned long *)sp == STACKLEAK_POISON) {
> +			poison_count++;
> +		} else {
> +			poison_count = 0;
> +			poison_high = sp;
> +		}
> +	}
> +
> +	return poison_high;
> +}
> +
>   static inline void stackleak_task_init(struct task_struct *t)
>   {
>   	t->lowest_stack = stackleak_task_low_bound(t);
> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> index ba346d46218f5..afd54b8e10b83 100644
> --- a/kernel/stackleak.c
> +++ b/kernel/stackleak.c
> @@ -74,20 +74,10 @@ static __always_inline void __stackleak_erase(void)
>   {
>   	const unsigned long task_stack_low = stackleak_task_low_bound(current);
>   	const unsigned long task_stack_high = stackleak_task_high_bound(current);
> -	unsigned long erase_low = current->lowest_stack;
> -	unsigned long erase_high;
> -	unsigned int poison_count = 0;
> -	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> -
> -	/* Search for the poison value in the kernel stack */
> -	while (erase_low > task_stack_low && poison_count <= depth) {
> -		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
> -			poison_count++;
> -		else
> -			poison_count = 0;
> -
> -		erase_low -= sizeof(unsigned long);
> -	}
> +	unsigned long erase_low, erase_high;
> +
> +	erase_low = stackleak_find_top_of_poison(task_stack_low,
> +						 current->lowest_stack);
>   
>   #ifdef CONFIG_STACKLEAK_METRICS
>   	current->prev_lowest_stack = erase_low;


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 06/13] stackleak: rework stack high bound handling
  2022-05-08 21:27   ` Alexander Popov
@ 2022-05-10 11:22     ` Mark Rutland
  2022-05-15 16:32       ` Alexander Popov
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-10 11:22 UTC (permalink / raw)
  To: Alexander Popov
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On Mon, May 09, 2022 at 12:27:22AM +0300, Alexander Popov wrote:
> On 27.04.2022 20:31, Mark Rutland wrote:
> > Prior to returning to userpace, we reset current->lowest_stack to a
> > reasonable high bound. Currently we do this by subtracting the arbitrary
> > value `THREAD_SIZE/64` from the top of the stack, for reasons lost to
> > history.
> > 
> > Looking at configurations today:
> > 
> > * On i386 where THREAD_SIZE is 8K, the bound will be 128 bytes. The
> >    pt_regs at the top of the stack is 68 bytes (with 0 to 16 bytes of
> >    padding above), and so this covers an additional portion of 44 to 60
> >    bytes.
> > 
> > * On x86_64 where THREAD_SIZE is at least 16K (up to 32K with KASAN) the
> >    bound will be at least 256 bytes (up to 512 with KASAN). The pt_regs
> >    at the top of the stack is 168 bytes, and so this cover an additional
> >    88 bytes of stack (up to 344 with KASAN).
> > 
> > * On arm64 where THREAD_SIZE is at least 16K (up to 64K with 64K pages
> >    and VMAP_STACK), the bound will be at least 256 bytes (up to 1024 with
> >    KASAN). The pt_regs at the top of the stack is 336 bytes, so this can
> >    fall within the pt_regs, or can cover an additional 688 bytes of
> >    stack.
> > 
> > Clearly the `THREAD_SIZE/64` value doesn't make much sense -- in the
> > worst case, this will cause more than 600 bytes of stack to be erased
> > for every syscall, even if actual stack usage were substantially
> > smaller.
> > 
> > This patches makes this slightly less nonsensical by consistently
> > resetting current->lowest_stack to the base of the task pt_regs. For
> > clarity and for consistency with the handling of the low bound, the
> > generation of the high bound is split into a helper with commentary
> > explaining why.
> > 
> > Since the pt_regs at the top of the stack will be clobbered upon the
> > next exception entry, we don't need to poison these at exception exit.
> > By using task_pt_regs() as the high stack boundary instead of
> > current_top_of_stack() we avoid some redundant poisoning, and the
> > compiler can share the address generation between the poisoning and
> > restting of `current->lowest_stack`, making the generated code more
> > optimal.
> > 
> > It's not clear to me whether the existing `THREAD_SIZE/64` offset was a
> > dodgy heuristic to skip the pt_regs, or whether it was attempting to
> > minimize the number of times stackleak_check_stack() would have to
> > update `current->lowest_stack` when stack usage was shallow at the cost
> > of unconditionally poisoning a small portion of the stack for every exit
> > to userspace.
> 
> I inherited this 'THREAD_SIZE/64' logic is from the original grsecurity patch.
> As I mentioned, originally this was written in asm.
> 
> For x86_64:
> 	mov	TASK_thread_sp0(%r11), %rdi
> 	sub	$256, %rdi
> 	mov	%rdi, TASK_lowest_stack(%r11)
> 
> For x86_32:
> 	mov TASK_thread_sp0(%ebp), %edi
> 	sub $128, %edi
> 	mov %edi, TASK_lowest_stack(%ebp)
> 
> 256 bytes for x86_64 and 128 bytes for x86_32 are exactly THREAD_SIZE/64.

To check my understanding, did you come up with the `THREAD_SIZE/64`
calculation from reverse-engineering the assembly?

I strongly suspect that has nothing to do with THEAD_SIZE, and was trying to
fall just below the task's pt_regs (and maybe some unconditional stack usage
just below that).

> I think this value was chosen as optimal for minimizing poison scanning.
> It's possible that stackleak_track_stack() is not called during the syscall
> because all the called functions have small stack frames.

I can believe that, but given the clearing cost appears to dominate the
scanning cost, I suspect we can live with making this precisely the bottom of
the pt_regs.

> > For now I've simply removed the offset, and if we need/want to minimize
> > updates for shallow stack usage it should be easy to add a better
> > heuristic atop, with appropriate commentary so we know what's going on.
> 
> I like your idea to erase the thread stack up to pt_regs if we call the
> stackleak erasing from the trampoline stack.
> 
> But here I don't understand where task_pt_regs() points to...

As mentioned in the commit message, there's a struct pt_regs at the top of each
task stack (created at exception entry, and consumed by exception return),
which contains the user register state. On arm64 and x86_64, that's *right* at
the top of the stack, but on i386 there can be some padding above that.
task_pt_regs(tsk) points at the base of that pt_regs.

That looks something like the following (with increasing addresses going
upwards):

  ----------------   <--- task_stack_page(tsk) + THREAD_SIZE
  (optional padding)
  ----------------   <--- task_top_of_stack(tsk) // x86 only
      /\
      ||
      || pt_regs
      ||
      \/
  ----------------   <--- task_pt_regs(tsk)
      /\
      ||
      ||
      ||
      || Usable task stack
      ||
      ||
      ||
      \/ 
  ----------------
  STACK_END_MAGIC
  ----------------   <--- task_stack_page(tsk)

Thanks,
Mark.

> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Popov <alex.popov@linux.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > ---
> >   include/linux/stackleak.h | 14 ++++++++++++++
> >   kernel/stackleak.c        | 19 ++++++++++++++-----
> >   2 files changed, 28 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
> > index 67430faa5c518..467661aeb4136 100644
> > --- a/include/linux/stackleak.h
> > +++ b/include/linux/stackleak.h
> > @@ -28,6 +28,20 @@ stackleak_task_low_bound(const struct task_struct *tsk)
> >   	return (unsigned long)end_of_stack(tsk) + sizeof(unsigned long);
> >   }
> > +/*
> > + * The address immediately after the highest address on tsk's stack which we
> > + * can plausibly erase.
> > + */
> > +static __always_inline unsigned long
> > +stackleak_task_high_bound(const struct task_struct *tsk)
> > +{
> > +	/*
> > +	 * The task's pt_regs lives at the top of the task stack and will be
> > +	 * overwritten by exception entry, so there's no need to erase them.
> > +	 */
> > +	return (unsigned long)task_pt_regs(tsk);
> > +}
> > +
> >   static inline void stackleak_task_init(struct task_struct *t)
> >   {
> >   	t->lowest_stack = stackleak_task_low_bound(t);
> > diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> > index d5f684dc0a2d9..ba346d46218f5 100644
> > --- a/kernel/stackleak.c
> > +++ b/kernel/stackleak.c
> > @@ -73,6 +73,7 @@ late_initcall(stackleak_sysctls_init);
> >   static __always_inline void __stackleak_erase(void)
> >   {
> >   	const unsigned long task_stack_low = stackleak_task_low_bound(current);
> > +	const unsigned long task_stack_high = stackleak_task_high_bound(current);
> >   	unsigned long erase_low = current->lowest_stack;
> >   	unsigned long erase_high;
> >   	unsigned int poison_count = 0;
> > @@ -93,14 +94,22 @@ static __always_inline void __stackleak_erase(void)
> >   #endif
> >   	/*
> > -	 * Now write the poison value to the kernel stack between 'erase_low'
> > -	 * and 'erase_high'. We assume that the stack pointer doesn't change
> > -	 * when we write poison.
> > +	 * Write poison to the task's stack between 'erase_low' and
> > +	 * 'erase_high'.
> > +	 *
> > +	 * If we're running on a different stack (e.g. an entry trampoline
> > +	 * stack) we can erase everything below the pt_regs at the top of the
> > +	 * task stack.
> > +	 *
> > +	 * If we're running on the task stack itself, we must not clobber any
> > +	 * stack used by this function and its caller. We assume that this
> > +	 * function has a fixed-size stack frame, and the current stack pointer
> > +	 * doesn't change while we write poison.
> >   	 */
> >   	if (on_thread_stack())
> >   		erase_high = current_stack_pointer;
> >   	else
> > -		erase_high = current_top_of_stack();
> > +		erase_high = task_stack_high;
> >   	while (erase_low < erase_high) {
> >   		*(unsigned long *)erase_low = STACKLEAK_POISON;
> > @@ -108,7 +117,7 @@ static __always_inline void __stackleak_erase(void)
> >   	}
> >   	/* Reset the 'lowest_stack' value for the next syscall */
> > -	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
> > +	current->lowest_stack = task_stack_high;
> >   }
> >   asmlinkage void noinstr stackleak_erase(void)
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack()
  2022-05-08 17:24   ` Alexander Popov
@ 2022-05-10 11:36     ` Mark Rutland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2022-05-10 11:36 UTC (permalink / raw)
  To: Alexander Popov
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On Sun, May 08, 2022 at 08:24:38PM +0300, Alexander Popov wrote:
> Hi Mark!
> 
> On 27.04.2022 20:31, Mark Rutland wrote:
> > Due to some historical confusion, arm64's current_top_of_stack() isn't
> > what the stackleak code expects. This could in theory result in a number
> > of problems, and practically results in an unnecessary performance hit.
> > We can avoid this by aligning the arm64 implementation with the x86
> > implementation.
> > 
> > The arm64 implementation of current_top_of_stack() was added
> > specifically for stackleak in commit:
> > 
> >    0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> > 
> > This was intended to be equivalent to the x86 implementation, but the
> > implementation, semantics, and performance characteristics differ
> > wildly:
> > 
> > * On x86, current_top_of_stack() returns the top of the current task's
> >    task stack, regardless of which stack is in active use.
> > 
> >    The implementation accesses a percpu variable which the x86 entry code
> >    maintains, and returns the location immediately above the pt_regs on
> >    the task stack (above which x86 has some padding).
> > 
> > * On arm64 current_top_of_stack() returns the top of the stack in active
> >    use (i.e. the one which is currently being used).
> > 
> >    The implementation checks the SP against a number of
> >    potentially-accessible stacks, and will BUG() if no stack is found.
> 
> As I could understand, for arm64, calling stackleak_erase() not from the
> thread stack would bring troubles because current_top_of_stack() would
> return an unexpected address from a foreign stack. Is this correct?

Yes.

> But this bug doesn't happen because arm64 always calls stackleak_erase()
> from the current thread stack. Right?

Yes.

> > The core stackleak_erase() code determines the upper bound of stack to
> > erase with:
> > 
> > | if (on_thread_stack())
> > |         boundary = current_stack_pointer;
> > | else
> > |         boundary = current_top_of_stack();
> > 
> > On arm64 stackleak_erase() is always called on a task stack, and
> > on_thread_stack() should always be true. On x86, stackleak_erase() is
> > mostly called on a trampoline stack, and is sometimes called on a task
> > stack.
> > 
> > Currently, this results in a lot of unnecessary code being generated for
> > arm64 for the impossible !on_thread_stack() case. Some of this is
> > inlined, bloating stackleak_erase(), while portions of this are left
> > out-of-line and permitted to be instrumented (which would be a
> > functional problem if that code were reachable).
> 
> Sorry, I didn't understand this part about instrumentation. Could you
> elaborate please?

Portions of the code are regular .text, and are subject to instrumentation by
KASAN/UBSAN/KCOV, ftrace, etc, where the compiler will (implicitly) insert
calls to out-of-line instrumentation callbacks. Some (but not all) of those are
disabled in the Makefile. For example, ftrace instrumentation is possible.

Generally, the instrumentation callbacks expect to run with a full kernel
environment (e.g. with RCU watching, IRQ tracing state correct), but at the
time stackleak_erase() is called, this is not the case. so those could go wrong
and corrupt state.

Additionally, since those calls are added implicitly by the compiler, they can
manipulate state at arbitrary points throughout the function where we might not
expect it (e.g. changing current->lowest_stack).

The general stance is that we should use noinstr to disable instrumentation,
and anything that needs to be inlined into noinstr needs to be marked with
__always_inline (which is guaranteed to either inline or cause a compile-time
error if it is not possible to inline).

This patch reworks things to avoid the potential problems; as per the commit
message I don't beleive anything goes wrong in practice today.

Thanks,
Mark.

> > As a first step towards improving this, this patch aligns arm64's
> > implementation of current_top_of_stack() with x86's, always returning
> > the top of the current task's stack. With GCC 11.1.0 this results in the
> > bulk of the unnecessary code being removed, including all of the
> > out-of-line instrumentable code.
> > 
> > While I don't believe there's a functional problem in practice I've
> > marked this as a fix since the semantic was clearly wrong, the fix
> > itself is simple, and other code might rely upon this in future.
> > 
> > Fixes: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin")
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Popov <alex.popov@linux.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >   arch/arm64/include/asm/processor.h | 10 ++++------
> >   1 file changed, 4 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> > index 73e38d9a540ce..6b1a12c23fe77 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -381,12 +381,10 @@ long get_tagged_addr_ctrl(struct task_struct *task);
> >    * of header definitions for the use of task_stack_page.
> >    */
> > -#define current_top_of_stack()								\
> > -({											\
> > -	struct stack_info _info;							\
> > -	BUG_ON(!on_accessible_stack(current, current_stack_pointer, 1, &_info));	\
> > -	_info.high;									\
> > -})
> > +/*
> > + * The top of the current task's task stack
> > + */
> > +#define current_top_of_stack()	((unsigned long)current->stack + THREAD_SIZE)
> >   #define on_thread_stack()	(on_task_stack(current, current_stack_pointer, 1, NULL))
> >   #endif /* __ASSEMBLY__ */
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 02/13] stackleak: move skip_erasing() check earlier
  2022-05-08 17:44   ` Alexander Popov
@ 2022-05-10 11:40     ` Mark Rutland
  0 siblings, 0 replies; 47+ messages in thread
From: Mark Rutland @ 2022-05-10 11:40 UTC (permalink / raw)
  To: Alexander Popov
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On Sun, May 08, 2022 at 08:44:56PM +0300, Alexander Popov wrote:
> On 27.04.2022 20:31, Mark Rutland wrote:
> > In stackleak_erase() we check skip_erasing() after accessing some fields
> > from current. As generating the address of current uses asm which
> > hazards with the static branch asm, this work is always performed, even
> > when the static branch is patched to jump to the return a the end of the
> > function.
> 
> Nice find!
> 
> > This patch avoids this redundant work by moving the skip_erasing() check
> > earlier.
> > 
> > To avoid complicating initialization within stackleak_erase(), the body
> > of the function is split out into a __stackleak_erase() helper, with the
> > check left in a wrapper function. The __stackleak_erase() helper is
> > marked __always_inline to ensure that this is inlined into
> > stackleak_erase() and not instrumented.

[...]

> > diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> > index ddb5a7f48d69e..753eab797a04d 100644
> > --- a/kernel/stackleak.c
> > +++ b/kernel/stackleak.c
> > @@ -70,7 +70,7 @@ late_initcall(stackleak_sysctls_init);
> >   #define skip_erasing()	false
> >   #endif /* CONFIG_STACKLEAK_RUNTIME_DISABLE */
> > -asmlinkage void noinstr stackleak_erase(void)
> > +static __always_inline void __stackleak_erase(void)
> 
> Are you sure that __stackleak_erase() doesn't need asmlinkage and noinstr as well?

I am certain it needs neither.

It's static and never called from asm, so it doesn't need `asmlinkage`.

It's marked `__always_inline`, so it will always be inlined into its caller (or
if the compiler cannot inline it, will result in a compiler error).

That's important to get good codegen (especially with the on/off stack variants
later in the series), and when inlined into its caller the compiler will treat
it as part of its caller for code generation, so the caller's `noinstr` takes
effect.

Thanks,
Mark.

> 
> >   {
> >   	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
> >   	unsigned long kstack_ptr = current->lowest_stack;
> > @@ -78,9 +78,6 @@ asmlinkage void noinstr stackleak_erase(void)
> >   	unsigned int poison_count = 0;
> >   	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> > -	if (skip_erasing())
> > -		return;
> > -
> >   	/* Check that 'lowest_stack' value is sane */
> >   	if (unlikely(kstack_ptr - boundary >= THREAD_SIZE))
> >   		kstack_ptr = boundary;
> > @@ -125,6 +122,14 @@ asmlinkage void noinstr stackleak_erase(void)
> >   	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
> >   }
> > +asmlinkage void noinstr stackleak_erase(void)
> > +{
> > +	if (skip_erasing())
> > +		return;
> > +
> > +	__stackleak_erase();
> > +}
> > +
> >   void __used __no_caller_saved_registers noinstr stackleak_track_stack(void)
> >   {
> >   	unsigned long sp = current_stack_pointer;
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-08 18:17   ` Alexander Popov
@ 2022-05-10 11:46     ` Mark Rutland
  2022-05-11  3:00       ` Kees Cook
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-10 11:46 UTC (permalink / raw)
  To: Alexander Popov
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
> On 27.04.2022 20:31, Mark Rutland wrote:
> > In __stackleak_erase() we check that the `erase_low` value derived from
> > `current->lowest_stack` is above the lowest legitimate stack pointer
> > value, but this is already enforced by stackleak_track_stack() when
> > recording the lowest stack value.
> > 
> > Remove the redundant check.
> > 
> > There should be no functional change as a result of this patch.
> 
> Mark, I can't agree here. I think this check is important.
> The performance profit from dropping it is less than the confidence decrease :)
> 
> With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
> overwrite some wrong kernel memory, but simply clears the whole thread
> stack, which is safe behavior.

If you feel strongly about it, I can restore the check, but I struggle to
believe that it's worthwhile. The `lowest_stack` value lives in the
task_struct, and if you have the power to corrupt that you have the power to do
much more interesting things.

If we do restore it, I'd like to add a big fat comment explaining the
rationale (i.e. that it only matter if someone could corrupt
`current->lowest_stack`, as otherwise that's guarnateed to be within bounds).

Thanks,
Mark.

> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Popov <alex.popov@linux.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > ---
> >   kernel/stackleak.c | 4 ----
> >   1 file changed, 4 deletions(-)
> > 
> > diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> > index 753eab797a04d..f7a0f8cf73c37 100644
> > --- a/kernel/stackleak.c
> > +++ b/kernel/stackleak.c
> > @@ -78,10 +78,6 @@ static __always_inline void __stackleak_erase(void)
> >   	unsigned int poison_count = 0;
> >   	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> > -	/* Check that 'lowest_stack' value is sane */
> > -	if (unlikely(kstack_ptr - boundary >= THREAD_SIZE))
> > -		kstack_ptr = boundary;
> > -
> >   	/* Search for the poison value in the kernel stack */
> >   	while (kstack_ptr > boundary && poison_count <= depth) {
> >   		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/13] stackleak: clarify variable names
  2022-05-08 20:49   ` Alexander Popov
@ 2022-05-10 13:01     ` Mark Rutland
  2022-05-11  3:05       ` Kees Cook
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-10 13:01 UTC (permalink / raw)
  To: Alexander Popov
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On Sun, May 08, 2022 at 11:49:46PM +0300, Alexander Popov wrote:
> On 27.04.2022 20:31, Mark Rutland wrote:
> > The logic within __stackleak_erase() can be a little hard to follow, as
> > `boundary` switches from being the low bound to the high bound mid way
> > through the function, and `kstack_ptr` is used to represent the start of
> > the region to erase while `boundary` represents the end of the region to
> > erase.
> > 
> > Make this a little clearer by consistently using clearer variable names.
> > The `boundary` variable is removed, the bounds of the region to erase
> > are described by `erase_low` and `erase_high`, and bounds of the task
> > stack are described by `task_stack_low` and `task_stck_high`.
> 
> A typo here in `task_stck_high`.

Ah; whoops.

> > As the same time, remove the comment above the variables, since it is
> > unclear whether it's intended as rationale, a complaint, or a TODO, and
> > is more confusing than helpful.
> 
> Yes, this comment is a bit confusing :) I can elaborate.
> 
> In the original grsecurity patch, the stackleak erasing was written in asm.
> When I adopted it and proposed for the upstream, Linus strongly opposed this.
> So I developed stackleak erasing in C.
> 
> And I wrote this comment to remember that having 'kstack_ptr' and 'boundary'
> variables on the stack (which we are clearing) would not be good.

Ok, so I think that falls into the "complaint" bucket I mentioned. I understand
that we don't have any guarantee from the compiler as to how it will use the
stack, and that's obviously a potential problem.

> That was also the main reason why I reused the 'boundary' variable: I wanted
> the compiler to allocate it in the register and I avoided creating many
> local variables.
>
> Mark, did your refactoring make the compiler allocate local variables on the
> stack instead of the registers?

Considering the whole series, testing with GCC 11.1.0:

* On arm64:
     before: stackleak_erase() uses 48 bytes of stack
     after: stackleak_erase() uses 0 bytes of stack

     Note: this is entirely due to patch 1; arm64 has enough GPRs that it
     doesn't need to use the stack.

* On x86_64:
     before: stackleak_erase() uses 0 bytes of stack
     after:  stackleak_erase() uses 0 bytes of stack

* On i386
     before: stackleak_erase() uses 8 bytes of stach
     after:  stackleak_erase() uses 16 bytes of stack

The i386 case isn't ideal, but given that those bytes will easily be used by
the entry triage code before getting to any syscall handling, I don't believe
that's an issue in practice.

Thanks,
Mark.

> > There should be no functional change as a result of this patch.
> > 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Popov <alex.popov@linux.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > ---
> >   kernel/stackleak.c | 30 ++++++++++++++----------------
> >   1 file changed, 14 insertions(+), 16 deletions(-)
> > 
> > diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> > index 24b7cf01b2972..d5f684dc0a2d9 100644
> > --- a/kernel/stackleak.c
> > +++ b/kernel/stackleak.c
> > @@ -73,40 +73,38 @@ late_initcall(stackleak_sysctls_init);
> >   static __always_inline void __stackleak_erase(void)
> >   {
> >   	const unsigned long task_stack_low = stackleak_task_low_bound(current);
> > -
> > -	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
> > -	unsigned long kstack_ptr = current->lowest_stack;
> > -	unsigned long boundary = task_stack_low;
> > +	unsigned long erase_low = current->lowest_stack;
> > +	unsigned long erase_high;
> >   	unsigned int poison_count = 0;
> >   	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> >   	/* Search for the poison value in the kernel stack */
> > -	while (kstack_ptr > boundary && poison_count <= depth) {
> > -		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
> > +	while (erase_low > task_stack_low && poison_count <= depth) {
> > +		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
> >   			poison_count++;
> >   		else
> >   			poison_count = 0;
> > -		kstack_ptr -= sizeof(unsigned long);
> > +		erase_low -= sizeof(unsigned long);
> >   	}
> >   #ifdef CONFIG_STACKLEAK_METRICS
> > -	current->prev_lowest_stack = kstack_ptr;
> > +	current->prev_lowest_stack = erase_low;
> >   #endif
> >   	/*
> > -	 * Now write the poison value to the kernel stack. Start from
> > -	 * 'kstack_ptr' and move up till the new 'boundary'. We assume that
> > -	 * the stack pointer doesn't change when we write poison.
> > +	 * Now write the poison value to the kernel stack between 'erase_low'
> > +	 * and 'erase_high'. We assume that the stack pointer doesn't change
> > +	 * when we write poison.
> >   	 */
> >   	if (on_thread_stack())
> > -		boundary = current_stack_pointer;
> > +		erase_high = current_stack_pointer;
> >   	else
> > -		boundary = current_top_of_stack();
> > +		erase_high = current_top_of_stack();
> > -	while (kstack_ptr < boundary) {
> > -		*(unsigned long *)kstack_ptr = STACKLEAK_POISON;
> > -		kstack_ptr += sizeof(unsigned long);
> > +	while (erase_low < erase_high) {
> > +		*(unsigned long *)erase_low = STACKLEAK_POISON;
> > +		erase_low += sizeof(unsigned long);
> >   	}
> >   	/* Reset the 'lowest_stack' value for the next syscall */
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/13] stackleak: rework poison scanning
  2022-05-09 13:51   ` Alexander Popov
@ 2022-05-10 13:13     ` Mark Rutland
  2022-05-15 17:33       ` Alexander Popov
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-10 13:13 UTC (permalink / raw)
  To: Alexander Popov
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On Mon, May 09, 2022 at 04:51:35PM +0300, Alexander Popov wrote:
> Hello Mark!
> 
> On 27.04.2022 20:31, Mark Rutland wrote:
> > Currently we over-estimate the region of stack which must be erased.
> > 
> > To determine the region to be erased, we scan downards for a contiguous
> > block of poison values (or the low bound of the stack). There are a few
> > minor problems with this today:
> > 
> > * When we find a block of poison values, we include this block within
> >    the region to erase.
> > 
> >    As this is included within the region to erase, this causes us to
> >    redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
> >    poison.
> 
> Right, this can be improved.
> 
> > * As the loop condition checks 'poison_count <= depth', it will run an
> >    additional iteration after finding the contiguous block of poison,
> >    decrementing 'erase_low' once more than necessary.
> 
> Actually, I think the current code is correct.
> 
> I'm intentionally searching one poison value more than
> STACKLEAK_SEARCH_DEPTH to avoid the corner case. See the BUILD_BUG_ON
> assertion in stackleak_track_stack() that describes it:
> 
> /*
>  * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
>  * STACKLEAK_SEARCH_DEPTH makes the poison search in
>  * stackleak_erase() unreliable. Let's prevent that.
>  */
> BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE > STACKLEAK_SEARCH_DEPTH);

I had read that, but as written that doesn't imply that it's necessary to scan
one more element than STACKLEAK_SEARCH_DEPTH, nor why. I'm more than happy to
change the logic, but I think we need a very clear explanation as to why we
need to scan the specific number of bytes we scan, and we should account for
that *within* STACKLEAK_SEARCH_DEPTH for clarity.

> >    As this is included within the region to erase, this causes us to
> >    redundantly overwrite an additional unsigned long with poison.
> > 
> > * As we always decrement 'erase_low' after checking an element on the
> >    stack, we always include the element below this within the region to
> >    erase.
> > 
> >    As this is included within the region to erase, this causes us to
> >    redundantly overwrite an additional unsigned long with poison.
> > 
> >    Note that this is not a functional problem. As the loop condition
> >    checks 'erase_low > task_stack_low', we'll never clobber the
> >    STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
> >    never fail to erase the element immediately above the STACK_END_MAGIC.
> 
> Right, I don't see any bug in the current erasing code.
> 
> When I wrote the current code, I carefully checked all the corner cases. For
> example, on the first stack erasing, the STACK_END_MAGIC was not
> overwritten, but the memory next to it was erased. Same for the beginning of
> the stack: I carefully checked that no unpoisoned bytes were left on the
> thread stack.
> 
> > In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
> > bytes more than necessary, which is unfortunate.
> > 
> > This patch reworks the logic to find the address immediately above the
> > poisoned region, by finding the lowest non-poisoned address. This is
> > factored into a stackleak_find_top_of_poison() helper both for clarity
> > and so that this can be shared with the LKDTM test in subsequent
> > patches.
> 
> You know, I wrote stackleak_erase() in very plain C. I wanted a compiler to
> generate assembly that is very close to the original asm version. I worried
> that compilers might do weird stuff with the local variables and the stack
> pointer.
> 
> So I checked stackleak for gcc versions 4.8, 5, 6, 7, 8, 9, and 10 on
> x86_64, i386 and arm64. This is my project that helped with this work:
> https://github.com/a13xp0p0v/kernel-build-containers

I've used the kernel.org cross toolchains, as published at:

  https://mirrors.edge.kernel.org/pub/tools/crosstool/

> Mark, in this patch series you use many local variables and helper functions.
> Honestly, this worries me. For example, compilers can (and usually do)
> ignore the presence of the 'inline' specifier for the purpose of
> optimization.

I've deliberately used `__always_inline` rather than regular `inline` to
prevent code being placed out-of-line. As mentioned in oether replies it has a
stronger semantic.

Thanks,
Mark.

> 
> Thanks!
> 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Popov <alex.popov@linux.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Andy Lutomirski <luto@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > ---
> >   include/linux/stackleak.h | 26 ++++++++++++++++++++++++++
> >   kernel/stackleak.c        | 18 ++++--------------
> >   2 files changed, 30 insertions(+), 14 deletions(-)
> > 
> > diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
> > index 467661aeb4136..c36e7a3b45e7e 100644
> > --- a/include/linux/stackleak.h
> > +++ b/include/linux/stackleak.h
> > @@ -42,6 +42,32 @@ stackleak_task_high_bound(const struct task_struct *tsk)
> >   	return (unsigned long)task_pt_regs(tsk);
> >   }
> > +/*
> > + * Find the address immediately above the poisoned region of the stack, where
> > + * that region falls between 'low' (inclusive) and 'high' (exclusive).
> > + */
> > +static __always_inline unsigned long
> > +stackleak_find_top_of_poison(const unsigned long low, const unsigned long high)
> > +{
> > +	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> > +	unsigned int poison_count = 0;
> > +	unsigned long poison_high = high;
> > +	unsigned long sp = high;
> > +
> > +	while (sp > low && poison_count < depth) {
> > +		sp -= sizeof(unsigned long);
> > +
> > +		if (*(unsigned long *)sp == STACKLEAK_POISON) {
> > +			poison_count++;
> > +		} else {
> > +			poison_count = 0;
> > +			poison_high = sp;
> > +		}
> > +	}
> > +
> > +	return poison_high;
> > +}
> > +
> >   static inline void stackleak_task_init(struct task_struct *t)
> >   {
> >   	t->lowest_stack = stackleak_task_low_bound(t);
> > diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> > index ba346d46218f5..afd54b8e10b83 100644
> > --- a/kernel/stackleak.c
> > +++ b/kernel/stackleak.c
> > @@ -74,20 +74,10 @@ static __always_inline void __stackleak_erase(void)
> >   {
> >   	const unsigned long task_stack_low = stackleak_task_low_bound(current);
> >   	const unsigned long task_stack_high = stackleak_task_high_bound(current);
> > -	unsigned long erase_low = current->lowest_stack;
> > -	unsigned long erase_high;
> > -	unsigned int poison_count = 0;
> > -	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> > -
> > -	/* Search for the poison value in the kernel stack */
> > -	while (erase_low > task_stack_low && poison_count <= depth) {
> > -		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
> > -			poison_count++;
> > -		else
> > -			poison_count = 0;
> > -
> > -		erase_low -= sizeof(unsigned long);
> > -	}
> > +	unsigned long erase_low, erase_high;
> > +
> > +	erase_low = stackleak_find_top_of_poison(task_stack_low,
> > +						 current->lowest_stack);
> >   #ifdef CONFIG_STACKLEAK_METRICS
> >   	current->prev_lowest_stack = erase_low;
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-10 11:46     ` Mark Rutland
@ 2022-05-11  3:00       ` Kees Cook
  2022-05-11  8:02         ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Kees Cook @ 2022-05-11  3:00 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Alexander Popov, linux-arm-kernel, akpm, catalin.marinas,
	linux-kernel, luto, will

On Tue, May 10, 2022 at 12:46:48PM +0100, Mark Rutland wrote:
> On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
> > On 27.04.2022 20:31, Mark Rutland wrote:
> > > In __stackleak_erase() we check that the `erase_low` value derived from
> > > `current->lowest_stack` is above the lowest legitimate stack pointer
> > > value, but this is already enforced by stackleak_track_stack() when
> > > recording the lowest stack value.
> > > 
> > > Remove the redundant check.
> > > 
> > > There should be no functional change as a result of this patch.
> > 
> > Mark, I can't agree here. I think this check is important.
> > The performance profit from dropping it is less than the confidence decrease :)
> > 
> > With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
> > overwrite some wrong kernel memory, but simply clears the whole thread
> > stack, which is safe behavior.
> 
> If you feel strongly about it, I can restore the check, but I struggle to
> believe that it's worthwhile. The `lowest_stack` value lives in the
> task_struct, and if you have the power to corrupt that you have the power to do
> much more interesting things.
> 
> If we do restore it, I'd like to add a big fat comment explaining the
> rationale (i.e. that it only matter if someone could corrupt
> `current->lowest_stack`, as otherwise that's guarnateed to be within bounds).

Yeah, let's restore it and add the comment. While I do agree it's likely
that such an corruption would likely mean an attacker had significant
control over kernel memory already, it is not uncommon that an attack
only has a limited index from a given address, etc. Or some manipulation
is possible via weird gadgets, etc. It's unlikely, but not impossible,
and a bounds-check for that value is cheap compared to the rest of the
work happening. :)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 05/13] stackleak: clarify variable names
  2022-05-10 13:01     ` Mark Rutland
@ 2022-05-11  3:05       ` Kees Cook
  0 siblings, 0 replies; 47+ messages in thread
From: Kees Cook @ 2022-05-11  3:05 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Alexander Popov, linux-arm-kernel, akpm, catalin.marinas,
	linux-kernel, luto, will

On Tue, May 10, 2022 at 02:01:49PM +0100, Mark Rutland wrote:
> On Sun, May 08, 2022 at 11:49:46PM +0300, Alexander Popov wrote:
> > On 27.04.2022 20:31, Mark Rutland wrote:
> > > The logic within __stackleak_erase() can be a little hard to follow, as
> > > `boundary` switches from being the low bound to the high bound mid way
> > > through the function, and `kstack_ptr` is used to represent the start of
> > > the region to erase while `boundary` represents the end of the region to
> > > erase.
> > > 
> > > Make this a little clearer by consistently using clearer variable names.
> > > The `boundary` variable is removed, the bounds of the region to erase
> > > are described by `erase_low` and `erase_high`, and bounds of the task
> > > stack are described by `task_stack_low` and `task_stck_high`.
> > 
> > A typo here in `task_stck_high`.
> 
> Ah; whoops.

No worries; I fixed this when I took the patch.

> > That was also the main reason why I reused the 'boundary' variable: I wanted
> > the compiler to allocate it in the register and I avoided creating many
> > local variables.
> >
> > Mark, did your refactoring make the compiler allocate local variables on the
> > stack instead of the registers?
> 
> Considering the whole series, testing with GCC 11.1.0:
> 
> * On arm64:
>      before: stackleak_erase() uses 48 bytes of stack
>      after: stackleak_erase() uses 0 bytes of stack
> 
>      Note: this is entirely due to patch 1; arm64 has enough GPRs that it
>      doesn't need to use the stack.
> 
> * On x86_64:
>      before: stackleak_erase() uses 0 bytes of stack
>      after:  stackleak_erase() uses 0 bytes of stack
> 
> * On i386
>      before: stackleak_erase() uses 8 bytes of stach
>      after:  stackleak_erase() uses 16 bytes of stack
> 
> The i386 case isn't ideal, but given that those bytes will easily be used by
> the entry triage code before getting to any syscall handling, I don't believe
> that's an issue in practice.

I am biased and totally fine with choosing a solution where 64-bit
improvement comes at a 32-bit cost.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-11  3:00       ` Kees Cook
@ 2022-05-11  8:02         ` Mark Rutland
  2022-05-11 14:44           ` Kees Cook
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-11  8:02 UTC (permalink / raw)
  To: Kees Cook
  Cc: Alexander Popov, linux-arm-kernel, akpm, catalin.marinas,
	linux-kernel, luto, will

On Tue, May 10, 2022 at 08:00:38PM -0700, Kees Cook wrote:
> On Tue, May 10, 2022 at 12:46:48PM +0100, Mark Rutland wrote:
> > On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
> > > On 27.04.2022 20:31, Mark Rutland wrote:
> > > > In __stackleak_erase() we check that the `erase_low` value derived from
> > > > `current->lowest_stack` is above the lowest legitimate stack pointer
> > > > value, but this is already enforced by stackleak_track_stack() when
> > > > recording the lowest stack value.
> > > > 
> > > > Remove the redundant check.
> > > > 
> > > > There should be no functional change as a result of this patch.
> > > 
> > > Mark, I can't agree here. I think this check is important.
> > > The performance profit from dropping it is less than the confidence decrease :)
> > > 
> > > With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
> > > overwrite some wrong kernel memory, but simply clears the whole thread
> > > stack, which is safe behavior.
> > 
> > If you feel strongly about it, I can restore the check, but I struggle to
> > believe that it's worthwhile. The `lowest_stack` value lives in the
> > task_struct, and if you have the power to corrupt that you have the power to do
> > much more interesting things.
> > 
> > If we do restore it, I'd like to add a big fat comment explaining the
> > rationale (i.e. that it only matter if someone could corrupt
> > `current->lowest_stack`, as otherwise that's guarnateed to be within bounds).
> 
> Yeah, let's restore it and add the comment. While I do agree it's likely
> that such an corruption would likely mean an attacker had significant
> control over kernel memory already, it is not uncommon that an attack
> only has a limited index from a given address, etc. Or some manipulation
> is possible via weird gadgets, etc. It's unlikely, but not impossible,
> and a bounds-check for that value is cheap compared to the rest of the
> work happening. :)

Fair enough; I can go spin a patch restoring this. I'm somewhat unhappy with
silently fixing that up, though -- IMO it'd be better to BUG() or similar in
that case.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-11  8:02         ` Mark Rutland
@ 2022-05-11 14:44           ` Kees Cook
  2022-05-12  9:14             ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Kees Cook @ 2022-05-11 14:44 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Alexander Popov, linux-arm-kernel, akpm, catalin.marinas,
	linux-kernel, luto, will



On May 11, 2022 1:02:45 AM PDT, Mark Rutland <mark.rutland@arm.com> wrote:
>On Tue, May 10, 2022 at 08:00:38PM -0700, Kees Cook wrote:
>> On Tue, May 10, 2022 at 12:46:48PM +0100, Mark Rutland wrote:
>> > On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
>> > > On 27.04.2022 20:31, Mark Rutland wrote:
>> > > > In __stackleak_erase() we check that the `erase_low` value derived from
>> > > > `current->lowest_stack` is above the lowest legitimate stack pointer
>> > > > value, but this is already enforced by stackleak_track_stack() when
>> > > > recording the lowest stack value.
>> > > > 
>> > > > Remove the redundant check.
>> > > > 
>> > > > There should be no functional change as a result of this patch.
>> > > 
>> > > Mark, I can't agree here. I think this check is important.
>> > > The performance profit from dropping it is less than the confidence decrease :)
>> > > 
>> > > With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
>> > > overwrite some wrong kernel memory, but simply clears the whole thread
>> > > stack, which is safe behavior.
>> > 
>> > If you feel strongly about it, I can restore the check, but I struggle to
>> > believe that it's worthwhile. The `lowest_stack` value lives in the
>> > task_struct, and if you have the power to corrupt that you have the power to do
>> > much more interesting things.
>> > 
>> > If we do restore it, I'd like to add a big fat comment explaining the
>> > rationale (i.e. that it only matter if someone could corrupt
>> > `current->lowest_stack`, as otherwise that's guarnateed to be within bounds).
>> 
>> Yeah, let's restore it and add the comment. While I do agree it's likely
>> that such an corruption would likely mean an attacker had significant
>> control over kernel memory already, it is not uncommon that an attack
>> only has a limited index from a given address, etc. Or some manipulation
>> is possible via weird gadgets, etc. It's unlikely, but not impossible,
>> and a bounds-check for that value is cheap compared to the rest of the
>> work happening. :)
>
>Fair enough; I can go spin a patch restoring this. I'm somewhat unhappy with
>silently fixing that up, though -- IMO it'd be better to BUG() or similar in
>that case.

I share your desires, and this was exactly what Alexander originally proposed, but Linus rejected it violently. :(
https://lore.kernel.org/lkml/CA+55aFy6jNLsywVYdGp83AMrXBo_P-pkjkphPGrO=82SPKCpLQ@mail.gmail.com/


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-11 14:44           ` Kees Cook
@ 2022-05-12  9:14             ` Mark Rutland
  2022-05-15 16:17               ` Alexander Popov
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-12  9:14 UTC (permalink / raw)
  To: Kees Cook
  Cc: Alexander Popov, linux-arm-kernel, akpm, catalin.marinas,
	linux-kernel, luto, will

On Wed, May 11, 2022 at 07:44:41AM -0700, Kees Cook wrote:
> 
> 
> On May 11, 2022 1:02:45 AM PDT, Mark Rutland <mark.rutland@arm.com> wrote:
> >On Tue, May 10, 2022 at 08:00:38PM -0700, Kees Cook wrote:
> >> On Tue, May 10, 2022 at 12:46:48PM +0100, Mark Rutland wrote:
> >> > On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
> >> > > On 27.04.2022 20:31, Mark Rutland wrote:
> >> > > > In __stackleak_erase() we check that the `erase_low` value derived from
> >> > > > `current->lowest_stack` is above the lowest legitimate stack pointer
> >> > > > value, but this is already enforced by stackleak_track_stack() when
> >> > > > recording the lowest stack value.
> >> > > > 
> >> > > > Remove the redundant check.
> >> > > > 
> >> > > > There should be no functional change as a result of this patch.
> >> > > 
> >> > > Mark, I can't agree here. I think this check is important.
> >> > > The performance profit from dropping it is less than the confidence decrease :)
> >> > > 
> >> > > With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
> >> > > overwrite some wrong kernel memory, but simply clears the whole thread
> >> > > stack, which is safe behavior.
> >> > 
> >> > If you feel strongly about it, I can restore the check, but I struggle to
> >> > believe that it's worthwhile. The `lowest_stack` value lives in the
> >> > task_struct, and if you have the power to corrupt that you have the power to do
> >> > much more interesting things.
> >> > 
> >> > If we do restore it, I'd like to add a big fat comment explaining the
> >> > rationale (i.e. that it only matter if someone could corrupt
> >> > `current->lowest_stack`, as otherwise that's guarnateed to be within bounds).
> >> 
> >> Yeah, let's restore it and add the comment. While I do agree it's likely
> >> that such an corruption would likely mean an attacker had significant
> >> control over kernel memory already, it is not uncommon that an attack
> >> only has a limited index from a given address, etc. Or some manipulation
> >> is possible via weird gadgets, etc. It's unlikely, but not impossible,
> >> and a bounds-check for that value is cheap compared to the rest of the
> >> work happening. :)
> >
> >Fair enough; I can go spin a patch restoring this. I'm somewhat unhappy with
> >silently fixing that up, though -- IMO it'd be better to BUG() or similar in
> >that case.
> 
> I share your desires, and this was exactly what Alexander originally proposed, but Linus rejected it violently. :(
> https://lore.kernel.org/lkml/CA+55aFy6jNLsywVYdGp83AMrXBo_P-pkjkphPGrO=82SPKCpLQ@mail.gmail.com/

I see. :/

Thinking about this some more, if we assume someone can corrupt *some* word of
memory, then we need to consider that instead of corrupting
task_struct::lowest_stack, they could corrupt task_struct::stack (or x86's
cpu_current_top_of_stack prior to this series).

With that in mind, if we detect that task_struct::lowest_stack is
out-of-bounds, we have no idea whether it has been corrupted or the other bound
values have been corrupted, and so we can't do the erase safely anyway.

So AFAICT we must *avoid* erasing when that goes wrong. Maybe we could WARN()
instead of BUG()?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-12  9:14             ` Mark Rutland
@ 2022-05-15 16:17               ` Alexander Popov
  2022-05-24 10:03                 ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-15 16:17 UTC (permalink / raw)
  To: Mark Rutland, Kees Cook
  Cc: linux-arm-kernel, akpm, catalin.marinas, linux-kernel, luto, will

On 12.05.2022 12:14, Mark Rutland wrote:
> On Wed, May 11, 2022 at 07:44:41AM -0700, Kees Cook wrote:
>>
>>
>> On May 11, 2022 1:02:45 AM PDT, Mark Rutland <mark.rutland@arm.com> wrote:
>>> On Tue, May 10, 2022 at 08:00:38PM -0700, Kees Cook wrote:
>>>> On Tue, May 10, 2022 at 12:46:48PM +0100, Mark Rutland wrote:
>>>>> On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
>>>>>> On 27.04.2022 20:31, Mark Rutland wrote:
>>>>>>> In __stackleak_erase() we check that the `erase_low` value derived from
>>>>>>> `current->lowest_stack` is above the lowest legitimate stack pointer
>>>>>>> value, but this is already enforced by stackleak_track_stack() when
>>>>>>> recording the lowest stack value.
>>>>>>>
>>>>>>> Remove the redundant check.
>>>>>>>
>>>>>>> There should be no functional change as a result of this patch.
>>>>>>
>>>>>> Mark, I can't agree here. I think this check is important.
>>>>>> The performance profit from dropping it is less than the confidence decrease :)
>>>>>>
>>>>>> With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
>>>>>> overwrite some wrong kernel memory, but simply clears the whole thread
>>>>>> stack, which is safe behavior.
>>>>>
>>>>> If you feel strongly about it, I can restore the check, but I struggle to
>>>>> believe that it's worthwhile. The `lowest_stack` value lives in the
>>>>> task_struct, and if you have the power to corrupt that you have the power to do
>>>>> much more interesting things.
>>>>>
>>>>> If we do restore it, I'd like to add a big fat comment explaining the
>>>>> rationale (i.e. that it only matter if someone could corrupt
>>>>> `current->lowest_stack`, as otherwise that's guarnateed to be within bounds).
>>>>
>>>> Yeah, let's restore it and add the comment. While I do agree it's likely
>>>> that such an corruption would likely mean an attacker had significant
>>>> control over kernel memory already, it is not uncommon that an attack
>>>> only has a limited index from a given address, etc. Or some manipulation
>>>> is possible via weird gadgets, etc. It's unlikely, but not impossible,
>>>> and a bounds-check for that value is cheap compared to the rest of the
>>>> work happening. :)
>>>
>>> Fair enough; I can go spin a patch restoring this. I'm somewhat unhappy with
>>> silently fixing that up, though -- IMO it'd be better to BUG() or similar in
>>> that case.
>>
>> I share your desires, and this was exactly what Alexander originally proposed, but Linus rejected it violently. :(
>> https://lore.kernel.org/lkml/CA+55aFy6jNLsywVYdGp83AMrXBo_P-pkjkphPGrO=82SPKCpLQ@mail.gmail.com/
> 
> I see. :/
> 
> Thinking about this some more, if we assume someone can corrupt *some* word of
> memory, then we need to consider that instead of corrupting
> task_struct::lowest_stack, they could corrupt task_struct::stack (or x86's
> cpu_current_top_of_stack prior to this series).
> 
> With that in mind, if we detect that task_struct::lowest_stack is
> out-of-bounds, we have no idea whether it has been corrupted or the other bound
> values have been corrupted, and so we can't do the erase safely anyway.

:)

IMO, even if a kernel thread stack is moved somewhere for any weird reason, 
stackleak must erase it at the end of syscall and do its job.

> So AFAICT we must *avoid* erasing when that goes wrong. Maybe we could WARN()
> instead of BUG()?

Mark, I think security features must not go out of service.

The 'lowest_stack' value is for making stackleak faster. I believe if the 
'lowest_stack' value is invalid, stackleak must not skip its main job and should 
erase the whole kernel thread stack.

When I developed 'stackleak_erase()' I tried adding 'WARN_ON()', but it didn't 
work properly there, as I remember. Warning handling code is very complex. So I 
dropped that fragile part.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 06/13] stackleak: rework stack high bound handling
  2022-05-10 11:22     ` Mark Rutland
@ 2022-05-15 16:32       ` Alexander Popov
  0 siblings, 0 replies; 47+ messages in thread
From: Alexander Popov @ 2022-05-15 16:32 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On 10.05.2022 14:22, Mark Rutland wrote:
> On Mon, May 09, 2022 at 12:27:22AM +0300, Alexander Popov wrote:
>> On 27.04.2022 20:31, Mark Rutland wrote:
>>> Prior to returning to userpace, we reset current->lowest_stack to a
>>> reasonable high bound. Currently we do this by subtracting the arbitrary
>>> value `THREAD_SIZE/64` from the top of the stack, for reasons lost to
>>> history.
>>>
>>> Looking at configurations today:
>>>
>>> * On i386 where THREAD_SIZE is 8K, the bound will be 128 bytes. The
>>>     pt_regs at the top of the stack is 68 bytes (with 0 to 16 bytes of
>>>     padding above), and so this covers an additional portion of 44 to 60
>>>     bytes.
>>>
>>> * On x86_64 where THREAD_SIZE is at least 16K (up to 32K with KASAN) the
>>>     bound will be at least 256 bytes (up to 512 with KASAN). The pt_regs
>>>     at the top of the stack is 168 bytes, and so this cover an additional
>>>     88 bytes of stack (up to 344 with KASAN).
>>>
>>> * On arm64 where THREAD_SIZE is at least 16K (up to 64K with 64K pages
>>>     and VMAP_STACK), the bound will be at least 256 bytes (up to 1024 with
>>>     KASAN). The pt_regs at the top of the stack is 336 bytes, so this can
>>>     fall within the pt_regs, or can cover an additional 688 bytes of
>>>     stack.
>>>
>>> Clearly the `THREAD_SIZE/64` value doesn't make much sense -- in the
>>> worst case, this will cause more than 600 bytes of stack to be erased
>>> for every syscall, even if actual stack usage were substantially
>>> smaller.
>>>
>>> This patches makes this slightly less nonsensical by consistently
>>> resetting current->lowest_stack to the base of the task pt_regs. For
>>> clarity and for consistency with the handling of the low bound, the
>>> generation of the high bound is split into a helper with commentary
>>> explaining why.
>>>
>>> Since the pt_regs at the top of the stack will be clobbered upon the
>>> next exception entry, we don't need to poison these at exception exit.
>>> By using task_pt_regs() as the high stack boundary instead of
>>> current_top_of_stack() we avoid some redundant poisoning, and the
>>> compiler can share the address generation between the poisoning and
>>> restting of `current->lowest_stack`, making the generated code more
>>> optimal.
>>>
>>> It's not clear to me whether the existing `THREAD_SIZE/64` offset was a
>>> dodgy heuristic to skip the pt_regs, or whether it was attempting to
>>> minimize the number of times stackleak_check_stack() would have to
>>> update `current->lowest_stack` when stack usage was shallow at the cost
>>> of unconditionally poisoning a small portion of the stack for every exit
>>> to userspace.
>>
>> I inherited this 'THREAD_SIZE/64' logic is from the original grsecurity patch.
>> As I mentioned, originally this was written in asm.
>>
>> For x86_64:
>> 	mov	TASK_thread_sp0(%r11), %rdi
>> 	sub	$256, %rdi
>> 	mov	%rdi, TASK_lowest_stack(%r11)
>>
>> For x86_32:
>> 	mov TASK_thread_sp0(%ebp), %edi
>> 	sub $128, %edi
>> 	mov %edi, TASK_lowest_stack(%ebp)
>>
>> 256 bytes for x86_64 and 128 bytes for x86_32 are exactly THREAD_SIZE/64.
> 
> To check my understanding, did you come up with the `THREAD_SIZE/64`
> calculation from reverse-engineering the assembly?

Yes, exactly.

> I strongly suspect that has nothing to do with THEAD_SIZE, and was trying to
> fall just below the task's pt_regs (and maybe some unconditional stack usage
> just below that).

Aha, that's probable.

>> I think this value was chosen as optimal for minimizing poison scanning.
>> It's possible that stackleak_track_stack() is not called during the syscall
>> because all the called functions have small stack frames.
> 
> I can believe that, but given the clearing cost appears to dominate the
> scanning cost, I suspect we can live with making this precisely the bottom of
> the pt_regs.

Sounds good!

>>> For now I've simply removed the offset, and if we need/want to minimize
>>> updates for shallow stack usage it should be easy to add a better
>>> heuristic atop, with appropriate commentary so we know what's going on.
>>
>> I like your idea to erase the thread stack up to pt_regs if we call the
>> stackleak erasing from the trampoline stack.
>>
>> But here I don't understand where task_pt_regs() points to...
> 
> As mentioned in the commit message, there's a struct pt_regs at the top of each
> task stack (created at exception entry, and consumed by exception return),
> which contains the user register state. On arm64 and x86_64, that's *right* at
> the top of the stack, but on i386 there can be some padding above that.
> task_pt_regs(tsk) points at the base of that pt_regs.
> 
> That looks something like the following (with increasing addresses going
> upwards):
> 
>    ----------------   <--- task_stack_page(tsk) + THREAD_SIZE
>    (optional padding)
>    ----------------   <--- task_top_of_stack(tsk) // x86 only
>        /\
>        ||
>        || pt_regs
>        ||
>        \/
>    ----------------   <--- task_pt_regs(tsk)
>        /\
>        ||
>        ||
>        ||
>        || Usable task stack
>        ||
>        ||
>        ||
>        \/
>    ----------------
>    STACK_END_MAGIC
>    ----------------   <--- task_stack_page(tsk)
> 
> Thanks,
> Mark.
> 
>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Alexander Popov <alex.popov@linux.com>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Andy Lutomirski <luto@kernel.org>
>>> Cc: Kees Cook <keescook@chromium.org>
>>> ---
>>>    include/linux/stackleak.h | 14 ++++++++++++++
>>>    kernel/stackleak.c        | 19 ++++++++++++++-----
>>>    2 files changed, 28 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
>>> index 67430faa5c518..467661aeb4136 100644
>>> --- a/include/linux/stackleak.h
>>> +++ b/include/linux/stackleak.h
>>> @@ -28,6 +28,20 @@ stackleak_task_low_bound(const struct task_struct *tsk)
>>>    	return (unsigned long)end_of_stack(tsk) + sizeof(unsigned long);
>>>    }
>>> +/*
>>> + * The address immediately after the highest address on tsk's stack which we
>>> + * can plausibly erase.
>>> + */
>>> +static __always_inline unsigned long
>>> +stackleak_task_high_bound(const struct task_struct *tsk)
>>> +{
>>> +	/*
>>> +	 * The task's pt_regs lives at the top of the task stack and will be
>>> +	 * overwritten by exception entry, so there's no need to erase them.
>>> +	 */
>>> +	return (unsigned long)task_pt_regs(tsk);
>>> +}
>>> +
>>>    static inline void stackleak_task_init(struct task_struct *t)
>>>    {
>>>    	t->lowest_stack = stackleak_task_low_bound(t);
>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
>>> index d5f684dc0a2d9..ba346d46218f5 100644
>>> --- a/kernel/stackleak.c
>>> +++ b/kernel/stackleak.c
>>> @@ -73,6 +73,7 @@ late_initcall(stackleak_sysctls_init);
>>>    static __always_inline void __stackleak_erase(void)
>>>    {
>>>    	const unsigned long task_stack_low = stackleak_task_low_bound(current);
>>> +	const unsigned long task_stack_high = stackleak_task_high_bound(current);
>>>    	unsigned long erase_low = current->lowest_stack;
>>>    	unsigned long erase_high;
>>>    	unsigned int poison_count = 0;
>>> @@ -93,14 +94,22 @@ static __always_inline void __stackleak_erase(void)
>>>    #endif
>>>    	/*
>>> -	 * Now write the poison value to the kernel stack between 'erase_low'
>>> -	 * and 'erase_high'. We assume that the stack pointer doesn't change
>>> -	 * when we write poison.
>>> +	 * Write poison to the task's stack between 'erase_low' and
>>> +	 * 'erase_high'.
>>> +	 *
>>> +	 * If we're running on a different stack (e.g. an entry trampoline
>>> +	 * stack) we can erase everything below the pt_regs at the top of the
>>> +	 * task stack.
>>> +	 *
>>> +	 * If we're running on the task stack itself, we must not clobber any
>>> +	 * stack used by this function and its caller. We assume that this
>>> +	 * function has a fixed-size stack frame, and the current stack pointer
>>> +	 * doesn't change while we write poison.
>>>    	 */
>>>    	if (on_thread_stack())
>>>    		erase_high = current_stack_pointer;
>>>    	else
>>> -		erase_high = current_top_of_stack();
>>> +		erase_high = task_stack_high;
>>>    	while (erase_low < erase_high) {
>>>    		*(unsigned long *)erase_low = STACKLEAK_POISON;
>>> @@ -108,7 +117,7 @@ static __always_inline void __stackleak_erase(void)
>>>    	}
>>>    	/* Reset the 'lowest_stack' value for the next syscall */
>>> -	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;
>>> +	current->lowest_stack = task_stack_high;
>>>    }
>>>    asmlinkage void noinstr stackleak_erase(void)
>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/13] stackleak: rework poison scanning
  2022-05-10 13:13     ` Mark Rutland
@ 2022-05-15 17:33       ` Alexander Popov
  2022-05-24 13:31         ` Mark Rutland
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-15 17:33 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On 10.05.2022 16:13, Mark Rutland wrote:
> On Mon, May 09, 2022 at 04:51:35PM +0300, Alexander Popov wrote:
>> Hello Mark!
>>
>> On 27.04.2022 20:31, Mark Rutland wrote:
>>> Currently we over-estimate the region of stack which must be erased.
>>>
>>> To determine the region to be erased, we scan downards for a contiguous
>>> block of poison values (or the low bound of the stack). There are a few
>>> minor problems with this today:
>>>
>>> * When we find a block of poison values, we include this block within
>>>     the region to erase.
>>>
>>>     As this is included within the region to erase, this causes us to
>>>     redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
>>>     poison.
>>
>> Right, this can be improved.
>>
>>> * As the loop condition checks 'poison_count <= depth', it will run an
>>>     additional iteration after finding the contiguous block of poison,
>>>     decrementing 'erase_low' once more than necessary.
>>
>> Actually, I think the current code is correct.
>>
>> I'm intentionally searching one poison value more than
>> STACKLEAK_SEARCH_DEPTH to avoid the corner case. See the BUILD_BUG_ON
>> assertion in stackleak_track_stack() that describes it:
>>
>> /*
>>   * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
>>   * STACKLEAK_SEARCH_DEPTH makes the poison search in
>>   * stackleak_erase() unreliable. Let's prevent that.
>>   */
>> BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE > STACKLEAK_SEARCH_DEPTH);
> 
> I had read that, but as written that doesn't imply that it's necessary to scan
> one more element than STACKLEAK_SEARCH_DEPTH, nor why. I'm more than happy to
> change the logic, but I think we need a very clear explanation as to why we
> need to scan the specific number of bytes we scan, and we should account for
> that *within* STACKLEAK_SEARCH_DEPTH for clarity.

I'll try to explain.

The stackleak gcc plugin instruments the kernel code inserting the 
'stackleak_track_stack()' calls for the functions with a stack frame size 
greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.

The kernel functions with a smaller stack frame are not instrumented (the 
'lowest_stack' value is not updated for them).

Any kernel function may leave uninitialized data on its stack frame. The poison 
scanning must handle that correctly. The uninitialized groups of poison values 
must be smaller than the search depth, otherwise 'stackleak_erase()' is unreliable.

So with this BUILD_BUG_ON I control that
   CONFIG_STACKLEAK_TRACK_MIN_SIZE <= STACKLEAK_SEARCH_DEPTH.

To be sure and avoid mistakes in the edge cases, 'stackleak_erase()' is 
searching one poison value more than STACKLEAK_SEARCH_DEPTH.

If you don't like this one additional poison value in the search, I would 
propose to change the assertion.

What do you think?

>>>     As this is included within the region to erase, this causes us to
>>>     redundantly overwrite an additional unsigned long with poison.
>>>
>>> * As we always decrement 'erase_low' after checking an element on the
>>>     stack, we always include the element below this within the region to
>>>     erase.
>>>
>>>     As this is included within the region to erase, this causes us to
>>>     redundantly overwrite an additional unsigned long with poison.
>>>
>>>     Note that this is not a functional problem. As the loop condition
>>>     checks 'erase_low > task_stack_low', we'll never clobber the
>>>     STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
>>>     never fail to erase the element immediately above the STACK_END_MAGIC.
>>
>> Right, I don't see any bug in the current erasing code.
>>
>> When I wrote the current code, I carefully checked all the corner cases. For
>> example, on the first stack erasing, the STACK_END_MAGIC was not
>> overwritten, but the memory next to it was erased. Same for the beginning of
>> the stack: I carefully checked that no unpoisoned bytes were left on the
>> thread stack.
>>
>>> In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
>>> bytes more than necessary, which is unfortunate.
>>>
>>> This patch reworks the logic to find the address immediately above the
>>> poisoned region, by finding the lowest non-poisoned address. This is
>>> factored into a stackleak_find_top_of_poison() helper both for clarity
>>> and so that this can be shared with the LKDTM test in subsequent
>>> patches.
>>
>> You know, I wrote stackleak_erase() in very plain C. I wanted a compiler to
>> generate assembly that is very close to the original asm version. I worried
>> that compilers might do weird stuff with the local variables and the stack
>> pointer.
>>
>> So I checked stackleak for gcc versions 4.8, 5, 6, 7, 8, 9, and 10 on
>> x86_64, i386 and arm64. This is my project that helped with this work:
>> https://github.com/a13xp0p0v/kernel-build-containers
> 
> I've used the kernel.org cross toolchains, as published at:
> 
>    https://mirrors.edge.kernel.org/pub/tools/crosstool/
> 
>> Mark, in this patch series you use many local variables and helper functions.
>> Honestly, this worries me. For example, compilers can (and usually do)
>> ignore the presence of the 'inline' specifier for the purpose of
>> optimization.
> 
> I've deliberately used `__always_inline` rather than regular `inline` to
> prevent code being placed out-of-line. As mentioned in oether replies it has a
> stronger semantic.
> 
> Thanks,
> Mark.
> 
>>
>> Thanks!
>>
>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Alexander Popov <alex.popov@linux.com>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Andy Lutomirski <luto@kernel.org>
>>> Cc: Kees Cook <keescook@chromium.org>
>>> ---
>>>    include/linux/stackleak.h | 26 ++++++++++++++++++++++++++
>>>    kernel/stackleak.c        | 18 ++++--------------
>>>    2 files changed, 30 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
>>> index 467661aeb4136..c36e7a3b45e7e 100644
>>> --- a/include/linux/stackleak.h
>>> +++ b/include/linux/stackleak.h
>>> @@ -42,6 +42,32 @@ stackleak_task_high_bound(const struct task_struct *tsk)
>>>    	return (unsigned long)task_pt_regs(tsk);
>>>    }
>>> +/*
>>> + * Find the address immediately above the poisoned region of the stack, where
>>> + * that region falls between 'low' (inclusive) and 'high' (exclusive).
>>> + */
>>> +static __always_inline unsigned long
>>> +stackleak_find_top_of_poison(const unsigned long low, const unsigned long high)
>>> +{
>>> +	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>>> +	unsigned int poison_count = 0;
>>> +	unsigned long poison_high = high;
>>> +	unsigned long sp = high;
>>> +
>>> +	while (sp > low && poison_count < depth) {
>>> +		sp -= sizeof(unsigned long);
>>> +
>>> +		if (*(unsigned long *)sp == STACKLEAK_POISON) {
>>> +			poison_count++;
>>> +		} else {
>>> +			poison_count = 0;
>>> +			poison_high = sp;
>>> +		}
>>> +	}
>>> +
>>> +	return poison_high;
>>> +}
>>> +
>>>    static inline void stackleak_task_init(struct task_struct *t)
>>>    {
>>>    	t->lowest_stack = stackleak_task_low_bound(t);
>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
>>> index ba346d46218f5..afd54b8e10b83 100644
>>> --- a/kernel/stackleak.c
>>> +++ b/kernel/stackleak.c
>>> @@ -74,20 +74,10 @@ static __always_inline void __stackleak_erase(void)
>>>    {
>>>    	const unsigned long task_stack_low = stackleak_task_low_bound(current);
>>>    	const unsigned long task_stack_high = stackleak_task_high_bound(current);
>>> -	unsigned long erase_low = current->lowest_stack;
>>> -	unsigned long erase_high;
>>> -	unsigned int poison_count = 0;
>>> -	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>>> -
>>> -	/* Search for the poison value in the kernel stack */
>>> -	while (erase_low > task_stack_low && poison_count <= depth) {
>>> -		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
>>> -			poison_count++;
>>> -		else
>>> -			poison_count = 0;
>>> -
>>> -		erase_low -= sizeof(unsigned long);
>>> -	}
>>> +	unsigned long erase_low, erase_high;
>>> +
>>> +	erase_low = stackleak_find_top_of_poison(task_stack_low,
>>> +						 current->lowest_stack);
>>>    #ifdef CONFIG_STACKLEAK_METRICS
>>>    	current->prev_lowest_stack = erase_low;
>>


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-15 16:17               ` Alexander Popov
@ 2022-05-24 10:03                 ` Mark Rutland
  2022-05-26 22:09                   ` Alexander Popov
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-24 10:03 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Kees Cook, linux-arm-kernel, akpm, catalin.marinas, linux-kernel,
	luto, will

On Sun, May 15, 2022 at 07:17:16PM +0300, Alexander Popov wrote:
> On 12.05.2022 12:14, Mark Rutland wrote:
> > On Wed, May 11, 2022 at 07:44:41AM -0700, Kees Cook wrote:
> > > 
> > > 
> > > On May 11, 2022 1:02:45 AM PDT, Mark Rutland <mark.rutland@arm.com> wrote:
> > > > On Tue, May 10, 2022 at 08:00:38PM -0700, Kees Cook wrote:
> > > > > On Tue, May 10, 2022 at 12:46:48PM +0100, Mark Rutland wrote:
> > > > > > On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
> > > > > > > On 27.04.2022 20:31, Mark Rutland wrote:
> > > > > > > > In __stackleak_erase() we check that the `erase_low` value derived from
> > > > > > > > `current->lowest_stack` is above the lowest legitimate stack pointer
> > > > > > > > value, but this is already enforced by stackleak_track_stack() when
> > > > > > > > recording the lowest stack value.
> > > > > > > > 
> > > > > > > > Remove the redundant check.
> > > > > > > > 
> > > > > > > > There should be no functional change as a result of this patch.
> > > > > > > 
> > > > > > > Mark, I can't agree here. I think this check is important.
> > > > > > > The performance profit from dropping it is less than the confidence decrease :)
> > > > > > > 
> > > > > > > With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
> > > > > > > overwrite some wrong kernel memory, but simply clears the whole thread
> > > > > > > stack, which is safe behavior.
> > > > > > 
> > > > > > If you feel strongly about it, I can restore the check, but I struggle to
> > > > > > believe that it's worthwhile. The `lowest_stack` value lives in the
> > > > > > task_struct, and if you have the power to corrupt that you have the power to do
> > > > > > much more interesting things.
> > > > > > 
> > > > > > If we do restore it, I'd like to add a big fat comment explaining the
> > > > > > rationale (i.e. that it only matter if someone could corrupt
> > > > > > `current->lowest_stack`, as otherwise that's guarnateed to be within bounds).
> > > > > 
> > > > > Yeah, let's restore it and add the comment. While I do agree it's likely
> > > > > that such an corruption would likely mean an attacker had significant
> > > > > control over kernel memory already, it is not uncommon that an attack
> > > > > only has a limited index from a given address, etc. Or some manipulation
> > > > > is possible via weird gadgets, etc. It's unlikely, but not impossible,
> > > > > and a bounds-check for that value is cheap compared to the rest of the
> > > > > work happening. :)
> > > > 
> > > > Fair enough; I can go spin a patch restoring this. I'm somewhat unhappy with
> > > > silently fixing that up, though -- IMO it'd be better to BUG() or similar in
> > > > that case.
> > > 
> > > I share your desires, and this was exactly what Alexander originally proposed, but Linus rejected it violently. :(
> > > https://lore.kernel.org/lkml/CA+55aFy6jNLsywVYdGp83AMrXBo_P-pkjkphPGrO=82SPKCpLQ@mail.gmail.com/
> > 
> > I see. :/
> > 
> > Thinking about this some more, if we assume someone can corrupt *some* word of
> > memory, then we need to consider that instead of corrupting
> > task_struct::lowest_stack, they could corrupt task_struct::stack (or x86's
> > cpu_current_top_of_stack prior to this series).
> > 
> > With that in mind, if we detect that task_struct::lowest_stack is
> > out-of-bounds, we have no idea whether it has been corrupted or the other bound
> > values have been corrupted, and so we can't do the erase safely anyway.
> 
> :)
> 
> IMO, even if a kernel thread stack is moved somewhere for any weird reason,
> stackleak must erase it at the end of syscall and do its job.

I'm not talking about the stack being *moved*. I'm talking about the pointers
to it being *corrupted* (wince we use those to determine the bounds).

The problem is that we don't have a single source of truth here that we can
rely upon. 

We're stuck between a dichotomy:

* If we assume an attacker *can* corrupt a word of memory, they can corrupt any
  of the in-memory values we use to find the stack in the first place. If we
  detect a mismatch we cannot know which is bad, and if the attacker can
  corrupt the one(s) we choose to blindly trust, then they can weaponize the
  erasing code to corrupt memory.

  That's *worse* than the info leak problem stackleak was originally trying to
  solve.

  See below for one way we could avoid that.

* If we assume the attacker *cannot* corrupt a word of memory, then we know the
  values must always be within bounds, and there's no need for the check.

> > So AFAICT we must *avoid* erasing when that goes wrong. Maybe we could WARN()
> > instead of BUG()?
> 
> Mark, I think security features must not go out of service.
>
> The 'lowest_stack' value is for making stackleak faster. I believe if the
> 'lowest_stack' value is invalid, stackleak must not skip its main job and
> should erase the whole kernel thread stack.

My point is that the conditions which permit `lowest_stack` to become invalid
(e.g. an attacker having an arbitrary or constrained write gadget) also permit
all the other stack boundariy values to become invalid.

If we detect `lowest_stack` is out of bounds, we have no idea which of
`lowest_stack` or the bounds are corrupt -- so we *cannot* safely erase: if the
bounds are corrupt we'll corrupt arbitrary memory.

We *could* do better by always deriving the bounds from an SP value (current
for on-stack, passed in by asm for off-stack). If we did that, we could more
reasonably treat the bounds values as more reliable than the `lowest_stack`
value, and with that I'd be happy with the bounds check (though I still think
we want to make this WARN()).

> When I developed 'stackleak_erase()' I tried adding 'WARN_ON()', but it
> didn't work properly there, as I remember. Warning handling code is very
> complex. So I dropped that fragile part.

Do you recall any specific problem, or just that there were problems?

I ask because the entry code, and handling of BUG() and WARN() has changed
quite a bit over the last couple of years. We've fixed some latent issues
there, though IIUC this late in the exception return flow there are still some
potential issues with the RCU/lockdep/etc context that would need to be
saved/restored.

We need to solve some of that in general anyuway, since there are other BUG()
and WARN() instances hidden in noinstr entry code. I'm happy to dig into that
(time permitting).

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/13] stackleak: rework poison scanning
  2022-05-15 17:33       ` Alexander Popov
@ 2022-05-24 13:31         ` Mark Rutland
  2022-05-26 23:25           ` Alexander Popov
  0 siblings, 1 reply; 47+ messages in thread
From: Mark Rutland @ 2022-05-24 13:31 UTC (permalink / raw)
  To: Alexander Popov
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On Sun, May 15, 2022 at 08:33:01PM +0300, Alexander Popov wrote:
> On 10.05.2022 16:13, Mark Rutland wrote:
> > On Mon, May 09, 2022 at 04:51:35PM +0300, Alexander Popov wrote:
> > > Hello Mark!
> > > 
> > > On 27.04.2022 20:31, Mark Rutland wrote:
> > > > Currently we over-estimate the region of stack which must be erased.
> > > > 
> > > > To determine the region to be erased, we scan downards for a contiguous
> > > > block of poison values (or the low bound of the stack). There are a few
> > > > minor problems with this today:
> > > > 
> > > > * When we find a block of poison values, we include this block within
> > > >     the region to erase.
> > > > 
> > > >     As this is included within the region to erase, this causes us to
> > > >     redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
> > > >     poison.
> > > 
> > > Right, this can be improved.
> > > 
> > > > * As the loop condition checks 'poison_count <= depth', it will run an
> > > >     additional iteration after finding the contiguous block of poison,
> > > >     decrementing 'erase_low' once more than necessary.
> > > 
> > > Actually, I think the current code is correct.
> > > 
> > > I'm intentionally searching one poison value more than
> > > STACKLEAK_SEARCH_DEPTH to avoid the corner case. See the BUILD_BUG_ON
> > > assertion in stackleak_track_stack() that describes it:
> > > 
> > > /*
> > >   * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
> > >   * STACKLEAK_SEARCH_DEPTH makes the poison search in
> > >   * stackleak_erase() unreliable. Let's prevent that.
> > >   */
> > > BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE > STACKLEAK_SEARCH_DEPTH);
> > 
> > I had read that, but as written that doesn't imply that it's necessary to scan
> > one more element than STACKLEAK_SEARCH_DEPTH, nor why. I'm more than happy to
> > change the logic, but I think we need a very clear explanation as to why we
> > need to scan the specific number of bytes we scan, and we should account for
> > that *within* STACKLEAK_SEARCH_DEPTH for clarity.
> 
> I'll try to explain.
> 
> The stackleak gcc plugin instruments the kernel code inserting the
> 'stackleak_track_stack()' calls for the functions with a stack frame size
> greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.
> 
> The kernel functions with a smaller stack frame are not instrumented (the
> 'lowest_stack' value is not updated for them).

I understood these points.

It's also worth noting that `noinstr` code will also not be instrumented
regardless of frame size -- we might want some build-time assertion for those.

> Any kernel function may leave uninitialized data on its stack frame. The
> poison scanning must handle that correctly. The uninitialized groups of
> poison values must be smaller than the search depth, otherwise
> 'stackleak_erase()' is unreliable.

I had understood this, but I had understood that for a caller->callee pair,
*something* would be pushed onto the stack. On arm64 that'd be a frame record
in the caller's stack frame, and on x86 that would be the return address
between the stack frames of the caller and callee. Since any unrecorded frame
is less than CONFIG_STACKLEAK_TRACK_MIN_SIZE, the non-poison bytes would fall
within CONFIG_STACKLEAK_TRACK_MIN_SIZE bytes.

There is a potential edge case on arm64, since the frame record is permitted to
be placed anywhere within the stack frame, and in theory it could be placed
high on the caller and low on the callee. If we wanted to handle that, we'd
need to scan 2 times the tracking size. In practice compilers consistently
place the frame record at one end (usually the low end, as part of constructing
the frame).

> So with this BUILD_BUG_ON I control that
>   CONFIG_STACKLEAK_TRACK_MIN_SIZE <= STACKLEAK_SEARCH_DEPTH.
> 
> To be sure and avoid mistakes in the edge cases, 'stackleak_erase()' is
> searching one poison value more than STACKLEAK_SEARCH_DEPTH.

Just to check my understanding, did you have a specific edge-case in mind, or
was that "just in case"?

It would be really nice if we had an example.

> If you don't like this one additional poison value in the search, I would
> propose to change the assertion.

If we clearly document *why*, then changing the assertion is fine by me.
However, as above I don't think that this is necessary.

As an aside, why is it possible to configure CONFIG_STACKLEAK_TRACK_MIN_SIZE to
be bigger than STACKLEAK_SEARCH_DEPTH in the first place?

In security/Kconfig.hardening we have:

| config STACKLEAK_TRACK_MIN_SIZE
|         int "Minimum stack frame size of functions tracked by STACKLEAK"
|         default 100 
|         range 0 4096
|         depends on GCC_PLUGIN_STACKLEAK
|         help
|           The STACKLEAK gcc plugin instruments the kernel code for tracking
|           the lowest border of the kernel stack (and for some other purposes).
|           It inserts the stackleak_track_stack() call for the functions with
|           a stack frame size greater than or equal to this parameter.
|           If unsure, leave the default value 100.

... where the vast majority of that range is going to lead to a BUILD_BUG().

Thanks,
Mark.

> What do you think?
> 
> > > >     As this is included within the region to erase, this causes us to
> > > >     redundantly overwrite an additional unsigned long with poison.
> > > > 
> > > > * As we always decrement 'erase_low' after checking an element on the
> > > >     stack, we always include the element below this within the region to
> > > >     erase.
> > > > 
> > > >     As this is included within the region to erase, this causes us to
> > > >     redundantly overwrite an additional unsigned long with poison.
> > > > 
> > > >     Note that this is not a functional problem. As the loop condition
> > > >     checks 'erase_low > task_stack_low', we'll never clobber the
> > > >     STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
> > > >     never fail to erase the element immediately above the STACK_END_MAGIC.
> > > 
> > > Right, I don't see any bug in the current erasing code.
> > > 
> > > When I wrote the current code, I carefully checked all the corner cases. For
> > > example, on the first stack erasing, the STACK_END_MAGIC was not
> > > overwritten, but the memory next to it was erased. Same for the beginning of
> > > the stack: I carefully checked that no unpoisoned bytes were left on the
> > > thread stack.
> > > 
> > > > In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
> > > > bytes more than necessary, which is unfortunate.
> > > > 
> > > > This patch reworks the logic to find the address immediately above the
> > > > poisoned region, by finding the lowest non-poisoned address. This is
> > > > factored into a stackleak_find_top_of_poison() helper both for clarity
> > > > and so that this can be shared with the LKDTM test in subsequent
> > > > patches.
> > > 
> > > You know, I wrote stackleak_erase() in very plain C. I wanted a compiler to
> > > generate assembly that is very close to the original asm version. I worried
> > > that compilers might do weird stuff with the local variables and the stack
> > > pointer.
> > > 
> > > So I checked stackleak for gcc versions 4.8, 5, 6, 7, 8, 9, and 10 on
> > > x86_64, i386 and arm64. This is my project that helped with this work:
> > > https://github.com/a13xp0p0v/kernel-build-containers
> > 
> > I've used the kernel.org cross toolchains, as published at:
> > 
> >    https://mirrors.edge.kernel.org/pub/tools/crosstool/
> > 
> > > Mark, in this patch series you use many local variables and helper functions.
> > > Honestly, this worries me. For example, compilers can (and usually do)
> > > ignore the presence of the 'inline' specifier for the purpose of
> > > optimization.
> > 
> > I've deliberately used `__always_inline` rather than regular `inline` to
> > prevent code being placed out-of-line. As mentioned in oether replies it has a
> > stronger semantic.
> > 
> > Thanks,
> > Mark.
> > 
> > > 
> > > Thanks!
> > > 
> > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: Alexander Popov <alex.popov@linux.com>
> > > > Cc: Andrew Morton <akpm@linux-foundation.org>
> > > > Cc: Andy Lutomirski <luto@kernel.org>
> > > > Cc: Kees Cook <keescook@chromium.org>
> > > > ---
> > > >    include/linux/stackleak.h | 26 ++++++++++++++++++++++++++
> > > >    kernel/stackleak.c        | 18 ++++--------------
> > > >    2 files changed, 30 insertions(+), 14 deletions(-)
> > > > 
> > > > diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
> > > > index 467661aeb4136..c36e7a3b45e7e 100644
> > > > --- a/include/linux/stackleak.h
> > > > +++ b/include/linux/stackleak.h
> > > > @@ -42,6 +42,32 @@ stackleak_task_high_bound(const struct task_struct *tsk)
> > > >    	return (unsigned long)task_pt_regs(tsk);
> > > >    }
> > > > +/*
> > > > + * Find the address immediately above the poisoned region of the stack, where
> > > > + * that region falls between 'low' (inclusive) and 'high' (exclusive).
> > > > + */
> > > > +static __always_inline unsigned long
> > > > +stackleak_find_top_of_poison(const unsigned long low, const unsigned long high)
> > > > +{
> > > > +	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> > > > +	unsigned int poison_count = 0;
> > > > +	unsigned long poison_high = high;
> > > > +	unsigned long sp = high;
> > > > +
> > > > +	while (sp > low && poison_count < depth) {
> > > > +		sp -= sizeof(unsigned long);
> > > > +
> > > > +		if (*(unsigned long *)sp == STACKLEAK_POISON) {
> > > > +			poison_count++;
> > > > +		} else {
> > > > +			poison_count = 0;
> > > > +			poison_high = sp;
> > > > +		}
> > > > +	}
> > > > +
> > > > +	return poison_high;
> > > > +}
> > > > +
> > > >    static inline void stackleak_task_init(struct task_struct *t)
> > > >    {
> > > >    	t->lowest_stack = stackleak_task_low_bound(t);
> > > > diff --git a/kernel/stackleak.c b/kernel/stackleak.c
> > > > index ba346d46218f5..afd54b8e10b83 100644
> > > > --- a/kernel/stackleak.c
> > > > +++ b/kernel/stackleak.c
> > > > @@ -74,20 +74,10 @@ static __always_inline void __stackleak_erase(void)
> > > >    {
> > > >    	const unsigned long task_stack_low = stackleak_task_low_bound(current);
> > > >    	const unsigned long task_stack_high = stackleak_task_high_bound(current);
> > > > -	unsigned long erase_low = current->lowest_stack;
> > > > -	unsigned long erase_high;
> > > > -	unsigned int poison_count = 0;
> > > > -	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
> > > > -
> > > > -	/* Search for the poison value in the kernel stack */
> > > > -	while (erase_low > task_stack_low && poison_count <= depth) {
> > > > -		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
> > > > -			poison_count++;
> > > > -		else
> > > > -			poison_count = 0;
> > > > -
> > > > -		erase_low -= sizeof(unsigned long);
> > > > -	}
> > > > +	unsigned long erase_low, erase_high;
> > > > +
> > > > +	erase_low = stackleak_find_top_of_poison(task_stack_low,
> > > > +						 current->lowest_stack);
> > > >    #ifdef CONFIG_STACKLEAK_METRICS
> > > >    	current->prev_lowest_stack = erase_low;
> > > 
> 

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 03/13] stackleak: remove redundant check
  2022-05-24 10:03                 ` Mark Rutland
@ 2022-05-26 22:09                   ` Alexander Popov
  0 siblings, 0 replies; 47+ messages in thread
From: Alexander Popov @ 2022-05-26 22:09 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Kees Cook, linux-arm-kernel, akpm, catalin.marinas, linux-kernel,
	luto, will

On 24.05.2022 13:03, Mark Rutland wrote:
> On Sun, May 15, 2022 at 07:17:16PM +0300, Alexander Popov wrote:
>> On 12.05.2022 12:14, Mark Rutland wrote:
>>> On Wed, May 11, 2022 at 07:44:41AM -0700, Kees Cook wrote:
>>>>
>>>>
>>>> On May 11, 2022 1:02:45 AM PDT, Mark Rutland <mark.rutland@arm.com> wrote:
>>>>> On Tue, May 10, 2022 at 08:00:38PM -0700, Kees Cook wrote:
>>>>>> On Tue, May 10, 2022 at 12:46:48PM +0100, Mark Rutland wrote:
>>>>>>> On Sun, May 08, 2022 at 09:17:01PM +0300, Alexander Popov wrote:
>>>>>>>> On 27.04.2022 20:31, Mark Rutland wrote:
>>>>>>>>> In __stackleak_erase() we check that the `erase_low` value derived from
>>>>>>>>> `current->lowest_stack` is above the lowest legitimate stack pointer
>>>>>>>>> value, but this is already enforced by stackleak_track_stack() when
>>>>>>>>> recording the lowest stack value.
>>>>>>>>>
>>>>>>>>> Remove the redundant check.
>>>>>>>>>
>>>>>>>>> There should be no functional change as a result of this patch.
>>>>>>>>
>>>>>>>> Mark, I can't agree here. I think this check is important.
>>>>>>>> The performance profit from dropping it is less than the confidence decrease :)
>>>>>>>>
>>>>>>>> With this check, if the 'lowest_stack' value is corrupted, stackleak doesn't
>>>>>>>> overwrite some wrong kernel memory, but simply clears the whole thread
>>>>>>>> stack, which is safe behavior.
>>>>>>>
>>>>>>> If you feel strongly about it, I can restore the check, but I struggle to
>>>>>>> believe that it's worthwhile. The `lowest_stack` value lives in the
>>>>>>> task_struct, and if you have the power to corrupt that you have the power to do
>>>>>>> much more interesting things.
>>>>>>>
>>>>>>> If we do restore it, I'd like to add a big fat comment explaining the
>>>>>>> rationale (i.e. that it only matter if someone could corrupt
>>>>>>> `current->lowest_stack`, as otherwise that's guarnateed to be within bounds).
>>>>>>
>>>>>> Yeah, let's restore it and add the comment. While I do agree it's likely
>>>>>> that such an corruption would likely mean an attacker had significant
>>>>>> control over kernel memory already, it is not uncommon that an attack
>>>>>> only has a limited index from a given address, etc. Or some manipulation
>>>>>> is possible via weird gadgets, etc. It's unlikely, but not impossible,
>>>>>> and a bounds-check for that value is cheap compared to the rest of the
>>>>>> work happening. :)
>>>>>
>>>>> Fair enough; I can go spin a patch restoring this. I'm somewhat unhappy with
>>>>> silently fixing that up, though -- IMO it'd be better to BUG() or similar in
>>>>> that case.
>>>>
>>>> I share your desires, and this was exactly what Alexander originally proposed, but Linus rejected it violently. :(
>>>> https://lore.kernel.org/lkml/CA+55aFy6jNLsywVYdGp83AMrXBo_P-pkjkphPGrO=82SPKCpLQ@mail.gmail.com/
>>>
>>> I see. :/
>>>
>>> Thinking about this some more, if we assume someone can corrupt *some* word of
>>> memory, then we need to consider that instead of corrupting
>>> task_struct::lowest_stack, they could corrupt task_struct::stack (or x86's
>>> cpu_current_top_of_stack prior to this series).
>>>
>>> With that in mind, if we detect that task_struct::lowest_stack is
>>> out-of-bounds, we have no idea whether it has been corrupted or the other bound
>>> values have been corrupted, and so we can't do the erase safely anyway.
>>
>> :)
>>
>> IMO, even if a kernel thread stack is moved somewhere for any weird reason,
>> stackleak must erase it at the end of syscall and do its job.
> 
> I'm not talking about the stack being *moved*. I'm talking about the pointers
> to it being *corrupted* (wince we use those to determine the bounds).
> 
> The problem is that we don't have a single source of truth here that we can
> rely upon.
> 
> We're stuck between a dichotomy:
> 
> * If we assume an attacker *can* corrupt a word of memory, they can corrupt any
>    of the in-memory values we use to find the stack in the first place. If we
>    detect a mismatch we cannot know which is bad, and if the attacker can
>    corrupt the one(s) we choose to blindly trust, then they can weaponize the
>    erasing code to corrupt memory.
> 
>    That's *worse* than the info leak problem stackleak was originally trying to
>    solve.
> 
>    See below for one way we could avoid that.
> 
> * If we assume the attacker *cannot* corrupt a word of memory, then we know the
>    values must always be within bounds, and there's no need for the check.

Mark, thanks for your reply.

Yes, I agree with this.

>>> So AFAICT we must *avoid* erasing when that goes wrong. Maybe we could WARN()
>>> instead of BUG()?
>>
>> Mark, I think security features must not go out of service.
>>
>> The 'lowest_stack' value is for making stackleak faster. I believe if the
>> 'lowest_stack' value is invalid, stackleak must not skip its main job and
>> should erase the whole kernel thread stack.
> 
> My point is that the conditions which permit `lowest_stack` to become invalid
> (e.g. an attacker having an arbitrary or constrained write gadget) also permit
> all the other stack boundariy values to become invalid.
> 
> If we detect `lowest_stack` is out of bounds, we have no idea which of
> `lowest_stack` or the bounds are corrupt -- so we *cannot* safely erase: if the
> bounds are corrupt we'll corrupt arbitrary memory.
> 
> We *could* do better by always deriving the bounds from an SP value (current
> for on-stack, passed in by asm for off-stack). If we did that, we could more
> reasonably treat the bounds values as more reliable than the `lowest_stack`
> value, and with that I'd be happy with the bounds check (though I still think
> we want to make this WARN()).

I would prefer erasing anyway over WARN_ON() + skipping it.

Ok, let's simply hope that the `lowest_stack` is correct. Let's erase the stack 
without additional checks.

>> When I developed 'stackleak_erase()' I tried adding 'WARN_ON()', but it
>> didn't work properly there, as I remember. Warning handling code is very
>> complex. So I dropped that fragile part.
> 
> Do you recall any specific problem, or just that there were problems?

As I remember, the kernel was hanging without printing full warning.

> I ask because the entry code, and handling of BUG() and WARN() has changed
> quite a bit over the last couple of years. We've fixed some latent issues
> there, though IIUC this late in the exception return flow there are still some
> potential issues with the RCU/lockdep/etc context that would need to be
> saved/restored.
> 
> We need to solve some of that in general anyuway, since there are other BUG()
> and WARN() instances hidden in noinstr entry code. I'm happy to dig into that
> (time permitting).
> 
> Thanks,
> Mark.


^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/13] stackleak: rework poison scanning
  2022-05-24 13:31         ` Mark Rutland
@ 2022-05-26 23:25           ` Alexander Popov
  2022-05-31 18:13             ` Kees Cook
  0 siblings, 1 reply; 47+ messages in thread
From: Alexander Popov @ 2022-05-26 23:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, akpm, catalin.marinas, keescook, linux-kernel,
	luto, will

On 24.05.2022 16:31, Mark Rutland wrote:
> On Sun, May 15, 2022 at 08:33:01PM +0300, Alexander Popov wrote:
>> On 10.05.2022 16:13, Mark Rutland wrote:
>>> On Mon, May 09, 2022 at 04:51:35PM +0300, Alexander Popov wrote:
>>>> Hello Mark!
>>>>
>>>> On 27.04.2022 20:31, Mark Rutland wrote:
>>>>> Currently we over-estimate the region of stack which must be erased.
>>>>>
>>>>> To determine the region to be erased, we scan downards for a contiguous
>>>>> block of poison values (or the low bound of the stack). There are a few
>>>>> minor problems with this today:
>>>>>
>>>>> * When we find a block of poison values, we include this block within
>>>>>      the region to erase.
>>>>>
>>>>>      As this is included within the region to erase, this causes us to
>>>>>      redundantly overwrite 'STACKLEAK_SEARCH_DEPTH' (128) bytes with
>>>>>      poison.
>>>>
>>>> Right, this can be improved.
>>>>
>>>>> * As the loop condition checks 'poison_count <= depth', it will run an
>>>>>      additional iteration after finding the contiguous block of poison,
>>>>>      decrementing 'erase_low' once more than necessary.
>>>>
>>>> Actually, I think the current code is correct.
>>>>
>>>> I'm intentionally searching one poison value more than
>>>> STACKLEAK_SEARCH_DEPTH to avoid the corner case. See the BUILD_BUG_ON
>>>> assertion in stackleak_track_stack() that describes it:
>>>>
>>>> /*
>>>>    * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
>>>>    * STACKLEAK_SEARCH_DEPTH makes the poison search in
>>>>    * stackleak_erase() unreliable. Let's prevent that.
>>>>    */
>>>> BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE > STACKLEAK_SEARCH_DEPTH);
>>>
>>> I had read that, but as written that doesn't imply that it's necessary to scan
>>> one more element than STACKLEAK_SEARCH_DEPTH, nor why. I'm more than happy to
>>> change the logic, but I think we need a very clear explanation as to why we
>>> need to scan the specific number of bytes we scan, and we should account for
>>> that *within* STACKLEAK_SEARCH_DEPTH for clarity.
>>
>> I'll try to explain.
>>
>> The stackleak gcc plugin instruments the kernel code inserting the
>> 'stackleak_track_stack()' calls for the functions with a stack frame size
>> greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.
>>
>> The kernel functions with a smaller stack frame are not instrumented (the
>> 'lowest_stack' value is not updated for them).
> 
> I understood these points.
> 
> It's also worth noting that `noinstr` code will also not be instrumented
> regardless of frame size -- we might want some build-time assertion for those.

I developed a trick that shows noinstr functions that stackleak would like to instrument:

diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
index 42f0252ee2a4..6db748d44957 100644
--- a/scripts/gcc-plugins/stackleak_plugin.c
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -397,6 +397,9 @@ static unsigned int stackleak_cleanup_execute(void)
  	const char *fn = DECL_NAME_POINTER(current_function_decl);
  	bool removed = false;

+	if (verbose)
+		fprintf(stderr, "stackleak: I see noinstr function %s()\n", fn);
+
  	/*
  	 * Leave stack tracking in functions that call alloca().
  	 * Additional case:
@@ -464,12 +467,12 @@ static bool stackleak_gate(void)
  		if (STRING_EQUAL(section, ".meminit.text"))
  			return false;
  		if (STRING_EQUAL(section, ".noinstr.text"))
-			return false;
+			return true;
  		if (STRING_EQUAL(section, ".entry.text"))
  			return false;
  	}

-	return track_frame_size >= 0;
+	return false;
  }

  /* Build the function declaration for stackleak_track_stack() */
@@ -589,8 +592,6 @@ __visible int plugin_init(struct plugin_name_args *plugin_info,
  				build_for_x86 = true;
  		} else if (!strcmp(argv[i].key, "disable")) {
  			disable = true;
-		} else if (!strcmp(argv[i].key, "verbose")) {
-			verbose = true;
  		} else {
  			error(G_("unknown option '-fplugin-arg-%s-%s'"),
  					plugin_name, argv[i].key);
@@ -598,6 +599,8 @@ __visible int plugin_init(struct plugin_name_args *plugin_info,
  		}
  	}

+	verbose = true;
+
  	if (disable) {
  		if (verbose)
  			fprintf(stderr, "stackleak: disabled for this translation unit\n");


Building defconfig for x86_64 gives this:

stackleak: I see noinstr function __do_fast_syscall_32()
stackleak: instrument __do_fast_syscall_32(): calls_alloca
--
stackleak: I see noinstr function do_syscall_64()
stackleak: instrument do_syscall_64(): calls_alloca
--
stackleak: I see noinstr function do_int80_syscall_32()
stackleak: instrument do_int80_syscall_32(): calls_alloca
--
stackleak: I see noinstr function do_machine_check()
stackleak: instrument do_machine_check()
--
stackleak: I see noinstr function exc_general_protection()
stackleak: instrument exc_general_protection()
--
stackleak: I see noinstr function fixup_bad_iret()
stackleak: instrument fixup_bad_iret()


The cases with calls_alloca are caused by CONFIG_RANDOMIZE_KSTACK_OFFSET=y.
Kees knows about that peculiarity.

Other cases are noinstr functions with large stack frame:
do_machine_check(), exc_general_protection(), and fixup_bad_iret().

I think adding a build-time assertion is not possible, since it would break
building the kernel.

>> Any kernel function may leave uninitialized data on its stack frame. The
>> poison scanning must handle that correctly. The uninitialized groups of
>> poison values must be smaller than the search depth, otherwise
>> 'stackleak_erase()' is unreliable.
> 
> I had understood this, but I had understood that for a caller->callee pair,
> *something* would be pushed onto the stack. On arm64 that'd be a frame record
> in the caller's stack frame, and on x86 that would be the return address
> between the stack frames of the caller and callee. Since any unrecorded frame
> is less than CONFIG_STACKLEAK_TRACK_MIN_SIZE, the non-poison bytes would fall
> within CONFIG_STACKLEAK_TRACK_MIN_SIZE bytes.

Yes, exactly.

> There is a potential edge case on arm64, since the frame record is permitted to
> be placed anywhere within the stack frame, and in theory it could be placed
> high on the caller and low on the callee. If we wanted to handle that, we'd
> need to scan 2 times the tracking size. In practice compilers consistently
> place the frame record at one end (usually the low end, as part of constructing
> the frame).

Good to hear that.

>> So with this BUILD_BUG_ON I control that
>>    CONFIG_STACKLEAK_TRACK_MIN_SIZE <= STACKLEAK_SEARCH_DEPTH.
>>
>> To be sure and avoid mistakes in the edge cases, 'stackleak_erase()' is
>> searching one poison value more than STACKLEAK_SEARCH_DEPTH.
> 
> Just to check my understanding, did you have a specific edge-case in mind, or
> was that "just in case"?

I didn't have a specific edge-case, but I wanted to avoid possible weird problems.

> It would be really nice if we had an example.
> 
>> If you don't like this one additional poison value in the search, I would
>> propose to change the assertion.
> 
> If we clearly document *why*, then changing the assertion is fine by me.
> However, as above I don't think that this is necessary.
> 
> As an aside, why is it possible to configure CONFIG_STACKLEAK_TRACK_MIN_SIZE to
> be bigger than STACKLEAK_SEARCH_DEPTH in the first place?
> 
> In security/Kconfig.hardening we have:
> 
> | config STACKLEAK_TRACK_MIN_SIZE
> |         int "Minimum stack frame size of functions tracked by STACKLEAK"
> |         default 100
> |         range 0 4096
> |         depends on GCC_PLUGIN_STACKLEAK
> |         help
> |           The STACKLEAK gcc plugin instruments the kernel code for tracking
> |           the lowest border of the kernel stack (and for some other purposes).
> |           It inserts the stackleak_track_stack() call for the functions with
> |           a stack frame size greater than or equal to this parameter.
> |           If unsure, leave the default value 100.
> 
> ... where the vast majority of that range is going to lead to a BUILD_BUG().

Honestly, I don't like the idea of having the STACKLEAK_TRACK_MIN_SIZE option in the Kconfig.

I was forced by the maintainers to introduce it when I was working on the stackleak patchset.

How about dropping CONFIG_STACKLEAK_TRACK_MIN_SIZE from Kconfig?

That would also allow to drop this build-time assertion.

>> What do you think?
>>
>>>>>      As this is included within the region to erase, this causes us to
>>>>>      redundantly overwrite an additional unsigned long with poison.
>>>>>
>>>>> * As we always decrement 'erase_low' after checking an element on the
>>>>>      stack, we always include the element below this within the region to
>>>>>      erase.
>>>>>
>>>>>      As this is included within the region to erase, this causes us to
>>>>>      redundantly overwrite an additional unsigned long with poison.
>>>>>
>>>>>      Note that this is not a functional problem. As the loop condition
>>>>>      checks 'erase_low > task_stack_low', we'll never clobber the
>>>>>      STACK_END_MAGIC. As we always decrement 'erase_low' after this, we'll
>>>>>      never fail to erase the element immediately above the STACK_END_MAGIC.
>>>>
>>>> Right, I don't see any bug in the current erasing code.
>>>>
>>>> When I wrote the current code, I carefully checked all the corner cases. For
>>>> example, on the first stack erasing, the STACK_END_MAGIC was not
>>>> overwritten, but the memory next to it was erased. Same for the beginning of
>>>> the stack: I carefully checked that no unpoisoned bytes were left on the
>>>> thread stack.
>>>>
>>>>> In total, this can cause us to erase `128 + 2 * sizeof(unsigned long)`
>>>>> bytes more than necessary, which is unfortunate.
>>>>>
>>>>> This patch reworks the logic to find the address immediately above the
>>>>> poisoned region, by finding the lowest non-poisoned address. This is
>>>>> factored into a stackleak_find_top_of_poison() helper both for clarity
>>>>> and so that this can be shared with the LKDTM test in subsequent
>>>>> patches.
>>>>
>>>> You know, I wrote stackleak_erase() in very plain C. I wanted a compiler to
>>>> generate assembly that is very close to the original asm version. I worried
>>>> that compilers might do weird stuff with the local variables and the stack
>>>> pointer.
>>>>
>>>> So I checked stackleak for gcc versions 4.8, 5, 6, 7, 8, 9, and 10 on
>>>> x86_64, i386 and arm64. This is my project that helped with this work:
>>>> https://github.com/a13xp0p0v/kernel-build-containers
>>>
>>> I've used the kernel.org cross toolchains, as published at:
>>>
>>>     https://mirrors.edge.kernel.org/pub/tools/crosstool/
>>>
>>>> Mark, in this patch series you use many local variables and helper functions.
>>>> Honestly, this worries me. For example, compilers can (and usually do)
>>>> ignore the presence of the 'inline' specifier for the purpose of
>>>> optimization.
>>>
>>> I've deliberately used `__always_inline` rather than regular `inline` to
>>> prevent code being placed out-of-line. As mentioned in oether replies it has a
>>> stronger semantic.
>>>
>>> Thanks,
>>> Mark.
>>>
>>>>
>>>> Thanks!
>>>>
>>>>> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
>>>>> Cc: Alexander Popov <alex.popov@linux.com>
>>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>>> Cc: Andy Lutomirski <luto@kernel.org>
>>>>> Cc: Kees Cook <keescook@chromium.org>
>>>>> ---
>>>>>     include/linux/stackleak.h | 26 ++++++++++++++++++++++++++
>>>>>     kernel/stackleak.c        | 18 ++++--------------
>>>>>     2 files changed, 30 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
>>>>> index 467661aeb4136..c36e7a3b45e7e 100644
>>>>> --- a/include/linux/stackleak.h
>>>>> +++ b/include/linux/stackleak.h
>>>>> @@ -42,6 +42,32 @@ stackleak_task_high_bound(const struct task_struct *tsk)
>>>>>     	return (unsigned long)task_pt_regs(tsk);
>>>>>     }
>>>>> +/*
>>>>> + * Find the address immediately above the poisoned region of the stack, where
>>>>> + * that region falls between 'low' (inclusive) and 'high' (exclusive).
>>>>> + */
>>>>> +static __always_inline unsigned long
>>>>> +stackleak_find_top_of_poison(const unsigned long low, const unsigned long high)
>>>>> +{
>>>>> +	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>>>>> +	unsigned int poison_count = 0;
>>>>> +	unsigned long poison_high = high;
>>>>> +	unsigned long sp = high;
>>>>> +
>>>>> +	while (sp > low && poison_count < depth) {
>>>>> +		sp -= sizeof(unsigned long);
>>>>> +
>>>>> +		if (*(unsigned long *)sp == STACKLEAK_POISON) {
>>>>> +			poison_count++;
>>>>> +		} else {
>>>>> +			poison_count = 0;
>>>>> +			poison_high = sp;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +	return poison_high;
>>>>> +}
>>>>> +
>>>>>     static inline void stackleak_task_init(struct task_struct *t)
>>>>>     {
>>>>>     	t->lowest_stack = stackleak_task_low_bound(t);
>>>>> diff --git a/kernel/stackleak.c b/kernel/stackleak.c
>>>>> index ba346d46218f5..afd54b8e10b83 100644
>>>>> --- a/kernel/stackleak.c
>>>>> +++ b/kernel/stackleak.c
>>>>> @@ -74,20 +74,10 @@ static __always_inline void __stackleak_erase(void)
>>>>>     {
>>>>>     	const unsigned long task_stack_low = stackleak_task_low_bound(current);
>>>>>     	const unsigned long task_stack_high = stackleak_task_high_bound(current);
>>>>> -	unsigned long erase_low = current->lowest_stack;
>>>>> -	unsigned long erase_high;
>>>>> -	unsigned int poison_count = 0;
>>>>> -	const unsigned int depth = STACKLEAK_SEARCH_DEPTH / sizeof(unsigned long);
>>>>> -
>>>>> -	/* Search for the poison value in the kernel stack */
>>>>> -	while (erase_low > task_stack_low && poison_count <= depth) {
>>>>> -		if (*(unsigned long *)erase_low == STACKLEAK_POISON)
>>>>> -			poison_count++;
>>>>> -		else
>>>>> -			poison_count = 0;
>>>>> -
>>>>> -		erase_low -= sizeof(unsigned long);
>>>>> -	}
>>>>> +	unsigned long erase_low, erase_high;
>>>>> +
>>>>> +	erase_low = stackleak_find_top_of_poison(task_stack_low,
>>>>> +						 current->lowest_stack);
>>>>>     #ifdef CONFIG_STACKLEAK_METRICS
>>>>>     	current->prev_lowest_stack = erase_low;
>>>>
>>


^ permalink raw reply related	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/13] stackleak: rework poison scanning
  2022-05-26 23:25           ` Alexander Popov
@ 2022-05-31 18:13             ` Kees Cook
  2022-06-03 16:55               ` Alexander Popov
  0 siblings, 1 reply; 47+ messages in thread
From: Kees Cook @ 2022-05-31 18:13 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Mark Rutland, linux-arm-kernel, akpm, catalin.marinas,
	linux-kernel, luto, will

On Fri, May 27, 2022 at 02:25:12AM +0300, Alexander Popov wrote:
> On 24.05.2022 16:31, Mark Rutland wrote:
> > [...]
> > It's also worth noting that `noinstr` code will also not be instrumented
> > regardless of frame size -- we might want some build-time assertion for those.
> 
> I developed a trick that shows noinstr functions that stackleak would like to instrument:
> 
> diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
> index 42f0252ee2a4..6db748d44957 100644
> --- a/scripts/gcc-plugins/stackleak_plugin.c
> +++ b/scripts/gcc-plugins/stackleak_plugin.c
> @@ -397,6 +397,9 @@ static unsigned int stackleak_cleanup_execute(void)
>  	const char *fn = DECL_NAME_POINTER(current_function_decl);
>  	bool removed = false;
> 
> +	if (verbose)
> +		fprintf(stderr, "stackleak: I see noinstr function %s()\n", fn);
> +
>  	/*
>  	 * Leave stack tracking in functions that call alloca().
>  	 * Additional case:
> @@ -464,12 +467,12 @@ static bool stackleak_gate(void)
>  		if (STRING_EQUAL(section, ".meminit.text"))
>  			return false;
>  		if (STRING_EQUAL(section, ".noinstr.text"))
> -			return false;
> +			return true;
>  		if (STRING_EQUAL(section, ".entry.text"))
>  			return false;
>  	}
> 
> -	return track_frame_size >= 0;
> +	return false;
>  }
> 
>  /* Build the function declaration for stackleak_track_stack() */
> @@ -589,8 +592,6 @@ __visible int plugin_init(struct plugin_name_args *plugin_info,
>  				build_for_x86 = true;
>  		} else if (!strcmp(argv[i].key, "disable")) {
>  			disable = true;
> -		} else if (!strcmp(argv[i].key, "verbose")) {
> -			verbose = true;
>  		} else {
>  			error(G_("unknown option '-fplugin-arg-%s-%s'"),
>  					plugin_name, argv[i].key);
> @@ -598,6 +599,8 @@ __visible int plugin_init(struct plugin_name_args *plugin_info,
>  		}
>  	}
> 
> +	verbose = true;
> +
>  	if (disable) {
>  		if (verbose)
>  			fprintf(stderr, "stackleak: disabled for this translation unit\n");
> 
> 
> Building defconfig for x86_64 gives this:
> 
> stackleak: I see noinstr function __do_fast_syscall_32()
> stackleak: instrument __do_fast_syscall_32(): calls_alloca
> --
> stackleak: I see noinstr function do_syscall_64()
> stackleak: instrument do_syscall_64(): calls_alloca
> --
> stackleak: I see noinstr function do_int80_syscall_32()
> stackleak: instrument do_int80_syscall_32(): calls_alloca

As you say, these are from RANDOMIZE_KSTACK_OFFSET, and are around
bounds-checked, and should already be getting wiped since these will
call into deeper (non-noinst) functions.

> stackleak: I see noinstr function do_machine_check()
> stackleak: instrument do_machine_check()
> --
> stackleak: I see noinstr function exc_general_protection()
> stackleak: instrument exc_general_protection()
> --
> stackleak: I see noinstr function fixup_bad_iret()
> stackleak: instrument fixup_bad_iret()
> 
> 
> The cases with calls_alloca are caused by CONFIG_RANDOMIZE_KSTACK_OFFSET=y.
> Kees knows about that peculiarity.
> 
> Other cases are noinstr functions with large stack frame:
> do_machine_check(), exc_general_protection(), and fixup_bad_iret().
> 
> I think adding a build-time assertion is not possible, since it would break
> building the kernel.

Do these functions share the syscall behavior of always calling into
non-noinst functions that _do_ have stack depth instrumentation?

> [...]
> > In security/Kconfig.hardening we have:
> > 
> > | config STACKLEAK_TRACK_MIN_SIZE
> > |         int "Minimum stack frame size of functions tracked by STACKLEAK"
> > |         default 100
> > |         range 0 4096
> > |         depends on GCC_PLUGIN_STACKLEAK
> > |         help
> > |           The STACKLEAK gcc plugin instruments the kernel code for tracking
> > |           the lowest border of the kernel stack (and for some other purposes).
> > |           It inserts the stackleak_track_stack() call for the functions with
> > |           a stack frame size greater than or equal to this parameter.
> > |           If unsure, leave the default value 100.
> > 
> > ... where the vast majority of that range is going to lead to a BUILD_BUG().
> 
> Honestly, I don't like the idea of having the STACKLEAK_TRACK_MIN_SIZE option in the Kconfig.
> 
> I was forced by the maintainers to introduce it when I was working on the stackleak patchset.
> 
> How about dropping CONFIG_STACKLEAK_TRACK_MIN_SIZE from Kconfig?
> 
> That would also allow to drop this build-time assertion.

Should this be arch-specific? (i.e. just make it a per-arch Kconfig
default, instead of user-selectable into weird values?)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: [PATCH v2 07/13] stackleak: rework poison scanning
  2022-05-31 18:13             ` Kees Cook
@ 2022-06-03 16:55               ` Alexander Popov
  0 siblings, 0 replies; 47+ messages in thread
From: Alexander Popov @ 2022-06-03 16:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: Mark Rutland, linux-arm-kernel, akpm, catalin.marinas,
	linux-kernel, luto, will

On 31.05.2022 21:13, Kees Cook wrote:
> On Fri, May 27, 2022 at 02:25:12AM +0300, Alexander Popov wrote:
>> On 24.05.2022 16:31, Mark Rutland wrote:
>>> [...]
>>> It's also worth noting that `noinstr` code will also not be instrumented
>>> regardless of frame size -- we might want some build-time assertion for those.
>>
>> I developed a trick that shows noinstr functions that stackleak would like to instrument:
>>
>> diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
>> index 42f0252ee2a4..6db748d44957 100644
>> --- a/scripts/gcc-plugins/stackleak_plugin.c
>> +++ b/scripts/gcc-plugins/stackleak_plugin.c
>> @@ -397,6 +397,9 @@ static unsigned int stackleak_cleanup_execute(void)
>>   	const char *fn = DECL_NAME_POINTER(current_function_decl);
>>   	bool removed = false;
>>
>> +	if (verbose)
>> +		fprintf(stderr, "stackleak: I see noinstr function %s()\n", fn);
>> +
>>   	/*
>>   	 * Leave stack tracking in functions that call alloca().
>>   	 * Additional case:
>> @@ -464,12 +467,12 @@ static bool stackleak_gate(void)
>>   		if (STRING_EQUAL(section, ".meminit.text"))
>>   			return false;
>>   		if (STRING_EQUAL(section, ".noinstr.text"))
>> -			return false;
>> +			return true;
>>   		if (STRING_EQUAL(section, ".entry.text"))
>>   			return false;
>>   	}
>>
>> -	return track_frame_size >= 0;
>> +	return false;
>>   }
>>
>>   /* Build the function declaration for stackleak_track_stack() */
>> @@ -589,8 +592,6 @@ __visible int plugin_init(struct plugin_name_args *plugin_info,
>>   				build_for_x86 = true;
>>   		} else if (!strcmp(argv[i].key, "disable")) {
>>   			disable = true;
>> -		} else if (!strcmp(argv[i].key, "verbose")) {
>> -			verbose = true;
>>   		} else {
>>   			error(G_("unknown option '-fplugin-arg-%s-%s'"),
>>   					plugin_name, argv[i].key);
>> @@ -598,6 +599,8 @@ __visible int plugin_init(struct plugin_name_args *plugin_info,
>>   		}
>>   	}
>>
>> +	verbose = true;
>> +
>>   	if (disable) {
>>   		if (verbose)
>>   			fprintf(stderr, "stackleak: disabled for this translation unit\n");
>>
>>
>> Building defconfig for x86_64 gives this:
>>
>> stackleak: I see noinstr function __do_fast_syscall_32()
>> stackleak: instrument __do_fast_syscall_32(): calls_alloca
>> --
>> stackleak: I see noinstr function do_syscall_64()
>> stackleak: instrument do_syscall_64(): calls_alloca
>> --
>> stackleak: I see noinstr function do_int80_syscall_32()
>> stackleak: instrument do_int80_syscall_32(): calls_alloca
> 
> As you say, these are from RANDOMIZE_KSTACK_OFFSET, and are around
> bounds-checked, and should already be getting wiped since these will
> call into deeper (non-noinst) functions.

Kees, it crossed my mind that for correct stack erasing the kernel with 
RANDOMIZE_KSTACK_OFFSET needs at least one stackleak_track_stack() call during 
the syscall handling.

Otherwise current->lowest_stack would point to the stack address where no stack 
frame was placed because of alloca with random size.

Am I right?

How about calling stackleak_track_stack() explicitly after the kernel stack 
randomization?


>> stackleak: I see noinstr function do_machine_check()
>> stackleak: instrument do_machine_check()
>> --
>> stackleak: I see noinstr function exc_general_protection()
>> stackleak: instrument exc_general_protection()
>> --
>> stackleak: I see noinstr function fixup_bad_iret()
>> stackleak: instrument fixup_bad_iret()
>>
>>
>> The cases with calls_alloca are caused by CONFIG_RANDOMIZE_KSTACK_OFFSET=y.
>> Kees knows about that peculiarity.
>>
>> Other cases are noinstr functions with large stack frame:
>> do_machine_check(), exc_general_protection(), and fixup_bad_iret().
>>
>> I think adding a build-time assertion is not possible, since it would break
>> building the kernel.
> 
> Do these functions share the syscall behavior of always calling into
> non-noinst functions that _do_ have stack depth instrumentation?

This is a right question.

I can't say for sure, but it looks like do_machine_check(), 
exc_general_protection() and fixup_bad_iret() do some low-level exception/trap 
handling and don't affect syscall handling. Do you agree?

>> [...]
>>> In security/Kconfig.hardening we have:
>>>
>>> | config STACKLEAK_TRACK_MIN_SIZE
>>> |         int "Minimum stack frame size of functions tracked by STACKLEAK"
>>> |         default 100
>>> |         range 0 4096
>>> |         depends on GCC_PLUGIN_STACKLEAK
>>> |         help
>>> |           The STACKLEAK gcc plugin instruments the kernel code for tracking
>>> |           the lowest border of the kernel stack (and for some other purposes).
>>> |           It inserts the stackleak_track_stack() call for the functions with
>>> |           a stack frame size greater than or equal to this parameter.
>>> |           If unsure, leave the default value 100.
>>>
>>> ... where the vast majority of that range is going to lead to a BUILD_BUG().
>>
>> Honestly, I don't like the idea of having the STACKLEAK_TRACK_MIN_SIZE option in the Kconfig.
>>
>> I was forced by the maintainers to introduce it when I was working on the stackleak patchset.
>>
>> How about dropping CONFIG_STACKLEAK_TRACK_MIN_SIZE from Kconfig?
>>
>> That would also allow to drop this build-time assertion.
> 
> Should this be arch-specific? (i.e. just make it a per-arch Kconfig
> default, instead of user-selectable into weird values?)

I don't think CONFIG_STACKLEAK_TRACK_MIN_SIZE is arch-specific, since 
STACKLEAK_SEARCH_DEPTH is the same for all architectures that support stackleak.

Best regards,
Alexander



^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2022-06-03 16:56 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-27 17:31 [PATCH v2 00/13] stackleak: fixes and rework Mark Rutland
2022-04-27 17:31 ` [PATCH v2 01/13] arm64: stackleak: fix current_top_of_stack() Mark Rutland
2022-05-04 16:41   ` Catalin Marinas
2022-05-04 19:01     ` Kees Cook
2022-05-04 19:55       ` Catalin Marinas
2022-05-05  8:25         ` Will Deacon
2022-05-08 17:24   ` Alexander Popov
2022-05-10 11:36     ` Mark Rutland
2022-04-27 17:31 ` [PATCH v2 02/13] stackleak: move skip_erasing() check earlier Mark Rutland
2022-05-08 17:44   ` Alexander Popov
2022-05-10 11:40     ` Mark Rutland
2022-04-27 17:31 ` [PATCH v2 03/13] stackleak: remove redundant check Mark Rutland
2022-05-08 18:17   ` Alexander Popov
2022-05-10 11:46     ` Mark Rutland
2022-05-11  3:00       ` Kees Cook
2022-05-11  8:02         ` Mark Rutland
2022-05-11 14:44           ` Kees Cook
2022-05-12  9:14             ` Mark Rutland
2022-05-15 16:17               ` Alexander Popov
2022-05-24 10:03                 ` Mark Rutland
2022-05-26 22:09                   ` Alexander Popov
2022-04-27 17:31 ` [PATCH v2 04/13] stackleak: rework stack low bound handling Mark Rutland
2022-04-27 17:31 ` [PATCH v2 05/13] stackleak: clarify variable names Mark Rutland
2022-05-08 20:49   ` Alexander Popov
2022-05-10 13:01     ` Mark Rutland
2022-05-11  3:05       ` Kees Cook
2022-04-27 17:31 ` [PATCH v2 06/13] stackleak: rework stack high bound handling Mark Rutland
2022-05-08 21:27   ` Alexander Popov
2022-05-10 11:22     ` Mark Rutland
2022-05-15 16:32       ` Alexander Popov
2022-04-27 17:31 ` [PATCH v2 07/13] stackleak: rework poison scanning Mark Rutland
2022-05-09 13:51   ` Alexander Popov
2022-05-10 13:13     ` Mark Rutland
2022-05-15 17:33       ` Alexander Popov
2022-05-24 13:31         ` Mark Rutland
2022-05-26 23:25           ` Alexander Popov
2022-05-31 18:13             ` Kees Cook
2022-06-03 16:55               ` Alexander Popov
2022-04-27 17:31 ` [PATCH v2 08/13] lkdtm/stackleak: avoid spurious failure Mark Rutland
2022-04-27 17:31 ` [PATCH v2 09/13] lkdtm/stackleak: rework boundary management Mark Rutland
2022-05-04 19:07   ` Kees Cook
2022-04-27 17:31 ` [PATCH v2 10/13] lkdtm/stackleak: prevent unexpected stack usage Mark Rutland
2022-04-27 17:31 ` [PATCH v2 11/13] lkdtm/stackleak: check stack boundaries Mark Rutland
2022-04-27 17:31 ` [PATCH v2 12/13] stackleak: add on/off stack variants Mark Rutland
2022-04-27 17:31 ` [PATCH v2 13/13] arm64: entry: use stackleak_erase_on_task_stack() Mark Rutland
2022-05-04 16:42   ` Catalin Marinas
2022-05-04 19:16 ` [PATCH v2 00/13] stackleak: fixes and rework Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).