Linux-api Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size
@ 2021-03-16  6:52 Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 1/6] uapi: Define the aux vector AT_MINSIGSTKSZ Chang S. Bae
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Chang S. Bae @ 2021-03-16  6:52 UTC (permalink / raw)
  To: bp, tglx, mingo, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, jannh, mpe,
	carlos, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel, chang.seok.bae

During signal entry, the kernel pushes data onto the normal userspace
stack. On x86, the data pushed onto the user stack includes XSAVE state,
which has grown over time as new features and larger registers have been
added to the architecture.

MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
constant indicates to userspace how much data the kernel expects to push on
the user stack, [2][3].

However, this constant is much too small and does not reflect recent
additions to the architecture. For instance, when AVX-512 states are in
use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.

The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
cause user stack overflow when delivering a signal.

In this series, we suggest a couple of things:
1. Provide a variable minimum stack size to userspace, as a similar
   approach to [5].
2. Avoid using a too-small alternate stack.

Changes from v6 [11]:
* Updated and fixed the documentation. (Borislav Petkov)
* Revised the AT_MINSIGSTKSZ comment. (Borislav Petkov)

Changes form v5 [10]:
* Fixed the overflow detection. (Andy Lutomirski)
* Reverted the AT_MINSIGSTKSZ removal on arm64. (Dave Martin)
* Added a documentation about the x86 AT_MINSIGSTKSZ.
* Supported the existing sigaltstack test to use the new aux vector.

Changes from v4 [9]:
* Moved the aux vector define to the generic header. (Carlos O'Donell)

Changes from v3 [8]:
* Updated the changelog. (Borislav Petkov)
* Revised the test messages again. (Borislav Petkov)

Changes from v2 [7]:
* Simplified the sigaltstack overflow prevention. (Jann Horn)
* Renamed fpstate size helper with cleanup. (Borislav Petkov)
* Cleaned up the signframe struct size defines. (Borislav Petkov)
* Revised the selftest messages. (Borislav Petkov)
* Revised a changelog. (Borislav Petkov)

Changes from v1 [6]:
* Took stack alignment into account for sigframe size. (Dave Martin)

[1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=b9dca794da093dc4d41d39db9851d444e1b54d9b;hb=HEAD
[2]: https://www.gnu.org/software/libc/manual/html_node/Signal-Stack.html
[3]: https://man7.org/linux/man-pages/man2/sigaltstack.2.html
[4]: https://bugzilla.kernel.org/show_bug.cgi?id=153531
[5]: https://blog.linuxplumbersconf.org/2017/ocw/system/presentations/4671/original/plumbers-dm-2017.pdf
[6]: https://lore.kernel.org/lkml/20200929205746.6763-1-chang.seok.bae@intel.com/
[7]: https://lore.kernel.org/lkml/20201119190237.626-1-chang.seok.bae@intel.com/
[8]: https://lore.kernel.org/lkml/20201223015312.4882-1-chang.seok.bae@intel.com/
[9]: https://lore.kernel.org/lkml/20210115211038.2072-1-chang.seok.bae@intel.com/
[10]: https://lore.kernel.org/lkml/20210203172242.29644-1-chang.seok.bae@intel.com/
[11]: https://lore.kernel.org/lkml/20210227165911.32757-1-chang.seok.bae@intel.com/

Chang S. Bae (6):
  uapi: Define the aux vector AT_MINSIGSTKSZ
  x86/signal: Introduce helpers to get the maximum signal frame size
  x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
  selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
  x86/signal: Detect and prevent an alternate signal stack overflow
  selftest/x86/signal: Include test cases for validating sigaltstack

 Documentation/x86/elf_auxvec.rst          |  53 +++++++++
 Documentation/x86/index.rst               |   1 +
 arch/x86/include/asm/elf.h                |   4 +
 arch/x86/include/asm/fpu/signal.h         |   2 +
 arch/x86/include/asm/sigframe.h           |   2 +
 arch/x86/include/uapi/asm/auxvec.h        |   4 +-
 arch/x86/kernel/cpu/common.c              |   3 +
 arch/x86/kernel/fpu/signal.c              |  19 ++++
 arch/x86/kernel/signal.c                  |  72 +++++++++++-
 include/uapi/linux/auxvec.h               |   3 +
 tools/testing/selftests/sigaltstack/sas.c |  20 +++-
 tools/testing/selftests/x86/Makefile      |   2 +-
 tools/testing/selftests/x86/sigaltstack.c | 128 ++++++++++++++++++++++
 13 files changed, 300 insertions(+), 13 deletions(-)
 create mode 100644 Documentation/x86/elf_auxvec.rst
 create mode 100644 tools/testing/selftests/x86/sigaltstack.c


base-commit: 1e28eed17697bcf343c6743f0028cc3b5dd88bf0
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v7 1/6] uapi: Define the aux vector AT_MINSIGSTKSZ
  2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
@ 2021-03-16  6:52 ` Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 2/6] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2021-03-16  6:52 UTC (permalink / raw)
  To: bp, tglx, mingo, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, jannh, mpe,
	carlos, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel, chang.seok.bae, linux-arm-kernel

Define the AT_MINSIGSTKSZ in generic Linux. It is already used as generic
ABI in glibc's generic elf.h, and this define will prevent future namespace
conflicts. In particular, x86 is also using this generic definition.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: Carlos O'Donell <carlos@redhat.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: libc-alpha@sourceware.org
Cc: linux-arch@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
---
Change from v6:
* Revised the comment. (Borislav Petkov)

Change from v5:
* Reverted the arm64 change. (Dave Martin and Will Deacon)
* Massaged the changelog.

Change from v4:
* Added as a new patch (Carlos O'Donell)
---
 include/uapi/linux/auxvec.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
index abe5f2b6581b..c7e502bf5a6f 100644
--- a/include/uapi/linux/auxvec.h
+++ b/include/uapi/linux/auxvec.h
@@ -33,5 +33,8 @@
 
 #define AT_EXECFN  31	/* filename of program */
 
+#ifndef AT_MINSIGSTKSZ
+#define AT_MINSIGSTKSZ	51	/* minimal stack size for signal delivery */
+#endif
 
 #endif /* _UAPI_LINUX_AUXVEC_H */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v7 2/6] x86/signal: Introduce helpers to get the maximum signal frame size
  2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 1/6] uapi: Define the aux vector AT_MINSIGSTKSZ Chang S. Bae
@ 2021-03-16  6:52 ` Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 3/6] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2021-03-16  6:52 UTC (permalink / raw)
  To: bp, tglx, mingo, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, jannh, mpe,
	carlos, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel, chang.seok.bae

Signal frames do not have a fixed format and can vary in size when a number
of things change: support XSAVE features, 32 vs. 64-bit apps. Add the code
to support a runtime method for userspace to dynamically discover how large
a signal stack needs to be.

Introduce a new variable, max_frame_size, and helper functions for the
calculation to be used in a new user interface. Set max_frame_size to a
system-wide worst-case value, instead of storing multiple app-specific
values.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Acked-by: H.J. Lu <hjl.tools@gmail.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v2:
* Renamed the fpstate size helper with cleanup (Borislav Petkov)
* Moved the sigframe struct size defines to where used (Borislav Petkov)
* Removed unneeded sentence in the changelog (Borislav Petkov)

Change from v1:
* Took stack alignment into account for sigframe size (Dave Martin)
---
 arch/x86/include/asm/fpu/signal.h |  2 ++
 arch/x86/include/asm/sigframe.h   |  2 ++
 arch/x86/kernel/cpu/common.c      |  3 ++
 arch/x86/kernel/fpu/signal.c      | 19 +++++++++++
 arch/x86/kernel/signal.c          | 57 +++++++++++++++++++++++++++++--
 5 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 7fb516b6893a..8b6631dffefd 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -29,6 +29,8 @@ unsigned long
 fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
 		     unsigned long *buf_fx, unsigned long *size);
 
+unsigned long fpu__get_fpstate_size(void);
+
 extern void fpu__init_prepare_fx_sw_frame(void);
 
 #endif /* _ASM_X86_FPU_SIGNAL_H */
diff --git a/arch/x86/include/asm/sigframe.h b/arch/x86/include/asm/sigframe.h
index 84eab2724875..5b1ed650b124 100644
--- a/arch/x86/include/asm/sigframe.h
+++ b/arch/x86/include/asm/sigframe.h
@@ -85,4 +85,6 @@ struct rt_sigframe_x32 {
 
 #endif /* CONFIG_X86_64 */
 
+void __init init_sigframe_size(void);
+
 #endif /* _ASM_X86_SIGFRAME_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index ab640abe26b6..c49ef3ad34dc 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -58,6 +58,7 @@
 #include <asm/intel-family.h>
 #include <asm/cpu_device_id.h>
 #include <asm/uv/uv.h>
+#include <asm/sigframe.h>
 
 #include "cpu.h"
 
@@ -1334,6 +1335,8 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 
 	fpu__init_system(c);
 
+	init_sigframe_size();
+
 #ifdef CONFIG_X86_32
 	/*
 	 * Regardless of whether PCID is enumerated, the SDM says
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a4ec65317a7f..dbb304e48f16 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -507,6 +507,25 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
 
 	return sp;
 }
+
+unsigned long fpu__get_fpstate_size(void)
+{
+	unsigned long ret = xstate_sigframe_size();
+
+	/*
+	 * This space is needed on (most) 32-bit kernels, or when a 32-bit
+	 * app is running on a 64-bit kernel. To keep things simple, just
+	 * assume the worst case and always include space for 'freg_state',
+	 * even for 64-bit apps on 64-bit kernels. This wastes a bit of
+	 * space, but keeps the code simple.
+	 */
+	if ((IS_ENABLED(CONFIG_IA32_EMULATION) ||
+	     IS_ENABLED(CONFIG_X86_32)) && use_fxsr())
+		ret += sizeof(struct fregs_state);
+
+	return ret;
+}
+
 /*
  * Prepare the SW reserved portion of the fxsave memory layout, indicating
  * the presence of the extended state information in the memory layout
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index ea794a083c44..800243afd1ef 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -212,6 +212,11 @@ do {									\
  * Set up a signal frame.
  */
 
+/* x86 ABI requires 16-byte alignment */
+#define FRAME_ALIGNMENT	16UL
+
+#define MAX_FRAME_PADDING	(FRAME_ALIGNMENT - 1)
+
 /*
  * Determine which stack to use..
  */
@@ -222,9 +227,9 @@ static unsigned long align_sigframe(unsigned long sp)
 	 * Align the stack pointer according to the i386 ABI,
 	 * i.e. so that on function entry ((sp + 4) & 15) == 0.
 	 */
-	sp = ((sp + 4) & -16ul) - 4;
+	sp = ((sp + 4) & -FRAME_ALIGNMENT) - 4;
 #else /* !CONFIG_X86_32 */
-	sp = round_down(sp, 16) - 8;
+	sp = round_down(sp, FRAME_ALIGNMENT) - 8;
 #endif
 	return sp;
 }
@@ -663,6 +668,54 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	return 0;
 }
 
+/*
+ * There are four different struct types for signal frame: sigframe_ia32,
+ * rt_sigframe_ia32, rt_sigframe_x32, and rt_sigframe. Use the worst case
+ * -- the largest size. It means the size for 64-bit apps is a bit more
+ * than needed, but this keeps the code simple.
+ */
+#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
+# define MAX_FRAME_SIGINFO_UCTXT_SIZE	sizeof(struct sigframe_ia32)
+#else
+# define MAX_FRAME_SIGINFO_UCTXT_SIZE	sizeof(struct rt_sigframe)
+#endif
+
+/*
+ * The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
+ * If a signal frame starts at an unaligned address, extra space is required.
+ * This is the max alignment padding, conservatively.
+ */
+#define MAX_XSAVE_PADDING	63UL
+
+/*
+ * The frame data is composed of the following areas and laid out as:
+ *
+ * -------------------------
+ * | alignment padding     |
+ * -------------------------
+ * | (f)xsave frame        |
+ * -------------------------
+ * | fsave header          |
+ * -------------------------
+ * | alignment padding     |
+ * -------------------------
+ * | siginfo + ucontext    |
+ * -------------------------
+ */
+
+/* max_frame_size tells userspace the worst case signal stack size. */
+static unsigned long __ro_after_init max_frame_size;
+
+void __init init_sigframe_size(void)
+{
+	max_frame_size = MAX_FRAME_SIGINFO_UCTXT_SIZE + MAX_FRAME_PADDING;
+
+	max_frame_size += fpu__get_fpstate_size() + MAX_XSAVE_PADDING;
+
+	/* Userspace expects an aligned size. */
+	max_frame_size = round_up(max_frame_size, FRAME_ALIGNMENT);
+}
+
 static inline int is_ia32_compat_frame(struct ksignal *ksig)
 {
 	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v7 3/6] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
  2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 1/6] uapi: Define the aux vector AT_MINSIGSTKSZ Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 2/6] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
@ 2021-03-16  6:52 ` Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 4/6] selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available Chang S. Bae
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2021-03-16  6:52 UTC (permalink / raw)
  To: bp, tglx, mingo, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, jannh, mpe,
	carlos, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel, chang.seok.bae, Fenghua Yu, linux-doc

Historically, signal.h defines MINSIGSTKSZ (2KB) and SIGSTKSZ (8KB), for
use by all architectures with sigaltstack(2). Over time, the hardware state
size grew, but these constants did not evolve. Today, literal use of these
constants on several architectures may result in signal stack overflow, and
thus user data corruption.

A few years ago, the ARM team addressed this issue by establishing
getauxval(AT_MINSIGSTKSZ). This enables the kernel to supply at runtime
value that is an appropriate replacement on the current and future
hardware.

Add getauxval(AT_MINSIGSTKSZ) support to x86, analogous to the support
added for ARM in commit 94b07c1f8c39 ("arm64: signal: Report signal frame
size to userspace via auxv").

Also, include a documentation to describe x86-specific auxiliary vectors.

Reported-by: Florian Weimer <fweimer@redhat.com>
Fixes: c2bc11f10a39 ("x86, AVX-512: Enable AVX-512 States Context Switch")
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: x86@kernel.org
Cc: libc-alpha@sourceware.org
Cc: linux-arch@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=153531
---
Changes from v6:
* Revised the documentation and fixed the build issue. (Borislav Petkov)
* Fixed the vertical alignment of '\'. (Borislav Petkov)

Changes from v5:
* Added a documentation.
---
 Documentation/x86/elf_auxvec.rst   | 53 ++++++++++++++++++++++++++++++
 Documentation/x86/index.rst        |  1 +
 arch/x86/include/asm/elf.h         |  4 +++
 arch/x86/include/uapi/asm/auxvec.h |  4 +--
 arch/x86/kernel/signal.c           |  5 +++
 5 files changed, 65 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/x86/elf_auxvec.rst

diff --git a/Documentation/x86/elf_auxvec.rst b/Documentation/x86/elf_auxvec.rst
new file mode 100644
index 000000000000..6c75b26f5efb
--- /dev/null
+++ b/Documentation/x86/elf_auxvec.rst
@@ -0,0 +1,53 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================
+x86-specific ELF Auxiliary Vectors
+==================================
+
+This document describes the semantics of the x86 auxiliary vectors.
+
+Introduction
+============
+
+ELF Auxiliary vectors enable the kernel to efficiently provide
+configuration specific parameters to userspace. In this example, a program
+allocates an alternate stack based on the kernel-provided size::
+
+   #include <sys/auxv.h>
+   #include <elf.h>
+   #include <signal.h>
+   #include <stdlib.h>
+   #include <assert.h>
+   #include <err.h>
+
+   #ifndef AT_MINSIGSTKSZ
+   #define AT_MINSIGSTKSZ	51
+   #endif
+
+   ....
+   stack_t ss;
+
+   ss.ss_sp = malloc(ss.ss_size);
+   assert(ss.ss_sp);
+
+   ss.ss_size = getauxval(AT_MINSIGSTKSZ) + SIGSTKSZ;
+   ss.ss_flags = 0;
+
+   if (sigaltstack(&ss, NULL))
+        err(1, "sigaltstack");
+
+
+The exposed auxiliary vectors
+=============================
+
+AT_SYSINFO is used for locating the vsyscall entry point.  It is not
+exported on 64-bit mode.
+
+AT_SYSINFO_EHDR is the start address of the page containing the vDSO.
+
+AT_MINSIGSTKSZ denotes the minimum stack size required by the kernel to
+deliver a signal to user-space.  AT_MINSIGSTKSZ comprehends the space
+consumed by the kernel to accommodate the user context for the current
+hardware configuration.  It does not comprehend subsequent user-space stack
+consumption, which must be added by the user.  (e.g. Above, user-space adds
+SIGSTKSZ to AT_MINSIGSTKSZ.)
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index 4693e192b447..d58614d5cde6 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -35,3 +35,4 @@ x86-specific Documentation
    sva
    sgx
    features
+   elf_auxvec
diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index 9224d40cdefe..18d9b1117871 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -312,6 +312,7 @@ do {									\
 		NEW_AUX_ENT(AT_SYSINFO,	VDSO_ENTRY);			\
 		NEW_AUX_ENT(AT_SYSINFO_EHDR, VDSO_CURRENT_BASE);	\
 	}								\
+	NEW_AUX_ENT(AT_MINSIGSTKSZ, get_sigframe_size());		\
 } while (0)
 
 /*
@@ -328,6 +329,7 @@ extern unsigned long task_size_32bit(void);
 extern unsigned long task_size_64bit(int full_addr_space);
 extern unsigned long get_mmap_base(int is_legacy);
 extern bool mmap_address_hint_valid(unsigned long addr, unsigned long len);
+extern unsigned long get_sigframe_size(void);
 
 #ifdef CONFIG_X86_32
 
@@ -349,6 +351,7 @@ do {									\
 	if (vdso64_enabled)						\
 		NEW_AUX_ENT(AT_SYSINFO_EHDR,				\
 			    (unsigned long __force)current->mm->context.vdso); \
+	NEW_AUX_ENT(AT_MINSIGSTKSZ, get_sigframe_size());		\
 } while (0)
 
 /* As a historical oddity, the x32 and x86_64 vDSOs are controlled together. */
@@ -357,6 +360,7 @@ do {									\
 	if (vdso64_enabled)						\
 		NEW_AUX_ENT(AT_SYSINFO_EHDR,				\
 			    (unsigned long __force)current->mm->context.vdso); \
+	NEW_AUX_ENT(AT_MINSIGSTKSZ, get_sigframe_size());		\
 } while (0)
 
 #define AT_SYSINFO		32
diff --git a/arch/x86/include/uapi/asm/auxvec.h b/arch/x86/include/uapi/asm/auxvec.h
index 580e3c567046..6beb55bbefa4 100644
--- a/arch/x86/include/uapi/asm/auxvec.h
+++ b/arch/x86/include/uapi/asm/auxvec.h
@@ -12,9 +12,9 @@
 
 /* entries in ARCH_DLINFO: */
 #if defined(CONFIG_IA32_EMULATION) || !defined(CONFIG_X86_64)
-# define AT_VECTOR_SIZE_ARCH 2
+# define AT_VECTOR_SIZE_ARCH 3
 #else /* else it's non-compat x86-64 */
-# define AT_VECTOR_SIZE_ARCH 1
+# define AT_VECTOR_SIZE_ARCH 2
 #endif
 
 #endif /* _ASM_X86_AUXVEC_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 800243afd1ef..0d24f64d0145 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -716,6 +716,11 @@ void __init init_sigframe_size(void)
 	max_frame_size = round_up(max_frame_size, FRAME_ALIGNMENT);
 }
 
+unsigned long get_sigframe_size(void)
+{
+	return max_frame_size;
+}
+
 static inline int is_ia32_compat_frame(struct ksignal *ksig)
 {
 	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v7 4/6] selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
  2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
                   ` (2 preceding siblings ...)
  2021-03-16  6:52 ` [PATCH v7 3/6] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
@ 2021-03-16  6:52 ` Chang S. Bae
  2021-03-16  6:52 ` [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow Chang S. Bae
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2021-03-16  6:52 UTC (permalink / raw)
  To: bp, tglx, mingo, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, jannh, mpe,
	carlos, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel, chang.seok.bae, linux-kselftest

The SIGSTKSZ constant may not represent enough stack size in some
architectures as the hardware state size grows.

Use getauxval(AT_MINSIGSTKSZ) to increase the stack size.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v5:
* Added as a new patch.
---
 tools/testing/selftests/sigaltstack/sas.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/sigaltstack/sas.c b/tools/testing/selftests/sigaltstack/sas.c
index 8934a3766d20..c53b070755b6 100644
--- a/tools/testing/selftests/sigaltstack/sas.c
+++ b/tools/testing/selftests/sigaltstack/sas.c
@@ -17,6 +17,7 @@
 #include <string.h>
 #include <assert.h>
 #include <errno.h>
+#include <sys/auxv.h>
 
 #include "../kselftest.h"
 
@@ -24,6 +25,11 @@
 #define SS_AUTODISARM  (1U << 31)
 #endif
 
+#ifndef AT_MINSIGSTKSZ
+#define AT_MINSIGSTKSZ	51
+#endif
+
+static unsigned int stack_size;
 static void *sstack, *ustack;
 static ucontext_t uc, sc;
 static const char *msg = "[OK]\tStack preserved";
@@ -47,7 +53,7 @@ void my_usr1(int sig, siginfo_t *si, void *u)
 #endif
 
 	if (sp < (unsigned long)sstack ||
-			sp >= (unsigned long)sstack + SIGSTKSZ) {
+			sp >= (unsigned long)sstack + stack_size) {
 		ksft_exit_fail_msg("SP is not on sigaltstack\n");
 	}
 	/* put some data on stack. other sighandler will try to overwrite it */
@@ -108,6 +114,10 @@ int main(void)
 	stack_t stk;
 	int err;
 
+	/* Make sure more than the required minimum. */
+	stack_size = getauxval(AT_MINSIGSTKSZ) + SIGSTKSZ;
+	ksft_print_msg("[NOTE]\tthe stack size is %lu\n", stack_size);
+
 	ksft_print_header();
 	ksft_set_plan(3);
 
@@ -117,7 +127,7 @@ int main(void)
 	sigaction(SIGUSR1, &act, NULL);
 	act.sa_sigaction = my_usr2;
 	sigaction(SIGUSR2, &act, NULL);
-	sstack = mmap(NULL, SIGSTKSZ, PROT_READ | PROT_WRITE,
+	sstack = mmap(NULL, stack_size, PROT_READ | PROT_WRITE,
 		      MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
 	if (sstack == MAP_FAILED) {
 		ksft_exit_fail_msg("mmap() - %s\n", strerror(errno));
@@ -139,7 +149,7 @@ int main(void)
 	}
 
 	stk.ss_sp = sstack;
-	stk.ss_size = SIGSTKSZ;
+	stk.ss_size = stack_size;
 	stk.ss_flags = SS_ONSTACK | SS_AUTODISARM;
 	err = sigaltstack(&stk, NULL);
 	if (err) {
@@ -161,7 +171,7 @@ int main(void)
 		}
 	}
 
-	ustack = mmap(NULL, SIGSTKSZ, PROT_READ | PROT_WRITE,
+	ustack = mmap(NULL, stack_size, PROT_READ | PROT_WRITE,
 		      MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
 	if (ustack == MAP_FAILED) {
 		ksft_exit_fail_msg("mmap() - %s\n", strerror(errno));
@@ -170,7 +180,7 @@ int main(void)
 	getcontext(&uc);
 	uc.uc_link = NULL;
 	uc.uc_stack.ss_sp = ustack;
-	uc.uc_stack.ss_size = SIGSTKSZ;
+	uc.uc_stack.ss_size = stack_size;
 	makecontext(&uc, switch_fn, 0);
 	raise(SIGUSR1);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
                   ` (3 preceding siblings ...)
  2021-03-16  6:52 ` [PATCH v7 4/6] selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available Chang S. Bae
@ 2021-03-16  6:52 ` Chang S. Bae
  2021-03-16 11:52   ` Borislav Petkov
  2021-03-25 18:13   ` Andy Lutomirski
  2021-03-16  6:52 ` [PATCH v7 6/6] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
  2021-03-17 10:06 ` [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Ingo Molnar
  6 siblings, 2 replies; 29+ messages in thread
From: Chang S. Bae @ 2021-03-16  6:52 UTC (permalink / raw)
  To: bp, tglx, mingo, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, jannh, mpe,
	carlos, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel, chang.seok.bae

The kernel pushes context on to the userspace stack to prepare for the
user's signal handler. When the user has supplied an alternate signal
stack, via sigaltstack(2), it is easy for the kernel to verify that the
stack size is sufficient for the current hardware context.

Check if writing the hardware context to the alternate stack will exceed
it's size. If yes, then instead of corrupting user-data and proceeding with
the original signal handler, an immediate SIGSEGV signal is delivered.

Instead of calling on_sig_stack(), directly check the new stack pointer
whether in the bounds.

While the kernel allows new source code to discover and use a sufficient
alternate signal stack size, this check is still necessary to protect
binaries with insufficient alternate signal stack size from data
corruption.

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Reviewed-by: Jann Horn <jannh@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v5:
* Fixed the overflow check. (Andy Lutomirski)
* Updated the changelog.

Changes from v3:
* Updated the changelog (Borislav Petkov)

Changes from v2:
* Simplified the implementation (Jann Horn)
---
 arch/x86/kernel/signal.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 0d24f64d0145..9a62604fbf63 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -242,7 +242,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	unsigned long math_size = 0;
 	unsigned long sp = regs->sp;
 	unsigned long buf_fx = 0;
-	int onsigstack = on_sig_stack(sp);
+	bool onsigstack = on_sig_stack(sp);
 	int ret;
 
 	/* redzone */
@@ -251,8 +251,11 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 
 	/* This is the X/Open sanctioned signal stack switching.  */
 	if (ka->sa.sa_flags & SA_ONSTACK) {
-		if (sas_ss_flags(sp) == 0)
+		if (sas_ss_flags(sp) == 0) {
 			sp = current->sas_ss_sp + current->sas_ss_size;
+			/* On the alternate signal stack */
+			onsigstack = true;
+		}
 	} else if (IS_ENABLED(CONFIG_X86_32) &&
 		   !onsigstack &&
 		   regs->ss != __USER_DS &&
@@ -272,7 +275,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	 * If we are on the alternate signal stack and would overflow it, don't.
 	 * Return an always-bogus address instead so we will die with SIGSEGV.
 	 */
-	if (onsigstack && !likely(on_sig_stack(sp)))
+	if (onsigstack && unlikely(sp <= current->sas_ss_sp ||
+				   sp - current->sas_ss_sp > current->sas_ss_size))
 		return (void __user *)-1L;
 
 	/* save i387 and extended state */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v7 6/6] selftest/x86/signal: Include test cases for validating sigaltstack
  2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
                   ` (4 preceding siblings ...)
  2021-03-16  6:52 ` [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow Chang S. Bae
@ 2021-03-16  6:52 ` Chang S. Bae
  2021-03-17 10:06 ` [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Ingo Molnar
  6 siblings, 0 replies; 29+ messages in thread
From: Chang S. Bae @ 2021-03-16  6:52 UTC (permalink / raw)
  To: bp, tglx, mingo, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, jannh, mpe,
	carlos, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel, chang.seok.bae, linux-kselftest

The test measures the kernel's signal delivery with different (enough vs.
insufficient) stack sizes.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v3:
* Revised test messages again (Borislav Petkov)

Changes from v2:
* Revised test messages (Borislav Petkov)
---
 tools/testing/selftests/x86/Makefile      |   2 +-
 tools/testing/selftests/x86/sigaltstack.c | 128 ++++++++++++++++++++++
 2 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/sigaltstack.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 333980375bc7..65bba2ae86ee 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -13,7 +13,7 @@ CAN_BUILD_WITH_NOPIE := $(shell ./check_cc.sh $(CC) trivial_program.c -no-pie)
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt test_mremap_vdso \
 			check_initial_reg_state sigreturn iopl ioperm \
 			test_vsyscall mov_ss_trap \
-			syscall_arg_fault fsgsbase_restore
+			syscall_arg_fault fsgsbase_restore sigaltstack
 TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/sigaltstack.c b/tools/testing/selftests/x86/sigaltstack.c
new file mode 100644
index 000000000000..f689af75e979
--- /dev/null
+++ b/tools/testing/selftests/x86/sigaltstack.c
@@ -0,0 +1,128 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define _GNU_SOURCE
+#include <signal.h>
+#include <stdio.h>
+#include <stdbool.h>
+#include <string.h>
+#include <err.h>
+#include <errno.h>
+#include <limits.h>
+#include <sys/mman.h>
+#include <sys/auxv.h>
+#include <sys/prctl.h>
+#include <sys/resource.h>
+#include <setjmp.h>
+
+/* sigaltstack()-enforced minimum stack */
+#define ENFORCED_MINSIGSTKSZ	2048
+
+#ifndef AT_MINSIGSTKSZ
+#  define AT_MINSIGSTKSZ	51
+#endif
+
+static int nerrs;
+
+static bool sigalrm_expected;
+
+static unsigned long at_minstack_size;
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static void clearhandler(int sig)
+{
+	struct sigaction sa;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = SIG_DFL;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static int setup_altstack(void *start, unsigned long size)
+{
+	stack_t ss;
+
+	memset(&ss, 0, sizeof(ss));
+	ss.ss_size = size;
+	ss.ss_sp = start;
+
+	return sigaltstack(&ss, NULL);
+}
+
+static jmp_buf jmpbuf;
+
+static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
+{
+	if (sigalrm_expected) {
+		printf("[FAIL]\tWrong signal delivered: SIGSEGV (expected SIGALRM).");
+		nerrs++;
+	} else {
+		printf("[OK]\tSIGSEGV signal delivered.\n");
+	}
+
+	siglongjmp(jmpbuf, 1);
+}
+
+static void sigalrm(int sig, siginfo_t *info, void *ctx_void)
+{
+	if (!sigalrm_expected) {
+		printf("[FAIL]\tWrong signal delivered: SIGALRM (expected SIGSEGV).");
+		nerrs++;
+	} else {
+		printf("[OK]\tSIGALRM signal delivered.\n");
+	}
+}
+
+static void test_sigaltstack(void *altstack, unsigned long size)
+{
+	if (setup_altstack(altstack, size))
+		err(1, "sigaltstack()");
+
+	sigalrm_expected = (size > at_minstack_size) ? true : false;
+
+	sethandler(SIGSEGV, sigsegv, 0);
+	sethandler(SIGALRM, sigalrm, SA_ONSTACK);
+
+	if (!sigsetjmp(jmpbuf, 1)) {
+		printf("[RUN]\tTest an alternate signal stack of %ssufficient size.\n",
+		       sigalrm_expected ? "" : "in");
+		printf("\tRaise SIGALRM. %s is expected to be delivered.\n",
+		       sigalrm_expected ? "It" : "SIGSEGV");
+		raise(SIGALRM);
+	}
+
+	clearhandler(SIGALRM);
+	clearhandler(SIGSEGV);
+}
+
+int main(void)
+{
+	void *altstack;
+
+	at_minstack_size = getauxval(AT_MINSIGSTKSZ);
+
+	altstack = mmap(NULL, at_minstack_size + SIGSTKSZ, PROT_READ | PROT_WRITE,
+			MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
+	if (altstack == MAP_FAILED)
+		err(1, "mmap()");
+
+	if ((ENFORCED_MINSIGSTKSZ + 1) < at_minstack_size)
+		test_sigaltstack(altstack, ENFORCED_MINSIGSTKSZ + 1);
+
+	test_sigaltstack(altstack, at_minstack_size + SIGSTKSZ);
+
+	return nerrs == 0 ? 0 : 1;
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-16  6:52 ` [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow Chang S. Bae
@ 2021-03-16 11:52   ` Borislav Petkov
  2021-03-16 18:26     ` Bae, Chang Seok
  2021-03-25 18:13   ` Andy Lutomirski
  1 sibling, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-03-16 11:52 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: tglx, mingo, luto, x86, len.brown, dave.hansen, hjl.tools,
	Dave.Martin, jannh, mpe, carlos, tony.luck, ravi.v.shankar,
	libc-alpha, linux-arch, linux-api, linux-kernel

On Mon, Mar 15, 2021 at 11:52:14PM -0700, Chang S. Bae wrote:
> @@ -272,7 +275,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>  	 * If we are on the alternate signal stack and would overflow it, don't.
>  	 * Return an always-bogus address instead so we will die with SIGSEGV.
>  	 */
> -	if (onsigstack && !likely(on_sig_stack(sp)))
> +	if (onsigstack && unlikely(sp <= current->sas_ss_sp ||
> +				   sp - current->sas_ss_sp > current->sas_ss_size))
>  		return (void __user *)-1L;

So clearly I'm missing something because trying to trigger the test case
in the bugzilla:

https://bugzilla.kernel.org/show_bug.cgi?id=153531

on current tip/master doesn't work. Runs with MY_MINSIGSTKSZ under 2048
fail with:

tst-minsigstksz-2: sigaltstack: Cannot allocate memory

and above 2048 don't overwrite bytes below the stack.

So something else is missing. How did you test this patch?

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-16 11:52   ` Borislav Petkov
@ 2021-03-16 18:26     ` Bae, Chang Seok
  2021-03-25 16:20       ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Bae, Chang Seok @ 2021-03-16 18:26 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, mingo, luto, x86, Brown, Len, Hansen, Dave,
	hjl.tools, Dave.Martin, jannh, mpe, carlos, Luck, Tony, Shankar,
	Ravi V, libc-alpha, linux-arch, linux-api, linux-kernel

On Mar 16, 2021, at 04:52, Borislav Petkov <bp@suse.de> wrote:
> On Mon, Mar 15, 2021 at 11:52:14PM -0700, Chang S. Bae wrote:
>> @@ -272,7 +275,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>> 	 * If we are on the alternate signal stack and would overflow it, don't.
>> 	 * Return an always-bogus address instead so we will die with SIGSEGV.
>> 	 */
>> -	if (onsigstack && !likely(on_sig_stack(sp)))
>> +	if (onsigstack && unlikely(sp <= current->sas_ss_sp ||
>> +				   sp - current->sas_ss_sp > current->sas_ss_size))
>> 		return (void __user *)-1L;
> 
> So clearly I'm missing something because trying to trigger the test case
> in the bugzilla:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=153531
> 
> on current tip/master doesn't work. Runs with MY_MINSIGSTKSZ under 2048
> fail with:
> 
> tst-minsigstksz-2: sigaltstack: Cannot allocate memory
> 
> and above 2048 don't overwrite bytes below the stack.
> 
> So something else is missing. How did you test this patch?

I suspect the AVX-512 states not enabled there.

When I ran it under a machine without AVX-512 like this, it didn’t show the
overwrite message:

    $ cat /proc/cpuinfo | grep -m1 "model name”
    model name      : Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz

    $ sudo dmesg | grep "Enabled xstate”
    [    0.000000] x86/fpu: Enabled xstate features 0x1f, context size is 960
    bytes, using ‘compacted’ format.

    $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=2047
    $ ./a.out
    a.out: sigaltstack: Cannot allocate memory

    $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=2048
    $ ./a.out

When do it again with AVX-512, it did show the message:

    $ cat /proc/cpuinfo  | grep -m1 "model name”
    model name      : Intel(R) Core(TM) i9-7940X CPU @ 3.10GHz

    $ sudo dmesg | grep "Enabled xstate”
    [    0.000000] x86/fpu: Enabled xstate features 0xff, context size is 2560
    bytes, using 'compacted' format.

    $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=2048
    $ ./a.out
    a.out: changed byte 1412 bytes below configured stack

    $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3490
    $ ./a.out
    a.out: changed byte 21 bytes below configured stack

    $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3491
    $ ./a.out


Also, on the second machine, without this patch:

    $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3191
    $ ./a.out
    a.out: changed byte 319 bytes below configured stack

But with this patch, it gave segfault with a too-small size:

    $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3191
    $ ./a.out
    Segmentation fault (core dumped)

Thanks,
Chang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size
  2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
                   ` (5 preceding siblings ...)
  2021-03-16  6:52 ` [PATCH v7 6/6] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
@ 2021-03-17 10:06 ` Ingo Molnar
  2021-03-17 10:44   ` Ingo Molnar
  6 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2021-03-17 10:06 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: bp, tglx, luto, x86, len.brown, dave.hansen, hjl.tools,
	Dave.Martin, jannh, mpe, carlos, tony.luck, ravi.v.shankar,
	libc-alpha, linux-arch, linux-api, linux-kernel


* Chang S. Bae <chang.seok.bae@intel.com> wrote:

> During signal entry, the kernel pushes data onto the normal userspace
> stack. On x86, the data pushed onto the user stack includes XSAVE state,
> which has grown over time as new features and larger registers have been
> added to the architecture.
> 
> MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> constant indicates to userspace how much data the kernel expects to push on
> the user stack, [2][3].
> 
> However, this constant is much too small and does not reflect recent
> additions to the architecture. For instance, when AVX-512 states are in
> use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> 
> The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> cause user stack overflow when delivering a signal.

>   uapi: Define the aux vector AT_MINSIGSTKSZ
>   x86/signal: Introduce helpers to get the maximum signal frame size
>   x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
>   selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
>   x86/signal: Detect and prevent an alternate signal stack overflow
>   selftest/x86/signal: Include test cases for validating sigaltstack

So this looks really complicated, is this justified?

Why not just internally round up sigaltstack size if it's too small? 
This would be more robust, as it would fix applications that use 
MINSIGSTKSZ but don't use the new AT_MINSIGSTKSZ facility.

I.e. does AT_MINSIGSTKSZ have any other uses than avoiding the 
segfault if MINSIGSTKSZ is used to create a small signal stack?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size
  2021-03-17 10:06 ` [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Ingo Molnar
@ 2021-03-17 10:44   ` Ingo Molnar
  2021-03-19 18:12     ` Len Brown
  0 siblings, 1 reply; 29+ messages in thread
From: Ingo Molnar @ 2021-03-17 10:44 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: bp, tglx, luto, x86, len.brown, dave.hansen, hjl.tools,
	Dave.Martin, jannh, mpe, carlos, tony.luck, ravi.v.shankar,
	libc-alpha, linux-arch, linux-api, linux-kernel


* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Chang S. Bae <chang.seok.bae@intel.com> wrote:
> 
> > During signal entry, the kernel pushes data onto the normal userspace
> > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > which has grown over time as new features and larger registers have been
> > added to the architecture.
> > 
> > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > constant indicates to userspace how much data the kernel expects to push on
> > the user stack, [2][3].
> > 
> > However, this constant is much too small and does not reflect recent
> > additions to the architecture. For instance, when AVX-512 states are in
> > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > 
> > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > cause user stack overflow when delivering a signal.
> 
> >   uapi: Define the aux vector AT_MINSIGSTKSZ
> >   x86/signal: Introduce helpers to get the maximum signal frame size
> >   x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
> >   selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
> >   x86/signal: Detect and prevent an alternate signal stack overflow
> >   selftest/x86/signal: Include test cases for validating sigaltstack
> 
> So this looks really complicated, is this justified?
> 
> Why not just internally round up sigaltstack size if it's too small? 
> This would be more robust, as it would fix applications that use 
> MINSIGSTKSZ but don't use the new AT_MINSIGSTKSZ facility.
> 
> I.e. does AT_MINSIGSTKSZ have any other uses than avoiding the 
> segfault if MINSIGSTKSZ is used to create a small signal stack?

I.e. if the kernel sees a too small ->ss_size in sigaltstack() it 
would ignore ->ss_sp and mmap() a new sigaltstack instead and use that 
for the signal handler stack.

This would automatically make MINSIGSTKSZ - and other too small sizes 
work today, and in the future.

But the question is, is there user-space usage of sigaltstacks that 
relies on controlling or reading the contents of the stack?

longjmp using programs perhaps?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size
  2021-03-17 10:44   ` Ingo Molnar
@ 2021-03-19 18:12     ` Len Brown
  2021-03-20 17:32       ` Ingo Molnar
  0 siblings, 1 reply; 29+ messages in thread
From: Len Brown @ 2021-03-19 18:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Chang S. Bae, Borislav Petkov, Thomas Gleixner, Andy Lutomirski,
	X86 ML, Brown, Len, Dave Hansen, hjl.tools, Dave Martin, jannh,
	mpe, carlos, bothersome-borer for tony.luck@intel.com,
	Ravi V. Shankar, libc-alpha, linux-arch, linux-api,
	Linux Kernel Mailing List

On Wed, Mar 17, 2021 at 6:45 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Ingo Molnar <mingo@kernel.org> wrote:
>
> >
> > * Chang S. Bae <chang.seok.bae@intel.com> wrote:
> >
> > > During signal entry, the kernel pushes data onto the normal userspace
> > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > which has grown over time as new features and larger registers have been
> > > added to the architecture.
> > >
> > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > constant indicates to userspace how much data the kernel expects to push on
> > > the user stack, [2][3].
> > >
> > > However, this constant is much too small and does not reflect recent
> > > additions to the architecture. For instance, when AVX-512 states are in
> > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > >
> > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > cause user stack overflow when delivering a signal.
> >
> > >   uapi: Define the aux vector AT_MINSIGSTKSZ
> > >   x86/signal: Introduce helpers to get the maximum signal frame size
> > >   x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
> > >   selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
> > >   x86/signal: Detect and prevent an alternate signal stack overflow
> > >   selftest/x86/signal: Include test cases for validating sigaltstack
> >
> > So this looks really complicated, is this justified?
> >
> > Why not just internally round up sigaltstack size if it's too small?
> > This would be more robust, as it would fix applications that use
> > MINSIGSTKSZ but don't use the new AT_MINSIGSTKSZ facility.
> >
> > I.e. does AT_MINSIGSTKSZ have any other uses than avoiding the
> > segfault if MINSIGSTKSZ is used to create a small signal stack?
>
> I.e. if the kernel sees a too small ->ss_size in sigaltstack() it
> would ignore ->ss_sp and mmap() a new sigaltstack instead and use that
> for the signal handler stack.
>
> This would automatically make MINSIGSTKSZ - and other too small sizes
> work today, and in the future.
>
> But the question is, is there user-space usage of sigaltstacks that
> relies on controlling or reading the contents of the stack?
>
> longjmp using programs perhaps?

For the legacy binary that requests a too-small sigaltstack, there are
several choices:

We could detect the too-small stack at sigaltstack(2) invocation and
return an error.
This results in two deal-killing problems:
First, some applications don't check the return value, so the check
would be fruitless.
Second, those that check and error-out may be programs that never
actually take the signal, and so we'd be causing a dusty binary to
exit, when it didn't exit on another system, or another kernel.

Or we could detect the too small stack at signal registration time.
This has the same two deal-killers as above.

Then there is the approach in this patch-set, which detects an
imminent stack overflow at run time.
It has neither of the two problems above, and the benefit that we now
prevent data corruption
that could have been happening on some systems already today.  The
down side is that the dusty binary
that does request the too-small stack can now die at run time.

So your idea of recognizing the problem and conjuring up a sufficient
stack is compelling,
since it would likely "just work", no matter how dumb the program.
But where would the
the sufficient stack come from -- is this a new kernel buffer, or is
there a way to abscond
some user memory?  I would expect a signal handler to look at the data
on its stack
and nobody else will look at that stack.  But this is already an
unreasonable program for
allocating a special signal stack in the first place :-/ So yes, one
could imagine the signal
handler could longjump instead of gracefully completing, and if this
specially allocated
signal stack isn't where the user planned, that could be trouble.

Another idea we discussed was to detect the potential overflow at run-time,
and instead of killing the process, just push the signal onto the
regular user stack.
this might actually work, but it is sort of devious; and it would not
work in the case
where the user overflowed their regular stack already, which may be
the most (only?)
compelling reason that they allocated and declared a special
sigaltstack in the first place...

-- 
Len Brown, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size
  2021-03-19 18:12     ` Len Brown
@ 2021-03-20 17:32       ` Ingo Molnar
  0 siblings, 0 replies; 29+ messages in thread
From: Ingo Molnar @ 2021-03-20 17:32 UTC (permalink / raw)
  To: Len Brown
  Cc: Chang S. Bae, Borislav Petkov, Thomas Gleixner, Andy Lutomirski,
	X86 ML, Brown, Len, Dave Hansen, hjl.tools, Dave Martin, jannh,
	mpe, carlos, bothersome-borer for tony.luck@intel.com,
	Ravi V. Shankar, libc-alpha, linux-arch, linux-api,
	Linux Kernel Mailing List


* Len Brown <lenb@kernel.org> wrote:

> On Wed, Mar 17, 2021 at 6:45 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Ingo Molnar <mingo@kernel.org> wrote:
> >
> > >
> > > * Chang S. Bae <chang.seok.bae@intel.com> wrote:
> > >
> > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > which has grown over time as new features and larger registers have been
> > > > added to the architecture.
> > > >
> > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > constant indicates to userspace how much data the kernel expects to push on
> > > > the user stack, [2][3].
> > > >
> > > > However, this constant is much too small and does not reflect recent
> > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > >
> > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > cause user stack overflow when delivering a signal.
> > >
> > > >   uapi: Define the aux vector AT_MINSIGSTKSZ
> > > >   x86/signal: Introduce helpers to get the maximum signal frame size
> > > >   x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
> > > >   selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available
> > > >   x86/signal: Detect and prevent an alternate signal stack overflow
> > > >   selftest/x86/signal: Include test cases for validating sigaltstack
> > >
> > > So this looks really complicated, is this justified?
> > >
> > > Why not just internally round up sigaltstack size if it's too small?
> > > This would be more robust, as it would fix applications that use
> > > MINSIGSTKSZ but don't use the new AT_MINSIGSTKSZ facility.
> > >
> > > I.e. does AT_MINSIGSTKSZ have any other uses than avoiding the
> > > segfault if MINSIGSTKSZ is used to create a small signal stack?
> >
> > I.e. if the kernel sees a too small ->ss_size in sigaltstack() it
> > would ignore ->ss_sp and mmap() a new sigaltstack instead and use that
> > for the signal handler stack.
> >
> > This would automatically make MINSIGSTKSZ - and other too small sizes
> > work today, and in the future.
> >
> > But the question is, is there user-space usage of sigaltstacks that
> > relies on controlling or reading the contents of the stack?
> >
> > longjmp using programs perhaps?
> 
> For the legacy binary that requests a too-small sigaltstack, there are
> several choices:
> 
> We could detect the too-small stack at sigaltstack(2) invocation and
> return an error.
> This results in two deal-killing problems:
> First, some applications don't check the return value, so the check
> would be fruitless.
> Second, those that check and error-out may be programs that never
> actually take the signal, and so we'd be causing a dusty binary to
> exit, when it didn't exit on another system, or another kernel.
> 
> Or we could detect the too small stack at signal registration time.
> This has the same two deal-killers as above.
> 
> Then there is the approach in this patch-set, which detects an
> imminent stack overflow at run time.
> It has neither of the two problems above, and the benefit that we now
> prevent data corruption
> that could have been happening on some systems already today.  The
> down side is that the dusty binary
> that does request the too-small stack can now die at run time.
> 
> So your idea of recognizing the problem and conjuring up a 
> sufficient stack is compelling, since it would likely "just work", 
> no matter how dumb the program. But where would the the sufficient 
> stack come from -- is this a new kernel buffer, or is there a way to 
> abscond some user memory?  I would expect a signal handler to look 
> at the data on its stack and nobody else will look at that stack.  
> But this is already an unreasonable program for allocating a special 
> signal stack in the first place :-/ So yes, one could imagine the 
> signal handler could longjump instead of gracefully completing, and 
> if this specially allocated signal stack isn't where the user 
> planned, that could be trouble.

We could mmap() (implicitly) new anonymous memory - but I can see why 
this is probably more trouble than worth...

> Another idea we discussed was to detect the potential overflow at 
> run-time, and instead of killing the process, just push the signal 
> onto the regular user stack. this might actually work, but it is 
> sort of devious; and it would not work in the case where the user 
> overflowed their regular stack already, which may be the most 
> (only?) compelling reason that they allocated and declared a special 
> sigaltstack in the first place...

Yeah, this doesn't sound deterministic enough.

Ok, thanks for the detailed answers - I withdraw my objections, let's 
proceed with the approach you are proposing?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-16 18:26     ` Bae, Chang Seok
@ 2021-03-25 16:20       ` Borislav Petkov
  2021-03-25 17:21         ` Bae, Chang Seok
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-03-25 16:20 UTC (permalink / raw)
  To: Bae, Chang Seok
  Cc: Thomas Gleixner, mingo, luto, x86, Brown, Len, Hansen, Dave,
	hjl.tools, Dave.Martin, jannh, mpe, carlos, Luck, Tony, Shankar,
	Ravi V, libc-alpha, linux-arch, linux-api, linux-kernel

On Tue, Mar 16, 2021 at 06:26:46PM +0000, Bae, Chang Seok wrote:
> I suspect the AVX-512 states not enabled there.

Ok, I found a machine which has AVX-512:

[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: xstate_offset[5]:  832, xstate_sizes[5]:   64
[    0.000000] x86/fpu: xstate_offset[6]:  896, xstate_sizes[6]:  512
[    0.000000] x86/fpu: xstate_offset[7]: 1408, xstate_sizes[7]: 1024
[    0.000000] x86/fpu: xstate_offset[9]: 2432, xstate_sizes[9]:    8
[    0.000000] x86/fpu: Enabled xstate features 0x2e7, context size is 2440 bytes, using 'compacted' format.

and applied your patch and added a debug printk, see end of mail.

Then, I ran the test case:

$ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3453 -o tst-minsigstksz-2
$ ./tst-minsigstksz-2
tst-minsigstksz-2: changed byte 50 bytes below configured stack

Whoops.

And the debug print said:

[ 5395.252884] signal: get_sigframe: sp: 0x7f54ec39e7b8, sas_ss_sp: 0x7f54ec39e6ce, sas_ss_size 0xd7d

which tells me that, AFAICT, your check whether we have enough alt stack
doesn't seem to work in this case.

Thx.

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index a06cb107c0e8..a7396f7c3832 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -237,7 +237,7 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	unsigned long math_size = 0;
 	unsigned long sp = regs->sp;
 	unsigned long buf_fx = 0;
-	int onsigstack = on_sig_stack(sp);
+	bool onsigstack = on_sig_stack(sp);
 	int ret;
 
 	/* redzone */
@@ -246,8 +246,11 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 
 	/* This is the X/Open sanctioned signal stack switching.  */
 	if (ka->sa.sa_flags & SA_ONSTACK) {
-		if (sas_ss_flags(sp) == 0)
+		if (sas_ss_flags(sp) == 0) {
 			sp = current->sas_ss_sp + current->sas_ss_size;
+			/* On the alternate signal stack */
+			onsigstack = true;
+		}
 	} else if (IS_ENABLED(CONFIG_X86_32) &&
 		   !onsigstack &&
 		   regs->ss != __USER_DS &&
@@ -263,11 +266,16 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 
 	sp = align_sigframe(sp - frame_size);
 
+	if (onsigstack)
+		pr_info("%s: sp: 0x%lx, sas_ss_sp: 0x%lx, sas_ss_size 0x%lx\n",
+			__func__, sp, current->sas_ss_sp, current->sas_ss_size);
+
 	/*
 	 * If we are on the alternate signal stack and would overflow it, don't.
 	 * Return an always-bogus address instead so we will die with SIGSEGV.
 	 */
-	if (onsigstack && !likely(on_sig_stack(sp)))
+	if (onsigstack && unlikely(sp <= current->sas_ss_sp ||
+				   sp - current->sas_ss_sp > current->sas_ss_size))
 		return (void __user *)-1L;
 
 	/* save i387 and extended state */

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-25 16:20       ` Borislav Petkov
@ 2021-03-25 17:21         ` Bae, Chang Seok
  2021-03-25 20:14           ` Florian Weimer
  0 siblings, 1 reply; 29+ messages in thread
From: Bae, Chang Seok @ 2021-03-25 17:21 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, mingo, luto, x86, Brown, Len, Hansen, Dave,
	hjl.tools, Dave.Martin, jannh, mpe, carlos, Luck, Tony, Shankar,
	Ravi V, libc-alpha, linux-arch, linux-api, linux-kernel

On Mar 25, 2021, at 09:20, Borislav Petkov <bp@suse.de> wrote:
> 
> $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3453 -o tst-minsigstksz-2
> $ ./tst-minsigstksz-2
> tst-minsigstksz-2: changed byte 50 bytes below configured stack
> 
> Whoops.
> 
> And the debug print said:
> 
> [ 5395.252884] signal: get_sigframe: sp: 0x7f54ec39e7b8, sas_ss_sp: 0x7f54ec39e6ce, sas_ss_size 0xd7d
> 
> which tells me that, AFAICT, your check whether we have enough alt stack
> doesn't seem to work in this case.

Yes, in this case.

tst-minsigstksz-2.c has this code:

static void
handler (int signo)
{
  /* Clear a bit of on-stack memory.  */
  volatile char buffer[256];
  for (size_t i = 0; i < sizeof (buffer); ++i)
    buffer[i] = 0;
  handler_run = 1;
}
…

  if (handler_run != 1)
    errx (1, "handler did not run");

  for (void *p = stack_buffer; p < stack_bottom; ++p)
    if (*(unsigned char *) p != 0xCC)
      errx (1, "changed byte %zd bytes below configured stack\n",
            stack_bottom - p);
…

I think the message comes from the handler’s overwriting, not from the kernel.

The patch's check is to detect and prevent the kernel-induced overflow --
whether alt stack enough for signal delivery itself.  The stack is possibly
not enough for the signal handler's use as the kernel does not know for it.

Thanks,
Chang






^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-16  6:52 ` [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow Chang S. Bae
  2021-03-16 11:52   ` Borislav Petkov
@ 2021-03-25 18:13   ` Andy Lutomirski
  2021-03-25 18:54     ` Borislav Petkov
  2021-03-26  4:58     ` Andy Lutomirski
  1 sibling, 2 replies; 29+ messages in thread
From: Andy Lutomirski @ 2021-03-25 18:13 UTC (permalink / raw)
  To: Chang S. Bae, Andrew Cooper, Boris Ostrovsky, Juergen Gross,
	Stefano Stabellini
  Cc: Borislav Petkov, Thomas Gleixner, Ingo Molnar, Andrew Lutomirski,
	X86 ML, Len Brown, Dave Hansen, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Tony Luck,
	Ravi V. Shankar, libc-alpha, linux-arch, Linux API, LKML


[-- Attachment #1: Type: text/plain, Size: 5557 bytes --]

On Mon, Mar 15, 2021 at 11:57 PM Chang S. Bae <chang.seok.bae@intel.com> wrote:
>
> The kernel pushes context on to the userspace stack to prepare for the
> user's signal handler. When the user has supplied an alternate signal
> stack, via sigaltstack(2), it is easy for the kernel to verify that the
> stack size is sufficient for the current hardware context.
>
> Check if writing the hardware context to the alternate stack will exceed
> it's size. If yes, then instead of corrupting user-data and proceeding with
> the original signal handler, an immediate SIGSEGV signal is delivered.
>
> Instead of calling on_sig_stack(), directly check the new stack pointer
> whether in the bounds.
>
> While the kernel allows new source code to discover and use a sufficient
> alternate signal stack size, this check is still necessary to protect
> binaries with insufficient alternate signal stack size from data
> corruption.

This patch results in excessively complicated control and data flow.

> -       int onsigstack = on_sig_stack(sp);
> +       bool onsigstack = on_sig_stack(sp);

Here onsigstack means "we were already using the altstack".

>         int ret;
>
>         /* redzone */
> @@ -251,8 +251,11 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>
>         /* This is the X/Open sanctioned signal stack switching.  */
>         if (ka->sa.sa_flags & SA_ONSTACK) {
> -               if (sas_ss_flags(sp) == 0)
> +               if (sas_ss_flags(sp) == 0) {
>                         sp = current->sas_ss_sp + current->sas_ss_size;
> +                       /* On the alternate signal stack */
> +                       onsigstack = true;
> +               }

But now onsigstack is also true if we are using the legacy path to
*enter* the altstack.  So now it's (was on altstack) || (entering
altstack via legacy path).

>         } else if (IS_ENABLED(CONFIG_X86_32) &&
>                    !onsigstack &&
>                    regs->ss != __USER_DS &&
> @@ -272,7 +275,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>          * If we are on the alternate signal stack and would overflow it, don't.
>          * Return an always-bogus address instead so we will die with SIGSEGV.
>          */
> -       if (onsigstack && !likely(on_sig_stack(sp)))
> +       if (onsigstack && unlikely(sp <= current->sas_ss_sp ||
> +                                  sp - current->sas_ss_sp > current->sas_ss_size))

And now we fail if ((was on altstack) || (entering altstack via legacy
path)) && (new sp is out of bounds).


The condition we actually want is that, if we are entering the
altstack and we don't fit, we should fail.  This is tricky because of
the autodisarm stuff and the possibility of nonlinear stack segments,
so it's not even clear to me exactly what we should be doing.  I
propose:




>                 return (void __user *)-1L;

Can we please log something (if (show_unhandled_signals ||
printk_ratelimit()) that says that we overflowed the altstack?

How about:

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index ea794a083c44..53781324a2d3 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -237,7 +237,8 @@ get_sigframe(struct k_sigaction *ka, struct
pt_regs *regs, size_t frame_size,
     unsigned long math_size = 0;
     unsigned long sp = regs->sp;
     unsigned long buf_fx = 0;
-    int onsigstack = on_sig_stack(sp);
+    bool already_onsigstack = on_sig_stack(sp);
+    bool entering_altstack = false;
     int ret;

     /* redzone */
@@ -246,15 +247,25 @@ get_sigframe(struct k_sigaction *ka, struct
pt_regs *regs, size_t frame_size,

     /* This is the X/Open sanctioned signal stack switching.  */
     if (ka->sa.sa_flags & SA_ONSTACK) {
-        if (sas_ss_flags(sp) == 0)
+        /*
+         * This checks already_onsigstack via sas_ss_flags().
+         * Sensible programs use SS_AUTODISARM, which disables
+         * that check, and programs that don't use
+         * SS_AUTODISARM get compatible but potentially
+         * bizarre behavior.
+         */
+        if (sas_ss_flags(sp) == 0) {
             sp = current->sas_ss_sp + current->sas_ss_size;
+            entering_altstack = true;
+        }
     } else if (IS_ENABLED(CONFIG_X86_32) &&
-           !onsigstack &&
+           !already_onsigstack &&
            regs->ss != __USER_DS &&
            !(ka->sa.sa_flags & SA_RESTORER) &&
            ka->sa.sa_restorer) {
         /* This is the legacy signal stack switching. */
         sp = (unsigned long) ka->sa.sa_restorer;
+        entering_altstack = true;
     }

     sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
@@ -267,8 +278,16 @@ get_sigframe(struct k_sigaction *ka, struct
pt_regs *regs, size_t frame_size,
      * If we are on the alternate signal stack and would overflow it, don't.
      * Return an always-bogus address instead so we will die with SIGSEGV.
      */
-    if (onsigstack && !likely(on_sig_stack(sp)))
+    if (unlikely(entering_altstack &&
+             (sp <= current->sas_ss_sp ||
+              sp - current->sas_ss_sp > current->sas_ss_size))) {
+        if (show_unhandled_signals && printk_ratelimit()) {
+            pr_info("%s[%d] overflowed sigaltstack",
+                tsk->comm, task_pid_nr(tsk));
+        }
+
         return (void __user *)-1L;
+    }

     /* save i387 and extended state */
     ret = copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size);

Apologies for whitespace damage.  I attached it, too.

[-- Attachment #2: stack.patch --]
[-- Type: text/x-patch, Size: 2212 bytes --]

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index ea794a083c44..53781324a2d3 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -237,7 +237,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	unsigned long math_size = 0;
 	unsigned long sp = regs->sp;
 	unsigned long buf_fx = 0;
-	int onsigstack = on_sig_stack(sp);
+	bool already_onsigstack = on_sig_stack(sp);
+	bool entering_altstack = false;
 	int ret;
 
 	/* redzone */
@@ -246,15 +247,25 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 
 	/* This is the X/Open sanctioned signal stack switching.  */
 	if (ka->sa.sa_flags & SA_ONSTACK) {
-		if (sas_ss_flags(sp) == 0)
+		/*
+		 * This checks already_onsigstack via sas_ss_flags().
+		 * Sensible programs use SS_AUTODISARM, which disables
+		 * that check, and programs that don't use
+		 * SS_AUTODISARM get compatible but potentially
+		 * bizarre behavior.
+		 */
+		if (sas_ss_flags(sp) == 0) {
 			sp = current->sas_ss_sp + current->sas_ss_size;
+			entering_altstack = true;
+		}
 	} else if (IS_ENABLED(CONFIG_X86_32) &&
-		   !onsigstack &&
+		   !already_onsigstack &&
 		   regs->ss != __USER_DS &&
 		   !(ka->sa.sa_flags & SA_RESTORER) &&
 		   ka->sa.sa_restorer) {
 		/* This is the legacy signal stack switching. */
 		sp = (unsigned long) ka->sa.sa_restorer;
+		entering_altstack = true;
 	}
 
 	sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
@@ -267,8 +278,16 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	 * If we are on the alternate signal stack and would overflow it, don't.
 	 * Return an always-bogus address instead so we will die with SIGSEGV.
 	 */
-	if (onsigstack && !likely(on_sig_stack(sp)))
+	if (unlikely(entering_altstack &&
+		     (sp <= current->sas_ss_sp ||
+		      sp - current->sas_ss_sp > current->sas_ss_size))) {
+		if (show_unhandled_signals && printk_ratelimit()) {
+			pr_info("%s[%d] overflowed sigaltstack",
+				tsk->comm, task_pid_nr(tsk));
+		}
+
 		return (void __user *)-1L;
+	}
 
 	/* save i387 and extended state */
 	ret = copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size);

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-25 18:13   ` Andy Lutomirski
@ 2021-03-25 18:54     ` Borislav Petkov
  2021-03-25 21:11       ` Bae, Chang Seok
  2021-03-26  4:56       ` Andy Lutomirski
  2021-03-26  4:58     ` Andy Lutomirski
  1 sibling, 2 replies; 29+ messages in thread
From: Borislav Petkov @ 2021-03-25 18:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chang S. Bae, Andrew Cooper, Boris Ostrovsky, Juergen Gross,
	Stefano Stabellini, Thomas Gleixner, Ingo Molnar, X86 ML,
	Len Brown, Dave Hansen, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Tony Luck,
	Ravi V. Shankar, libc-alpha, linux-arch, Linux API, LKML

On Thu, Mar 25, 2021 at 11:13:12AM -0700, Andy Lutomirski wrote:
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index ea794a083c44..53781324a2d3 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -237,7 +237,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>  	unsigned long math_size = 0;
>  	unsigned long sp = regs->sp;
>  	unsigned long buf_fx = 0;
> -	int onsigstack = on_sig_stack(sp);
> +	bool already_onsigstack = on_sig_stack(sp);
> +	bool entering_altstack = false;
>  	int ret;
>  
>  	/* redzone */
> @@ -246,15 +247,25 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>  
>  	/* This is the X/Open sanctioned signal stack switching.  */
>  	if (ka->sa.sa_flags & SA_ONSTACK) {
> -		if (sas_ss_flags(sp) == 0)
> +		/*
> +		 * This checks already_onsigstack via sas_ss_flags().
> +		 * Sensible programs use SS_AUTODISARM, which disables
> +		 * that check, and programs that don't use
> +		 * SS_AUTODISARM get compatible but potentially
> +		 * bizarre behavior.
> +		 */
> +		if (sas_ss_flags(sp) == 0) {
>  			sp = current->sas_ss_sp + current->sas_ss_size;
> +			entering_altstack = true;
> +		}
>  	} else if (IS_ENABLED(CONFIG_X86_32) &&
> -		   !onsigstack &&
> +		   !already_onsigstack &&
>  		   regs->ss != __USER_DS &&
>  		   !(ka->sa.sa_flags & SA_RESTORER) &&
>  		   ka->sa.sa_restorer) {
>  		/* This is the legacy signal stack switching. */
>  		sp = (unsigned long) ka->sa.sa_restorer;
> +		entering_altstack = true;
>  	}

What a mess this whole signal handling is. I need a course in signal
handling to understand what's going on here...

>  
>  	sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
> @@ -267,8 +278,16 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>  	 * If we are on the alternate signal stack and would overflow it, don't.
>  	 * Return an always-bogus address instead so we will die with SIGSEGV.
>  	 */
> -	if (onsigstack && !likely(on_sig_stack(sp)))
> +	if (unlikely(entering_altstack &&
> +		     (sp <= current->sas_ss_sp ||
> +		      sp - current->sas_ss_sp > current->sas_ss_size))) {

You could've simply done

	if (unlikely(entering_altstack && !on_sig_stack(sp)))

here.


> +		if (show_unhandled_signals && printk_ratelimit()) {
> +			pr_info("%s[%d] overflowed sigaltstack",
> +				tsk->comm, task_pid_nr(tsk));
> +		}

Why do you even wanna issue that? It looks like callers will propagate
an error value up and people don't look at dmesg all the time.

Btw, s/tsk/current/g

IOW, this builds:

---
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index a06cb107c0e8..c00e932b5f18 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -234,10 +234,11 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	     void __user **fpstate)
 {
 	/* Default to using normal stack */
+	bool already_onsigstack = on_sig_stack(regs->sp);
+	bool entering_altstack = false;
 	unsigned long math_size = 0;
 	unsigned long sp = regs->sp;
 	unsigned long buf_fx = 0;
-	int onsigstack = on_sig_stack(sp);
 	int ret;
 
 	/* redzone */
@@ -246,15 +247,24 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 
 	/* This is the X/Open sanctioned signal stack switching.  */
 	if (ka->sa.sa_flags & SA_ONSTACK) {
-		if (sas_ss_flags(sp) == 0)
+		/*
+		 * This checks already_onsigstack via sas_ss_flags(). Sensible
+		 * programs use SS_AUTODISARM, which disables that check, and
+		 * programs that don't use SS_AUTODISARM get compatible but
+		 * potentially bizarre behavior.
+		 */
+		if (sas_ss_flags(sp) == 0) {
 			sp = current->sas_ss_sp + current->sas_ss_size;
+			entering_altstack = true;
+		}
 	} else if (IS_ENABLED(CONFIG_X86_32) &&
-		   !onsigstack &&
+		   !already_onsigstack &&
 		   regs->ss != __USER_DS &&
 		   !(ka->sa.sa_flags & SA_RESTORER) &&
 		   ka->sa.sa_restorer) {
 		/* This is the legacy signal stack switching. */
 		sp = (unsigned long) ka->sa.sa_restorer;
+		entering_altstack = true;
 	}
 
 	sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
@@ -267,8 +277,14 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 	 * If we are on the alternate signal stack and would overflow it, don't.
 	 * Return an always-bogus address instead so we will die with SIGSEGV.
 	 */
-	if (onsigstack && !likely(on_sig_stack(sp)))
+	if (unlikely(entering_altstack && !on_sig_stack(sp))) {
+
+		if (show_unhandled_signals && printk_ratelimit())
+			pr_info("%s[%d] overflowed sigaltstack",
+				current->comm, task_pid_nr(current));
+
 		return (void __user *)-1L;
+	}
 
 	/* save i387 and extended state */
 	ret = copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size);


-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-25 17:21         ` Bae, Chang Seok
@ 2021-03-25 20:14           ` Florian Weimer
  0 siblings, 0 replies; 29+ messages in thread
From: Florian Weimer @ 2021-03-25 20:14 UTC (permalink / raw)
  To: Bae, Chang Seok via Libc-alpha
  Cc: Borislav Petkov, Bae, Chang Seok, linux-arch, Brown, Len, Luck,
	Tony, jannh, x86, linux-kernel, Dave.Martin, Hansen, Dave, luto,
	linux-api, Thomas Gleixner, mingo, Shankar, Ravi V

* Chang Seok via Libc-alpha Bae:

> On Mar 25, 2021, at 09:20, Borislav Petkov <bp@suse.de> wrote:
>> 
>> $ gcc tst-minsigstksz-2.c -DMY_MINSIGSTKSZ=3453 -o tst-minsigstksz-2
>> $ ./tst-minsigstksz-2
>> tst-minsigstksz-2: changed byte 50 bytes below configured stack
>> 
>> Whoops.
>> 
>> And the debug print said:
>> 
>> [ 5395.252884] signal: get_sigframe: sp: 0x7f54ec39e7b8, sas_ss_sp: 0x7f54ec39e6ce, sas_ss_size 0xd7d
>> 
>> which tells me that, AFAICT, your check whether we have enough alt stack
>> doesn't seem to work in this case.
>
> Yes, in this case.
>
> tst-minsigstksz-2.c has this code:
>
> static void
> handler (int signo)
> {
>   /* Clear a bit of on-stack memory.  */
>   volatile char buffer[256];
>   for (size_t i = 0; i < sizeof (buffer); ++i)
>     buffer[i] = 0;
>   handler_run = 1;
> }
> …
>
>   if (handler_run != 1)
>     errx (1, "handler did not run");
>
>   for (void *p = stack_buffer; p < stack_bottom; ++p)
>     if (*(unsigned char *) p != 0xCC)
>       errx (1, "changed byte %zd bytes below configured stack\n",
>             stack_bottom - p);
> …
>
> I think the message comes from the handler’s overwriting, not from the kernel.
>
> The patch's check is to detect and prevent the kernel-induced overflow --
> whether alt stack enough for signal delivery itself.  The stack is possibly
> not enough for the signal handler's use as the kernel does not know for it.

Ahh, right.  When I wrote the test, I didn't know which turn the
kernel would eventually take, so the test is quite arbitrary.

The glibc dynamic loader uses XSAVE/XSAVEC as well, so you can
probably double the practical stack requirement if lazy binding is in
use and can be triggered from the signal handler.  Estimating stack
sizes is hard.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-25 18:54     ` Borislav Petkov
@ 2021-03-25 21:11       ` Bae, Chang Seok
  2021-03-25 21:27         ` Borislav Petkov
  2021-03-26  4:56       ` Andy Lutomirski
  1 sibling, 1 reply; 29+ messages in thread
From: Bae, Chang Seok @ 2021-03-25 21:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Cooper, Andrew, Boris Ostrovsky, Gross, Jurgen,
	Stefano Stabellini, Thomas Gleixner, Ingo Molnar, X86 ML, Brown,
	Len, Hansen, Dave, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Luck, Tony, Shankar,
	Ravi V, libc-alpha, linux-arch, Linux API, LKML

On Mar 25, 2021, at 11:54, Borislav Petkov <bp@suse.de> wrote:
> On Thu, Mar 25, 2021 at 11:13:12AM -0700, Andy Lutomirski wrote:
>> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
>> index ea794a083c44..53781324a2d3 100644
>> --- a/arch/x86/kernel/signal.c
>> +++ b/arch/x86/kernel/signal.c
>> @@ -237,7 +237,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>> 	unsigned long math_size = 0;
>> 	unsigned long sp = regs->sp;
>> 	unsigned long buf_fx = 0;
>> -	int onsigstack = on_sig_stack(sp);
>> +	bool already_onsigstack = on_sig_stack(sp);
>> +	bool entering_altstack = false;
>> 	int ret;
>> 
>> 	/* redzone */
>> @@ -246,15 +247,25 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>> 
>> 	/* This is the X/Open sanctioned signal stack switching.  */
>> 	if (ka->sa.sa_flags & SA_ONSTACK) {
>> -		if (sas_ss_flags(sp) == 0)
>> +		/*
>> +		 * This checks already_onsigstack via sas_ss_flags().
>> +		 * Sensible programs use SS_AUTODISARM, which disables
>> +		 * that check, and programs that don't use
>> +		 * SS_AUTODISARM get compatible but potentially
>> +		 * bizarre behavior.
>> +		 */
>> +		if (sas_ss_flags(sp) == 0) {
>> 			sp = current->sas_ss_sp + current->sas_ss_size;
>> +			entering_altstack = true;
>> +		}
>> 	} else if (IS_ENABLED(CONFIG_X86_32) &&
>> -		   !onsigstack &&
>> +		   !already_onsigstack &&
>> 		   regs->ss != __USER_DS &&
>> 		   !(ka->sa.sa_flags & SA_RESTORER) &&
>> 		   ka->sa.sa_restorer) {
>> 		/* This is the legacy signal stack switching. */
>> 		sp = (unsigned long) ka->sa.sa_restorer;
>> +		entering_altstack = true;
>> 	}
> 
> What a mess this whole signal handling is. I need a course in signal
> handling to understand what's going on here...
> 
>> 
>> 	sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
>> @@ -267,8 +278,16 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
>> 	 * If we are on the alternate signal stack and would overflow it, don't.
>> 	 * Return an always-bogus address instead so we will die with SIGSEGV.
>> 	 */
>> -	if (onsigstack && !likely(on_sig_stack(sp)))
>> +	if (unlikely(entering_altstack &&
>> +		     (sp <= current->sas_ss_sp ||
>> +		      sp - current->sas_ss_sp > current->sas_ss_size))) {
> 
> You could've simply done
> 
> 	if (unlikely(entering_altstack && !on_sig_stack(sp)))
> 
> here.

But if sigaltstack()’ed with the SS_AUTODISARM flag, both on_sig_stack() and
sas_ss_flags() return 0 [1]. Then, segfault always here. v5 had the exact
issue before [2].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/sched/signal.h#n576
[2] https://lore.kernel.org/lkml/CALCETrXuFrHUU-L=HMofTgEDZk9muPnVtK=EjsTHqQ01XhbRYg@mail.gmail.com/

Thanks,
Chang


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-25 21:11       ` Bae, Chang Seok
@ 2021-03-25 21:27         ` Borislav Petkov
  0 siblings, 0 replies; 29+ messages in thread
From: Borislav Petkov @ 2021-03-25 21:27 UTC (permalink / raw)
  To: Bae, Chang Seok
  Cc: Andy Lutomirski, Cooper, Andrew, Boris Ostrovsky, Gross, Jurgen,
	Stefano Stabellini, Thomas Gleixner, Ingo Molnar, X86 ML, Brown,
	Len, Hansen, Dave, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Luck, Tony, Shankar,
	Ravi V, libc-alpha, linux-arch, Linux API, LKML

On Thu, Mar 25, 2021 at 09:11:56PM +0000, Bae, Chang Seok wrote:
> But if sigaltstack()’ed with the SS_AUTODISARM flag, both on_sig_stack() and
> sas_ss_flags() return 0 [1]. Then, segfault always here. v5 had the exact
> issue before [2].

Ah, there's that SS_AUTODISARM check above it which I missed, sorry.

I guess we can do a __on_sig_stack() helper or so which does the stack
check only without the SS_AUTODISARM. Just for readability's sake in
what is already a pretty messy function.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-25 18:54     ` Borislav Petkov
  2021-03-25 21:11       ` Bae, Chang Seok
@ 2021-03-26  4:56       ` Andy Lutomirski
  2021-03-26 10:30         ` Borislav Petkov
  1 sibling, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2021-03-26  4:56 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Chang S. Bae, Andrew Cooper, Boris Ostrovsky,
	Juergen Gross, Stefano Stabellini, Thomas Gleixner, Ingo Molnar,
	X86 ML, Len Brown, Dave Hansen, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Tony Luck,
	Ravi V. Shankar, libc-alpha, linux-arch, Linux API, LKML

On Thu, Mar 25, 2021 at 11:54 AM Borislav Petkov <bp@suse.de> wrote:
>
> On Thu, Mar 25, 2021 at 11:13:12AM -0700, Andy Lutomirski wrote:
> > diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> > index ea794a083c44..53781324a2d3 100644
> > --- a/arch/x86/kernel/signal.c
> > +++ b/arch/x86/kernel/signal.c
> > @@ -237,7 +237,8 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
> >       unsigned long math_size = 0;
> >       unsigned long sp = regs->sp;
> >       unsigned long buf_fx = 0;
> > -     int onsigstack = on_sig_stack(sp);
> > +     bool already_onsigstack = on_sig_stack(sp);
> > +     bool entering_altstack = false;
> >       int ret;
> >
> >       /* redzone */
> > @@ -246,15 +247,25 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
> >
> >       /* This is the X/Open sanctioned signal stack switching.  */
> >       if (ka->sa.sa_flags & SA_ONSTACK) {
> > -             if (sas_ss_flags(sp) == 0)
> > +             /*
> > +              * This checks already_onsigstack via sas_ss_flags().
> > +              * Sensible programs use SS_AUTODISARM, which disables
> > +              * that check, and programs that don't use
> > +              * SS_AUTODISARM get compatible but potentially
> > +              * bizarre behavior.
> > +              */
> > +             if (sas_ss_flags(sp) == 0) {
> >                       sp = current->sas_ss_sp + current->sas_ss_size;
> > +                     entering_altstack = true;
> > +             }
> >       } else if (IS_ENABLED(CONFIG_X86_32) &&
> > -                !onsigstack &&
> > +                !already_onsigstack &&
> >                  regs->ss != __USER_DS &&
> >                  !(ka->sa.sa_flags & SA_RESTORER) &&
> >                  ka->sa.sa_restorer) {
> >               /* This is the legacy signal stack switching. */
> >               sp = (unsigned long) ka->sa.sa_restorer;
> > +             entering_altstack = true;
> >       }
>
> What a mess this whole signal handling is. I need a course in signal
> handling to understand what's going on here...
>
> >
> >       sp = fpu__alloc_mathframe(sp, IS_ENABLED(CONFIG_X86_32),
> > @@ -267,8 +278,16 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
> >        * If we are on the alternate signal stack and would overflow it, don't.
> >        * Return an always-bogus address instead so we will die with SIGSEGV.
> >        */
> > -     if (onsigstack && !likely(on_sig_stack(sp)))
> > +     if (unlikely(entering_altstack &&
> > +                  (sp <= current->sas_ss_sp ||
> > +                   sp - current->sas_ss_sp > current->sas_ss_size))) {
>
> You could've simply done
>
>         if (unlikely(entering_altstack && !on_sig_stack(sp)))
>
> here.

Nope.  on_sig_stack() is a horrible kludge and won't work here.  We
could have something like __on_sig_stack() or sp_is_on_sig_stack() or
something, though.

>
>
> > +             if (show_unhandled_signals && printk_ratelimit()) {
> > +                     pr_info("%s[%d] overflowed sigaltstack",
> > +                             tsk->comm, task_pid_nr(tsk));
> > +             }
>
> Why do you even wanna issue that? It looks like callers will propagate
> an error value up and people don't look at dmesg all the time.

I figure that the people whose programs spontaneously crash should get
a hint why if they look at dmesg.  Maybe the message should say
"overflowed sigaltstack -- try noavx512"?

We really ought to have a SIGSIGFAIL signal that's sent, double-fault
style, when we fail to send a signal.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-25 18:13   ` Andy Lutomirski
  2021-03-25 18:54     ` Borislav Petkov
@ 2021-03-26  4:58     ` Andy Lutomirski
  1 sibling, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2021-03-26  4:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew Cooper, Boris Ostrovsky, Juergen Gross,
	Stefano Stabellini, Borislav Petkov, Thomas Gleixner,
	Ingo Molnar, X86 ML, Len Brown, Dave Hansen, H. J. Lu,
	Dave Martin, Jann Horn, Michael Ellerman, Carlos O'Donell,
	Tony Luck, Ravi V. Shankar, libc-alpha, linux-arch, Linux API,
	LKML, Bae, Chang Seok

I forgot to mention why I cc'd all you fine Xen folk:

On Thu, Mar 25, 2021 at 11:13 AM Andy Lutomirski <luto@kernel.org> wrote:

>
> >         } else if (IS_ENABLED(CONFIG_X86_32) &&
> >                    !onsigstack &&
> >                    regs->ss != __USER_DS &&

This bit here seems really dubious on Xen PV.  Honestly it seems
dubious everywhere, but especially on Xen PV.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-26  4:56       ` Andy Lutomirski
@ 2021-03-26 10:30         ` Borislav Petkov
  2021-04-12 22:30           ` Bae, Chang Seok
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-03-26 10:30 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Chang S. Bae, Andrew Cooper, Boris Ostrovsky, Juergen Gross,
	Stefano Stabellini, Thomas Gleixner, Ingo Molnar, X86 ML,
	Len Brown, Dave Hansen, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Tony Luck,
	Ravi V. Shankar, libc-alpha, linux-arch, Linux API, LKML

On Thu, Mar 25, 2021 at 09:56:53PM -0700, Andy Lutomirski wrote:
> Nope.  on_sig_stack() is a horrible kludge and won't work here.  We
> could have something like __on_sig_stack() or sp_is_on_sig_stack() or
> something, though.

Yeah, see my other reply. Ack to either of those carved out helpers.

> I figure that the people whose programs spontaneously crash should get
> a hint why if they look at dmesg.  Maybe the message should say
> "overflowed sigaltstack -- try noavx512"?

I guess, as long as it is ratelimited. I mean, we can remove it later if
it starts gettin' annoying.

> We really ought to have a SIGSIGFAIL signal that's sent, double-fault
> style, when we fail to send a signal.

Yeap, we should be able to tell userspace that we couldn't send a
signal, hohumm.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-03-26 10:30         ` Borislav Petkov
@ 2021-04-12 22:30           ` Bae, Chang Seok
  2021-04-14 10:12             ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Bae, Chang Seok @ 2021-04-12 22:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Cooper, Andrew, Boris Ostrovsky, Gross, Jurgen,
	Stefano Stabellini, Thomas Gleixner, Ingo Molnar, X86 ML, Brown,
	Len, Hansen, Dave, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Luck, Tony, Shankar,
	Ravi V, libc-alpha, linux-arch, Linux API, LKML

On Mar 26, 2021, at 03:30, Borislav Petkov <bp@alien8.de> wrote:
> On Thu, Mar 25, 2021 at 09:56:53PM -0700, Andy Lutomirski wrote:
>> We really ought to have a SIGSIGFAIL signal that's sent, double-fault
>> style, when we fail to send a signal.
> 
> Yeap, we should be able to tell userspace that we couldn't send a
> signal, hohumm.

Hi Boris,

Let me clarify some details as preparing to include this in a revision.

So, IIUC, a number needs to be assigned for this new SIGFAIL. At a glance, not
sure which one to pick there in signal.h -- 1-31 fully occupied and the rest
for 33 different real-time signals.

Also, perhaps, force_sig(SIGFAIL) here, instead of return -1 -- to die with
SIGSEGV.

Thanks,
Chang


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-04-12 22:30           ` Bae, Chang Seok
@ 2021-04-14 10:12             ` Borislav Petkov
  2021-04-14 11:30               ` Florian Weimer
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-04-14 10:12 UTC (permalink / raw)
  To: Bae, Chang Seok, Florian Weimer
  Cc: Andy Lutomirski, Cooper, Andrew, Boris Ostrovsky, Gross, Jurgen,
	Stefano Stabellini, Thomas Gleixner, Ingo Molnar, X86 ML, Brown,
	Len, Hansen, Dave, H. J. Lu, Dave Martin, Jann Horn,
	Michael Ellerman, Carlos O'Donell, Luck, Tony, Shankar,
	Ravi V, libc-alpha, linux-arch, Linux API, LKML

On Mon, Apr 12, 2021 at 10:30:23PM +0000, Bae, Chang Seok wrote:
> On Mar 26, 2021, at 03:30, Borislav Petkov <bp@alien8.de> wrote:
> > On Thu, Mar 25, 2021 at 09:56:53PM -0700, Andy Lutomirski wrote:
> >> We really ought to have a SIGSIGFAIL signal that's sent, double-fault
> >> style, when we fail to send a signal.
> > 
> > Yeap, we should be able to tell userspace that we couldn't send a
> > signal, hohumm.
> 
> Hi Boris,
> 
> Let me clarify some details as preparing to include this in a revision.
> 
> So, IIUC, a number needs to be assigned for this new SIGFAIL. At a glance, not
> sure which one to pick there in signal.h -- 1-31 fully occupied and the rest
> for 33 different real-time signals.
> 
> Also, perhaps, force_sig(SIGFAIL) here, instead of return -1 -- to die with
> SIGSEGV.

I think this needs to be decided together with userspace people so that
they can act accordingly and whether it even makes sense to them.

Florian, any suggestions?

Subthread starts here:

https://lkml.kernel.org/r/CALCETrXQZuvJQrHDMst6PPgtJxaS_sPk2JhwMiMDNPunq45YFg@mail.gmail.com

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-04-14 10:12             ` Borislav Petkov
@ 2021-04-14 11:30               ` Florian Weimer
  2021-04-14 12:06                 ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer @ 2021-04-14 11:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Bae, Chang Seok, Andy Lutomirski, Cooper, Andrew,
	Boris Ostrovsky, Gross, Jurgen, Stefano Stabellini,
	Thomas Gleixner, Ingo Molnar, X86 ML, Brown, Len, Hansen, Dave,
	H. J. Lu, Dave Martin, Jann Horn, Michael Ellerman,
	Carlos O'Donell, Luck, Tony, Shankar, Ravi V, libc-alpha,
	linux-arch, Linux API, LKML

* Borislav Petkov:

> On Mon, Apr 12, 2021 at 10:30:23PM +0000, Bae, Chang Seok wrote:
>> On Mar 26, 2021, at 03:30, Borislav Petkov <bp@alien8.de> wrote:
>> > On Thu, Mar 25, 2021 at 09:56:53PM -0700, Andy Lutomirski wrote:
>> >> We really ought to have a SIGSIGFAIL signal that's sent, double-fault
>> >> style, when we fail to send a signal.
>> > 
>> > Yeap, we should be able to tell userspace that we couldn't send a
>> > signal, hohumm.
>> 
>> Hi Boris,
>> 
>> Let me clarify some details as preparing to include this in a revision.
>> 
>> So, IIUC, a number needs to be assigned for this new SIGFAIL. At a glance, not
>> sure which one to pick there in signal.h -- 1-31 fully occupied and the rest
>> for 33 different real-time signals.
>> 
>> Also, perhaps, force_sig(SIGFAIL) here, instead of return -1 -- to die with
>> SIGSEGV.
>
> I think this needs to be decided together with userspace people so that
> they can act accordingly and whether it even makes sense to them.
>
> Florian, any suggestions?

Is this discussion about better behavior (at least diagnostics) for
existing applications, without any code changes?  Or an alternative
programming model?

Does noavx512 acutally reduce the XSAVE size to AVX2 levels?  Or would
you need noxsave?

One possibility is that the sigaltstack size check prevents application
from running which work just fine today because all they do is install a
stack overflow handler, and stack overflow does not actually happen.  So
if sigaltstack fails and the application checks the result of the system
call, it probably won't run at all.  Shifting the diagnostic to the
pointer where the signal would have to be delivered is perhaps the only
thing that can be done.

As for SIGFAIL in particular, I don't think there are any leftover
signal numbers.  It would need a prctl to assign the signal number, and
I'm not sure if there is a useful programming model because signals do
not really compose well even today.  SIGFAIL adds another point where
libraries need to collaborate, and we do not have a mechanism for that.
(This is about what Rich Felker termed “library-safe code”, proper
maintenance of process-wide resources such as the current directory.)

Thanks,
Florian


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-04-14 11:30               ` Florian Weimer
@ 2021-04-14 12:06                 ` Borislav Petkov
  2021-05-03  5:30                   ` Florian Weimer
  0 siblings, 1 reply; 29+ messages in thread
From: Borislav Petkov @ 2021-04-14 12:06 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Bae, Chang Seok, Andy Lutomirski, Cooper, Andrew,
	Boris Ostrovsky, Gross, Jurgen, Stefano Stabellini,
	Thomas Gleixner, Ingo Molnar, X86 ML, Brown, Len, Hansen, Dave,
	H. J. Lu, Dave Martin, Jann Horn, Michael Ellerman,
	Carlos O'Donell, Luck, Tony, Shankar, Ravi V, libc-alpha,
	linux-arch, Linux API, LKML

On Wed, Apr 14, 2021 at 01:30:43PM +0200, Florian Weimer wrote:
> Is this discussion about better behavior (at least diagnostics) for
> existing applications, without any code changes?  Or an alternative
> programming model?

Former.

> Does noavx512 acutally reduce the XSAVE size to AVX2 levels?

Yeah.

> Or would you need noxsave?

I don't think so.

> One possibility is that the sigaltstack size check prevents application
> from running which work just fine today because all they do is install a
> stack overflow handler, and stack overflow does not actually happen.

So sigaltstack(2) says in the NOTES:

       Functions  called  from  a signal handler executing on an alternate signal stack
       will also use the alternate signal stack.  (This also applies  to  any  handlers
       invoked for other signals while the process is executing on the alternate signal
       stack.)  Unlike the standard stack, the system does not automatically extend the
       alternate  signal  stack.   Exceeding the allocated size of the alternate signal
       stack will lead to unpredictable results.

> So if sigaltstack fails and the application checks the result of the
> system call, it probably won't run at all. Shifting the diagnostic to
> the pointer where the signal would have to be delivered is perhaps the
> only thing that can be done.

So using the example from the same manpage:

       The most common usage of an alternate signal stack is to handle the SIGSEGV sig‐
       nal that is generated if the space available for the normal process stack is ex‐
       hausted: in this case, a signal handler for SIGSEGV cannot  be  invoked  on  the
       process stack; if we wish to handle it, we must use an alternate signal stack.

and considering these "unpredictable results" would it make sense or
even be at all possible to return SIGFAIL from that SIGSEGV signal
handler which should run on the sigaltstack but that sigaltstack
overflows?

I think we wanna be able to tell the process through that previously
registered SIGSEGV handler which is supposed to run on the sigaltstack,
that that stack got overflowed.

Or is this use case obsolete and this is not what people do at all?

> As for SIGFAIL in particular, I don't think there are any leftover
> signal numbers.  It would need a prctl to assign the signal number, and
> I'm not sure if there is a useful programming model because signals do
> not really compose well even today.  SIGFAIL adds another point where
> libraries need to collaborate, and we do not have a mechanism for that.
> (This is about what Rich Felker termed “library-safe code”, proper
> maintenance of process-wide resources such as the current directory.)

Oh fun.

I guess if Linux goes and does something, people would adopt it and
it'll become standard. :-P

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-04-14 12:06                 ` Borislav Petkov
@ 2021-05-03  5:30                   ` Florian Weimer
  2021-05-03 11:17                     ` Borislav Petkov
  0 siblings, 1 reply; 29+ messages in thread
From: Florian Weimer @ 2021-05-03  5:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Bae, Chang Seok, Andy Lutomirski, Cooper, Andrew,
	Boris Ostrovsky, Gross, Jurgen, Stefano Stabellini,
	Thomas Gleixner, Ingo Molnar, X86 ML, Brown, Len, Hansen, Dave,
	H. J. Lu, Dave Martin, Jann Horn, Michael Ellerman,
	Carlos O'Donell, Luck, Tony, Shankar, Ravi V, libc-alpha,
	linux-arch, Linux API, LKML

* Borislav Petkov:

>> One possibility is that the sigaltstack size check prevents application
>> from running which work just fine today because all they do is install a
>> stack overflow handler, and stack overflow does not actually happen.
>
> So sigaltstack(2) says in the NOTES:
>
>        Functions  called  from  a signal handler executing on an alternate signal stack
>        will also use the alternate signal stack.  (This also applies  to  any  handlers
>        invoked for other signals while the process is executing on the alternate signal
>        stack.)  Unlike the standard stack, the system does not automatically extend the
>        alternate  signal  stack.   Exceeding the allocated size of the alternate signal
>        stack will lead to unpredictable results.
>
>> So if sigaltstack fails and the application checks the result of the
>> system call, it probably won't run at all. Shifting the diagnostic to
>> the pointer where the signal would have to be delivered is perhaps the
>> only thing that can be done.
>
> So using the example from the same manpage:
>
>        The most common usage of an alternate signal stack is to handle the SIGSEGV sig‐
>        nal that is generated if the space available for the normal process stack is ex‐
>        hausted: in this case, a signal handler for SIGSEGV cannot  be  invoked  on  the
>        process stack; if we wish to handle it, we must use an alternate signal stack.
>
> and considering these "unpredictable results" would it make sense or
> even be at all possible to return SIGFAIL from that SIGSEGV signal
> handler which should run on the sigaltstack but that sigaltstack
> overflows?
>
> I think we wanna be able to tell the process through that previously
> registered SIGSEGV handler which is supposed to run on the sigaltstack,
> that that stack got overflowed.

Just to be clear, I'm worried about the case where an application
installs a stack overflow handler, but stack overflow does not regularly
happen at run time.  GNU m4 is an example.  Today, for most m4 scripts,
it's totally fine to have an alternative signal stack which is too
small.  If the kernel returned an error for the sigaltstack call, m4
wouldn't start anymore, independently of the script.  Which is worse
than memory corruption with some scripts, I think.

> Or is this use case obsolete and this is not what people do at all?

It's widely used in currently-maintained software.  It's the only way to
recover from stack overflows without boundary checks on every function
call.

Does the alternative signal stack actually have to contain the siginfo_t
data?  I don't think it has to be contiguous.  Maybe the kernel could
allocate and map something behind the processes back if the sigaltstack
region is too small?

And for the stack overflow handler, the kernel could treat SIGSEGV with
a sigaltstack region that is too small like the SIG_DFL handler.  This
would make m4 work again.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow
  2021-05-03  5:30                   ` Florian Weimer
@ 2021-05-03 11:17                     ` Borislav Petkov
  0 siblings, 0 replies; 29+ messages in thread
From: Borislav Petkov @ 2021-05-03 11:17 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Bae, Chang Seok, Andy Lutomirski, Cooper, Andrew,
	Boris Ostrovsky, Gross, Jurgen, Stefano Stabellini,
	Thomas Gleixner, Ingo Molnar, X86 ML, Brown, Len, Hansen, Dave,
	H. J. Lu, Dave Martin, Jann Horn, Michael Ellerman,
	Carlos O'Donell, Luck, Tony, Shankar, Ravi V, libc-alpha,
	linux-arch, Linux API, LKML

On Mon, May 03, 2021 at 07:30:21AM +0200, Florian Weimer wrote:
> Just to be clear, I'm worried about the case where an application
> installs a stack overflow handler, but stack overflow does not regularly
> happen at run time.  GNU m4 is an example.  Today, for most m4 scripts,
> it's totally fine to have an alternative signal stack which is too
> small.  If the kernel returned an error for the sigaltstack call, m4
> wouldn't start anymore, independently of the script.  Which is worse
> than memory corruption with some scripts, I think.

Oh lovely.

> 
> > Or is this use case obsolete and this is not what people do at all?
> 
> It's widely used in currently-maintained software.  It's the only way to
> recover from stack overflows without boundary checks on every function
> call.
> 
> Does the alternative signal stack actually have to contain the siginfo_t
> data?  I don't think it has to be contiguous.  Maybe the kernel could
> allocate and map something behind the processes back if the sigaltstack
> region is too small?

So there's an attempt floating around to address this:

https://lkml.kernel.org/r/20210422044856.27250-1-chang.seok.bae@intel.com

esp patch 3.

I'd appreciate having a look and sanity-checking this whether it makes
sense and could be useful this way...

> And for the stack overflow handler, the kernel could treat SIGSEGV with
> a sigaltstack region that is too small like the SIG_DFL handler.  This
> would make m4 work again.

/me searches a bit about SIG_DFL...

Do you mean that the default action in this case should be what SIGSEGV
does by default - to dump core?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, back to index

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16  6:52 [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 1/6] uapi: Define the aux vector AT_MINSIGSTKSZ Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 2/6] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 3/6] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 4/6] selftest/sigaltstack: Use the AT_MINSIGSTKSZ aux vector if available Chang S. Bae
2021-03-16  6:52 ` [PATCH v7 5/6] x86/signal: Detect and prevent an alternate signal stack overflow Chang S. Bae
2021-03-16 11:52   ` Borislav Petkov
2021-03-16 18:26     ` Bae, Chang Seok
2021-03-25 16:20       ` Borislav Petkov
2021-03-25 17:21         ` Bae, Chang Seok
2021-03-25 20:14           ` Florian Weimer
2021-03-25 18:13   ` Andy Lutomirski
2021-03-25 18:54     ` Borislav Petkov
2021-03-25 21:11       ` Bae, Chang Seok
2021-03-25 21:27         ` Borislav Petkov
2021-03-26  4:56       ` Andy Lutomirski
2021-03-26 10:30         ` Borislav Petkov
2021-04-12 22:30           ` Bae, Chang Seok
2021-04-14 10:12             ` Borislav Petkov
2021-04-14 11:30               ` Florian Weimer
2021-04-14 12:06                 ` Borislav Petkov
2021-05-03  5:30                   ` Florian Weimer
2021-05-03 11:17                     ` Borislav Petkov
2021-03-26  4:58     ` Andy Lutomirski
2021-03-16  6:52 ` [PATCH v7 6/6] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
2021-03-17 10:06 ` [PATCH v7 0/6] x86: Improve Minimum Alternate Stack Size Ingo Molnar
2021-03-17 10:44   ` Ingo Molnar
2021-03-19 18:12     ` Len Brown
2021-03-20 17:32       ` Ingo Molnar

Linux-api Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-api/0 linux-api/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-api linux-api/ https://lore.kernel.org/linux-api \
		linux-api@vger.kernel.org
	public-inbox-index linux-api

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-api


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git