Linux-arch Archive on lore.kernel.org
 help / color / Atom feed
* [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
@ 2020-09-29 20:57 Chang S. Bae
  2020-09-29 20:57 ` [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
                   ` (4 more replies)
  0 siblings, 5 replies; 31+ messages in thread
From: Chang S. Bae @ 2020-09-29 20:57 UTC (permalink / raw)
  To: tglx, mingo, bp, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, mpe, tony.luck,
	ravi.v.shankar, libc-alpha, linux-arch, linux-api, linux-kernel,
	chang.seok.bae

During signal entry, the kernel pushes data onto the normal userspace
stack. On x86, the data pushed onto the user stack includes XSAVE state,
which has grown over time as new features and larger registers have been
added to the architecture.

MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
constant indicates to userspace how much data the kernel expects to push on
the user stack, [2][3].

However, this constant is much too small and does not reflect recent
additions to the architecture. For instance, when AVX-512 states are in
use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.

The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
cause user stack overflow when delivering a signal.

In this series, we suggest a couple of things:
1. Provide a variable minimum stack size to userspace, as a similar
   approach to [5]
2. Avoid using a too-small alternate stack

[1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=b9dca794da093dc4d41d39db9851d444e1b54d9b;hb=HEAD
[2]: https://www.gnu.org/software/libc/manual/html_node/Signal-Stack.html
[3]: https://man7.org/linux/man-pages/man2/sigaltstack.2.html
[4]: https://bugzilla.kernel.org/show_bug.cgi?id=153531
[5]: https://blog.linuxplumbersconf.org/2017/ocw/system/presentations/4671/original/plumbers-dm-2017.pdf

Chang S. Bae (4):
  x86/signal: Introduce helpers to get the maximum signal frame size
  x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
  x86/signal: Prevent an alternate stack overflow before a signal
    delivery
  selftest/x86/signal: Include test cases for validating sigaltstack

 arch/x86/ia32/ia32_signal.c               |  11 +-
 arch/x86/include/asm/elf.h                |   4 +
 arch/x86/include/asm/fpu/signal.h         |   2 +
 arch/x86/include/asm/sigframe.h           |  25 +++++
 arch/x86/include/uapi/asm/auxvec.h        |   6 +-
 arch/x86/kernel/cpu/common.c              |   3 +
 arch/x86/kernel/fpu/signal.c              |  20 ++++
 arch/x86/kernel/signal.c                  |  66 +++++++++++-
 tools/testing/selftests/x86/Makefile      |   2 +-
 tools/testing/selftests/x86/sigaltstack.c | 126 ++++++++++++++++++++++
 10 files changed, 258 insertions(+), 7 deletions(-)
 create mode 100644 tools/testing/selftests/x86/sigaltstack.c

--
2.17.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size
  2020-09-29 20:57 [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Chang S. Bae
@ 2020-09-29 20:57 ` Chang S. Bae
  2020-10-05 13:42   ` Dave Martin
  2020-09-29 20:57 ` [RFC PATCH 2/4] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 31+ messages in thread
From: Chang S. Bae @ 2020-09-29 20:57 UTC (permalink / raw)
  To: tglx, mingo, bp, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, mpe, tony.luck,
	ravi.v.shankar, libc-alpha, linux-arch, linux-api, linux-kernel,
	chang.seok.bae

Signal frames do not have a fixed format and can vary in size when a number
of things change: support XSAVE features, 32 vs. 64-bit apps. Add the code
to support a runtime method for userspace to dynamically discover how large
a signal stack needs to be.

Introduce a new variable, max_frame_size, and helper functions for the
calculation to be used in a new user interface. Set max_frame_size to a
system-wide worst-case value, instead of storing multiple app-specific
values.

Locate the body of the helper function -- fpu__get_fpstate_sigframe_size()
in fpu/signal.c for its relevance.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/include/asm/fpu/signal.h |  2 ++
 arch/x86/include/asm/sigframe.h   | 23 ++++++++++++++++
 arch/x86/kernel/cpu/common.c      |  3 +++
 arch/x86/kernel/fpu/signal.c      | 20 ++++++++++++++
 arch/x86/kernel/signal.c          | 45 +++++++++++++++++++++++++++++++
 5 files changed, 93 insertions(+)

diff --git a/arch/x86/include/asm/fpu/signal.h b/arch/x86/include/asm/fpu/signal.h
index 7fb516b6893a..5bfbf8f2e5a3 100644
--- a/arch/x86/include/asm/fpu/signal.h
+++ b/arch/x86/include/asm/fpu/signal.h
@@ -29,6 +29,8 @@ unsigned long
 fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
 		     unsigned long *buf_fx, unsigned long *size);
 
+unsigned long fpu__get_fpstate_sigframe_size(void);
+
 extern void fpu__init_prepare_fx_sw_frame(void);
 
 #endif /* _ASM_X86_FPU_SIGNAL_H */
diff --git a/arch/x86/include/asm/sigframe.h b/arch/x86/include/asm/sigframe.h
index 84eab2724875..ac77f3f90bc9 100644
--- a/arch/x86/include/asm/sigframe.h
+++ b/arch/x86/include/asm/sigframe.h
@@ -52,6 +52,15 @@ struct rt_sigframe_ia32 {
 	char retcode[8];
 	/* fp state follows here */
 };
+
+#define SIZEOF_sigframe_ia32	sizeof(struct sigframe_ia32)
+#define SIZEOF_rt_sigframe_ia32	sizeof(struct rt_sigframe_ia32)
+
+#else
+
+#define SIZEOF_sigframe_ia32	0
+#define SIZEOF_rt_sigframe_ia32	0
+
 #endif /* defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION) */
 
 #ifdef CONFIG_X86_64
@@ -81,8 +90,22 @@ struct rt_sigframe_x32 {
 	/* fp state follows here */
 };
 
+#define SIZEOF_rt_sigframe_x32	sizeof(struct rt_sigframe_x32)
+
 #endif /* CONFIG_X86_X32_ABI */
 
+#define SIZEOF_rt_sigframe	sizeof(struct rt_sigframe)
+
+#else
+
+#define SIZEOF_rt_sigframe	0
+
 #endif /* CONFIG_X86_64 */
 
+#ifndef SIZEOF_rt_sigframe_x32
+#define SIZEOF_rt_sigframe_x32	0
+#endif
+
+void __init init_sigframe_size(void);
+
 #endif /* _ASM_X86_SIGFRAME_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index c5d6f17d9b9d..d0363e15ec2e 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -57,6 +57,7 @@
 #include <asm/intel-family.h>
 #include <asm/cpu_device_id.h>
 #include <asm/uv/uv.h>
+#include <asm/sigframe.h>
 
 #include "cpu.h"
 
@@ -1276,6 +1277,8 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 
 	fpu__init_system(c);
 
+	init_sigframe_size();
+
 #ifdef CONFIG_X86_32
 	/*
 	 * Regardless of whether PCID is enumerated, the SDM says
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index a4ec65317a7f..9f009525f551 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -507,6 +507,26 @@ fpu__alloc_mathframe(unsigned long sp, int ia32_frame,
 
 	return sp;
 }
+
+unsigned long fpu__get_fpstate_sigframe_size(void)
+{
+	unsigned long xstate_size = xstate_sigframe_size();
+	unsigned long fsave_header_size = 0;
+
+	/*
+	 * This space is needed on (most) 32-bit kernels, or when a 32-bit
+	 * app is running on a 64-bit kernel. To keep things simple, just
+	 * assume the worst case and always include space for 'freg_state',
+	 * even for 64-bit apps on 64-bit kernels. This wastes a bit of
+	 * space, but keeps the code simple.
+	 */
+	if ((IS_ENABLED(CONFIG_IA32_EMULATION) ||
+	     IS_ENABLED(CONFIG_X86_32)) && use_fxsr())
+		fsave_header_size = sizeof(struct fregs_state);
+
+	return fsave_header_size + xstate_size;
+}
+
 /*
  * Prepare the SW reserved portion of the fxsave memory layout, indicating
  * the presence of the extended state information in the memory layout
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index be0d7d4152ec..239a0b23a4b0 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -663,6 +663,51 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	return 0;
 }
 
+/*
+ * The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
+ * If a signal frame starts at an unaligned address, extra space is required.
+ * This is the max alignment padding, conservatively.
+ */
+#define MAX_XSAVE_PADDING	63UL
+
+/*
+ * The frame data is composed of the following areas and laid out as:
+ *
+ * -------------------------
+ * | alignment padding     |
+ * -------------------------
+ * | (f)xsave frame        |
+ * -------------------------
+ * | fsave header          |
+ * -------------------------
+ * | siginfo + ucontext    |
+ * -------------------------
+ */
+
+/* max_frame_size tells userspace the worst case signal stack size. */
+static unsigned long __ro_after_init max_frame_size;
+
+void __init init_sigframe_size(void)
+{
+	/*
+	 * Use the largest of possible structure formats. This might
+	 * slightly oversize the frame for 64-bit apps.
+	 */
+
+	if (IS_ENABLED(CONFIG_X86_32) ||
+	    IS_ENABLED(CONFIG_IA32_EMULATION))
+		max_frame_size = max((unsigned long)SIZEOF_sigframe_ia32,
+				     (unsigned long)SIZEOF_rt_sigframe_ia32);
+
+	if (IS_ENABLED(CONFIG_X86_X32_ABI))
+		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe_x32);
+
+	if (IS_ENABLED(CONFIG_X86_64))
+		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe);
+
+	max_frame_size += fpu__get_fpstate_sigframe_size() + MAX_XSAVE_PADDING;
+}
+
 static inline int is_ia32_compat_frame(struct ksignal *ksig)
 {
 	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-- 
2.17.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC PATCH 2/4] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
  2020-09-29 20:57 [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Chang S. Bae
  2020-09-29 20:57 ` [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
@ 2020-09-29 20:57 ` Chang S. Bae
  2020-09-29 20:57 ` [RFC PATCH 3/4] x86/signal: Prevent an alternate stack overflow before a signal delivery Chang S. Bae
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 31+ messages in thread
From: Chang S. Bae @ 2020-09-29 20:57 UTC (permalink / raw)
  To: tglx, mingo, bp, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, mpe, tony.luck,
	ravi.v.shankar, libc-alpha, linux-arch, linux-api, linux-kernel,
	chang.seok.bae, Fenghua Yu

Historically, signal.h defines MINSIGSTKSZ (2KB) and SIGSTKSZ (8KB), for
use by all architectures with sigaltstack(2). Over time, the hardware state
size grew, but these constants did not evolve. Today, literal use of these
constants on several architectures may result in signal stack overflow, and
thus user data corruption.

A few years ago, the ARM team addressed this issue by establishing
getauxval(AT_MINSIGSTKSZ), such that the kernel can supply at runtime value
that is an appropriate replacement on the current and future hardware.

Add getauxval(AT_MINSIGSTKSZ) support to x86, analogous to the support
added for ARM in commit 94b07c1f8c39 ("arm64: signal: Report signal frame
size to userspace via auxv").

Reported-by: Florian Weimer <fweimer@redhat.com>
Fixes: c2bc11f10a39 ("x86, AVX-512: Enable AVX-512 States Context Switch")
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: x86@kernel.org
Cc: libc-alpha@sourceware.org
Cc: linux-arch@vger.kernel.org
Cc: linux-api@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=153531
---
 arch/x86/include/asm/elf.h         | 4 ++++
 arch/x86/include/uapi/asm/auxvec.h | 6 ++++--
 arch/x86/kernel/signal.c           | 5 +++++
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h
index b9a5d488f1a5..044b024abea1 100644
--- a/arch/x86/include/asm/elf.h
+++ b/arch/x86/include/asm/elf.h
@@ -311,6 +311,7 @@ do {									\
 		NEW_AUX_ENT(AT_SYSINFO,	VDSO_ENTRY);			\
 		NEW_AUX_ENT(AT_SYSINFO_EHDR, VDSO_CURRENT_BASE);	\
 	}								\
+	NEW_AUX_ENT(AT_MINSIGSTKSZ, get_sigframe_size());			\
 } while (0)
 
 /*
@@ -327,6 +328,7 @@ extern unsigned long task_size_32bit(void);
 extern unsigned long task_size_64bit(int full_addr_space);
 extern unsigned long get_mmap_base(int is_legacy);
 extern bool mmap_address_hint_valid(unsigned long addr, unsigned long len);
+extern unsigned long get_sigframe_size(void);
 
 #ifdef CONFIG_X86_32
 
@@ -348,6 +350,7 @@ do {									\
 	if (vdso64_enabled)						\
 		NEW_AUX_ENT(AT_SYSINFO_EHDR,				\
 			    (unsigned long __force)current->mm->context.vdso); \
+	NEW_AUX_ENT(AT_MINSIGSTKSZ, get_sigframe_size());			\
 } while (0)
 
 /* As a historical oddity, the x32 and x86_64 vDSOs are controlled together. */
@@ -356,6 +359,7 @@ do {									\
 	if (vdso64_enabled)						\
 		NEW_AUX_ENT(AT_SYSINFO_EHDR,				\
 			    (unsigned long __force)current->mm->context.vdso); \
+	NEW_AUX_ENT(AT_MINSIGSTKSZ, get_sigframe_size());			\
 } while (0)
 
 #define AT_SYSINFO		32
diff --git a/arch/x86/include/uapi/asm/auxvec.h b/arch/x86/include/uapi/asm/auxvec.h
index 580e3c567046..edd7808060e6 100644
--- a/arch/x86/include/uapi/asm/auxvec.h
+++ b/arch/x86/include/uapi/asm/auxvec.h
@@ -10,11 +10,13 @@
 #endif
 #define AT_SYSINFO_EHDR		33
 
+#define AT_MINSIGSTKSZ		51
+
 /* entries in ARCH_DLINFO: */
 #if defined(CONFIG_IA32_EMULATION) || !defined(CONFIG_X86_64)
-# define AT_VECTOR_SIZE_ARCH 2
+# define AT_VECTOR_SIZE_ARCH 3
 #else /* else it's non-compat x86-64 */
-# define AT_VECTOR_SIZE_ARCH 1
+# define AT_VECTOR_SIZE_ARCH 2
 #endif
 
 #endif /* _ASM_X86_AUXVEC_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 239a0b23a4b0..2a313d34ec49 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -708,6 +708,11 @@ void __init init_sigframe_size(void)
 	max_frame_size += fpu__get_fpstate_sigframe_size() + MAX_XSAVE_PADDING;
 }
 
+unsigned long get_sigframe_size(void)
+{
+	return max_frame_size;
+}
+
 static inline int is_ia32_compat_frame(struct ksignal *ksig)
 {
 	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-- 
2.17.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC PATCH 3/4] x86/signal: Prevent an alternate stack overflow before a signal delivery
  2020-09-29 20:57 [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Chang S. Bae
  2020-09-29 20:57 ` [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
  2020-09-29 20:57 ` [RFC PATCH 2/4] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
@ 2020-09-29 20:57 ` Chang S. Bae
  2020-09-29 20:57 ` [RFC PATCH 4/4] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
  2020-10-05 13:45 ` [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Dave Martin
  4 siblings, 0 replies; 31+ messages in thread
From: Chang S. Bae @ 2020-09-29 20:57 UTC (permalink / raw)
  To: tglx, mingo, bp, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, mpe, tony.luck,
	ravi.v.shankar, libc-alpha, linux-arch, linux-api, linux-kernel,
	chang.seok.bae

The kernel pushes data on the userspace stack when entering a signal. If
using a sigaltstack(), the kernel precisely knows the user stack size.

When the kernel knows that the user stack is too small, avoid the overflow
and do an immediate SIGSEGV instead.

This overflow is known to occur on systems with large XSAVE state. The
effort to increase the size typically used for altstacks reduces the
frequency of these overflows, but this approach is still useful for legacy
binaries.

Here the kernel expects a bit conservative stack size (for 64-bit apps).
Legacy binaries used a too-small sigaltstack would be already overflowed
before this change, if they run on modern hardware.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
 arch/x86/ia32/ia32_signal.c     | 11 ++++++++---
 arch/x86/include/asm/sigframe.h |  2 ++
 arch/x86/kernel/signal.c        | 16 +++++++++++++++-
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 81cf22398cd1..85abd9eb79d5 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -210,13 +210,18 @@ static void __user *get_sigframe(struct ksignal *ksig, struct pt_regs *regs,
 	sp = regs->sp;
 
 	/* This is the X/Open sanctioned signal stack switching.  */
-	if (ksig->ka.sa.sa_flags & SA_ONSTACK)
+	if (ksig->ka.sa.sa_flags & SA_ONSTACK) {
+		/* If the altstack might overflow, die with SIGSEGV: */
+		if (!altstack_size_ok(current))
+			return (void __user *)-1L;
+
 		sp = sigsp(sp, ksig);
 	/* This is the legacy signal stack switching. */
-	else if (regs->ss != __USER32_DS &&
+	} else if (regs->ss != __USER32_DS &&
 		!(ksig->ka.sa.sa_flags & SA_RESTORER) &&
-		 ksig->ka.sa.sa_restorer)
+		 ksig->ka.sa.sa_restorer) {
 		sp = (unsigned long) ksig->ka.sa.sa_restorer;
+	}
 
 	sp = fpu__alloc_mathframe(sp, 1, &fx_aligned, &math_size);
 	*fpstate = (struct _fpstate_32 __user *) sp;
diff --git a/arch/x86/include/asm/sigframe.h b/arch/x86/include/asm/sigframe.h
index ac77f3f90bc9..c9f2f9ace76f 100644
--- a/arch/x86/include/asm/sigframe.h
+++ b/arch/x86/include/asm/sigframe.h
@@ -106,6 +106,8 @@ struct rt_sigframe_x32 {
 #define SIZEOF_rt_sigframe_x32	0
 #endif
 
+bool altstack_size_ok(struct task_struct *tsk);
+
 void __init init_sigframe_size(void);
 
 #endif /* _ASM_X86_SIGFRAME_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 2a313d34ec49..c042236ef52e 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -246,8 +246,13 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs *regs, size_t frame_size,
 
 	/* This is the X/Open sanctioned signal stack switching.  */
 	if (ka->sa.sa_flags & SA_ONSTACK) {
-		if (sas_ss_flags(sp) == 0)
+		if (sas_ss_flags(sp) == 0) {
+			/* If the altstack might overflow, die with SIGSEGV: */
+			if (!altstack_size_ok(current))
+				return (void __user *)-1L;
+
 			sp = current->sas_ss_sp + current->sas_ss_size;
+		}
 	} else if (IS_ENABLED(CONFIG_X86_32) &&
 		   !onsigstack &&
 		   regs->ss != __USER_DS &&
@@ -713,6 +718,15 @@ unsigned long get_sigframe_size(void)
 	return max_frame_size;
 }
 
+bool altstack_size_ok(struct task_struct *tsk)
+{
+	/*
+	 * Can this task's sigaltstack accommodate the largest
+	 * signal frame the kernel might need?
+	 */
+	return (tsk->sas_ss_size >= max_frame_size);
+}
+
 static inline int is_ia32_compat_frame(struct ksignal *ksig)
 {
 	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-- 
2.17.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [RFC PATCH 4/4] selftest/x86/signal: Include test cases for validating sigaltstack
  2020-09-29 20:57 [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Chang S. Bae
                   ` (2 preceding siblings ...)
  2020-09-29 20:57 ` [RFC PATCH 3/4] x86/signal: Prevent an alternate stack overflow before a signal delivery Chang S. Bae
@ 2020-09-29 20:57 ` Chang S. Bae
  2020-10-05 13:45 ` [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Dave Martin
  4 siblings, 0 replies; 31+ messages in thread
From: Chang S. Bae @ 2020-09-29 20:57 UTC (permalink / raw)
  To: tglx, mingo, bp, luto, x86
  Cc: len.brown, dave.hansen, hjl.tools, Dave.Martin, mpe, tony.luck,
	ravi.v.shankar, libc-alpha, linux-arch, linux-api, linux-kernel,
	chang.seok.bae, linux-kselftest

The test measures the kernel's signal delivery with different (enough vs.
insufficient) stack sizes.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kselftest@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 tools/testing/selftests/x86/Makefile      |   2 +-
 tools/testing/selftests/x86/sigaltstack.c | 126 ++++++++++++++++++++++
 2 files changed, 127 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/sigaltstack.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 6703c7906b71..e0c52e5ab49e 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -13,7 +13,7 @@ CAN_BUILD_WITH_NOPIE := $(shell ./check_cc.sh $(CC) trivial_program.c -no-pie)
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt test_mremap_vdso \
 			check_initial_reg_state sigreturn iopl ioperm \
 			test_vdso test_vsyscall mov_ss_trap \
-			syscall_arg_fault fsgsbase_restore
+			syscall_arg_fault fsgsbase_restore sigaltstack
 TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/sigaltstack.c b/tools/testing/selftests/x86/sigaltstack.c
new file mode 100644
index 000000000000..353679df6901
--- /dev/null
+++ b/tools/testing/selftests/x86/sigaltstack.c
@@ -0,0 +1,126 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#define _GNU_SOURCE
+#include <signal.h>
+#include <stdio.h>
+#include <stdbool.h>
+#include <string.h>
+#include <err.h>
+#include <errno.h>
+#include <limits.h>
+#include <sys/mman.h>
+#include <sys/auxv.h>
+#include <sys/prctl.h>
+#include <sys/resource.h>
+#include <setjmp.h>
+
+/* sigaltstack()-enforced minimum stack */
+#define ENFORCED_MINSIGSTKSZ	2048
+
+#ifndef AT_MINSIGSTKSZ
+#  define AT_MINSIGSTKSZ	51
+#endif
+
+static int nerrs;
+
+static bool sigalrm_expected;
+
+static unsigned long at_minstack_size;
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static void clearhandler(int sig)
+{
+	struct sigaction sa;
+
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_handler = SIG_DFL;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+static int setup_altstack(void *start, unsigned long size)
+{
+	stack_t ss;
+
+	memset(&ss, 0, sizeof(ss));
+	ss.ss_size = size;
+	ss.ss_sp = start;
+
+	return sigaltstack(&ss, NULL);
+}
+
+static jmp_buf jmpbuf;
+
+static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
+{
+	if (sigalrm_expected) {
+		printf("[FAIL]\tSIGSEGV signal delivered.\n");
+		nerrs++;
+	} else {
+		printf("[OK]\tSIGSEGV signal expectedly delivered.\n");
+	}
+
+	siglongjmp(jmpbuf, 1);
+}
+
+static void sigalrm(int sig, siginfo_t *info, void *ctx_void)
+{
+	if (!sigalrm_expected) {
+		printf("[FAIL]\tSIGALRM sigal delivered.\n");
+		nerrs++;
+	} else {
+		printf("[OK]\tSIGALRM signal expectedly delivered.\n");
+	}
+}
+
+static void test_sigaltstack(void *altstack, unsigned long size)
+{
+	if (setup_altstack(altstack, size))
+		err(1, "sigaltstack()");
+
+	sigalrm_expected = (size > at_minstack_size) ? true : false;
+
+	sethandler(SIGSEGV, sigsegv, 0);
+	sethandler(SIGALRM, sigalrm, SA_ONSTACK);
+
+	if (sigsetjmp(jmpbuf, 1) == 0) {
+		printf("[RUN]\tTest an %s sigaltstack\n",
+		       sigalrm_expected ? "enough" : "insufficient");
+		raise(SIGALRM);
+	}
+
+	clearhandler(SIGALRM);
+	clearhandler(SIGSEGV);
+}
+
+int main(void)
+{
+	void *altstack;
+
+	at_minstack_size = getauxval(AT_MINSIGSTKSZ);
+
+	altstack = mmap(NULL, at_minstack_size + SIGSTKSZ, PROT_READ | PROT_WRITE,
+			MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
+	if (altstack == MAP_FAILED)
+		err(1, "mmap()");
+
+	if ((ENFORCED_MINSIGSTKSZ + 1) < at_minstack_size)
+		test_sigaltstack(altstack, ENFORCED_MINSIGSTKSZ + 1);
+
+	test_sigaltstack(altstack, at_minstack_size + SIGSTKSZ);
+
+	return nerrs == 0 ? 0 : 1;
+}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size
  2020-09-29 20:57 ` [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
@ 2020-10-05 13:42   ` Dave Martin
  2020-10-06 17:45     ` Bae, Chang Seok
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-05 13:42 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: tglx, mingo, bp, luto, x86, len.brown, dave.hansen, hjl.tools,
	mpe, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel

On Tue, Sep 29, 2020 at 01:57:43PM -0700, Chang S. Bae wrote:
> Signal frames do not have a fixed format and can vary in size when a number
> of things change: support XSAVE features, 32 vs. 64-bit apps. Add the code
> to support a runtime method for userspace to dynamically discover how large
> a signal stack needs to be.
> 
> Introduce a new variable, max_frame_size, and helper functions for the
> calculation to be used in a new user interface. Set max_frame_size to a
> system-wide worst-case value, instead of storing multiple app-specific
> values.
> 
> Locate the body of the helper function -- fpu__get_fpstate_sigframe_size()
> in fpu/signal.c for its relevance.
> 
> Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
> Reviewed-by: Len Brown <len.brown@intel.com>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  arch/x86/include/asm/fpu/signal.h |  2 ++
>  arch/x86/include/asm/sigframe.h   | 23 ++++++++++++++++
>  arch/x86/kernel/cpu/common.c      |  3 +++
>  arch/x86/kernel/fpu/signal.c      | 20 ++++++++++++++
>  arch/x86/kernel/signal.c          | 45 +++++++++++++++++++++++++++++++
>  5 files changed, 93 insertions(+)

[...]

> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index be0d7d4152ec..239a0b23a4b0 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -663,6 +663,51 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  	return 0;
>  }
>  
> +/*
> + * The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
> + * If a signal frame starts at an unaligned address, extra space is required.
> + * This is the max alignment padding, conservatively.
> + */
> +#define MAX_XSAVE_PADDING	63UL
> +
> +/*
> + * The frame data is composed of the following areas and laid out as:
> + *
> + * -------------------------
> + * | alignment padding     |
> + * -------------------------
> + * | (f)xsave frame        |
> + * -------------------------
> + * | fsave header          |
> + * -------------------------
> + * | siginfo + ucontext    |
> + * -------------------------
> + */
> +
> +/* max_frame_size tells userspace the worst case signal stack size. */
> +static unsigned long __ro_after_init max_frame_size;
> +
> +void __init init_sigframe_size(void)
> +{
> +	/*
> +	 * Use the largest of possible structure formats. This might
> +	 * slightly oversize the frame for 64-bit apps.
> +	 */
> +
> +	if (IS_ENABLED(CONFIG_X86_32) ||
> +	    IS_ENABLED(CONFIG_IA32_EMULATION))
> +		max_frame_size = max((unsigned long)SIZEOF_sigframe_ia32,
> +				     (unsigned long)SIZEOF_rt_sigframe_ia32);
> +
> +	if (IS_ENABLED(CONFIG_X86_X32_ABI))
> +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe_x32);
> +
> +	if (IS_ENABLED(CONFIG_X86_64))
> +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe);
> +
> +	max_frame_size += fpu__get_fpstate_sigframe_size() + MAX_XSAVE_PADDING;

For arm64, we round the worst-case padding up by one.

I can't remember the full rationale for this, but it at least seemed a
bit weird to report a size that is not a multiple of the alignment.

I'm can't think of a clear argument as to why it really matters, though.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-09-29 20:57 [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Chang S. Bae
                   ` (3 preceding siblings ...)
  2020-09-29 20:57 ` [RFC PATCH 4/4] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
@ 2020-10-05 13:45 ` Dave Martin
  2020-10-05 21:17   ` H.J. Lu
  4 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-05 13:45 UTC (permalink / raw)
  To: Chang S. Bae
  Cc: tglx, mingo, bp, luto, x86, len.brown, dave.hansen, hjl.tools,
	mpe, tony.luck, ravi.v.shankar, libc-alpha, linux-arch,
	linux-api, linux-kernel

On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> During signal entry, the kernel pushes data onto the normal userspace
> stack. On x86, the data pushed onto the user stack includes XSAVE state,
> which has grown over time as new features and larger registers have been
> added to the architecture.
> 
> MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> constant indicates to userspace how much data the kernel expects to push on
> the user stack, [2][3].
> 
> However, this constant is much too small and does not reflect recent
> additions to the architecture. For instance, when AVX-512 states are in
> use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> 
> The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> cause user stack overflow when delivering a signal.
> 
> In this series, we suggest a couple of things:
> 1. Provide a variable minimum stack size to userspace, as a similar
>    approach to [5]
> 2. Avoid using a too-small alternate stack

I can't comment on the x86 specifics, but the approach followed in this
series does seem consistent with the way arm64 populates
AT_MINSIGSTKSZ.

I need to dig up my glibc hacks for providing a sysconf interface to
this...

Cheers
---Dave

> 
> [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=b9dca794da093dc4d41d39db9851d444e1b54d9b;hb=HEAD
> [2]: https://www.gnu.org/software/libc/manual/html_node/Signal-Stack.html
> [3]: https://man7.org/linux/man-pages/man2/sigaltstack.2.html
> [4]: https://bugzilla.kernel.org/show_bug.cgi?id=153531
> [5]: https://blog.linuxplumbersconf.org/2017/ocw/system/presentations/4671/original/plumbers-dm-2017.pdf
> 
> Chang S. Bae (4):
>   x86/signal: Introduce helpers to get the maximum signal frame size
>   x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ
>   x86/signal: Prevent an alternate stack overflow before a signal
>     delivery
>   selftest/x86/signal: Include test cases for validating sigaltstack
> 
>  arch/x86/ia32/ia32_signal.c               |  11 +-
>  arch/x86/include/asm/elf.h                |   4 +
>  arch/x86/include/asm/fpu/signal.h         |   2 +
>  arch/x86/include/asm/sigframe.h           |  25 +++++
>  arch/x86/include/uapi/asm/auxvec.h        |   6 +-
>  arch/x86/kernel/cpu/common.c              |   3 +
>  arch/x86/kernel/fpu/signal.c              |  20 ++++
>  arch/x86/kernel/signal.c                  |  66 +++++++++++-
>  tools/testing/selftests/x86/Makefile      |   2 +-
>  tools/testing/selftests/x86/sigaltstack.c | 126 ++++++++++++++++++++++
>  10 files changed, 258 insertions(+), 7 deletions(-)
>  create mode 100644 tools/testing/selftests/x86/sigaltstack.c
> 
> --
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-05 13:45 ` [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Dave Martin
@ 2020-10-05 21:17   ` H.J. Lu
  2020-10-06  9:25     ` Dave Martin
  0 siblings, 1 reply; 31+ messages in thread
From: H.J. Lu @ 2020-10-05 21:17 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > During signal entry, the kernel pushes data onto the normal userspace
> > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > which has grown over time as new features and larger registers have been
> > added to the architecture.
> >
> > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > constant indicates to userspace how much data the kernel expects to push on
> > the user stack, [2][3].
> >
> > However, this constant is much too small and does not reflect recent
> > additions to the architecture. For instance, when AVX-512 states are in
> > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> >
> > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > cause user stack overflow when delivering a signal.
> >
> > In this series, we suggest a couple of things:
> > 1. Provide a variable minimum stack size to userspace, as a similar
> >    approach to [5]
> > 2. Avoid using a too-small alternate stack
>
> I can't comment on the x86 specifics, but the approach followed in this
> series does seem consistent with the way arm64 populates
> AT_MINSIGSTKSZ.
>
> I need to dig up my glibc hacks for providing a sysconf interface to
> this...

Here is my proposal for glibc:

https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html

1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
is in use.


-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-05 21:17   ` H.J. Lu
@ 2020-10-06  9:25     ` Dave Martin
  2020-10-06 12:12       ` H.J. Lu
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-06  9:25 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > During signal entry, the kernel pushes data onto the normal userspace
> > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > which has grown over time as new features and larger registers have been
> > > added to the architecture.
> > >
> > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > constant indicates to userspace how much data the kernel expects to push on
> > > the user stack, [2][3].
> > >
> > > However, this constant is much too small and does not reflect recent
> > > additions to the architecture. For instance, when AVX-512 states are in
> > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > >
> > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > cause user stack overflow when delivering a signal.
> > >
> > > In this series, we suggest a couple of things:
> > > 1. Provide a variable minimum stack size to userspace, as a similar
> > >    approach to [5]
> > > 2. Avoid using a too-small alternate stack
> >
> > I can't comment on the x86 specifics, but the approach followed in this
> > series does seem consistent with the way arm64 populates
> > AT_MINSIGSTKSZ.
> >
> > I need to dig up my glibc hacks for providing a sysconf interface to
> > this...
> 
> Here is my proposal for glibc:
> 
> https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html

Thanks for the link.

Are there patches yet?  I already had some hacks in the works, but I can
drop them if there's something already out there.


> 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.

Can we do this?  IIUC, this is an ABI break and carries the risk of
buffer overruns.

The reason for not simply increasing the kernel's MINSIGSTKSZ #define
(apart from the fact that it is rarely used, due to glibc's shadowing
definitions) was that userspace binaries will have baked in the old
value of the constant and may be making assumptions about it.

For example, the type (char [MINSIGSTKSZ]) changes if this #define
changes.  This could be a problem if an newly built library tries to
memcpy() or dump such an object defined by and old binary.
Bounds-checking and the stack sizes passed to things like sigaltstack()
and makecontext() could similarly go wrong.


> 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.

How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
discovery method is changing.  The meaning of the value is exactly the
same as before.

If we are going to rename it though, it could make sense to go for
something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".

The trouble with including "STKSZ" is that is sounds like a
recommendation for your stack size.  While the signal frame size is
relevant to picking a stack size, it's not the only thing to
consider.


Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
of a "recommended stack size" be abandoned?  glibc can at least make a
slightly more informed guess about suitable stack sizes than the kernel
(and glibc already has to guess anyway, in order to determine the
default thread stack size).


> 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> is in use.

Great if we can do it.  I was concerned that this might be
controversial.

Would this just be a recommendation, or can we enforce it somehow?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06  9:25     ` Dave Martin
@ 2020-10-06 12:12       ` H.J. Lu
  2020-10-06 15:18         ` H.J. Lu
  2020-10-06 15:25         ` Dave Martin
  0 siblings, 2 replies; 31+ messages in thread
From: H.J. Lu @ 2020-10-06 12:12 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > which has grown over time as new features and larger registers have been
> > > > added to the architecture.
> > > >
> > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > constant indicates to userspace how much data the kernel expects to push on
> > > > the user stack, [2][3].
> > > >
> > > > However, this constant is much too small and does not reflect recent
> > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > >
> > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > cause user stack overflow when delivering a signal.
> > > >
> > > > In this series, we suggest a couple of things:
> > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > >    approach to [5]
> > > > 2. Avoid using a too-small alternate stack
> > >
> > > I can't comment on the x86 specifics, but the approach followed in this
> > > series does seem consistent with the way arm64 populates
> > > AT_MINSIGSTKSZ.
> > >
> > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > this...
> >
> > Here is my proposal for glibc:
> >
> > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
>
> Thanks for the link.
>
> Are there patches yet?  I already had some hacks in the works, but I can
> drop them if there's something already out there.

I am working on it.

>
> > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
>
> Can we do this?  IIUC, this is an ABI break and carries the risk of
> buffer overruns.
>
> The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> (apart from the fact that it is rarely used, due to glibc's shadowing
> definitions) was that userspace binaries will have baked in the old
> value of the constant and may be making assumptions about it.
>
> For example, the type (char [MINSIGSTKSZ]) changes if this #define
> changes.  This could be a problem if an newly built library tries to
> memcpy() or dump such an object defined by and old binary.
> Bounds-checking and the stack sizes passed to things like sigaltstack()
> and makecontext() could similarly go wrong.

With my original proposal:

https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html

char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
constants:

https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html

>
> > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
>
> How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> discovery method is changing.  The meaning of the value is exactly the
> same as before.
>
> If we are going to rename it though, it could make sense to go for
> something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
>
> The trouble with including "STKSZ" is that is sounds like a
> recommendation for your stack size.  While the signal frame size is
> relevant to picking a stack size, it's not the only thing to
> consider.

The problem is that AT_MINSIGSTKSZ is the signal frame size used by
kernel.   The minimum stack size for a signal handler is more likely
AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
frame size used by kernel + 6KB for user application.

>
> Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> of a "recommended stack size" be abandoned?  glibc can at least make a
> slightly more informed guess about suitable stack sizes than the kernel
> (and glibc already has to guess anyway, in order to determine the
> default thread stack size).

Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
available.

>
> > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > is in use.
>
> Great if we can do it.  I was concerned that this might be
> controversial.
>
> Would this just be a recommendation, or can we enforce it somehow?

It is just an idea.  We need to move away from constant SIGSTKSZ and
MINSIGSTKSZ.

-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 12:12       ` H.J. Lu
@ 2020-10-06 15:18         ` H.J. Lu
  2020-10-06 15:43           ` Dave Martin
  2020-10-06 15:25         ` Dave Martin
  1 sibling, 1 reply; 31+ messages in thread
From: H.J. Lu @ 2020-10-06 15:18 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > >
> > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > which has grown over time as new features and larger registers have been
> > > > > added to the architecture.
> > > > >
> > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > the user stack, [2][3].
> > > > >
> > > > > However, this constant is much too small and does not reflect recent
> > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > >
> > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > cause user stack overflow when delivering a signal.
> > > > >
> > > > > In this series, we suggest a couple of things:
> > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > >    approach to [5]
> > > > > 2. Avoid using a too-small alternate stack
> > > >
> > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > series does seem consistent with the way arm64 populates
> > > > AT_MINSIGSTKSZ.
> > > >
> > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > this...
> > >
> > > Here is my proposal for glibc:
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> >
> > Thanks for the link.
> >
> > Are there patches yet?  I already had some hacks in the works, but I can
> > drop them if there's something already out there.
>
> I am working on it.
>
> >
> > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> >
> > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > buffer overruns.
> >
> > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > (apart from the fact that it is rarely used, due to glibc's shadowing
> > definitions) was that userspace binaries will have baked in the old
> > value of the constant and may be making assumptions about it.
> >
> > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > changes.  This could be a problem if an newly built library tries to
> > memcpy() or dump such an object defined by and old binary.
> > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > and makecontext() could similarly go wrong.
>
> With my original proposal:
>
> https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
>
> char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> constants:
>
> https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
>
> >
> > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> >
> > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > discovery method is changing.  The meaning of the value is exactly the
> > same as before.
> >
> > If we are going to rename it though, it could make sense to go for
> > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> >
> > The trouble with including "STKSZ" is that is sounds like a
> > recommendation for your stack size.  While the signal frame size is
> > relevant to picking a stack size, it's not the only thing to
> > consider.
>
> The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> kernel.   The minimum stack size for a signal handler is more likely
> AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> frame size used by kernel + 6KB for user application.
>
> >
> > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > of a "recommended stack size" be abandoned?  glibc can at least make a
> > slightly more informed guess about suitable stack sizes than the kernel
> > (and glibc already has to guess anyway, in order to determine the
> > default thread stack size).
>
> Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> available.
>
> >
> > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > is in use.
> >
> > Great if we can do it.  I was concerned that this might be
> > controversial.
> >
> > Would this just be a recommendation, or can we enforce it somehow?
>
> It is just an idea.  We need to move away from constant SIGSTKSZ and
> MINSIGSTKSZ.
>

Here is the glibc patch:

https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ

AT_MINSIGSTKSZ should return the signal frame size used by kernel + 6KB
for user application.

-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 12:12       ` H.J. Lu
  2020-10-06 15:18         ` H.J. Lu
@ 2020-10-06 15:25         ` Dave Martin
  2020-10-06 15:33           ` Dave Hansen
  2020-10-06 15:34           ` H.J. Lu
  1 sibling, 2 replies; 31+ messages in thread
From: Dave Martin @ 2020-10-06 15:25 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote:
> On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > >
> > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > which has grown over time as new features and larger registers have been
> > > > > added to the architecture.
> > > > >
> > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > the user stack, [2][3].
> > > > >
> > > > > However, this constant is much too small and does not reflect recent
> > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > >
> > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > cause user stack overflow when delivering a signal.
> > > > >
> > > > > In this series, we suggest a couple of things:
> > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > >    approach to [5]
> > > > > 2. Avoid using a too-small alternate stack
> > > >
> > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > series does seem consistent with the way arm64 populates
> > > > AT_MINSIGSTKSZ.
> > > >
> > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > this...
> > >
> > > Here is my proposal for glibc:
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> >
> > Thanks for the link.
> >
> > Are there patches yet?  I already had some hacks in the works, but I can
> > drop them if there's something already out there.
> 
> I am working on it.

OK.  I may post something for discussion, but I'm happy for it to be
superseded by someone (i.e., other than me) who actually knows what
they're doing...

> >
> > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> >
> > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > buffer overruns.
> >
> > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > (apart from the fact that it is rarely used, due to glibc's shadowing
> > definitions) was that userspace binaries will have baked in the old
> > value of the constant and may be making assumptions about it.
> >
> > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > changes.  This could be a problem if an newly built library tries to
> > memcpy() or dump such an object defined by and old binary.
> > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > and makecontext() could similarly go wrong.
> 
> With my original proposal:
> 
> https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> 
> char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> constants:
> 
> https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html

Ah, I see.  But both still API and ABI breaks; moreover, declaraing an
array with size based on (MIN)SIGSTKSZ is not just reasonable, but the
obvious thing to do with this constant in many simple cases.  Such usage
is widespread, see:

 * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1


Your two approaches seem to trade off two different sources of buffer
overruns: undersized stacks versus ABI breaks across library boundaries.

Since undersized stack is by far the more familiar problem and we at
least have guard regions to help detect overruns, I'd vote to keep
MINSIGSTKSZ and SIGSTKSZ as-is, at least for now.

Or are people reporting real stack overruns on x86 today?


For arm64, we made large vectors on SVE opt-in, so that oversized signal
frames are not seen by default.  Would somethine similar be feasible on
x86?


> > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> >
> > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > discovery method is changing.  The meaning of the value is exactly the
> > same as before.
> >
> > If we are going to rename it though, it could make sense to go for
> > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> >
> > The trouble with including "STKSZ" is that is sounds like a
> > recommendation for your stack size.  While the signal frame size is
> > relevant to picking a stack size, it's not the only thing to
> > consider.
> 
> The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> kernel.   The minimum stack size for a signal handler is more likely
> AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> frame size used by kernel + 6KB for user application.

Ack; to be correct, you also need to take into account which signals may
be unmasked while running on this stack, and the stack requirements of
all their handlers.  Unfortunately, that's hard :(

What's your view on my naming suggesions?


> > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > of a "recommended stack size" be abandoned?  glibc can at least make a
> > slightly more informed guess about suitable stack sizes than the kernel
> > (and glibc already has to guess anyway, in order to determine the
> > default thread stack size).
> 
> Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> available.

In my code, I generate _SC_SIGSTKSZ as the equivalent of

	max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ)

which is >= the legacy value, and broadly reperesentative of the
relationship between MINSIGSTKSZ and SIGSTKSZ on most arches.


What do you think?


> > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > is in use.
> >
> > Great if we can do it.  I was concerned that this might be
> > controversial.
> >
> > Would this just be a recommendation, or can we enforce it somehow?
> 
> It is just an idea.  We need to move away from constant SIGSTKSZ and
> MINSIGSTKSZ.

Totally agree with that.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 15:25         ` Dave Martin
@ 2020-10-06 15:33           ` Dave Hansen
  2020-10-06 17:00             ` Dave Martin
  2020-10-06 15:34           ` H.J. Lu
  1 sibling, 1 reply; 31+ messages in thread
From: Dave Hansen @ 2020-10-06 15:33 UTC (permalink / raw)
  To: Dave Martin, H.J. Lu
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Michael Ellerman, Tony Luck, Ravi V. Shankar, GNU C Library,
	linux-arch, Linux API, LKML

On 10/6/20 8:25 AM, Dave Martin wrote:
> Or are people reporting real stack overruns on x86 today?

We have real overruns.  We have ~2800 bytes of XSAVE (regisiter) state
mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 15:25         ` Dave Martin
  2020-10-06 15:33           ` Dave Hansen
@ 2020-10-06 15:34           ` H.J. Lu
  2020-10-06 16:55             ` Dave Martin
  1 sibling, 1 reply; 31+ messages in thread
From: H.J. Lu @ 2020-10-06 15:34 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote:
> > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > >
> > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > > which has grown over time as new features and larger registers have been
> > > > > > added to the architecture.
> > > > > >
> > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > > the user stack, [2][3].
> > > > > >
> > > > > > However, this constant is much too small and does not reflect recent
> > > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > > >
> > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > > cause user stack overflow when delivering a signal.
> > > > > >
> > > > > > In this series, we suggest a couple of things:
> > > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > > >    approach to [5]
> > > > > > 2. Avoid using a too-small alternate stack
> > > > >
> > > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > > series does seem consistent with the way arm64 populates
> > > > > AT_MINSIGSTKSZ.
> > > > >
> > > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > > this...
> > > >
> > > > Here is my proposal for glibc:
> > > >
> > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> > >
> > > Thanks for the link.
> > >
> > > Are there patches yet?  I already had some hacks in the works, but I can
> > > drop them if there's something already out there.
> >
> > I am working on it.
>
> OK.  I may post something for discussion, but I'm happy for it to be
> superseded by someone (i.e., other than me) who actually knows what
> they're doing...

Please see my previous email for my glibc patch:

https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ

> > >
> > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> > >
> > > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > > buffer overruns.
> > >
> > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > > (apart from the fact that it is rarely used, due to glibc's shadowing
> > > definitions) was that userspace binaries will have baked in the old
> > > value of the constant and may be making assumptions about it.
> > >
> > > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > > changes.  This could be a problem if an newly built library tries to
> > > memcpy() or dump such an object defined by and old binary.
> > > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > > and makecontext() could similarly go wrong.
> >
> > With my original proposal:
> >
> > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> >
> > char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> > constants:
> >
> > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
>
> Ah, I see.  But both still API and ABI breaks; moreover, declaraing an
> array with size based on (MIN)SIGSTKSZ is not just reasonable, but the
> obvious thing to do with this constant in many simple cases.  Such usage
> is widespread, see:
>
>  * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1
>
>
> Your two approaches seem to trade off two different sources of buffer
> overruns: undersized stacks versus ABI breaks across library boundaries.

We can't get everything we want.

> Since undersized stack is by far the more familiar problem and we at
> least have guard regions to help detect overruns, I'd vote to keep
> MINSIGSTKSZ and SIGSTKSZ as-is, at least for now.

Agree.

> Or are people reporting real stack overruns on x86 today?

I hope so.

>
> For arm64, we made large vectors on SVE opt-in, so that oversized signal
> frames are not seen by default.  Would somethine similar be feasible on
> x86?
>
>
> > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> > >
> > > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > > discovery method is changing.  The meaning of the value is exactly the
> > > same as before.
> > >
> > > If we are going to rename it though, it could make sense to go for
> > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> > >
> > > The trouble with including "STKSZ" is that is sounds like a
> > > recommendation for your stack size.  While the signal frame size is
> > > relevant to picking a stack size, it's not the only thing to
> > > consider.
> >
> > The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> > kernel.   The minimum stack size for a signal handler is more likely
> > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> > frame size used by kernel + 6KB for user application.
>
> Ack; to be correct, you also need to take into account which signals may
> be unmasked while running on this stack, and the stack requirements of
> all their handlers.  Unfortunately, that's hard :(
>
> What's your view on my naming suggesions?

I used _SC_MINSIGSTKSZ:

https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9

>
> > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > > of a "recommended stack size" be abandoned?  glibc can at least make a
> > > slightly more informed guess about suitable stack sizes than the kernel
> > > (and glibc already has to guess anyway, in order to determine the
> > > default thread stack size).
> >
> > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> > available.
>
> In my code, I generate _SC_SIGSTKSZ as the equivalent of
>
>         max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ)
>
> which is >= the legacy value, and broadly reperesentative of the
> relationship between MINSIGSTKSZ and SIGSTKSZ on most arches.
>
>
> What do you think?

sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases.

>
> > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > > is in use.
> > >
> > > Great if we can do it.  I was concerned that this might be
> > > controversial.
> > >
> > > Would this just be a recommendation, or can we enforce it somehow?
> >
> > It is just an idea.  We need to move away from constant SIGSTKSZ and
> > MINSIGSTKSZ.
>
> Totally agree with that.
>

With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile
if the source assumes constant SIGSTKSZ or MINSIGSTKSZ.

-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 15:18         ` H.J. Lu
@ 2020-10-06 15:43           ` Dave Martin
  2020-10-06 16:52             ` H.J. Lu
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-06 15:43 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 06, 2020 at 08:18:03AM -0700, H.J. Lu wrote:
> On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > >
> > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > > which has grown over time as new features and larger registers have been
> > > > > > added to the architecture.
> > > > > >
> > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > > the user stack, [2][3].
> > > > > >
> > > > > > However, this constant is much too small and does not reflect recent
> > > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > > >
> > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > > cause user stack overflow when delivering a signal.
> > > > > >
> > > > > > In this series, we suggest a couple of things:
> > > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > > >    approach to [5]
> > > > > > 2. Avoid using a too-small alternate stack
> > > > >
> > > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > > series does seem consistent with the way arm64 populates
> > > > > AT_MINSIGSTKSZ.
> > > > >
> > > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > > this...
> > > >
> > > > Here is my proposal for glibc:
> > > >
> > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> > >
> > > Thanks for the link.
> > >
> > > Are there patches yet?  I already had some hacks in the works, but I can
> > > drop them if there's something already out there.
> >
> > I am working on it.
> >
> > >
> > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> > >
> > > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > > buffer overruns.
> > >
> > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > > (apart from the fact that it is rarely used, due to glibc's shadowing
> > > definitions) was that userspace binaries will have baked in the old
> > > value of the constant and may be making assumptions about it.
> > >
> > > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > > changes.  This could be a problem if an newly built library tries to
> > > memcpy() or dump such an object defined by and old binary.
> > > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > > and makecontext() could similarly go wrong.
> >
> > With my original proposal:
> >
> > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> >
> > char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> > constants:
> >
> > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
> >
> > >
> > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> > >
> > > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > > discovery method is changing.  The meaning of the value is exactly the
> > > same as before.
> > >
> > > If we are going to rename it though, it could make sense to go for
> > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> > >
> > > The trouble with including "STKSZ" is that is sounds like a
> > > recommendation for your stack size.  While the signal frame size is
> > > relevant to picking a stack size, it's not the only thing to
> > > consider.
> >
> > The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> > kernel.   The minimum stack size for a signal handler is more likely
> > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> > frame size used by kernel + 6KB for user application.
> >
> > >
> > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > > of a "recommended stack size" be abandoned?  glibc can at least make a
> > > slightly more informed guess about suitable stack sizes than the kernel
> > > (and glibc already has to guess anyway, in order to determine the
> > > default thread stack size).
> >
> > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> > available.
> >
> > >
> > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > > is in use.
> > >
> > > Great if we can do it.  I was concerned that this might be
> > > controversial.
> > >
> > > Would this just be a recommendation, or can we enforce it somehow?
> >
> > It is just an idea.  We need to move away from constant SIGSTKSZ and
> > MINSIGSTKSZ.
> >
> 
> Here is the glibc patch:
> 
> https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ
> 
> AT_MINSIGSTKSZ should return the signal frame size used by kernel + 6KB
> for user application.

I'm not sure about the 6K here.

We a few fundamental parameters:

 * the actual maximum size of the kernel-allocated signal frame (which
   we'll report via AT_MINSIGSTKSZ);

 * the size of additional userspace stack frame required to execute the
   minimal (i.e., empty) signal handler.  (On AArch64, this is 0.  In
   environments where the C lirbrary calls signal handlers through some
   sort of wrapper, this would need to include the wrapper's stack
   needs also);

 * additional userspace stack needs for the actual signal handler code.
   This is completely unknown.


_SC_MINSIGSTKSZ (however named) should certainly include the first two,
but I'm not sure about the third.  It will at least be architecture-
dependent.


This is one reason why I still favor having more than one constant here:
the fundamental system properties should be discoverable for software
that knows how to calculate its own stack needs accurately.

Since calculating stack needs is hard and most software doesn't bother
to do it, we could also give a "recommended" stack size which
incorporates a guess of typical handler stack needs (similarly to the
legacy SIGSTKSZ constant), but I think that should be a separate
parameter.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 15:43           ` Dave Martin
@ 2020-10-06 16:52             ` H.J. Lu
  0 siblings, 0 replies; 31+ messages in thread
From: H.J. Lu @ 2020-10-06 16:52 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 6, 2020 at 8:43 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Tue, Oct 06, 2020 at 08:18:03AM -0700, H.J. Lu wrote:
> > On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > >
> > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > >
> > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > > > which has grown over time as new features and larger registers have been
> > > > > > > added to the architecture.
> > > > > > >
> > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > > > the user stack, [2][3].
> > > > > > >
> > > > > > > However, this constant is much too small and does not reflect recent
> > > > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > > > >
> > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > > > cause user stack overflow when delivering a signal.
> > > > > > >
> > > > > > > In this series, we suggest a couple of things:
> > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > > > >    approach to [5]
> > > > > > > 2. Avoid using a too-small alternate stack
> > > > > >
> > > > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > > > series does seem consistent with the way arm64 populates
> > > > > > AT_MINSIGSTKSZ.
> > > > > >
> > > > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > > > this...
> > > > >
> > > > > Here is my proposal for glibc:
> > > > >
> > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> > > >
> > > > Thanks for the link.
> > > >
> > > > Are there patches yet?  I already had some hacks in the works, but I can
> > > > drop them if there's something already out there.
> > >
> > > I am working on it.
> > >
> > > >
> > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> > > >
> > > > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > > > buffer overruns.
> > > >
> > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > > > (apart from the fact that it is rarely used, due to glibc's shadowing
> > > > definitions) was that userspace binaries will have baked in the old
> > > > value of the constant and may be making assumptions about it.
> > > >
> > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > > > changes.  This could be a problem if an newly built library tries to
> > > > memcpy() or dump such an object defined by and old binary.
> > > > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > > > and makecontext() could similarly go wrong.
> > >
> > > With my original proposal:
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> > >
> > > char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> > > constants:
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
> > >
> > > >
> > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> > > >
> > > > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > > > discovery method is changing.  The meaning of the value is exactly the
> > > > same as before.
> > > >
> > > > If we are going to rename it though, it could make sense to go for
> > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> > > >
> > > > The trouble with including "STKSZ" is that is sounds like a
> > > > recommendation for your stack size.  While the signal frame size is
> > > > relevant to picking a stack size, it's not the only thing to
> > > > consider.
> > >
> > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> > > kernel.   The minimum stack size for a signal handler is more likely
> > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> > > frame size used by kernel + 6KB for user application.
> > >
> > > >
> > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > > > of a "recommended stack size" be abandoned?  glibc can at least make a
> > > > slightly more informed guess about suitable stack sizes than the kernel
> > > > (and glibc already has to guess anyway, in order to determine the
> > > > default thread stack size).
> > >
> > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> > > available.
> > >
> > > >
> > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > > > is in use.
> > > >
> > > > Great if we can do it.  I was concerned that this might be
> > > > controversial.
> > > >
> > > > Would this just be a recommendation, or can we enforce it somehow?
> > >
> > > It is just an idea.  We need to move away from constant SIGSTKSZ and
> > > MINSIGSTKSZ.
> > >
> >
> > Here is the glibc patch:
> >
> > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ
> >
> > AT_MINSIGSTKSZ should return the signal frame size used by kernel + 6KB
> > for user application.
>
> I'm not sure about the 6K here.

6KB is something I made up.

> We a few fundamental parameters:
>
>  * the actual maximum size of the kernel-allocated signal frame (which
>    we'll report via AT_MINSIGSTKSZ);

Agree.

>  * the size of additional userspace stack frame required to execute the
>    minimal (i.e., empty) signal handler.  (On AArch64, this is 0.  In

It is also 0 for x86.

>    environments where the C lirbrary calls signal handlers through some
>    sort of wrapper, this would need to include the wrapper's stack
>    needs also);

>  * additional userspace stack needs for the actual signal handler code.
>    This is completely unknown.

That is 6KB I made up.

>
> _SC_MINSIGSTKSZ (however named) should certainly include the first two,
> but I'm not sure about the third.  It will at least be architecture-
> dependent.
>
>
> This is one reason why I still favor having more than one constant here:
> the fundamental system properties should be discoverable for software
> that knows how to calculate its own stack needs accurately.
>
> Since calculating stack needs is hard and most software doesn't bother
> to do it, we could also give a "recommended" stack size which
> incorporates a guess of typical handler stack needs (similarly to the
> legacy SIGSTKSZ constant), but I think that should be a separate
> parameter.

Sounds reasonable.  We can have _SC_MINSIGSTKSZ and
_SC_SIGSTKSZ which is _SC_MINSIGSTKSZ + 6KB (or some
other value).

-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 15:34           ` H.J. Lu
@ 2020-10-06 16:55             ` Dave Martin
  2020-10-06 17:44               ` H.J. Lu
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-06 16:55 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote:
> On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote:
> > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > >
> > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > >
> > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > > > which has grown over time as new features and larger registers have been
> > > > > > > added to the architecture.
> > > > > > >
> > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > > > the user stack, [2][3].
> > > > > > >
> > > > > > > However, this constant is much too small and does not reflect recent
> > > > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > > > >
> > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > > > cause user stack overflow when delivering a signal.
> > > > > > >
> > > > > > > In this series, we suggest a couple of things:
> > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > > > >    approach to [5]
> > > > > > > 2. Avoid using a too-small alternate stack
> > > > > >
> > > > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > > > series does seem consistent with the way arm64 populates
> > > > > > AT_MINSIGSTKSZ.
> > > > > >
> > > > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > > > this...
> > > > >
> > > > > Here is my proposal for glibc:
> > > > >
> > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> > > >
> > > > Thanks for the link.
> > > >
> > > > Are there patches yet?  I already had some hacks in the works, but I can
> > > > drop them if there's something already out there.
> > >
> > > I am working on it.
> >
> > OK.  I may post something for discussion, but I'm happy for it to be
> > superseded by someone (i.e., other than me) who actually knows what
> > they're doing...
> 
> Please see my previous email for my glibc patch:
> 
> https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ
> 
> > > >
> > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> > > >
> > > > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > > > buffer overruns.
> > > >
> > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > > > (apart from the fact that it is rarely used, due to glibc's shadowing
> > > > definitions) was that userspace binaries will have baked in the old
> > > > value of the constant and may be making assumptions about it.
> > > >
> > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > > > changes.  This could be a problem if an newly built library tries to
> > > > memcpy() or dump such an object defined by and old binary.
> > > > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > > > and makecontext() could similarly go wrong.
> > >
> > > With my original proposal:
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> > >
> > > char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> > > constants:
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
> >
> > Ah, I see.  But both still API and ABI breaks; moreover, declaraing an
> > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the
> > obvious thing to do with this constant in many simple cases.  Such usage
> > is widespread, see:
> >
> >  * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1
> >
> >
> > Your two approaches seem to trade off two different sources of buffer
> > overruns: undersized stacks versus ABI breaks across library boundaries.
> 
> We can't get everything we want.
> 
> > Since undersized stack is by far the more familiar problem and we at
> > least have guard regions to help detect overruns, I'd vote to keep
> > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now.
> 
> Agree.
> 
> > Or are people reporting real stack overruns on x86 today?
> 
> I hope so.
> 
> >
> > For arm64, we made large vectors on SVE opt-in, so that oversized signal
> > frames are not seen by default.  Would somethine similar be feasible on
> > x86?
> >
> >
> > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> > > >
> > > > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > > > discovery method is changing.  The meaning of the value is exactly the
> > > > same as before.
> > > >
> > > > If we are going to rename it though, it could make sense to go for
> > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> > > >
> > > > The trouble with including "STKSZ" is that is sounds like a
> > > > recommendation for your stack size.  While the signal frame size is
> > > > relevant to picking a stack size, it's not the only thing to
> > > > consider.
> > >
> > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> > > kernel.   The minimum stack size for a signal handler is more likely
> > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> > > frame size used by kernel + 6KB for user application.
> >
> > Ack; to be correct, you also need to take into account which signals may
> > be unmasked while running on this stack, and the stack requirements of
> > all their handlers.  Unfortunately, that's hard :(
> >
> > What's your view on my naming suggesions?
> 
> I used _SC_MINSIGSTKSZ:
> 
> https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9

Apologies, I missed that.

Otherwise, the changes look much as I would expect, except for the
"6K for user program" thing.  This is strictly not included in the
legacy MINSIGSTKSZ.

> 
> >
> > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > > > of a "recommended stack size" be abandoned?  glibc can at least make a
> > > > slightly more informed guess about suitable stack sizes than the kernel
> > > > (and glibc already has to guess anyway, in order to determine the
> > > > default thread stack size).
> > >
> > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> > > available.
> >
> > In my code, I generate _SC_SIGSTKSZ as the equivalent of
> >
> >         max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ)
> >
> > which is >= the legacy value, and broadly reperesentative of the
> > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches.
> >
> >
> > What do you think?
> 
> sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases.

Why, though?

MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever.


Software that calculates its own needs to know the actual system values,
not estimates based on guesses about how much stack a typical program
might need if it were recompiled for x86.

This doesn't mean we can't have a generic suggested value that's suitable
for common scenarios (like SIGSTKSZ), but if we do then I think it
should be a separate constant.


> > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > > > is in use.
> > > >
> > > > Great if we can do it.  I was concerned that this might be
> > > > controversial.
> > > >
> > > > Would this just be a recommendation, or can we enforce it somehow?
> > >
> > > It is just an idea.  We need to move away from constant SIGSTKSZ and
> > > MINSIGSTKSZ.
> >
> > Totally agree with that.
> >
> 
> With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile
> if the source assumes constant SIGSTKSZ or MINSIGSTKSZ.

Ah yes, I see.  That's a sensible precaution.

Is it accepted in general that defining different feature test macros
can lead to ABI incompatibilities?

I have thought that building a shared library with _GNU_SOURCE (say)
doesn't mean that a program that loads that library must also be built
with _GNU_SOURCE.  For one thing, that's hard to police.


However, there are already combinations that could break, e.g., mixing
-D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if
this define changes off_t.


So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an
acceptable compromise.  Interfaces that depend on the value of
MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice --
I don't know of a specific example.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 15:33           ` Dave Hansen
@ 2020-10-06 17:00             ` Dave Martin
  2020-10-06 18:21               ` Florian Weimer
  2020-10-06 18:30               ` Dave Hansen
  0 siblings, 2 replies; 31+ messages in thread
From: Dave Martin @ 2020-10-06 17:00 UTC (permalink / raw)
  To: Dave Hansen
  Cc: H.J. Lu, Chang S. Bae, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Andy Lutomirski, the arch/x86 maintainers,
	Len Brown, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote:
> On 10/6/20 8:25 AM, Dave Martin wrote:
> > Or are people reporting real stack overruns on x86 today?
> 
> We have real overruns.  We have ~2800 bytes of XSAVE (regisiter) state
> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.

Right.  Out of interest, do you believe that's a direct consequence of
the larger kernel-generated signal frame, or does the expansion of
userspace stack frames play a role too?

In practice software just assumes SIGSTKSZ and then ignores the problem
until / unless an actual stack overflow is seen.

There's probably a lot of software out there whose stack is
theoretically too small even without AVX-512 etc. in the mix, especially
when considering the possibility of nested signals...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 16:55             ` Dave Martin
@ 2020-10-06 17:44               ` H.J. Lu
  2020-10-07 10:47                 ` Dave Martin
  0 siblings, 1 reply; 31+ messages in thread
From: H.J. Lu @ 2020-10-06 17:44 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 6, 2020 at 9:55 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote:
> > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote:
> > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > >
> > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > > >
> > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > > > > which has grown over time as new features and larger registers have been
> > > > > > > > added to the architecture.
> > > > > > > >
> > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > > > > the user stack, [2][3].
> > > > > > > >
> > > > > > > > However, this constant is much too small and does not reflect recent
> > > > > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > > > > >
> > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > > > > cause user stack overflow when delivering a signal.
> > > > > > > >
> > > > > > > > In this series, we suggest a couple of things:
> > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > > > > >    approach to [5]
> > > > > > > > 2. Avoid using a too-small alternate stack
> > > > > > >
> > > > > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > > > > series does seem consistent with the way arm64 populates
> > > > > > > AT_MINSIGSTKSZ.
> > > > > > >
> > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > > > > this...
> > > > > >
> > > > > > Here is my proposal for glibc:
> > > > > >
> > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> > > > >
> > > > > Thanks for the link.
> > > > >
> > > > > Are there patches yet?  I already had some hacks in the works, but I can
> > > > > drop them if there's something already out there.
> > > >
> > > > I am working on it.
> > >
> > > OK.  I may post something for discussion, but I'm happy for it to be
> > > superseded by someone (i.e., other than me) who actually knows what
> > > they're doing...
> >
> > Please see my previous email for my glibc patch:
> >
> > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ
> >
> > > > >
> > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> > > > >
> > > > > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > > > > buffer overruns.
> > > > >
> > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > > > > (apart from the fact that it is rarely used, due to glibc's shadowing
> > > > > definitions) was that userspace binaries will have baked in the old
> > > > > value of the constant and may be making assumptions about it.
> > > > >
> > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > > > > changes.  This could be a problem if an newly built library tries to
> > > > > memcpy() or dump such an object defined by and old binary.
> > > > > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > > > > and makecontext() could similarly go wrong.
> > > >
> > > > With my original proposal:
> > > >
> > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> > > >
> > > > char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> > > > constants:
> > > >
> > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
> > >
> > > Ah, I see.  But both still API and ABI breaks; moreover, declaraing an
> > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the
> > > obvious thing to do with this constant in many simple cases.  Such usage
> > > is widespread, see:
> > >
> > >  * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1
> > >
> > >
> > > Your two approaches seem to trade off two different sources of buffer
> > > overruns: undersized stacks versus ABI breaks across library boundaries.
> >
> > We can't get everything we want.
> >
> > > Since undersized stack is by far the more familiar problem and we at
> > > least have guard regions to help detect overruns, I'd vote to keep
> > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now.
> >
> > Agree.
> >
> > > Or are people reporting real stack overruns on x86 today?
> >
> > I hope so.
> >
> > >
> > > For arm64, we made large vectors on SVE opt-in, so that oversized signal
> > > frames are not seen by default.  Would somethine similar be feasible on
> > > x86?
> > >
> > >
> > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> > > > >
> > > > > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > > > > discovery method is changing.  The meaning of the value is exactly the
> > > > > same as before.
> > > > >
> > > > > If we are going to rename it though, it could make sense to go for
> > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> > > > >
> > > > > The trouble with including "STKSZ" is that is sounds like a
> > > > > recommendation for your stack size.  While the signal frame size is
> > > > > relevant to picking a stack size, it's not the only thing to
> > > > > consider.
> > > >
> > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> > > > kernel.   The minimum stack size for a signal handler is more likely
> > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> > > > frame size used by kernel + 6KB for user application.
> > >
> > > Ack; to be correct, you also need to take into account which signals may
> > > be unmasked while running on this stack, and the stack requirements of
> > > all their handlers.  Unfortunately, that's hard :(
> > >
> > > What's your view on my naming suggesions?
> >
> > I used _SC_MINSIGSTKSZ:
> >
> > https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9
>
> Apologies, I missed that.
>
> Otherwise, the changes look much as I would expect, except for the
> "6K for user program" thing.  This is strictly not included in the
> legacy MINSIGSTKSZ.
>
> >
> > >
> > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > > > > of a "recommended stack size" be abandoned?  glibc can at least make a
> > > > > slightly more informed guess about suitable stack sizes than the kernel
> > > > > (and glibc already has to guess anyway, in order to determine the
> > > > > default thread stack size).
> > > >
> > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> > > > available.
> > >
> > > In my code, I generate _SC_SIGSTKSZ as the equivalent of
> > >
> > >         max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ)
> > >
> > > which is >= the legacy value, and broadly reperesentative of the
> > > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches.
> > >
> > >
> > > What do you think?
> >
> > sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases.
>
> Why, though?
>
> MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever.
>
>
> Software that calculates its own needs to know the actual system values,
> not estimates based on guesses about how much stack a typical program
> might need if it were recompiled for x86.
>
> This doesn't mean we can't have a generic suggested value that's suitable
> for common scenarios (like SIGSTKSZ), but if we do then I think it
> should be a separate constant.

I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.
_SC_MINSIGSTKSZ is  the minimum signal stack size from AT_MINSIGSTKSZ,
which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value
of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application.

>
> > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > > > > is in use.
> > > > >
> > > > > Great if we can do it.  I was concerned that this might be
> > > > > controversial.
> > > > >
> > > > > Would this just be a recommendation, or can we enforce it somehow?
> > > >
> > > > It is just an idea.  We need to move away from constant SIGSTKSZ and
> > > > MINSIGSTKSZ.
> > >
> > > Totally agree with that.
> > >
> >
> > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile
> > if the source assumes constant SIGSTKSZ or MINSIGSTKSZ.
>
> Ah yes, I see.  That's a sensible precaution.
>
> Is it accepted in general that defining different feature test macros
> can lead to ABI incompatibilities?
>
> I have thought that building a shared library with _GNU_SOURCE (say)
> doesn't mean that a program that loads that library must also be built
> with _GNU_SOURCE.  For one thing, that's hard to police.
>
>
> However, there are already combinations that could break, e.g., mixing
> -D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if
> this define changes off_t.
>
>
> So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an
> acceptable compromise.  Interfaces that depend on the value of
> MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice --
> I don't know of a specific example.
>

I changed it to _SC_SIGSTKSZ_SOURCE:

https://gitlab.com/x86-glibc/glibc/-/commit/41d5e6b31025721590f12d5aa543eb0bc53ce263

#ifdef __USE_SC_SIGSTKSZ
# include <unistd.h>
/* Minimum stack size for a signal handler: sysconf (SC_SIGSTKSZ).  */
# undef MINSIGSTKSZ
# define MINSIGSTKSZ sysconf (_SC_SIGSTKSZ)
/* System default stack size for a signal handler: MINSIGSTKSZ.  */
# undef SIGSTKSZ
# define SIGSTKSZ MINSIGSTKSZ
#endif

Compilation will fail if the source assumes constant MINSIGSTKSZ or
SIGSTKSZ.

-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size
  2020-10-05 13:42   ` Dave Martin
@ 2020-10-06 17:45     ` Bae, Chang Seok
  2020-10-07 10:05       ` Dave Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Bae, Chang Seok @ 2020-10-06 17:45 UTC (permalink / raw)
  To: Dave.Martin
  Cc: mingo, linux-kernel, hjl.tools, bp, tglx, linux-api, libc-alpha,
	x86, luto, Shankar, Ravi V, Hansen, Dave, Luck, Tony, Brown, Len,
	linux-arch, mpe

On Mon, 2020-10-05 at 14:42 +0100, Dave Martin wrote:
> On Tue, Sep 29, 2020 at 01:57:43PM -0700, Chang S. Bae wrote:
> > 
> > +/*
> > + * The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
> > + * If a signal frame starts at an unaligned address, extra space is required.
> > + * This is the max alignment padding, conservatively.
> > + */
> > +#define MAX_XSAVE_PADDING	63UL
> > +
> > +/*
> > + * The frame data is composed of the following areas and laid out as:
> > + *
> > + * -------------------------
> > + * | alignment padding     |
> > + * -------------------------
> > + * | (f)xsave frame        |
> > + * -------------------------
> > + * | fsave header          |
> > + * -------------------------
> > + * | siginfo + ucontext    |
> > + * -------------------------
> > + */
> > +
> > +/* max_frame_size tells userspace the worst case signal stack size. */
> > +static unsigned long __ro_after_init max_frame_size;
> > +
> > +void __init init_sigframe_size(void)
> > +{
> > +	/*
> > +	 * Use the largest of possible structure formats. This might
> > +	 * slightly oversize the frame for 64-bit apps.
> > +	 */
> > +
> > +	if (IS_ENABLED(CONFIG_X86_32) ||
> > +	    IS_ENABLED(CONFIG_IA32_EMULATION))
> > +		max_frame_size = max((unsigned long)SIZEOF_sigframe_ia32,
> > +				     (unsigned long)SIZEOF_rt_sigframe_ia32);
> > +
> > +	if (IS_ENABLED(CONFIG_X86_X32_ABI))
> > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe_x32);
> > +
> > +	if (IS_ENABLED(CONFIG_X86_64))
> > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe);
> > +
> > +	max_frame_size += fpu__get_fpstate_sigframe_size() + MAX_XSAVE_PADDING;
> 
> For arm64, we round the worst-case padding up by one.
> 

Yeah, I saw that. The ARM code adds the max padding, too:

	signal_minsigstksz = sigframe_size(&user) +
		round_up(sizeof(struct frame_record), 16) +
		16; /* max alignment padding */


https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/signal.c#n973

> I can't remember the full rationale for this, but it at least seemed a
> bit weird to report a size that is not a multiple of the alignment.
> 

Because the last state size of XSAVE may not be 64B aligned, the (reported)
sum of xstate size here does not guarantee 64B alignment.

> I'm can't think of a clear argument as to why it really matters, though.

We care about the start of XSAVE buffer for the XSAVE instructions, to be
64B-aligned.

Thanks,
Chang

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 17:00             ` Dave Martin
@ 2020-10-06 18:21               ` Florian Weimer
  2020-10-07 10:19                 ` Dave Martin
  2020-10-06 18:30               ` Dave Hansen
  1 sibling, 1 reply; 31+ messages in thread
From: Florian Weimer @ 2020-10-06 18:21 UTC (permalink / raw)
  To: Dave Martin via Libc-alpha
  Cc: Dave Hansen, Dave Martin, linux-arch, Tony Luck, Ravi V. Shankar,
	Len Brown, Chang S. Bae, the arch/x86 maintainers, LKML,
	Andy Lutomirski, Linux API, Thomas Gleixner, Borislav Petkov,
	Ingo Molnar

* Dave Martin via Libc-alpha:

> On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote:
>> On 10/6/20 8:25 AM, Dave Martin wrote:
>> > Or are people reporting real stack overruns on x86 today?
>> 
>> We have real overruns.  We have ~2800 bytes of XSAVE (regisiter) state
>> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.
>
> Right.  Out of interest, do you believe that's a direct consequence of
> the larger kernel-generated signal frame, or does the expansion of
> userspace stack frames play a role too?

I must say that I do not quite understand this question.

32 64-*byte* registers simply need 2048 bytes of storage space worst
case, there is really no way around that.

> In practice software just assumes SIGSTKSZ and then ignores the problem
> until / unless an actual stack overflow is seen.
>
> There's probably a lot of software out there whose stack is
> theoretically too small even without AVX-512 etc. in the mix, especially
> when considering the possibility of nested signals...

That is certainly true.  We have seen problems with ntpd, which
requested a 16 KiB stack, at a time when there were various deductions
from the stack size, and since the glibc dynamic loader also uses XSAVE,
ntpd exceeded the remaining stack space.  But in this case, we just
fudged the stack size computation in pthread_create and made it less
likely that the dynamic loader was activated, which largely worked
around this particular problem.  For MINSIGSTKSZ, we just don't have
this option because it's simply too small in the first place.

I don't immediately recall a bug due to SIGSTKSZ being too small.  The
test cases I wrote for this were all artificial, to raise awareness of
this issue (applications treating these as recommended values, rather
than minimum value to avoid immediately sigaltstack/phtread_create
failures, same issue with PTHREAD_STACK_MIN).

Thanks,
Florian
-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 17:00             ` Dave Martin
  2020-10-06 18:21               ` Florian Weimer
@ 2020-10-06 18:30               ` Dave Hansen
  2020-10-07 10:20                 ` Dave Martin
  1 sibling, 1 reply; 31+ messages in thread
From: Dave Hansen @ 2020-10-06 18:30 UTC (permalink / raw)
  To: Dave Martin
  Cc: H.J. Lu, Chang S. Bae, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Andy Lutomirski, the arch/x86 maintainers,
	Len Brown, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On 10/6/20 10:00 AM, Dave Martin wrote:
> On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote:
>> On 10/6/20 8:25 AM, Dave Martin wrote:
>>> Or are people reporting real stack overruns on x86 today?
>> We have real overruns.  We have ~2800 bytes of XSAVE (regisiter) state
>> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.
> Right.  Out of interest, do you believe that's a direct consequence of
> the larger kernel-generated signal frame, or does the expansion of
> userspace stack frames play a role too?

The kernel-generated signal frame is entirely responsible for the ~2800
bytes that I'm talking about.

I'm sure there are some systems where userspace plays a role, but those
are much less of a worry at the moment, since the kernel-induced
overflows mean an instant crash that userspace has no recourse for.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size
  2020-10-06 17:45     ` Bae, Chang Seok
@ 2020-10-07 10:05       ` Dave Martin
  2020-10-08 22:43         ` Bae, Chang Seok
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-07 10:05 UTC (permalink / raw)
  To: Bae, Chang Seok
  Cc: mingo, linux-kernel, hjl.tools, bp, tglx, linux-api, libc-alpha,
	x86, luto, Shankar, Ravi V, Hansen, Dave, Luck, Tony, Brown, Len,
	linux-arch, mpe

On Tue, Oct 06, 2020 at 05:45:24PM +0000, Bae, Chang Seok wrote:
> On Mon, 2020-10-05 at 14:42 +0100, Dave Martin wrote:
> > On Tue, Sep 29, 2020 at 01:57:43PM -0700, Chang S. Bae wrote:
> > > 
> > > +/*
> > > + * The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
> > > + * If a signal frame starts at an unaligned address, extra space is required.
> > > + * This is the max alignment padding, conservatively.
> > > + */
> > > +#define MAX_XSAVE_PADDING	63UL
> > > +
> > > +/*
> > > + * The frame data is composed of the following areas and laid out as:
> > > + *
> > > + * -------------------------
> > > + * | alignment padding     |
> > > + * -------------------------
> > > + * | (f)xsave frame        |
> > > + * -------------------------
> > > + * | fsave header          |
> > > + * -------------------------
> > > + * | siginfo + ucontext    |
> > > + * -------------------------
> > > + */
> > > +
> > > +/* max_frame_size tells userspace the worst case signal stack size. */
> > > +static unsigned long __ro_after_init max_frame_size;
> > > +
> > > +void __init init_sigframe_size(void)
> > > +{
> > > +	/*
> > > +	 * Use the largest of possible structure formats. This might
> > > +	 * slightly oversize the frame for 64-bit apps.
> > > +	 */
> > > +
> > > +	if (IS_ENABLED(CONFIG_X86_32) ||
> > > +	    IS_ENABLED(CONFIG_IA32_EMULATION))
> > > +		max_frame_size = max((unsigned long)SIZEOF_sigframe_ia32,
> > > +				     (unsigned long)SIZEOF_rt_sigframe_ia32);
> > > +
> > > +	if (IS_ENABLED(CONFIG_X86_X32_ABI))
> > > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe_x32);
> > > +
> > > +	if (IS_ENABLED(CONFIG_X86_64))
> > > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe);
> > > +
> > > +	max_frame_size += fpu__get_fpstate_sigframe_size() + MAX_XSAVE_PADDING;
> > 
> > For arm64, we round the worst-case padding up by one.
> > 
> 
> Yeah, I saw that. The ARM code adds the max padding, too:
> 
> 	signal_minsigstksz = sigframe_size(&user) +
> 		round_up(sizeof(struct frame_record), 16) +
> 		16; /* max alignment padding */
> 
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/signal.c#n973
> 
> > I can't remember the full rationale for this, but it at least seemed a
> > bit weird to report a size that is not a multiple of the alignment.
> > 
> 
> Because the last state size of XSAVE may not be 64B aligned, the (reported)
> sum of xstate size here does not guarantee 64B alignment.
> 
> > I'm can't think of a clear argument as to why it really matters, though.
> 
> We care about the start of XSAVE buffer for the XSAVE instructions, to be
> 64B-aligned.

Ah, I see.  That makes sense.

For arm64, there is no additional alignment padding inside the frame,
only the padding inserted after the frame to ensure that the base
address is 16-byte aligned.

However, I wonder whether people will tend to assume that AT_MINSIGSTKSZ
is a sensible (if minimal) amount of stack to allocate.  Allocating an
odd number of bytes, or any amount that isn't a multiple of the
architecture's preferred (or mandated) stack alignment probably doesn't
make sense.

AArch64 has a mandatory stack alignment of 16 bytes; I'm not sure about
x86.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 18:21               ` Florian Weimer
@ 2020-10-07 10:19                 ` Dave Martin
  0 siblings, 0 replies; 31+ messages in thread
From: Dave Martin @ 2020-10-07 10:19 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Dave Martin via Libc-alpha, Dave Hansen, linux-arch, Tony Luck,
	Ravi V. Shankar, Len Brown, Chang S. Bae,
	the arch/x86 maintainers, LKML, Andy Lutomirski, Linux API,
	Thomas Gleixner, Borislav Petkov, Ingo Molnar

On Tue, Oct 06, 2020 at 08:21:00PM +0200, Florian Weimer wrote:
> * Dave Martin via Libc-alpha:
> 
> > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote:
> >> On 10/6/20 8:25 AM, Dave Martin wrote:
> >> > Or are people reporting real stack overruns on x86 today?
> >> 
> >> We have real overruns.  We have ~2800 bytes of XSAVE (regisiter) state
> >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.
> >
> > Right.  Out of interest, do you believe that's a direct consequence of
> > the larger kernel-generated signal frame, or does the expansion of
> > userspace stack frames play a role too?
> 
> I must say that I do not quite understand this question.
> 
> 32 64-*byte* registers simply need 2048 bytes of storage space worst
> case, there is really no way around that.

If the architecture grows more or bigger registers, and if those
registers are used in general-purpose code, then all stack frames will
tend to grow, not just the signal frame.

So a stack overflow might be caused by the larger signal frame by
itself; or it might be caused by the growth of the stack of 20 function
frames created by someone's signal handler.

In the latter case, this is just a "normal" stack overflow, and nothing
really to do with signals or SIGSTKSZ.  Rebuilding with different
compiler flags could also grow the stack usage and cause just the same
problem.

I also strongly suspect that people often don't think about signal
nesting when allocating signal stacks.  So, there might be a pre-
existing potential overflow that just becomes more likely when the
signal frame grows.  That's not really SIGSTKSZ's fault.


Of course, AVX-512 might never be used in general-purpose code.  On
AArch64, SVE can be used in general-purpose code, but it's too early to
say what its prevalence will be in signal handlers.  Probably low.


> > In practice software just assumes SIGSTKSZ and then ignores the problem
> > until / unless an actual stack overflow is seen.
> >
> > There's probably a lot of software out there whose stack is
> > theoretically too small even without AVX-512 etc. in the mix, especially
> > when considering the possibility of nested signals...
> 
> That is certainly true.  We have seen problems with ntpd, which
> requested a 16 KiB stack, at a time when there were various deductions
> from the stack size, and since the glibc dynamic loader also uses XSAVE,
> ntpd exceeded the remaining stack space.  But in this case, we just
> fudged the stack size computation in pthread_create and made it less
> likely that the dynamic loader was activated, which largely worked
> around this particular problem.  For MINSIGSTKSZ, we just don't have
> this option because it's simply too small in the first place.
> 
> I don't immediately recall a bug due to SIGSTKSZ being too small.  The
> test cases I wrote for this were all artificial, to raise awareness of
> this issue (applications treating these as recommended values, rather
> than minimum value to avoid immediately sigaltstack/phtread_create
> failures, same issue with PTHREAD_STACK_MIN).

Ack, I think if SIGSTKSZ was too small significantly often, there would
be more awareness of the issue.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 18:30               ` Dave Hansen
@ 2020-10-07 10:20                 ` Dave Martin
  0 siblings, 0 replies; 31+ messages in thread
From: Dave Martin @ 2020-10-07 10:20 UTC (permalink / raw)
  To: Dave Hansen
  Cc: H.J. Lu, Chang S. Bae, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Andy Lutomirski, the arch/x86 maintainers,
	Len Brown, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 06, 2020 at 11:30:42AM -0700, Dave Hansen wrote:
> On 10/6/20 10:00 AM, Dave Martin wrote:
> > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote:
> >> On 10/6/20 8:25 AM, Dave Martin wrote:
> >>> Or are people reporting real stack overruns on x86 today?
> >> We have real overruns.  We have ~2800 bytes of XSAVE (regisiter) state
> >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.
> > Right.  Out of interest, do you believe that's a direct consequence of
> > the larger kernel-generated signal frame, or does the expansion of
> > userspace stack frames play a role too?
> 
> The kernel-generated signal frame is entirely responsible for the ~2800
> bytes that I'm talking about.
> 
> I'm sure there are some systems where userspace plays a role, but those
> are much less of a worry at the moment, since the kernel-induced
> overflows mean an instant crash that userspace has no recourse for.

Ack, that sounds pretty convincing.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-06 17:44               ` H.J. Lu
@ 2020-10-07 10:47                 ` Dave Martin
  2020-10-07 13:30                   ` H.J. Lu
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-07 10:47 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote:
> On Tue, Oct 6, 2020 at 9:55 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote:
> > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > >
> > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote:
> > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > >
> > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > > > > > which has grown over time as new features and larger registers have been
> > > > > > > > > added to the architecture.
> > > > > > > > >
> > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > > > > > the user stack, [2][3].
> > > > > > > > >
> > > > > > > > > However, this constant is much too small and does not reflect recent
> > > > > > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > > > > > >
> > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > > > > > cause user stack overflow when delivering a signal.
> > > > > > > > >
> > > > > > > > > In this series, we suggest a couple of things:
> > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > > > > > >    approach to [5]
> > > > > > > > > 2. Avoid using a too-small alternate stack
> > > > > > > >
> > > > > > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > > > > > series does seem consistent with the way arm64 populates
> > > > > > > > AT_MINSIGSTKSZ.
> > > > > > > >
> > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > > > > > this...
> > > > > > >
> > > > > > > Here is my proposal for glibc:
> > > > > > >
> > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> > > > > >
> > > > > > Thanks for the link.
> > > > > >
> > > > > > Are there patches yet?  I already had some hacks in the works, but I can
> > > > > > drop them if there's something already out there.
> > > > >
> > > > > I am working on it.
> > > >
> > > > OK.  I may post something for discussion, but I'm happy for it to be
> > > > superseded by someone (i.e., other than me) who actually knows what
> > > > they're doing...
> > >
> > > Please see my previous email for my glibc patch:
> > >
> > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ
> > >
> > > > > >
> > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> > > > > >
> > > > > > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > > > > > buffer overruns.
> > > > > >
> > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > > > > > (apart from the fact that it is rarely used, due to glibc's shadowing
> > > > > > definitions) was that userspace binaries will have baked in the old
> > > > > > value of the constant and may be making assumptions about it.
> > > > > >
> > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > > > > > changes.  This could be a problem if an newly built library tries to
> > > > > > memcpy() or dump such an object defined by and old binary.
> > > > > > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > > > > > and makecontext() could similarly go wrong.
> > > > >
> > > > > With my original proposal:
> > > > >
> > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> > > > >
> > > > > char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> > > > > constants:
> > > > >
> > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
> > > >
> > > > Ah, I see.  But both still API and ABI breaks; moreover, declaraing an
> > > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the
> > > > obvious thing to do with this constant in many simple cases.  Such usage
> > > > is widespread, see:
> > > >
> > > >  * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1
> > > >
> > > >
> > > > Your two approaches seem to trade off two different sources of buffer
> > > > overruns: undersized stacks versus ABI breaks across library boundaries.
> > >
> > > We can't get everything we want.
> > >
> > > > Since undersized stack is by far the more familiar problem and we at
> > > > least have guard regions to help detect overruns, I'd vote to keep
> > > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now.
> > >
> > > Agree.
> > >
> > > > Or are people reporting real stack overruns on x86 today?
> > >
> > > I hope so.
> > >
> > > >
> > > > For arm64, we made large vectors on SVE opt-in, so that oversized signal
> > > > frames are not seen by default.  Would somethine similar be feasible on
> > > > x86?
> > > >
> > > >
> > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> > > > > >
> > > > > > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > > > > > discovery method is changing.  The meaning of the value is exactly the
> > > > > > same as before.
> > > > > >
> > > > > > If we are going to rename it though, it could make sense to go for
> > > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> > > > > >
> > > > > > The trouble with including "STKSZ" is that is sounds like a
> > > > > > recommendation for your stack size.  While the signal frame size is
> > > > > > relevant to picking a stack size, it's not the only thing to
> > > > > > consider.
> > > > >
> > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> > > > > kernel.   The minimum stack size for a signal handler is more likely
> > > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> > > > > frame size used by kernel + 6KB for user application.
> > > >
> > > > Ack; to be correct, you also need to take into account which signals may
> > > > be unmasked while running on this stack, and the stack requirements of
> > > > all their handlers.  Unfortunately, that's hard :(
> > > >
> > > > What's your view on my naming suggesions?
> > >
> > > I used _SC_MINSIGSTKSZ:
> > >
> > > https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9
> >
> > Apologies, I missed that.
> >
> > Otherwise, the changes look much as I would expect, except for the
> > "6K for user program" thing.  This is strictly not included in the
> > legacy MINSIGSTKSZ.
> >
> > >
> > > >
> > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > > > > > of a "recommended stack size" be abandoned?  glibc can at least make a
> > > > > > slightly more informed guess about suitable stack sizes than the kernel
> > > > > > (and glibc already has to guess anyway, in order to determine the
> > > > > > default thread stack size).
> > > > >
> > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> > > > > available.
> > > >
> > > > In my code, I generate _SC_SIGSTKSZ as the equivalent of
> > > >
> > > >         max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ)
> > > >
> > > > which is >= the legacy value, and broadly reperesentative of the
> > > > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches.
> > > >
> > > >
> > > > What do you think?
> > >
> > > sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases.
> >
> > Why, though?
> >
> > MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever.
> >
> >
> > Software that calculates its own needs to know the actual system values,
> > not estimates based on guesses about how much stack a typical program
> > might need if it were recompiled for x86.
> >
> > This doesn't mean we can't have a generic suggested value that's suitable
> > for common scenarios (like SIGSTKSZ), but if we do then I think it
> > should be a separate constant.
> 
> I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.
> _SC_MINSIGSTKSZ is  the minimum signal stack size from AT_MINSIGSTKSZ,
> which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value
> of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application.

Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead?

If the arch has more or bigger registers to save in the signal frame,
the chances are that they're going to get saved in some userspace stack
frames too.  So I suspect that the user signal handler stack usage may
scale up to some extent rather than being a constant.


To help people migrate without unpleasant surprises, I also figured it
would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >=
legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ.
This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a
drop-in replacement for MINSIGSTKSZ, etc.

(To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64.
My idea was that sysconf () should hide this surprise, but people who
really want to know the kernel value would tolerate some
nonportability and read AT_MINSIGSTKSZ directly.)


So then:

	kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ);
	minsigstksz = LEGACY_MINSIGSTKSZ;
	if (kernel_minsigstksz > minsigstksz)
		minsistksz = kernel_minsigstksz;

	sigstksz = LEGACY_SIGSTKSZ;
	if (minsigstksz * 4 > sigstksz)
		sigstksz = minsigstksz * 4;


(Or something like that, unless the architecture provides its own
definitions.  ia64's MINSIGSTKSZ is enormous, so it probably doesn't
want this.)


Also: should all these values be rounded up to a multiple of the
architecture's preferred stack alignment?

Should the preferred stack alignment also be exposed through sysconf?
Portable code otherwise has no way to know this, though if the
preferred alignment is <= the minimum malloc()/alloca() alignment then
this is generally not an issue.)


> >
> > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > > > > > is in use.
> > > > > >
> > > > > > Great if we can do it.  I was concerned that this might be
> > > > > > controversial.
> > > > > >
> > > > > > Would this just be a recommendation, or can we enforce it somehow?
> > > > >
> > > > > It is just an idea.  We need to move away from constant SIGSTKSZ and
> > > > > MINSIGSTKSZ.
> > > >
> > > > Totally agree with that.
> > > >
> > >
> > > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile
> > > if the source assumes constant SIGSTKSZ or MINSIGSTKSZ.
> >
> > Ah yes, I see.  That's a sensible precaution.
> >
> > Is it accepted in general that defining different feature test macros
> > can lead to ABI incompatibilities?
> >
> > I have thought that building a shared library with _GNU_SOURCE (say)
> > doesn't mean that a program that loads that library must also be built
> > with _GNU_SOURCE.  For one thing, that's hard to police.
> >
> >
> > However, there are already combinations that could break, e.g., mixing
> > -D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if
> > this define changes off_t.
> >
> >
> > So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an
> > acceptable compromise.  Interfaces that depend on the value of
> > MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice --
> > I don't know of a specific example.
> >
> 
> I changed it to _SC_SIGSTKSZ_SOURCE:
> 
> https://gitlab.com/x86-glibc/glibc/-/commit/41d5e6b31025721590f12d5aa543eb0bc53ce263
> 
> #ifdef __USE_SC_SIGSTKSZ
> # include <unistd.h>
> /* Minimum stack size for a signal handler: sysconf (SC_SIGSTKSZ).  */
> # undef MINSIGSTKSZ
> # define MINSIGSTKSZ sysconf (_SC_SIGSTKSZ)
> /* System default stack size for a signal handler: MINSIGSTKSZ.  */
> # undef SIGSTKSZ
> # define SIGSTKSZ MINSIGSTKSZ
> #endif
> 
> Compilation will fail if the source assumes constant MINSIGSTKSZ or
> SIGSTKSZ.

I don't understand all the glibc-fu, bit it looks reasonable overall
(notwithstanding my comments above).

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-07 10:47                 ` Dave Martin
@ 2020-10-07 13:30                   ` H.J. Lu
  2020-10-07 15:45                     ` Dave Martin
  0 siblings, 1 reply; 31+ messages in thread
From: H.J. Lu @ 2020-10-07 13:30 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Wed, Oct 7, 2020 at 3:47 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote:
> > On Tue, Oct 6, 2020 at 9:55 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote:
> > > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > >
> > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote:
> > > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > > >
> > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote:
> > > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote:
> > > > > > > > > > During signal entry, the kernel pushes data onto the normal userspace
> > > > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state,
> > > > > > > > > > which has grown over time as new features and larger registers have been
> > > > > > > > > > added to the architecture.
> > > > > > > > > >
> > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and
> > > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is
> > > > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ
> > > > > > > > > > constant indicates to userspace how much data the kernel expects to push on
> > > > > > > > > > the user stack, [2][3].
> > > > > > > > > >
> > > > > > > > > > However, this constant is much too small and does not reflect recent
> > > > > > > > > > additions to the architecture. For instance, when AVX-512 states are in
> > > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB.
> > > > > > > > > >
> > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can
> > > > > > > > > > cause user stack overflow when delivering a signal.
> > > > > > > > > >
> > > > > > > > > > In this series, we suggest a couple of things:
> > > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar
> > > > > > > > > >    approach to [5]
> > > > > > > > > > 2. Avoid using a too-small alternate stack
> > > > > > > > >
> > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this
> > > > > > > > > series does seem consistent with the way arm64 populates
> > > > > > > > > AT_MINSIGSTKSZ.
> > > > > > > > >
> > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to
> > > > > > > > > this...
> > > > > > > >
> > > > > > > > Here is my proposal for glibc:
> > > > > > > >
> > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html
> > > > > > >
> > > > > > > Thanks for the link.
> > > > > > >
> > > > > > > Are there patches yet?  I already had some hacks in the works, but I can
> > > > > > > drop them if there's something already out there.
> > > > > >
> > > > > > I am working on it.
> > > > >
> > > > > OK.  I may post something for discussion, but I'm happy for it to be
> > > > > superseded by someone (i.e., other than me) who actually knows what
> > > > > they're doing...
> > > >
> > > > Please see my previous email for my glibc patch:
> > > >
> > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ
> > > >
> > > > > > >
> > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB.
> > > > > > >
> > > > > > > Can we do this?  IIUC, this is an ABI break and carries the risk of
> > > > > > > buffer overruns.
> > > > > > >
> > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define
> > > > > > > (apart from the fact that it is rarely used, due to glibc's shadowing
> > > > > > > definitions) was that userspace binaries will have baked in the old
> > > > > > > value of the constant and may be making assumptions about it.
> > > > > > >
> > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define
> > > > > > > changes.  This could be a problem if an newly built library tries to
> > > > > > > memcpy() or dump such an object defined by and old binary.
> > > > > > > Bounds-checking and the stack sizes passed to things like sigaltstack()
> > > > > > > and makecontext() could similarly go wrong.
> > > > > >
> > > > > > With my original proposal:
> > > > > >
> > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html
> > > > > >
> > > > > > char [MINSIGSTKSZ] won't compile.  The feedback is to increase the
> > > > > > constants:
> > > > > >
> > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html
> > > > >
> > > > > Ah, I see.  But both still API and ABI breaks; moreover, declaraing an
> > > > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the
> > > > > obvious thing to do with this constant in many simple cases.  Such usage
> > > > > is widespread, see:
> > > > >
> > > > >  * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1
> > > > >
> > > > >
> > > > > Your two approaches seem to trade off two different sources of buffer
> > > > > overruns: undersized stacks versus ABI breaks across library boundaries.
> > > >
> > > > We can't get everything we want.
> > > >
> > > > > Since undersized stack is by far the more familiar problem and we at
> > > > > least have guard regions to help detect overruns, I'd vote to keep
> > > > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now.
> > > >
> > > > Agree.
> > > >
> > > > > Or are people reporting real stack overruns on x86 today?
> > > >
> > > > I hope so.
> > > >
> > > > >
> > > > > For arm64, we made large vectors on SVE opt-in, so that oversized signal
> > > > > frames are not seen by default.  Would somethine similar be feasible on
> > > > > x86?
> > > > >
> > > > >
> > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel.
> > > > > > >
> > > > > > > How about "_SC_MINSIGSTKSZ"?  This was my initial choice since only the
> > > > > > > discovery method is changing.  The meaning of the value is exactly the
> > > > > > > same as before.
> > > > > > >
> > > > > > > If we are going to rename it though, it could make sense to go for
> > > > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE".
> > > > > > >
> > > > > > > The trouble with including "STKSZ" is that is sounds like a
> > > > > > > recommendation for your stack size.  While the signal frame size is
> > > > > > > relevant to picking a stack size, it's not the only thing to
> > > > > > > consider.
> > > > > >
> > > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by
> > > > > > kernel.   The minimum stack size for a signal handler is more likely
> > > > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal
> > > > > > frame size used by kernel + 6KB for user application.
> > > > >
> > > > > Ack; to be correct, you also need to take into account which signals may
> > > > > be unmasked while running on this stack, and the stack requirements of
> > > > > all their handlers.  Unfortunately, that's hard :(
> > > > >
> > > > > What's your view on my naming suggesions?
> > > >
> > > > I used _SC_MINSIGSTKSZ:
> > > >
> > > > https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9
> > >
> > > Apologies, I missed that.
> > >
> > > Otherwise, the changes look much as I would expect, except for the
> > > "6K for user program" thing.  This is strictly not included in the
> > > legacy MINSIGSTKSZ.
> > >
> > > >
> > > > >
> > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept
> > > > > > > of a "recommended stack size" be abandoned?  glibc can at least make a
> > > > > > > slightly more informed guess about suitable stack sizes than the kernel
> > > > > > > (and glibc already has to guess anyway, in order to determine the
> > > > > > > default thread stack size).
> > > > > >
> > > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't
> > > > > > available.
> > > > >
> > > > > In my code, I generate _SC_SIGSTKSZ as the equivalent of
> > > > >
> > > > >         max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ)
> > > > >
> > > > > which is >= the legacy value, and broadly reperesentative of the
> > > > > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches.
> > > > >
> > > > >
> > > > > What do you think?
> > > >
> > > > sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases.
> > >
> > > Why, though?
> > >
> > > MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever.
> > >
> > >
> > > Software that calculates its own needs to know the actual system values,
> > > not estimates based on guesses about how much stack a typical program
> > > might need if it were recompiled for x86.
> > >
> > > This doesn't mean we can't have a generic suggested value that's suitable
> > > for common scenarios (like SIGSTKSZ), but if we do then I think it
> > > should be a separate constant.
> >
> > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.
> > _SC_MINSIGSTKSZ is  the minimum signal stack size from AT_MINSIGSTKSZ,
> > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value
> > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application.
>
> Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead?

Done.

> If the arch has more or bigger registers to save in the signal frame,
> the chances are that they're going to get saved in some userspace stack
> frames too.  So I suspect that the user signal handler stack usage may
> scale up to some extent rather than being a constant.
>
>
> To help people migrate without unpleasant surprises, I also figured it
> would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >=
> legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ.
> This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a
> drop-in replacement for MINSIGSTKSZ, etc.
>
> (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64.
> My idea was that sysconf () should hide this surprise, but people who
> really want to know the kernel value would tolerate some
> nonportability and read AT_MINSIGSTKSZ directly.)
>
>
> So then:
>
>         kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ);
>         minsigstksz = LEGACY_MINSIGSTKSZ;
>         if (kernel_minsigstksz > minsigstksz)
>                 minsistksz = kernel_minsigstksz;
>
>         sigstksz = LEGACY_SIGSTKSZ;
>         if (minsigstksz * 4 > sigstksz)
>                 sigstksz = minsigstksz * 4;

I updated users/hjl/AT_MINSIGSTKSZ branch with

+@item _SC_MINSIGSTKSZ
+@standards{GNU, unistd.h}
+Inquire about the signal stack size used by the kernel.
+
+@item _SC_SIGSTKSZ
+@standards{GNU, unistd.h}
+Inquire about the default signal stack size for a signal handler.

    case _SC_MINSIGSTKSZ:
      assert (GLRO(dl_minsigstacksize) != 0);
      return GLRO(dl_minsigstacksize);

    case _SC_SIGSTKSZ:
      {
        /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4.  */
        long int minsigstacksize = GLRO(dl_minsigstacksize);
        _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
                        "MINSIGSTKSZ is constant");
        if (minsigstacksize < MINSIGSTKSZ)
          minsigstacksize = MINSIGSTKSZ;
        return minsigstacksize * 4;
      }

>
> (Or something like that, unless the architecture provides its own
> definitions.  ia64's MINSIGSTKSZ is enormous, so it probably doesn't
> want this.)
>
>
> Also: should all these values be rounded up to a multiple of the
> architecture's preferred stack alignment?

Kernel should provide a properly aligned AT_MINSIGSTKSZ.

> Should the preferred stack alignment also be exposed through sysconf?
> Portable code otherwise has no way to know this, though if the
> preferred alignment is <= the minimum malloc()/alloca() alignment then
> this is generally not an issue.)

Good question.  But it is orthogonal to the signal stack size issue.

>
> > >
> > > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE
> > > > > > > > is in use.
> > > > > > >
> > > > > > > Great if we can do it.  I was concerned that this might be
> > > > > > > controversial.
> > > > > > >
> > > > > > > Would this just be a recommendation, or can we enforce it somehow?
> > > > > >
> > > > > > It is just an idea.  We need to move away from constant SIGSTKSZ and
> > > > > > MINSIGSTKSZ.
> > > > >
> > > > > Totally agree with that.
> > > > >
> > > >
> > > > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile
> > > > if the source assumes constant SIGSTKSZ or MINSIGSTKSZ.
> > >
> > > Ah yes, I see.  That's a sensible precaution.
> > >
> > > Is it accepted in general that defining different feature test macros
> > > can lead to ABI incompatibilities?
> > >
> > > I have thought that building a shared library with _GNU_SOURCE (say)
> > > doesn't mean that a program that loads that library must also be built
> > > with _GNU_SOURCE.  For one thing, that's hard to police.
> > >
> > >
> > > However, there are already combinations that could break, e.g., mixing
> > > -D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if
> > > this define changes off_t.
> > >
> > >
> > > So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an
> > > acceptable compromise.  Interfaces that depend on the value of
> > > MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice --
> > > I don't know of a specific example.
> > >
> >
> > I changed it to _SC_SIGSTKSZ_SOURCE:
> >
> > https://gitlab.com/x86-glibc/glibc/-/commit/41d5e6b31025721590f12d5aa543eb0bc53ce263
> >
> > #ifdef __USE_SC_SIGSTKSZ
> > # include <unistd.h>
> > /* Minimum stack size for a signal handler: sysconf (SC_SIGSTKSZ).  */
> > # undef MINSIGSTKSZ
> > # define MINSIGSTKSZ sysconf (_SC_SIGSTKSZ)
> > /* System default stack size for a signal handler: MINSIGSTKSZ.  */
> > # undef SIGSTKSZ
> > # define SIGSTKSZ MINSIGSTKSZ
> > #endif
> >
> > Compilation will fail if the source assumes constant MINSIGSTKSZ or
> > SIGSTKSZ.
>
> I don't understand all the glibc-fu, bit it looks reasonable overall
> (notwithstanding my comments above).
>
> Cheers
> ---Dave



-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-07 13:30                   ` H.J. Lu
@ 2020-10-07 15:45                     ` Dave Martin
  2020-10-07 17:43                       ` H.J. Lu
  0 siblings, 1 reply; 31+ messages in thread
From: Dave Martin @ 2020-10-07 15:45 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Wed, Oct 07, 2020 at 06:30:03AM -0700, H.J. Lu wrote:
> On Wed, Oct 7, 2020 at 3:47 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote:

[...]

> > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.
> > > _SC_MINSIGSTKSZ is  the minimum signal stack size from AT_MINSIGSTKSZ,
> > > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value
> > > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application.
> >
> > Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead?
> 
> Done.

OK.  I was prepared to have to argue my case a bit more, but if you
think this is OK then I will stop arguing ;)


> > If the arch has more or bigger registers to save in the signal frame,
> > the chances are that they're going to get saved in some userspace stack
> > frames too.  So I suspect that the user signal handler stack usage may
> > scale up to some extent rather than being a constant.
> >
> >
> > To help people migrate without unpleasant surprises, I also figured it
> > would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >=
> > legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ.
> > This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a
> > drop-in replacement for MINSIGSTKSZ, etc.
> >
> > (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64.
> > My idea was that sysconf () should hide this surprise, but people who
> > really want to know the kernel value would tolerate some
> > nonportability and read AT_MINSIGSTKSZ directly.)
> >
> >
> > So then:
> >
> >         kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ);
> >         minsigstksz = LEGACY_MINSIGSTKSZ;
> >         if (kernel_minsigstksz > minsigstksz)
> >                 minsistksz = kernel_minsigstksz;
> >
> >         sigstksz = LEGACY_SIGSTKSZ;
> >         if (minsigstksz * 4 > sigstksz)
> >                 sigstksz = minsigstksz * 4;
> 
> I updated users/hjl/AT_MINSIGSTKSZ branch with
> 
> +@item _SC_MINSIGSTKSZ
> +@standards{GNU, unistd.h}

Can we specify these more agressively now?

> +Inquire about the signal stack size used by the kernel.

I think we've already concluded that this should included all mandatory
overheads, including those imposed by the compiler and glibc?

e.g.:

--8<--

The returned value is the minimum number of bytes of free stack space
required in order to gurantee successful, non-nested handling of a
single signal whose handler is an empty function.

-->8--

> +
> +@item _SC_SIGSTKSZ
> +@standards{GNU, unistd.h}
> +Inquire about the default signal stack size for a signal handler.

Similarly:

--8<--

The returned value is the suggested minimum number of bytes of stack
space required for a signal stack.

This is not guaranteed to be enough for any specific purpose other than
the invocation of a single, non-nested, empty handler, but nonetheless
should be enough for basic scenarios involving simple signal handlers
and very low levels of signal nesting (say, 2 or 3 levels at the very
most).

This value is provided for developer convenience and to ease migration
from the legacy SIGSTKSZ constant.  Programs requiring stronger
guarantees should avoid using it if at all possible.

-->8--


If these descriptions are too wordy, we might want to move some of it
out to signal.texi, though.

> 
>     case _SC_MINSIGSTKSZ:
>       assert (GLRO(dl_minsigstacksize) != 0);
>       return GLRO(dl_minsigstacksize);
> 
>     case _SC_SIGSTKSZ:
>       {
>         /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4.  */
>         long int minsigstacksize = GLRO(dl_minsigstacksize);
>         _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
>                         "MINSIGSTKSZ is constant");
>         if (minsigstacksize < MINSIGSTKSZ)
>           minsigstacksize = MINSIGSTKSZ;
>         return minsigstacksize * 4;
>       }
> 
> >
> > (Or something like that, unless the architecture provides its own
> > definitions.  ia64's MINSIGSTKSZ is enormous, so it probably doesn't
> > want this.)
> >
> >
> > Also: should all these values be rounded up to a multiple of the
> > architecture's preferred stack alignment?
> 
> Kernel should provide a properly aligned AT_MINSIGSTKSZ.

OK.  Can you comment on Chang S. Bae's series?  I wasn't sure that the
proposal produces an aligned value for AT_MINSIGSTKSZ on x86, but maybe
I just worked it out wrong.


> > Should the preferred stack alignment also be exposed through sysconf?
> > Portable code otherwise has no way to know this, though if the
> > preferred alignment is <= the minimum malloc()/alloca() alignment then
> > this is generally not an issue.)
> 
> Good question.  But it is orthogonal to the signal stack size issue.

Ack.

There are various other brokennesses around this area, such as the fact
that the register data may now run off the end of the mcontext_t object
that is supposed to contain it.  Ideally we should fork ucontext_t or
mcontext_t into two types, one for the ucontext API and one for the
signal API (which are anyway highly incompatible with each other).

If this stuff is addressed, I guess we could bundle it under a more
general feature test macro.  But it's probably best to nail down this
series in something like its current form first.


I'll follow up on libc-alpha with a wishlist, but that will be a
separate conversation...

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size
  2020-10-07 15:45                     ` Dave Martin
@ 2020-10-07 17:43                       ` H.J. Lu
  0 siblings, 0 replies; 31+ messages in thread
From: H.J. Lu @ 2020-10-07 17:43 UTC (permalink / raw)
  To: Dave Martin
  Cc: Chang S. Bae, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Andy Lutomirski, the arch/x86 maintainers, Len Brown,
	Dave Hansen, Michael Ellerman, Tony Luck, Ravi V. Shankar,
	GNU C Library, linux-arch, Linux API, LKML

On Wed, Oct 7, 2020 at 8:46 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Wed, Oct 07, 2020 at 06:30:03AM -0700, H.J. Lu wrote:
> > On Wed, Oct 7, 2020 at 3:47 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote:
>
> [...]
>
> > > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.
> > > > _SC_MINSIGSTKSZ is  the minimum signal stack size from AT_MINSIGSTKSZ,
> > > > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value
> > > > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application.
> > >
> > > Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead?
> >
> > Done.
>
> OK.  I was prepared to have to argue my case a bit more, but if you
> think this is OK then I will stop arguing ;)
>
>
> > > If the arch has more or bigger registers to save in the signal frame,
> > > the chances are that they're going to get saved in some userspace stack
> > > frames too.  So I suspect that the user signal handler stack usage may
> > > scale up to some extent rather than being a constant.
> > >
> > >
> > > To help people migrate without unpleasant surprises, I also figured it
> > > would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >=
> > > legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ.
> > > This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a
> > > drop-in replacement for MINSIGSTKSZ, etc.
> > >
> > > (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64.
> > > My idea was that sysconf () should hide this surprise, but people who
> > > really want to know the kernel value would tolerate some
> > > nonportability and read AT_MINSIGSTKSZ directly.)
> > >
> > >
> > > So then:
> > >
> > >         kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ);
> > >         minsigstksz = LEGACY_MINSIGSTKSZ;
> > >         if (kernel_minsigstksz > minsigstksz)
> > >                 minsistksz = kernel_minsigstksz;
> > >
> > >         sigstksz = LEGACY_SIGSTKSZ;
> > >         if (minsigstksz * 4 > sigstksz)
> > >                 sigstksz = minsigstksz * 4;
> >
> > I updated users/hjl/AT_MINSIGSTKSZ branch with
> >
> > +@item _SC_MINSIGSTKSZ
> > +@standards{GNU, unistd.h}
>
> Can we specify these more agressively now?
>
> > +Inquire about the signal stack size used by the kernel.
>
> I think we've already concluded that this should included all mandatory
> overheads, including those imposed by the compiler and glibc?
>
> e.g.:
>
> --8<--
>
> The returned value is the minimum number of bytes of free stack space
> required in order to gurantee successful, non-nested handling of a
> single signal whose handler is an empty function.
>
> -->8--
>
> > +
> > +@item _SC_SIGSTKSZ
> > +@standards{GNU, unistd.h}
> > +Inquire about the default signal stack size for a signal handler.
>
> Similarly:
>
> --8<--
>
> The returned value is the suggested minimum number of bytes of stack
> space required for a signal stack.
>
> This is not guaranteed to be enough for any specific purpose other than
> the invocation of a single, non-nested, empty handler, but nonetheless
> should be enough for basic scenarios involving simple signal handlers
> and very low levels of signal nesting (say, 2 or 3 levels at the very
> most).
>
> This value is provided for developer convenience and to ease migration
> from the legacy SIGSTKSZ constant.  Programs requiring stronger
> guarantees should avoid using it if at all possible.
>
> -->8--

Done.

>
> If these descriptions are too wordy, we might want to move some of it
> out to signal.texi, though.
>
> >
> >     case _SC_MINSIGSTKSZ:
> >       assert (GLRO(dl_minsigstacksize) != 0);
> >       return GLRO(dl_minsigstacksize);
> >
> >     case _SC_SIGSTKSZ:
> >       {
> >         /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4.  */
> >         long int minsigstacksize = GLRO(dl_minsigstacksize);
> >         _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
> >                         "MINSIGSTKSZ is constant");
> >         if (minsigstacksize < MINSIGSTKSZ)
> >           minsigstacksize = MINSIGSTKSZ;
> >         return minsigstacksize * 4;
> >       }
> >
> > >
> > > (Or something like that, unless the architecture provides its own
> > > definitions.  ia64's MINSIGSTKSZ is enormous, so it probably doesn't
> > > want this.)
> > >
> > >
> > > Also: should all these values be rounded up to a multiple of the
> > > architecture's preferred stack alignment?
> >
> > Kernel should provide a properly aligned AT_MINSIGSTKSZ.
>
> OK.  Can you comment on Chang S. Bae's series?  I wasn't sure that the
> proposal produces an aligned value for AT_MINSIGSTKSZ on x86, but maybe
> I just worked it out wrong.

In Linux kernel, the signal frame data is composed of the following areas
and laid out as:
     ------------------------------
     | alignment padding          |
     ------------------------------
     | xsave buffer               |  <<<<<<< This must be 64 byte aligned
     ------------------------------
     | fsave header (32-bit only) |
     ------------------------------
     | siginfo + ucontext         |
     ------------------------------

In order to get the proper alignment for xsave buffer, the signal stake
size should be rounded up to 64 byte aligned.  In glibc, I have

 /* Size of xsave buffer.  */
  unsigned int sigframe_size = ebx;
if (64 bit)
  /* NB: sizeof(struct rt_sigframe) in Linux kernel.  */
  sigframe_size += 440;
else
  /* NB: sizeof(struct sigframe_ia32) + sizeof(struct fregs_state)) in
     Linux kernel.  */
  sigframe_size += 736 + 112;
endif

  /* Round up to 64-byte aligned.  */
  sigframe_size = ALIGN_UP (sigframe_size, 64);

  GLRO(dl_minsigstacksize) = sigframe_size;

Kernel should do something similar.

>
> > > Should the preferred stack alignment also be exposed through sysconf?
> > > Portable code otherwise has no way to know this, though if the
> > > preferred alignment is <= the minimum malloc()/alloca() alignment then
> > > this is generally not an issue.)
> >
> > Good question.  But it is orthogonal to the signal stack size issue.
>
> Ack.
>
> There are various other brokennesses around this area, such as the fact
> that the register data may now run off the end of the mcontext_t object
> that is supposed to contain it.  Ideally we should fork ucontext_t or
> mcontext_t into two types, one for the ucontext API and one for the
> signal API (which are anyway highly incompatible with each other).
>
> If this stuff is addressed, I guess we could bundle it under a more
> general feature test macro.  But it's probably best to nail down this
> series in something like its current form first.

Agreed.

>
> I'll follow up on libc-alpha with a wishlist, but that will be a
> separate conversation...
>
> [...]
>
> Cheers
> ---Dave



-- 
H.J.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size
  2020-10-07 10:05       ` Dave Martin
@ 2020-10-08 22:43         ` Bae, Chang Seok
  2020-10-12 13:26           ` Dave Martin
  0 siblings, 1 reply; 31+ messages in thread
From: Bae, Chang Seok @ 2020-10-08 22:43 UTC (permalink / raw)
  To: Dave.Martin
  Cc: mingo, linux-kernel, hjl.tools, bp, tglx, linux-api, libc-alpha,
	x86, luto, Shankar, Ravi V, Hansen, Dave, Luck, Tony, Brown, Len,
	linux-arch, mpe

On Wed, 2020-10-07 at 11:05 +0100, Dave Martin wrote:
> On Tue, Oct 06, 2020 at 05:45:24PM +0000, Bae, Chang Seok wrote:
> > On Mon, 2020-10-05 at 14:42 +0100, Dave Martin wrote:
> > > On Tue, Sep 29, 2020 at 01:57:43PM -0700, Chang S. Bae wrote:
> > > > 
> > > > +/*
> > > > + * The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
> > > > + * If a signal frame starts at an unaligned address, extra space is required.
> > > > + * This is the max alignment padding, conservatively.
> > > > + */
> > > > +#define MAX_XSAVE_PADDING	63UL
> > > > +
> > > > +/*
> > > > + * The frame data is composed of the following areas and laid out as:
> > > > + *
> > > > + * -------------------------
> > > > + * | alignment padding     |
> > > > + * -------------------------
> > > > + * | (f)xsave frame        |
> > > > + * -------------------------
> > > > + * | fsave header          |
> > > > + * -------------------------
> > > > + * | siginfo + ucontext    |
> > > > + * -------------------------
> > > > + */
> > > > +
> > > > +/* max_frame_size tells userspace the worst case signal stack size. */
> > > > +static unsigned long __ro_after_init max_frame_size;
> > > > +
> > > > +void __init init_sigframe_size(void)
> > > > +{
> > > > +	/*
> > > > +	 * Use the largest of possible structure formats. This might
> > > > +	 * slightly oversize the frame for 64-bit apps.
> > > > +	 */
> > > > +
> > > > +	if (IS_ENABLED(CONFIG_X86_32) ||
> > > > +	    IS_ENABLED(CONFIG_IA32_EMULATION))
> > > > +		max_frame_size = max((unsigned long)SIZEOF_sigframe_ia32,
> > > > +				     (unsigned long)SIZEOF_rt_sigframe_ia32);
> > > > +
> > > > +	if (IS_ENABLED(CONFIG_X86_X32_ABI))
> > > > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe_x32);
> > > > +
> > > > +	if (IS_ENABLED(CONFIG_X86_64))
> > > > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe);
> > > > +
> > > > +	max_frame_size += fpu__get_fpstate_sigframe_size() + MAX_XSAVE_PADDING;
> > > 
> > > For arm64, we round the worst-case padding up by one.
> > > 
> > 
> > Yeah, I saw that. The ARM code adds the max padding, too:
> > 
> > 	signal_minsigstksz = sigframe_size(&user) +
> > 		round_up(sizeof(struct frame_record), 16) +
> > 		16; /* max alignment padding */
> > 
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/signal.c#n973
> > 
> > > I can't remember the full rationale for this, but it at least seemed a
> > > bit weird to report a size that is not a multiple of the alignment.
> > > 
> > 
> > Because the last state size of XSAVE may not be 64B aligned, the (reported)
> > sum of xstate size here does not guarantee 64B alignment.
> > 
> > > I'm can't think of a clear argument as to why it really matters, though.
> > 
> > We care about the start of XSAVE buffer for the XSAVE instructions, to be
> > 64B-aligned.
> 
> Ah, I see.  That makes sense.
> 
> For arm64, there is no additional alignment padding inside the frame,
> only the padding inserted after the frame to ensure that the base
> address is 16-byte aligned.
> 
> However, I wonder whether people will tend to assume that AT_MINSIGSTKSZ
> is a sensible (if minimal) amount of stack to allocate.  Allocating an
> odd number of bytes, or any amount that isn't a multiple of the
> architecture's preferred (or mandated) stack alignment probably doesn't
> make sense.
> 
> AArch64 has a mandatory stack alignment of 16 bytes; I'm not sure about
> x86.

The x86 ABI looks to require 16-byte alignment (for both 32-/64-bit modes).
FWIW, the 32-bit ABI got changed from 4-byte alignement.

Thank you for brining up the point. Ack. The kernel is expected to return a
16-byte aligned size. I made this change after a discussion with H.J.:

diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index c042236ef52e..52815d7c08fb 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -212,6 +212,11 @@ do {							
		\
  * Set up a signal frame.
  */
 
+/* x86 ABI requires 16-byte alignment */
+#define FRAME_ALIGNMENT	16UL
+
+#define MAX_FRAME_PADDING	FRAME_ALIGNMENT - 1
+
 /*
  * Determine which stack to use..
  */
@@ -222,9 +227,9 @@ static unsigned long align_sigframe(unsigned long sp)
 	 * Align the stack pointer according to the i386 ABI,
 	 * i.e. so that on function entry ((sp + 4) & 15) == 0.
 	 */
-	sp = ((sp + 4) & -16ul) - 4;
+	sp = ((sp + 4) & -FRAME_ALIGNMENT) - 4;
 #else /* !CONFIG_X86_32 */
-	sp = round_down(sp, 16) - 8;
+	sp = round_down(sp, FRAME_ALIGNMENT) - 8;
 #endif
 	return sp;
 }
@@ -404,7 +409,7 @@ static int __setup_rt_frame(int sig, struct ksignal
*ksig,
 	unsafe_put_sigcontext(&frame->uc.uc_mcontext, fp, regs, set,
Efault);
 	unsafe_put_sigmask(set, frame, Efault);
 	user_access_end();
-	
+
 	if (copy_siginfo_to_user(&frame->info, &ksig->info))
 		return -EFAULT;
 
@@ -685,6 +690,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
  * -------------------------
  * | fsave header          |
  * -------------------------
+ * | alignment padding     |
+ * -------------------------
  * | siginfo + ucontext    |
  * -------------------------
  */
@@ -710,7 +717,12 @@ void __init init_sigframe_size(void)
 	if (IS_ENABLED(CONFIG_X86_64))
 		max_frame_size = max(max_frame_size, (unsigned
long)SIZEOF_rt_sigframe);
 
+	max_frame_size += MAX_FRAME_PADDING;
+
 	max_frame_size += fpu__get_fpstate_sigframe_size() +
MAX_XSAVE_PADDING;
+
+	/* Userspace expects an aligned size. */
+	max_frame_size = round_up(max_frame_size, FRAME_ALIGNMENT);
 }
 
 unsigned long get_sigframe_size(void)


Thanks,
Chang




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size
  2020-10-08 22:43         ` Bae, Chang Seok
@ 2020-10-12 13:26           ` Dave Martin
  0 siblings, 0 replies; 31+ messages in thread
From: Dave Martin @ 2020-10-12 13:26 UTC (permalink / raw)
  To: Bae, Chang Seok
  Cc: mingo, linux-kernel, hjl.tools, bp, tglx, linux-api, libc-alpha,
	x86, luto, Shankar, Ravi V, Hansen, Dave, Luck, Tony, Brown, Len,
	linux-arch, mpe

On Thu, Oct 08, 2020 at 10:43:50PM +0000, Bae, Chang Seok wrote:
> On Wed, 2020-10-07 at 11:05 +0100, Dave Martin wrote:
> > On Tue, Oct 06, 2020 at 05:45:24PM +0000, Bae, Chang Seok wrote:
> > > On Mon, 2020-10-05 at 14:42 +0100, Dave Martin wrote:
> > > > On Tue, Sep 29, 2020 at 01:57:43PM -0700, Chang S. Bae wrote:
> > > > > 
> > > > > +/*
> > > > > + * The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
> > > > > + * If a signal frame starts at an unaligned address, extra space is required.
> > > > > + * This is the max alignment padding, conservatively.
> > > > > + */
> > > > > +#define MAX_XSAVE_PADDING	63UL
> > > > > +
> > > > > +/*
> > > > > + * The frame data is composed of the following areas and laid out as:
> > > > > + *
> > > > > + * -------------------------
> > > > > + * | alignment padding     |
> > > > > + * -------------------------
> > > > > + * | (f)xsave frame        |
> > > > > + * -------------------------
> > > > > + * | fsave header          |
> > > > > + * -------------------------
> > > > > + * | siginfo + ucontext    |
> > > > > + * -------------------------
> > > > > + */
> > > > > +
> > > > > +/* max_frame_size tells userspace the worst case signal stack size. */
> > > > > +static unsigned long __ro_after_init max_frame_size;
> > > > > +
> > > > > +void __init init_sigframe_size(void)
> > > > > +{
> > > > > +	/*
> > > > > +	 * Use the largest of possible structure formats. This might
> > > > > +	 * slightly oversize the frame for 64-bit apps.
> > > > > +	 */
> > > > > +
> > > > > +	if (IS_ENABLED(CONFIG_X86_32) ||
> > > > > +	    IS_ENABLED(CONFIG_IA32_EMULATION))
> > > > > +		max_frame_size = max((unsigned long)SIZEOF_sigframe_ia32,
> > > > > +				     (unsigned long)SIZEOF_rt_sigframe_ia32);
> > > > > +
> > > > > +	if (IS_ENABLED(CONFIG_X86_X32_ABI))
> > > > > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe_x32);
> > > > > +
> > > > > +	if (IS_ENABLED(CONFIG_X86_64))
> > > > > +		max_frame_size = max(max_frame_size, (unsigned long)SIZEOF_rt_sigframe);
> > > > > +
> > > > > +	max_frame_size += fpu__get_fpstate_sigframe_size() + MAX_XSAVE_PADDING;
> > > > 
> > > > For arm64, we round the worst-case padding up by one.
> > > > 
> > > 
> > > Yeah, I saw that. The ARM code adds the max padding, too:
> > > 
> > > 	signal_minsigstksz = sigframe_size(&user) +
> > > 		round_up(sizeof(struct frame_record), 16) +
> > > 		16; /* max alignment padding */
> > > 
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/kernel/signal.c#n973
> > > 
> > > > I can't remember the full rationale for this, but it at least seemed a
> > > > bit weird to report a size that is not a multiple of the alignment.
> > > > 
> > > 
> > > Because the last state size of XSAVE may not be 64B aligned, the (reported)
> > > sum of xstate size here does not guarantee 64B alignment.
> > > 
> > > > I'm can't think of a clear argument as to why it really matters, though.
> > > 
> > > We care about the start of XSAVE buffer for the XSAVE instructions, to be
> > > 64B-aligned.
> > 
> > Ah, I see.  That makes sense.
> > 
> > For arm64, there is no additional alignment padding inside the frame,
> > only the padding inserted after the frame to ensure that the base
> > address is 16-byte aligned.
> > 
> > However, I wonder whether people will tend to assume that AT_MINSIGSTKSZ
> > is a sensible (if minimal) amount of stack to allocate.  Allocating an
> > odd number of bytes, or any amount that isn't a multiple of the
> > architecture's preferred (or mandated) stack alignment probably doesn't
> > make sense.
> > 
> > AArch64 has a mandatory stack alignment of 16 bytes; I'm not sure about
> > x86.
> 
> The x86 ABI looks to require 16-byte alignment (for both 32-/64-bit modes).
> FWIW, the 32-bit ABI got changed from 4-byte alignement.
> 
> Thank you for brining up the point. Ack. The kernel is expected to return a
> 16-byte aligned size. I made this change after a discussion with H.J.:
> 
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index c042236ef52e..52815d7c08fb 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -212,6 +212,11 @@ do {							
> 		\
>   * Set up a signal frame.
>   */
>  
> +/* x86 ABI requires 16-byte alignment */
> +#define FRAME_ALIGNMENT	16UL
> +
> +#define MAX_FRAME_PADDING	FRAME_ALIGNMENT - 1
> +

You might want () here, to avoid future surpsises.

>  /*
>   * Determine which stack to use..
>   */
> @@ -222,9 +227,9 @@ static unsigned long align_sigframe(unsigned long sp)
>  	 * Align the stack pointer according to the i386 ABI,
>  	 * i.e. so that on function entry ((sp + 4) & 15) == 0.
>  	 */
> -	sp = ((sp + 4) & -16ul) - 4;
> +	sp = ((sp + 4) & -FRAME_ALIGNMENT) - 4;
>  #else /* !CONFIG_X86_32 */
> -	sp = round_down(sp, 16) - 8;
> +	sp = round_down(sp, FRAME_ALIGNMENT) - 8;
>  #endif
>  	return sp;
>  }
> @@ -404,7 +409,7 @@ static int __setup_rt_frame(int sig, struct ksignal
> *ksig,
>  	unsafe_put_sigcontext(&frame->uc.uc_mcontext, fp, regs, set,
> Efault);
>  	unsafe_put_sigmask(set, frame, Efault);
>  	user_access_end();
> -	
> +
>  	if (copy_siginfo_to_user(&frame->info, &ksig->info))
>  		return -EFAULT;
>  
> @@ -685,6 +690,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
>   * -------------------------
>   * | fsave header          |
>   * -------------------------
> + * | alignment padding     |
> + * -------------------------
>   * | siginfo + ucontext    |
>   * -------------------------
>   */
> @@ -710,7 +717,12 @@ void __init init_sigframe_size(void)
>  	if (IS_ENABLED(CONFIG_X86_64))
>  		max_frame_size = max(max_frame_size, (unsigned
> long)SIZEOF_rt_sigframe);
>  
> +	max_frame_size += MAX_FRAME_PADDING;
> +
>  	max_frame_size += fpu__get_fpstate_sigframe_size() +
> MAX_XSAVE_PADDING;
> +
> +	/* Userspace expects an aligned size. */
> +	max_frame_size = round_up(max_frame_size, FRAME_ALIGNMENT);
>  }

[...]

Seems reasonable, I guess.

(I won't comment on the x86 ABI specifics.)

Cheers
---Dave

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, back to index

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 20:57 [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Chang S. Bae
2020-09-29 20:57 ` [RFC PATCH 1/4] x86/signal: Introduce helpers to get the maximum signal frame size Chang S. Bae
2020-10-05 13:42   ` Dave Martin
2020-10-06 17:45     ` Bae, Chang Seok
2020-10-07 10:05       ` Dave Martin
2020-10-08 22:43         ` Bae, Chang Seok
2020-10-12 13:26           ` Dave Martin
2020-09-29 20:57 ` [RFC PATCH 2/4] x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ Chang S. Bae
2020-09-29 20:57 ` [RFC PATCH 3/4] x86/signal: Prevent an alternate stack overflow before a signal delivery Chang S. Bae
2020-09-29 20:57 ` [RFC PATCH 4/4] selftest/x86/signal: Include test cases for validating sigaltstack Chang S. Bae
2020-10-05 13:45 ` [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Dave Martin
2020-10-05 21:17   ` H.J. Lu
2020-10-06  9:25     ` Dave Martin
2020-10-06 12:12       ` H.J. Lu
2020-10-06 15:18         ` H.J. Lu
2020-10-06 15:43           ` Dave Martin
2020-10-06 16:52             ` H.J. Lu
2020-10-06 15:25         ` Dave Martin
2020-10-06 15:33           ` Dave Hansen
2020-10-06 17:00             ` Dave Martin
2020-10-06 18:21               ` Florian Weimer
2020-10-07 10:19                 ` Dave Martin
2020-10-06 18:30               ` Dave Hansen
2020-10-07 10:20                 ` Dave Martin
2020-10-06 15:34           ` H.J. Lu
2020-10-06 16:55             ` Dave Martin
2020-10-06 17:44               ` H.J. Lu
2020-10-07 10:47                 ` Dave Martin
2020-10-07 13:30                   ` H.J. Lu
2020-10-07 15:45                     ` Dave Martin
2020-10-07 17:43                       ` H.J. Lu

Linux-arch Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arch/0 linux-arch/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arch linux-arch/ https://lore.kernel.org/linux-arch \
		linux-arch@vger.kernel.org
	public-inbox-index linux-arch

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-arch


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git