All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] arm64: Signal context expansion
@ 2016-09-09 14:15 Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 1/5] arm64: signal: Refactor sigcontext parsing in rt_sigreturn Dave Martin
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for supporting the ARM Scalable Vector Extension in the
Linux kernel, it is necessary to be able to save more data in the signal
frame than can be accommodated in the existing arm64 signal frame
structure.


To minimise ABI impact the current structures are retained as-is, and a
new extra_context record is defined, which (if present) declares some
additional space beyond the standard arm64 signal frame.

This new record can be added in sigframe.__reserved[] if there is a
need to allocate extra space beyond the standard signal frame.  The
extra block of memory referenced by extra_context can then be parsed in
the same way as sigcontext.__reserved[].  Old code should just ignore
the whole thing as an unrecognised record.  To maintain backward
compatibility, signal context records defined today are always placed
directly in __reserved[], never in the block referenced by
extra_context.

This series also does some refactoring in order to make it easier to add
records to the signal frame in the future, using the new expansion
mechanism.  The extra_context mechanism is hidden behind a simple
allocator that grows and signal frame only as needed.


Note: this series makes rt_sigreturn somewhat stricter about parsing the
signal frame than previously: violators will be SEGV'd.  This should
only be an ABI break for software that was doing something wrong in the
first place, but I'm open to suggestions if people have concerns about
it.


Dave Martin (5):
  arm64: signal: Refactor sigcontext parsing in rt_sigreturn
  arm64: signal: factor frame layout and population into separate passes
  arm64: signal: factor out signal frame record allocation
  arm64: signal: Allocate extra sigcontext space as needed
  arm64: signal: Parse extra_context during sigreturn

 arch/arm64/include/uapi/asm/sigcontext.h |  27 +++
 arch/arm64/kernel/signal.c               | 355 ++++++++++++++++++++++++++++---
 2 files changed, 350 insertions(+), 32 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 1/5] arm64: signal: Refactor sigcontext parsing in rt_sigreturn
  2016-09-09 14:15 [RFC PATCH 0/5] arm64: Signal context expansion Dave Martin
@ 2016-09-09 14:15 ` Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 2/5] arm64: signal: factor frame layout and population into separate passes Dave Martin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, rt_sigreturn does very limited checking on the
sigcontext coming from userspace.

Future additions of extra dynamic sigcontext data will increase the
potential for surprises.  Also, it is not clear whether the
sigcontext extension records are supposed to occur in a particular
order.

This patch factors out the sigcontext parsing into a separate
function, and adds extra checks to validate the well-formedness of
the sigcontext structure.

To help with this, an abstraction for the signal frame layout is
also added, using offsets to track the location of different
records in the frame.  Although trivial, this provides a base to
extend upon in order to track more complex layouts.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/signal.c | 121 +++++++++++++++++++++++++++++++++++++--------
 1 file changed, 101 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index a8eafdb..8e0e0a3 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -19,9 +19,11 @@
 
 #include <linux/compat.h>
 #include <linux/errno.h>
+#include <linux/kernel.h>
 #include <linux/signal.h>
 #include <linux/personality.h>
 #include <linux/freezer.h>
+#include <linux/stddef.h>
 #include <linux/uaccess.h>
 #include <linux/tracehook.h>
 #include <linux/ratelimit.h>
@@ -45,6 +47,10 @@ struct rt_sigframe {
 	u64 lr;
 };
 
+struct rt_sigframe_user_layout {
+	struct rt_sigframe __user *sigframe;
+};
+
 static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
 {
 	struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
@@ -92,12 +98,86 @@ static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
 	return err ? -EFAULT : 0;
 }
 
+struct user_ctxs {
+	struct fpsimd_context __user *fpsimd;
+};
+
+static int parse_user_sigframe(struct user_ctxs *user,
+			       struct rt_sigframe __user *sf)
+{
+	struct sigcontext __user *sc = &sf->uc.uc_mcontext;
+	struct _aarch64_ctx __user *head =
+		(struct _aarch64_ctx __user *)&sc->__reserved;
+	size_t offset = 0;
+
+	user->fpsimd = NULL;
+
+	while (1) {
+		int err;
+		u32 magic, size;
+
+		head = (struct _aarch64_ctx __user *)&sc->__reserved[offset];
+		if (!IS_ALIGNED((unsigned long)head, 16))
+			goto invalid;
+
+		err = 0;
+		__get_user_error(magic, &head->magic, err);
+		__get_user_error(size, &head->size, err);
+		if (err)
+			return err;
+
+		switch (magic) {
+		case 0:
+			if (size)
+				goto invalid;
+
+			goto done;
+
+		case FPSIMD_MAGIC:
+			if (user->fpsimd)
+				goto invalid;
+
+			if (offset > sizeof(sc->__reserved) -
+					sizeof(*user->fpsimd) ||
+			    size < sizeof(*user->fpsimd))
+				goto invalid;
+
+			user->fpsimd = (struct fpsimd_context __user *)head;
+			break;
+
+		case ESR_MAGIC:
+			/* ignore */
+			break;
+
+		default:
+			goto invalid;
+		}
+
+		if (size < sizeof(*head))
+			goto invalid;
+
+		if (size > sizeof(sc->__reserved) - (sizeof(*head) + offset))
+			goto invalid;
+
+		offset += size;
+	}
+
+done:
+	if (!user->fpsimd)
+		goto invalid;
+
+	return 0;
+
+invalid:
+	return -EINVAL;
+}
+
 static int restore_sigframe(struct pt_regs *regs,
 			    struct rt_sigframe __user *sf)
 {
 	sigset_t set;
 	int i, err;
-	void *aux = sf->uc.uc_mcontext.__reserved;
+	struct user_ctxs user;
 
 	err = __copy_from_user(&set, &sf->uc.uc_sigmask, sizeof(set));
 	if (err == 0)
@@ -116,12 +196,11 @@ static int restore_sigframe(struct pt_regs *regs,
 	regs->syscallno = ~0UL;
 
 	err |= !valid_user_regs(&regs->user_regs, current);
+	if (err == 0)
+		err = parse_user_sigframe(&user, sf);
 
-	if (err == 0) {
-		struct fpsimd_context *fpsimd_ctx =
-			container_of(aux, struct fpsimd_context, head);
-		err |= restore_fpsimd_context(fpsimd_ctx);
-	}
+	if (err == 0)
+		err = restore_fpsimd_context(user.fpsimd);
 
 	return err;
 }
@@ -162,10 +241,11 @@ badframe:
 	return 0;
 }
 
-static int setup_sigframe(struct rt_sigframe __user *sf,
+static int setup_sigframe(struct rt_sigframe_user_layout *user,
 			  struct pt_regs *regs, sigset_t *set)
 {
 	int i, err = 0;
+	struct rt_sigframe __user *sf = user->sigframe;
 	void *aux = sf->uc.uc_mcontext.__reserved;
 	struct _aarch64_ctx *end;
 
@@ -209,33 +289,32 @@ static int setup_sigframe(struct rt_sigframe __user *sf,
 	return err;
 }
 
-static struct rt_sigframe __user *get_sigframe(struct ksignal *ksig,
-					       struct pt_regs *regs)
+static int get_sigframe(struct rt_sigframe_user_layout *user,
+			 struct ksignal *ksig, struct pt_regs *regs)
 {
 	unsigned long sp, sp_top;
-	struct rt_sigframe __user *frame;
 
 	sp = sp_top = sigsp(regs->sp, ksig);
 
 	sp = (sp - sizeof(struct rt_sigframe)) & ~15;
-	frame = (struct rt_sigframe __user *)sp;
+	user->sigframe = (struct rt_sigframe __user *)sp;
 
 	/*
 	 * Check that we can actually write to the signal frame.
 	 */
-	if (!access_ok(VERIFY_WRITE, frame, sp_top - sp))
-		frame = NULL;
+	if (!access_ok(VERIFY_WRITE, user->sigframe, sp_top - sp))
+		return -EFAULT;
 
-	return frame;
+	return 0;
 }
 
 static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
-			 void __user *frame, int usig)
+			 struct rt_sigframe_user_layout *user, int usig)
 {
 	__sigrestore_t sigtramp;
 
 	regs->regs[0] = usig;
-	regs->sp = (unsigned long)frame;
+	regs->sp = (unsigned long)user->sigframe;
 	regs->regs[29] = regs->sp + offsetof(struct rt_sigframe, fp);
 	regs->pc = (unsigned long)ka->sa.sa_handler;
 
@@ -250,20 +329,22 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
 static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
 			  struct pt_regs *regs)
 {
+	struct rt_sigframe_user_layout user;
 	struct rt_sigframe __user *frame;
 	int err = 0;
 
-	frame = get_sigframe(ksig, regs);
-	if (!frame)
+	if (get_sigframe(&user, ksig, regs))
 		return 1;
 
+	frame = user.sigframe;
+
 	__put_user_error(0, &frame->uc.uc_flags, err);
 	__put_user_error(NULL, &frame->uc.uc_link, err);
 
 	err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
-	err |= setup_sigframe(frame, regs, set);
+	err |= setup_sigframe(&user, regs, set);
 	if (err == 0) {
-		setup_return(regs, &ksig->ka, frame, usig);
+		setup_return(regs, &ksig->ka, &user, usig);
 		if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
 			err |= copy_siginfo_to_user(&frame->info, &ksig->info);
 			regs->regs[1] = (unsigned long)&frame->info;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 2/5] arm64: signal: factor frame layout and population into separate passes
  2016-09-09 14:15 [RFC PATCH 0/5] arm64: Signal context expansion Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 1/5] arm64: signal: Refactor sigcontext parsing in rt_sigreturn Dave Martin
@ 2016-09-09 14:15 ` Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 3/5] arm64: signal: factor out signal frame record allocation Dave Martin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for expanding the signal frame, this patch refactors
the signal frame setup code in setup_sigframe() into two separate
passes.

The first pass, setup_sigframe_layout(), determines the size of the
signal frame and its internal layout, including the presence and
location of optional records.  The resulting knowledge is used to
allocate and locate the user stack space required for the signal
frame and to determine which optional records to include.

The second pass, setup_sigframe(), is called once the stack frame
is allocated in order to populate it with the necessary context
information.

This change has no effect on the signal ABI, but will make it
easier to expand the signal frame in future patches.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/signal.c | 111 ++++++++++++++++++++++++++++++++++++---------
 1 file changed, 89 insertions(+), 22 deletions(-)

diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 8e0e0a3..3bce940 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -25,6 +25,7 @@
 #include <linux/freezer.h>
 #include <linux/stddef.h>
 #include <linux/uaccess.h>
+#include <linux/string.h>
 #include <linux/tracehook.h>
 #include <linux/ratelimit.h>
 
@@ -49,8 +50,39 @@ struct rt_sigframe {
 
 struct rt_sigframe_user_layout {
 	struct rt_sigframe __user *sigframe;
+
+	unsigned long size;	/* size of allocated sigframe data */
+	unsigned long limit;	/* largest allowed size */
+
+	unsigned long fpsimd_offset;
+	unsigned long esr_offset;
+	unsigned long end_offset;
 };
 
+static void init_user_layout(struct rt_sigframe_user_layout *user)
+{
+	memset(user, 0, sizeof *user);
+	user->size = offsetof(struct rt_sigframe, uc.uc_mcontext.__reserved);
+
+	user->limit = user->size +
+		sizeof(user->sigframe->uc.uc_mcontext.__reserved) -
+		round_up(sizeof(struct _aarch64_ctx), 16);
+		/* ^ reserve space for terminator */
+}
+
+static size_t sigframe_size(struct rt_sigframe_user_layout const *user)
+{
+	return round_up(max(user->size, sizeof(struct rt_sigframe)), 16);
+}
+
+static void __user *apply_user_offset(
+	struct rt_sigframe_user_layout const *user, unsigned long offset)
+{
+	char __user *base = (char __user *)user->sigframe;
+
+	return base + offset;
+}
+
 static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
 {
 	struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
@@ -106,26 +138,35 @@ static int parse_user_sigframe(struct user_ctxs *user,
 			       struct rt_sigframe __user *sf)
 {
 	struct sigcontext __user *sc = &sf->uc.uc_mcontext;
-	struct _aarch64_ctx __user *head =
-		(struct _aarch64_ctx __user *)&sc->__reserved;
+	struct _aarch64_ctx __user *head;
+	char __user *base = (char __user *)&sc->__reserved;
 	size_t offset = 0;
+	size_t limit = sizeof(sc->__reserved);
 
 	user->fpsimd = NULL;
 
+	if (!IS_ALIGNED((unsigned long)base, 16))
+		goto invalid;
+
 	while (1) {
-		int err;
+		int err = 0;
 		u32 magic, size;
 
-		head = (struct _aarch64_ctx __user *)&sc->__reserved[offset];
-		if (!IS_ALIGNED((unsigned long)head, 16))
+		if (limit - offset < sizeof(*head))
 			goto invalid;
 
-		err = 0;
+		if (!IS_ALIGNED(offset, 16))
+			goto invalid;
+
+		head = (struct _aarch64_ctx __user *)(base + offset);
 		__get_user_error(magic, &head->magic, err);
 		__get_user_error(size, &head->size, err);
 		if (err)
 			return err;
 
+		if (limit - offset < size)
+			goto invalid;
+
 		switch (magic) {
 		case 0:
 			if (size)
@@ -137,9 +178,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
 			if (user->fpsimd)
 				goto invalid;
 
-			if (offset > sizeof(sc->__reserved) -
-					sizeof(*user->fpsimd) ||
-			    size < sizeof(*user->fpsimd))
+			if (size < sizeof(*user->fpsimd))
 				goto invalid;
 
 			user->fpsimd = (struct fpsimd_context __user *)head;
@@ -156,7 +195,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
 		if (size < sizeof(*head))
 			goto invalid;
 
-		if (size > sizeof(sc->__reserved) - (sizeof(*head) + offset))
+		if (limit - offset < size)
 			goto invalid;
 
 		offset += size;
@@ -241,13 +280,30 @@ badframe:
 	return 0;
 }
 
+/* Determine the layout of optional records in the signal frame */
+static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
+{
+	user->fpsimd_offset = user->size;
+	user->size += round_up(sizeof(struct fpsimd_context), 16);
+
+	/* fault information, if valid */
+	if (current->thread.fault_code) {
+		user->esr_offset = user->size;
+		user->size += round_up(sizeof(struct esr_context), 16);
+	}
+
+	/* set the "end" magic */
+	user->end_offset = user->size;
+
+	return 0;
+}
+
+
 static int setup_sigframe(struct rt_sigframe_user_layout *user,
 			  struct pt_regs *regs, sigset_t *set)
 {
 	int i, err = 0;
 	struct rt_sigframe __user *sf = user->sigframe;
-	void *aux = sf->uc.uc_mcontext.__reserved;
-	struct _aarch64_ctx *end;
 
 	/* set up the stack frame for unwinding */
 	__put_user_error(regs->regs[29], &sf->fp, err);
@@ -265,26 +321,29 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
 	err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));
 
 	if (err == 0) {
-		struct fpsimd_context *fpsimd_ctx =
-			container_of(aux, struct fpsimd_context, head);
+		struct fpsimd_context __user *fpsimd_ctx =
+			apply_user_offset(user, user->fpsimd_offset);
 		err |= preserve_fpsimd_context(fpsimd_ctx);
-		aux += sizeof(*fpsimd_ctx);
 	}
 
 	/* fault information, if valid */
-	if (current->thread.fault_code) {
-		struct esr_context *esr_ctx =
-			container_of(aux, struct esr_context, head);
+	if (err == 0 && user->esr_offset) {
+		struct esr_context __user *esr_ctx =
+			apply_user_offset(user, user->esr_offset);
+
 		__put_user_error(ESR_MAGIC, &esr_ctx->head.magic, err);
 		__put_user_error(sizeof(*esr_ctx), &esr_ctx->head.size, err);
 		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
-		aux += sizeof(*esr_ctx);
 	}
 
 	/* set the "end" magic */
-	end = aux;
-	__put_user_error(0, &end->magic, err);
-	__put_user_error(0, &end->size, err);
+	if (err == 0) {
+		struct _aarch64_ctx __user *end =
+			apply_user_offset(user, user->end_offset);
+
+		__put_user_error(0, &end->magic, err);
+		__put_user_error(0, &end->size, err);
+	}
 
 	return err;
 }
@@ -293,10 +352,18 @@ static int get_sigframe(struct rt_sigframe_user_layout *user,
 			 struct ksignal *ksig, struct pt_regs *regs)
 {
 	unsigned long sp, sp_top;
+	int err;
+
+	init_user_layout(user);
 
 	sp = sp_top = sigsp(regs->sp, ksig);
 
 	sp = (sp - sizeof(struct rt_sigframe)) & ~15;
+	err = setup_sigframe_layout(user);
+	if (err)
+		return err;
+
+	sp -= sigframe_size(user);
 	user->sigframe = (struct rt_sigframe __user *)sp;
 
 	/*
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 3/5] arm64: signal: factor out signal frame record allocation
  2016-09-09 14:15 [RFC PATCH 0/5] arm64: Signal context expansion Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 1/5] arm64: signal: Refactor sigcontext parsing in rt_sigreturn Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 2/5] arm64: signal: factor frame layout and population into separate passes Dave Martin
@ 2016-09-09 14:15 ` Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 4/5] arm64: signal: Allocate extra sigcontext space as needed Dave Martin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

Factor out the allocator for signal frame optional records into a
separate function, to ensure consistency and facilitate later
expansion of the signal frame.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/signal.c | 43 ++++++++++++++++++++++++++++++++++++-------
 1 file changed, 36 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 3bce940..963d2ba 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -75,6 +75,22 @@ static size_t sigframe_size(struct rt_sigframe_user_layout const *user)
 	return round_up(max(user->size, sizeof(struct rt_sigframe)), 16);
 }
 
+/*
+ * Allocate space for an optional record of <size> bytes in the user
+ * signal frame.  The offset from the signal frame base address to the
+ * allocated block is assigned to *offset.
+ */
+static int sigframe_alloc(struct rt_sigframe_user_layout *user,
+			  unsigned long *offset, size_t size)
+{
+	size_t padded_size = round_up(size, 16);
+
+	*offset = user->size;
+	user->size += padded_size;
+
+	return 0;
+}
+
 static void __user *apply_user_offset(
 	struct rt_sigframe_user_layout const *user, unsigned long offset)
 {
@@ -283,19 +299,32 @@ badframe:
 /* Determine the layout of optional records in the signal frame */
 static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
 {
-	user->fpsimd_offset = user->size;
-	user->size += round_up(sizeof(struct fpsimd_context), 16);
+	int err;
+
+	err = sigframe_alloc(user, &user->fpsimd_offset,
+			     sizeof(struct fpsimd_context));
+	if (err)
+		return err;
 
 	/* fault information, if valid */
 	if (current->thread.fault_code) {
-		user->esr_offset = user->size;
-		user->size += round_up(sizeof(struct esr_context), 16);
+		err = sigframe_alloc(user, &user->esr_offset,
+				     sizeof(struct esr_context));
+		if (err)
+			return err;
 	}
 
-	/* set the "end" magic */
-	user->end_offset = user->size;
+	/*
+	 * Allocate space for the terminator record.
+	 * HACK: here we undo the reservation of space for the end record.
+	 * This bodge should be replaced with a cleaner approach later on.
+	 */
+	user->limit = offsetof(struct rt_sigframe, uc.uc_mcontext.__reserved) +
+		sizeof(user->sigframe->uc.uc_mcontext.__reserved);
 
-	return 0;
+	err = sigframe_alloc(user, &user->end_offset,
+			     sizeof(struct _aarch64_ctx));
+	return err;
 }
 
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 4/5] arm64: signal: Allocate extra sigcontext space as needed
  2016-09-09 14:15 [RFC PATCH 0/5] arm64: Signal context expansion Dave Martin
                   ` (2 preceding siblings ...)
  2016-09-09 14:15 ` [RFC PATCH 3/5] arm64: signal: factor out signal frame record allocation Dave Martin
@ 2016-09-09 14:15 ` Dave Martin
  2016-09-09 14:15 ` [RFC PATCH 5/5] arm64: signal: Parse extra_context during sigreturn Dave Martin
  2016-09-09 14:39 ` [RFC PATCH 0/5] arm64: Signal context expansion Florian Weimer
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

This patch modifies the context block allocator to create an
extra_context expansion block as necessary, and adds the necessary
code to populate, parse and decode this block.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/uapi/asm/sigcontext.h |  27 ++++++++
 arch/arm64/kernel/signal.c               | 112 +++++++++++++++++++++++++------
 2 files changed, 120 insertions(+), 19 deletions(-)

diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
index ee469be..1af8437 100644
--- a/arch/arm64/include/uapi/asm/sigcontext.h
+++ b/arch/arm64/include/uapi/asm/sigcontext.h
@@ -61,4 +61,31 @@ struct esr_context {
 	__u64 esr;
 };
 
+/*
+ * Pointer to extra space for additional structures that don't fit in
+ * sigcontext.__reserved[].  Note:
+ *
+ * 1) fpsimd_context, esr_context and extra_context must be placed in
+ * sigcontext.__reserved[] if present.  They cannot be placed in the
+ * extra space.  Any other record can be placed either in the extra
+ * space or in sigcontext.__reserved[].
+ *
+ * 2) There must not be more than one extra_context.
+ *
+ * 3) If extra_context is present, it must be followed immediately in
+ * sigcontext.__reserved[] by the terminating null _aarch64_ctx (i.e.,
+ * extra_context must be the last record in sigcontext.__reserved[]
+ * except for the terminator).
+ *
+ * 4) The extra space must itself be terminated with a null
+ * _aarch64_ctx.
+ */
+#define EXTRA_MAGIC	0x45585401
+
+struct extra_context {
+	struct _aarch64_ctx head;
+	void *data;	/* 16-byte aligned pointer to the extra space */
+	__u32 size;	/* size in bytes of the extra space */
+};
+
 #endif /* _UAPI__ASM_SIGCONTEXT_H */
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 963d2ba..d7cdb51 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -25,6 +25,7 @@
 #include <linux/freezer.h>
 #include <linux/stddef.h>
 #include <linux/uaccess.h>
+#include <linux/sizes.h>
 #include <linux/string.h>
 #include <linux/tracehook.h>
 #include <linux/ratelimit.h>
@@ -56,18 +57,22 @@ struct rt_sigframe_user_layout {
 
 	unsigned long fpsimd_offset;
 	unsigned long esr_offset;
+	unsigned long extra_offset;
 	unsigned long end_offset;
 };
 
 static void init_user_layout(struct rt_sigframe_user_layout *user)
 {
+	const size_t __reserved_size =
+		sizeof(user->sigframe->uc.uc_mcontext.__reserved);
+	const size_t terminator_size =
+		round_up(sizeof(struct _aarch64_ctx), 16);
+
 	memset(user, 0, sizeof *user);
 	user->size = offsetof(struct rt_sigframe, uc.uc_mcontext.__reserved);
-
-	user->limit = user->size +
-		sizeof(user->sigframe->uc.uc_mcontext.__reserved) -
-		round_up(sizeof(struct _aarch64_ctx), 16);
-		/* ^ reserve space for terminator */
+	user->limit = user->size + (__reserved_size - terminator_size -
+				    sizeof(struct extra_context));
+	/* Reserve space for extension and terminator ^ */
 }
 
 static size_t sigframe_size(struct rt_sigframe_user_layout const *user)
@@ -75,6 +80,49 @@ static size_t sigframe_size(struct rt_sigframe_user_layout const *user)
 	return round_up(max(user->size, sizeof(struct rt_sigframe)), 16);
 }
 
+/* Sanity limit on the maximum size of signal frame we'll try to generate. */
+/* This is NOT ABI. */
+#define SIGFRAME_MAXSZ SZ_64K
+
+static int __sigframe_alloc(struct rt_sigframe_user_layout *user,
+			    unsigned long *offset, size_t size, bool extend)
+{
+	size_t padded_size = round_up(size, 16);
+
+	if (padded_size > user->limit - user->size &&
+	    !user->extra_offset &&
+	    extend) {
+		int ret;
+
+		ret = __sigframe_alloc(user, &user->extra_offset,
+				       sizeof(struct extra_context), false);
+		if (ret)
+			return ret;
+
+		/*
+		 * Further allocations must go after the fixed-size
+		 * part of the signal frame:
+		 */
+		user->size = round_up(sizeof(struct rt_sigframe), 16);
+
+		/*
+		 * Allow expansion up to SIGFRAME_MAXSZ, ensuring space for
+		 * the terminator:
+		 */
+		user->limit = SIGFRAME_MAXSZ -
+			round_up(sizeof(struct _aarch64_ctx), 16);
+	}
+
+	/* Still not enough space?  Bad luck! */
+	if (padded_size > user->limit - user->size)
+		return -ENOMEM;
+
+	*offset = user->size;
+	user->size += padded_size;
+
+	return 0;
+}
+
 /*
  * Allocate space for an optional record of <size> bytes in the user
  * signal frame.  The offset from the signal frame base address to the
@@ -83,11 +131,26 @@ static size_t sigframe_size(struct rt_sigframe_user_layout const *user)
 static int sigframe_alloc(struct rt_sigframe_user_layout *user,
 			  unsigned long *offset, size_t size)
 {
-	size_t padded_size = round_up(size, 16);
+	return __sigframe_alloc(user, offset, size, true);
+}
 
-	*offset = user->size;
-	user->size += padded_size;
+/* Allocate the null terminator record and prevent further allocations */
+static int sigframe_alloc_end(struct rt_sigframe_user_layout *user)
+{
+	int ret;
+	const size_t terminator_size =
+		round_up(sizeof(struct _aarch64_ctx), 16);
+
+	/* Un-reserve the space reserved for the terminator: */
+	user->limit += terminator_size;
+
+	ret = sigframe_alloc(user, &user->end_offset,
+			     sizeof(struct _aarch64_ctx));
+	if (ret)
+		return ret;
 
+	/* Prevent further allocation: */
+	user->limit = user->size;
 	return 0;
 }
 
@@ -314,17 +377,7 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user)
 			return err;
 	}
 
-	/*
-	 * Allocate space for the terminator record.
-	 * HACK: here we undo the reservation of space for the end record.
-	 * This bodge should be replaced with a cleaner approach later on.
-	 */
-	user->limit = offsetof(struct rt_sigframe, uc.uc_mcontext.__reserved) +
-		sizeof(user->sigframe->uc.uc_mcontext.__reserved);
-
-	err = sigframe_alloc(user, &user->end_offset,
-			     sizeof(struct _aarch64_ctx));
-	return err;
+	return sigframe_alloc_end(user);
 }
 
 
@@ -365,6 +418,27 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
 		__put_user_error(current->thread.fault_code, &esr_ctx->esr, err);
 	}
 
+	if (err == 0 && user->extra_offset) {
+		struct extra_context __user *extra =
+			apply_user_offset(user, user->extra_offset);
+		struct _aarch64_ctx __user *end =
+			(struct _aarch64_ctx __user *)((char __user *)extra +
+				round_up(sizeof(*extra), 16));
+		void __user *extra_data = apply_user_offset(user,
+			round_up(sizeof(struct rt_sigframe), 16));
+		u32 extra_size = round_up(user->size, 16) -
+			round_up(sizeof(struct rt_sigframe), 16);
+
+		__put_user_error(EXTRA_MAGIC, &extra->head.magic, err);
+		__put_user_error(sizeof(*extra), &extra->head.size, err);
+		__put_user_error(extra_data, &extra->data, err);
+		__put_user_error(extra_size, &extra->size, err);
+
+		/* Add the terminator */
+		__put_user_error(0, &end->magic, err);
+		__put_user_error(0, &end->size, err);
+	}
+
 	/* set the "end" magic */
 	if (err == 0) {
 		struct _aarch64_ctx __user *end =
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 5/5] arm64: signal: Parse extra_context during sigreturn
  2016-09-09 14:15 [RFC PATCH 0/5] arm64: Signal context expansion Dave Martin
                   ` (3 preceding siblings ...)
  2016-09-09 14:15 ` [RFC PATCH 4/5] arm64: signal: Allocate extra sigcontext space as needed Dave Martin
@ 2016-09-09 14:15 ` Dave Martin
  2016-09-09 14:39 ` [RFC PATCH 0/5] arm64: Signal context expansion Florian Weimer
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-09 14:15 UTC (permalink / raw)
  To: linux-arm-kernel

If extra_context is present, parse it.

To avoid abuse by userspace, this patch attempts to ensure that:
 * that no more than one extra_context is accepted;
 * that the extra_context is a sensible size;
 * that the extra context data is properly aligned.

This patch relies on the user accessors in order to ensure that the
user-supplied extra context data pointer is an honest userspace
address.

Other than that, the kernel doesn't care specially whether the
pointer supplied is sensible (e.g., not garbage, doesn't overlap
sigcontext.__reserved[], etc.) since this cannot harm the kernel.

More checks may be added later in order to aid debugging of
botched sigreturns from userspace.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/signal.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index d7cdb51..c5f18ef 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -221,6 +221,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
 	char __user *base = (char __user *)&sc->__reserved;
 	size_t offset = 0;
 	size_t limit = sizeof(sc->__reserved);
+	bool have_extra_context = false;
 
 	user->fpsimd = NULL;
 
@@ -230,6 +231,9 @@ static int parse_user_sigframe(struct user_ctxs *user,
 	while (1) {
 		int err = 0;
 		u32 magic, size;
+		struct extra_context const __user *extra;
+		void __user *extra_data;
+		u32 extra_size;
 
 		if (limit - offset < sizeof(*head))
 			goto invalid;
@@ -267,6 +271,42 @@ static int parse_user_sigframe(struct user_ctxs *user,
 			/* ignore */
 			break;
 
+		case EXTRA_MAGIC:
+			if (have_extra_context)
+				goto invalid;
+
+			if (size < sizeof(*extra))
+				goto invalid;
+
+			extra = (struct extra_context const __user *)head;
+			__get_user_error(extra_data, &extra->data, err);
+			__get_user_error(extra_size, &extra->size, err);
+			if (err)
+				return err;
+
+			/* Prevent looping/repeated parsing of extra_conext */
+			have_extra_context = true;
+
+			/*
+			 * Rely on the __user accessors to reject bogus
+			 * pointers.
+			 */
+			base = extra_data;
+			if (!IS_ALIGNED((unsigned long)base, 16))
+				goto invalid;
+
+			/* Reject "unreasonably large" frames: */
+			limit = extra_size;
+			if (limit > SIGFRAME_MAXSZ - sizeof(sc->__reserved))
+				goto invalid;
+
+			/*
+			 * Ignore trailing terminator in __reserved[]
+			 * and start parsing extra_data:
+			 */
+			offset = 0;
+			continue;
+
 		default:
 			goto invalid;
 		}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-09 14:15 [RFC PATCH 0/5] arm64: Signal context expansion Dave Martin
                   ` (4 preceding siblings ...)
  2016-09-09 14:15 ` [RFC PATCH 5/5] arm64: signal: Parse extra_context during sigreturn Dave Martin
@ 2016-09-09 14:39 ` Florian Weimer
  2016-09-09 15:21   ` Dave Martin
  5 siblings, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2016-09-09 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/09/2016 04:15 PM, Dave Martin wrote:
> This new record can be added in sigframe.__reserved[] if there is a
> need to allocate extra space beyond the standard signal frame.  The
> extra block of memory referenced by extra_context can then be parsed in
> the same way as sigcontext.__reserved[].  Old code should just ignore
> the whole thing as an unrecognised record.  To maintain backward
> compatibility, signal context records defined today are always placed
> directly in __reserved[], never in the block referenced by
> extra_context.

Do you add this extra information only if the stack is sufficiently large?

x86_64 adds the new information even for small stacks set up with 
sigaltstack, leading to memory corruption on bleeding-edge hardware:

   <https://bugzilla.kernel.org/show_bug.cgi?id=153531>

Thanks,
Florian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-09 14:39 ` [RFC PATCH 0/5] arm64: Signal context expansion Florian Weimer
@ 2016-09-09 15:21   ` Dave Martin
  2016-09-09 17:01     ` Florian Weimer
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Martin @ 2016-09-09 15:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 09, 2016 at 04:39:14PM +0200, Florian Weimer wrote:
> On 09/09/2016 04:15 PM, Dave Martin wrote:
> >This new record can be added in sigframe.__reserved[] if there is a
> >need to allocate extra space beyond the standard signal frame.  The
> >extra block of memory referenced by extra_context can then be parsed in
> >the same way as sigcontext.__reserved[].  Old code should just ignore
> >the whole thing as an unrecognised record.  To maintain backward
> >compatibility, signal context records defined today are always placed
> >directly in __reserved[], never in the block referenced by
> >extra_context.
> 
> Do you add this extra information only if the stack is sufficiently large?
> 
> x86_64 adds the new information even for small stacks set up with
> sigaltstack, leading to memory corruption on bleeding-edge hardware:
> 
>   <https://bugzilla.kernel.org/show_bug.cgi?id=153531>

Hmmm, not yet.  We already check that the whole frame is writable user
memory, but this isn't sufficient to avoid user corruption in the case
of alternate signal stacks.  I'll fix this -- thanks for flagging it.

If the stack isn't large enough, we'll still have to SEGV the task
though.


We can (and should) bump up the SIG{,MIN}STKSZ constants when adding
the SVE support proper to the kernel, but this doesn't solve the
problem of existing binaries.  For backwards compatibility the kernel
would still have to permit the historical (too-small) values of
these constants for calls to sigaltstack(2) though, which is not ideal.

I wonder whether we should make the signal stack size runtime
discoverable through sysconf() instead...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-09 15:21   ` Dave Martin
@ 2016-09-09 17:01     ` Florian Weimer
  2016-09-12 11:17       ` Dave Martin
  0 siblings, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2016-09-09 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/09/2016 05:21 PM, Dave Martin wrote:

>> Do you add this extra information only if the stack is sufficiently large?
>>
>> x86_64 adds the new information even for small stacks set up with
>> sigaltstack, leading to memory corruption on bleeding-edge hardware:
>>
>>   <https://bugzilla.kernel.org/show_bug.cgi?id=153531>
>
> Hmmm, not yet.  We already check that the whole frame is writable user
> memory, but this isn't sufficient to avoid user corruption in the case
> of alternate signal stacks.  I'll fix this -- thanks for flagging it.
>
> If the stack isn't large enough, we'll still have to SEGV the task
> though.

You could skip copying the data and not install a pointer to it in the 
existing signal context.

> We can (and should) bump up the SIG{,MIN}STKSZ constants when adding
> the SVE support proper to the kernel,

That's a userspace ABI change (libraries use these constants to size 
struct members), and not a good idea.  You might get away with at this 
stage, but you can't do this every time you add some new process state 
you want to add to signal handlers.

> I wonder whether we should make the signal stack size runtime
> discoverable through sysconf() instead...

That might be a good idea, yes.

Florian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-09 17:01     ` Florian Weimer
@ 2016-09-12 11:17       ` Dave Martin
  2016-09-12 12:49         ` Florian Weimer
  2016-09-12 15:01         ` Szabolcs Nagy
  0 siblings, 2 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-12 11:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 09, 2016 at 07:01:12PM +0200, Florian Weimer wrote:
> On 09/09/2016 05:21 PM, Dave Martin wrote:
> 
> >>Do you add this extra information only if the stack is sufficiently large?
> >>
> >>x86_64 adds the new information even for small stacks set up with
> >>sigaltstack, leading to memory corruption on bleeding-edge hardware:
> >>
> >>  <https://bugzilla.kernel.org/show_bug.cgi?id=153531>
> >
> >Hmmm, not yet.  We already check that the whole frame is writable user
> >memory, but this isn't sufficient to avoid user corruption in the case
> >of alternate signal stacks.  I'll fix this -- thanks for flagging it.
> >
> >If the stack isn't large enough, we'll still have to SEGV the task
> >though.
> 
> You could skip copying the data and not install a pointer to it in the
> existing signal context.

We could, but then we'd corrupt the task state in sigreturn, since
we wouldn't have been able to save/restore part of the state.

> >We can (and should) bump up the SIG{,MIN}STKSZ constants when adding
> >the SVE support proper to the kernel,
> 
> That's a userspace ABI change (libraries use these constants to size struct
> members), and not a good idea.  You might get away with at this stage, but
> you can't do this every time you add some new process state you want to add
> to signal handlers.

For internal interfaces within a single component that's tolerable,
since a single value would be used for each of these constants
throughout the build of that component.

A quick search on sources.debian.org suggests that the total number of
packages that expose {,MIN}SIGSTKSZ dependent definitions in their
public interfaces is small (I couldn't find any after paging through
dozens of pages of results -- so the total is maybe in the range 0-10)
-- hopefully few enough to eyeball.


> >I wonder whether we should make the signal stack size runtime
> >discoverable through sysconf() instead...

I will likely suggest this for the future, but of course it doesn't help
for current binaries.


Note that MINSIGSTKSZ stared life wrong for arm64, and has since gone
through a few ABI breaking changes.  I don't condone this, but we have
form in this area :/

sigaltstack() already fails with ENOMEM for software that passes
ss_size = MINSIGSTKSZ, and is built against glibc<2.22 [1], [2], running
on linux>=4.3 [3], which is an ABI break in case where sigaltstack() is
otherwise guaranteed to succeed.


The bottom line here is that the sigaltstack() API is broken with regard
to extensibility, so we cannot extend the amount of signal state without
breaking something.


The least-wrong thing I can think of to do is:

* deprecate but continue to support the existing sigaltstack API/ABI
with today's {,MIN}SIGSTKSZ definitions

* guarantee (as much as possible) that software using this ABI continues
to work (by saving/restoring only data that _must_ be saved/restored at
each signal, which may be small enough to fit)

* providing a clean failure mode (fatal signal) when this proves
impossible at signal delivery/return time;

* define a new interface for runtime-querying the required signal stack
size;

* define a new syscall or new stack_t.ss_flags flags (say, SS_STRICT)
that permits the kernel to enforce a runtime-determined minimum greater
than MINSIGSTKSZ when calling sigaltstack().


Another option would be:

* define a new interface for runtime-querying the required signal stack
size, and

* support the current API/ABI, but make a call to sigaltstack() SEGV or
SIGILL the caller if it specifies ss_stack >= MINSIGSTKSZ but smaller
than the actual runtime minimum.

(this would cause old software to break immediately in an obvious way on
new systems, forcing people to fix their software -- which they might or
might not actually bother to do).


Thoughts?

Cheers
---Dave


[1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=b763f6ae859ecea70a5dacb8ad45c71d5f667e2e

[2] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=e143034ce24dc383016b4882b89cfebc7a6a62d7;hb=b8079dd0d360648e4e8de48656c5c38972621072

[3] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c9692657c0321fec7bcb3ca8c6db56c08c640ace

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-12 11:17       ` Dave Martin
@ 2016-09-12 12:49         ` Florian Weimer
  2016-09-12 15:21           ` Dave Martin
  2016-09-12 15:01         ` Szabolcs Nagy
  1 sibling, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2016-09-12 12:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/12/2016 01:17 PM, Dave Martin wrote:

>>> If the stack isn't large enough, we'll still have to SEGV the task
>>> though.
>>
>> You could skip copying the data and not install a pointer to it in the
>> existing signal context.
>
> We could, but then we'd corrupt the task state in sigreturn, since
> we wouldn't have been able to save/restore part of the state.

Ah, I wasn't aware that the kernel doesn't have a copy of the state. 
Could the kernel reserve space for it when sigaltstack is called?

> The least-wrong thing I can think of to do is:
>
> * deprecate but continue to support the existing sigaltstack API/ABI
> with today's {,MIN}SIGSTKSZ definitions

There is also PTHREAD_STACK_MIN.  The glibc default for that (which is 
overriden by some architectures) is 16 KiB.  It is also quite small; see 
the sequence of events mentioned here:

   <https://sourceware.org/bugzilla/show_bug.cgi?id=20249>

> * guarantee (as much as possible) that software using this ABI continues
> to work (by saving/restoring only data that _must_ be saved/restored at
> each signal, which may be small enough to fit)
>
> * providing a clean failure mode (fatal signal) when this proves
> impossible at signal delivery/return time;
>
> * define a new interface for runtime-querying the required signal stack
> size;
>
> * define a new syscall or new stack_t.ss_flags flags (say, SS_STRICT)
> that permits the kernel to enforce a runtime-determined minimum greater
> than MINSIGSTKSZ when calling sigaltstack().
>
>
> Another option would be:
>
> * define a new interface for runtime-querying the required signal stack
> size, and
>
> * support the current API/ABI, but make a call to sigaltstack() SEGV or
> SIGILL the caller if it specifies ss_stack >= MINSIGSTKSZ but smaller
> than the actual runtime minimum.
>
> (this would cause old software to break immediately in an obvious way on
> new systems, forcing people to fix their software -- which they might or
> might not actually bother to do).

The second option looks a bit problematic from a support perspective.

Do you think it would be possible to block access to hardware features 
that cause signal stack bloat on a per-process basis?  Then we could 
bump the kernel requirement enforced by sigaltstack directly for the 
default, and define a compatibility personality that minimizes the 
signal stack size for old applications.  (A way to query the required 
alternative signal stack size would still come in handy, though.)

Florian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-12 11:17       ` Dave Martin
  2016-09-12 12:49         ` Florian Weimer
@ 2016-09-12 15:01         ` Szabolcs Nagy
  2016-09-12 15:30           ` Dave Martin
  1 sibling, 1 reply; 22+ messages in thread
From: Szabolcs Nagy @ 2016-09-12 15:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/09/16 12:17, Dave Martin wrote:
> On Fri, Sep 09, 2016 at 07:01:12PM +0200, Florian Weimer wrote:
>> On 09/09/2016 05:21 PM, Dave Martin wrote:
>>
>>>> Do you add this extra information only if the stack is sufficiently large?
>>>>
>>>> x86_64 adds the new information even for small stacks set up with
>>>> sigaltstack, leading to memory corruption on bleeding-edge hardware:
>>>>
>>>>  <https://bugzilla.kernel.org/show_bug.cgi?id=153531>
>>>
>>> Hmmm, not yet.  We already check that the whole frame is writable user
>>> memory, but this isn't sufficient to avoid user corruption in the case
>>> of alternate signal stacks.  I'll fix this -- thanks for flagging it.
>>>
>>> If the stack isn't large enough, we'll still have to SEGV the task
>>> though.
>>
>> You could skip copying the data and not install a pointer to it in the
>> existing signal context.
> 
> We could, but then we'd corrupt the task state in sigreturn, since
> we wouldn't have been able to save/restore part of the state.
> 
>>> We can (and should) bump up the SIG{,MIN}STKSZ constants when adding
>>> the SVE support proper to the kernel,
>>
>> That's a userspace ABI change (libraries use these constants to size struct
>> members), and not a good idea.  You might get away with at this stage, but
>> you can't do this every time you add some new process state you want to add
>> to signal handlers.
> 
> For internal interfaces within a single component that's tolerable,
> since a single value would be used for each of these constants
> throughout the build of that component.
> 
> A quick search on sources.debian.org suggests that the total number of
> packages that expose {,MIN}SIGSTKSZ dependent definitions in their
> public interfaces is small (I couldn't find any after paging through
> dozens of pages of results -- so the total is maybe in the range 0-10)
> -- hopefully few enough to eyeball.
> 
> 
>>> I wonder whether we should make the signal stack size runtime
>>> discoverable through sysconf() instead...
> 
> I will likely suggest this for the future, but of course it doesn't help
> for current binaries.
> 
> 
> Note that MINSIGSTKSZ stared life wrong for arm64, and has since gone
> through a few ABI breaking changes.  I don't condone this, but we have
> form in this area :/
> 
> sigaltstack() already fails with ENOMEM for software that passes
> ss_size = MINSIGSTKSZ, and is built against glibc<2.22 [1], [2], running
> on linux>=4.3 [3], which is an ABI break in case where sigaltstack() is
> otherwise guaranteed to succeed.
> 

yes, this was abi breaking change.

if glibc does not care about existing binaries
that use sigaltstack with MINSIGSTKSZ then it can
increase the size, but i think the kernel should
not change the abi (there are other libcs and libc
independent runtime systems on linux for aarch64
with their own sigaltstack setup, not all of them
may care about SVE).

i assume the kernel can avoid saving SVE regs when
they are not used by the process.

> 
> The bottom line here is that the sigaltstack() API is broken with regard
> to extensibility, so we cannot extend the amount of signal state without
> breaking something.
> 

extending signal state can break things independently
of sigaltstack.

binaries with strict guarantees about worst case stack
usage can change behaviour.

fortunately glibc PTHREAD_STACK_MIN is huge on aarch64
so applications using it are unlikely to break because
of the increased signal state.
(this also means it's impossible to have threads with
tiny stacks on glibc, so large amount of threads means
large amount of commit charge.)

> 
> The least-wrong thing I can think of to do is:
> 
> * deprecate but continue to support the existing sigaltstack API/ABI
> with today's {,MIN}SIGSTKSZ definitions
> 
> * guarantee (as much as possible) that software using this ABI continues
> to work (by saving/restoring only data that _must_ be saved/restored at
> each signal, which may be small enough to fit)
> 
> * providing a clean failure mode (fatal signal) when this proves
> impossible at signal delivery/return time;
> 
> * define a new interface for runtime-querying the required signal stack
> size;
> 
> * define a new syscall or new stack_t.ss_flags flags (say, SS_STRICT)
> that permits the kernel to enforce a runtime-determined minimum greater
> than MINSIGSTKSZ when calling sigaltstack().
> 
> 
> Another option would be:
> 
> * define a new interface for runtime-querying the required signal stack
> size, and
> 
> * support the current API/ABI, but make a call to sigaltstack() SEGV or
> SIGILL the caller if it specifies ss_stack >= MINSIGSTKSZ but smaller
> than the actual runtime minimum.
> 
> (this would cause old software to break immediately in an obvious way on
> new systems, forcing people to fix their software -- which they might or
> might not actually bother to do).
> 
> 
> Thoughts?
> 
> Cheers
> ---Dave
> 
> 
> [1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=b763f6ae859ecea70a5dacb8ad45c71d5f667e2e
> 
> [2] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=e143034ce24dc383016b4882b89cfebc7a6a62d7;hb=b8079dd0d360648e4e8de48656c5c38972621072
> 
> [3] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c9692657c0321fec7bcb3ca8c6db56c08c640ace
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-12 12:49         ` Florian Weimer
@ 2016-09-12 15:21           ` Dave Martin
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-12 15:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 12, 2016 at 02:49:20PM +0200, Florian Weimer wrote:
> On 09/12/2016 01:17 PM, Dave Martin wrote:
> 
> >>>If the stack isn't large enough, we'll still have to SEGV the task
> >>>though.
> >>
> >>You could skip copying the data and not install a pointer to it in the
> >>existing signal context.
> >
> >We could, but then we'd corrupt the task state in sigreturn, since
> >we wouldn't have been able to save/restore part of the state.
> 
> Ah, I wasn't aware that the kernel doesn't have a copy of the state. Could
> the kernel reserve space for it when sigaltstack is called?

Signals can be nested without limit.

If you record some state in the kernel, it's also hard to track when it
can be freed if userspace leaves the signal handler by other means than
sigreturn() (say, longjmp() or setcontext()).

> >The least-wrong thing I can think of to do is:
> >
> >* deprecate but continue to support the existing sigaltstack API/ABI
> >with today's {,MIN}SIGSTKSZ definitions
> 
> There is also PTHREAD_STACK_MIN.  The glibc default for that (which is
> overriden by some architectures) is 16 KiB.  It is also quite small; see the
> sequence of events mentioned here:
> 
>   <https://sourceware.org/bugzilla/show_bug.cgi?id=20249>

It's a related issue, though this doesn't look to me like a bug.  Why
should something like vfprintf() fit in PTHREAD_STACK_MIN bytes of stack?


> >* guarantee (as much as possible) that software using this ABI continues
> >to work (by saving/restoring only data that _must_ be saved/restored at
> >each signal, which may be small enough to fit)
> >
> >* providing a clean failure mode (fatal signal) when this proves
> >impossible at signal delivery/return time;
> >
> >* define a new interface for runtime-querying the required signal stack
> >size;
> >
> >* define a new syscall or new stack_t.ss_flags flags (say, SS_STRICT)
> >that permits the kernel to enforce a runtime-determined minimum greater
> >than MINSIGSTKSZ when calling sigaltstack().
> >
> >
> >Another option would be:
> >
> >* define a new interface for runtime-querying the required signal stack
> >size, and
> >
> >* support the current API/ABI, but make a call to sigaltstack() SEGV or
> >SIGILL the caller if it specifies ss_stack >= MINSIGSTKSZ but smaller
> >than the actual runtime minimum.
> >
> >(this would cause old software to break immediately in an obvious way on
> >new systems, forcing people to fix their software -- which they might or
> >might not actually bother to do).
> 
> The second option looks a bit problematic from a support perspective.

Well, indeed :(

> Do you think it would be possible to block access to hardware features that
> cause signal stack bloat on a per-process basis?  Then we could bump the
> kernel requirement enforced by sigaltstack directly for the default, and
> define a compatibility personality that minimizes the signal stack size for
> old applications.  (A way to query the required alternative signal stack
> size would still come in handy, though.)

This can be done, in general, but it blocks current software from using
libraries that make internal use of a new arch feature, even if those
libraries' exported ABIs don't depend on the feature.

This would make the new feature unusable until/unless you rebuild the
world.


There's no perfect solution; rather I want to arrive at something
pragmatic that doesn't break existing software unnecessarily.

I designed the extra_context mechnism gets added only if something is
allocated to the signal frame at runtime, so software running on
hardware that doesn't have the new extension, and processes that never
make use of the new extension (even in libraries) would be unaffected.


Programs that are unaware of the extension, but use sigaltstack() _and_
also use libraries that use the extension may get SEGV'd due to failed
signal delivery.


If this is still not good enough, I may have to rethink...

We _could_ save some state in the kernel, but that will add a lot of
complexity in the kernel, and probably still won't cover all
cases.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-12 15:01         ` Szabolcs Nagy
@ 2016-09-12 15:30           ` Dave Martin
  2016-09-12 16:44             ` Szabolcs Nagy
  2016-09-13  9:28             ` Florian Weimer
  0 siblings, 2 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-12 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 12, 2016 at 04:01:17PM +0100, Szabolcs Nagy wrote:
> On 12/09/16 12:17, Dave Martin wrote:

[...]

> >> On 09/09/2016 05:21 PM, Dave Martin wrote:

[...]

> >>> I wonder whether we should make the signal stack size runtime
> >>> discoverable through sysconf() instead...
> > 
> > I will likely suggest this for the future, but of course it doesn't help
> > for current binaries.
> > 
> > 
> > Note that MINSIGSTKSZ stared life wrong for arm64, and has since gone
> > through a few ABI breaking changes.  I don't condone this, but we have
> > form in this area :/
> > 
> > sigaltstack() already fails with ENOMEM for software that passes
> > ss_size = MINSIGSTKSZ, and is built against glibc<2.22 [1], [2], running
> > on linux>=4.3 [3], which is an ABI break in case where sigaltstack() is
> > otherwise guaranteed to succeed.
> > 
> 
> yes, this was abi breaking change.
> 
> if glibc does not care about existing binaries
> that use sigaltstack with MINSIGSTKSZ then it can
> increase the size, but i think the kernel should
> not change the abi (there are other libcs and libc
> independent runtime systems on linux for aarch64
> with their own sigaltstack setup, not all of them
> may care about SVE).
> 
> i assume the kernel can avoid saving SVE regs when
> they are not used by the process.

I can (and do), in my patches (not posted yet).

The real issue here is that a recently updated shared library might be
optimised to use SVE, where the program using it is an older, SVE-
unaware binary.

(think of an optimised math library using some new fancy SVE-based
number crunching internally).

> 
> > 
> > The bottom line here is that the sigaltstack() API is broken with regard
> > to extensibility, so we cannot extend the amount of signal state without
> > breaking something.
> > 
> 
> extending signal state can break things independently
> of sigaltstack.
> 
> binaries with strict guarantees about worst case stack
> usage can change behaviour.

Indeed, but this is not a new issue.  Software must run with enough
stack in order to be portable, but there is no portable way to determine
how much stack is needed.

> fortunately glibc PTHREAD_STACK_MIN is huge on aarch64
> so applications using it are unlikely to break because
> of the increased signal state.
> (this also means it's impossible to have threads with
> tiny stacks on glibc, so large amount of threads means
> large amount of commit charge.)

Again, not a new problem.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-12 15:30           ` Dave Martin
@ 2016-09-12 16:44             ` Szabolcs Nagy
  2016-09-12 17:24               ` Dave Martin
  2016-09-13  9:28             ` Florian Weimer
  1 sibling, 1 reply; 22+ messages in thread
From: Szabolcs Nagy @ 2016-09-12 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 12/09/16 16:30, Dave Martin wrote:
> On Mon, Sep 12, 2016 at 04:01:17PM +0100, Szabolcs Nagy wrote:
>> On 12/09/16 12:17, Dave Martin wrote:
> 
> [...]
> 
>>>> On 09/09/2016 05:21 PM, Dave Martin wrote:
> 
> [...]
> 
>>>>> I wonder whether we should make the signal stack size runtime
>>>>> discoverable through sysconf() instead...
>>>
>>> I will likely suggest this for the future, but of course it doesn't help
>>> for current binaries.
>>>
>>>
>>> Note that MINSIGSTKSZ stared life wrong for arm64, and has since gone
>>> through a few ABI breaking changes.  I don't condone this, but we have
>>> form in this area :/
>>>
>>> sigaltstack() already fails with ENOMEM for software that passes
>>> ss_size = MINSIGSTKSZ, and is built against glibc<2.22 [1], [2], running
>>> on linux>=4.3 [3], which is an ABI break in case where sigaltstack() is
>>> otherwise guaranteed to succeed.
>>>
>>
>> yes, this was abi breaking change.
>>
>> if glibc does not care about existing binaries
>> that use sigaltstack with MINSIGSTKSZ then it can
>> increase the size, but i think the kernel should
>> not change the abi (there are other libcs and libc
>> independent runtime systems on linux for aarch64
>> with their own sigaltstack setup, not all of them
>> may care about SVE).
>>
>> i assume the kernel can avoid saving SVE regs when
>> they are not used by the process.
> 
> I can (and do), in my patches (not posted yet).
> 
> The real issue here is that a recently updated shared library might be
> optimised to use SVE, where the program using it is an older, SVE-
> unaware binary.
> 
> (think of an optimised math library using some new fancy SVE-based
> number crunching internally).
> 

this is why sve would be a new abi in an ideal world,
instead of subtly changing behaviour behind existing
binaries.

>>
>>>
>>> The bottom line here is that the sigaltstack() API is broken with regard
>>> to extensibility, so we cannot extend the amount of signal state without
>>> breaking something.
>>>
>>
>> extending signal state can break things independently
>> of sigaltstack.
>>
>> binaries with strict guarantees about worst case stack
>> usage can change behaviour.
> 
> Indeed, but this is not a new issue.  Software must run with enough
> stack in order to be portable, but there is no portable way to determine
> how much stack is needed.
> 

why not?

musl libc has stack usage guarantee that can be checked
by the compiler that built the libc on all supported
targets. (with some manual checks in a few cases.)

however as soon as targets change the signal frame size
randomly, these worst case guarantees get broken.
(musl's MINSIGSTKSZ was always 6k on aarch64, even
before the kernel changed it.)

>> fortunately glibc PTHREAD_STACK_MIN is huge on aarch64
>> so applications using it are unlikely to break because
>> of the increased signal state.
>> (this also means it's impossible to have threads with
>> tiny stacks on glibc, so large amount of threads means
>> large amount of commit charge.)
> 
> Again, not a new problem.
> 

on musl PTHREAD_STACK_MIN is small, so the sve signal
state can break existing binaries that use tiny thread
stack (there is at least one linux distro that supports
musl+aarch64, although they don't distribute binary
packages yet).

so even if the glibc abi changes, i think the kernel
should not do the change, at least not immediately.

(and if the kernel decides to increase MINSIGSTKSZ
i think it should wait for distros picking up new
enough glibc with increased MINSIGSTKSZ to avoid
spuriously failing sigaltstack calls.)

> [...]
> 
> Cheers
> ---Dave
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-12 16:44             ` Szabolcs Nagy
@ 2016-09-12 17:24               ` Dave Martin
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-12 17:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 12, 2016 at 05:44:48PM +0100, Szabolcs Nagy wrote:
> On 12/09/16 16:30, Dave Martin wrote:
> > On Mon, Sep 12, 2016 at 04:01:17PM +0100, Szabolcs Nagy wrote:
> >> On 12/09/16 12:17, Dave Martin wrote:
> > 
> > [...]
> > 
> >>>> On 09/09/2016 05:21 PM, Dave Martin wrote:
> > 
> > [...]
> > 
> >>>>> I wonder whether we should make the signal stack size runtime
> >>>>> discoverable through sysconf() instead...
> >>>
> >>> I will likely suggest this for the future, but of course it doesn't help
> >>> for current binaries.
> >>>
> >>>
> >>> Note that MINSIGSTKSZ stared life wrong for arm64, and has since gone
> >>> through a few ABI breaking changes.  I don't condone this, but we have
> >>> form in this area :/
> >>>
> >>> sigaltstack() already fails with ENOMEM for software that passes
> >>> ss_size = MINSIGSTKSZ, and is built against glibc<2.22 [1], [2], running
> >>> on linux>=4.3 [3], which is an ABI break in case where sigaltstack() is
> >>> otherwise guaranteed to succeed.
> >>>
> >>
> >> yes, this was abi breaking change.
> >>
> >> if glibc does not care about existing binaries
> >> that use sigaltstack with MINSIGSTKSZ then it can
> >> increase the size, but i think the kernel should
> >> not change the abi (there are other libcs and libc
> >> independent runtime systems on linux for aarch64
> >> with their own sigaltstack setup, not all of them
> >> may care about SVE).
> >>
> >> i assume the kernel can avoid saving SVE regs when
> >> they are not used by the process.
> > 
> > I can (and do), in my patches (not posted yet).
> > 
> > The real issue here is that a recently updated shared library might be
> > optimised to use SVE, where the program using it is an older, SVE-
> > unaware binary.
> > 
> > (think of an optimised math library using some new fancy SVE-based
> > number crunching internally).
> > 
> 
> this is why sve would be a new abi in an ideal world,
> instead of subtly changing behaviour behind existing
> binaries.

Well, indeed.

In an ideal world, we also like to be able to make use of new features
though.

> >>
> >>>
> >>> The bottom line here is that the sigaltstack() API is broken with regard
> >>> to extensibility, so we cannot extend the amount of signal state without
> >>> breaking something.
> >>>
> >>
> >> extending signal state can break things independently
> >> of sigaltstack.
> >>
> >> binaries with strict guarantees about worst case stack
> >> usage can change behaviour.
> > 
> > Indeed, but this is not a new issue.  Software must run with enough
> > stack in order to be portable, but there is no portable way to determine
> > how much stack is needed.
> > 
> 
> why not?
> 
> musl libc has stack usage guarantee that can be checked
> by the compiler that built the libc on all supported
> targets. (with some manual checks in a few cases.)

"Requires some specific, nonstandard libc" = "not portable".

I don't say that this isn't a good feature of musl libc, just that this
feature is not generally available.

> however as soon as targets change the signal frame size
> randomly, these worst case guarantees get broken.
> (musl's MINSIGSTKSZ was always 6k on aarch64, even
> before the kernel changed it.)

Unfortunately, that is true.  Any component (including libc) that makes
assumptions about the kernel signal stack overhead may break.

(Whether musl should have invented its own magic number for MINSIGSTKSZ
instead of adding its own known overhead to the kernel's MINSIGSTKSZ is
debatable, but in this case it is lucky that it did.)

> >> fortunately glibc PTHREAD_STACK_MIN is huge on aarch64
> >> so applications using it are unlikely to break because
> >> of the increased signal state.
> >> (this also means it's impossible to have threads with
> >> tiny stacks on glibc, so large amount of threads means
> >> large amount of commit charge.)
> > 
> > Again, not a new problem.
> > 
> 
> on musl PTHREAD_STACK_MIN is small, so the sve signal
> state can break existing binaries that use tiny thread
> stack (there is at least one linux distro that supports
> musl+aarch64, although they don't distribute binary
> packages yet).
> 
> so even if the glibc abi changes, i think the kernel
> should not do the change, at least not immediately.
> 
> (and if the kernel decides to increase MINSIGSTKSZ
> i think it should wait for distros picking up new
> enough glibc with increased MINSIGSTKSZ to avoid
> spuriously failing sigaltstack calls.)

So, increasing MINSIGSTKSZ doesn't look a great idea, partly because it
breaks stuff, and partly because there is no correct value to change it
to -- we may have to make the same break all over again, at some point
in the future, or set it to something ridiculously large.


I think it's clear that this series needs to honor (and refuse to
overrun) the alternate signal stack size down to the "legacy"
MINSIGSTKSZ (5K).

But how to ensure that software that was built "too small" a MINSIGSTKSZ
for all arch extensions known to the running kernel is a separate
problem that needs careful thought and libc involvement.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-12 15:30           ` Dave Martin
  2016-09-12 16:44             ` Szabolcs Nagy
@ 2016-09-13  9:28             ` Florian Weimer
  2016-09-13 15:52               ` Dave Martin
  1 sibling, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2016-09-13  9:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/12/2016 05:30 PM, Dave Martin wrote:
> On Mon, Sep 12, 2016 at 04:01:17PM +0100, Szabolcs Nagy wrote:
>> i assume the kernel can avoid saving SVE regs when
>> they are not used by the process.
>
> I can (and do), in my patches (not posted yet).
>
> The real issue here is that a recently updated shared library might be
> optimised to use SVE, where the program using it is an older, SVE-
> unaware binary.
>
> (think of an optimised math library using some new fancy SVE-based
> number crunching internally).

In my little world, binaries are expected to do run-time CPU feature 
discovery and fall back to the baseline ABI if the system does not 
support ABI extensions.  This means that a personality flag can disable 
the extension, the optimized math libraries run with the slower baseline 
code instead, and the rest of the application sees the expected kernel 
interface.

 From this perspective, it is quite important that it is possible to get 
legacy applications running on newer hardware, preferably with out 
wrapping them in a dedicated VM.

I understand that there are other environments were this is less of a 
concern, but I also believe that the age of an architecture factors less 
into this than one might want.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-13  9:28             ` Florian Weimer
@ 2016-09-13 15:52               ` Dave Martin
  2016-09-13 16:02                 ` Florian Weimer
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Martin @ 2016-09-13 15:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 13, 2016 at 11:28:21AM +0200, Florian Weimer wrote:
> On 09/12/2016 05:30 PM, Dave Martin wrote:
> >On Mon, Sep 12, 2016 at 04:01:17PM +0100, Szabolcs Nagy wrote:
> >>i assume the kernel can avoid saving SVE regs when
> >>they are not used by the process.
> >
> >I can (and do), in my patches (not posted yet).
> >
> >The real issue here is that a recently updated shared library might be
> >optimised to use SVE, where the program using it is an older, SVE-
> >unaware binary.
> >
> >(think of an optimised math library using some new fancy SVE-based
> >number crunching internally).
> 
> In my little world, binaries are expected to do run-time CPU feature
> discovery and fall back to the baseline ABI if the system does not support
> ABI extensions.  This means that a personality flag can disable the
> extension, the optimized math libraries run with the slower baseline code
> instead, and the rest of the application sees the expected kernel interface.
> 
> From this perspective, it is quite important that it is possible to get
> legacy applications running on newer hardware, preferably with out wrapping
> them in a dedicated VM.
> 
> I understand that there are other environments were this is less of a
> concern, but I also believe that the age of an architecture factors less
> into this than one might want.

Agreed.  I'll need to think some more about how this should work in
general.

This series by itself doesn't change the ABI unless something allocates
enough signal frame data to exceed the historical frame size -- which
for now cannot happen.

What I may do is split off the extra_context part of the series (without
which the signal frame never grows).

SVE will still fit if the vector length is below a certain limit.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-13 15:52               ` Dave Martin
@ 2016-09-13 16:02                 ` Florian Weimer
  2016-09-15 16:45                   ` Dave Martin
  0 siblings, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2016-09-13 16:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/13/2016 05:52 PM, Dave Martin wrote:

> Agreed.  I'll need to think some more about how this should work in
> general.

Thanks.

Depending on some SVE implementations details (which I know nothing 
about, I only saw some public overview slides), we may also need 
additional storage space to preserve SVE registers in the dynamic 
linker.  Due to lazy binding, this code cn be called from a signal 
handler, so this needs to be factored into stack size requirements as well.

Problematic are register width extensions used for argument passing and 
callee-saved registers whose width has been extended.  Both are 
particularly challenging to deal with if existing vector instructions 
clear the extension part (which may be desirable for other reasons).

The size of the jmp_buf type is a concern as well.

Florian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-13 16:02                 ` Florian Weimer
@ 2016-09-15 16:45                   ` Dave Martin
  2016-09-16 12:10                     ` Florian Weimer
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Martin @ 2016-09-15 16:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 13, 2016 at 06:02:45PM +0200, Florian Weimer wrote:
> On 09/13/2016 05:52 PM, Dave Martin wrote:
> 
> >Agreed.  I'll need to think some more about how this should work in
> >general.
> 
> Thanks.
> 
> Depending on some SVE implementations details (which I know nothing about, I
> only saw some public overview slides), we may also need additional storage
> space to preserve SVE registers in the dynamic linker.  Due to lazy binding,
> this code cn be called from a signal handler, so this needs to be factored
> into stack size requirements as well.

Yes and no.  The kernel SIGSTKSZ constants don't care about ld.so --
that's userspace overhead, not kernel overhead.

But it is a potential ABI issue for userspace; libc will need to massage
the constants appropriately for its own implementation, and we do need
to make sure that the kernel doesn't use up more stack on signal delivery
than userspace expects.

> Problematic are register width extensions used for argument passing and
> callee-saved registers whose width has been extended.  Both are particularly
> challenging to deal with if existing vector instructions clear the extension
> part (which may be desirable for other reasons).
> 
> The size of the jmp_buf type is a concern as well.

The default PCS for SVE will not be introducing any extra save/restore
requirements for SVE -- i.e., everything is caller-save at public
interfaces, except for the FPSIMD register bits that are already callee-
save under the existing PCS.

Some work will be needed as and when support is introduced for using an
SVE-specific PCS at public interfaces.

Interprocedural optimisation and LTO within individual shared libs need
not depend on this, but I'm not sure how it will work in detail.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-15 16:45                   ` Dave Martin
@ 2016-09-16 12:10                     ` Florian Weimer
  2016-09-16 17:39                       ` Dave Martin
  0 siblings, 1 reply; 22+ messages in thread
From: Florian Weimer @ 2016-09-16 12:10 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/15/2016 06:45 PM, Dave Martin wrote:
> On Tue, Sep 13, 2016 at 06:02:45PM +0200, Florian Weimer wrote:
>> On 09/13/2016 05:52 PM, Dave Martin wrote:
>>
>>> Agreed.  I'll need to think some more about how this should work in
>>> general.
>>
>> Thanks.
>>
>> Depending on some SVE implementations details (which I know nothing about, I
>> only saw some public overview slides), we may also need additional storage
>> space to preserve SVE registers in the dynamic linker.  Due to lazy binding,
>> this code cn be called from a signal handler, so this needs to be factored
>> into stack size requirements as well.
>
> Yes and no.  The kernel SIGSTKSZ constants don't care about ld.so --
> that's userspace overhead, not kernel overhead.

Well yes, so is jmp_buf.  But I think you still should consider the full 
picture. :)

At least the lazy binding stack overhead can be avoided with LD_BIND_NOW=1.

>> Problematic are register width extensions used for argument passing and
>> callee-saved registers whose width has been extended.  Both are particularly
>> challenging to deal with if existing vector instructions clear the extension
>> part (which may be desirable for other reasons).
>>
>> The size of the jmp_buf type is a concern as well.
>
> The default PCS for SVE will not be introducing any extra save/restore
> requirements for SVE -- i.e., everything is caller-save at public
> interfaces, except for the FPSIMD register bits that are already callee-
> save under the existing PCS.

Is it possible to pass arguments in the register extension parts?  If 
existing non-SVE code can clobber these bits, we need some adjustments 
in libc to save and restore SVE state in some places (which may need 
further stack space).

Again, for lazy binding, LD_BIND_NOW=1 will help temporarily, but 
profiling code may need a separate fix.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/5] arm64: Signal context expansion
  2016-09-16 12:10                     ` Florian Weimer
@ 2016-09-16 17:39                       ` Dave Martin
  0 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2016-09-16 17:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 16, 2016 at 02:10:53PM +0200, Florian Weimer wrote:
> On 09/15/2016 06:45 PM, Dave Martin wrote:
> >On Tue, Sep 13, 2016 at 06:02:45PM +0200, Florian Weimer wrote:
> >>On 09/13/2016 05:52 PM, Dave Martin wrote:
> >>
> >>>Agreed.  I'll need to think some more about how this should work in
> >>>general.
> >>
> >>Thanks.
> >>
> >>Depending on some SVE implementations details (which I know nothing about, I
> >>only saw some public overview slides), we may also need additional storage
> >>space to preserve SVE registers in the dynamic linker.  Due to lazy binding,
> >>this code cn be called from a signal handler, so this needs to be factored
> >>into stack size requirements as well.
> >
> >Yes and no.  The kernel SIGSTKSZ constants don't care about ld.so --
> >that's userspace overhead, not kernel overhead.
> 
> Well yes, so is jmp_buf.  But I think you still should consider the full
> picture. :)
> 
> At least the lazy binding stack overhead can be avoided with LD_BIND_NOW=1.

I am considering the wider picture, but we're a bit out of scope for the
patch series, except that I will split out the ability to grow the
signal frame out as a separate patch.

> 
> >>Problematic are register width extensions used for argument passing and
> >>callee-saved registers whose width has been extended.  Both are particularly
> >>challenging to deal with if existing vector instructions clear the extension
> >>part (which may be desirable for other reasons).
> >>
> >>The size of the jmp_buf type is a concern as well.
> >
> >The default PCS for SVE will not be introducing any extra save/restore
> >requirements for SVE -- i.e., everything is caller-save at public
> >interfaces, except for the FPSIMD register bits that are already callee-
> >save under the existing PCS.
> 
> Is it possible to pass arguments in the register extension parts?  If
> existing non-SVE code can clobber these bits, we need some adjustments in
> libc to save and restore SVE state in some places (which may need further
> stack space).

Arguments can be passed the the extended vector registers in theory, but
this is expected only at private interfaces and within compilation units
to begin with.  General-purpose libraries including libc would not be
using those bits at their interfaces, at least for now.

When calling any public interface using the default PCS, all the SVE
registers can be clobbered -- the caller must compensate for this if
it cares.  So, the compensation is always in the _new_ binaries where
code gen in the compiler can take care of it automatically -- not
the current and non-SVE binaries.  libc needs to update for this
specifically, so long as only the default PCS is supported for all
dynamic symbols (which I think will be the case to begin with).

A similar argument applies to all libc entry points, including
setjmp()/longjmp()/jmp_buf.

Notwithstanding this, a binary built for SVE _may_ use more stack
than a binary built without SVE support.  This can always happen when
compiling with different options or even after a minor compiler or
library upgrade, so any software that doesn't already have some safety
buffer in its stack allocations is running a risk even without SVE.  SVE
will probably tend to grow frames rather than shrink them, so there will
be increased pressure on stack allocations, but I don't have a good
answer about the scale of the impact at this point.

For now, I don't expect to fix everything in one go.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-09-16 17:39 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-09 14:15 [RFC PATCH 0/5] arm64: Signal context expansion Dave Martin
2016-09-09 14:15 ` [RFC PATCH 1/5] arm64: signal: Refactor sigcontext parsing in rt_sigreturn Dave Martin
2016-09-09 14:15 ` [RFC PATCH 2/5] arm64: signal: factor frame layout and population into separate passes Dave Martin
2016-09-09 14:15 ` [RFC PATCH 3/5] arm64: signal: factor out signal frame record allocation Dave Martin
2016-09-09 14:15 ` [RFC PATCH 4/5] arm64: signal: Allocate extra sigcontext space as needed Dave Martin
2016-09-09 14:15 ` [RFC PATCH 5/5] arm64: signal: Parse extra_context during sigreturn Dave Martin
2016-09-09 14:39 ` [RFC PATCH 0/5] arm64: Signal context expansion Florian Weimer
2016-09-09 15:21   ` Dave Martin
2016-09-09 17:01     ` Florian Weimer
2016-09-12 11:17       ` Dave Martin
2016-09-12 12:49         ` Florian Weimer
2016-09-12 15:21           ` Dave Martin
2016-09-12 15:01         ` Szabolcs Nagy
2016-09-12 15:30           ` Dave Martin
2016-09-12 16:44             ` Szabolcs Nagy
2016-09-12 17:24               ` Dave Martin
2016-09-13  9:28             ` Florian Weimer
2016-09-13 15:52               ` Dave Martin
2016-09-13 16:02                 ` Florian Weimer
2016-09-15 16:45                   ` Dave Martin
2016-09-16 12:10                     ` Florian Weimer
2016-09-16 17:39                       ` Dave Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.