linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/4] x86: sigcontext fixes, again
@ 2016-02-16 23:09 Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs Andy Lutomirski
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Andy Lutomirski @ 2016-02-16 23:09 UTC (permalink / raw)
  To: x86
  Cc: Denys Vlasenko, Stas Sergeev, Cyrill Gorcunov, Pavel Emelyanov,
	Brian Gerst, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

This is take 2 at fixing x86 64-bit signals wrt SS.  After a lot of
thought, this is not controlled by any flags -- I would much prefer
to avoid opt-in behavior.  Instead, it just tries hard to avoid
triggering the cases that break DOSEMU.

Stas, this now seems to pass the test you sent me.  It works with
stock dosemu2 (I haven't tested classic dosemu because I can't get it
to work regardless).  It also works with a patched dosemu2 that bypasses
the userspace trampoline:

https://github.com/amluto/dosemu2/commit/571b4d08dc885b7a133e444a2ad23e0d21366206

With this applied, all of the x86 selftests pass on x86_64.  That
wasn't the case before -- ldt_gdt_64 was broken.

This is a bit risky, and another option would be to do nothing at
all.  Then we'd disable the problematic self-tests (sigh), and
DOSEMU and similar tools will be stuck using gross hacks even on new
kernels.

Changes from v4:
 - More comment fixes from Borislav and Ingo.

Changes from v3:
 - Comment fixes from Borislav.
 - Improve AR_xyz comments to better match SDM.  (Borislav, is this good?)
 - Get rid of duplicate AR_xyz defines in the selftest.
 - (no non-comment code changes)

Changes from v2:
 - Rebased (which wasn't quite trivial since the headers were rearranged)
 - Slightly improved a comment in patch 1.
 - Tidy up changelogs a bit.

Changes from v1:
 - Comment fixes
 - Fix screwed up uaccess that broke things

Andy Lutomirski (4):
  x86/signal/64: Add a comment about sigcontext->fs and gs
  x86/signal/64: Fix SS if needed when delivering a 64-bit signal
  x86/signal/64: Re-add support for SS in the 64-bit signal context
  selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS

 arch/x86/include/asm/desc_defs.h        |  23 ++++
 arch/x86/include/asm/sighandling.h      |   1 -
 arch/x86/include/uapi/asm/sigcontext.h  |  32 ++++-
 arch/x86/include/uapi/asm/ucontext.h    |  53 +++++++-
 arch/x86/kernel/signal.c                | 114 +++++++++++++---
 tools/testing/selftests/x86/Makefile    |   7 +-
 tools/testing/selftests/x86/sigreturn.c | 230 ++++++++++++++++++++++++++++----
 7 files changed, 400 insertions(+), 60 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs
  2016-02-16 23:09 [PATCH v5 0/4] x86: sigcontext fixes, again Andy Lutomirski
@ 2016-02-16 23:09 ` Andy Lutomirski
  2016-02-17  7:21   ` Ingo Molnar
  2016-02-17 12:10   ` [tip:x86/asm] x86/signal/64: Add a comment about sigcontext-> fs " tip-bot for Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 2/4] x86/signal/64: Fix SS if needed when delivering a 64-bit signal Andy Lutomirski
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Andy Lutomirski @ 2016-02-16 23:09 UTC (permalink / raw)
  To: x86
  Cc: Denys Vlasenko, Stas Sergeev, Cyrill Gorcunov, Pavel Emelyanov,
	Brian Gerst, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

These fields have a strange history.  This tries to document it.

This borrows from 9a036b93a344 ("x86/signal/64: Remove 'fs' and 'gs'
from sigcontext"), which was reverted by ed596cde9425 ("Revert x86
sigcontext cleanups").

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/uapi/asm/sigcontext.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index d485232f1e9f..702c40468859 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -341,6 +341,31 @@ struct sigcontext {
 	__u64				rip;
 	__u64				eflags;		/* RFLAGS */
 	__u16				cs;
+
+	/*
+	 * Prior to 2.5.64 ("[PATCH] x86-64 updates for 2.5.64-bk3"),
+	 * Linux saved and restored fs and gs in these slots.  This
+	 * was counterproductive, as fsbase and gsbase were never
+	 * saved, so arch_prctl was presumably unreliable.
+	 *
+	 * These slots should never be reused without extreme caution:
+	 *
+	 *  - Some DOSEMU versions stash fs and gs in these slots manually,
+	 *    thus overwriting anything the kernel expects to be preserved
+	 *    in these slots.
+	 *
+	 *  - If these slots are ever needed for any other purpose,
+	 *    there is some risk that very old 64-bit binaries could get
+	 *    confused.  I doubt that many such binaries still work,
+	 *    though, since the same patch in 2.5.64 also removed the
+	 *    64-bit set_thread_area syscall, so it appears that there
+	 *    is no TLS API beyond modify_ldt that works in both pre-
+	 *    and post-2.5.64 kernels.
+	 *
+	 * If the kernel ever adds explicit fs, gs, fsbase, and gsbase
+	 * save/restore, it will most likely need to be opt-in and use
+	 * different context slots.
+	 */
 	__u16				gs;
 	__u16				fs;
 	__u16				__pad0;
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 2/4] x86/signal/64: Fix SS if needed when delivering a 64-bit signal
  2016-02-16 23:09 [PATCH v5 0/4] x86: sigcontext fixes, again Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs Andy Lutomirski
@ 2016-02-16 23:09 ` Andy Lutomirski
  2016-02-17 12:10   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 3/4] x86/signal/64: Re-add support for SS in the 64-bit signal context Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 4/4] selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS Andy Lutomirski
  3 siblings, 1 reply; 12+ messages in thread
From: Andy Lutomirski @ 2016-02-16 23:09 UTC (permalink / raw)
  To: x86
  Cc: Denys Vlasenko, Stas Sergeev, Cyrill Gorcunov, Pavel Emelyanov,
	Brian Gerst, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

Signals are always delivered to 64-bit tasks with CS set to a long
mode segment.  In long mode, SS doesn't matter as long as it's a
present writable segment.

If SS starts out invalid (this can happen if the signal was caused
by an IRET fault or was delivered on the way out of set_thread_area
or modify_ldt), then IRET to the signal handler can fail, eventually
killing the task.

The straightforward fix would be to simply reset SS when delivering
a signal.  That breaks DOSEMU, though: 64-bit builds of DOSEMU rely
on SS being set to the faulting SS when signals are delivered.

As a compromise, this patch leaves SS alone so long as it's valid.

The net effect should be that the behavior of successfully delivered
signals is unchanged.  Some signals that would previously have
failed to be delivered will now be delivered successfully.

This has no effect for x32 or 32-bit tasks: their signal handlers
were already called with SS == __USER_DS.

(On Xen, there's a slight hole: if a task sets SS to a writable
 *kernel* data segment, then we will fail to identify it as invalid
 and we'll still kill the task.  If anyone cares, this could be fixed
 with a new paravirt hook.)

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/desc_defs.h | 23 ++++++++++++++++++
 arch/x86/kernel/signal.c         | 51 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/desc_defs.h b/arch/x86/include/asm/desc_defs.h
index 278441f39856..eb5deb42484d 100644
--- a/arch/x86/include/asm/desc_defs.h
+++ b/arch/x86/include/asm/desc_defs.h
@@ -98,4 +98,27 @@ struct desc_ptr {
 
 #endif /* !__ASSEMBLY__ */
 
+/* Access rights as returned by LAR */
+#define AR_TYPE_RODATA		(0 * (1 << 9))
+#define AR_TYPE_RWDATA		(1 * (1 << 9))
+#define AR_TYPE_RODATA_EXPDOWN	(2 * (1 << 9))
+#define AR_TYPE_RWDATA_EXPDOWN	(3 * (1 << 9))
+#define AR_TYPE_XOCODE		(4 * (1 << 9))
+#define AR_TYPE_XRCODE		(5 * (1 << 9))
+#define AR_TYPE_XOCODE_CONF	(6 * (1 << 9))
+#define AR_TYPE_XRCODE_CONF	(7 * (1 << 9))
+#define AR_TYPE_MASK		(7 * (1 << 9))
+
+#define AR_DPL0			(0 * (1 << 13))
+#define AR_DPL3			(3 * (1 << 13))
+#define AR_DPL_MASK		(3 * (1 << 13))
+
+#define AR_A			(1 << 8)   /* "Accessed" */
+#define AR_S			(1 << 12)  /* If clear, "System" segment */
+#define AR_P			(1 << 15)  /* "Present" */
+#define AR_AVL			(1 << 20)  /* "AVaiLable" (no HW effect) */
+#define AR_L			(1 << 21)  /* "Long mode" for code segments */
+#define AR_DB			(1 << 22)  /* D/B, effect depends on type */
+#define AR_G			(1 << 23)  /* "Granularity" (limit in pages) */
+
 #endif /* _ASM_X86_DESC_DEFS_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index cb6282c3638f..bb3e4208d90d 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -61,6 +61,35 @@
 	regs->seg = GET_SEG(seg) | 3;			\
 } while (0)
 
+#ifdef CONFIG_X86_64
+/*
+ * If regs->ss will cause an IRET fault, change it.  Otherwise leave it
+ * alone.  Using this generally makes no sense unless
+ * user_64bit_mode(regs) would return true.
+ */
+static void force_valid_ss(struct pt_regs *regs)
+{
+	u32 ar;
+	asm volatile ("lar %[old_ss], %[ar]\n\t"
+		      "jz 1f\n\t"		/* If invalid: */
+		      "xorl %[ar], %[ar]\n\t"	/* set ar = 0 */
+		      "1:"
+		      : [ar] "=r" (ar)
+		      : [old_ss] "rm" ((u16)regs->ss));
+
+	/*
+	 * For a valid 64-bit user context, we need DPL 3, type
+	 * read-write data or read-write exp-down data, and S and P
+	 * set.  We can't use VERW because VERW doesn't check the
+	 * P bit.
+	 */
+	ar &= AR_DPL_MASK | AR_S | AR_P | AR_TYPE_MASK;
+	if (ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA) &&
+	    ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA_EXPDOWN))
+		regs->ss = __USER_DS;
+}
+#endif
+
 int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
 {
 	unsigned long buf_val;
@@ -459,10 +488,28 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 
 	regs->sp = (unsigned long)frame;
 
-	/* Set up the CS register to run signal handlers in 64-bit mode,
-	   even if the handler happens to be interrupting 32-bit code. */
+	/*
+	 * Set up the CS and SS registers to run signal handlers in
+	 * 64-bit mode, even if the handler happens to be interrupting
+	 * 32-bit or 16-bit code.
+	 *
+	 * SS is subtle.  In 64-bit mode, we don't need any particular
+	 * SS descriptor, but we do need SS to be valid.  It's possible
+	 * that the old SS is entirely bogus -- this can happen if the
+	 * signal we're trying to deliver is #GP or #SS caused by a bad
+	 * SS value.  We also have a compatbility issue here: DOSEMU
+	 * relies on the contents of the SS register indicating the
+	 * SS value at the time of the signal, even though that code in
+	 * DOSEMU predates sigreturn's ability to restore SS.  (DOSEMU
+	 * avoids relying on sigreturn to restore SS; instead it uses
+	 * a trampoline.)  So we do our best: if the old SS was valid,
+	 * we keep it.  Otherwise we replace it.
+	 */
 	regs->cs = __USER_CS;
 
+	if (unlikely(regs->ss != __USER_DS))
+		force_valid_ss(regs);
+
 	return 0;
 }
 #endif /* CONFIG_X86_32 */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 3/4] x86/signal/64: Re-add support for SS in the 64-bit signal context
  2016-02-16 23:09 [PATCH v5 0/4] x86: sigcontext fixes, again Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 2/4] x86/signal/64: Fix SS if needed when delivering a 64-bit signal Andy Lutomirski
@ 2016-02-16 23:09 ` Andy Lutomirski
  2016-02-17 12:11   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
  2016-02-16 23:09 ` [PATCH v5 4/4] selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS Andy Lutomirski
  3 siblings, 1 reply; 12+ messages in thread
From: Andy Lutomirski @ 2016-02-16 23:09 UTC (permalink / raw)
  To: x86
  Cc: Denys Vlasenko, Stas Sergeev, Cyrill Gorcunov, Pavel Emelyanov,
	Brian Gerst, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

This is a second attempt to make the improvements from c6f2062935c8
("x86/signal/64: Fix SS handling for signals delivered to 64-bit
programs"), which was reverted by 51adbfbba5c6 ("x86/signal/64: Add
support for SS in the 64-bit signal context").

This adds two new uc_flags flags.  UC_SIGCONTEXT_SS will be set for
all 64-bit signals (including x32).  It indicates that the saved SS
field is valid and that the kernel supports the new behavior.

The goal is to fix a problems with signal handling in 64-bit tasks:
SS wasn't saved in the 64-bit signal context, making it awkward to
determine what SS was at the time of signal delivery and making it
impossible to return to a non-flat SS (as calling sigreturn clobbers
SS).

This also made it extremely difficult for 64-bit tasks to return to
fully-defined 16-bit contexts, because only the kernel can easily do
espfix64, but sigreturn was unable to set a non-flag SS:ESP.
(DOSEMU has a monstrous hack to partially work around this
limitation.)

If we could go back in time, the correct fix would be to make 64-bit
signals work just like 32-bit signals with respect to SS: save it
in signal context, reset it when delivering a signal, and restore
it in sigreturn.

Unfortunately, doing that (as I tried originally) breaks DOSEMU:
DOSEMU wouldn't reset the signal context's SS when clearing the LDT
and changing the saved CS to 64-bit mode, since it predates the SS
context field existing in the first place.

This patch is a bit more complicated, and it tries to balance a
bunch of goals.  It makes most cases of changing ucontext->ss during
signal handling work as expected.

I do this by special-casing the interesting case.  On sigreturn,
ucontext->ss will be honored by default, unless the ucontext was
created from scratch by an old program and had a 64-bit CS
(unfortunately, CRIU can do this) or was the result of changing a
32-bit signal context to 64-bit without resetting SS (as DOSEMU
does).

For the benefit of new 64-bit software that uses segmentation (new
versions of DOSEMU might), the new behavior can be detected with a
new ucontext flag UC_SIGCONTEXT_SS.

To avoid compilation issues, __pad0 is left as an alias for ss in
ucontext.

The nitty-gritty details are documented in the header file.

This patch also re-enables the sigreturn_64 and ldt_gdt_64 selftests,
as the kernel change allows both of them to pass.

Cc: Stas Sergeev <stsp@list.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Tested-by: Stas Sergeev <stsp@list.ru>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/sighandling.h     |  1 -
 arch/x86/include/uapi/asm/sigcontext.h |  7 ++--
 arch/x86/include/uapi/asm/ucontext.h   | 53 +++++++++++++++++++++++++---
 arch/x86/kernel/signal.c               | 63 ++++++++++++++++++++++++----------
 tools/testing/selftests/x86/Makefile   |  7 ++--
 5 files changed, 101 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index 89db46752a8f..452c88b8ad06 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -13,7 +13,6 @@
 			 X86_EFLAGS_CF | X86_EFLAGS_RF)
 
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
-int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc);
 int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 		     struct pt_regs *regs, unsigned long mask);
 
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index 702c40468859..db8248f52007 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -256,7 +256,7 @@ struct sigcontext_64 {
 	__u16				cs;
 	__u16				gs;
 	__u16				fs;
-	__u16				__pad0;
+	__u16				ss;
 	__u64				err;
 	__u64				trapno;
 	__u64				oldmask;
@@ -368,7 +368,10 @@ struct sigcontext {
 	 */
 	__u16				gs;
 	__u16				fs;
-	__u16				__pad0;
+	union {
+		__u16			ss;	/* If UC_SIGCONTEXT_SS */
+		__u16			__pad0;	/* If !UC_SIGCONTEXT_SS */
+	};
 	__u64				err;
 	__u64				trapno;
 	__u64				oldmask;
diff --git a/arch/x86/include/uapi/asm/ucontext.h b/arch/x86/include/uapi/asm/ucontext.h
index b7c29c8017f2..e3d1ec90616e 100644
--- a/arch/x86/include/uapi/asm/ucontext.h
+++ b/arch/x86/include/uapi/asm/ucontext.h
@@ -1,11 +1,54 @@
 #ifndef _ASM_X86_UCONTEXT_H
 #define _ASM_X86_UCONTEXT_H
 
-#define UC_FP_XSTATE	0x1	/* indicates the presence of extended state
-				 * information in the memory layout pointed
-				 * by the fpstate pointer in the ucontext's
-				 * sigcontext struct (uc_mcontext).
-				 */
+/*
+ * Indicates the presence of extended state information in the memory
+ * layout pointed by the fpstate pointer in the ucontext's sigcontext
+ * struct (uc_mcontext).
+ */
+#define UC_FP_XSTATE	0x1
+
+#ifdef __x86_64__
+/*
+ * UC_SIGCONTEXT_SS will be set when delivering 64-bit or x32 signals on
+ * kernels that save SS in the sigcontext.  All kernels that set
+ * UC_SIGCONTEXT_SS will correctly restore at least the low 32 bits of esp
+ * regardless of SS (i.e. they implement espfix).
+ *
+ * Kernels that set UC_SIGCONTEXT_SS will also set UC_STRICT_RESTORE_SS
+ * when delivering a signal that came from 64-bit code.
+ *
+ * Sigreturn restores SS as follows:
+ *
+ * if (saved SS is valid || UC_STRICT_RESTORE_SS is set ||
+ *     saved CS is not 64-bit)
+ *         new SS = saved SS  (will fail IRET and signal if invalid)
+ * else
+ *         new SS = a flat 32-bit data segment
+ *
+ * This behavior serves three purposes:
+ *
+ * - Legacy programs that construct a 64-bit sigcontext from scratch
+ *   with zero or garbage in the SS slot (e.g. old CRIU) and call
+ *   sigreturn will still work.
+ *
+ * - Old DOSEMU versions sometimes catch a signal from a segmented
+ *   context, delete the old SS segment (with modify_ldt), and change
+ *   the saved CS to a 64-bit segment.  These DOSEMU versions expect
+ *   sigreturn to send them back to 64-bit mode without killing them,
+ *   despite the fact that the SS selector when the signal was raised is
+ *   no longer valid.  UC_STRICT_RESTORE_SS will be clear, so the kernel
+ *   will fix up SS for these DOSEMU versions.
+ *
+ * - Old and new programs that catch a signal and return without
+ *   modifying the saved context will end up in exactly the state they
+ *   started in, even if they were running in a segmented context when
+ *   the signal was raised..  Old kernels would lose track of the
+ *   previous SS value.
+ */
+#define UC_SIGCONTEXT_SS	0x2
+#define UC_STRICT_RESTORE_SS	0x4
+#endif
 
 #include <asm-generic/ucontext.h>
 
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index bb3e4208d90d..32725f6a2932 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -90,7 +90,9 @@ static void force_valid_ss(struct pt_regs *regs)
 }
 #endif
 
-int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
+static int restore_sigcontext(struct pt_regs *regs,
+			      struct sigcontext __user *sc,
+			      unsigned long uc_flags)
 {
 	unsigned long buf_val;
 	void __user *buf;
@@ -123,15 +125,18 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
 		COPY(r15);
 #endif /* CONFIG_X86_64 */
 
-#ifdef CONFIG_X86_32
 		COPY_SEG_CPL3(cs);
 		COPY_SEG_CPL3(ss);
-#else /* !CONFIG_X86_32 */
-		/* Kernel saves and restores only the CS segment register on signals,
-		 * which is the bare minimum needed to allow mixed 32/64-bit code.
-		 * App's signal handler can save/restore other segments if needed. */
-		COPY_SEG_CPL3(cs);
-#endif /* CONFIG_X86_32 */
+
+#ifdef CONFIG_X86_64
+		/*
+		 * Fix up SS if needed for the benefit of old DOSEMU and
+		 * CRIU.
+		 */
+		if (unlikely(!(uc_flags & UC_STRICT_RESTORE_SS) &&
+			     user_64bit_mode(regs)))
+			force_valid_ss(regs);
+#endif
 
 		get_user_ex(tmpflags, &sc->flags);
 		regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
@@ -194,6 +199,7 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 		put_user_ex(regs->cs, &sc->cs);
 		put_user_ex(0, &sc->gs);
 		put_user_ex(0, &sc->fs);
+		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 
 		put_user_ex(fpstate, &sc->fpstate);
@@ -432,6 +438,21 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 	return 0;
 }
 #else /* !CONFIG_X86_32 */
+static unsigned long frame_uc_flags(struct pt_regs *regs)
+{
+	unsigned long flags;
+
+	if (cpu_has_xsave)
+		flags = UC_FP_XSTATE | UC_SIGCONTEXT_SS;
+	else
+		flags = UC_SIGCONTEXT_SS;
+
+	if (likely(user_64bit_mode(regs)))
+		flags |= UC_STRICT_RESTORE_SS;
+
+	return flags;
+}
+
 static int __setup_rt_frame(int sig, struct ksignal *ksig,
 			    sigset_t *set, struct pt_regs *regs)
 {
@@ -451,10 +472,7 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 
 	put_user_try {
 		/* Create the ucontext.  */
-		if (cpu_has_xsave)
-			put_user_ex(UC_FP_XSTATE, &frame->uc.uc_flags);
-		else
-			put_user_ex(0, &frame->uc.uc_flags);
+		put_user_ex(frame_uc_flags(regs), &frame->uc.uc_flags);
 		put_user_ex(0, &frame->uc.uc_link);
 		save_altstack_ex(&frame->uc.uc_stack, regs->sp);
 
@@ -536,10 +554,7 @@ static int x32_setup_rt_frame(struct ksignal *ksig,
 
 	put_user_try {
 		/* Create the ucontext.  */
-		if (cpu_has_xsave)
-			put_user_ex(UC_FP_XSTATE, &frame->uc.uc_flags);
-		else
-			put_user_ex(0, &frame->uc.uc_flags);
+		put_user_ex(frame_uc_flags(regs), &frame->uc.uc_flags);
 		put_user_ex(0, &frame->uc.uc_link);
 		compat_save_altstack_ex(&frame->uc.uc_stack, regs->sp);
 		put_user_ex(0, &frame->uc.uc__pad0);
@@ -601,7 +616,11 @@ asmlinkage unsigned long sys_sigreturn(void)
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->sc))
+	/*
+	 * x86_32 has no uc_flags bits relevant to restore_sigcontext.
+	 * Save a few cycles by skipping the __get_user.
+	 */
+	if (restore_sigcontext(regs, &frame->sc, 0))
 		goto badframe;
 	return regs->ax;
 
@@ -617,16 +636,19 @@ asmlinkage long sys_rt_sigreturn(void)
 	struct pt_regs *regs = current_pt_regs();
 	struct rt_sigframe __user *frame;
 	sigset_t set;
+	unsigned long uc_flags;
 
 	frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long));
 	if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
 		goto badframe;
 	if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
 		goto badframe;
+	if (__get_user(uc_flags, &frame->uc.uc_flags))
+		goto badframe;
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->uc.uc_mcontext))
+	if (restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags))
 		goto badframe;
 
 	if (restore_altstack(&frame->uc.uc_stack))
@@ -810,6 +832,7 @@ asmlinkage long sys32_x32_rt_sigreturn(void)
 	struct pt_regs *regs = current_pt_regs();
 	struct rt_sigframe_x32 __user *frame;
 	sigset_t set;
+	unsigned long uc_flags;
 
 	frame = (struct rt_sigframe_x32 __user *)(regs->sp - 8);
 
@@ -817,10 +840,12 @@ asmlinkage long sys32_x32_rt_sigreturn(void)
 		goto badframe;
 	if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
 		goto badframe;
+	if (__get_user(uc_flags, &frame->uc.uc_flags))
+		goto badframe;
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->uc.uc_mcontext))
+	if (restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags))
 		goto badframe;
 
 	if (compat_restore_altstack(&frame->uc.uc_stack))
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index d0c473f65850..f5a02f19546c 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -4,10 +4,11 @@ include ../lib.mk
 
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
-TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall
-TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault sigreturn test_syscall_vdso unwind_vdso \
+TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall \
+	sigreturn \
+	ldt_gdt
+TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
-			ldt_gdt \
 			vdso_restorer
 
 TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v5 4/4] selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS
  2016-02-16 23:09 [PATCH v5 0/4] x86: sigcontext fixes, again Andy Lutomirski
                   ` (2 preceding siblings ...)
  2016-02-16 23:09 ` [PATCH v5 3/4] x86/signal/64: Re-add support for SS in the 64-bit signal context Andy Lutomirski
@ 2016-02-16 23:09 ` Andy Lutomirski
  2016-02-17 12:11   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
  3 siblings, 1 reply; 12+ messages in thread
From: Andy Lutomirski @ 2016-02-16 23:09 UTC (permalink / raw)
  To: x86
  Cc: Denys Vlasenko, Stas Sergeev, Cyrill Gorcunov, Pavel Emelyanov,
	Brian Gerst, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov, Andy Lutomirski

This tests the two ABI-preserving cases that DOSEMU cares about, and
it also explicitly tests the new UC_SIGCONTEXT_SS and
UC_STRICT_RESTORE_SS flags.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 tools/testing/selftests/x86/sigreturn.c | 230 ++++++++++++++++++++++++++++----
 1 file changed, 202 insertions(+), 28 deletions(-)

diff --git a/tools/testing/selftests/x86/sigreturn.c b/tools/testing/selftests/x86/sigreturn.c
index b5aa1bab7416..8a577e7070c6 100644
--- a/tools/testing/selftests/x86/sigreturn.c
+++ b/tools/testing/selftests/x86/sigreturn.c
@@ -54,6 +54,37 @@
 #include <sys/ptrace.h>
 #include <sys/user.h>
 
+/* Pull in AR_xyz defines. */
+typedef unsigned int u32;
+typedef unsigned short u16;
+#include "../../../../arch/x86/include/asm/desc_defs.h"
+
+/*
+ * Copied from asm/ucontext.h, as asm/ucontext.h conflicts badly with the glibc
+ * headers.
+ */
+#ifdef __x86_64__
+/*
+ * UC_SIGCONTEXT_SS will be set when delivering 64-bit or x32 signals on
+ * kernels that save SS in the sigcontext.  All kernels that set
+ * UC_SIGCONTEXT_SS will correctly restore at least the low 32 bits of esp
+ * regardless of SS (i.e. they implement espfix).
+ *
+ * Kernels that set UC_SIGCONTEXT_SS will also set UC_STRICT_RESTORE_SS
+ * when delivering a signal that came from 64-bit code.
+ *
+ * Sigreturn restores SS as follows:
+ *
+ * if (saved SS is valid || UC_STRICT_RESTORE_SS is set ||
+ *     saved CS is not 64-bit)
+ *         new SS = saved SS  (will fail IRET and signal if invalid)
+ * else
+ *         new SS = a flat 32-bit data segment
+ */
+#define UC_SIGCONTEXT_SS       0x2
+#define UC_STRICT_RESTORE_SS   0x4
+#endif
+
 /*
  * In principle, this test can run on Linux emulation layers (e.g.
  * Illumos "LX branded zones").  Solaris-based kernels reserve LDT
@@ -267,6 +298,9 @@ static gregset_t initial_regs, requested_regs, resulting_regs;
 /* Instructions for the SIGUSR1 handler. */
 static volatile unsigned short sig_cs, sig_ss;
 static volatile sig_atomic_t sig_trapped, sig_err, sig_trapno;
+#ifdef __x86_64__
+static volatile sig_atomic_t sig_corrupt_final_ss;
+#endif
 
 /* Abstractions for some 32-bit vs 64-bit differences. */
 #ifdef __x86_64__
@@ -305,9 +339,105 @@ static greg_t *csptr(ucontext_t *ctx)
 }
 #endif
 
+/*
+ * Checks a given selector for its code bitness or returns -1 if it's not
+ * a usable code segment selector.
+ */
+int cs_bitness(unsigned short cs)
+{
+	uint32_t valid = 0, ar;
+	asm ("lar %[cs], %[ar]\n\t"
+	     "jnz 1f\n\t"
+	     "mov $1, %[valid]\n\t"
+	     "1:"
+	     : [ar] "=r" (ar), [valid] "+rm" (valid)
+	     : [cs] "r" (cs));
+
+	if (!valid)
+		return -1;
+
+	bool db = (ar & (1 << 22));
+	bool l = (ar & (1 << 21));
+
+	if (!(ar & (1<<11)))
+	    return -1;	/* Not code. */
+
+	if (l && !db)
+		return 64;
+	else if (!l && db)
+		return 32;
+	else if (!l && !db)
+		return 16;
+	else
+		return -1;	/* Unknown bitness. */
+}
+
+/*
+ * Checks a given selector for its code bitness or returns -1 if it's not
+ * a usable code segment selector.
+ */
+bool is_valid_ss(unsigned short cs)
+{
+	uint32_t valid = 0, ar;
+	asm ("lar %[cs], %[ar]\n\t"
+	     "jnz 1f\n\t"
+	     "mov $1, %[valid]\n\t"
+	     "1:"
+	     : [ar] "=r" (ar), [valid] "+rm" (valid)
+	     : [cs] "r" (cs));
+
+	if (!valid)
+		return false;
+
+	if ((ar & AR_TYPE_MASK) != AR_TYPE_RWDATA &&
+	    (ar & AR_TYPE_MASK) != AR_TYPE_RWDATA_EXPDOWN)
+		return false;
+
+	return (ar & AR_P);
+}
+
 /* Number of errors in the current test case. */
 static volatile sig_atomic_t nerrs;
 
+static void validate_signal_ss(int sig, ucontext_t *ctx)
+{
+#ifdef __x86_64__
+	bool was_64bit = (cs_bitness(*csptr(ctx)) == 64);
+
+	if (!(ctx->uc_flags & UC_SIGCONTEXT_SS)) {
+		printf("[FAIL]\tUC_SIGCONTEXT_SS was not set\n");
+		nerrs++;
+
+		/*
+		 * This happens on Linux 4.1.  The rest will fail, too, so
+		 * return now to reduce the noise.
+		 */
+		return;
+	}
+
+	/* UC_STRICT_RESTORE_SS is set iff we came from 64-bit mode. */
+	if (!!(ctx->uc_flags & UC_STRICT_RESTORE_SS) != was_64bit) {
+		printf("[FAIL]\tUC_STRICT_RESTORE_SS was wrong in signal %d\n",
+		       sig);
+		nerrs++;
+	}
+
+	if (is_valid_ss(*ssptr(ctx))) {
+		/*
+		 * DOSEMU was written before 64-bit sigcontext had SS, and
+		 * it tries to figure out the signal source SS by looking at
+		 * the physical register.  Make sure that keeps working.
+		 */
+		unsigned short hw_ss;
+		asm ("mov %%ss, %0" : "=rm" (hw_ss));
+		if (hw_ss != *ssptr(ctx)) {
+			printf("[FAIL]\tHW SS didn't match saved SS\n");
+			nerrs++;
+		}
+	}
+#endif
+}
+
 /*
  * SIGUSR1 handler.  Sets CS and SS as requested and points IP to the
  * int3 trampoline.  Sets SP to a large known value so that we can see
@@ -317,6 +447,8 @@ static void sigusr1(int sig, siginfo_t *info, void *ctx_void)
 {
 	ucontext_t *ctx = (ucontext_t*)ctx_void;
 
+	validate_signal_ss(sig, ctx);
+
 	memcpy(&initial_regs, &ctx->uc_mcontext.gregs, sizeof(gregset_t));
 
 	*csptr(ctx) = sig_cs;
@@ -334,13 +466,16 @@ static void sigusr1(int sig, siginfo_t *info, void *ctx_void)
 }
 
 /*
- * Called after a successful sigreturn.  Restores our state so that
- * the original raise(SIGUSR1) returns.
+ * Called after a successful sigreturn (via int3) or from a failed
+ * sigreturn (directly by kernel).  Restores our state so that the
+ * original raise(SIGUSR1) returns.
  */
 static void sigtrap(int sig, siginfo_t *info, void *ctx_void)
 {
 	ucontext_t *ctx = (ucontext_t*)ctx_void;
 
+	validate_signal_ss(sig, ctx);
+
 	sig_err = ctx->uc_mcontext.gregs[REG_ERR];
 	sig_trapno = ctx->uc_mcontext.gregs[REG_TRAPNO];
 
@@ -358,41 +493,62 @@ static void sigtrap(int sig, siginfo_t *info, void *ctx_void)
 	memcpy(&resulting_regs, &ctx->uc_mcontext.gregs, sizeof(gregset_t));
 	memcpy(&ctx->uc_mcontext.gregs, &initial_regs, sizeof(gregset_t));
 
+#ifdef __x86_64__
+	if (sig_corrupt_final_ss) {
+		if (ctx->uc_flags & UC_STRICT_RESTORE_SS) {
+			printf("[FAIL]\tUC_STRICT_RESTORE_SS was set inappropriately\n");
+			nerrs++;
+		} else {
+			/*
+			 * DOSEMU transitions from 32-bit to 64-bit mode by
+			 * adjusting sigcontext, and it requires that this work
+			 * even if the saved SS is bogus.
+			 */
+			printf("\tCorrupting SS on return to 64-bit mode\n");
+			*ssptr(ctx) = 0;
+		}
+	}
+#endif
+
 	sig_trapped = sig;
 }
 
-/*
- * Checks a given selector for its code bitness or returns -1 if it's not
- * a usable code segment selector.
- */
-int cs_bitness(unsigned short cs)
+#ifdef __x86_64__
+/* Tests recovery if !UC_STRICT_RESTORE_SS */
+static void sigusr2(int sig, siginfo_t *info, void *ctx_void)
 {
-	uint32_t valid = 0, ar;
-	asm ("lar %[cs], %[ar]\n\t"
-	     "jnz 1f\n\t"
-	     "mov $1, %[valid]\n\t"
-	     "1:"
-	     : [ar] "=r" (ar), [valid] "+rm" (valid)
-	     : [cs] "r" (cs));
+	ucontext_t *ctx = (ucontext_t*)ctx_void;
 
-	if (!valid)
-		return -1;
+	if (!(ctx->uc_flags & UC_STRICT_RESTORE_SS)) {
+		printf("[FAIL]\traise(2) didn't set UC_STRICT_RESTORE_SS\n");
+		nerrs++;
+		return;  /* We can't do the rest. */
+	}
 
-	bool db = (ar & (1 << 22));
-	bool l = (ar & (1 << 21));
+	ctx->uc_flags &= ~UC_STRICT_RESTORE_SS;
+	*ssptr(ctx) = 0;
 
-	if (!(ar & (1<<11)))
-	    return -1;	/* Not code. */
+	/* Return.  The kernel should recover without sending another signal. */
+}
 
-	if (l && !db)
-		return 64;
-	else if (!l && db)
-		return 32;
-	else if (!l && !db)
-		return 16;
-	else
-		return -1;	/* Unknown bitness. */
+static int test_nonstrict_ss(void)
+{
+	clearhandler(SIGUSR1);
+	clearhandler(SIGTRAP);
+	clearhandler(SIGSEGV);
+	clearhandler(SIGILL);
+	sethandler(SIGUSR2, sigusr2, 0);
+
+	nerrs = 0;
+
+	printf("[RUN]\tClear UC_STRICT_RESTORE_SS and corrupt SS\n");
+	raise(SIGUSR2);
+	if (!nerrs)
+		printf("[OK]\tIt worked\n");
+
+	return nerrs;
 }
+#endif
 
 /* Finds a usable code segment of the requested bitness. */
 int find_cs(int bitness)
@@ -576,6 +732,12 @@ static int test_bad_iret(int cs_bits, unsigned short ss, int force_cs)
 		       errdesc, strsignal(sig_trapped));
 		return 0;
 	} else {
+		/*
+		 * This also implicitly tests UC_STRICT_RESTORE_SS:
+		 * We check that these signals set UC_STRICT_RESTORE_SS and,
+		 * if UC_STRICT_RESTORE_SS doesn't cause strict behavior,
+		 * then we won't get SIGSEGV.
+		 */
 		printf("[FAIL]\tDid not get SIGSEGV\n");
 		return 1;
 	}
@@ -632,6 +794,14 @@ int main()
 						    GDT3(gdt_data16_idx));
 	}
 
+#ifdef __x86_64__
+	/* Nasty ABI case: check SS corruption handling. */
+	sig_corrupt_final_ss = 1;
+	total_nerrs += test_valid_sigreturn(32, false, -1);
+	total_nerrs += test_valid_sigreturn(32, true, -1);
+	sig_corrupt_final_ss = 0;
+#endif
+
 	/*
 	 * We're done testing valid sigreturn cases.  Now we test states
 	 * for which sigreturn itself will succeed but the subsequent
@@ -680,5 +850,9 @@ int main()
 	if (gdt_npdata32_idx)
 		test_bad_iret(32, GDT3(gdt_npdata32_idx), -1);
 
+#ifdef __x86_64__
+	total_nerrs += test_nonstrict_ss();
+#endif
+
 	return total_nerrs ? 1 : 0;
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs
  2016-02-16 23:09 ` [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs Andy Lutomirski
@ 2016-02-17  7:21   ` Ingo Molnar
  2016-02-17 10:03     ` Stas Sergeev
  2016-02-17 12:10   ` [tip:x86/asm] x86/signal/64: Add a comment about sigcontext-> fs " tip-bot for Andy Lutomirski
  1 sibling, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2016-02-17  7:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: x86, Denys Vlasenko, Stas Sergeev, Cyrill Gorcunov,
	Pavel Emelyanov, Brian Gerst, linux-kernel, Linus Torvalds,
	Oleg Nesterov, Borislav Petkov


* Andy Lutomirski <luto@kernel.org> wrote:

> +	 * If the kernel ever adds explicit fs, gs, fsbase, and gsbase
> +	 * save/restore, it will most likely need to be opt-in and use
> +	 * different context slots.

Btw., that's not necessarily true: it could also be made opt-out, and a 
modify_ldt() or any other cleanly identifiable legacy usage/signature that is 
associated with DOSEMU might trigger the opt-out automatically as well.

I.e. behaviorally it would still keep the ABI and modern default behavior could 
still be whatever we want to make it.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs
  2016-02-17  7:21   ` Ingo Molnar
@ 2016-02-17 10:03     ` Stas Sergeev
  2016-02-17 11:05       ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Stas Sergeev @ 2016-02-17 10:03 UTC (permalink / raw)
  To: Ingo Molnar, Andy Lutomirski
  Cc: x86, Denys Vlasenko, Cyrill Gorcunov, Pavel Emelyanov,
	Brian Gerst, linux-kernel, Linus Torvalds, Oleg Nesterov,
	Borislav Petkov

17.02.2016 10:21, Ingo Molnar пишет:
> 
> * Andy Lutomirski <luto@kernel.org> wrote:
> 
>> +	 * If the kernel ever adds explicit fs, gs, fsbase, and gsbase
>> +	 * save/restore, it will most likely need to be opt-in and use
>> +	 * different context slots.
> 
> Btw., that's not necessarily true: it could also be made opt-out, and a 
> modify_ldt() or any other cleanly identifiable legacy usage/signature that is 
> associated with DOSEMU might trigger the opt-out automatically as well.
But there are the new versions of dosemu that still use
"modify_ldt() or any other cleanly identifiable legacy usage/signature"
and yet wants a new functionality.
Please don't go for such an unreliable heuristic.

> I.e. behaviorally it would still keep the ABI and modern default behavior could 
> still be whatever we want to make it.
You can allow glibc to deal with the opt-ins.
It has symbol versioning system AFAIK exactly for that.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs
  2016-02-17 10:03     ` Stas Sergeev
@ 2016-02-17 11:05       ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2016-02-17 11:05 UTC (permalink / raw)
  To: Stas Sergeev
  Cc: Andy Lutomirski, x86, Denys Vlasenko, Cyrill Gorcunov,
	Pavel Emelyanov, Brian Gerst, linux-kernel, Linus Torvalds,
	Oleg Nesterov, Borislav Petkov


* Stas Sergeev <stsp@list.ru> wrote:

> 17.02.2016 10:21, Ingo Molnar пишет:
> > 
> > * Andy Lutomirski <luto@kernel.org> wrote:
> > 
> >> +	 * If the kernel ever adds explicit fs, gs, fsbase, and gsbase
> >> +	 * save/restore, it will most likely need to be opt-in and use
> >> +	 * different context slots.
> > 
> > Btw., that's not necessarily true: it could also be made opt-out, and a 
> > modify_ldt() or any other cleanly identifiable legacy usage/signature that is 
> > associated with DOSEMU might trigger the opt-out automatically as well.
> But there are the new versions of dosemu that still use
> "modify_ldt() or any other cleanly identifiable legacy usage/signature"
> and yet wants a new functionality.
> Please don't go for such an unreliable heuristic.

Only if implemented dumbly.

Implemented intelligently there would be 3 runtime states: enabled, disabled, 
automatic. New DOSEMU would explicitly enable it, which would override any 
automatic defaults.

But note that I was only pointing out that the comment is needlessly restrictive. 
Anyway, I have applied Andy's patches.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:x86/asm] x86/signal/64: Add a comment about sigcontext-> fs and gs
  2016-02-16 23:09 ` [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs Andy Lutomirski
  2016-02-17  7:21   ` Ingo Molnar
@ 2016-02-17 12:10   ` tip-bot for Andy Lutomirski
  1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-17 12:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, luto, oleg, bp, peterz, stsp, gorcunov, hpa, luto,
	dvlasenk, brgerst, viro, xemul, mingo, torvalds, tglx

Commit-ID:  e54fdcca70a33a7e447e526b305db85e978c0563
Gitweb:     http://git.kernel.org/tip/e54fdcca70a33a7e447e526b305db85e978c0563
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Tue, 16 Feb 2016 15:09:01 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 17 Feb 2016 08:32:11 +0100

x86/signal/64: Add a comment about sigcontext->fs and gs

These fields have a strange history.  This tries to document it.

This borrows from 9a036b93a344 ("x86/signal/64: Remove 'fs' and 'gs'
from sigcontext"), which was reverted by ed596cde9425 ("Revert x86
sigcontext cleanups").

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/baa78f3c84106fa5acbc319377b1850602f5deec.1455664054.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/uapi/asm/sigcontext.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index d485232..702c404 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -341,6 +341,31 @@ struct sigcontext {
 	__u64				rip;
 	__u64				eflags;		/* RFLAGS */
 	__u16				cs;
+
+	/*
+	 * Prior to 2.5.64 ("[PATCH] x86-64 updates for 2.5.64-bk3"),
+	 * Linux saved and restored fs and gs in these slots.  This
+	 * was counterproductive, as fsbase and gsbase were never
+	 * saved, so arch_prctl was presumably unreliable.
+	 *
+	 * These slots should never be reused without extreme caution:
+	 *
+	 *  - Some DOSEMU versions stash fs and gs in these slots manually,
+	 *    thus overwriting anything the kernel expects to be preserved
+	 *    in these slots.
+	 *
+	 *  - If these slots are ever needed for any other purpose,
+	 *    there is some risk that very old 64-bit binaries could get
+	 *    confused.  I doubt that many such binaries still work,
+	 *    though, since the same patch in 2.5.64 also removed the
+	 *    64-bit set_thread_area syscall, so it appears that there
+	 *    is no TLS API beyond modify_ldt that works in both pre-
+	 *    and post-2.5.64 kernels.
+	 *
+	 * If the kernel ever adds explicit fs, gs, fsbase, and gsbase
+	 * save/restore, it will most likely need to be opt-in and use
+	 * different context slots.
+	 */
 	__u16				gs;
 	__u16				fs;
 	__u16				__pad0;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:x86/asm] x86/signal/64: Fix SS if needed when delivering a 64-bit signal
  2016-02-16 23:09 ` [PATCH v5 2/4] x86/signal/64: Fix SS if needed when delivering a 64-bit signal Andy Lutomirski
@ 2016-02-17 12:10   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-17 12:10 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, luto, gorcunov, hpa, linux-kernel, luto, dvlasenk, oleg,
	brgerst, viro, mingo, peterz, torvalds, tglx, xemul, stsp

Commit-ID:  8ff5bd2e1e2767fbf737f84d5f92668dafe7e7b0
Gitweb:     http://git.kernel.org/tip/8ff5bd2e1e2767fbf737f84d5f92668dafe7e7b0
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Tue, 16 Feb 2016 15:09:02 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 17 Feb 2016 08:32:11 +0100

x86/signal/64: Fix SS if needed when delivering a 64-bit signal

Signals are always delivered to 64-bit tasks with CS set to a long
mode segment.  In long mode, SS doesn't matter as long as it's a
present writable segment.

If SS starts out invalid (this can happen if the signal was caused
by an IRET fault or was delivered on the way out of set_thread_area
or modify_ldt), then IRET to the signal handler can fail, eventually
killing the task.

The straightforward fix would be to simply reset SS when delivering
a signal.  That breaks DOSEMU, though: 64-bit builds of DOSEMU rely
on SS being set to the faulting SS when signals are delivered.

As a compromise, this patch leaves SS alone so long as it's valid.

The net effect should be that the behavior of successfully delivered
signals is unchanged.  Some signals that would previously have
failed to be delivered will now be delivered successfully.

This has no effect for x32 or 32-bit tasks: their signal handlers
were already called with SS == __USER_DS.

(On Xen, there's a slight hole: if a task sets SS to a writable
 *kernel* data segment, then we will fail to identify it as invalid
 and we'll still kill the task.  If anyone cares, this could be fixed
 with a new paravirt hook.)

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/163c6e1eacde41388f3ff4d2fe6769be651d7b6e.1455664054.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/desc_defs.h | 23 ++++++++++++++++++
 arch/x86/kernel/signal.c         | 51 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/desc_defs.h b/arch/x86/include/asm/desc_defs.h
index 278441f..eb5deb4 100644
--- a/arch/x86/include/asm/desc_defs.h
+++ b/arch/x86/include/asm/desc_defs.h
@@ -98,4 +98,27 @@ struct desc_ptr {
 
 #endif /* !__ASSEMBLY__ */
 
+/* Access rights as returned by LAR */
+#define AR_TYPE_RODATA		(0 * (1 << 9))
+#define AR_TYPE_RWDATA		(1 * (1 << 9))
+#define AR_TYPE_RODATA_EXPDOWN	(2 * (1 << 9))
+#define AR_TYPE_RWDATA_EXPDOWN	(3 * (1 << 9))
+#define AR_TYPE_XOCODE		(4 * (1 << 9))
+#define AR_TYPE_XRCODE		(5 * (1 << 9))
+#define AR_TYPE_XOCODE_CONF	(6 * (1 << 9))
+#define AR_TYPE_XRCODE_CONF	(7 * (1 << 9))
+#define AR_TYPE_MASK		(7 * (1 << 9))
+
+#define AR_DPL0			(0 * (1 << 13))
+#define AR_DPL3			(3 * (1 << 13))
+#define AR_DPL_MASK		(3 * (1 << 13))
+
+#define AR_A			(1 << 8)   /* "Accessed" */
+#define AR_S			(1 << 12)  /* If clear, "System" segment */
+#define AR_P			(1 << 15)  /* "Present" */
+#define AR_AVL			(1 << 20)  /* "AVaiLable" (no HW effect) */
+#define AR_L			(1 << 21)  /* "Long mode" for code segments */
+#define AR_DB			(1 << 22)  /* D/B, effect depends on type */
+#define AR_G			(1 << 23)  /* "Granularity" (limit in pages) */
+
 #endif /* _ASM_X86_DESC_DEFS_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index c07ff5d..52f82c7 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -61,6 +61,35 @@
 	regs->seg = GET_SEG(seg) | 3;			\
 } while (0)
 
+#ifdef CONFIG_X86_64
+/*
+ * If regs->ss will cause an IRET fault, change it.  Otherwise leave it
+ * alone.  Using this generally makes no sense unless
+ * user_64bit_mode(regs) would return true.
+ */
+static void force_valid_ss(struct pt_regs *regs)
+{
+	u32 ar;
+	asm volatile ("lar %[old_ss], %[ar]\n\t"
+		      "jz 1f\n\t"		/* If invalid: */
+		      "xorl %[ar], %[ar]\n\t"	/* set ar = 0 */
+		      "1:"
+		      : [ar] "=r" (ar)
+		      : [old_ss] "rm" ((u16)regs->ss));
+
+	/*
+	 * For a valid 64-bit user context, we need DPL 3, type
+	 * read-write data or read-write exp-down data, and S and P
+	 * set.  We can't use VERW because VERW doesn't check the
+	 * P bit.
+	 */
+	ar &= AR_DPL_MASK | AR_S | AR_P | AR_TYPE_MASK;
+	if (ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA) &&
+	    ar != (AR_DPL3 | AR_S | AR_P | AR_TYPE_RWDATA_EXPDOWN))
+		regs->ss = __USER_DS;
+}
+#endif
+
 int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
 {
 	unsigned long buf_val;
@@ -459,10 +488,28 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 
 	regs->sp = (unsigned long)frame;
 
-	/* Set up the CS register to run signal handlers in 64-bit mode,
-	   even if the handler happens to be interrupting 32-bit code. */
+	/*
+	 * Set up the CS and SS registers to run signal handlers in
+	 * 64-bit mode, even if the handler happens to be interrupting
+	 * 32-bit or 16-bit code.
+	 *
+	 * SS is subtle.  In 64-bit mode, we don't need any particular
+	 * SS descriptor, but we do need SS to be valid.  It's possible
+	 * that the old SS is entirely bogus -- this can happen if the
+	 * signal we're trying to deliver is #GP or #SS caused by a bad
+	 * SS value.  We also have a compatbility issue here: DOSEMU
+	 * relies on the contents of the SS register indicating the
+	 * SS value at the time of the signal, even though that code in
+	 * DOSEMU predates sigreturn's ability to restore SS.  (DOSEMU
+	 * avoids relying on sigreturn to restore SS; instead it uses
+	 * a trampoline.)  So we do our best: if the old SS was valid,
+	 * we keep it.  Otherwise we replace it.
+	 */
 	regs->cs = __USER_CS;
 
+	if (unlikely(regs->ss != __USER_DS))
+		force_valid_ss(regs);
+
 	return 0;
 }
 #endif /* CONFIG_X86_32 */

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:x86/asm] x86/signal/64: Re-add support for SS in the 64-bit signal context
  2016-02-16 23:09 ` [PATCH v5 3/4] x86/signal/64: Re-add support for SS in the 64-bit signal context Andy Lutomirski
@ 2016-02-17 12:11   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-17 12:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, xemul, linux-kernel, luto, viro, peterz, bp, oleg,
	stsp, tglx, gorcunov, hpa, mingo, brgerst, torvalds, luto

Commit-ID:  6c25da5ad55d48c41b8909bc1f4e3cd5d85bb499
Gitweb:     http://git.kernel.org/tip/6c25da5ad55d48c41b8909bc1f4e3cd5d85bb499
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Tue, 16 Feb 2016 15:09:03 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 17 Feb 2016 08:32:11 +0100

x86/signal/64: Re-add support for SS in the 64-bit signal context

This is a second attempt to make the improvements from c6f2062935c8
("x86/signal/64: Fix SS handling for signals delivered to 64-bit
programs"), which was reverted by 51adbfbba5c6 ("x86/signal/64: Add
support for SS in the 64-bit signal context").

This adds two new uc_flags flags.  UC_SIGCONTEXT_SS will be set for
all 64-bit signals (including x32).  It indicates that the saved SS
field is valid and that the kernel supports the new behavior.

The goal is to fix a problems with signal handling in 64-bit tasks:
SS wasn't saved in the 64-bit signal context, making it awkward to
determine what SS was at the time of signal delivery and making it
impossible to return to a non-flat SS (as calling sigreturn clobbers
SS).

This also made it extremely difficult for 64-bit tasks to return to
fully-defined 16-bit contexts, because only the kernel can easily do
espfix64, but sigreturn was unable to set a non-flag SS:ESP.
(DOSEMU has a monstrous hack to partially work around this
limitation.)

If we could go back in time, the correct fix would be to make 64-bit
signals work just like 32-bit signals with respect to SS: save it
in signal context, reset it when delivering a signal, and restore
it in sigreturn.

Unfortunately, doing that (as I tried originally) breaks DOSEMU:
DOSEMU wouldn't reset the signal context's SS when clearing the LDT
and changing the saved CS to 64-bit mode, since it predates the SS
context field existing in the first place.

This patch is a bit more complicated, and it tries to balance a
bunch of goals.  It makes most cases of changing ucontext->ss during
signal handling work as expected.

I do this by special-casing the interesting case.  On sigreturn,
ucontext->ss will be honored by default, unless the ucontext was
created from scratch by an old program and had a 64-bit CS
(unfortunately, CRIU can do this) or was the result of changing a
32-bit signal context to 64-bit without resetting SS (as DOSEMU
does).

For the benefit of new 64-bit software that uses segmentation (new
versions of DOSEMU might), the new behavior can be detected with a
new ucontext flag UC_SIGCONTEXT_SS.

To avoid compilation issues, __pad0 is left as an alias for ss in
ucontext.

The nitty-gritty details are documented in the header file.

This patch also re-enables the sigreturn_64 and ldt_gdt_64 selftests,
as the kernel change allows both of them to pass.

Tested-by: Stas Sergeev <stsp@list.ru>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/749149cbfc3e75cd7fcdad69a854b399d792cc6f.1455664054.git.luto@kernel.org
[ Small readability edit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/sighandling.h     |  1 -
 arch/x86/include/uapi/asm/sigcontext.h |  7 ++--
 arch/x86/include/uapi/asm/ucontext.h   | 53 +++++++++++++++++++++++++---
 arch/x86/kernel/signal.c               | 63 ++++++++++++++++++++++++----------
 tools/testing/selftests/x86/Makefile   |  5 ++-
 5 files changed, 99 insertions(+), 30 deletions(-)

diff --git a/arch/x86/include/asm/sighandling.h b/arch/x86/include/asm/sighandling.h
index 89db467..452c88b 100644
--- a/arch/x86/include/asm/sighandling.h
+++ b/arch/x86/include/asm/sighandling.h
@@ -13,7 +13,6 @@
 			 X86_EFLAGS_CF | X86_EFLAGS_RF)
 
 void signal_fault(struct pt_regs *regs, void __user *frame, char *where);
-int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc);
 int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 		     struct pt_regs *regs, unsigned long mask);
 
diff --git a/arch/x86/include/uapi/asm/sigcontext.h b/arch/x86/include/uapi/asm/sigcontext.h
index 702c404..62d4111 100644
--- a/arch/x86/include/uapi/asm/sigcontext.h
+++ b/arch/x86/include/uapi/asm/sigcontext.h
@@ -256,7 +256,7 @@ struct sigcontext_64 {
 	__u16				cs;
 	__u16				gs;
 	__u16				fs;
-	__u16				__pad0;
+	__u16				ss;
 	__u64				err;
 	__u64				trapno;
 	__u64				oldmask;
@@ -368,7 +368,10 @@ struct sigcontext {
 	 */
 	__u16				gs;
 	__u16				fs;
-	__u16				__pad0;
+	union {
+		__u16			ss;	/* If UC_SIGCONTEXT_SS */
+		__u16			__pad0;	/* Alias name for old (!UC_SIGCONTEXT_SS) user-space */
+	};
 	__u64				err;
 	__u64				trapno;
 	__u64				oldmask;
diff --git a/arch/x86/include/uapi/asm/ucontext.h b/arch/x86/include/uapi/asm/ucontext.h
index b7c29c8..e3d1ec9 100644
--- a/arch/x86/include/uapi/asm/ucontext.h
+++ b/arch/x86/include/uapi/asm/ucontext.h
@@ -1,11 +1,54 @@
 #ifndef _ASM_X86_UCONTEXT_H
 #define _ASM_X86_UCONTEXT_H
 
-#define UC_FP_XSTATE	0x1	/* indicates the presence of extended state
-				 * information in the memory layout pointed
-				 * by the fpstate pointer in the ucontext's
-				 * sigcontext struct (uc_mcontext).
-				 */
+/*
+ * Indicates the presence of extended state information in the memory
+ * layout pointed by the fpstate pointer in the ucontext's sigcontext
+ * struct (uc_mcontext).
+ */
+#define UC_FP_XSTATE	0x1
+
+#ifdef __x86_64__
+/*
+ * UC_SIGCONTEXT_SS will be set when delivering 64-bit or x32 signals on
+ * kernels that save SS in the sigcontext.  All kernels that set
+ * UC_SIGCONTEXT_SS will correctly restore at least the low 32 bits of esp
+ * regardless of SS (i.e. they implement espfix).
+ *
+ * Kernels that set UC_SIGCONTEXT_SS will also set UC_STRICT_RESTORE_SS
+ * when delivering a signal that came from 64-bit code.
+ *
+ * Sigreturn restores SS as follows:
+ *
+ * if (saved SS is valid || UC_STRICT_RESTORE_SS is set ||
+ *     saved CS is not 64-bit)
+ *         new SS = saved SS  (will fail IRET and signal if invalid)
+ * else
+ *         new SS = a flat 32-bit data segment
+ *
+ * This behavior serves three purposes:
+ *
+ * - Legacy programs that construct a 64-bit sigcontext from scratch
+ *   with zero or garbage in the SS slot (e.g. old CRIU) and call
+ *   sigreturn will still work.
+ *
+ * - Old DOSEMU versions sometimes catch a signal from a segmented
+ *   context, delete the old SS segment (with modify_ldt), and change
+ *   the saved CS to a 64-bit segment.  These DOSEMU versions expect
+ *   sigreturn to send them back to 64-bit mode without killing them,
+ *   despite the fact that the SS selector when the signal was raised is
+ *   no longer valid.  UC_STRICT_RESTORE_SS will be clear, so the kernel
+ *   will fix up SS for these DOSEMU versions.
+ *
+ * - Old and new programs that catch a signal and return without
+ *   modifying the saved context will end up in exactly the state they
+ *   started in, even if they were running in a segmented context when
+ *   the signal was raised..  Old kernels would lose track of the
+ *   previous SS value.
+ */
+#define UC_SIGCONTEXT_SS	0x2
+#define UC_STRICT_RESTORE_SS	0x4
+#endif
 
 #include <asm-generic/ucontext.h>
 
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 52f82c7..548ddf7 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -90,7 +90,9 @@ static void force_valid_ss(struct pt_regs *regs)
 }
 #endif
 
-int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
+static int restore_sigcontext(struct pt_regs *regs,
+			      struct sigcontext __user *sc,
+			      unsigned long uc_flags)
 {
 	unsigned long buf_val;
 	void __user *buf;
@@ -123,15 +125,18 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc)
 		COPY(r15);
 #endif /* CONFIG_X86_64 */
 
-#ifdef CONFIG_X86_32
 		COPY_SEG_CPL3(cs);
 		COPY_SEG_CPL3(ss);
-#else /* !CONFIG_X86_32 */
-		/* Kernel saves and restores only the CS segment register on signals,
-		 * which is the bare minimum needed to allow mixed 32/64-bit code.
-		 * App's signal handler can save/restore other segments if needed. */
-		COPY_SEG_CPL3(cs);
-#endif /* CONFIG_X86_32 */
+
+#ifdef CONFIG_X86_64
+		/*
+		 * Fix up SS if needed for the benefit of old DOSEMU and
+		 * CRIU.
+		 */
+		if (unlikely(!(uc_flags & UC_STRICT_RESTORE_SS) &&
+			     user_64bit_mode(regs)))
+			force_valid_ss(regs);
+#endif
 
 		get_user_ex(tmpflags, &sc->flags);
 		regs->flags = (regs->flags & ~FIX_EFLAGS) | (tmpflags & FIX_EFLAGS);
@@ -194,6 +199,7 @@ int setup_sigcontext(struct sigcontext __user *sc, void __user *fpstate,
 		put_user_ex(regs->cs, &sc->cs);
 		put_user_ex(0, &sc->gs);
 		put_user_ex(0, &sc->fs);
+		put_user_ex(regs->ss, &sc->ss);
 #endif /* CONFIG_X86_32 */
 
 		put_user_ex(fpstate, &sc->fpstate);
@@ -432,6 +438,21 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 	return 0;
 }
 #else /* !CONFIG_X86_32 */
+static unsigned long frame_uc_flags(struct pt_regs *regs)
+{
+	unsigned long flags;
+
+	if (cpu_has_xsave)
+		flags = UC_FP_XSTATE | UC_SIGCONTEXT_SS;
+	else
+		flags = UC_SIGCONTEXT_SS;
+
+	if (likely(user_64bit_mode(regs)))
+		flags |= UC_STRICT_RESTORE_SS;
+
+	return flags;
+}
+
 static int __setup_rt_frame(int sig, struct ksignal *ksig,
 			    sigset_t *set, struct pt_regs *regs)
 {
@@ -451,10 +472,7 @@ static int __setup_rt_frame(int sig, struct ksignal *ksig,
 
 	put_user_try {
 		/* Create the ucontext.  */
-		if (cpu_has_xsave)
-			put_user_ex(UC_FP_XSTATE, &frame->uc.uc_flags);
-		else
-			put_user_ex(0, &frame->uc.uc_flags);
+		put_user_ex(frame_uc_flags(regs), &frame->uc.uc_flags);
 		put_user_ex(0, &frame->uc.uc_link);
 		save_altstack_ex(&frame->uc.uc_stack, regs->sp);
 
@@ -536,10 +554,7 @@ static int x32_setup_rt_frame(struct ksignal *ksig,
 
 	put_user_try {
 		/* Create the ucontext.  */
-		if (cpu_has_xsave)
-			put_user_ex(UC_FP_XSTATE, &frame->uc.uc_flags);
-		else
-			put_user_ex(0, &frame->uc.uc_flags);
+		put_user_ex(frame_uc_flags(regs), &frame->uc.uc_flags);
 		put_user_ex(0, &frame->uc.uc_link);
 		compat_save_altstack_ex(&frame->uc.uc_stack, regs->sp);
 		put_user_ex(0, &frame->uc.uc__pad0);
@@ -601,7 +616,11 @@ asmlinkage unsigned long sys_sigreturn(void)
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->sc))
+	/*
+	 * x86_32 has no uc_flags bits relevant to restore_sigcontext.
+	 * Save a few cycles by skipping the __get_user.
+	 */
+	if (restore_sigcontext(regs, &frame->sc, 0))
 		goto badframe;
 	return regs->ax;
 
@@ -617,16 +636,19 @@ asmlinkage long sys_rt_sigreturn(void)
 	struct pt_regs *regs = current_pt_regs();
 	struct rt_sigframe __user *frame;
 	sigset_t set;
+	unsigned long uc_flags;
 
 	frame = (struct rt_sigframe __user *)(regs->sp - sizeof(long));
 	if (!access_ok(VERIFY_READ, frame, sizeof(*frame)))
 		goto badframe;
 	if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
 		goto badframe;
+	if (__get_user(uc_flags, &frame->uc.uc_flags))
+		goto badframe;
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->uc.uc_mcontext))
+	if (restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags))
 		goto badframe;
 
 	if (restore_altstack(&frame->uc.uc_stack))
@@ -813,6 +835,7 @@ asmlinkage long sys32_x32_rt_sigreturn(void)
 	struct pt_regs *regs = current_pt_regs();
 	struct rt_sigframe_x32 __user *frame;
 	sigset_t set;
+	unsigned long uc_flags;
 
 	frame = (struct rt_sigframe_x32 __user *)(regs->sp - 8);
 
@@ -820,10 +843,12 @@ asmlinkage long sys32_x32_rt_sigreturn(void)
 		goto badframe;
 	if (__copy_from_user(&set, &frame->uc.uc_sigmask, sizeof(set)))
 		goto badframe;
+	if (__get_user(uc_flags, &frame->uc.uc_flags))
+		goto badframe;
 
 	set_current_blocked(&set);
 
-	if (restore_sigcontext(regs, &frame->uc.uc_mcontext))
+	if (restore_sigcontext(regs, &frame->uc.uc_mcontext, uc_flags))
 		goto badframe;
 
 	if (compat_restore_altstack(&frame->uc.uc_stack))
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index df4f767..d5ce7d7 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -5,10 +5,9 @@ include ../lib.mk
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall \
-			check_initial_reg_state
-TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault sigreturn test_syscall_vdso unwind_vdso \
+			check_initial_reg_state sigreturn ldt_gdt
+TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
-			ldt_gdt \
 			vdso_restorer
 
 TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:x86/asm] selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS
  2016-02-16 23:09 ` [PATCH v5 4/4] selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS Andy Lutomirski
@ 2016-02-17 12:11   ` tip-bot for Andy Lutomirski
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Andy Lutomirski @ 2016-02-17 12:11 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, gorcunov, hpa, luto, oleg, torvalds, peterz, shuahkh,
	linux-kernel, viro, bp, luto, xemul, brgerst, stsp, dvlasenk,
	mingo

Commit-ID:  4f6c893822932622fad2b46cc30467be9f20ee5d
Gitweb:     http://git.kernel.org/tip/4f6c893822932622fad2b46cc30467be9f20ee5d
Author:     Andy Lutomirski <luto@kernel.org>
AuthorDate: Tue, 16 Feb 2016 15:09:04 -0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 17 Feb 2016 08:32:12 +0100

selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS

This tests the two ABI-preserving cases that DOSEMU cares about, and
it also explicitly tests the new UC_SIGCONTEXT_SS and
UC_STRICT_RESTORE_SS flags.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/f3d08f98541d0bd3030ceb35e05e21f59e30232c.1455664054.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 tools/testing/selftests/x86/sigreturn.c | 230 ++++++++++++++++++++++++++++----
 1 file changed, 202 insertions(+), 28 deletions(-)

diff --git a/tools/testing/selftests/x86/sigreturn.c b/tools/testing/selftests/x86/sigreturn.c
index b5aa1ba..8a577e7 100644
--- a/tools/testing/selftests/x86/sigreturn.c
+++ b/tools/testing/selftests/x86/sigreturn.c
@@ -54,6 +54,37 @@
 #include <sys/ptrace.h>
 #include <sys/user.h>
 
+/* Pull in AR_xyz defines. */
+typedef unsigned int u32;
+typedef unsigned short u16;
+#include "../../../../arch/x86/include/asm/desc_defs.h"
+
+/*
+ * Copied from asm/ucontext.h, as asm/ucontext.h conflicts badly with the glibc
+ * headers.
+ */
+#ifdef __x86_64__
+/*
+ * UC_SIGCONTEXT_SS will be set when delivering 64-bit or x32 signals on
+ * kernels that save SS in the sigcontext.  All kernels that set
+ * UC_SIGCONTEXT_SS will correctly restore at least the low 32 bits of esp
+ * regardless of SS (i.e. they implement espfix).
+ *
+ * Kernels that set UC_SIGCONTEXT_SS will also set UC_STRICT_RESTORE_SS
+ * when delivering a signal that came from 64-bit code.
+ *
+ * Sigreturn restores SS as follows:
+ *
+ * if (saved SS is valid || UC_STRICT_RESTORE_SS is set ||
+ *     saved CS is not 64-bit)
+ *         new SS = saved SS  (will fail IRET and signal if invalid)
+ * else
+ *         new SS = a flat 32-bit data segment
+ */
+#define UC_SIGCONTEXT_SS       0x2
+#define UC_STRICT_RESTORE_SS   0x4
+#endif
+
 /*
  * In principle, this test can run on Linux emulation layers (e.g.
  * Illumos "LX branded zones").  Solaris-based kernels reserve LDT
@@ -267,6 +298,9 @@ static gregset_t initial_regs, requested_regs, resulting_regs;
 /* Instructions for the SIGUSR1 handler. */
 static volatile unsigned short sig_cs, sig_ss;
 static volatile sig_atomic_t sig_trapped, sig_err, sig_trapno;
+#ifdef __x86_64__
+static volatile sig_atomic_t sig_corrupt_final_ss;
+#endif
 
 /* Abstractions for some 32-bit vs 64-bit differences. */
 #ifdef __x86_64__
@@ -305,9 +339,105 @@ static greg_t *csptr(ucontext_t *ctx)
 }
 #endif
 
+/*
+ * Checks a given selector for its code bitness or returns -1 if it's not
+ * a usable code segment selector.
+ */
+int cs_bitness(unsigned short cs)
+{
+	uint32_t valid = 0, ar;
+	asm ("lar %[cs], %[ar]\n\t"
+	     "jnz 1f\n\t"
+	     "mov $1, %[valid]\n\t"
+	     "1:"
+	     : [ar] "=r" (ar), [valid] "+rm" (valid)
+	     : [cs] "r" (cs));
+
+	if (!valid)
+		return -1;
+
+	bool db = (ar & (1 << 22));
+	bool l = (ar & (1 << 21));
+
+	if (!(ar & (1<<11)))
+	    return -1;	/* Not code. */
+
+	if (l && !db)
+		return 64;
+	else if (!l && db)
+		return 32;
+	else if (!l && !db)
+		return 16;
+	else
+		return -1;	/* Unknown bitness. */
+}
+
+/*
+ * Checks a given selector for its code bitness or returns -1 if it's not
+ * a usable code segment selector.
+ */
+bool is_valid_ss(unsigned short cs)
+{
+	uint32_t valid = 0, ar;
+	asm ("lar %[cs], %[ar]\n\t"
+	     "jnz 1f\n\t"
+	     "mov $1, %[valid]\n\t"
+	     "1:"
+	     : [ar] "=r" (ar), [valid] "+rm" (valid)
+	     : [cs] "r" (cs));
+
+	if (!valid)
+		return false;
+
+	if ((ar & AR_TYPE_MASK) != AR_TYPE_RWDATA &&
+	    (ar & AR_TYPE_MASK) != AR_TYPE_RWDATA_EXPDOWN)
+		return false;
+
+	return (ar & AR_P);
+}
+
 /* Number of errors in the current test case. */
 static volatile sig_atomic_t nerrs;
 
+static void validate_signal_ss(int sig, ucontext_t *ctx)
+{
+#ifdef __x86_64__
+	bool was_64bit = (cs_bitness(*csptr(ctx)) == 64);
+
+	if (!(ctx->uc_flags & UC_SIGCONTEXT_SS)) {
+		printf("[FAIL]\tUC_SIGCONTEXT_SS was not set\n");
+		nerrs++;
+
+		/*
+		 * This happens on Linux 4.1.  The rest will fail, too, so
+		 * return now to reduce the noise.
+		 */
+		return;
+	}
+
+	/* UC_STRICT_RESTORE_SS is set iff we came from 64-bit mode. */
+	if (!!(ctx->uc_flags & UC_STRICT_RESTORE_SS) != was_64bit) {
+		printf("[FAIL]\tUC_STRICT_RESTORE_SS was wrong in signal %d\n",
+		       sig);
+		nerrs++;
+	}
+
+	if (is_valid_ss(*ssptr(ctx))) {
+		/*
+		 * DOSEMU was written before 64-bit sigcontext had SS, and
+		 * it tries to figure out the signal source SS by looking at
+		 * the physical register.  Make sure that keeps working.
+		 */
+		unsigned short hw_ss;
+		asm ("mov %%ss, %0" : "=rm" (hw_ss));
+		if (hw_ss != *ssptr(ctx)) {
+			printf("[FAIL]\tHW SS didn't match saved SS\n");
+			nerrs++;
+		}
+	}
+#endif
+}
+
 /*
  * SIGUSR1 handler.  Sets CS and SS as requested and points IP to the
  * int3 trampoline.  Sets SP to a large known value so that we can see
@@ -317,6 +447,8 @@ static void sigusr1(int sig, siginfo_t *info, void *ctx_void)
 {
 	ucontext_t *ctx = (ucontext_t*)ctx_void;
 
+	validate_signal_ss(sig, ctx);
+
 	memcpy(&initial_regs, &ctx->uc_mcontext.gregs, sizeof(gregset_t));
 
 	*csptr(ctx) = sig_cs;
@@ -334,13 +466,16 @@ static void sigusr1(int sig, siginfo_t *info, void *ctx_void)
 }
 
 /*
- * Called after a successful sigreturn.  Restores our state so that
- * the original raise(SIGUSR1) returns.
+ * Called after a successful sigreturn (via int3) or from a failed
+ * sigreturn (directly by kernel).  Restores our state so that the
+ * original raise(SIGUSR1) returns.
  */
 static void sigtrap(int sig, siginfo_t *info, void *ctx_void)
 {
 	ucontext_t *ctx = (ucontext_t*)ctx_void;
 
+	validate_signal_ss(sig, ctx);
+
 	sig_err = ctx->uc_mcontext.gregs[REG_ERR];
 	sig_trapno = ctx->uc_mcontext.gregs[REG_TRAPNO];
 
@@ -358,41 +493,62 @@ static void sigtrap(int sig, siginfo_t *info, void *ctx_void)
 	memcpy(&resulting_regs, &ctx->uc_mcontext.gregs, sizeof(gregset_t));
 	memcpy(&ctx->uc_mcontext.gregs, &initial_regs, sizeof(gregset_t));
 
+#ifdef __x86_64__
+	if (sig_corrupt_final_ss) {
+		if (ctx->uc_flags & UC_STRICT_RESTORE_SS) {
+			printf("[FAIL]\tUC_STRICT_RESTORE_SS was set inappropriately\n");
+			nerrs++;
+		} else {
+			/*
+			 * DOSEMU transitions from 32-bit to 64-bit mode by
+			 * adjusting sigcontext, and it requires that this work
+			 * even if the saved SS is bogus.
+			 */
+			printf("\tCorrupting SS on return to 64-bit mode\n");
+			*ssptr(ctx) = 0;
+		}
+	}
+#endif
+
 	sig_trapped = sig;
 }
 
-/*
- * Checks a given selector for its code bitness or returns -1 if it's not
- * a usable code segment selector.
- */
-int cs_bitness(unsigned short cs)
+#ifdef __x86_64__
+/* Tests recovery if !UC_STRICT_RESTORE_SS */
+static void sigusr2(int sig, siginfo_t *info, void *ctx_void)
 {
-	uint32_t valid = 0, ar;
-	asm ("lar %[cs], %[ar]\n\t"
-	     "jnz 1f\n\t"
-	     "mov $1, %[valid]\n\t"
-	     "1:"
-	     : [ar] "=r" (ar), [valid] "+rm" (valid)
-	     : [cs] "r" (cs));
+	ucontext_t *ctx = (ucontext_t*)ctx_void;
 
-	if (!valid)
-		return -1;
+	if (!(ctx->uc_flags & UC_STRICT_RESTORE_SS)) {
+		printf("[FAIL]\traise(2) didn't set UC_STRICT_RESTORE_SS\n");
+		nerrs++;
+		return;  /* We can't do the rest. */
+	}
 
-	bool db = (ar & (1 << 22));
-	bool l = (ar & (1 << 21));
+	ctx->uc_flags &= ~UC_STRICT_RESTORE_SS;
+	*ssptr(ctx) = 0;
 
-	if (!(ar & (1<<11)))
-	    return -1;	/* Not code. */
+	/* Return.  The kernel should recover without sending another signal. */
+}
 
-	if (l && !db)
-		return 64;
-	else if (!l && db)
-		return 32;
-	else if (!l && !db)
-		return 16;
-	else
-		return -1;	/* Unknown bitness. */
+static int test_nonstrict_ss(void)
+{
+	clearhandler(SIGUSR1);
+	clearhandler(SIGTRAP);
+	clearhandler(SIGSEGV);
+	clearhandler(SIGILL);
+	sethandler(SIGUSR2, sigusr2, 0);
+
+	nerrs = 0;
+
+	printf("[RUN]\tClear UC_STRICT_RESTORE_SS and corrupt SS\n");
+	raise(SIGUSR2);
+	if (!nerrs)
+		printf("[OK]\tIt worked\n");
+
+	return nerrs;
 }
+#endif
 
 /* Finds a usable code segment of the requested bitness. */
 int find_cs(int bitness)
@@ -576,6 +732,12 @@ static int test_bad_iret(int cs_bits, unsigned short ss, int force_cs)
 		       errdesc, strsignal(sig_trapped));
 		return 0;
 	} else {
+		/*
+		 * This also implicitly tests UC_STRICT_RESTORE_SS:
+		 * We check that these signals set UC_STRICT_RESTORE_SS and,
+		 * if UC_STRICT_RESTORE_SS doesn't cause strict behavior,
+		 * then we won't get SIGSEGV.
+		 */
 		printf("[FAIL]\tDid not get SIGSEGV\n");
 		return 1;
 	}
@@ -632,6 +794,14 @@ int main()
 						    GDT3(gdt_data16_idx));
 	}
 
+#ifdef __x86_64__
+	/* Nasty ABI case: check SS corruption handling. */
+	sig_corrupt_final_ss = 1;
+	total_nerrs += test_valid_sigreturn(32, false, -1);
+	total_nerrs += test_valid_sigreturn(32, true, -1);
+	sig_corrupt_final_ss = 0;
+#endif
+
 	/*
 	 * We're done testing valid sigreturn cases.  Now we test states
 	 * for which sigreturn itself will succeed but the subsequent
@@ -680,5 +850,9 @@ int main()
 	if (gdt_npdata32_idx)
 		test_bad_iret(32, GDT3(gdt_npdata32_idx), -1);
 
+#ifdef __x86_64__
+	total_nerrs += test_nonstrict_ss();
+#endif
+
 	return total_nerrs ? 1 : 0;
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-02-17 12:13 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-16 23:09 [PATCH v5 0/4] x86: sigcontext fixes, again Andy Lutomirski
2016-02-16 23:09 ` [PATCH v5 1/4] x86/signal/64: Add a comment about sigcontext->fs and gs Andy Lutomirski
2016-02-17  7:21   ` Ingo Molnar
2016-02-17 10:03     ` Stas Sergeev
2016-02-17 11:05       ` Ingo Molnar
2016-02-17 12:10   ` [tip:x86/asm] x86/signal/64: Add a comment about sigcontext-> fs " tip-bot for Andy Lutomirski
2016-02-16 23:09 ` [PATCH v5 2/4] x86/signal/64: Fix SS if needed when delivering a 64-bit signal Andy Lutomirski
2016-02-17 12:10   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2016-02-16 23:09 ` [PATCH v5 3/4] x86/signal/64: Re-add support for SS in the 64-bit signal context Andy Lutomirski
2016-02-17 12:11   ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2016-02-16 23:09 ` [PATCH v5 4/4] selftests/x86: Add tests for UC_SIGCONTEXT_SS and UC_STRICT_RESTORE_SS Andy Lutomirski
2016-02-17 12:11   ` [tip:x86/asm] " tip-bot for Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).