linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/11] x86: Supervisor Mode Access Prevention
@ 2012-09-21 19:43 H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 01/11] x86, cpufeature: Add feature bit for SMAP H. Peter Anvin
                   ` (13 more replies)
  0 siblings, 14 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

Supervisor Mode Access Prevention (SMAP) is a new security feature
disclosed by Intel in revision 014 of the Intel® Architecture
Instruction Set Extensions Programming Reference:

http://software.intel.com/sites/default/files/319433-014.pdf

When SMAP is active, the kernel cannot normally access pages that are
user space (U=1).  Since the kernel does have the need to access user
space pages under specific circumstances, an override is provided: the
kernel can access user space pages if EFLAGS.AC=1.  For system data
structures, e.g. descriptor tables, that are accessed by the processor
directly, SMAP is active even in CPL 3 regardless of EFLAGS.AC.

SMAP also includes two new instructions, STAC and CLAC, to flip the AC
flag more quickly.

Note: patch 01/11 is already in tip:x86/cpufeature.

List of patches:
      x86, cpufeature: Add feature bit for SMAP
      x86-32, mm: The WP test should be done on a kernel page
      x86, smap: Add CR4 bit for SMAP
      x86, alternative: Use .pushsection/.popsection
      x86, alternative: Add header guards to <asm/alternative-asm.h>
      x86, smap: Add a header file with macros for STAC/CLAC
      x86, uaccess: Merge prototypes for clear_user/__clear_user
      x86, smap: Add STAC and CLAC instructions to control user space access
      x86, smap: Turn on Supervisor Mode Access Prevention
      x86, smap: A page fault due to SMAP is an oops
      x86, smap: Reduce the SMAP overhead for signal handling

Diff stat:

 Documentation/kernel-parameters.txt    |    6 ++-
 arch/x86/Kconfig                       |   11 ++++
 arch/x86/ia32/ia32_signal.c            |   12 +++--
 arch/x86/ia32/ia32entry.S              |    6 ++
 arch/x86/include/asm/alternative-asm.h |    9 +++-
 arch/x86/include/asm/alternative.h     |   32 ++++++------
 arch/x86/include/asm/cpufeature.h      |    1 +
 arch/x86/include/asm/fpu-internal.h    |   10 ++--
 arch/x86/include/asm/futex.h           |   19 +++++--
 arch/x86/include/asm/processor-flags.h |    1 +
 arch/x86/include/asm/smap.h            |   91 ++++++++++++++++++++++++++++++++
 arch/x86/include/asm/uaccess.h         |   28 ++++++----
 arch/x86/include/asm/uaccess_32.h      |    3 -
 arch/x86/include/asm/uaccess_64.h      |    3 -
 arch/x86/include/asm/xsave.h           |   10 ++--
 arch/x86/kernel/cpu/common.c           |   29 ++++++++++-
 arch/x86/kernel/entry_64.S             |   11 ++++-
 arch/x86/kernel/signal.c               |   24 +++++----
 arch/x86/lib/copy_user_64.S            |    7 +++
 arch/x86/lib/copy_user_nocache_64.S    |    3 +
 arch/x86/lib/getuser.S                 |   10 ++++
 arch/x86/lib/putuser.S                 |    8 +++-
 arch/x86/lib/usercopy_32.c             |   13 ++++-
 arch/x86/lib/usercopy_64.c             |    3 +
 arch/x86/mm/fault.c                    |   18 ++++++
 arch/x86/mm/init_32.c                  |    2 +-
 26 files changed, 301 insertions(+), 69 deletions(-)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH 01/11] x86, cpufeature: Add feature bit for SMAP
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 02/11] x86-32, mm: The WP test should be done on a kernel page H. Peter Anvin
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming

From: "H. Peter Anvin" <hpa@zytor.com>

Add CPUID feature bit for Supervisor Mode Access Prevention (SMAP).

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Link: http://lkml.kernel.org/n/tip-ethzcr5nipikl6hd5q8ssepq@git.kernel.org
---
 arch/x86/include/asm/cpufeature.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 6b7ee5f..633b617 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -209,6 +209,7 @@
 #define X86_FEATURE_RTM		(9*32+11) /* Restricted Transactional Memory */
 #define X86_FEATURE_RDSEED	(9*32+18) /* The RDSEED instruction */
 #define X86_FEATURE_ADX		(9*32+19) /* The ADCX and ADOX instructions */
+#define X86_FEATURE_SMAP	(9*32+20) /* Supervisor Mode Access Prevention */
 
 #if defined(__KERNEL__) && !defined(__ASSEMBLY__)
 
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 02/11] x86-32, mm: The WP test should be done on a kernel page
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 01/11] x86, cpufeature: Add feature bit for SMAP H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 19:58   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 03/11] x86, smap: Add CR4 bit for SMAP H. Peter Anvin
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

PAGE_READONLY includes user permission, but this is a page used
exclusively by the kernel; use PAGE_KERNEL_RO instead.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/mm/init_32.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 575d86f..e537b35 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -712,7 +712,7 @@ static void __init test_wp_bit(void)
   "Checking if this processor honours the WP bit even in supervisor mode...");
 
 	/* Any page-aligned address will do, the test is non-destructive */
-	__set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_READONLY);
+	__set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_KERNEL_RO);
 	boot_cpu_data.wp_works_ok = do_test_wp_bit();
 	clear_fixmap(FIX_WP_TEST);
 
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 03/11] x86, smap: Add CR4 bit for SMAP
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 01/11] x86, cpufeature: Add feature bit for SMAP H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 02/11] x86-32, mm: The WP test should be done on a kernel page H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 19:59   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 04/11] x86, alternative: Use .pushsection/.popsection H. Peter Anvin
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

Add X86_CR4_SMAP to <asm/processor-flags.h>.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/processor-flags.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h
index aea1d1d..680cf09 100644
--- a/arch/x86/include/asm/processor-flags.h
+++ b/arch/x86/include/asm/processor-flags.h
@@ -65,6 +65,7 @@
 #define X86_CR4_PCIDE	0x00020000 /* enable PCID support */
 #define X86_CR4_OSXSAVE 0x00040000 /* enable xsave and xrestore */
 #define X86_CR4_SMEP	0x00100000 /* enable SMEP support */
+#define X86_CR4_SMAP	0x00200000 /* enable SMAP support */
 
 /*
  * x86-64 Task Priority Register, CR8
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 04/11] x86, alternative: Use .pushsection/.popsection
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (2 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 03/11] x86, smap: Add CR4 bit for SMAP H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:00   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 05/11] x86, alternative: Add header guards to <asm/alternative-asm.h> H. Peter Anvin
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

.section/.previous doesn't nest.  Use .pushsection/.popsection in
<asm/alternative.h> so that they can be properly nested.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/alternative-asm.h |    4 ++--
 arch/x86/include/asm/alternative.h     |   32 ++++++++++++++++----------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h
index 952bd01..018d29f 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -5,10 +5,10 @@
 #ifdef CONFIG_SMP
 	.macro LOCK_PREFIX
 672:	lock
-	.section .smp_locks,"a"
+	.pushsection .smp_locks,"a"
 	.balign 4
 	.long 672b - .
-	.previous
+	.popsection
 	.endm
 #else
 	.macro LOCK_PREFIX
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 7078068..87bc00d 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -29,10 +29,10 @@
 
 #ifdef CONFIG_SMP
 #define LOCK_PREFIX_HERE \
-		".section .smp_locks,\"a\"\n"	\
-		".balign 4\n"			\
-		".long 671f - .\n" /* offset */	\
-		".previous\n"			\
+		".pushsection .smp_locks,\"a\"\n"	\
+		".balign 4\n"				\
+		".long 671f - .\n" /* offset */		\
+		".popsection\n"				\
 		"671:"
 
 #define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock; "
@@ -99,30 +99,30 @@ static inline int alternatives_text_reserved(void *start, void *end)
 /* alternative assembly primitive: */
 #define ALTERNATIVE(oldinstr, newinstr, feature)			\
 	OLDINSTR(oldinstr)						\
-	".section .altinstructions,\"a\"\n"				\
+	".pushsection .altinstructions,\"a\"\n"				\
 	ALTINSTR_ENTRY(feature, 1)					\
-	".previous\n"							\
-	".section .discard,\"aw\",@progbits\n"				\
+	".popsection\n"							\
+	".pushsection .discard,\"aw\",@progbits\n"			\
 	DISCARD_ENTRY(1)						\
-	".previous\n"							\
-	".section .altinstr_replacement, \"ax\"\n"			\
+	".popsection\n"							\
+	".pushsection .altinstr_replacement, \"ax\"\n"			\
 	ALTINSTR_REPLACEMENT(newinstr, feature, 1)			\
-	".previous"
+	".popsection"
 
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\
 	OLDINSTR(oldinstr)						\
-	".section .altinstructions,\"a\"\n"				\
+	".pushsection .altinstructions,\"a\"\n"				\
 	ALTINSTR_ENTRY(feature1, 1)					\
 	ALTINSTR_ENTRY(feature2, 2)					\
-	".previous\n"							\
-	".section .discard,\"aw\",@progbits\n"				\
+	".popsection\n"							\
+	".pushsection .discard,\"aw\",@progbits\n"			\
 	DISCARD_ENTRY(1)						\
 	DISCARD_ENTRY(2)						\
-	".previous\n"							\
-	".section .altinstr_replacement, \"ax\"\n"			\
+	".popsection\n"							\
+	".pushsection .altinstr_replacement, \"ax\"\n"			\
 	ALTINSTR_REPLACEMENT(newinstr1, feature1, 1)			\
 	ALTINSTR_REPLACEMENT(newinstr2, feature2, 2)			\
-	".previous"
+	".popsection"
 
 /*
  * This must be included *after* the definition of ALTERNATIVE due to
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 05/11] x86, alternative: Add header guards to <asm/alternative-asm.h>
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (3 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 04/11] x86, alternative: Use .pushsection/.popsection H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:01   ` [tip:x86/smap] x86, alternative: Add header guards to <asm/ alternative-asm.h> tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 06/11] x86, smap: Add a header file with macros for STAC/CLAC H. Peter Anvin
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

Add header guards to protect <asm/alternative-asm.h> against multiple
inclusion.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/alternative-asm.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h
index 018d29f..372231c 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -1,3 +1,6 @@
+#ifndef _ASM_X86_ALTERNATIVE_ASM_H
+#define _ASM_X86_ALTERNATIVE_ASM_H
+
 #ifdef __ASSEMBLY__
 
 #include <asm/asm.h>
@@ -24,3 +27,5 @@
 .endm
 
 #endif  /*  __ASSEMBLY__  */
+
+#endif /* _ASM_X86_ALTERNATIVE_ASM_H */
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 06/11] x86, smap: Add a header file with macros for STAC/CLAC
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (4 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 05/11] x86, alternative: Add header guards to <asm/alternative-asm.h> H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:02   ` [tip:x86/smap] x86, smap: Add a header file with macros for STAC/ CLAC tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 07/11] x86, uaccess: Merge prototypes for clear_user/__clear_user H. Peter Anvin
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

The STAC/CLAC instructions are only available with SMAP, but on the
other hand they aren't needed if SMAP is not available, or before we
start to run userspace, so construct them as alternatives which start
out as noops and are enabled by the alternatives mechanism.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/Kconfig            |   11 +++++
 arch/x86/include/asm/smap.h |   91 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 102 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/smap.h

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8ec3a1a..5ce8694 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1487,6 +1487,17 @@ config ARCH_RANDOM
 	  If supported, this is a high bandwidth, cryptographically
 	  secure hardware random number generator.
 
+config X86_SMAP
+	def_bool y
+	prompt "Supervisor Mode Access Prevention" if EXPERT
+	---help---
+	  Supervisor Mode Access Prevention (SMAP) is a security
+	  feature in newer Intel processors.  There is a small
+	  performance cost if this enabled and turned on; there is
+	  also a small increase in the kernel size if this is enabled.
+
+	  If unsure, say Y.
+
 config EFI
 	bool "EFI runtime service support"
 	depends on ACPI
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
new file mode 100644
index 0000000..3989c24
--- /dev/null
+++ b/arch/x86/include/asm/smap.h
@@ -0,0 +1,91 @@
+/*
+ * Supervisor Mode Access Prevention support
+ *
+ * Copyright (C) 2012 Intel Corporation
+ * Author: H. Peter Anvin <hpa@linux.intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#ifndef _ASM_X86_SMAP_H
+#define _ASM_X86_SMAP_H
+
+#include <linux/stringify.h>
+#include <asm/nops.h>
+#include <asm/cpufeature.h>
+
+/* "Raw" instruction opcodes */
+#define __ASM_CLAC	.byte 0x0f,0x01,0xca
+#define __ASM_STAC	.byte 0x0f,0x01,0xcb
+
+#ifdef __ASSEMBLY__
+
+#include <asm/alternative-asm.h>
+
+#ifdef CONFIG_X86_SMAP
+
+#define ASM_CLAC							\
+	661: ASM_NOP3 ;							\
+	.pushsection .altinstr_replacement, "ax" ;			\
+	662: __ASM_CLAC ;						\
+	.popsection ;							\
+	.pushsection .altinstructions, "a" ;				\
+	altinstruction_entry 661b, 662b, X86_FEATURE_SMAP, 3, 3 ;	\
+	.popsection
+
+#define ASM_STAC							\
+	661: ASM_NOP3 ;							\
+	.pushsection .altinstr_replacement, "ax" ;			\
+	662: __ASM_STAC ;						\
+	.popsection ;							\
+	.pushsection .altinstructions, "a" ;				\
+	altinstruction_entry 661b, 662b, X86_FEATURE_SMAP, 3, 3 ;	\
+	.popsection
+
+#else /* CONFIG_X86_SMAP */
+
+#define ASM_CLAC
+#define ASM_STAC
+
+#endif /* CONFIG_X86_SMAP */
+
+#else /* __ASSEMBLY__ */
+
+#include <asm/alternative.h>
+
+#ifdef CONFIG_X86_SMAP
+
+static inline void clac(void)
+{
+	/* Note: a barrier is implicit in alternative() */
+	alternative(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
+}
+
+static inline void stac(void)
+{
+	/* Note: a barrier is implicit in alternative() */
+	alternative(ASM_NOP3, __stringify(__ASM_STAC), X86_FEATURE_SMAP);
+}
+
+/* These macros can be used in asm() statements */
+#define ASM_CLAC \
+	ALTERNATIVE(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP)
+#define ASM_STAC \
+	ALTERNATIVE(ASM_NOP3, __stringify(__ASM_STAC), X86_FEATURE_SMAP)
+
+#else /* CONFIG_X86_SMAP */
+
+static inline void clac(void) { }
+static inline void stac(void) { }
+
+#define ASM_CLAC
+#define ASM_STAC
+
+#endif /* CONFIG_X86_SMAP */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_X86_SMAP_H */
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 07/11] x86, uaccess: Merge prototypes for clear_user/__clear_user
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (5 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 06/11] x86, smap: Add a header file with macros for STAC/CLAC H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:03   ` [tip:x86/smap] x86, uaccess: Merge prototypes for clear_user/ __clear_user tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 08/11] x86, smap: Add STAC and CLAC instructions to control user space access H. Peter Anvin
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

The prototypes for clear_user() and __clear_user() are identical in
the 32- and 64-bit headers.  No functionality change.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/uaccess.h    |    3 +++
 arch/x86/include/asm/uaccess_32.h |    3 ---
 arch/x86/include/asm/uaccess_64.h |    3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index e1f3a17..2c7df3d 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -569,6 +569,9 @@ strncpy_from_user(char *dst, const char __user *src, long count);
 extern __must_check long strlen_user(const char __user *str);
 extern __must_check long strnlen_user(const char __user *str, long n);
 
+unsigned long __must_check clear_user(void __user *mem, unsigned long len);
+unsigned long __must_check __clear_user(void __user *mem, unsigned long len);
+
 /*
  * movsl can be slow when source and dest are not both 8-byte aligned
  */
diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index 576e39b..7f760a9 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -213,7 +213,4 @@ static inline unsigned long __must_check copy_from_user(void *to,
 	return n;
 }
 
-unsigned long __must_check clear_user(void __user *mem, unsigned long len);
-unsigned long __must_check __clear_user(void __user *mem, unsigned long len);
-
 #endif /* _ASM_X86_UACCESS_32_H */
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index d8def8b..142810c 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -217,9 +217,6 @@ int __copy_in_user(void __user *dst, const void __user *src, unsigned size)
 	}
 }
 
-__must_check unsigned long clear_user(void __user *mem, unsigned long len);
-__must_check unsigned long __clear_user(void __user *mem, unsigned long len);
-
 static __must_check __always_inline int
 __copy_from_user_inatomic(void *dst, const void __user *src, unsigned size)
 {
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 08/11] x86, smap: Add STAC and CLAC instructions to control user space access
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (6 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 07/11] x86, uaccess: Merge prototypes for clear_user/__clear_user H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:04   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
  2012-09-22  0:16   ` [tip:x86/smap] x86-32, smap: Add STAC/ CLAC instructions to 32-bit kernel entry tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 09/11] x86, smap: Turn on Supervisor Mode Access Prevention H. Peter Anvin
                   ` (5 subsequent siblings)
  13 siblings, 2 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

When Supervisor Mode Access Prevention (SMAP) is enabled, access to
userspace from the kernel is controlled by the AC flag.  To make the
performance of manipulating that flag acceptable, there are two new
instructions, STAC and CLAC, to set and clear it.

This patch adds those instructions, via alternative(), when the SMAP
feature is enabled.  It also adds X86_EFLAGS_AC unconditionally to the
SYSCALL entry mask; there is simply no reason to make that one
conditional.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/ia32/ia32entry.S           |    6 ++++++
 arch/x86/include/asm/fpu-internal.h |   10 ++++++----
 arch/x86/include/asm/futex.h        |   19 +++++++++++++------
 arch/x86/include/asm/smap.h         |    4 ++--
 arch/x86/include/asm/uaccess.h      |   31 +++++++++++++++++++------------
 arch/x86/include/asm/xsave.h        |   10 ++++++----
 arch/x86/kernel/cpu/common.c        |    3 ++-
 arch/x86/kernel/entry_64.S          |   11 ++++++++++-
 arch/x86/lib/copy_user_64.S         |    7 +++++++
 arch/x86/lib/copy_user_nocache_64.S |    3 +++
 arch/x86/lib/getuser.S              |   10 ++++++++++
 arch/x86/lib/putuser.S              |    8 +++++++-
 arch/x86/lib/usercopy_32.c          |   13 ++++++++++++-
 arch/x86/lib/usercopy_64.c          |    3 +++
 14 files changed, 106 insertions(+), 32 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 20e5f7b..9c28950 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -14,6 +14,7 @@
 #include <asm/segment.h>
 #include <asm/irqflags.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 #include <linux/linkage.h>
 #include <linux/err.h>
 
@@ -146,8 +147,10 @@ ENTRY(ia32_sysenter_target)
 	SAVE_ARGS 0,1,0
  	/* no need to do an access_ok check here because rbp has been
  	   32bit zero extended */ 
+	ASM_STAC
 1:	movl	(%rbp),%ebp
 	_ASM_EXTABLE(1b,ia32_badarg)
+	ASM_CLAC
 	orl     $TS_COMPAT,TI_status+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	testl   $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	CFI_REMEMBER_STATE
@@ -301,8 +304,10 @@ ENTRY(ia32_cstar_target)
 	/* no need to do an access_ok check here because r8 has been
 	   32bit zero extended */ 
 	/* hardware stack frame is complete now */	
+	ASM_STAC
 1:	movl	(%r8),%r9d
 	_ASM_EXTABLE(1b,ia32_badarg)
+	ASM_CLAC
 	orl     $TS_COMPAT,TI_status+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	testl   $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	CFI_REMEMBER_STATE
@@ -365,6 +370,7 @@ cstar_tracesys:
 END(ia32_cstar_target)
 				
 ia32_badarg:
+	ASM_CLAC
 	movq $-EFAULT,%rax
 	jmp ia32_sysret
 	CFI_ENDPROC
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 75f4c6d..0fe1358 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -126,8 +126,9 @@ static inline int fxsave_user(struct i387_fxsave_struct __user *fx)
 
 	/* See comment in fxsave() below. */
 #ifdef CONFIG_AS_FXSAVEQ
-	asm volatile("1:  fxsaveq %[fx]\n\t"
-		     "2:\n"
+	asm volatile(ASM_STAC "\n"
+		     "1:  fxsaveq %[fx]\n\t"
+		     "2: " ASM_CLAC "\n"
 		     ".section .fixup,\"ax\"\n"
 		     "3:  movl $-1,%[err]\n"
 		     "    jmp  2b\n"
@@ -136,8 +137,9 @@ static inline int fxsave_user(struct i387_fxsave_struct __user *fx)
 		     : [err] "=r" (err), [fx] "=m" (*fx)
 		     : "0" (0));
 #else
-	asm volatile("1:  rex64/fxsave (%[fx])\n\t"
-		     "2:\n"
+	asm volatile(ASM_STAC "\n"
+		     "1:  rex64/fxsave (%[fx])\n\t"
+		     "2: " ASM_CLAC "\n"
 		     ".section .fixup,\"ax\"\n"
 		     "3:  movl $-1,%[err]\n"
 		     "    jmp  2b\n"
diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h
index 71ecbcb..f373046 100644
--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -9,10 +9,13 @@
 #include <asm/asm.h>
 #include <asm/errno.h>
 #include <asm/processor.h>
+#include <asm/smap.h>
 
 #define __futex_atomic_op1(insn, ret, oldval, uaddr, oparg)	\
-	asm volatile("1:\t" insn "\n"				\
-		     "2:\t.section .fixup,\"ax\"\n"		\
+	asm volatile("\t" ASM_STAC "\n"				\
+		     "1:\t" insn "\n"				\
+		     "2:\t" ASM_CLAC "\n"			\
+		     "\t.section .fixup,\"ax\"\n"		\
 		     "3:\tmov\t%3, %1\n"			\
 		     "\tjmp\t2b\n"				\
 		     "\t.previous\n"				\
@@ -21,12 +24,14 @@
 		     : "i" (-EFAULT), "0" (oparg), "1" (0))
 
 #define __futex_atomic_op2(insn, ret, oldval, uaddr, oparg)	\
-	asm volatile("1:\tmovl	%2, %0\n"			\
+	asm volatile("\t" ASM_STAC "\n"				\
+		     "1:\tmovl	%2, %0\n"			\
 		     "\tmovl\t%0, %3\n"				\
 		     "\t" insn "\n"				\
 		     "2:\t" LOCK_PREFIX "cmpxchgl %3, %2\n"	\
 		     "\tjnz\t1b\n"				\
-		     "3:\t.section .fixup,\"ax\"\n"		\
+		     "3:\t" ASM_CLAC "\n"			\
+		     "\t.section .fixup,\"ax\"\n"		\
 		     "4:\tmov\t%5, %1\n"			\
 		     "\tjmp\t3b\n"				\
 		     "\t.previous\n"				\
@@ -122,8 +127,10 @@ static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
-	asm volatile("1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
-		     "2:\t.section .fixup, \"ax\"\n"
+	asm volatile("\t" ASM_STAC "\n"
+		     "1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
+		     "2:\t" ASM_CLAC "\n"
+		     "\t.section .fixup, \"ax\"\n"
 		     "3:\tmov     %3, %0\n"
 		     "\tjmp     2b\n"
 		     "\t.previous\n"
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 3989c24..8d3120f 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -58,13 +58,13 @@
 
 #ifdef CONFIG_X86_SMAP
 
-static inline void clac(void)
+static __always_inline void clac(void)
 {
 	/* Note: a barrier is implicit in alternative() */
 	alternative(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
 }
 
-static inline void stac(void)
+static __always_inline void stac(void)
 {
 	/* Note: a barrier is implicit in alternative() */
 	alternative(ASM_NOP3, __stringify(__ASM_STAC), X86_FEATURE_SMAP);
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 2c7df3d..b92ece1 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -9,6 +9,7 @@
 #include <linux/string.h>
 #include <asm/asm.h>
 #include <asm/page.h>
+#include <asm/smap.h>
 
 #define VERIFY_READ 0
 #define VERIFY_WRITE 1
@@ -192,9 +193,10 @@ extern int __get_user_bad(void);
 
 #ifdef CONFIG_X86_32
 #define __put_user_asm_u64(x, addr, err, errret)			\
-	asm volatile("1:	movl %%eax,0(%2)\n"			\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	movl %%eax,0(%2)\n"			\
 		     "2:	movl %%edx,4(%2)\n"			\
-		     "3:\n"						\
+		     "3: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
 		     "4:	movl %3,%0\n"				\
 		     "	jmp 3b\n"					\
@@ -205,9 +207,10 @@ extern int __get_user_bad(void);
 		     : "A" (x), "r" (addr), "i" (errret), "0" (err))
 
 #define __put_user_asm_ex_u64(x, addr)					\
-	asm volatile("1:	movl %%eax,0(%1)\n"			\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	movl %%eax,0(%1)\n"			\
 		     "2:	movl %%edx,4(%1)\n"			\
-		     "3:\n"						\
+		     "3: " ASM_CLAC "\n"				\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     _ASM_EXTABLE_EX(2b, 3b)				\
 		     : : "A" (x), "r" (addr))
@@ -379,8 +382,9 @@ do {									\
 } while (0)
 
 #define __get_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
-	asm volatile("1:	mov"itype" %2,%"rtype"1\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %2,%"rtype"1\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
 		     "3:	mov %3,%0\n"				\
 		     "	xor"itype" %"rtype"1,%"rtype"1\n"		\
@@ -412,8 +416,9 @@ do {									\
 } while (0)
 
 #define __get_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile("1:	mov"itype" %1,%"rtype"0\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %1,%"rtype"0\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : ltype(x) : "m" (__m(addr)))
 
@@ -443,8 +448,9 @@ struct __large_struct { unsigned long buf[100]; };
  * aliasing issues.
  */
 #define __put_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
-	asm volatile("1:	mov"itype" %"rtype"1,%2\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %"rtype"1,%2\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
 		     "3:	mov %3,%0\n"				\
 		     "	jmp 2b\n"					\
@@ -454,8 +460,9 @@ struct __large_struct { unsigned long buf[100]; };
 		     : ltype(x), "m" (__m(addr)), "i" (errret), "0" (err))
 
 #define __put_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile("1:	mov"itype" %"rtype"0,%1\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %"rtype"0,%1\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : : ltype(x), "m" (__m(addr)))
 
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 8a1b6f9..2a923bd 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -74,8 +74,9 @@ static inline int xsave_user(struct xsave_struct __user *buf)
 	if (unlikely(err))
 		return -EFAULT;
 
-	__asm__ __volatile__("1: .byte " REX_PREFIX "0x0f,0xae,0x27\n"
-			     "2:\n"
+	__asm__ __volatile__(ASM_STAC "\n"
+			     "1: .byte " REX_PREFIX "0x0f,0xae,0x27\n"
+			     "2: " ASM_CLAC "\n"
 			     ".section .fixup,\"ax\"\n"
 			     "3:  movl $-1,%[err]\n"
 			     "    jmp  2b\n"
@@ -97,8 +98,9 @@ static inline int xrestore_user(struct xsave_struct __user *buf, u64 mask)
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 
-	__asm__ __volatile__("1: .byte " REX_PREFIX "0x0f,0xae,0x2f\n"
-			     "2:\n"
+	__asm__ __volatile__(ASM_STAC "\n"
+			     "1: .byte " REX_PREFIX "0x0f,0xae,0x2f\n"
+			     "2: " ASM_CLAC "\n"
 			     ".section .fixup,\"ax\"\n"
 			     "3:  movl $-1,%[err]\n"
 			     "    jmp  2b\n"
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a5fbc3c..cd43e52 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1113,7 +1113,8 @@ void syscall_init(void)
 
 	/* Flags to clear on syscall */
 	wrmsrl(MSR_SYSCALL_MASK,
-	       X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF|X86_EFLAGS_IOPL);
+	       X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF|
+	       X86_EFLAGS_IOPL|X86_EFLAGS_AC);
 }
 
 unsigned long kernel_eflags;
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 69babd8..ce87e3d 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -56,6 +56,7 @@
 #include <asm/ftrace.h>
 #include <asm/percpu.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 #include <linux/err.h>
 
 /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this.  */
@@ -465,7 +466,8 @@ END(ret_from_fork)
  * System call entry. Up to 6 arguments in registers are supported.
  *
  * SYSCALL does not save anything on the stack and does not change the
- * stack pointer.
+ * stack pointer.  However, it does mask the flags register for us, so
+ * CLD and CLAC are not needed.
  */
 
 /*
@@ -884,6 +886,7 @@ END(interrupt)
 	 */
 	.p2align CONFIG_X86_L1_CACHE_SHIFT
 common_interrupt:
+	ASM_CLAC
 	XCPT_FRAME
 	addq $-0x80,(%rsp)		/* Adjust vector to [-256,-1] range */
 	interrupt do_IRQ
@@ -1023,6 +1026,7 @@ END(common_interrupt)
  */
 .macro apicinterrupt num sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	pushq_cfi $~(\num)
 .Lcommon_\sym:
@@ -1077,6 +1081,7 @@ apicinterrupt IRQ_WORK_VECTOR \
  */
 .macro zeroentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	pushq_cfi $-1		/* ORIG_RAX: no syscall to restart */
@@ -1094,6 +1099,7 @@ END(\sym)
 
 .macro paranoidzeroentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	pushq_cfi $-1		/* ORIG_RAX: no syscall to restart */
@@ -1112,6 +1118,7 @@ END(\sym)
 #define INIT_TSS_IST(x) PER_CPU_VAR(init_tss) + (TSS_ist + ((x) - 1) * 8)
 .macro paranoidzeroentry_ist sym do_sym ist
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	pushq_cfi $-1		/* ORIG_RAX: no syscall to restart */
@@ -1131,6 +1138,7 @@ END(\sym)
 
 .macro errorentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	XCPT_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	subq $ORIG_RAX-R15, %rsp
@@ -1149,6 +1157,7 @@ END(\sym)
 	/* error code is on the stack already */
 .macro paranoiderrorentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	XCPT_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	subq $ORIG_RAX-R15, %rsp
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index 5b2995f..a30ca15 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -17,6 +17,7 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative-asm.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 /*
  * By placing feature2 after feature1 in altinstructions section, we logically
@@ -130,6 +131,7 @@ ENDPROC(bad_from_user)
  */
 ENTRY(copy_user_generic_unrolled)
 	CFI_STARTPROC
+	ASM_STAC
 	cmpl $8,%edx
 	jb 20f		/* less then 8 bytes, go to byte copy loop */
 	ALIGN_DESTINATION
@@ -177,6 +179,7 @@ ENTRY(copy_user_generic_unrolled)
 	decl %ecx
 	jnz 21b
 23:	xor %eax,%eax
+	ASM_CLAC
 	ret
 
 	.section .fixup,"ax"
@@ -232,6 +235,7 @@ ENDPROC(copy_user_generic_unrolled)
  */
 ENTRY(copy_user_generic_string)
 	CFI_STARTPROC
+	ASM_STAC
 	andl %edx,%edx
 	jz 4f
 	cmpl $8,%edx
@@ -246,6 +250,7 @@ ENTRY(copy_user_generic_string)
 3:	rep
 	movsb
 4:	xorl %eax,%eax
+	ASM_CLAC
 	ret
 
 	.section .fixup,"ax"
@@ -273,12 +278,14 @@ ENDPROC(copy_user_generic_string)
  */
 ENTRY(copy_user_enhanced_fast_string)
 	CFI_STARTPROC
+	ASM_STAC
 	andl %edx,%edx
 	jz 2f
 	movl %edx,%ecx
 1:	rep
 	movsb
 2:	xorl %eax,%eax
+	ASM_CLAC
 	ret
 
 	.section .fixup,"ax"
diff --git a/arch/x86/lib/copy_user_nocache_64.S b/arch/x86/lib/copy_user_nocache_64.S
index cacddc7..6a4f43c 100644
--- a/arch/x86/lib/copy_user_nocache_64.S
+++ b/arch/x86/lib/copy_user_nocache_64.S
@@ -15,6 +15,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/thread_info.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 	.macro ALIGN_DESTINATION
 #ifdef FIX_ALIGNMENT
@@ -48,6 +49,7 @@
  */
 ENTRY(__copy_user_nocache)
 	CFI_STARTPROC
+	ASM_STAC
 	cmpl $8,%edx
 	jb 20f		/* less then 8 bytes, go to byte copy loop */
 	ALIGN_DESTINATION
@@ -95,6 +97,7 @@ ENTRY(__copy_user_nocache)
 	decl %ecx
 	jnz 21b
 23:	xorl %eax,%eax
+	ASM_CLAC
 	sfence
 	ret
 
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index b33b1fb..156b9c8 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -33,6 +33,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/thread_info.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 	.text
 ENTRY(__get_user_1)
@@ -40,8 +41,10 @@ ENTRY(__get_user_1)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae bad_get_user
+	ASM_STAC
 1:	movzb (%_ASM_AX),%edx
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_1)
@@ -53,8 +56,10 @@ ENTRY(__get_user_2)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae bad_get_user
+	ASM_STAC
 2:	movzwl -1(%_ASM_AX),%edx
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_2)
@@ -66,8 +71,10 @@ ENTRY(__get_user_4)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae bad_get_user
+	ASM_STAC
 3:	mov -3(%_ASM_AX),%edx
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_4)
@@ -80,8 +87,10 @@ ENTRY(__get_user_8)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae	bad_get_user
+	ASM_STAC
 4:	movq -7(%_ASM_AX),%_ASM_DX
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_8)
@@ -91,6 +100,7 @@ bad_get_user:
 	CFI_STARTPROC
 	xor %edx,%edx
 	mov $(-EFAULT),%_ASM_AX
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 END(bad_get_user)
diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S
index 7f951c8..fc6ba17 100644
--- a/arch/x86/lib/putuser.S
+++ b/arch/x86/lib/putuser.S
@@ -15,6 +15,7 @@
 #include <asm/thread_info.h>
 #include <asm/errno.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 
 /*
@@ -31,7 +32,8 @@
 
 #define ENTER	CFI_STARTPROC ; \
 		GET_THREAD_INFO(%_ASM_BX)
-#define EXIT	ret ; \
+#define EXIT	ASM_CLAC ;	\
+		ret ;		\
 		CFI_ENDPROC
 
 .text
@@ -39,6 +41,7 @@ ENTRY(__put_user_1)
 	ENTER
 	cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 1:	movb %al,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
@@ -50,6 +53,7 @@ ENTRY(__put_user_2)
 	sub $1,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 2:	movw %ax,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
@@ -61,6 +65,7 @@ ENTRY(__put_user_4)
 	sub $3,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 3:	movl %eax,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
@@ -72,6 +77,7 @@ ENTRY(__put_user_8)
 	sub $7,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 4:	mov %_ASM_AX,(%_ASM_CX)
 #ifdef CONFIG_X86_32
 5:	movl %edx,4(%_ASM_CX)
diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c
index 1781b2f..98f6d6b6 100644
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -42,10 +42,11 @@ do {									\
 	int __d0;							\
 	might_fault();							\
 	__asm__ __volatile__(						\
+		ASM_STAC "\n"						\
 		"0:	rep; stosl\n"					\
 		"	movl %2,%0\n"					\
 		"1:	rep; stosb\n"					\
-		"2:\n"							\
+		"2: " ASM_CLAC "\n"					\
 		".section .fixup,\"ax\"\n"				\
 		"3:	lea 0(%2,%0,4),%0\n"				\
 		"	jmp 2b\n"					\
@@ -626,10 +627,12 @@ survive:
 		return n;
 	}
 #endif
+	stac();
 	if (movsl_is_ok(to, from, n))
 		__copy_user(to, from, n);
 	else
 		n = __copy_user_intel(to, from, n);
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_to_user_ll);
@@ -637,10 +640,12 @@ EXPORT_SYMBOL(__copy_to_user_ll);
 unsigned long __copy_from_user_ll(void *to, const void __user *from,
 					unsigned long n)
 {
+	stac();
 	if (movsl_is_ok(to, from, n))
 		__copy_user_zeroing(to, from, n);
 	else
 		n = __copy_user_zeroing_intel(to, from, n);
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll);
@@ -648,11 +653,13 @@ EXPORT_SYMBOL(__copy_from_user_ll);
 unsigned long __copy_from_user_ll_nozero(void *to, const void __user *from,
 					 unsigned long n)
 {
+	stac();
 	if (movsl_is_ok(to, from, n))
 		__copy_user(to, from, n);
 	else
 		n = __copy_user_intel((void __user *)to,
 				      (const void *)from, n);
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll_nozero);
@@ -660,6 +667,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nozero);
 unsigned long __copy_from_user_ll_nocache(void *to, const void __user *from,
 					unsigned long n)
 {
+	stac();
 #ifdef CONFIG_X86_INTEL_USERCOPY
 	if (n > 64 && cpu_has_xmm2)
 		n = __copy_user_zeroing_intel_nocache(to, from, n);
@@ -668,6 +676,7 @@ unsigned long __copy_from_user_ll_nocache(void *to, const void __user *from,
 #else
 	__copy_user_zeroing(to, from, n);
 #endif
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll_nocache);
@@ -675,6 +684,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocache);
 unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *from,
 					unsigned long n)
 {
+	stac();
 #ifdef CONFIG_X86_INTEL_USERCOPY
 	if (n > 64 && cpu_has_xmm2)
 		n = __copy_user_intel_nocache(to, from, n);
@@ -683,6 +693,7 @@ unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *fr
 #else
 	__copy_user(to, from, n);
 #endif
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index e5b130b..05928aa 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -18,6 +18,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size)
 	might_fault();
 	/* no memory constraint because it doesn't change any memory gcc knows
 	   about */
+	stac();
 	asm volatile(
 		"	testq  %[size8],%[size8]\n"
 		"	jz     4f\n"
@@ -40,6 +41,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size)
 		: [size8] "=&c"(size), [dst] "=&D" (__d0)
 		: [size1] "r"(size & 7), "[size8]" (size / 8), "[dst]"(addr),
 		  [zero] "r" (0UL), [eight] "r" (8UL));
+	clac();
 	return size;
 }
 EXPORT_SYMBOL(__clear_user);
@@ -82,5 +84,6 @@ copy_user_handle_tail(char *to, char *from, unsigned len, unsigned zerorest)
 	for (c = 0, zero_len = len; zerorest && zero_len; --zero_len)
 		if (__put_user_nocheck(c, to++, sizeof(char)))
 			break;
+	clac();
 	return len;
 }
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 09/11] x86, smap: Turn on Supervisor Mode Access Prevention
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (7 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 08/11] x86, smap: Add STAC and CLAC instructions to control user space access H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:05   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 10/11] x86, smap: A page fault due to SMAP is an oops H. Peter Anvin
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

If Supervisor Mode Access Prevention is available and not disabled by
the user, turn it on.  Also fix the expansion of SMEP (Supervisor Mode
Execution Prevention.)

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 Documentation/kernel-parameters.txt |    6 +++++-
 arch/x86/kernel/cpu/common.c        |   26 ++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 1 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index ad7e2e5..49c5c41 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1812,8 +1812,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			noexec=on: enable non-executable mappings (default)
 			noexec=off: disable non-executable mappings
 
+	nosmap		[X86]
+			Disable SMAP (Supervisor Mode Access Prevention)
+			even if it is supported by processor.
+
 	nosmep		[X86]
-			Disable SMEP (Supervisor Mode Execution Protection)
+			Disable SMEP (Supervisor Mode Execution Prevention)
 			even if it is supported by processor.
 
 	noexec32	[X86-64]
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cd43e52..7d35d65 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -278,6 +278,31 @@ static __cpuinit void setup_smep(struct cpuinfo_x86 *c)
 	}
 }
 
+static int disable_smap __cpuinitdata;
+static __init int setup_disable_smap(char *arg)
+{
+	disable_smap = 1;
+	return 1;
+}
+__setup("nosmap", setup_disable_smap);
+
+static __cpuinit void setup_smap(struct cpuinfo_x86 *c)
+{
+	if (cpu_has(c, X86_FEATURE_SMAP)) {
+		if (unlikely(disable_smap)) {
+			setup_clear_cpu_cap(X86_FEATURE_SMAP);
+			clear_in_cr4(X86_CR4_SMAP);
+		} else {
+			set_in_cr4(X86_CR4_SMAP);
+			/*
+			 * Don't use clac() here since alternatives
+			 * haven't run yet...
+			 */
+			asm volatile(__stringify(__ASM_CLAC) ::: "memory");
+		}
+	}
+}
+
 /*
  * Some CPU features depend on higher CPUID levels, which may not always
  * be available due to CPUID level capping or broken virtualization
@@ -713,6 +738,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 	filter_cpuid_features(c, false);
 
 	setup_smep(c);
+	setup_smap(c);
 
 	if (this_cpu->c_bsp_init)
 		this_cpu->c_bsp_init(c);
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 10/11] x86, smap: A page fault due to SMAP is an oops
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (8 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 09/11] x86, smap: Turn on Supervisor Mode Access Prevention H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:06   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
  2012-09-21 19:43 ` [PATCH 11/11] x86, smap: Reduce the SMAP overhead for signal handling H. Peter Anvin
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

If we get a page fault due to SMAP, trigger an oops rather than
spinning forever.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/mm/fault.c |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 76dcd9d..f2fb75d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -995,6 +995,17 @@ static int fault_in_kernel_space(unsigned long address)
 	return address >= TASK_SIZE_MAX;
 }
 
+static inline bool smap_violation(int error_code, struct pt_regs *regs)
+{
+	if (error_code & PF_USER)
+		return false;
+
+	if (!user_mode_vm(regs) && (regs->flags & X86_EFLAGS_AC))
+		return false;
+
+	return true;
+}
+
 /*
  * This routine handles page faults.  It determines the address,
  * and the problem, and then passes it off to one of the appropriate
@@ -1088,6 +1099,13 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
 	if (unlikely(error_code & PF_RSVD))
 		pgtable_bad(regs, error_code, address);
 
+	if (static_cpu_has(X86_FEATURE_SMAP)) {
+		if (unlikely(smap_violation(error_code, regs))) {
+			bad_area_nosemaphore(regs, error_code, address);
+			return;
+		}
+	}
+
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
 	/*
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH 11/11] x86, smap: Reduce the SMAP overhead for signal handling
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (9 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 10/11] x86, smap: A page fault due to SMAP is an oops H. Peter Anvin
@ 2012-09-21 19:43 ` H. Peter Anvin
  2012-09-21 20:07   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
  2012-09-21 19:54 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Linus Torvalds
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:43 UTC (permalink / raw)
  To: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar, Thomas Gleixner
  Cc: Linus Torvalds, Kees Cook, Linda Wang, Matt Fleming, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

Signal handling contains a bunch of accesses to individual user space
items, which causes an excessive number of STAC and CLAC
instructions.  Instead, let get/put_user_try ... get/put_user_catch()
contain the STAC and CLAC instructions.

This means that get/put_user_try no longer nests, and furthermore that
it is no longer legal to use user space access functions other than
__get/put_user_ex() inside those blocks.  However, these macros are
x86-specific anyway and are only used in the signal-handling paths; a
simple reordering of moving the larger subroutine calls out of the
try...catch blocks resolves that problem.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/ia32/ia32_signal.c    |   12 +++++++-----
 arch/x86/include/asm/uaccess.h |   14 ++++++--------
 arch/x86/kernel/signal.c       |   24 ++++++++++++++----------
 3 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 673ac9b..05e62a3 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -250,11 +250,12 @@ static int ia32_restore_sigcontext(struct pt_regs *regs,
 
 		get_user_ex(tmp, &sc->fpstate);
 		buf = compat_ptr(tmp);
-		err |= restore_i387_xstate_ia32(buf);
 
 		get_user_ex(*pax, &sc->ax);
 	} get_user_catch(err);
 
+	err |= restore_i387_xstate_ia32(buf);
+
 	return err;
 }
 
@@ -502,7 +503,6 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sig, &frame->sig);
 		put_user_ex(ptr_to_compat(&frame->info), &frame->pinfo);
 		put_user_ex(ptr_to_compat(&frame->uc), &frame->puc);
-		err |= copy_siginfo_to_user32(&frame->info, info);
 
 		/* Create the ucontext.  */
 		if (cpu_has_xsave)
@@ -514,9 +514,6 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sas_ss_flags(regs->sp),
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
-		err |= ia32_setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
-					     regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		if (ka->sa.sa_flags & SA_RESTORER)
 			restorer = ka->sa.sa_restorer;
@@ -532,6 +529,11 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(*((u64 *)&code), (u64 *)frame->retcode);
 	} put_user_catch(err);
 
+	err |= copy_siginfo_to_user32(&frame->info, info);
+	err |= ia32_setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
+				     regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index b92ece1..a91acfb 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -416,9 +416,8 @@ do {									\
 } while (0)
 
 #define __get_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile(ASM_STAC "\n"					\
-		     "1:	mov"itype" %1,%"rtype"0\n"		\
-		     "2: " ASM_CLAC "\n"				\
+	asm volatile("1:	mov"itype" %1,%"rtype"0\n"		\
+		     "2:\n"						\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : ltype(x) : "m" (__m(addr)))
 
@@ -460,9 +459,8 @@ struct __large_struct { unsigned long buf[100]; };
 		     : ltype(x), "m" (__m(addr)), "i" (errret), "0" (err))
 
 #define __put_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile(ASM_STAC "\n"					\
-		     "1:	mov"itype" %"rtype"0,%1\n"		\
-		     "2: " ASM_CLAC "\n"				\
+	asm volatile("1:	mov"itype" %"rtype"0,%1\n"		\
+		     "2:\n"						\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : : ltype(x), "m" (__m(addr)))
 
@@ -470,13 +468,13 @@ struct __large_struct { unsigned long buf[100]; };
  * uaccess_try and catch
  */
 #define uaccess_try	do {						\
-	int prev_err = current_thread_info()->uaccess_err;		\
 	current_thread_info()->uaccess_err = 0;				\
+	stac();								\
 	barrier();
 
 #define uaccess_catch(err)						\
+	clac();								\
 	(err) |= (current_thread_info()->uaccess_err ? -EFAULT : 0);	\
-	current_thread_info()->uaccess_err = prev_err;			\
 } while (0)
 
 /**
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index b280908..9326128 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -114,11 +114,12 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
 		regs->orig_ax = -1;		/* disable syscall checks */
 
 		get_user_ex(buf, &sc->fpstate);
-		err |= restore_i387_xstate(buf);
 
 		get_user_ex(*pax, &sc->ax);
 	} get_user_catch(err);
 
+	err |= restore_i387_xstate(buf);
+
 	return err;
 }
 
@@ -357,7 +358,6 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sig, &frame->sig);
 		put_user_ex(&frame->info, &frame->pinfo);
 		put_user_ex(&frame->uc, &frame->puc);
-		err |= copy_siginfo_to_user(&frame->info, info);
 
 		/* Create the ucontext.  */
 		if (cpu_has_xsave)
@@ -369,9 +369,6 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sas_ss_flags(regs->sp),
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
-		err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
-					regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		/* Set up to return from userspace.  */
 		restorer = VDSO32_SYMBOL(current->mm->context.vdso, rt_sigreturn);
@@ -389,6 +386,11 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(*((u64 *)&rt_retcode), (u64 *)frame->retcode);
 	} put_user_catch(err);
 
+	err |= copy_siginfo_to_user(&frame->info, info);
+	err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
+				regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 
@@ -436,8 +438,6 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sas_ss_flags(regs->sp),
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(me->sas_ss_size, &frame->uc.uc_stack.ss_size);
-		err |= setup_sigcontext(&frame->uc.uc_mcontext, fp, regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		/* Set up to return from userspace.  If provided, use a stub
 		   already in userspace.  */
@@ -450,6 +450,9 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		}
 	} put_user_catch(err);
 
+	err |= setup_sigcontext(&frame->uc.uc_mcontext, fp, regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 
@@ -855,9 +858,6 @@ static int x32_setup_rt_frame(int sig, struct k_sigaction *ka,
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
 		put_user_ex(0, &frame->uc.uc__pad0);
-		err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
-					regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		if (ka->sa.sa_flags & SA_RESTORER) {
 			restorer = ka->sa.sa_restorer;
@@ -869,6 +869,10 @@ static int x32_setup_rt_frame(int sig, struct k_sigaction *ka,
 		put_user_ex(restorer, &frame->pretcode);
 	} put_user_catch(err);
 
+	err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
+				regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (10 preceding siblings ...)
  2012-09-21 19:43 ` [PATCH 11/11] x86, smap: Reduce the SMAP overhead for signal handling H. Peter Anvin
@ 2012-09-21 19:54 ` Linus Torvalds
  2012-09-21 19:57   ` H. Peter Anvin
  2012-09-21 20:08   ` Ingo Molnar
  2012-09-21 22:07 ` Eric W. Biederman
  2012-09-21 22:08 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Dave Jones
  13 siblings, 2 replies; 56+ messages in thread
From: Linus Torvalds @ 2012-09-21 19:54 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Kees Cook, Linda Wang, Matt Fleming

On Fri, Sep 21, 2012 at 12:43 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
> Supervisor Mode Access Prevention (SMAP) is a new security feature
> disclosed by Intel in revision 014 of the Intel® Architecture
> Instruction Set Extensions Programming Reference:

Looks good.

Did this find any bugs, btw? We've had a few cases where we forgot to
use the proper user access function, and code just happened to work
because it all boils down to the same thing and never got any page
faults in practice anyway..

I'd obviously hope that we have caught all of them, but.. IOW, has
SMAP actually triggered for anybody in testing inside Intel?

                 Linus

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 19:54 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Linus Torvalds
@ 2012-09-21 19:57   ` H. Peter Anvin
  2012-09-21 20:08   ` Ingo Molnar
  1 sibling, 0 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 19:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Linux Kernel Mailing List, Ingo Molnar,
	Thomas Gleixner, Kees Cook, Linda Wang, Matt Fleming

On 09/21/2012 12:54 PM, Linus Torvalds wrote:
> On Fri, Sep 21, 2012 at 12:43 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>> Supervisor Mode Access Prevention (SMAP) is a new security feature
>> disclosed by Intel in revision 014 of the Intel® Architecture
>> Instruction Set Extensions Programming Reference:
> 
> Looks good.
> 
> Did this find any bugs, btw? We've had a few cases where we forgot to
> use the proper user access function, and code just happened to work
> because it all boils down to the same thing and never got any page
> faults in practice anyway..
> 
> I'd obviously hope that we have caught all of them, but.. IOW, has
> SMAP actually triggered for anybody in testing inside Intel?
> 

So far, it caught the use of PAGE_READONLY instead of PAGE_KERNEL_RO for
the WP test on 32 bits (patch 02/11).

It has not had very high testing bandwidth yet, and especially the
exposure of driver code has been very limited, so I would not be at all
surprised if more crop up.

	-hpa

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86-32, mm: The WP test should be done on a kernel page
  2012-09-21 19:43 ` [PATCH 02/11] x86-32, mm: The WP test should be done on a kernel page H. Peter Anvin
@ 2012-09-21 19:58   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 19:58 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  8bd753be7a96443dd5111cb07ed5907a3787c978
Gitweb:     http://git.kernel.org/tip/8bd753be7a96443dd5111cb07ed5907a3787c978
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:06 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:25 -0700

x86-32, mm: The WP test should be done on a kernel page

PAGE_READONLY includes user permission, but this is a page used
exclusively by the kernel; use PAGE_KERNEL_RO instead.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-3-git-send-email-hpa@linux.intel.com
---
 arch/x86/mm/init_32.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 575d86f..e537b35 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -712,7 +712,7 @@ static void __init test_wp_bit(void)
   "Checking if this processor honours the WP bit even in supervisor mode...");
 
 	/* Any page-aligned address will do, the test is non-destructive */
-	__set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_READONLY);
+	__set_fixmap(FIX_WP_TEST, __pa(&swapper_pg_dir), PAGE_KERNEL_RO);
 	boot_cpu_data.wp_works_ok = do_test_wp_bit();
 	clear_fixmap(FIX_WP_TEST);
 

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, smap: Add CR4 bit for SMAP
  2012-09-21 19:43 ` [PATCH 03/11] x86, smap: Add CR4 bit for SMAP H. Peter Anvin
@ 2012-09-21 19:59   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 19:59 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  85fdf05cc395f23384cb0adb22765cbaa9653b54
Gitweb:     http://git.kernel.org/tip/85fdf05cc395f23384cb0adb22765cbaa9653b54
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:07 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:25 -0700

x86, smap: Add CR4 bit for SMAP

Add X86_CR4_SMAP to <asm/processor-flags.h>.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-4-git-send-email-hpa@linux.intel.com
---
 arch/x86/include/asm/processor-flags.h |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h
index aea1d1d..680cf09 100644
--- a/arch/x86/include/asm/processor-flags.h
+++ b/arch/x86/include/asm/processor-flags.h
@@ -65,6 +65,7 @@
 #define X86_CR4_PCIDE	0x00020000 /* enable PCID support */
 #define X86_CR4_OSXSAVE 0x00040000 /* enable xsave and xrestore */
 #define X86_CR4_SMEP	0x00100000 /* enable SMEP support */
+#define X86_CR4_SMAP	0x00200000 /* enable SMAP support */
 
 /*
  * x86-64 Task Priority Register, CR8

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, alternative: Use .pushsection/.popsection
  2012-09-21 19:43 ` [PATCH 04/11] x86, alternative: Use .pushsection/.popsection H. Peter Anvin
@ 2012-09-21 20:00   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:00 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  9cebed423c84a56b871327dd77e555d1d2186a6b
Gitweb:     http://git.kernel.org/tip/9cebed423c84a56b871327dd77e555d1d2186a6b
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:08 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:25 -0700

x86, alternative: Use .pushsection/.popsection

.section/.previous doesn't nest.  Use .pushsection/.popsection in
<asm/alternative.h> so that they can be properly nested.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-5-git-send-email-hpa@linux.intel.com
---
 arch/x86/include/asm/alternative-asm.h |    4 ++--
 arch/x86/include/asm/alternative.h     |   32 ++++++++++++++++----------------
 2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h
index 952bd01..018d29f 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -5,10 +5,10 @@
 #ifdef CONFIG_SMP
 	.macro LOCK_PREFIX
 672:	lock
-	.section .smp_locks,"a"
+	.pushsection .smp_locks,"a"
 	.balign 4
 	.long 672b - .
-	.previous
+	.popsection
 	.endm
 #else
 	.macro LOCK_PREFIX
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 7078068..87bc00d 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -29,10 +29,10 @@
 
 #ifdef CONFIG_SMP
 #define LOCK_PREFIX_HERE \
-		".section .smp_locks,\"a\"\n"	\
-		".balign 4\n"			\
-		".long 671f - .\n" /* offset */	\
-		".previous\n"			\
+		".pushsection .smp_locks,\"a\"\n"	\
+		".balign 4\n"				\
+		".long 671f - .\n" /* offset */		\
+		".popsection\n"				\
 		"671:"
 
 #define LOCK_PREFIX LOCK_PREFIX_HERE "\n\tlock; "
@@ -99,30 +99,30 @@ static inline int alternatives_text_reserved(void *start, void *end)
 /* alternative assembly primitive: */
 #define ALTERNATIVE(oldinstr, newinstr, feature)			\
 	OLDINSTR(oldinstr)						\
-	".section .altinstructions,\"a\"\n"				\
+	".pushsection .altinstructions,\"a\"\n"				\
 	ALTINSTR_ENTRY(feature, 1)					\
-	".previous\n"							\
-	".section .discard,\"aw\",@progbits\n"				\
+	".popsection\n"							\
+	".pushsection .discard,\"aw\",@progbits\n"			\
 	DISCARD_ENTRY(1)						\
-	".previous\n"							\
-	".section .altinstr_replacement, \"ax\"\n"			\
+	".popsection\n"							\
+	".pushsection .altinstr_replacement, \"ax\"\n"			\
 	ALTINSTR_REPLACEMENT(newinstr, feature, 1)			\
-	".previous"
+	".popsection"
 
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\
 	OLDINSTR(oldinstr)						\
-	".section .altinstructions,\"a\"\n"				\
+	".pushsection .altinstructions,\"a\"\n"				\
 	ALTINSTR_ENTRY(feature1, 1)					\
 	ALTINSTR_ENTRY(feature2, 2)					\
-	".previous\n"							\
-	".section .discard,\"aw\",@progbits\n"				\
+	".popsection\n"							\
+	".pushsection .discard,\"aw\",@progbits\n"			\
 	DISCARD_ENTRY(1)						\
 	DISCARD_ENTRY(2)						\
-	".previous\n"							\
-	".section .altinstr_replacement, \"ax\"\n"			\
+	".popsection\n"							\
+	".pushsection .altinstr_replacement, \"ax\"\n"			\
 	ALTINSTR_REPLACEMENT(newinstr1, feature1, 1)			\
 	ALTINSTR_REPLACEMENT(newinstr2, feature2, 2)			\
-	".previous"
+	".popsection"
 
 /*
  * This must be included *after* the definition of ALTERNATIVE due to

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, alternative: Add header guards to <asm/ alternative-asm.h>
  2012-09-21 19:43 ` [PATCH 05/11] x86, alternative: Add header guards to <asm/alternative-asm.h> H. Peter Anvin
@ 2012-09-21 20:01   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:01 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  76f30759f690db21ca567a20665ed2679ad3235b
Gitweb:     http://git.kernel.org/tip/76f30759f690db21ca567a20665ed2679ad3235b
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:09 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:26 -0700

x86, alternative: Add header guards to <asm/alternative-asm.h>

Add header guards to protect <asm/alternative-asm.h> against multiple
inclusion.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-6-git-send-email-hpa@linux.intel.com
---
 arch/x86/include/asm/alternative-asm.h |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/alternative-asm.h b/arch/x86/include/asm/alternative-asm.h
index 018d29f..372231c 100644
--- a/arch/x86/include/asm/alternative-asm.h
+++ b/arch/x86/include/asm/alternative-asm.h
@@ -1,3 +1,6 @@
+#ifndef _ASM_X86_ALTERNATIVE_ASM_H
+#define _ASM_X86_ALTERNATIVE_ASM_H
+
 #ifdef __ASSEMBLY__
 
 #include <asm/asm.h>
@@ -24,3 +27,5 @@
 .endm
 
 #endif  /*  __ASSEMBLY__  */
+
+#endif /* _ASM_X86_ALTERNATIVE_ASM_H */

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, smap: Add a header file with macros for STAC/ CLAC
  2012-09-21 19:43 ` [PATCH 06/11] x86, smap: Add a header file with macros for STAC/CLAC H. Peter Anvin
@ 2012-09-21 20:02   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:02 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  51ae4a2d775e1ee456282d7c60e49693d0a8555d
Gitweb:     http://git.kernel.org/tip/51ae4a2d775e1ee456282d7c60e49693d0a8555d
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:10 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:26 -0700

x86, smap: Add a header file with macros for STAC/CLAC

The STAC/CLAC instructions are only available with SMAP, but on the
other hand they aren't needed if SMAP is not available, or before we
start to run userspace, so construct them as alternatives which start
out as noops and are enabled by the alternatives mechanism.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-7-git-send-email-hpa@linux.intel.com
---
 arch/x86/Kconfig            |   11 +++++
 arch/x86/include/asm/smap.h |   91 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 102 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8ec3a1a..5ce8694 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1487,6 +1487,17 @@ config ARCH_RANDOM
 	  If supported, this is a high bandwidth, cryptographically
 	  secure hardware random number generator.
 
+config X86_SMAP
+	def_bool y
+	prompt "Supervisor Mode Access Prevention" if EXPERT
+	---help---
+	  Supervisor Mode Access Prevention (SMAP) is a security
+	  feature in newer Intel processors.  There is a small
+	  performance cost if this enabled and turned on; there is
+	  also a small increase in the kernel size if this is enabled.
+
+	  If unsure, say Y.
+
 config EFI
 	bool "EFI runtime service support"
 	depends on ACPI
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
new file mode 100644
index 0000000..3989c24
--- /dev/null
+++ b/arch/x86/include/asm/smap.h
@@ -0,0 +1,91 @@
+/*
+ * Supervisor Mode Access Prevention support
+ *
+ * Copyright (C) 2012 Intel Corporation
+ * Author: H. Peter Anvin <hpa@linux.intel.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ */
+
+#ifndef _ASM_X86_SMAP_H
+#define _ASM_X86_SMAP_H
+
+#include <linux/stringify.h>
+#include <asm/nops.h>
+#include <asm/cpufeature.h>
+
+/* "Raw" instruction opcodes */
+#define __ASM_CLAC	.byte 0x0f,0x01,0xca
+#define __ASM_STAC	.byte 0x0f,0x01,0xcb
+
+#ifdef __ASSEMBLY__
+
+#include <asm/alternative-asm.h>
+
+#ifdef CONFIG_X86_SMAP
+
+#define ASM_CLAC							\
+	661: ASM_NOP3 ;							\
+	.pushsection .altinstr_replacement, "ax" ;			\
+	662: __ASM_CLAC ;						\
+	.popsection ;							\
+	.pushsection .altinstructions, "a" ;				\
+	altinstruction_entry 661b, 662b, X86_FEATURE_SMAP, 3, 3 ;	\
+	.popsection
+
+#define ASM_STAC							\
+	661: ASM_NOP3 ;							\
+	.pushsection .altinstr_replacement, "ax" ;			\
+	662: __ASM_STAC ;						\
+	.popsection ;							\
+	.pushsection .altinstructions, "a" ;				\
+	altinstruction_entry 661b, 662b, X86_FEATURE_SMAP, 3, 3 ;	\
+	.popsection
+
+#else /* CONFIG_X86_SMAP */
+
+#define ASM_CLAC
+#define ASM_STAC
+
+#endif /* CONFIG_X86_SMAP */
+
+#else /* __ASSEMBLY__ */
+
+#include <asm/alternative.h>
+
+#ifdef CONFIG_X86_SMAP
+
+static inline void clac(void)
+{
+	/* Note: a barrier is implicit in alternative() */
+	alternative(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
+}
+
+static inline void stac(void)
+{
+	/* Note: a barrier is implicit in alternative() */
+	alternative(ASM_NOP3, __stringify(__ASM_STAC), X86_FEATURE_SMAP);
+}
+
+/* These macros can be used in asm() statements */
+#define ASM_CLAC \
+	ALTERNATIVE(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP)
+#define ASM_STAC \
+	ALTERNATIVE(ASM_NOP3, __stringify(__ASM_STAC), X86_FEATURE_SMAP)
+
+#else /* CONFIG_X86_SMAP */
+
+static inline void clac(void) { }
+static inline void stac(void) { }
+
+#define ASM_CLAC
+#define ASM_STAC
+
+#endif /* CONFIG_X86_SMAP */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_X86_SMAP_H */

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, uaccess: Merge prototypes for clear_user/ __clear_user
  2012-09-21 19:43 ` [PATCH 07/11] x86, uaccess: Merge prototypes for clear_user/__clear_user H. Peter Anvin
@ 2012-09-21 20:03   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:03 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  a052858fabb376b695f2c125633daa6728e0f284
Gitweb:     http://git.kernel.org/tip/a052858fabb376b695f2c125633daa6728e0f284
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:11 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:26 -0700

x86, uaccess: Merge prototypes for clear_user/__clear_user

The prototypes for clear_user() and __clear_user() are identical in
the 32- and 64-bit headers.  No functionality change.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-8-git-send-email-hpa@linux.intel.com
---
 arch/x86/include/asm/uaccess.h    |    3 +++
 arch/x86/include/asm/uaccess_32.h |    3 ---
 arch/x86/include/asm/uaccess_64.h |    3 ---
 3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index e1f3a17..2c7df3d 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -569,6 +569,9 @@ strncpy_from_user(char *dst, const char __user *src, long count);
 extern __must_check long strlen_user(const char __user *str);
 extern __must_check long strnlen_user(const char __user *str, long n);
 
+unsigned long __must_check clear_user(void __user *mem, unsigned long len);
+unsigned long __must_check __clear_user(void __user *mem, unsigned long len);
+
 /*
  * movsl can be slow when source and dest are not both 8-byte aligned
  */
diff --git a/arch/x86/include/asm/uaccess_32.h b/arch/x86/include/asm/uaccess_32.h
index 576e39b..7f760a9 100644
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -213,7 +213,4 @@ static inline unsigned long __must_check copy_from_user(void *to,
 	return n;
 }
 
-unsigned long __must_check clear_user(void __user *mem, unsigned long len);
-unsigned long __must_check __clear_user(void __user *mem, unsigned long len);
-
 #endif /* _ASM_X86_UACCESS_32_H */
diff --git a/arch/x86/include/asm/uaccess_64.h b/arch/x86/include/asm/uaccess_64.h
index d8def8b..142810c 100644
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -217,9 +217,6 @@ int __copy_in_user(void __user *dst, const void __user *src, unsigned size)
 	}
 }
 
-__must_check unsigned long clear_user(void __user *mem, unsigned long len);
-__must_check unsigned long __clear_user(void __user *mem, unsigned long len);
-
 static __must_check __always_inline int
 __copy_from_user_inatomic(void *dst, const void __user *src, unsigned size)
 {

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, smap: Add STAC and CLAC instructions to control user space access
  2012-09-21 19:43 ` [PATCH 08/11] x86, smap: Add STAC and CLAC instructions to control user space access H. Peter Anvin
@ 2012-09-21 20:04   ` tip-bot for H. Peter Anvin
  2012-09-22  0:16   ` [tip:x86/smap] x86-32, smap: Add STAC/ CLAC instructions to 32-bit kernel entry tip-bot for H. Peter Anvin
  1 sibling, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  63bcff2a307b9bcc712a8251eb27df8b2e117967
Gitweb:     http://git.kernel.org/tip/63bcff2a307b9bcc712a8251eb27df8b2e117967
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:12 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:27 -0700

x86, smap: Add STAC and CLAC instructions to control user space access

When Supervisor Mode Access Prevention (SMAP) is enabled, access to
userspace from the kernel is controlled by the AC flag.  To make the
performance of manipulating that flag acceptable, there are two new
instructions, STAC and CLAC, to set and clear it.

This patch adds those instructions, via alternative(), when the SMAP
feature is enabled.  It also adds X86_EFLAGS_AC unconditionally to the
SYSCALL entry mask; there is simply no reason to make that one
conditional.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
---
 arch/x86/ia32/ia32entry.S           |    6 ++++++
 arch/x86/include/asm/fpu-internal.h |   10 ++++++----
 arch/x86/include/asm/futex.h        |   19 +++++++++++++------
 arch/x86/include/asm/smap.h         |    4 ++--
 arch/x86/include/asm/uaccess.h      |   31 +++++++++++++++++++------------
 arch/x86/include/asm/xsave.h        |   10 ++++++----
 arch/x86/kernel/cpu/common.c        |    3 ++-
 arch/x86/kernel/entry_64.S          |   11 ++++++++++-
 arch/x86/lib/copy_user_64.S         |    7 +++++++
 arch/x86/lib/copy_user_nocache_64.S |    3 +++
 arch/x86/lib/getuser.S              |   10 ++++++++++
 arch/x86/lib/putuser.S              |    8 +++++++-
 arch/x86/lib/usercopy_32.c          |   13 ++++++++++++-
 arch/x86/lib/usercopy_64.c          |    3 +++
 14 files changed, 106 insertions(+), 32 deletions(-)

diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
index 20e5f7b..9c28950 100644
--- a/arch/x86/ia32/ia32entry.S
+++ b/arch/x86/ia32/ia32entry.S
@@ -14,6 +14,7 @@
 #include <asm/segment.h>
 #include <asm/irqflags.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 #include <linux/linkage.h>
 #include <linux/err.h>
 
@@ -146,8 +147,10 @@ ENTRY(ia32_sysenter_target)
 	SAVE_ARGS 0,1,0
  	/* no need to do an access_ok check here because rbp has been
  	   32bit zero extended */ 
+	ASM_STAC
 1:	movl	(%rbp),%ebp
 	_ASM_EXTABLE(1b,ia32_badarg)
+	ASM_CLAC
 	orl     $TS_COMPAT,TI_status+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	testl   $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	CFI_REMEMBER_STATE
@@ -301,8 +304,10 @@ ENTRY(ia32_cstar_target)
 	/* no need to do an access_ok check here because r8 has been
 	   32bit zero extended */ 
 	/* hardware stack frame is complete now */	
+	ASM_STAC
 1:	movl	(%r8),%r9d
 	_ASM_EXTABLE(1b,ia32_badarg)
+	ASM_CLAC
 	orl     $TS_COMPAT,TI_status+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	testl   $_TIF_WORK_SYSCALL_ENTRY,TI_flags+THREAD_INFO(%rsp,RIP-ARGOFFSET)
 	CFI_REMEMBER_STATE
@@ -365,6 +370,7 @@ cstar_tracesys:
 END(ia32_cstar_target)
 				
 ia32_badarg:
+	ASM_CLAC
 	movq $-EFAULT,%rax
 	jmp ia32_sysret
 	CFI_ENDPROC
diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h
index 75f4c6d..0fe1358 100644
--- a/arch/x86/include/asm/fpu-internal.h
+++ b/arch/x86/include/asm/fpu-internal.h
@@ -126,8 +126,9 @@ static inline int fxsave_user(struct i387_fxsave_struct __user *fx)
 
 	/* See comment in fxsave() below. */
 #ifdef CONFIG_AS_FXSAVEQ
-	asm volatile("1:  fxsaveq %[fx]\n\t"
-		     "2:\n"
+	asm volatile(ASM_STAC "\n"
+		     "1:  fxsaveq %[fx]\n\t"
+		     "2: " ASM_CLAC "\n"
 		     ".section .fixup,\"ax\"\n"
 		     "3:  movl $-1,%[err]\n"
 		     "    jmp  2b\n"
@@ -136,8 +137,9 @@ static inline int fxsave_user(struct i387_fxsave_struct __user *fx)
 		     : [err] "=r" (err), [fx] "=m" (*fx)
 		     : "0" (0));
 #else
-	asm volatile("1:  rex64/fxsave (%[fx])\n\t"
-		     "2:\n"
+	asm volatile(ASM_STAC "\n"
+		     "1:  rex64/fxsave (%[fx])\n\t"
+		     "2: " ASM_CLAC "\n"
 		     ".section .fixup,\"ax\"\n"
 		     "3:  movl $-1,%[err]\n"
 		     "    jmp  2b\n"
diff --git a/arch/x86/include/asm/futex.h b/arch/x86/include/asm/futex.h
index 71ecbcb..f373046 100644
--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -9,10 +9,13 @@
 #include <asm/asm.h>
 #include <asm/errno.h>
 #include <asm/processor.h>
+#include <asm/smap.h>
 
 #define __futex_atomic_op1(insn, ret, oldval, uaddr, oparg)	\
-	asm volatile("1:\t" insn "\n"				\
-		     "2:\t.section .fixup,\"ax\"\n"		\
+	asm volatile("\t" ASM_STAC "\n"				\
+		     "1:\t" insn "\n"				\
+		     "2:\t" ASM_CLAC "\n"			\
+		     "\t.section .fixup,\"ax\"\n"		\
 		     "3:\tmov\t%3, %1\n"			\
 		     "\tjmp\t2b\n"				\
 		     "\t.previous\n"				\
@@ -21,12 +24,14 @@
 		     : "i" (-EFAULT), "0" (oparg), "1" (0))
 
 #define __futex_atomic_op2(insn, ret, oldval, uaddr, oparg)	\
-	asm volatile("1:\tmovl	%2, %0\n"			\
+	asm volatile("\t" ASM_STAC "\n"				\
+		     "1:\tmovl	%2, %0\n"			\
 		     "\tmovl\t%0, %3\n"				\
 		     "\t" insn "\n"				\
 		     "2:\t" LOCK_PREFIX "cmpxchgl %3, %2\n"	\
 		     "\tjnz\t1b\n"				\
-		     "3:\t.section .fixup,\"ax\"\n"		\
+		     "3:\t" ASM_CLAC "\n"			\
+		     "\t.section .fixup,\"ax\"\n"		\
 		     "4:\tmov\t%5, %1\n"			\
 		     "\tjmp\t3b\n"				\
 		     "\t.previous\n"				\
@@ -122,8 +127,10 @@ static inline int futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
 
-	asm volatile("1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
-		     "2:\t.section .fixup, \"ax\"\n"
+	asm volatile("\t" ASM_STAC "\n"
+		     "1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
+		     "2:\t" ASM_CLAC "\n"
+		     "\t.section .fixup, \"ax\"\n"
 		     "3:\tmov     %3, %0\n"
 		     "\tjmp     2b\n"
 		     "\t.previous\n"
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 3989c24..8d3120f 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -58,13 +58,13 @@
 
 #ifdef CONFIG_X86_SMAP
 
-static inline void clac(void)
+static __always_inline void clac(void)
 {
 	/* Note: a barrier is implicit in alternative() */
 	alternative(ASM_NOP3, __stringify(__ASM_CLAC), X86_FEATURE_SMAP);
 }
 
-static inline void stac(void)
+static __always_inline void stac(void)
 {
 	/* Note: a barrier is implicit in alternative() */
 	alternative(ASM_NOP3, __stringify(__ASM_STAC), X86_FEATURE_SMAP);
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 2c7df3d..b92ece1 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -9,6 +9,7 @@
 #include <linux/string.h>
 #include <asm/asm.h>
 #include <asm/page.h>
+#include <asm/smap.h>
 
 #define VERIFY_READ 0
 #define VERIFY_WRITE 1
@@ -192,9 +193,10 @@ extern int __get_user_bad(void);
 
 #ifdef CONFIG_X86_32
 #define __put_user_asm_u64(x, addr, err, errret)			\
-	asm volatile("1:	movl %%eax,0(%2)\n"			\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	movl %%eax,0(%2)\n"			\
 		     "2:	movl %%edx,4(%2)\n"			\
-		     "3:\n"						\
+		     "3: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
 		     "4:	movl %3,%0\n"				\
 		     "	jmp 3b\n"					\
@@ -205,9 +207,10 @@ extern int __get_user_bad(void);
 		     : "A" (x), "r" (addr), "i" (errret), "0" (err))
 
 #define __put_user_asm_ex_u64(x, addr)					\
-	asm volatile("1:	movl %%eax,0(%1)\n"			\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	movl %%eax,0(%1)\n"			\
 		     "2:	movl %%edx,4(%1)\n"			\
-		     "3:\n"						\
+		     "3: " ASM_CLAC "\n"				\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     _ASM_EXTABLE_EX(2b, 3b)				\
 		     : : "A" (x), "r" (addr))
@@ -379,8 +382,9 @@ do {									\
 } while (0)
 
 #define __get_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
-	asm volatile("1:	mov"itype" %2,%"rtype"1\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %2,%"rtype"1\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
 		     "3:	mov %3,%0\n"				\
 		     "	xor"itype" %"rtype"1,%"rtype"1\n"		\
@@ -412,8 +416,9 @@ do {									\
 } while (0)
 
 #define __get_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile("1:	mov"itype" %1,%"rtype"0\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %1,%"rtype"0\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : ltype(x) : "m" (__m(addr)))
 
@@ -443,8 +448,9 @@ struct __large_struct { unsigned long buf[100]; };
  * aliasing issues.
  */
 #define __put_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
-	asm volatile("1:	mov"itype" %"rtype"1,%2\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %"rtype"1,%2\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     ".section .fixup,\"ax\"\n"				\
 		     "3:	mov %3,%0\n"				\
 		     "	jmp 2b\n"					\
@@ -454,8 +460,9 @@ struct __large_struct { unsigned long buf[100]; };
 		     : ltype(x), "m" (__m(addr)), "i" (errret), "0" (err))
 
 #define __put_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile("1:	mov"itype" %"rtype"0,%1\n"		\
-		     "2:\n"						\
+	asm volatile(ASM_STAC "\n"					\
+		     "1:	mov"itype" %"rtype"0,%1\n"		\
+		     "2: " ASM_CLAC "\n"				\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : : ltype(x), "m" (__m(addr)))
 
diff --git a/arch/x86/include/asm/xsave.h b/arch/x86/include/asm/xsave.h
index 8a1b6f9..2a923bd 100644
--- a/arch/x86/include/asm/xsave.h
+++ b/arch/x86/include/asm/xsave.h
@@ -74,8 +74,9 @@ static inline int xsave_user(struct xsave_struct __user *buf)
 	if (unlikely(err))
 		return -EFAULT;
 
-	__asm__ __volatile__("1: .byte " REX_PREFIX "0x0f,0xae,0x27\n"
-			     "2:\n"
+	__asm__ __volatile__(ASM_STAC "\n"
+			     "1: .byte " REX_PREFIX "0x0f,0xae,0x27\n"
+			     "2: " ASM_CLAC "\n"
 			     ".section .fixup,\"ax\"\n"
 			     "3:  movl $-1,%[err]\n"
 			     "    jmp  2b\n"
@@ -97,8 +98,9 @@ static inline int xrestore_user(struct xsave_struct __user *buf, u64 mask)
 	u32 lmask = mask;
 	u32 hmask = mask >> 32;
 
-	__asm__ __volatile__("1: .byte " REX_PREFIX "0x0f,0xae,0x2f\n"
-			     "2:\n"
+	__asm__ __volatile__(ASM_STAC "\n"
+			     "1: .byte " REX_PREFIX "0x0f,0xae,0x2f\n"
+			     "2: " ASM_CLAC "\n"
 			     ".section .fixup,\"ax\"\n"
 			     "3:  movl $-1,%[err]\n"
 			     "    jmp  2b\n"
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a5fbc3c..cd43e52 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1113,7 +1113,8 @@ void syscall_init(void)
 
 	/* Flags to clear on syscall */
 	wrmsrl(MSR_SYSCALL_MASK,
-	       X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF|X86_EFLAGS_IOPL);
+	       X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF|
+	       X86_EFLAGS_IOPL|X86_EFLAGS_AC);
 }
 
 unsigned long kernel_eflags;
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 69babd8..ce87e3d 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -56,6 +56,7 @@
 #include <asm/ftrace.h>
 #include <asm/percpu.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 #include <linux/err.h>
 
 /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this.  */
@@ -465,7 +466,8 @@ END(ret_from_fork)
  * System call entry. Up to 6 arguments in registers are supported.
  *
  * SYSCALL does not save anything on the stack and does not change the
- * stack pointer.
+ * stack pointer.  However, it does mask the flags register for us, so
+ * CLD and CLAC are not needed.
  */
 
 /*
@@ -884,6 +886,7 @@ END(interrupt)
 	 */
 	.p2align CONFIG_X86_L1_CACHE_SHIFT
 common_interrupt:
+	ASM_CLAC
 	XCPT_FRAME
 	addq $-0x80,(%rsp)		/* Adjust vector to [-256,-1] range */
 	interrupt do_IRQ
@@ -1023,6 +1026,7 @@ END(common_interrupt)
  */
 .macro apicinterrupt num sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	pushq_cfi $~(\num)
 .Lcommon_\sym:
@@ -1077,6 +1081,7 @@ apicinterrupt IRQ_WORK_VECTOR \
  */
 .macro zeroentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	pushq_cfi $-1		/* ORIG_RAX: no syscall to restart */
@@ -1094,6 +1099,7 @@ END(\sym)
 
 .macro paranoidzeroentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	pushq_cfi $-1		/* ORIG_RAX: no syscall to restart */
@@ -1112,6 +1118,7 @@ END(\sym)
 #define INIT_TSS_IST(x) PER_CPU_VAR(init_tss) + (TSS_ist + ((x) - 1) * 8)
 .macro paranoidzeroentry_ist sym do_sym ist
 ENTRY(\sym)
+	ASM_CLAC
 	INTR_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	pushq_cfi $-1		/* ORIG_RAX: no syscall to restart */
@@ -1131,6 +1138,7 @@ END(\sym)
 
 .macro errorentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	XCPT_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	subq $ORIG_RAX-R15, %rsp
@@ -1149,6 +1157,7 @@ END(\sym)
 	/* error code is on the stack already */
 .macro paranoiderrorentry sym do_sym
 ENTRY(\sym)
+	ASM_CLAC
 	XCPT_FRAME
 	PARAVIRT_ADJUST_EXCEPTION_FRAME
 	subq $ORIG_RAX-R15, %rsp
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index 5b2995f..a30ca15 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -17,6 +17,7 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative-asm.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 /*
  * By placing feature2 after feature1 in altinstructions section, we logically
@@ -130,6 +131,7 @@ ENDPROC(bad_from_user)
  */
 ENTRY(copy_user_generic_unrolled)
 	CFI_STARTPROC
+	ASM_STAC
 	cmpl $8,%edx
 	jb 20f		/* less then 8 bytes, go to byte copy loop */
 	ALIGN_DESTINATION
@@ -177,6 +179,7 @@ ENTRY(copy_user_generic_unrolled)
 	decl %ecx
 	jnz 21b
 23:	xor %eax,%eax
+	ASM_CLAC
 	ret
 
 	.section .fixup,"ax"
@@ -232,6 +235,7 @@ ENDPROC(copy_user_generic_unrolled)
  */
 ENTRY(copy_user_generic_string)
 	CFI_STARTPROC
+	ASM_STAC
 	andl %edx,%edx
 	jz 4f
 	cmpl $8,%edx
@@ -246,6 +250,7 @@ ENTRY(copy_user_generic_string)
 3:	rep
 	movsb
 4:	xorl %eax,%eax
+	ASM_CLAC
 	ret
 
 	.section .fixup,"ax"
@@ -273,12 +278,14 @@ ENDPROC(copy_user_generic_string)
  */
 ENTRY(copy_user_enhanced_fast_string)
 	CFI_STARTPROC
+	ASM_STAC
 	andl %edx,%edx
 	jz 2f
 	movl %edx,%ecx
 1:	rep
 	movsb
 2:	xorl %eax,%eax
+	ASM_CLAC
 	ret
 
 	.section .fixup,"ax"
diff --git a/arch/x86/lib/copy_user_nocache_64.S b/arch/x86/lib/copy_user_nocache_64.S
index cacddc7..6a4f43c 100644
--- a/arch/x86/lib/copy_user_nocache_64.S
+++ b/arch/x86/lib/copy_user_nocache_64.S
@@ -15,6 +15,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/thread_info.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 	.macro ALIGN_DESTINATION
 #ifdef FIX_ALIGNMENT
@@ -48,6 +49,7 @@
  */
 ENTRY(__copy_user_nocache)
 	CFI_STARTPROC
+	ASM_STAC
 	cmpl $8,%edx
 	jb 20f		/* less then 8 bytes, go to byte copy loop */
 	ALIGN_DESTINATION
@@ -95,6 +97,7 @@ ENTRY(__copy_user_nocache)
 	decl %ecx
 	jnz 21b
 23:	xorl %eax,%eax
+	ASM_CLAC
 	sfence
 	ret
 
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index b33b1fb..156b9c8 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -33,6 +33,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/thread_info.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 	.text
 ENTRY(__get_user_1)
@@ -40,8 +41,10 @@ ENTRY(__get_user_1)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae bad_get_user
+	ASM_STAC
 1:	movzb (%_ASM_AX),%edx
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_1)
@@ -53,8 +56,10 @@ ENTRY(__get_user_2)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae bad_get_user
+	ASM_STAC
 2:	movzwl -1(%_ASM_AX),%edx
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_2)
@@ -66,8 +71,10 @@ ENTRY(__get_user_4)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae bad_get_user
+	ASM_STAC
 3:	mov -3(%_ASM_AX),%edx
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_4)
@@ -80,8 +87,10 @@ ENTRY(__get_user_8)
 	GET_THREAD_INFO(%_ASM_DX)
 	cmp TI_addr_limit(%_ASM_DX),%_ASM_AX
 	jae	bad_get_user
+	ASM_STAC
 4:	movq -7(%_ASM_AX),%_ASM_DX
 	xor %eax,%eax
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 ENDPROC(__get_user_8)
@@ -91,6 +100,7 @@ bad_get_user:
 	CFI_STARTPROC
 	xor %edx,%edx
 	mov $(-EFAULT),%_ASM_AX
+	ASM_CLAC
 	ret
 	CFI_ENDPROC
 END(bad_get_user)
diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S
index 7f951c8..fc6ba17 100644
--- a/arch/x86/lib/putuser.S
+++ b/arch/x86/lib/putuser.S
@@ -15,6 +15,7 @@
 #include <asm/thread_info.h>
 #include <asm/errno.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 
 /*
@@ -31,7 +32,8 @@
 
 #define ENTER	CFI_STARTPROC ; \
 		GET_THREAD_INFO(%_ASM_BX)
-#define EXIT	ret ; \
+#define EXIT	ASM_CLAC ;	\
+		ret ;		\
 		CFI_ENDPROC
 
 .text
@@ -39,6 +41,7 @@ ENTRY(__put_user_1)
 	ENTER
 	cmp TI_addr_limit(%_ASM_BX),%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 1:	movb %al,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
@@ -50,6 +53,7 @@ ENTRY(__put_user_2)
 	sub $1,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 2:	movw %ax,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
@@ -61,6 +65,7 @@ ENTRY(__put_user_4)
 	sub $3,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 3:	movl %eax,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
@@ -72,6 +77,7 @@ ENTRY(__put_user_8)
 	sub $7,%_ASM_BX
 	cmp %_ASM_BX,%_ASM_CX
 	jae bad_put_user
+	ASM_STAC
 4:	mov %_ASM_AX,(%_ASM_CX)
 #ifdef CONFIG_X86_32
 5:	movl %edx,4(%_ASM_CX)
diff --git a/arch/x86/lib/usercopy_32.c b/arch/x86/lib/usercopy_32.c
index 1781b2f..98f6d6b6 100644
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -42,10 +42,11 @@ do {									\
 	int __d0;							\
 	might_fault();							\
 	__asm__ __volatile__(						\
+		ASM_STAC "\n"						\
 		"0:	rep; stosl\n"					\
 		"	movl %2,%0\n"					\
 		"1:	rep; stosb\n"					\
-		"2:\n"							\
+		"2: " ASM_CLAC "\n"					\
 		".section .fixup,\"ax\"\n"				\
 		"3:	lea 0(%2,%0,4),%0\n"				\
 		"	jmp 2b\n"					\
@@ -626,10 +627,12 @@ survive:
 		return n;
 	}
 #endif
+	stac();
 	if (movsl_is_ok(to, from, n))
 		__copy_user(to, from, n);
 	else
 		n = __copy_user_intel(to, from, n);
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_to_user_ll);
@@ -637,10 +640,12 @@ EXPORT_SYMBOL(__copy_to_user_ll);
 unsigned long __copy_from_user_ll(void *to, const void __user *from,
 					unsigned long n)
 {
+	stac();
 	if (movsl_is_ok(to, from, n))
 		__copy_user_zeroing(to, from, n);
 	else
 		n = __copy_user_zeroing_intel(to, from, n);
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll);
@@ -648,11 +653,13 @@ EXPORT_SYMBOL(__copy_from_user_ll);
 unsigned long __copy_from_user_ll_nozero(void *to, const void __user *from,
 					 unsigned long n)
 {
+	stac();
 	if (movsl_is_ok(to, from, n))
 		__copy_user(to, from, n);
 	else
 		n = __copy_user_intel((void __user *)to,
 				      (const void *)from, n);
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll_nozero);
@@ -660,6 +667,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nozero);
 unsigned long __copy_from_user_ll_nocache(void *to, const void __user *from,
 					unsigned long n)
 {
+	stac();
 #ifdef CONFIG_X86_INTEL_USERCOPY
 	if (n > 64 && cpu_has_xmm2)
 		n = __copy_user_zeroing_intel_nocache(to, from, n);
@@ -668,6 +676,7 @@ unsigned long __copy_from_user_ll_nocache(void *to, const void __user *from,
 #else
 	__copy_user_zeroing(to, from, n);
 #endif
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll_nocache);
@@ -675,6 +684,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocache);
 unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *from,
 					unsigned long n)
 {
+	stac();
 #ifdef CONFIG_X86_INTEL_USERCOPY
 	if (n > 64 && cpu_has_xmm2)
 		n = __copy_user_intel_nocache(to, from, n);
@@ -683,6 +693,7 @@ unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *fr
 #else
 	__copy_user(to, from, n);
 #endif
+	clac();
 	return n;
 }
 EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);
diff --git a/arch/x86/lib/usercopy_64.c b/arch/x86/lib/usercopy_64.c
index e5b130b..05928aa 100644
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -18,6 +18,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size)
 	might_fault();
 	/* no memory constraint because it doesn't change any memory gcc knows
 	   about */
+	stac();
 	asm volatile(
 		"	testq  %[size8],%[size8]\n"
 		"	jz     4f\n"
@@ -40,6 +41,7 @@ unsigned long __clear_user(void __user *addr, unsigned long size)
 		: [size8] "=&c"(size), [dst] "=&D" (__d0)
 		: [size1] "r"(size & 7), "[size8]" (size / 8), "[dst]"(addr),
 		  [zero] "r" (0UL), [eight] "r" (8UL));
+	clac();
 	return size;
 }
 EXPORT_SYMBOL(__clear_user);
@@ -82,5 +84,6 @@ copy_user_handle_tail(char *to, char *from, unsigned len, unsigned zerorest)
 	for (c = 0, zero_len = len; zerorest && zero_len; --zero_len)
 		if (__put_user_nocheck(c, to++, sizeof(char)))
 			break;
+	clac();
 	return len;
 }

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, smap: Turn on Supervisor Mode Access Prevention
  2012-09-21 19:43 ` [PATCH 09/11] x86, smap: Turn on Supervisor Mode Access Prevention H. Peter Anvin
@ 2012-09-21 20:05   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:05 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  52b6179ac87d33c2eeaff5292786a10fe98cff64
Gitweb:     http://git.kernel.org/tip/52b6179ac87d33c2eeaff5292786a10fe98cff64
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:13 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:27 -0700

x86, smap: Turn on Supervisor Mode Access Prevention

If Supervisor Mode Access Prevention is available and not disabled by
the user, turn it on.  Also fix the expansion of SMEP (Supervisor Mode
Execution Prevention.)

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-10-git-send-email-hpa@linux.intel.com
---
 Documentation/kernel-parameters.txt |    6 +++++-
 arch/x86/kernel/cpu/common.c        |   26 ++++++++++++++++++++++++++
 2 files changed, 31 insertions(+), 1 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index ad7e2e5..49c5c41 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1812,8 +1812,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			noexec=on: enable non-executable mappings (default)
 			noexec=off: disable non-executable mappings
 
+	nosmap		[X86]
+			Disable SMAP (Supervisor Mode Access Prevention)
+			even if it is supported by processor.
+
 	nosmep		[X86]
-			Disable SMEP (Supervisor Mode Execution Protection)
+			Disable SMEP (Supervisor Mode Execution Prevention)
 			even if it is supported by processor.
 
 	noexec32	[X86-64]
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index cd43e52..7d35d65 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -278,6 +278,31 @@ static __cpuinit void setup_smep(struct cpuinfo_x86 *c)
 	}
 }
 
+static int disable_smap __cpuinitdata;
+static __init int setup_disable_smap(char *arg)
+{
+	disable_smap = 1;
+	return 1;
+}
+__setup("nosmap", setup_disable_smap);
+
+static __cpuinit void setup_smap(struct cpuinfo_x86 *c)
+{
+	if (cpu_has(c, X86_FEATURE_SMAP)) {
+		if (unlikely(disable_smap)) {
+			setup_clear_cpu_cap(X86_FEATURE_SMAP);
+			clear_in_cr4(X86_CR4_SMAP);
+		} else {
+			set_in_cr4(X86_CR4_SMAP);
+			/*
+			 * Don't use clac() here since alternatives
+			 * haven't run yet...
+			 */
+			asm volatile(__stringify(__ASM_CLAC) ::: "memory");
+		}
+	}
+}
+
 /*
  * Some CPU features depend on higher CPUID levels, which may not always
  * be available due to CPUID level capping or broken virtualization
@@ -713,6 +738,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 	filter_cpuid_features(c, false);
 
 	setup_smep(c);
+	setup_smap(c);
 
 	if (this_cpu->c_bsp_init)
 		this_cpu->c_bsp_init(c);

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, smap: A page fault due to SMAP is an oops
  2012-09-21 19:43 ` [PATCH 10/11] x86, smap: A page fault due to SMAP is an oops H. Peter Anvin
@ 2012-09-21 20:06   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:06 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  40d3cd6695014bf3c44e2ca66b610b18acaf923d
Gitweb:     http://git.kernel.org/tip/40d3cd6695014bf3c44e2ca66b610b18acaf923d
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:14 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:27 -0700

x86, smap: A page fault due to SMAP is an oops

If we get a page fault due to SMAP, trigger an oops rather than
spinning forever.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-11-git-send-email-hpa@linux.intel.com
---
 arch/x86/mm/fault.c |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 76dcd9d..f2fb75d 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -995,6 +995,17 @@ static int fault_in_kernel_space(unsigned long address)
 	return address >= TASK_SIZE_MAX;
 }
 
+static inline bool smap_violation(int error_code, struct pt_regs *regs)
+{
+	if (error_code & PF_USER)
+		return false;
+
+	if (!user_mode_vm(regs) && (regs->flags & X86_EFLAGS_AC))
+		return false;
+
+	return true;
+}
+
 /*
  * This routine handles page faults.  It determines the address,
  * and the problem, and then passes it off to one of the appropriate
@@ -1088,6 +1099,13 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
 	if (unlikely(error_code & PF_RSVD))
 		pgtable_bad(regs, error_code, address);
 
+	if (static_cpu_has(X86_FEATURE_SMAP)) {
+		if (unlikely(smap_violation(error_code, regs))) {
+			bad_area_nosemaphore(regs, error_code, address);
+			return;
+		}
+	}
+
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
 	/*

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, smap: Reduce the SMAP overhead for signal handling
  2012-09-21 19:43 ` [PATCH 11/11] x86, smap: Reduce the SMAP overhead for signal handling H. Peter Anvin
@ 2012-09-21 20:07   ` tip-bot for H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-21 20:07 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  5e88353d8b5f483bc1c873ad24ac2b59a6b66c73
Gitweb:     http://git.kernel.org/tip/5e88353d8b5f483bc1c873ad24ac2b59a6b66c73
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 12:43:15 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 12:45:27 -0700

x86, smap: Reduce the SMAP overhead for signal handling

Signal handling contains a bunch of accesses to individual user space
items, which causes an excessive number of STAC and CLAC
instructions.  Instead, let get/put_user_try ... get/put_user_catch()
contain the STAC and CLAC instructions.

This means that get/put_user_try no longer nests, and furthermore that
it is no longer legal to use user space access functions other than
__get/put_user_ex() inside those blocks.  However, these macros are
x86-specific anyway and are only used in the signal-handling paths; a
simple reordering of moving the larger subroutine calls out of the
try...catch blocks resolves that problem.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-12-git-send-email-hpa@linux.intel.com
---
 arch/x86/ia32/ia32_signal.c    |   12 +++++++-----
 arch/x86/include/asm/uaccess.h |   14 ++++++--------
 arch/x86/kernel/signal.c       |   24 ++++++++++++++----------
 3 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
index 673ac9b..05e62a3 100644
--- a/arch/x86/ia32/ia32_signal.c
+++ b/arch/x86/ia32/ia32_signal.c
@@ -250,11 +250,12 @@ static int ia32_restore_sigcontext(struct pt_regs *regs,
 
 		get_user_ex(tmp, &sc->fpstate);
 		buf = compat_ptr(tmp);
-		err |= restore_i387_xstate_ia32(buf);
 
 		get_user_ex(*pax, &sc->ax);
 	} get_user_catch(err);
 
+	err |= restore_i387_xstate_ia32(buf);
+
 	return err;
 }
 
@@ -502,7 +503,6 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sig, &frame->sig);
 		put_user_ex(ptr_to_compat(&frame->info), &frame->pinfo);
 		put_user_ex(ptr_to_compat(&frame->uc), &frame->puc);
-		err |= copy_siginfo_to_user32(&frame->info, info);
 
 		/* Create the ucontext.  */
 		if (cpu_has_xsave)
@@ -514,9 +514,6 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sas_ss_flags(regs->sp),
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
-		err |= ia32_setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
-					     regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		if (ka->sa.sa_flags & SA_RESTORER)
 			restorer = ka->sa.sa_restorer;
@@ -532,6 +529,11 @@ int ia32_setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(*((u64 *)&code), (u64 *)frame->retcode);
 	} put_user_catch(err);
 
+	err |= copy_siginfo_to_user32(&frame->info, info);
+	err |= ia32_setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
+				     regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index b92ece1..a91acfb 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -416,9 +416,8 @@ do {									\
 } while (0)
 
 #define __get_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile(ASM_STAC "\n"					\
-		     "1:	mov"itype" %1,%"rtype"0\n"		\
-		     "2: " ASM_CLAC "\n"				\
+	asm volatile("1:	mov"itype" %1,%"rtype"0\n"		\
+		     "2:\n"						\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : ltype(x) : "m" (__m(addr)))
 
@@ -460,9 +459,8 @@ struct __large_struct { unsigned long buf[100]; };
 		     : ltype(x), "m" (__m(addr)), "i" (errret), "0" (err))
 
 #define __put_user_asm_ex(x, addr, itype, rtype, ltype)			\
-	asm volatile(ASM_STAC "\n"					\
-		     "1:	mov"itype" %"rtype"0,%1\n"		\
-		     "2: " ASM_CLAC "\n"				\
+	asm volatile("1:	mov"itype" %"rtype"0,%1\n"		\
+		     "2:\n"						\
 		     _ASM_EXTABLE_EX(1b, 2b)				\
 		     : : ltype(x), "m" (__m(addr)))
 
@@ -470,13 +468,13 @@ struct __large_struct { unsigned long buf[100]; };
  * uaccess_try and catch
  */
 #define uaccess_try	do {						\
-	int prev_err = current_thread_info()->uaccess_err;		\
 	current_thread_info()->uaccess_err = 0;				\
+	stac();								\
 	barrier();
 
 #define uaccess_catch(err)						\
+	clac();								\
 	(err) |= (current_thread_info()->uaccess_err ? -EFAULT : 0);	\
-	current_thread_info()->uaccess_err = prev_err;			\
 } while (0)
 
 /**
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index b280908..9326128 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -114,11 +114,12 @@ int restore_sigcontext(struct pt_regs *regs, struct sigcontext __user *sc,
 		regs->orig_ax = -1;		/* disable syscall checks */
 
 		get_user_ex(buf, &sc->fpstate);
-		err |= restore_i387_xstate(buf);
 
 		get_user_ex(*pax, &sc->ax);
 	} get_user_catch(err);
 
+	err |= restore_i387_xstate(buf);
+
 	return err;
 }
 
@@ -357,7 +358,6 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sig, &frame->sig);
 		put_user_ex(&frame->info, &frame->pinfo);
 		put_user_ex(&frame->uc, &frame->puc);
-		err |= copy_siginfo_to_user(&frame->info, info);
 
 		/* Create the ucontext.  */
 		if (cpu_has_xsave)
@@ -369,9 +369,6 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sas_ss_flags(regs->sp),
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
-		err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
-					regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		/* Set up to return from userspace.  */
 		restorer = VDSO32_SYMBOL(current->mm->context.vdso, rt_sigreturn);
@@ -389,6 +386,11 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(*((u64 *)&rt_retcode), (u64 *)frame->retcode);
 	} put_user_catch(err);
 
+	err |= copy_siginfo_to_user(&frame->info, info);
+	err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
+				regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 
@@ -436,8 +438,6 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		put_user_ex(sas_ss_flags(regs->sp),
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(me->sas_ss_size, &frame->uc.uc_stack.ss_size);
-		err |= setup_sigcontext(&frame->uc.uc_mcontext, fp, regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		/* Set up to return from userspace.  If provided, use a stub
 		   already in userspace.  */
@@ -450,6 +450,9 @@ static int __setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
 		}
 	} put_user_catch(err);
 
+	err |= setup_sigcontext(&frame->uc.uc_mcontext, fp, regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 
@@ -855,9 +858,6 @@ static int x32_setup_rt_frame(int sig, struct k_sigaction *ka,
 			    &frame->uc.uc_stack.ss_flags);
 		put_user_ex(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
 		put_user_ex(0, &frame->uc.uc__pad0);
-		err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
-					regs, set->sig[0]);
-		err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 
 		if (ka->sa.sa_flags & SA_RESTORER) {
 			restorer = ka->sa.sa_restorer;
@@ -869,6 +869,10 @@ static int x32_setup_rt_frame(int sig, struct k_sigaction *ka,
 		put_user_ex(restorer, &frame->pretcode);
 	} put_user_catch(err);
 
+	err |= setup_sigcontext(&frame->uc.uc_mcontext, fpstate,
+				regs, set->sig[0]);
+	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
+
 	if (err)
 		return -EFAULT;
 

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 19:54 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Linus Torvalds
  2012-09-21 19:57   ` H. Peter Anvin
@ 2012-09-21 20:08   ` Ingo Molnar
  2012-09-21 21:03     ` H. Peter Anvin
  1 sibling, 1 reply; 56+ messages in thread
From: Ingo Molnar @ 2012-09-21 20:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Linux Kernel Mailing List, H. Peter Anvin,
	Thomas Gleixner, Kees Cook, Linda Wang, Matt Fleming


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, Sep 21, 2012 at 12:43 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>
> > Supervisor Mode Access Prevention (SMAP) is a new security 
> > feature disclosed by Intel in revision 014 of the Intel® 
> > Architecture Instruction Set Extensions Programming 
> > Reference:
> 
> Looks good.
> 
> Did this find any bugs, btw? We've had a few cases where we 
> forgot to use the proper user access function, and code just 
> happened to work because it all boils down to the same thing 
> and never got any page faults in practice anyway..

The 4g:4g patch sweeped out most of the historic ones - so what 
we have are perhaps newer bugs (but those should be pretty rare, 
most new features are cross-arch).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 20:08   ` Ingo Molnar
@ 2012-09-21 21:03     ` H. Peter Anvin
  2012-09-21 21:09       ` Linus Torvalds
  0 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 21:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Linux Kernel Mailing List, H. Peter Anvin,
	Thomas Gleixner, Kees Cook, Linda Wang, Matt Fleming

On 09/21/2012 01:08 PM, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
>> On Fri, Sep 21, 2012 at 12:43 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>>
>>> Supervisor Mode Access Prevention (SMAP) is a new security 
>>> feature disclosed by Intel in revision 014 of the Intel® 
>>> Architecture Instruction Set Extensions Programming 
>>> Reference:
>>
>> Looks good.
>>
>> Did this find any bugs, btw? We've had a few cases where we 
>> forgot to use the proper user access function, and code just 
>> happened to work because it all boils down to the same thing 
>> and never got any page faults in practice anyway..
> 
> The 4g:4g patch sweeped out most of the historic ones - so what 
> we have are perhaps newer bugs (but those should be pretty rare, 
> most new features are cross-arch).
> 

A while ago I also did a mockup patch which switched %cr3 to
swapper_pg_dir while entering the kernel (basically where the CLAC
instructions go, plus the SYSCALL path; a restore was obviously needed,
too.)  The performance was atrocious, but I didn't remember running into
any problems.

	-hpa


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 21:03     ` H. Peter Anvin
@ 2012-09-21 21:09       ` Linus Torvalds
  2012-09-21 21:12         ` H. Peter Anvin
  0 siblings, 1 reply; 56+ messages in thread
From: Linus Torvalds @ 2012-09-21 21:09 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Linux Kernel Mailing List, H. Peter Anvin,
	Thomas Gleixner, Kees Cook, Linda Wang, Matt Fleming

On Fri, Sep 21, 2012 at 2:03 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>
> A while ago I also did a mockup patch which switched %cr3 to
> swapper_pg_dir while entering the kernel (basically where the CLAC
> instructions go, plus the SYSCALL path; a restore was obviously needed,
> too.)  The performance was atrocious, but I didn't remember running into
> any problems.

Well, they are bound to be corner-cases and unusual. I was thinking of
problems like the one recently fixed in commit ed6fe9d614fc ("Fix
order of arguments to compat_put_time[spec|val]"), which really
requires compat handling of fairly unusual cases.

That's the kind of situation where I'd expect bugs might still lurk.
And it would only get triggered by some rather unusual setups.

             Linus

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 21:09       ` Linus Torvalds
@ 2012-09-21 21:12         ` H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 21:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Ingo Molnar, Linux Kernel Mailing List,
	Thomas Gleixner, Kees Cook, Linda Wang, Matt Fleming

On 09/21/2012 02:09 PM, Linus Torvalds wrote:
> On Fri, Sep 21, 2012 at 2:03 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>>
>> A while ago I also did a mockup patch which switched %cr3 to
>> swapper_pg_dir while entering the kernel (basically where the CLAC
>> instructions go, plus the SYSCALL path; a restore was obviously needed,
>> too.)  The performance was atrocious, but I didn't remember running into
>> any problems.
> 
> Well, they are bound to be corner-cases and unusual. I was thinking of
> problems like the one recently fixed in commit ed6fe9d614fc ("Fix
> order of arguments to compat_put_time[spec|val]"), which really
> requires compat handling of fairly unusual cases.
> 
> That's the kind of situation where I'd expect bugs might still lurk.
> And it would only get triggered by some rather unusual setups.
> 

Yes; in *most* cases these are exploitable security bugs on non-SMAP
hardware (which is obviously the whole point!), but there are a few
conditions where there may be issues that aren't exploitable problems.

	-hpa




^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (11 preceding siblings ...)
  2012-09-21 19:54 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Linus Torvalds
@ 2012-09-21 22:07 ` Eric W. Biederman
  2012-09-21 22:12   ` H. Peter Anvin
  2012-09-21 22:08 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Dave Jones
  13 siblings, 1 reply; 56+ messages in thread
From: Eric W. Biederman @ 2012-09-21 22:07 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Linus Torvalds, Kees Cook, Linda Wang,
	Matt Fleming

"H. Peter Anvin" <hpa@linux.intel.com> writes:

> Supervisor Mode Access Prevention (SMAP) is a new security feature
> disclosed by Intel in revision 014 of the Intel® Architecture
> Instruction Set Extensions Programming Reference:
>
> http://software.intel.com/sites/default/files/319433-014.pdf
>
> When SMAP is active, the kernel cannot normally access pages that are
> user space (U=1).  Since the kernel does have the need to access user
> space pages under specific circumstances, an override is provided: the
> kernel can access user space pages if EFLAGS.AC=1.  For system data
> structures, e.g. descriptor tables, that are accessed by the processor
> directly, SMAP is active even in CPL 3 regardless of EFLAGS.AC.
>
> SMAP also includes two new instructions, STAC and CLAC, to flip the AC
> flag more quickly.

Have you tested kexec in this environment?

This is the kind of cpu feature that when we enable it, frequently we
have to do something on the kexec path.

At a quick skim it looks like the kexec path is using kernel page table
entries and clearing all bits from cr4 except X86_CR4_PAE so I don't
actually expect this change will require anything on the kexec path.

Eric

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
                   ` (12 preceding siblings ...)
  2012-09-21 22:07 ` Eric W. Biederman
@ 2012-09-21 22:08 ` Dave Jones
  2012-09-21 22:10   ` H. Peter Anvin
  13 siblings, 1 reply; 56+ messages in thread
From: Dave Jones @ 2012-09-21 22:08 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Linus Torvalds, Kees Cook, Linda Wang,
	Matt Fleming

On Fri, Sep 21, 2012 at 12:43:04PM -0700, H. Peter Anvin wrote:
 > Supervisor Mode Access Prevention (SMAP) is a new security feature
 > disclosed by Intel in revision 014 of the Intel® Architecture
 > Instruction Set Extensions Programming Reference:
 > 
 > http://software.intel.com/sites/default/files/319433-014.pdf
 > 
 > When SMAP is active, the kernel cannot normally access pages that are
 > user space (U=1).  Since the kernel does have the need to access user
 > space pages under specific circumstances, an override is provided: the
 > kernel can access user space pages if EFLAGS.AC=1.  For system data
 > structures, e.g. descriptor tables, that are accessed by the processor
 > directly, SMAP is active even in CPL 3 regardless of EFLAGS.AC.
 > 
 > SMAP also includes two new instructions, STAC and CLAC, to flip the AC
 > flag more quickly.

Perhaps add a printk somewhere to show that it's actually been enabled maybe ?

Also, would it be feasible to add something like we have for test_nx ?
If this feature regresses in some way in the future, I suspect we'd like
to know about it sooner rather than later.

	Dave


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 22:08 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Dave Jones
@ 2012-09-21 22:10   ` H. Peter Anvin
  2012-09-22 11:32     ` Ingo Molnar
  0 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 22:10 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel Mailing List, H. Peter Anvin,
	Ingo Molnar, Thomas Gleixner, Linus Torvalds, Kees Cook,
	Linda Wang, Matt Fleming

On 09/21/2012 03:08 PM, Dave Jones wrote:
> 
> Perhaps add a printk somewhere to show that it's actually been enabled maybe ?
> 
> Also, would it be feasible to add something like we have for test_nx ?
> If this feature regresses in some way in the future, I suspect we'd like
> to know about it sooner rather than later.
> 

Good idea... should add this both for SMEP and SMAP.

	-hpa




^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 22:07 ` Eric W. Biederman
@ 2012-09-21 22:12   ` H. Peter Anvin
  2012-09-22  0:41     ` Eric W. Biederman
  0 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-21 22:12 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Linus Torvalds, Kees Cook, Linda Wang,
	Matt Fleming

On 09/21/2012 03:07 PM, Eric W. Biederman wrote:
> 
> Have you tested kexec in this environment?
> 
> This is the kind of cpu feature that when we enable it, frequently we
> have to do something on the kexec path.
> 
> At a quick skim it looks like the kexec path is using kernel page table
> entries and clearing all bits from cr4 except X86_CR4_PAE so I don't
> actually expect this change will require anything on the kexec path.
> 

I have not, no, but as you quite correctly point out that shouldn't
affect things.

We should also change the kernel to start clean with CR4 -- the purpose
of CR4 is to indicate which CPU features the OS is opting into.

I think we do on x86-64 but not on x86-32 at the moment.

This is an unrelated problem, though, and can be addressed later.

	-hpa

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86-32, smap: Add STAC/ CLAC instructions to 32-bit kernel entry
  2012-09-21 19:43 ` [PATCH 08/11] x86, smap: Add STAC and CLAC instructions to control user space access H. Peter Anvin
  2012-09-21 20:04   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
@ 2012-09-22  0:16   ` tip-bot for H. Peter Anvin
  1 sibling, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-22  0:16 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  e59d1b0a24199db01978e6c1e89859eda93ce683
Gitweb:     http://git.kernel.org/tip/e59d1b0a24199db01978e6c1e89859eda93ce683
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Fri, 21 Sep 2012 13:58:10 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Fri, 21 Sep 2012 14:04:27 -0700

x86-32, smap: Add STAC/CLAC instructions to 32-bit kernel entry

The changes to entry_32.S got missed in checkin:

63bcff2a x86, smap: Add STAC and CLAC instructions to control user space access

The resulting kernel was largely functional but SMAP protection could
have been bypassed.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348256595-29119-9-git-send-email-hpa@linux.intel.com
---
 arch/x86/kernel/entry_32.S |   26 ++++++++++++++++++++++++++
 1 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/entry_32.S b/arch/x86/kernel/entry_32.S
index 623f288..9ebbeca 100644
--- a/arch/x86/kernel/entry_32.S
+++ b/arch/x86/kernel/entry_32.S
@@ -57,6 +57,7 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative-asm.h>
 #include <asm/asm.h>
+#include <asm/smap.h>
 
 /* Avoid __ASSEMBLER__'ifying <linux/audit.h> just for this.  */
 #include <linux/elf-em.h>
@@ -407,7 +408,9 @@ sysenter_past_esp:
  */
 	cmpl $__PAGE_OFFSET-3,%ebp
 	jae syscall_fault
+	ASM_STAC
 1:	movl (%ebp),%ebp
+	ASM_CLAC
 	movl %ebp,PT_EBP(%esp)
 	_ASM_EXTABLE(1b,syscall_fault)
 
@@ -488,6 +491,7 @@ ENDPROC(ia32_sysenter_target)
 	# system call handler stub
 ENTRY(system_call)
 	RING0_INT_FRAME			# can't unwind into user space anyway
+	ASM_CLAC
 	pushl_cfi %eax			# save orig_eax
 	SAVE_ALL
 	GET_THREAD_INFO(%ebp)
@@ -670,6 +674,7 @@ END(syscall_exit_work)
 
 	RING0_INT_FRAME			# can't unwind into user space anyway
 syscall_fault:
+	ASM_CLAC
 	GET_THREAD_INFO(%ebp)
 	movl $-EFAULT,PT_EAX(%esp)
 	jmp resume_userspace
@@ -825,6 +830,7 @@ END(interrupt)
  */
 	.p2align CONFIG_X86_L1_CACHE_SHIFT
 common_interrupt:
+	ASM_CLAC
 	addl $-0x80,(%esp)	/* Adjust vector into the [-256,-1] range */
 	SAVE_ALL
 	TRACE_IRQS_OFF
@@ -841,6 +847,7 @@ ENDPROC(common_interrupt)
 #define BUILD_INTERRUPT3(name, nr, fn)	\
 ENTRY(name)				\
 	RING0_INT_FRAME;		\
+	ASM_CLAC;			\
 	pushl_cfi $~(nr);		\
 	SAVE_ALL;			\
 	TRACE_IRQS_OFF			\
@@ -857,6 +864,7 @@ ENDPROC(name)
 
 ENTRY(coprocessor_error)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 	pushl_cfi $do_coprocessor_error
 	jmp error_code
@@ -865,6 +873,7 @@ END(coprocessor_error)
 
 ENTRY(simd_coprocessor_error)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 #ifdef CONFIG_X86_INVD_BUG
 	/* AMD 486 bug: invd from userspace calls exception 19 instead of #GP */
@@ -886,6 +895,7 @@ END(simd_coprocessor_error)
 
 ENTRY(device_not_available)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $-1			# mark this as an int
 	pushl_cfi $do_device_not_available
 	jmp error_code
@@ -906,6 +916,7 @@ END(native_irq_enable_sysexit)
 
 ENTRY(overflow)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 	pushl_cfi $do_overflow
 	jmp error_code
@@ -914,6 +925,7 @@ END(overflow)
 
 ENTRY(bounds)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 	pushl_cfi $do_bounds
 	jmp error_code
@@ -922,6 +934,7 @@ END(bounds)
 
 ENTRY(invalid_op)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 	pushl_cfi $do_invalid_op
 	jmp error_code
@@ -930,6 +943,7 @@ END(invalid_op)
 
 ENTRY(coprocessor_segment_overrun)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 	pushl_cfi $do_coprocessor_segment_overrun
 	jmp error_code
@@ -938,6 +952,7 @@ END(coprocessor_segment_overrun)
 
 ENTRY(invalid_TSS)
 	RING0_EC_FRAME
+	ASM_CLAC
 	pushl_cfi $do_invalid_TSS
 	jmp error_code
 	CFI_ENDPROC
@@ -945,6 +960,7 @@ END(invalid_TSS)
 
 ENTRY(segment_not_present)
 	RING0_EC_FRAME
+	ASM_CLAC
 	pushl_cfi $do_segment_not_present
 	jmp error_code
 	CFI_ENDPROC
@@ -952,6 +968,7 @@ END(segment_not_present)
 
 ENTRY(stack_segment)
 	RING0_EC_FRAME
+	ASM_CLAC
 	pushl_cfi $do_stack_segment
 	jmp error_code
 	CFI_ENDPROC
@@ -959,6 +976,7 @@ END(stack_segment)
 
 ENTRY(alignment_check)
 	RING0_EC_FRAME
+	ASM_CLAC
 	pushl_cfi $do_alignment_check
 	jmp error_code
 	CFI_ENDPROC
@@ -966,6 +984,7 @@ END(alignment_check)
 
 ENTRY(divide_error)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0			# no error code
 	pushl_cfi $do_divide_error
 	jmp error_code
@@ -975,6 +994,7 @@ END(divide_error)
 #ifdef CONFIG_X86_MCE
 ENTRY(machine_check)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 	pushl_cfi machine_check_vector
 	jmp error_code
@@ -984,6 +1004,7 @@ END(machine_check)
 
 ENTRY(spurious_interrupt_bug)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $0
 	pushl_cfi $do_spurious_interrupt_bug
 	jmp error_code
@@ -1207,6 +1228,7 @@ return_to_handler:
 
 ENTRY(page_fault)
 	RING0_EC_FRAME
+	ASM_CLAC
 	pushl_cfi $do_page_fault
 	ALIGN
 error_code:
@@ -1279,6 +1301,7 @@ END(page_fault)
 
 ENTRY(debug)
 	RING0_INT_FRAME
+	ASM_CLAC
 	cmpl $ia32_sysenter_target,(%esp)
 	jne debug_stack_correct
 	FIX_STACK 12, debug_stack_correct, debug_esp_fix_insn
@@ -1303,6 +1326,7 @@ END(debug)
  */
 ENTRY(nmi)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi %eax
 	movl %ss, %eax
 	cmpw $__ESPFIX_SS, %ax
@@ -1373,6 +1397,7 @@ END(nmi)
 
 ENTRY(int3)
 	RING0_INT_FRAME
+	ASM_CLAC
 	pushl_cfi $-1			# mark this as an int
 	SAVE_ALL
 	TRACE_IRQS_OFF
@@ -1393,6 +1418,7 @@ END(general_protection)
 #ifdef CONFIG_KVM_GUEST
 ENTRY(async_page_fault)
 	RING0_EC_FRAME
+	ASM_CLAC
 	pushl_cfi $do_async_page_fault
 	jmp error_code
 	CFI_ENDPROC

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 22:12   ` H. Peter Anvin
@ 2012-09-22  0:41     ` Eric W. Biederman
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
  0 siblings, 1 reply; 56+ messages in thread
From: Eric W. Biederman @ 2012-09-22  0:41 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, H. Peter Anvin, Ingo Molnar,
	Thomas Gleixner, Linus Torvalds, Kees Cook, Linda Wang,
	Matt Fleming

"H. Peter Anvin" <hpa@linux.intel.com> writes:

> On 09/21/2012 03:07 PM, Eric W. Biederman wrote:
>> 
>> Have you tested kexec in this environment?
>> 
>> This is the kind of cpu feature that when we enable it, frequently we
>> have to do something on the kexec path.
>> 
>> At a quick skim it looks like the kexec path is using kernel page table
>> entries and clearing all bits from cr4 except X86_CR4_PAE so I don't
>> actually expect this change will require anything on the kexec path.
>> 
>
> I have not, no, but as you quite correctly point out that shouldn't
> affect things.
>
> We should also change the kernel to start clean with CR4 -- the purpose
> of CR4 is to indicate which CPU features the OS is opting into.
>
> I think we do on x86-64 but not on x86-32 at the moment.
>
> This is an unrelated problem, though, and can be addressed later.

Agreed.  I just was just curious where things stood.

Eric


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-21 22:10   ` H. Peter Anvin
@ 2012-09-22 11:32     ` Ingo Molnar
  2012-09-24 20:31       ` H. Peter Anvin
  0 siblings, 1 reply; 56+ messages in thread
From: Ingo Molnar @ 2012-09-22 11:32 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dave Jones, Linux Kernel Mailing List, H. Peter Anvin,
	Thomas Gleixner, Linus Torvalds, Kees Cook, Linda Wang,
	Matt Fleming


* H. Peter Anvin <hpa@linux.intel.com> wrote:

> On 09/21/2012 03:08 PM, Dave Jones wrote:
> > 
> > Perhaps add a printk somewhere to show that it's actually been enabled maybe ?
> > 
> > Also, would it be feasible to add something like we have for test_nx ?
> > If this feature regresses in some way in the future, I suspect we'd like
> > to know about it sooner rather than later.
> > 
> 
> Good idea... should add this both for SMEP and SMAP.

Very much agreed - these exploit preventation hardware features 
are really useful, and it's good to inform the user that they 
are active.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-22 11:32     ` Ingo Molnar
@ 2012-09-24 20:31       ` H. Peter Anvin
  2012-09-24 20:43         ` Kees Cook
  0 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-24 20:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, Dave Jones, Linux Kernel Mailing List,
	Thomas Gleixner, Linus Torvalds, Kees Cook, Linda Wang,
	Matt Fleming

On 09/22/2012 04:32 AM, Ingo Molnar wrote:
> 
> * H. Peter Anvin <hpa@linux.intel.com> wrote:
> 
>> On 09/21/2012 03:08 PM, Dave Jones wrote:
>>>
>>> Perhaps add a printk somewhere to show that it's actually been enabled maybe ?
>>>
>>> Also, would it be feasible to add something like we have for test_nx ?
>>> If this feature regresses in some way in the future, I suspect we'd like
>>> to know about it sooner rather than later.
>>
>> Good idea... should add this both for SMEP and SMAP.
> 
> Very much agreed - these exploit preventation hardware features 
> are really useful, and it's good to inform the user that they 
> are active.
> 

I was thinking about this, do you think a printk would be better, or a
new field in /proc/cpuinfo?

	-hpa

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-24 20:31       ` H. Peter Anvin
@ 2012-09-24 20:43         ` Kees Cook
  2012-09-24 20:51           ` H. Peter Anvin
  0 siblings, 1 reply; 56+ messages in thread
From: Kees Cook @ 2012-09-24 20:43 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, H. Peter Anvin, Dave Jones,
	Linux Kernel Mailing List, Thomas Gleixner, Linus Torvalds,
	Linda Wang, Matt Fleming

On Mon, Sep 24, 2012 at 1:31 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 09/22/2012 04:32 AM, Ingo Molnar wrote:
>>
>> * H. Peter Anvin <hpa@linux.intel.com> wrote:
>>
>>> On 09/21/2012 03:08 PM, Dave Jones wrote:
>>>>
>>>> Perhaps add a printk somewhere to show that it's actually been enabled maybe ?
>>>>
>>>> Also, would it be feasible to add something like we have for test_nx ?
>>>> If this feature regresses in some way in the future, I suspect we'd like
>>>> to know about it sooner rather than later.
>>>
>>> Good idea... should add this both for SMEP and SMAP.
>>
>> Very much agreed - these exploit preventation hardware features
>> are really useful, and it's good to inform the user that they
>> are active.
>>
>
> I was thinking about this, do you think a printk would be better, or a
> new field in /proc/cpuinfo?

We use printk for displaying the possible states of NX, however, this
is rather ephemeral and scrolls away, making its harder for an admin
to find later. It might make sense for a new field in cpuinfo for all
three, however, unlike NX, the status of SMEP/SMAP isn't (normally)
discoverable from userspace. That said, their CPU feature flags are
already right there, and the cases where it would be disabled are very
small.

How about this...

mem protection  : nx smap smep

Maybe the "why" of a cpu feature being missing from the "mem
protection" line can stay in printk?

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH 00/11] x86: Supervisor Mode Access Prevention
  2012-09-24 20:43         ` Kees Cook
@ 2012-09-24 20:51           ` H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-24 20:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: H. Peter Anvin, Ingo Molnar, Dave Jones,
	Linux Kernel Mailing List, Thomas Gleixner, Linus Torvalds,
	Linda Wang, Matt Fleming

On 09/24/2012 01:43 PM, Kees Cook wrote:
> 
> How about this...
> 
> mem protection  : nx smap smep
> 
> Maybe the "why" of a cpu feature being missing from the "mem
> protection" line can stay in printk?
> 

Come to think about it, since we use setup_set/clear_cpu_cap we aready
don't list the feature in /proc/cpuinfo already if it isn't enabled.

I would therefore suggest that we simply printk a message if the feature
is disabled.

	-hpa

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2012-09-22  0:41     ` Eric W. Biederman
@ 2012-09-24 23:27       ` H. Peter Anvin
  2012-09-25 13:27         ` Konrad Rzeszutek Wilk
                           ` (6 more replies)
  0 siblings, 7 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-09-24 23:27 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Ingo Molnar, Thomas Gleixner, Dave Jones, Linus Torvalds,
	Eric W. Biederman, Ian Campbell, Konrad Rzeszutek Wilk,
	Jeremy Fitzhardinge, Rusty Russell, David Woodhouse, Vivek Goyal,
	Andres Salomon, Yinghai Lu, H. Peter Anvin, H. Peter Anvin

From: "H. Peter Anvin" <hpa@linux.intel.com>

%cr4 is supposed to reflect a set of features into which the operating
system is opting in.  If the BIOS or bootloader leaks bits here, this
is not desirable.  Consider a bootloader passing in %cr4.pae set to a
legacy paging kernel, for example -- it will not have any immediate
effect, but the kernel would crash when turning paging on.

A similar argument applies to %eflags, and since we have to look for
%eflags.id being settable we can use a sequence which clears %eflags
as a side effect.

Note that we already do this for x86-64.

I would like opinions on this especially from the PV crowd and
nonstandard platforms (e.g. OLPC) to make sure we don't screw them up.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/kernel/head_32.S |   31 ++++++++++++++++---------------
 1 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index d42ab17..957a47a 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -287,27 +287,28 @@ ENTRY(startup_32_smp)
 	leal -__PAGE_OFFSET(%ecx),%esp
 
 default_entry:
-
 /*
  *	New page tables may be in 4Mbyte page mode and may
  *	be using the global pages. 
  *
  *	NOTE! If we are on a 486 we may have no cr4 at all!
- *	So we do not try to touch it unless we really have
- *	some bits in it to set.  This won't work if the BSP
- *	implements cr4 but this AP does not -- very unlikely
- *	but be warned!  The same applies to the pse feature
- *	if not equally supported. --macro
- *
- *	NOTE! We have to correct for the fact that we're
- *	not yet offset PAGE_OFFSET..
+ *	Specifically, cr4 exists if and only if CPUID exists,
+ *	which in turn exists if and only if EFLAGS.ID exists.
  */
-#define cr4_bits pa(mmu_cr4_features)
-	movl cr4_bits,%edx
-	andl %edx,%edx
-	jz 6f
-	movl %cr4,%eax		# Turn on paging options (PSE,PAE,..)
-	orl %edx,%eax
+	movl $X86_EFLAGS_ID,%ecx
+	pushl %ecx
+	popfl
+	pushfl
+	popl %eax
+	pushl $0
+	popfl
+	pushfl
+	popl %edx
+	xorl %edx,%eax
+	testl %ecx,%eax
+	jz 6f			# No ID flag = no CPUID = no CR4
+
+	movl pa(mmu_cr4_features),%eax
 	movl %eax,%cr4
 
 	testb $X86_CR4_PAE, %al		# check if PAE is enabled
-- 
1.7.6.5


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
@ 2012-09-25 13:27         ` Konrad Rzeszutek Wilk
  2012-09-25 13:48         ` Ian Campbell
                           ` (5 subsequent siblings)
  6 siblings, 0 replies; 56+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-09-25 13:27 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, Ingo Molnar, Thomas Gleixner,
	Dave Jones, Linus Torvalds, Eric W. Biederman, Ian Campbell,
	Konrad Rzeszutek Wilk, Jeremy Fitzhardinge, Rusty Russell,
	David Woodhouse, Vivek Goyal, Andres Salomon, Yinghai Lu,
	H. Peter Anvin

On Mon, Sep 24, 2012 at 04:27:19PM -0700, H. Peter Anvin wrote:
> From: "H. Peter Anvin" <hpa@linux.intel.com>
> 
> %cr4 is supposed to reflect a set of features into which the operating
> system is opting in.  If the BIOS or bootloader leaks bits here, this
> is not desirable.  Consider a bootloader passing in %cr4.pae set to a
> legacy paging kernel, for example -- it will not have any immediate
> effect, but the kernel would crash when turning paging on.
> 
> A similar argument applies to %eflags, and since we have to look for
> %eflags.id being settable we can use a sequence which clears %eflags
> as a side effect.
> 
> Note that we already do this for x86-64.
> 
> I would like opinions on this especially from the PV crowd and
> nonstandard platforms (e.g. OLPC) to make sure we don't screw them up.
> 

>From a glance at this it looks ok -  as we do not ever end up in this
function at all. Our entry point in the kernel is from startup_xen.

But just to make sure that nothing is amiss let me take this patch
on a spin.

And thank you for CC-ing me.

> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
> ---
>  arch/x86/kernel/head_32.S |   31 ++++++++++++++++---------------
>  1 files changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
> index d42ab17..957a47a 100644
> --- a/arch/x86/kernel/head_32.S
> +++ b/arch/x86/kernel/head_32.S
> @@ -287,27 +287,28 @@ ENTRY(startup_32_smp)
>  	leal -__PAGE_OFFSET(%ecx),%esp
>  
>  default_entry:
> -
>  /*
>   *	New page tables may be in 4Mbyte page mode and may
>   *	be using the global pages. 
>   *
>   *	NOTE! If we are on a 486 we may have no cr4 at all!
> - *	So we do not try to touch it unless we really have
> - *	some bits in it to set.  This won't work if the BSP
> - *	implements cr4 but this AP does not -- very unlikely
> - *	but be warned!  The same applies to the pse feature
> - *	if not equally supported. --macro
> - *
> - *	NOTE! We have to correct for the fact that we're
> - *	not yet offset PAGE_OFFSET..
> + *	Specifically, cr4 exists if and only if CPUID exists,
> + *	which in turn exists if and only if EFLAGS.ID exists.
>   */
> -#define cr4_bits pa(mmu_cr4_features)
> -	movl cr4_bits,%edx
> -	andl %edx,%edx
> -	jz 6f
> -	movl %cr4,%eax		# Turn on paging options (PSE,PAE,..)
> -	orl %edx,%eax
> +	movl $X86_EFLAGS_ID,%ecx
> +	pushl %ecx
> +	popfl
> +	pushfl
> +	popl %eax
> +	pushl $0
> +	popfl
> +	pushfl
> +	popl %edx
> +	xorl %edx,%eax
> +	testl %ecx,%eax
> +	jz 6f			# No ID flag = no CPUID = no CR4
> +
> +	movl pa(mmu_cr4_features),%eax
>  	movl %eax,%cr4
>  
>  	testb $X86_CR4_PAE, %al		# check if PAE is enabled
> -- 
> 1.7.6.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
  2012-09-25 13:27         ` Konrad Rzeszutek Wilk
@ 2012-09-25 13:48         ` Ian Campbell
  2012-09-26 11:29           ` Konrad Rzeszutek Wilk
  2012-09-27  6:11         ` [tip:x86/smap] " tip-bot for H. Peter Anvin
                           ` (4 subsequent siblings)
  6 siblings, 1 reply; 56+ messages in thread
From: Ian Campbell @ 2012-09-25 13:48 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, Ingo Molnar, Thomas Gleixner,
	Dave Jones, Linus Torvalds, Eric W. Biederman,
	Konrad Rzeszutek Wilk, Jeremy Fitzhardinge, Rusty Russell,
	David Woodhouse, Vivek Goyal, Andres Salomon, Yinghai Lu,
	H. Peter Anvin

On Tue, 2012-09-25 at 00:27 +0100, H. Peter Anvin wrote:
> From: "H. Peter Anvin" <hpa@linux.intel.com>
> 
> %cr4 is supposed to reflect a set of features into which the operating
> system is opting in.  If the BIOS or bootloader leaks bits here, this
> is not desirable.  Consider a bootloader passing in %cr4.pae set to a
> legacy paging kernel, for example -- it will not have any immediate
> effect, but the kernel would crash when turning paging on.
> 
> A similar argument applies to %eflags, and since we have to look for
> %eflags.id being settable we can use a sequence which clears %eflags
> as a side effect.
> 
> Note that we already do this for x86-64.
> 
> I would like opinions on this especially from the PV crowd

Xen PV guests don't pass through this code path so there is no danger
there AFAICT, so from that PoV:

Acked-by: Ian Campbell <ian.campbell@citrix.com>

FWIW it looks correct to me from the native PoV too, but you probably
already knew that ;-)

Ian.



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2012-09-25 13:48         ` Ian Campbell
@ 2012-09-26 11:29           ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 56+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-09-26 11:29 UTC (permalink / raw)
  To: Ian Campbell
  Cc: H. Peter Anvin, Linux Kernel Mailing List, Ingo Molnar,
	Thomas Gleixner, Dave Jones, Linus Torvalds, Eric W. Biederman,
	Konrad Rzeszutek Wilk, Jeremy Fitzhardinge, Rusty Russell,
	David Woodhouse, Vivek Goyal, Andres Salomon, Yinghai Lu,
	H. Peter Anvin

On Tue, Sep 25, 2012 at 02:48:20PM +0100, Ian Campbell wrote:
> On Tue, 2012-09-25 at 00:27 +0100, H. Peter Anvin wrote:
> > From: "H. Peter Anvin" <hpa@linux.intel.com>
> > 
> > %cr4 is supposed to reflect a set of features into which the operating
> > system is opting in.  If the BIOS or bootloader leaks bits here, this
> > is not desirable.  Consider a bootloader passing in %cr4.pae set to a
> > legacy paging kernel, for example -- it will not have any immediate
> > effect, but the kernel would crash when turning paging on.
> > 
> > A similar argument applies to %eflags, and since we have to look for
> > %eflags.id being settable we can use a sequence which clears %eflags
> > as a side effect.
> > 
> > Note that we already do this for x86-64.
> > 
> > I would like opinions on this especially from the PV crowd
> 
> Xen PV guests don't pass through this code path so there is no danger
> there AFAICT, so from that PoV:
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> FWIW it looks correct to me from the native PoV too, but you probably
> already knew that ;-)

And sanity testing confirmed it.

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> Ian.
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86-32: Start out eflags and cr4 clean
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
  2012-09-25 13:27         ` Konrad Rzeszutek Wilk
  2012-09-25 13:48         ` Ian Campbell
@ 2012-09-27  6:11         ` tip-bot for H. Peter Anvin
  2012-11-24  3:49           ` Yuhong Bao
  2012-09-27  6:11         ` [tip:x86/smap] x86, suspend: On wakeup always initialize cr4 and EFER tip-bot for H. Peter Anvin
                           ` (3 subsequent siblings)
  6 siblings, 1 reply; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-27  6:11 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, hpa

Commit-ID:  5a5a51db78ef24aa61a4cb2ae36f07f6fa37356d
Gitweb:     http://git.kernel.org/tip/5a5a51db78ef24aa61a4cb2ae36f07f6fa37356d
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Mon, 24 Sep 2012 16:05:48 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Wed, 26 Sep 2012 15:06:22 -0700

x86-32: Start out eflags and cr4 clean

%cr4 is supposed to reflect a set of features into which the operating
system is opting in.  If the BIOS or bootloader leaks bits here, this
is not desirable.  Consider a bootloader passing in %cr4.pae set to a
legacy paging kernel, for example -- it will not have any immediate
effect, but the kernel would crash when turning paging on.

A similar argument applies to %eflags, and since we have to look for
%eflags.id being settable we can use a sequence which clears %eflags
as a side effect.

Note that we already do this for x86-64.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
---
 arch/x86/kernel/head_32.S |   31 ++++++++++++++++---------------
 1 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index d42ab17..957a47a 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -287,27 +287,28 @@ ENTRY(startup_32_smp)
 	leal -__PAGE_OFFSET(%ecx),%esp
 
 default_entry:
-
 /*
  *	New page tables may be in 4Mbyte page mode and may
  *	be using the global pages. 
  *
  *	NOTE! If we are on a 486 we may have no cr4 at all!
- *	So we do not try to touch it unless we really have
- *	some bits in it to set.  This won't work if the BSP
- *	implements cr4 but this AP does not -- very unlikely
- *	but be warned!  The same applies to the pse feature
- *	if not equally supported. --macro
- *
- *	NOTE! We have to correct for the fact that we're
- *	not yet offset PAGE_OFFSET..
+ *	Specifically, cr4 exists if and only if CPUID exists,
+ *	which in turn exists if and only if EFLAGS.ID exists.
  */
-#define cr4_bits pa(mmu_cr4_features)
-	movl cr4_bits,%edx
-	andl %edx,%edx
-	jz 6f
-	movl %cr4,%eax		# Turn on paging options (PSE,PAE,..)
-	orl %edx,%eax
+	movl $X86_EFLAGS_ID,%ecx
+	pushl %ecx
+	popfl
+	pushfl
+	popl %eax
+	pushl $0
+	popfl
+	pushfl
+	popl %edx
+	xorl %edx,%eax
+	testl %ecx,%eax
+	jz 6f			# No ID flag = no CPUID = no CR4
+
+	movl pa(mmu_cr4_features),%eax
 	movl %eax,%cr4
 
 	testb $X86_CR4_PAE, %al		# check if PAE is enabled

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/smap] x86, suspend: On wakeup always initialize cr4 and EFER
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
                           ` (2 preceding siblings ...)
  2012-09-27  6:11         ` [tip:x86/smap] " tip-bot for H. Peter Anvin
@ 2012-09-27  6:11         ` tip-bot for H. Peter Anvin
  2012-10-01 22:04         ` [tip:x86/urgent] x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID tip-bot for H. Peter Anvin
                           ` (2 subsequent siblings)
  6 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-09-27  6:11 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, rjw, hpa

Commit-ID:  73201dbec64aebf6b0dca855b523f437972dc7bb
Gitweb:     http://git.kernel.org/tip/73201dbec64aebf6b0dca855b523f437972dc7bb
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Wed, 26 Sep 2012 15:02:34 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Wed, 26 Sep 2012 15:06:22 -0700

x86, suspend: On wakeup always initialize cr4 and EFER

We already have a flag word to indicate the existence of MISC_ENABLES,
so use the same flag word to indicate existence of cr4 and EFER, and
always restore them if they exist.  That way if something passes a
nonzero value when the value *should* be zero, we will still
initialize it.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
---
 arch/x86/kernel/acpi/sleep.c      |   15 ++++++++++-----
 arch/x86/realmode/rm/wakeup.h     |    2 ++
 arch/x86/realmode/rm/wakeup_asm.S |   29 +++++++++++++++++++----------
 3 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 1b8e5a0..11676cf 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -43,17 +43,22 @@ int acpi_suspend_lowlevel(void)
 
 	header->video_mode = saved_video_mode;
 
+	header->pmode_behavior = 0;
+
 #ifndef CONFIG_64BIT
 	store_gdt((struct desc_ptr *)&header->pmode_gdt);
 
-	if (rdmsr_safe(MSR_EFER, &header->pmode_efer_low,
-		       &header->pmode_efer_high))
-		header->pmode_efer_low = header->pmode_efer_high = 0;
+	if (!rdmsr_safe(MSR_EFER,
+			&header->pmode_efer_low,
+			&header->pmode_efer_high))
+		header->pmode_behavior |= (1 << WAKEUP_BEHAVIOR_RESTORE_EFER);
 #endif /* !CONFIG_64BIT */
 
 	header->pmode_cr0 = read_cr0();
-	header->pmode_cr4 = read_cr4_safe();
-	header->pmode_behavior = 0;
+	if (__this_cpu_read(cpu_info.cpuid_level) >= 0) {
+		header->pmode_cr4 = read_cr4();
+		header->pmode_behavior |= (1 << WAKEUP_BEHAVIOR_RESTORE_CR4);
+	}
 	if (!rdmsr_safe(MSR_IA32_MISC_ENABLE,
 			&header->pmode_misc_en_low,
 			&header->pmode_misc_en_high))
diff --git a/arch/x86/realmode/rm/wakeup.h b/arch/x86/realmode/rm/wakeup.h
index 9317e00..7dd86a4 100644
--- a/arch/x86/realmode/rm/wakeup.h
+++ b/arch/x86/realmode/rm/wakeup.h
@@ -36,5 +36,7 @@ extern struct wakeup_header wakeup_header;
 
 /* Wakeup behavior bits */
 #define WAKEUP_BEHAVIOR_RESTORE_MISC_ENABLE     0
+#define WAKEUP_BEHAVIOR_RESTORE_CR4		1
+#define WAKEUP_BEHAVIOR_RESTORE_EFER		2
 
 #endif /* ARCH_X86_KERNEL_ACPI_RM_WAKEUP_H */
diff --git a/arch/x86/realmode/rm/wakeup_asm.S b/arch/x86/realmode/rm/wakeup_asm.S
index 8905166..e56479e 100644
--- a/arch/x86/realmode/rm/wakeup_asm.S
+++ b/arch/x86/realmode/rm/wakeup_asm.S
@@ -74,9 +74,18 @@ ENTRY(wakeup_start)
 
 	lidtl	wakeup_idt
 
-	/* Clear the EFLAGS */
-	pushl	$0
+	/* Clear the EFLAGS but remember if we have EFLAGS.ID */
+	movl $X86_EFLAGS_ID, %ecx
+	pushl %ecx
 	popfl
+	pushfl
+	popl %edi
+	pushl $0
+	popfl
+	pushfl
+	popl %edx
+	xorl %edx, %edi
+	andl %ecx, %edi		/* %edi is zero iff CPUID & %cr4 are missing */
 
 	/* Check header signature... */
 	movl	signature, %eax
@@ -93,8 +102,8 @@ ENTRY(wakeup_start)
 
 	/* Restore MISC_ENABLE before entering protected mode, in case
 	   BIOS decided to clear XD_DISABLE during S3. */
-	movl	pmode_behavior, %eax
-	btl	$WAKEUP_BEHAVIOR_RESTORE_MISC_ENABLE, %eax
+	movl	pmode_behavior, %edi
+	btl	$WAKEUP_BEHAVIOR_RESTORE_MISC_ENABLE, %edi
 	jnc	1f
 
 	movl	pmode_misc_en, %eax
@@ -110,15 +119,15 @@ ENTRY(wakeup_start)
 	movl	pmode_cr3, %eax
 	movl	%eax, %cr3
 
-	movl	pmode_cr4, %ecx
-	jecxz	1f
-	movl	%ecx, %cr4
+	btl	$WAKEUP_BEHAVIOR_RESTORE_CR4, %edi
+	jz	1f
+	movl	pmode_cr4, %eax
+	movl	%eax, %cr4
 1:
+	btl	$WAKEUP_BEHAVIOR_RESTORE_EFER, %edi
+	jz	1f
 	movl	pmode_efer, %eax
 	movl	pmode_efer + 4, %edx
-	movl	%eax, %ecx
-	orl	%edx, %ecx
-	jz	1f
 	movl	$MSR_EFER, %ecx
 	wrmsr
 1:

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/urgent] x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
                           ` (3 preceding siblings ...)
  2012-09-27  6:11         ` [tip:x86/smap] x86, suspend: On wakeup always initialize cr4 and EFER tip-bot for H. Peter Anvin
@ 2012-10-01 22:04         ` tip-bot for H. Peter Anvin
  2012-10-02  6:52         ` tip-bot for H. Peter Anvin
  2012-10-10 19:59         ` [RFC PATCH] x86-32: Start out eflags and cr4 clean Andres Salomon
  6 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-10-01 22:04 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, rjw, hpa

Commit-ID:  f1d01b229294734e52630c6ad6366aa29a257475
Gitweb:     http://git.kernel.org/tip/f1d01b229294734e52630c6ad6366aa29a257475
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Mon, 1 Oct 2012 14:34:42 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Mon, 1 Oct 2012 14:45:18 -0700

x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID

The patch:

    73201dbe x86, suspend: On wakeup always initialize cr4 and EFER

... was incorrectly committed in an intermediate (unfinished) form.

- We need to test CF, not ZF, for a bit test with btl.
- We don't actually need to compute the existence of EFLAGS.ID,
  since we set a flag at suspend time if CR4 should be restored.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
---
 arch/x86/realmode/rm/wakeup_asm.S |   15 +++------------
 1 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/x86/realmode/rm/wakeup_asm.S b/arch/x86/realmode/rm/wakeup_asm.S
index e56479e..9e7e147 100644
--- a/arch/x86/realmode/rm/wakeup_asm.S
+++ b/arch/x86/realmode/rm/wakeup_asm.S
@@ -74,18 +74,9 @@ ENTRY(wakeup_start)
 
 	lidtl	wakeup_idt
 
-	/* Clear the EFLAGS but remember if we have EFLAGS.ID */
-	movl $X86_EFLAGS_ID, %ecx
-	pushl %ecx
-	popfl
-	pushfl
-	popl %edi
+	/* Clear the EFLAGS */
 	pushl $0
 	popfl
-	pushfl
-	popl %edx
-	xorl %edx, %edi
-	andl %ecx, %edi		/* %edi is zero iff CPUID & %cr4 are missing */
 
 	/* Check header signature... */
 	movl	signature, %eax
@@ -120,12 +111,12 @@ ENTRY(wakeup_start)
 	movl	%eax, %cr3
 
 	btl	$WAKEUP_BEHAVIOR_RESTORE_CR4, %edi
-	jz	1f
+	jnc	1f
 	movl	pmode_cr4, %eax
 	movl	%eax, %cr4
 1:
 	btl	$WAKEUP_BEHAVIOR_RESTORE_EFER, %edi
-	jz	1f
+	jnc	1f
 	movl	pmode_efer, %eax
 	movl	pmode_efer + 4, %edx
 	movl	$MSR_EFER, %ecx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [tip:x86/urgent] x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
                           ` (4 preceding siblings ...)
  2012-10-01 22:04         ` [tip:x86/urgent] x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID tip-bot for H. Peter Anvin
@ 2012-10-02  6:52         ` tip-bot for H. Peter Anvin
  2012-10-10 19:59         ` [RFC PATCH] x86-32: Start out eflags and cr4 clean Andres Salomon
  6 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2012-10-02  6:52 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, tglx, rjw, hpa

Commit-ID:  1396adc3c2bdc556d4cdd1cf107aa0b6d59fbb1e
Gitweb:     http://git.kernel.org/tip/1396adc3c2bdc556d4cdd1cf107aa0b6d59fbb1e
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Mon, 1 Oct 2012 14:34:42 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 2 Oct 2012 08:42:31 +0200

x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID

The patch:

    73201dbe x86, suspend: On wakeup always initialize cr4 and EFER

... was incorrectly committed in an intermediate (unfinished) form.

- We need to test CF, not ZF, for a bit test with btl.
- We don't actually need to compute the existence of EFLAGS.ID,
  since we set a flag at suspend time if CR4 should be restored.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Link: http://lkml.kernel.org/r/1348529239-17943-1-git-send-email-hpa@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/realmode/rm/wakeup_asm.S |   15 +++------------
 1 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/x86/realmode/rm/wakeup_asm.S b/arch/x86/realmode/rm/wakeup_asm.S
index e56479e..9e7e147 100644
--- a/arch/x86/realmode/rm/wakeup_asm.S
+++ b/arch/x86/realmode/rm/wakeup_asm.S
@@ -74,18 +74,9 @@ ENTRY(wakeup_start)
 
 	lidtl	wakeup_idt
 
-	/* Clear the EFLAGS but remember if we have EFLAGS.ID */
-	movl $X86_EFLAGS_ID, %ecx
-	pushl %ecx
-	popfl
-	pushfl
-	popl %edi
+	/* Clear the EFLAGS */
 	pushl $0
 	popfl
-	pushfl
-	popl %edx
-	xorl %edx, %edi
-	andl %ecx, %edi		/* %edi is zero iff CPUID & %cr4 are missing */
 
 	/* Check header signature... */
 	movl	signature, %eax
@@ -120,12 +111,12 @@ ENTRY(wakeup_start)
 	movl	%eax, %cr3
 
 	btl	$WAKEUP_BEHAVIOR_RESTORE_CR4, %edi
-	jz	1f
+	jnc	1f
 	movl	pmode_cr4, %eax
 	movl	%eax, %cr4
 1:
 	btl	$WAKEUP_BEHAVIOR_RESTORE_EFER, %edi
-	jz	1f
+	jnc	1f
 	movl	pmode_efer, %eax
 	movl	pmode_efer + 4, %edx
 	movl	$MSR_EFER, %ecx

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
                           ` (5 preceding siblings ...)
  2012-10-02  6:52         ` tip-bot for H. Peter Anvin
@ 2012-10-10 19:59         ` Andres Salomon
  2013-01-19  0:40           ` Andres Salomon
  6 siblings, 1 reply; 56+ messages in thread
From: Andres Salomon @ 2012-10-10 19:59 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, Ingo Molnar, Thomas Gleixner,
	Dave Jones, Linus Torvalds, Eric W. Biederman, Ian Campbell,
	Konrad Rzeszutek Wilk, Jeremy Fitzhardinge, Rusty Russell,
	David Woodhouse, Vivek Goyal, Yinghai Lu, H. Peter Anvin,
	techteam

Thanks for Cc'ing me.  I just tested this on an XO-1, and it doesn't
appear to break anything.  So, looks fine to me.

Acked-by: Andres Salomon <dilinger@queued.net>


On Mon, 24 Sep 2012 16:27:19 -0700 "H. Peter
Anvin" <hpa@linux.intel.com> wrote:

> From: "H. Peter Anvin" <hpa@linux.intel.com>
> 
> %cr4 is supposed to reflect a set of features into which the operating
> system is opting in.  If the BIOS or bootloader leaks bits here, this
> is not desirable.  Consider a bootloader passing in %cr4.pae set to a
> legacy paging kernel, for example -- it will not have any immediate
> effect, but the kernel would crash when turning paging on.
> 
> A similar argument applies to %eflags, and since we have to look for
> %eflags.id being settable we can use a sequence which clears %eflags
> as a side effect.
> 
> Note that we already do this for x86-64.
> 
> I would like opinions on this especially from the PV crowd and
> nonstandard platforms (e.g. OLPC) to make sure we don't screw them up.
> 
> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
> ---
>  arch/x86/kernel/head_32.S |   31 ++++++++++++++++---------------
>  1 files changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
> index d42ab17..957a47a 100644
> --- a/arch/x86/kernel/head_32.S
> +++ b/arch/x86/kernel/head_32.S
> @@ -287,27 +287,28 @@ ENTRY(startup_32_smp)
>  	leal -__PAGE_OFFSET(%ecx),%esp
>  
>  default_entry:
> -
>  /*
>   *	New page tables may be in 4Mbyte page mode and may
>   *	be using the global pages. 
>   *
>   *	NOTE! If we are on a 486 we may have no cr4 at all!
> - *	So we do not try to touch it unless we really have
> - *	some bits in it to set.  This won't work if the BSP
> - *	implements cr4 but this AP does not -- very unlikely
> - *	but be warned!  The same applies to the pse feature
> - *	if not equally supported. --macro
> - *
> - *	NOTE! We have to correct for the fact that we're
> - *	not yet offset PAGE_OFFSET..
> + *	Specifically, cr4 exists if and only if CPUID exists,
> + *	which in turn exists if and only if EFLAGS.ID exists.
>   */
> -#define cr4_bits pa(mmu_cr4_features)
> -	movl cr4_bits,%edx
> -	andl %edx,%edx
> -	jz 6f
> -	movl %cr4,%eax		# Turn on paging options
> (PSE,PAE,..)
> -	orl %edx,%eax
> +	movl $X86_EFLAGS_ID,%ecx
> +	pushl %ecx
> +	popfl
> +	pushfl
> +	popl %eax
> +	pushl $0
> +	popfl
> +	pushfl
> +	popl %edx
> +	xorl %edx,%eax
> +	testl %ecx,%eax
> +	jz 6f			# No ID flag = no CPUID = no CR4
> +
> +	movl pa(mmu_cr4_features),%eax
>  	movl %eax,%cr4
>  
>  	testb $X86_CR4_PAE, %al		# check if PAE is
> enabled

^ permalink raw reply	[flat|nested] 56+ messages in thread

* RE: [tip:x86/smap] x86-32: Start out eflags and cr4 clean
  2012-09-27  6:11         ` [tip:x86/smap] " tip-bot for H. Peter Anvin
@ 2012-11-24  3:49           ` Yuhong Bao
  2012-11-24  5:06             ` H. Peter Anvin
  0 siblings, 1 reply; 56+ messages in thread
From: Yuhong Bao @ 2012-11-24  3:49 UTC (permalink / raw)
  To: mingo, linux-tip-commits; +Cc: linux-kernel, hpa, tglx

> + * Specifically, cr4 exists if and only if CPUID exists,
> + * which in turn exists if and only if EFLAGS.ID exists.
This may be true for *Intel* 486 CPUs as VME was implemented at the same time as CPUID, but I am not sure that it is true for AMD 486 CPUs.

Yuhong Bao 		 	   		  

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [tip:x86/smap] x86-32: Start out eflags and cr4 clean
  2012-11-24  3:49           ` Yuhong Bao
@ 2012-11-24  5:06             ` H. Peter Anvin
  0 siblings, 0 replies; 56+ messages in thread
From: H. Peter Anvin @ 2012-11-24  5:06 UTC (permalink / raw)
  To: Yuhong Bao; +Cc: mingo, linux-tip-commits, linux-kernel, tglx

On 11/23/2012 07:49 PM, Yuhong Bao wrote:
>> + * Specifically, cr4 exists if and only if CPUID exists,
>> + * which in turn exists if and only if EFLAGS.ID exists.
> This may be true for *Intel* 486 CPUs as VME was implemented at the same time as CPUID, but I am not sure that it is true for AMD 486 CPUs.
>
> Yuhong Bao 		 	   		

That is the architecture; if you know of a counterexample speak up.  "I 
am not sure" doesn't count.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2012-10-10 19:59         ` [RFC PATCH] x86-32: Start out eflags and cr4 clean Andres Salomon
@ 2013-01-19  0:40           ` Andres Salomon
  2013-01-19  0:42             ` H. Peter Anvin
  0 siblings, 1 reply; 56+ messages in thread
From: Andres Salomon @ 2013-01-19  0:40 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linux Kernel Mailing List, Ingo Molnar, Thomas Gleixner,
	Dave Jones, Linus Torvalds, Eric W. Biederman, Ian Campbell,
	Konrad Rzeszutek Wilk, Jeremy Fitzhardinge, Rusty Russell,
	David Woodhouse, Vivek Goyal, Yinghai Lu, H. Peter Anvin,
	techteam

Bad news on this patch; I've been told that it breaks booting on an
XO-1.5.  Does anyone from OLPC know why yet?


On Wed, 10 Oct 2012 12:59:27 -0700
Andres Salomon <dilinger@queued.net> wrote:

> Thanks for Cc'ing me.  I just tested this on an XO-1, and it doesn't
> appear to break anything.  So, looks fine to me.
> 
> Acked-by: Andres Salomon <dilinger@queued.net>
> 
> 
> On Mon, 24 Sep 2012 16:27:19 -0700 "H. Peter
> Anvin" <hpa@linux.intel.com> wrote:
> 
> > From: "H. Peter Anvin" <hpa@linux.intel.com>
> > 
> > %cr4 is supposed to reflect a set of features into which the
> > operating system is opting in.  If the BIOS or bootloader leaks
> > bits here, this is not desirable.  Consider a bootloader passing in
> > %cr4.pae set to a legacy paging kernel, for example -- it will not
> > have any immediate effect, but the kernel would crash when turning
> > paging on.
> > 
> > A similar argument applies to %eflags, and since we have to look for
> > %eflags.id being settable we can use a sequence which clears %eflags
> > as a side effect.
> > 
> > Note that we already do this for x86-64.
> > 
> > I would like opinions on this especially from the PV crowd and
> > nonstandard platforms (e.g. OLPC) to make sure we don't screw them
> > up.
> > 
> > Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
> > ---
> >  arch/x86/kernel/head_32.S |   31 ++++++++++++++++---------------
> >  1 files changed, 16 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
> > index d42ab17..957a47a 100644
> > --- a/arch/x86/kernel/head_32.S
> > +++ b/arch/x86/kernel/head_32.S
> > @@ -287,27 +287,28 @@ ENTRY(startup_32_smp)
> >  	leal -__PAGE_OFFSET(%ecx),%esp
> >  
> >  default_entry:
> > -
> >  /*
> >   *	New page tables may be in 4Mbyte page mode and may
> >   *	be using the global pages. 
> >   *
> >   *	NOTE! If we are on a 486 we may have no cr4 at all!
> > - *	So we do not try to touch it unless we really have
> > - *	some bits in it to set.  This won't work if the BSP
> > - *	implements cr4 but this AP does not -- very unlikely
> > - *	but be warned!  The same applies to the pse feature
> > - *	if not equally supported. --macro
> > - *
> > - *	NOTE! We have to correct for the fact that we're
> > - *	not yet offset PAGE_OFFSET..
> > + *	Specifically, cr4 exists if and only if CPUID exists,
> > + *	which in turn exists if and only if EFLAGS.ID exists.
> >   */
> > -#define cr4_bits pa(mmu_cr4_features)
> > -	movl cr4_bits,%edx
> > -	andl %edx,%edx
> > -	jz 6f
> > -	movl %cr4,%eax		# Turn on paging options
> > (PSE,PAE,..)
> > -	orl %edx,%eax
> > +	movl $X86_EFLAGS_ID,%ecx
> > +	pushl %ecx
> > +	popfl
> > +	pushfl
> > +	popl %eax
> > +	pushl $0
> > +	popfl
> > +	pushfl
> > +	popl %edx
> > +	xorl %edx,%eax
> > +	testl %ecx,%eax
> > +	jz 6f			# No ID flag = no CPUID = no
> > CR4 +
> > +	movl pa(mmu_cr4_features),%eax
> >  	movl %eax,%cr4
> >  
> >  	testb $X86_CR4_PAE, %al		# check if PAE is
> > enabled

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2013-01-19  0:40           ` Andres Salomon
@ 2013-01-19  0:42             ` H. Peter Anvin
  2013-01-19  1:05               ` [Techteam] " Mitch Bradley
  0 siblings, 1 reply; 56+ messages in thread
From: H. Peter Anvin @ 2013-01-19  0:42 UTC (permalink / raw)
  To: Andres Salomon
  Cc: Linux Kernel Mailing List, Ingo Molnar, Thomas Gleixner,
	Dave Jones, Linus Torvalds, Eric W. Biederman, Ian Campbell,
	Konrad Rzeszutek Wilk, Jeremy Fitzhardinge, Rusty Russell,
	David Woodhouse, Vivek Goyal, Yinghai Lu, H. Peter Anvin,
	techteam

On 01/18/2013 04:40 PM, Andres Salomon wrote:
> Bad news on this patch; I've been told that it breaks booting on an
> XO-1.5.  Does anyone from OLPC know why yet?

What are the settings of CR0 and CR4 on kernel entry on XO-1.5?

	-hpa



^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Techteam] [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2013-01-19  0:42             ` H. Peter Anvin
@ 2013-01-19  1:05               ` Mitch Bradley
  2013-01-19  2:35                 ` H. Peter Anvin
  0 siblings, 1 reply; 56+ messages in thread
From: Mitch Bradley @ 2013-01-19  1:05 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andres Salomon, Jeremy Fitzhardinge, Ian Campbell,
	Konrad Rzeszutek Wilk, David Woodhouse, Rusty Russell,
	Linux Kernel Mailing List, Vivek Goyal, Eric W. Biederman,
	H. Peter Anvin, Dave Jones, Thomas Gleixner, Linus Torvalds,
	Ingo Molnar, techteam, Yinghai Lu



On 1/18/2013 2:42 PM, H. Peter Anvin wrote:
> On 01/18/2013 04:40 PM, Andres Salomon wrote:
>> Bad news on this patch; I've been told that it breaks booting on an
>> XO-1.5.  Does anyone from OLPC know why yet?
> 
> What are the settings of CR0 and CR4 on kernel entry on XO-1.5?


CR0 is 0x80000011
CR4 is 0x10

> 
> 	-hpa
> 
> 
> _______________________________________________
> Techteam mailing list
> Techteam@lists.laptop.org
> http://lists.laptop.org/listinfo/techteam
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Techteam] [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2013-01-19  1:05               ` [Techteam] " Mitch Bradley
@ 2013-01-19  2:35                 ` H. Peter Anvin
  2013-01-19  7:44                   ` Mitch Bradley
                                     ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: H. Peter Anvin @ 2013-01-19  2:35 UTC (permalink / raw)
  To: Mitch Bradley
  Cc: Andres Salomon, Jeremy Fitzhardinge, Ian Campbell,
	Konrad Rzeszutek Wilk, David Woodhouse, Rusty Russell,
	Linux Kernel Mailing List, Vivek Goyal, Eric W. Biederman,
	H. Peter Anvin, Dave Jones, Thomas Gleixner, Linus Torvalds,
	Ingo Molnar, techteam, Yinghai Lu

[-- Attachment #1: Type: text/plain, Size: 1154 bytes --]

On 01/18/2013 05:05 PM, Mitch Bradley wrote:
> 
> 
> On 1/18/2013 2:42 PM, H. Peter Anvin wrote:
>> On 01/18/2013 04:40 PM, Andres Salomon wrote:
>>> Bad news on this patch; I've been told that it breaks booting on an
>>> XO-1.5.  Does anyone from OLPC know why yet?
>>
>> What are the settings of CR0 and CR4 on kernel entry on XO-1.5?
> 
> 
> CR0 is 0x80000011
> CR4 is 0x10
> 

OK, that makes sense... the kernel doesn't enable the PSE bit yet and I
bet that's what you're using for the non-stolen page tables.

Can we simply disable paging before mucking with CR4?  The other option
that I can see is to always enable PSE and PGE, since they are simply
features opt-ins that don't do any harm if unused.  At the same time,
though, entering the kernel through the default_entry path with paging
enabled is definitely not anything the kernel expects.

Does this patch work for you?  Since we have ditched 386 support, we can
mimic x86-64 (yay, one more difference gone!) and just use a predefined
value for %cr0 (the FPU flags need to change if we are on an FPU-less
chip, but that happens during FPU probing.)

Does this patch work for you?

	-hpa




[-- Attachment #2: diff --]
[-- Type: text/plain, Size: 836 bytes --]

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 8e7f655..2713ea1 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -300,6 +300,11 @@ ENTRY(startup_32_smp)
 	leal -__PAGE_OFFSET(%ecx),%esp
 
 default_entry:
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
+	movl $(CR0_STATE & ~X86_CR0_PG),%eax
+	movl %eax,%cr0
 /*
  *	New page tables may be in 4Mbyte page mode and may
  *	be using the global pages. 
@@ -364,8 +369,7 @@ default_entry:
  */
 	movl $pa(initial_page_table), %eax
 	movl %eax,%cr3		/* set the page table pointer.. */
-	movl %cr0,%eax
-	orl  $X86_CR0_PG,%eax
+	movl $CR0_STATE,%eax
 	movl %eax,%cr0		/* ..and set paging (PG) bit */
 	ljmp $__BOOT_CS,$1f	/* Clear prefetch and normalize %eip */
 1:

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [Techteam] [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2013-01-19  2:35                 ` H. Peter Anvin
@ 2013-01-19  7:44                   ` Mitch Bradley
  2013-01-19 12:34                   ` Daniel Drake
  2013-01-19 19:15                   ` [tip:x86/urgent] x86-32: Start out cr0 clean, disable paging before modifying cr3/4 tip-bot for H. Peter Anvin
  2 siblings, 0 replies; 56+ messages in thread
From: Mitch Bradley @ 2013-01-19  7:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andres Salomon, Jeremy Fitzhardinge, Ian Campbell,
	Konrad Rzeszutek Wilk, David Woodhouse, Rusty Russell,
	Linux Kernel Mailing List, Vivek Goyal, Eric W. Biederman,
	H. Peter Anvin, Dave Jones, Thomas Gleixner, Linus Torvalds,
	Ingo Molnar, techteam, Yinghai Lu



On 1/18/2013 4:35 PM, H. Peter Anvin wrote:
> On 01/18/2013 05:05 PM, Mitch Bradley wrote:
>>
>>
>> On 1/18/2013 2:42 PM, H. Peter Anvin wrote:
>>> On 01/18/2013 04:40 PM, Andres Salomon wrote:
>>>> Bad news on this patch; I've been told that it breaks booting on an
>>>> XO-1.5.  Does anyone from OLPC know why yet?
>>>
>>> What are the settings of CR0 and CR4 on kernel entry on XO-1.5?
>>
>>
>> CR0 is 0x80000011
>> CR4 is 0x10
>>
> 
> OK, that makes sense... the kernel doesn't enable the PSE bit yet and I
> bet that's what you're using for the non-stolen page tables.

Indeed, we are using 4M pages to map the firmware into high virtual
addresses.

> 
> Can we simply disable paging before mucking with CR4?  The other option
> that I can see is to always enable PSE and PGE, since they are simply
> features opt-ins that don't do any harm if unused.  At the same time,
> though, entering the kernel through the default_entry path with paging
> enabled is definitely not anything the kernel expects.
> 
> Does this patch work for you?  Since we have ditched 386 support, we can
> mimic x86-64 (yay, one more difference gone!) and just use a predefined
> value for %cr0 (the FPU flags need to change if we are on an FPU-less
> chip, but that happens during FPU probing.)
> 
> Does this patch work for you?

We will test it and get back to you.

> 
> 	-hpa
> 
> 
> 

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [Techteam] [RFC PATCH] x86-32: Start out eflags and cr4 clean
  2013-01-19  2:35                 ` H. Peter Anvin
  2013-01-19  7:44                   ` Mitch Bradley
@ 2013-01-19 12:34                   ` Daniel Drake
  2013-01-19 19:15                   ` [tip:x86/urgent] x86-32: Start out cr0 clean, disable paging before modifying cr3/4 tip-bot for H. Peter Anvin
  2 siblings, 0 replies; 56+ messages in thread
From: Daniel Drake @ 2013-01-19 12:34 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Mitch Bradley, Jeremy Fitzhardinge, H. Peter Anvin, Ian Campbell,
	Konrad Rzeszutek Wilk, Linus Torvalds, Rusty Russell,
	Linux Kernel Mailing List, Vivek Goyal, Eric W. Biederman,
	Dave Jones, Thomas Gleixner, David Woodhouse, Ingo Molnar,
	techteam, Yinghai Lu

On Fri, Jan 18, 2013 at 8:35 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
> Can we simply disable paging before mucking with CR4?  The other option
> that I can see is to always enable PSE and PGE, since they are simply
> features opt-ins that don't do any harm if unused.  At the same time,
> though, entering the kernel through the default_entry path with paging
> enabled is definitely not anything the kernel expects.
>
> Does this patch work for you?  Since we have ditched 386 support, we can
> mimic x86-64 (yay, one more difference gone!) and just use a predefined
> value for %cr0 (the FPU flags need to change if we are on an FPU-less
> chip, but that happens during FPU probing.)

The patch fixes boot on XO-1.5. Thanks for the quick response!

Daniel

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [tip:x86/urgent] x86-32: Start out cr0 clean, disable paging before modifying cr3/4
  2013-01-19  2:35                 ` H. Peter Anvin
  2013-01-19  7:44                   ` Mitch Bradley
  2013-01-19 12:34                   ` Daniel Drake
@ 2013-01-19 19:15                   ` tip-bot for H. Peter Anvin
  2 siblings, 0 replies; 56+ messages in thread
From: tip-bot for H. Peter Anvin @ 2013-01-19 19:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, dsd, dilinger, tglx, hpa

Commit-ID:  021ef050fc092d5638e69868d126c18006ea7296
Gitweb:     http://git.kernel.org/tip/021ef050fc092d5638e69868d126c18006ea7296
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Sat, 19 Jan 2013 10:29:37 -0800
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Sat, 19 Jan 2013 11:01:22 -0800

x86-32: Start out cr0 clean, disable paging before modifying cr3/4

Patch

  5a5a51db78e x86-32: Start out eflags and cr4 clean

... made x86-32 match x86-64 in that we initialize %eflags and %cr4
from scratch.  This broke OLPC XO-1.5, because the XO enters the
kernel with paging enabled, which the kernel doesn't expect.

Since we no longer support 386 (the source of most of the variability
in %cr0 configuration), we can simply match further x86-64 and
initialize %cr0 to a fixed value -- the one variable part remaining in
%cr0 is for FPU control, but all that is handled later on in
initialization; in particular, configuring %cr0 as if the FPU is
present until proven otherwise is correct and necessary for the probe
to work.

To deal with the XO case sanely, explicitly disable paging in %cr0
before we muck with %cr3, %cr4 or EFER -- those operations are
inherently unsafe with paging enabled.

NOTE: There is still a lot of 386-related junk in head_32.S which we
can and should get rid of, however, this is intended as a minimal fix
whereas the cleanup can be deferred to the next merge window.

Reported-by: Andres Salomon <dilinger@queued.net>
Tested-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/50FA0661.2060400@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/kernel/head_32.S | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 8e7f655..c8932c7 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -300,6 +300,12 @@ ENTRY(startup_32_smp)
 	leal -__PAGE_OFFSET(%ecx),%esp
 
 default_entry:
+#define CR0_STATE	(X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | \
+			 X86_CR0_NE | X86_CR0_WP | X86_CR0_AM | \
+			 X86_CR0_PG)
+	movl $(CR0_STATE & ~X86_CR0_PG),%eax
+	movl %eax,%cr0
+
 /*
  *	New page tables may be in 4Mbyte page mode and may
  *	be using the global pages. 
@@ -364,8 +370,7 @@ default_entry:
  */
 	movl $pa(initial_page_table), %eax
 	movl %eax,%cr3		/* set the page table pointer.. */
-	movl %cr0,%eax
-	orl  $X86_CR0_PG,%eax
+	movl $CR0_STATE,%eax
 	movl %eax,%cr0		/* ..and set paging (PG) bit */
 	ljmp $__BOOT_CS,$1f	/* Clear prefetch and normalize %eip */
 1:

^ permalink raw reply related	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2013-01-19 19:15 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-21 19:43 [PATCH 00/11] x86: Supervisor Mode Access Prevention H. Peter Anvin
2012-09-21 19:43 ` [PATCH 01/11] x86, cpufeature: Add feature bit for SMAP H. Peter Anvin
2012-09-21 19:43 ` [PATCH 02/11] x86-32, mm: The WP test should be done on a kernel page H. Peter Anvin
2012-09-21 19:58   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 03/11] x86, smap: Add CR4 bit for SMAP H. Peter Anvin
2012-09-21 19:59   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 04/11] x86, alternative: Use .pushsection/.popsection H. Peter Anvin
2012-09-21 20:00   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 05/11] x86, alternative: Add header guards to <asm/alternative-asm.h> H. Peter Anvin
2012-09-21 20:01   ` [tip:x86/smap] x86, alternative: Add header guards to <asm/ alternative-asm.h> tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 06/11] x86, smap: Add a header file with macros for STAC/CLAC H. Peter Anvin
2012-09-21 20:02   ` [tip:x86/smap] x86, smap: Add a header file with macros for STAC/ CLAC tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 07/11] x86, uaccess: Merge prototypes for clear_user/__clear_user H. Peter Anvin
2012-09-21 20:03   ` [tip:x86/smap] x86, uaccess: Merge prototypes for clear_user/ __clear_user tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 08/11] x86, smap: Add STAC and CLAC instructions to control user space access H. Peter Anvin
2012-09-21 20:04   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-09-22  0:16   ` [tip:x86/smap] x86-32, smap: Add STAC/ CLAC instructions to 32-bit kernel entry tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 09/11] x86, smap: Turn on Supervisor Mode Access Prevention H. Peter Anvin
2012-09-21 20:05   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 10/11] x86, smap: A page fault due to SMAP is an oops H. Peter Anvin
2012-09-21 20:06   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-09-21 19:43 ` [PATCH 11/11] x86, smap: Reduce the SMAP overhead for signal handling H. Peter Anvin
2012-09-21 20:07   ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-09-21 19:54 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Linus Torvalds
2012-09-21 19:57   ` H. Peter Anvin
2012-09-21 20:08   ` Ingo Molnar
2012-09-21 21:03     ` H. Peter Anvin
2012-09-21 21:09       ` Linus Torvalds
2012-09-21 21:12         ` H. Peter Anvin
2012-09-21 22:07 ` Eric W. Biederman
2012-09-21 22:12   ` H. Peter Anvin
2012-09-22  0:41     ` Eric W. Biederman
2012-09-24 23:27       ` [RFC PATCH] x86-32: Start out eflags and cr4 clean H. Peter Anvin
2012-09-25 13:27         ` Konrad Rzeszutek Wilk
2012-09-25 13:48         ` Ian Campbell
2012-09-26 11:29           ` Konrad Rzeszutek Wilk
2012-09-27  6:11         ` [tip:x86/smap] " tip-bot for H. Peter Anvin
2012-11-24  3:49           ` Yuhong Bao
2012-11-24  5:06             ` H. Peter Anvin
2012-09-27  6:11         ` [tip:x86/smap] x86, suspend: On wakeup always initialize cr4 and EFER tip-bot for H. Peter Anvin
2012-10-01 22:04         ` [tip:x86/urgent] x86, suspend: Correct the restore of CR4, EFER; skip computing EFLAGS.ID tip-bot for H. Peter Anvin
2012-10-02  6:52         ` tip-bot for H. Peter Anvin
2012-10-10 19:59         ` [RFC PATCH] x86-32: Start out eflags and cr4 clean Andres Salomon
2013-01-19  0:40           ` Andres Salomon
2013-01-19  0:42             ` H. Peter Anvin
2013-01-19  1:05               ` [Techteam] " Mitch Bradley
2013-01-19  2:35                 ` H. Peter Anvin
2013-01-19  7:44                   ` Mitch Bradley
2013-01-19 12:34                   ` Daniel Drake
2013-01-19 19:15                   ` [tip:x86/urgent] x86-32: Start out cr0 clean, disable paging before modifying cr3/4 tip-bot for H. Peter Anvin
2012-09-21 22:08 ` [PATCH 00/11] x86: Supervisor Mode Access Prevention Dave Jones
2012-09-21 22:10   ` H. Peter Anvin
2012-09-22 11:32     ` Ingo Molnar
2012-09-24 20:31       ` H. Peter Anvin
2012-09-24 20:43         ` Kees Cook
2012-09-24 20:51           ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).