All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
@ 2014-05-07  8:19 Feng Wu
  2014-05-07  8:19 ` [PATCH v6 01/10] x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET Feng Wu
                   ` (11 more replies)
  0 siblings, 12 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Supervisor Mode Access Prevention (SMAP) is a new security
feature disclosed by Intel, please refer to the following
document:

http://software.intel.com/sites/default/files/319433-014.pdf

Every access to a linear address is either a supervisor-mode
access or a user-mode access. All accesses performed while the
current privilege level (CPL) is less than 3 are supervisor-mode
accesses. If CPL = 3, accesses are generally user-mode accesses.
However, some operations implicitly access system data structures,
and the resulting accesses to those data structures are supervisor-mode
accesses regardless of CPL. Examples of such implicit supervisor
accesses include the following: accesses to the global descriptor
table (GDT) or local descriptor table (LDT) to load a segment descriptor;
accesses to the interrupt descriptor table (IDT) when delivering an
interrupt or exception; and accesses to the task-state segment (TSS) as
part of a task switch or change of CPL.

If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
to linear addresses that are accessible in user mode. If CPL < 3,
SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
applies to all supervisor-mode data accesses (these are implicit
supervisor accesses) regardless of the value of EFLAGS.AC.

Version 1:
  * Add two macros for STAC/CLAC instructions
  * Temporary disable SMAP to legally access user pages in kernel mode
  * Enable Supervisor Mode Access Prevention (SMAP) for Xen itself
  * Add SMAP support to HVM guest
  * Disable SMAP feature when guest is in non-paging mode

Version 2:
  * Change the definition of ASM_STAC/ASM_CLAC.
  * Clear AC bit at the beginning of exception, interrup, hypercall.
  * Make construct_dom0() wrapped in a stac()/clac() part as a whole.
  * Reorder some patches in the series.
  * Combine some conditionals with SMEP.
  * Typo, etc. 

Version 3:
  * Clean-ups to ASM_STAC/ASM_CLAC
  * Enable SMAP after constructin domain 0
  * Move common_interrupt to entry.S
  * Remove ASM_CLAC calls in some places where exception happens
  * Correct the logic in hvm_vcpu_has_smep()/hvm_vcpu_has_smap() 
  * Make the output message more readable when SMAP violation happens
  * Use hvm_get_segment_register() to get the guest SS in guest_walk_tables()
  * Coding style changes, etc.

Version 4:
  * Use common macro CPUINFO_features instead of CPUINFO86_ext_features in xen/arch/x86/boot/head.S
  * Make ASM_STAC/ASM_CLAC common both in assembly and C code
  * Merge xen/include/asm-x86/x86_64/asm_defns.h into xen/include/asm-x86/asm_defns.h
  * Add a parameter to SAVE_ALL to include ASM_CALC in it optional
  * Remove ASM_STAC/ASM_CLAC pair in compat_create_bounce_frame, since in this chunk of code,
    it only accesses the pv guest's kernel stack, which is in ring 1 for 32-bit pv guests.
  * Call "setup_clear_cpu_cap(X86_FEATURE_SMAP)" before APs get brought up
  * Coding style changes.

Version 5:
  * Remove C verion of CPUINFO_FEATURE_OFFSET.
  * Implement clac()/stac() in C.
  * Set the default value of the parameter to 1 for macro SAVE_ALL.
  * Add const to the second parameter of __page_fault_type().
  * Clear SMAP bit in CR4 before construct_dom0() and set the bit back after it.
  * Coding style changes.

Version 6:
  * Add memory clobbers for clac()/stac()
  * Add ASM_CLAC at the beginning of ignore_int
  * Some comment changes

Since Linux kernel has already supported SMAP, I tested this patch both
in EPT and shadow mode with Linux guest, they work well. And I also
tested the failure case, in which, I triggered an SMAP violation in
the guest kernel and Linux successfully received the related page
fault.

Feng Wu (10):
  x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
  x86: move common_interrupt to entry.S
  x86: merge stuff from asm-x86/x86_64/asm_defns.h to
    asm-x86/asm_defns.h
  x86: Add support for STAC/CLAC instructions
  Clear AC bit in RFLAGS to protect Xen itself by SMAP
  x86: Temporary disable SMAP to legally access user pages in kernel
    mode
  VMX: Disable SMAP feature when guest is in non-paging mode
  x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  x86/hvm: Add SMAP support to HVM guest
  x86/tools: Expose SMAP to HVM guests

 docs/misc/xen-command-line.markdown    |   7 +
 tools/libxc/xc_cpufeature.h            |   1 +
 tools/libxc/xc_cpuid_x86.c             |   1 +
 xen/arch/x86/boot/head.S               |   3 +-
 xen/arch/x86/hvm/hvm.c                 |   3 +
 xen/arch/x86/hvm/vmx/vmx.c             |   6 +-
 xen/arch/x86/i8259.c                   |   2 -
 xen/arch/x86/mm/guest_walk.c           |  40 ++++--
 xen/arch/x86/setup.c                   |  27 ++++
 xen/arch/x86/traps.c                   |  70 +++++++--
 xen/arch/x86/usercopy.c                |   6 +
 xen/arch/x86/x86_64/asm-offsets.c      |   2 +-
 xen/arch/x86/x86_64/compat/entry.S     |   3 +-
 xen/arch/x86/x86_64/entry.S            |  15 +-
 xen/arch/x86/x86_64/traps.c            |   2 +-
 xen/include/asm-x86/asm_defns.h        | 253 ++++++++++++++++++++++++++++++++-
 xen/include/asm-x86/cpufeature.h       |   5 +
 xen/include/asm-x86/domain.h           |   6 +-
 xen/include/asm-x86/hvm/hvm.h          |  18 ++-
 xen/include/asm-x86/uaccess.h          |   8 +-
 xen/include/asm-x86/x86_64/asm_defns.h | 231 ------------------------------
 xen/include/asm-x86/x86_64/system.h    |   4 +-
 22 files changed, 439 insertions(+), 274 deletions(-)
 delete mode 100644 xen/include/asm-x86/x86_64/asm_defns.h

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v6 01/10] x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07  8:19 ` [PATCH v6 02/10] x86: move common_interrupt to entry.S Feng Wu
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

This patch defines macros CPUINFO_features and CPUINFO_FEATURE_OFFSET.
CPUINFO_features can be used as the base of the offset for cpu features,
while CPUINFO_FEATURE_OFFSET is used to define the right offset for
specific CPU feature.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/boot/head.S          | 3 ++-
 xen/arch/x86/x86_64/asm-offsets.c | 2 +-
 xen/include/asm-x86/asm_defns.h   | 3 +++
 xen/include/asm-x86/cpufeature.h  | 4 ++++
 4 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
index ca0e97d..8e4e065 100644
--- a/xen/arch/x86/boot/head.S
+++ b/xen/arch/x86/boot/head.S
@@ -6,6 +6,7 @@
 #include <asm/fixmap.h>
 #include <asm/page.h>
 #include <asm/msr.h>
+#include <asm/cpufeature.h>
 
         .text
         .code32
@@ -137,7 +138,7 @@ __start:
         mov     $0x80000001,%eax
         cpuid
 1:      mov     %edx,sym_phys(cpuid_ext_features)
-        mov     %edx,sym_phys(boot_cpu_data)+CPUINFO86_ext_features
+        mov     %edx,sym_phys(boot_cpu_data)+CPUINFO_FEATURE_OFFSET(X86_FEATURE_LM)
 
         /* Check for availability of long mode. */
         bt      $29,%edx
diff --git a/xen/arch/x86/x86_64/asm-offsets.c b/xen/arch/x86/x86_64/asm-offsets.c
index 9acb648..5f46969 100644
--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -160,7 +160,7 @@ void __dummy__(void)
     OFFSET(IRQSTAT_softirq_pending, irq_cpustat_t, __softirq_pending);
     BLANK();
 
-    OFFSET(CPUINFO86_ext_features, struct cpuinfo_x86, x86_capability[1]);
+    OFFSET(CPUINFO_features, struct cpuinfo_x86, x86_capability[0]);
     BLANK();
 
     OFFSET(MB_flags, multiboot_info_t, flags);
diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index a4601ba..2b055f5 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -77,6 +77,9 @@ void ret_from_intr(void);
 #define ASSERT_NOT_IN_ATOMIC
 #endif
 
+#define CPUINFO_FEATURE_OFFSET(feature)           \
+        (((feature >> 3) & ~3) + CPUINFO_features)
+
 #else
 
 #ifdef __clang__ /* clang's builtin assember can't do .subsection */
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 6a6b1ab..20881c0 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -7,7 +7,9 @@
 #ifndef __ASM_I386_CPUFEATURE_H
 #define __ASM_I386_CPUFEATURE_H
 
+#ifndef __ASSEMBLY__
 #include <xen/bitops.h>
+#endif
 
 #define NCAPINTS	8	/* N 32-bit words worth of info */
 
@@ -152,6 +154,7 @@
 #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
 
+#ifndef __ASSEMBLY__
 #define cpu_has(c, bit)		test_bit(bit, (c)->x86_capability)
 #define boot_cpu_has(bit)	test_bit(bit, boot_cpu_data.x86_capability)
 #define cpufeat_mask(idx)       (1u << ((idx) & 31))
@@ -210,6 +213,7 @@
 #define cpu_has_vmx		boot_cpu_has(X86_FEATURE_VMXE)
 
 #define cpu_has_cpuid_faulting	boot_cpu_has(X86_FEATURE_CPUID_FAULTING)
+#endif
 
 #endif /* __ASM_I386_CPUFEATURE_H */
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 02/10] x86: move common_interrupt to entry.S
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
  2014-05-07  8:19 ` [PATCH v6 01/10] x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07  8:19 ` [PATCH v6 03/10] x86: merge stuff from asm-x86/x86_64/asm_defns.h to asm-x86/asm_defns.h Feng Wu
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

This patch moves label common_interrupt from asm_defns.h to entry.S

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/i8259.c                   |  2 --
 xen/arch/x86/x86_64/entry.S            |  6 +++++
 xen/include/asm-x86/x86_64/asm_defns.h | 49 ++++++++++++++--------------------
 3 files changed, 26 insertions(+), 31 deletions(-)

diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c
index 6fdcce8..9fec490 100644
--- a/xen/arch/x86/i8259.c
+++ b/xen/arch/x86/i8259.c
@@ -35,8 +35,6 @@
 
 __asm__(".section .text");
 
-BUILD_COMMON_IRQ()
-
 #define IRQ_NAME(nr) VEC##nr##_interrupt
 
 #define BI(nr)                                           \
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 3ea4683..1c81852 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -465,6 +465,12 @@ ENTRY(dom_crash_sync_extable)
         xorl  %edi,%edi
         jmp   asm_domain_crash_synchronous /* Does not return */
 
+ENTRY(common_interrupt)
+        SAVE_ALL
+        movq %rsp,%rdi
+        callq do_IRQ
+        jmp ret_from_intr
+
 /* No special register assumptions. */
 ENTRY(ret_from_intr)
         GET_CURRENT(%rbx)
diff --git a/xen/include/asm-x86/x86_64/asm_defns.h b/xen/include/asm-x86/x86_64/asm_defns.h
index bf63ac1..aedb2da 100644
--- a/xen/include/asm-x86/x86_64/asm_defns.h
+++ b/xen/include/asm-x86/x86_64/asm_defns.h
@@ -51,27 +51,27 @@
         r__->entry_vector |= TRAP_regs_dirty; \
 })
 
-#define SAVE_ALL                                \
-        addq  $-(UREGS_error_code-UREGS_r15), %rsp; \
-        cld;                                    \
-        movq  %rdi,UREGS_rdi(%rsp);             \
-        movq  %rsi,UREGS_rsi(%rsp);             \
-        movq  %rdx,UREGS_rdx(%rsp);             \
-        movq  %rcx,UREGS_rcx(%rsp);             \
-        movq  %rax,UREGS_rax(%rsp);             \
-        movq  %r8,UREGS_r8(%rsp);               \
-        movq  %r9,UREGS_r9(%rsp);               \
-        movq  %r10,UREGS_r10(%rsp);             \
-        movq  %r11,UREGS_r11(%rsp);             \
-        movq  %rbx,UREGS_rbx(%rsp);             \
-        movq  %rbp,UREGS_rbp(%rsp);             \
-        SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp); \
-        movq  %r12,UREGS_r12(%rsp);             \
-        movq  %r13,UREGS_r13(%rsp);             \
-        movq  %r14,UREGS_r14(%rsp);             \
-        movq  %r15,UREGS_r15(%rsp);             \
-
 #ifdef __ASSEMBLY__
+.macro SAVE_ALL
+        addq  $-(UREGS_error_code-UREGS_r15), %rsp
+        cld
+        movq  %rdi,UREGS_rdi(%rsp)
+        movq  %rsi,UREGS_rsi(%rsp)
+        movq  %rdx,UREGS_rdx(%rsp)
+        movq  %rcx,UREGS_rcx(%rsp)
+        movq  %rax,UREGS_rax(%rsp)
+        movq  %r8,UREGS_r8(%rsp)
+        movq  %r9,UREGS_r9(%rsp)
+        movq  %r10,UREGS_r10(%rsp)
+        movq  %r11,UREGS_r11(%rsp)
+        movq  %rbx,UREGS_rbx(%rsp)
+        movq  %rbp,UREGS_rbp(%rsp)
+        SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp)
+        movq  %r12,UREGS_r12(%rsp)
+        movq  %r13,UREGS_r13(%rsp)
+        movq  %r14,UREGS_r14(%rsp)
+        movq  %r15,UREGS_r15(%rsp)
+.endm
 
 /*
  * Save all registers not preserved by C code or used in entry/exit code. Mark
@@ -208,15 +208,6 @@
 #define REX64_PREFIX "rex64/"
 #endif
 
-#define BUILD_COMMON_IRQ()                      \
-__asm__(                                        \
-    "\n" __ALIGN_STR"\n"                        \
-    "common_interrupt:\n\t"                     \
-    STR(SAVE_ALL) "\n\t"                        \
-    "movq %rsp,%rdi\n\t"                        \
-    "callq " STR(do_IRQ) "\n\t"                 \
-    "jmp ret_from_intr\n");
-
 #define BUILD_IRQ(nr)                           \
     "pushq $0\n\t"                              \
     "movl $"#nr",4(%rsp)\n\t"                   \
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 03/10] x86: merge stuff from asm-x86/x86_64/asm_defns.h to asm-x86/asm_defns.h
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
  2014-05-07  8:19 ` [PATCH v6 01/10] x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET Feng Wu
  2014-05-07  8:19 ` [PATCH v6 02/10] x86: move common_interrupt to entry.S Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07  8:19 ` [PATCH v6 04/10] x86: Add support for STAC/CLAC instructions Feng Wu
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

This patch move stuff unchanged from asm-x86/x86_64/asm_defns.h to
asm-x86/asm_defns.h

Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/include/asm-x86/asm_defns.h        | 217 +++++++++++++++++++++++++++++++-
 xen/include/asm-x86/x86_64/asm_defns.h | 222 ---------------------------------
 2 files changed, 216 insertions(+), 223 deletions(-)
 delete mode 100644 xen/include/asm-x86/x86_64/asm_defns.h

diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index 2b055f5..c4b2bb2 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -7,12 +7,65 @@
 #include <asm/asm-offsets.h>
 #endif
 #include <asm/processor.h>
+#include <asm/percpu.h>
 
 #ifndef __ASSEMBLY__
 void ret_from_intr(void);
 #endif
 
-#include <asm/x86_64/asm_defns.h>
+#ifdef CONFIG_FRAME_POINTER
+/* Indicate special exception stack frame by inverting the frame pointer. */
+#define SETUP_EXCEPTION_FRAME_POINTER(offs)     \
+        leaq  offs(%rsp),%rbp;                  \
+        notq  %rbp
+#else
+#define SETUP_EXCEPTION_FRAME_POINTER(offs)
+#endif
+
+#ifndef NDEBUG
+#define ASSERT_INTERRUPT_STATUS(x)              \
+        pushf;                                  \
+        testb $X86_EFLAGS_IF>>8,1(%rsp);        \
+        j##x  1f;                               \
+        ud2a;                                   \
+1:      addq  $8,%rsp;
+#else
+#define ASSERT_INTERRUPT_STATUS(x)
+#endif
+
+#define ASSERT_INTERRUPTS_ENABLED  ASSERT_INTERRUPT_STATUS(nz)
+#define ASSERT_INTERRUPTS_DISABLED ASSERT_INTERRUPT_STATUS(z)
+
+/*
+ * This flag is set in an exception frame when registers R12-R15 did not get
+ * saved.
+ */
+#define _TRAP_regs_partial 16
+#define TRAP_regs_partial  (1 << _TRAP_regs_partial)
+/*
+ * This flag gets set in an exception frame when registers R12-R15 possibly
+ * get modified from their originally saved values and hence need to be
+ * restored even if the normal call flow would restore register values.
+ *
+ * The flag being set implies _TRAP_regs_partial to be unset. Restoring
+ * R12-R15 thus is
+ * - required when this flag is set,
+ * - safe when _TRAP_regs_partial is unset.
+ */
+#define _TRAP_regs_dirty   17
+#define TRAP_regs_dirty    (1 << _TRAP_regs_dirty)
+
+#define mark_regs_dirty(r) ({ \
+        struct cpu_user_regs *r__ = (r); \
+        ASSERT(!((r__)->entry_vector & TRAP_regs_partial)); \
+        r__->entry_vector |= TRAP_regs_dirty; \
+})
+
+#ifdef __ASSEMBLY__
+# define _ASM_EX(p) p-.
+#else
+# define _ASM_EX(p) #p "-."
+#endif
 
 /* Exception table entry */
 #ifdef __ASSEMBLY__
@@ -106,4 +159,166 @@ void ret_from_intr(void);
 
 #endif
 
+#ifdef __ASSEMBLY__
+.macro SAVE_ALL
+        addq  $-(UREGS_error_code-UREGS_r15), %rsp
+        cld
+        movq  %rdi,UREGS_rdi(%rsp)
+        movq  %rsi,UREGS_rsi(%rsp)
+        movq  %rdx,UREGS_rdx(%rsp)
+        movq  %rcx,UREGS_rcx(%rsp)
+        movq  %rax,UREGS_rax(%rsp)
+        movq  %r8,UREGS_r8(%rsp)
+        movq  %r9,UREGS_r9(%rsp)
+        movq  %r10,UREGS_r10(%rsp)
+        movq  %r11,UREGS_r11(%rsp)
+        movq  %rbx,UREGS_rbx(%rsp)
+        movq  %rbp,UREGS_rbp(%rsp)
+        SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp)
+        movq  %r12,UREGS_r12(%rsp)
+        movq  %r13,UREGS_r13(%rsp)
+        movq  %r14,UREGS_r14(%rsp)
+        movq  %r15,UREGS_r15(%rsp)
+.endm
+
+/*
+ * Save all registers not preserved by C code or used in entry/exit code. Mark
+ * the frame as partial.
+ *
+ * @type: exception type
+ * @compat: R8-R15 don't need saving, and the frame nevertheless is complete
+ */
+.macro SAVE_VOLATILE type compat=0
+.if \compat
+        movl  $\type,UREGS_entry_vector-UREGS_error_code(%rsp)
+.else
+        movl  $\type|TRAP_regs_partial,\
+              UREGS_entry_vector-UREGS_error_code(%rsp)
+.endif
+        addq  $-(UREGS_error_code-UREGS_r15),%rsp
+        cld
+        movq  %rdi,UREGS_rdi(%rsp)
+        movq  %rsi,UREGS_rsi(%rsp)
+        movq  %rdx,UREGS_rdx(%rsp)
+        movq  %rcx,UREGS_rcx(%rsp)
+        movq  %rax,UREGS_rax(%rsp)
+.if !\compat
+        movq  %r8,UREGS_r8(%rsp)
+        movq  %r9,UREGS_r9(%rsp)
+        movq  %r10,UREGS_r10(%rsp)
+        movq  %r11,UREGS_r11(%rsp)
+.endif
+        movq  %rbx,UREGS_rbx(%rsp)
+        movq  %rbp,UREGS_rbp(%rsp)
+        SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp)
+.endm
+
+/*
+ * Complete a frame potentially only partially saved.
+ */
+.macro SAVE_PRESERVED
+        btrl  $_TRAP_regs_partial,UREGS_entry_vector(%rsp)
+        jnc   987f
+        movq  %r12,UREGS_r12(%rsp)
+        movq  %r13,UREGS_r13(%rsp)
+        movq  %r14,UREGS_r14(%rsp)
+        movq  %r15,UREGS_r15(%rsp)
+987:
+.endm
+
+/*
+ * Reload registers not preserved by C code from frame.
+ *
+ * @compat: R8-R11 don't need reloading
+ *
+ * For the way it is used in RESTORE_ALL, this macro must preserve EFLAGS.ZF.
+ */
+.macro LOAD_C_CLOBBERED compat=0
+.if !\compat
+        movq  UREGS_r11(%rsp),%r11
+        movq  UREGS_r10(%rsp),%r10
+        movq  UREGS_r9(%rsp),%r9
+        movq  UREGS_r8(%rsp),%r8
+.endif
+        movq  UREGS_rax(%rsp),%rax
+        movq  UREGS_rcx(%rsp),%rcx
+        movq  UREGS_rdx(%rsp),%rdx
+        movq  UREGS_rsi(%rsp),%rsi
+        movq  UREGS_rdi(%rsp),%rdi
+.endm
+
+/*
+ * Restore all previously saved registers.
+ *
+ * @adj: extra stack pointer adjustment to be folded into the adjustment done
+ *       anyway at the end of the macro
+ * @compat: R8-R15 don't need reloading
+ */
+.macro RESTORE_ALL adj=0 compat=0
+.if !\compat
+        testl $TRAP_regs_dirty,UREGS_entry_vector(%rsp)
+.endif
+        LOAD_C_CLOBBERED \compat
+.if !\compat
+        jz    987f
+        movq  UREGS_r15(%rsp),%r15
+        movq  UREGS_r14(%rsp),%r14
+        movq  UREGS_r13(%rsp),%r13
+        movq  UREGS_r12(%rsp),%r12
+#ifndef NDEBUG
+        .subsection 1
+987:    testl $TRAP_regs_partial,UREGS_entry_vector(%rsp)
+        jnz   987f
+        cmpq  UREGS_r15(%rsp),%r15
+        jne   789f
+        cmpq  UREGS_r14(%rsp),%r14
+        jne   789f
+        cmpq  UREGS_r13(%rsp),%r13
+        jne   789f
+        cmpq  UREGS_r12(%rsp),%r12
+        je    987f
+789:    ud2
+        .subsection 0
+#endif
+.endif
+987:    movq  UREGS_rbp(%rsp),%rbp
+        movq  UREGS_rbx(%rsp),%rbx
+        subq  $-(UREGS_error_code-UREGS_r15+\adj), %rsp
+.endm
+
+#endif
+
+#ifdef PERF_COUNTERS
+#define PERFC_INCR(_name,_idx,_cur)             \
+        pushq _cur;                             \
+        movslq VCPU_processor(_cur),_cur;       \
+        pushq %rdx;                             \
+        leaq __per_cpu_offset(%rip),%rdx;       \
+        movq (%rdx,_cur,8),_cur;                \
+        leaq per_cpu__perfcounters(%rip),%rdx;  \
+        addq %rdx,_cur;                         \
+        popq %rdx;                              \
+        incl ASM_PERFC_##_name*4(_cur,_idx,4);  \
+        popq _cur
+#else
+#define PERFC_INCR(_name,_idx,_cur)
+#endif
+
+/* Work around AMD erratum #88 */
+#define safe_swapgs                             \
+        "mfence; swapgs;"
+
+#ifdef __sun__
+#define REX64_PREFIX "rex64\\"
+#elif defined(__clang__)
+#define REX64_PREFIX ".byte 0x48; "
+#else
+#define REX64_PREFIX "rex64/"
+#endif
+
+#define BUILD_IRQ(nr)                           \
+    "pushq $0\n\t"                              \
+    "movl $"#nr",4(%rsp)\n\t"                   \
+    "jmp common_interrupt"
+
 #endif /* __X86_ASM_DEFNS_H__ */
diff --git a/xen/include/asm-x86/x86_64/asm_defns.h b/xen/include/asm-x86/x86_64/asm_defns.h
deleted file mode 100644
index aedb2da..0000000
--- a/xen/include/asm-x86/x86_64/asm_defns.h
+++ /dev/null
@@ -1,222 +0,0 @@
-#ifndef __X86_64_ASM_DEFNS_H__
-#define __X86_64_ASM_DEFNS_H__
-
-#include <asm/percpu.h>
-
-#ifdef CONFIG_FRAME_POINTER
-/* Indicate special exception stack frame by inverting the frame pointer. */
-#define SETUP_EXCEPTION_FRAME_POINTER(offs)     \
-        leaq  offs(%rsp),%rbp;                  \
-        notq  %rbp
-#else
-#define SETUP_EXCEPTION_FRAME_POINTER(offs)
-#endif
-
-#ifndef NDEBUG
-#define ASSERT_INTERRUPT_STATUS(x)              \
-        pushf;                                  \
-        testb $X86_EFLAGS_IF>>8,1(%rsp);        \
-        j##x  1f;                               \
-        ud2a;                                   \
-1:      addq  $8,%rsp;
-#else
-#define ASSERT_INTERRUPT_STATUS(x)
-#endif
-
-#define ASSERT_INTERRUPTS_ENABLED  ASSERT_INTERRUPT_STATUS(nz)
-#define ASSERT_INTERRUPTS_DISABLED ASSERT_INTERRUPT_STATUS(z)
-
-/*
- * This flag is set in an exception frame when registers R12-R15 did not get
- * saved.
- */
-#define _TRAP_regs_partial 16
-#define TRAP_regs_partial  (1 << _TRAP_regs_partial)
-/*
- * This flag gets set in an exception frame when registers R12-R15 possibly
- * get modified from their originally saved values and hence need to be
- * restored even if the normal call flow would restore register values.
- *
- * The flag being set implies _TRAP_regs_partial to be unset. Restoring
- * R12-R15 thus is
- * - required when this flag is set,
- * - safe when _TRAP_regs_partial is unset.
- */
-#define _TRAP_regs_dirty   17
-#define TRAP_regs_dirty    (1 << _TRAP_regs_dirty)
-
-#define mark_regs_dirty(r) ({ \
-        struct cpu_user_regs *r__ = (r); \
-        ASSERT(!((r__)->entry_vector & TRAP_regs_partial)); \
-        r__->entry_vector |= TRAP_regs_dirty; \
-})
-
-#ifdef __ASSEMBLY__
-.macro SAVE_ALL
-        addq  $-(UREGS_error_code-UREGS_r15), %rsp
-        cld
-        movq  %rdi,UREGS_rdi(%rsp)
-        movq  %rsi,UREGS_rsi(%rsp)
-        movq  %rdx,UREGS_rdx(%rsp)
-        movq  %rcx,UREGS_rcx(%rsp)
-        movq  %rax,UREGS_rax(%rsp)
-        movq  %r8,UREGS_r8(%rsp)
-        movq  %r9,UREGS_r9(%rsp)
-        movq  %r10,UREGS_r10(%rsp)
-        movq  %r11,UREGS_r11(%rsp)
-        movq  %rbx,UREGS_rbx(%rsp)
-        movq  %rbp,UREGS_rbp(%rsp)
-        SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp)
-        movq  %r12,UREGS_r12(%rsp)
-        movq  %r13,UREGS_r13(%rsp)
-        movq  %r14,UREGS_r14(%rsp)
-        movq  %r15,UREGS_r15(%rsp)
-.endm
-
-/*
- * Save all registers not preserved by C code or used in entry/exit code. Mark
- * the frame as partial.
- *
- * @type: exception type
- * @compat: R8-R15 don't need saving, and the frame nevertheless is complete
- */
-.macro SAVE_VOLATILE type compat=0
-.if \compat
-        movl  $\type,UREGS_entry_vector-UREGS_error_code(%rsp)
-.else
-        movl  $\type|TRAP_regs_partial,\
-              UREGS_entry_vector-UREGS_error_code(%rsp)
-.endif
-        addq  $-(UREGS_error_code-UREGS_r15),%rsp
-        cld
-        movq  %rdi,UREGS_rdi(%rsp)
-        movq  %rsi,UREGS_rsi(%rsp)
-        movq  %rdx,UREGS_rdx(%rsp)
-        movq  %rcx,UREGS_rcx(%rsp)
-        movq  %rax,UREGS_rax(%rsp)
-.if !\compat
-        movq  %r8,UREGS_r8(%rsp)
-        movq  %r9,UREGS_r9(%rsp)
-        movq  %r10,UREGS_r10(%rsp)
-        movq  %r11,UREGS_r11(%rsp)
-.endif
-        movq  %rbx,UREGS_rbx(%rsp)
-        movq  %rbp,UREGS_rbp(%rsp)
-        SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp)
-.endm
-
-/*
- * Complete a frame potentially only partially saved.
- */
-.macro SAVE_PRESERVED
-        btrl  $_TRAP_regs_partial,UREGS_entry_vector(%rsp)
-        jnc   987f
-        movq  %r12,UREGS_r12(%rsp)
-        movq  %r13,UREGS_r13(%rsp)
-        movq  %r14,UREGS_r14(%rsp)
-        movq  %r15,UREGS_r15(%rsp)
-987:
-.endm
-
-/*
- * Reload registers not preserved by C code from frame.
- *
- * @compat: R8-R11 don't need reloading
- *
- * For the way it is used in RESTORE_ALL, this macro must preserve EFLAGS.ZF.
- */
-.macro LOAD_C_CLOBBERED compat=0
-.if !\compat
-        movq  UREGS_r11(%rsp),%r11
-        movq  UREGS_r10(%rsp),%r10
-        movq  UREGS_r9(%rsp),%r9
-        movq  UREGS_r8(%rsp),%r8
-.endif
-        movq  UREGS_rax(%rsp),%rax
-        movq  UREGS_rcx(%rsp),%rcx
-        movq  UREGS_rdx(%rsp),%rdx
-        movq  UREGS_rsi(%rsp),%rsi
-        movq  UREGS_rdi(%rsp),%rdi
-.endm
-
-/*
- * Restore all previously saved registers.
- *
- * @adj: extra stack pointer adjustment to be folded into the adjustment done
- *       anyway at the end of the macro
- * @compat: R8-R15 don't need reloading
- */
-.macro RESTORE_ALL adj=0 compat=0
-.if !\compat
-        testl $TRAP_regs_dirty,UREGS_entry_vector(%rsp)
-.endif
-        LOAD_C_CLOBBERED \compat
-.if !\compat
-        jz    987f
-        movq  UREGS_r15(%rsp),%r15
-        movq  UREGS_r14(%rsp),%r14
-        movq  UREGS_r13(%rsp),%r13
-        movq  UREGS_r12(%rsp),%r12
-#ifndef NDEBUG
-        .subsection 1
-987:    testl $TRAP_regs_partial,UREGS_entry_vector(%rsp)
-        jnz   987f
-        cmpq  UREGS_r15(%rsp),%r15
-        jne   789f
-        cmpq  UREGS_r14(%rsp),%r14
-        jne   789f
-        cmpq  UREGS_r13(%rsp),%r13
-        jne   789f
-        cmpq  UREGS_r12(%rsp),%r12
-        je    987f
-789:    ud2
-        .subsection 0
-#endif
-.endif
-987:    movq  UREGS_rbp(%rsp),%rbp
-        movq  UREGS_rbx(%rsp),%rbx
-        subq  $-(UREGS_error_code-UREGS_r15+\adj), %rsp
-.endm
-
-#endif
-
-#ifdef PERF_COUNTERS
-#define PERFC_INCR(_name,_idx,_cur)             \
-        pushq _cur;                             \
-        movslq VCPU_processor(_cur),_cur;       \
-        pushq %rdx;                             \
-        leaq __per_cpu_offset(%rip),%rdx;       \
-        movq (%rdx,_cur,8),_cur;                \
-        leaq per_cpu__perfcounters(%rip),%rdx;  \
-        addq %rdx,_cur;                         \
-        popq %rdx;                              \
-        incl ASM_PERFC_##_name*4(_cur,_idx,4);  \
-        popq _cur
-#else
-#define PERFC_INCR(_name,_idx,_cur)
-#endif
-
-/* Work around AMD erratum #88 */
-#define safe_swapgs                             \
-        "mfence; swapgs;"
-
-#ifdef __sun__
-#define REX64_PREFIX "rex64\\"
-#elif defined(__clang__)
-#define REX64_PREFIX ".byte 0x48; "
-#else
-#define REX64_PREFIX "rex64/"
-#endif
-
-#define BUILD_IRQ(nr)                           \
-    "pushq $0\n\t"                              \
-    "movl $"#nr",4(%rsp)\n\t"                   \
-    "jmp common_interrupt"
-
-#ifdef __ASSEMBLY__
-# define _ASM_EX(p) p-.
-#else
-# define _ASM_EX(p) #p "-."
-#endif
-
-#endif /* __X86_64_ASM_DEFNS_H__ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 04/10] x86: Add support for STAC/CLAC instructions
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (2 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 03/10] x86: merge stuff from asm-x86/x86_64/asm_defns.h to asm-x86/asm_defns.h Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07  9:36   ` Andrew Cooper
  2014-05-07  8:19 ` [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

The STAC/CLAC instructions are only available when SMAP feature is
available, but on the other hand they aren't needed if SMAP is not
enabled, or before we start to run userspace, in that case, the
functions and macros do nothing.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/include/asm-x86/asm_defns.h | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index c4b2bb2..0302836 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -8,6 +8,8 @@
 #endif
 #include <asm/processor.h>
 #include <asm/percpu.h>
+#include <xen/stringify.h>
+#include <asm/cpufeature.h>
 
 #ifndef __ASSEMBLY__
 void ret_from_intr(void);
@@ -159,6 +161,34 @@ void ret_from_intr(void);
 
 #endif
 
+/* "Raw" instruction opcodes */
+#define __ASM_CLAC      .byte 0x0f,0x01,0xca
+#define __ASM_STAC      .byte 0x0f,0x01,0xcb
+
+#ifdef __ASSEMBLY__
+#define ASM_AC(op)                                       \
+        btl $X86_FEATURE_SMAP & 31,                      \
+        CPUINFO_FEATURE_OFFSET(X86_FEATURE_SMAP)+boot_cpu_data(%rip); \
+        jnc 881f;                                        \
+        __ASM_##op;                                      \
+881:
+
+#define ASM_STAC ASM_AC(STAC)
+#define ASM_CLAC ASM_AC(CLAC)
+#else
+static inline void clac(void)
+{
+    if ( boot_cpu_has(X86_FEATURE_SMAP) )
+        asm volatile (__stringify(__ASM_CLAC) : : : "memory");
+}
+
+static inline void stac(void)
+{
+    if ( boot_cpu_has(X86_FEATURE_SMAP) )
+        asm volatile (__stringify(__ASM_STAC) : : : "memory");
+}
+#endif
+
 #ifdef __ASSEMBLY__
 .macro SAVE_ALL
         addq  $-(UREGS_error_code-UREGS_r15), %rsp
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (3 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 04/10] x86: Add support for STAC/CLAC instructions Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07  9:44   ` Andrew Cooper
  2014-05-07  8:19 ` [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Clear AC bit in RFLAGS at the beginning of exception, interrupt, hypercall,
so Xen itself can be protected by SMAP mechanism.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/x86_64/compat/entry.S | 3 ++-
 xen/arch/x86/x86_64/entry.S        | 7 +++++--
 xen/arch/x86/x86_64/traps.c        | 2 +-
 xen/include/asm-x86/asm_defns.h    | 5 ++++-
 4 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index 32b3bcc..00dd0f3 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -13,6 +13,7 @@
 #include <irq_vectors.h>
 
 ENTRY(compat_hypercall)
+        ASM_CLAC
         pushq $0
         SAVE_VOLATILE type=TRAP_syscall compat=1
 
@@ -178,7 +179,7 @@ ENTRY(compat_restore_all_guest)
 
 .section .fixup,"ax"
 .Lfx0:  sti
-        SAVE_ALL
+        SAVE_ALL 0
         movq  UREGS_error_code(%rsp),%rsi
         movq  %rsp,%rax
         andq  $~0xf,%rsp
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 1c81852..205251d 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -68,7 +68,7 @@ iret_exit_to_guest:
 
 .section .fixup,"ax"
 .Lfx0:  sti
-        SAVE_ALL
+        SAVE_ALL 0
         movq  UREGS_error_code(%rsp),%rsi
         movq  %rsp,%rax
         andq  $~0xf,%rsp
@@ -273,6 +273,7 @@ ENTRY(sysenter_entry)
         pushq $0
         pushfq
 GLOBAL(sysenter_eflags_saved)
+        ASM_CLAC
         pushq $3 /* ring 3 null cs */
         pushq $0 /* null rip */
         pushq $0
@@ -309,6 +310,7 @@ UNLIKELY_END(sysenter_gpf)
         jmp   .Lbounce_exception
 
 ENTRY(int80_direct_trap)
+        ASM_CLAC
         pushq $0
         SAVE_VOLATILE 0x80
 
@@ -621,7 +623,7 @@ ENTRY(double_fault)
 
         .pushsection .init.text, "ax", @progbits
 ENTRY(early_page_fault)
-        SAVE_ALL
+        SAVE_ALL 0
         movq  %rsp,%rdi
         call  do_early_page_fault
         jmp   restore_all_xen
@@ -673,6 +675,7 @@ ENTRY(nmi_crash)
         ud2
 
 ENTRY(machine_check)
+        ASM_CLAC
         pushq $0
         movl  $TRAP_machine_check,4(%rsp)
         jmp   handle_ist_exception
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 90072c1..b87b33e 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -437,7 +437,7 @@ void __devinit subarch_percpu_traps_init(void)
     /* Common SYSCALL parameters. */
     wrmsr(MSR_STAR, 0, (FLAT_RING3_CS32<<16) | __HYPERVISOR_CS);
     wrmsr(MSR_SYSCALL_MASK,
-          X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
+          X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
           X86_EFLAGS_DF|X86_EFLAGS_IF|X86_EFLAGS_TF,
           0U);
 }
diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index 0302836..9aa3657 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -190,7 +190,10 @@ static inline void stac(void)
 #endif
 
 #ifdef __ASSEMBLY__
-.macro SAVE_ALL
+.macro SAVE_ALL clear_ac=1
+.if \clear_ac
+        ASM_CLAC
+.endif
         addq  $-(UREGS_error_code-UREGS_r15), %rsp
         cld
         movq  %rdi,UREGS_rdi(%rsp)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (4 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07  9:49   ` Andrew Cooper
  2014-05-08  1:14   ` Tian, Kevin
  2014-05-07  8:19 ` [PATCH v6 07/10] VMX: Disable SMAP feature when guest is in non-paging mode Feng Wu
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Use STAC/CLAC to temporarily disable SMAP to allow legal accesses to
user pages in kernel mode

STAC/CLAC is not needed for compat_create_bounce_frame, since in this
chunk of code, it only accesses the pv guest's kernel stack, which is
in ring 1 for 32-bit pv guests.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/traps.c                | 12 ++++++++++++
 xen/arch/x86/usercopy.c             |  6 ++++++
 xen/arch/x86/x86_64/entry.S         |  2 ++
 xen/include/asm-x86/uaccess.h       |  8 ++++++--
 xen/include/asm-x86/x86_64/system.h |  4 +++-
 5 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 45070bb..19c96d5 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -3811,6 +3811,18 @@ unsigned long do_get_debugreg(int reg)
 
 void asm_domain_crash_synchronous(unsigned long addr)
 {
+    /*
+     * We need clear AC bit here because in entry.S AC is set
+     * by ASM_STAC to temporarily allow accesses to user pages
+     * which is prevented by SMAP by default.
+     *
+     * For some code paths, where this function is called, clac()
+     * is not needed, but adding clac() here instead of each place
+     * asm_domain_crash_synchronous() is called can reduce the code
+     * redundancy, and it is harmless as well.
+     */
+    clac();
+
     if ( addr == 0 )
         addr = this_cpu(last_extable_addr);
 
diff --git a/xen/arch/x86/usercopy.c b/xen/arch/x86/usercopy.c
index b79202b..4cc78f5 100644
--- a/xen/arch/x86/usercopy.c
+++ b/xen/arch/x86/usercopy.c
@@ -14,6 +14,7 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned n)
 {
     unsigned long __d0, __d1, __d2, __n = n;
 
+    stac();
     asm volatile (
         "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
         "    jbe  1f\n"
@@ -42,6 +43,7 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned n)
         : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
         : "0" (__n), "1" (to), "2" (from), "3" (__n)
         : "memory" );
+    clac();
 
     return __n;
 }
@@ -51,6 +53,7 @@ __copy_from_user_ll(void *to, const void __user *from, unsigned n)
 {
     unsigned long __d0, __d1, __d2, __n = n;
 
+    stac();
     asm volatile (
         "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
         "    jbe  1f\n"
@@ -85,6 +88,7 @@ __copy_from_user_ll(void *to, const void __user *from, unsigned n)
         : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
         : "0" (__n), "1" (to), "2" (from), "3" (__n)
         : "memory" );
+    clac();
 
     return __n;
 }
@@ -113,6 +117,7 @@ copy_to_user(void __user *to, const void *from, unsigned n)
 #define __do_clear_user(addr,size)					\
 do {									\
 	long __d0;							\
+	stac();								\
 	__asm__ __volatile__(						\
 		"0:	rep; stosl\n"					\
 		"	movl %2,%0\n"					\
@@ -126,6 +131,7 @@ do {									\
 		_ASM_EXTABLE(1b,2b)					\
 		: "=&c"(size), "=&D" (__d0)				\
 		: "r"(size & 3), "0"(size / 4), "1"((long)addr), "a"(0));	\
+	clac();								\
 } while (0)
 
 /**
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 205251d..dce6eb5 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -379,6 +379,7 @@ __UNLIKELY_END(create_bounce_frame_bad_sp)
         movb  TRAPBOUNCE_flags(%rdx),%cl
         subq  $40,%rsi
         movq  UREGS_ss+8(%rsp),%rax
+        ASM_STAC
 .Lft2:  movq  %rax,32(%rsi)             # SS
         movq  UREGS_rsp+8(%rsp),%rax
 .Lft3:  movq  %rax,24(%rsi)             # RSP
@@ -423,6 +424,7 @@ UNLIKELY_END(bounce_failsafe)
 .Lft12: movq  %rax,8(%rsi)              # R11
         movq  UREGS_rcx+8(%rsp),%rax
 .Lft13: movq  %rax,(%rsi)               # RCX
+        ASM_CLAC
         /* Rewrite our stack frame and return to guest-OS mode. */
         /* IA32 Ref. Vol. 3: TF, VM, RF and NT flags are cleared on trap. */
         /* Also clear AC: alignment checks shouldn't trigger in kernel mode. */
diff --git a/xen/include/asm-x86/uaccess.h b/xen/include/asm-x86/uaccess.h
index 88b4ba2..92bc322 100644
--- a/xen/include/asm-x86/uaccess.h
+++ b/xen/include/asm-x86/uaccess.h
@@ -146,6 +146,7 @@ struct __large_struct { unsigned long buf[100]; };
  * aliasing issues.
  */
 #define __put_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
+	stac();								\
 	__asm__ __volatile__(						\
 		"1:	mov"itype" %"rtype"1,%2\n"			\
 		"2:\n"							\
@@ -155,9 +156,11 @@ struct __large_struct { unsigned long buf[100]; };
 		".previous\n"						\
 		_ASM_EXTABLE(1b, 3b)					\
 		: "=r"(err)						\
-		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))
+		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err));	\
+	clac();
 
 #define __get_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
+	stac();								\
 	__asm__ __volatile__(						\
 		"1:	mov"itype" %2,%"rtype"1\n"			\
 		"2:\n"							\
@@ -168,7 +171,8 @@ struct __large_struct { unsigned long buf[100]; };
 		".previous\n"						\
 		_ASM_EXTABLE(1b, 3b)					\
 		: "=r"(err), ltype (x)					\
-		: "m"(__m(addr)), "i"(errret), "0"(err))
+		: "m"(__m(addr)), "i"(errret), "0"(err));		\
+	clac();
 
 /**
  * __copy_to_user: - Copy a block of data into user space, with less checking
diff --git a/xen/include/asm-x86/x86_64/system.h b/xen/include/asm-x86/x86_64/system.h
index 20f038b..b6341ad1 100644
--- a/xen/include/asm-x86/x86_64/system.h
+++ b/xen/include/asm-x86/x86_64/system.h
@@ -12,6 +12,7 @@
  * is the same as the initial value of _o then _n is written to location _p.
  */
 #define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype)                 \
+    stac();                                                             \
     asm volatile (                                                      \
         "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n"                      \
         "2:\n"                                                          \
@@ -22,7 +23,8 @@
         _ASM_EXTABLE(1b, 3b)                                            \
         : "=a" (_o), "=r" (_rc)                                         \
         : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1" (0) \
-        : "memory");
+        : "memory");                                                    \
+    clac();
 
 #define cmpxchg_user(_p,_o,_n)                                          \
 ({                                                                      \
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 07/10] VMX: Disable SMAP feature when guest is in non-paging mode
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (5 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-08  1:16   ` Tian, Kevin
  2014-05-07  8:19 ` [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

SMAP is disabled if CPU is in non-paging mode in hardware.
However Xen always uses paging mode to emulate guest non-paging
mode with HAP. To emulate this behavior, SMAP needs to be manually
disabled when guest switches to non-paging mode.

This logic is similiar with SMEP.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index ecdbc17..f0f6f83 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1319,12 +1319,12 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
         if ( !hvm_paging_enabled(v) )
         {
             /*
-             * SMEP is disabled if CPU is in non-paging mode in hardware.
+             * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
              * However Xen always uses paging mode to emulate guest non-paging
-             * mode. To emulate this behavior, SMEP needs to be manually
+             * mode. To emulate this behavior, SMEP/SMAP needs to be manually
              * disabled when guest VCPU is in non-paging mode.
              */
-            v->arch.hvm_vcpu.hw_cr[4] &= ~X86_CR4_SMEP;
+            v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
         }
         __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
         break;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (6 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 07/10] VMX: Disable SMAP feature when guest is in non-paging mode Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07 10:26   ` Andrew Cooper
  2014-05-08  1:20   ` Tian, Kevin
  2014-05-07  8:19 ` [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Supervisor Mode Access Prevention (SMAP) is a new security
feature disclosed by Intel, please refer to the following
document:

http://software.intel.com/sites/default/files/319433-014.pdf

If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
to linear addresses that are accessible in user mode. If CPL < 3,
SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
applies to all supervisor-mode data accesses (these are implicit
supervisor accesses) regardless of the value of EFLAGS.AC.

This patch enables SMAP in Xen to prevent Xen hypervisor from
accessing pv guest data, whose translation paging-structure
entries' U/S flags are all set.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 docs/misc/xen-command-line.markdown |  7 +++++
 xen/arch/x86/setup.c                | 27 +++++++++++++++++
 xen/arch/x86/traps.c                | 58 +++++++++++++++++++++++++++----------
 xen/include/asm-x86/cpufeature.h    |  1 +
 xen/include/asm-x86/domain.h        |  6 ++--
 5 files changed, 82 insertions(+), 17 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 7dc938b..a7ac53d 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -912,6 +912,13 @@ Set the serial transmit buffer size.
 
 Flag to enable Supervisor Mode Execution Protection
 
+### smap
+> `= <boolean>`
+
+> Default: `true`
+
+Flag to enable Supervisor Mode Access Prevention
+
 ### snb\_igd\_quirk
 > `= <boolean>`
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 2e30701..29b22a1 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
 static bool_t __initdata disable_smep;
 invbool_param("smep", disable_smep);
 
+/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
+static bool_t __initdata disable_smap;
+invbool_param("smap", disable_smap);
+
 /* **** Linux config option: propagated to domain0. */
 /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
 /* "acpi=force":  Override the disable blacklist.                   */
@@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( cpu_has_smep )
         set_in_cr4(X86_CR4_SMEP);
 
+    if ( disable_smap )
+        setup_clear_cpu_cap(X86_FEATURE_SMAP);
+    if ( cpu_has_smap )
+        set_in_cr4(X86_CR4_SMAP);
+
     if ( cpu_has_fsgsbase )
         set_in_cr4(X86_CR4_FSGSBASE);
 
@@ -1384,6 +1393,21 @@ void __init noreturn __start_xen(unsigned long mbi_p)
         printk(XENLOG_WARNING
                "Multiple initrd candidates, picking module #%u\n",
                initrdidx);
+    /*
+     * Temporarily clear SMAP bit in CR4 to allow user-page accesses in
+     * construct_dom0(). This looks a little hackish, but using stac()/clac()
+     * to do this is not appropriate here: Since construct_dom0() calls
+     * copy_from_user(), so we cannot wrap construct_dom0() as a whole in
+     * stac()/clac() - the AC bit remaining cleared after copy_from_user()
+     * would induce problems in construct_dom0().
+     *
+     * On the other hand, if we used stac()/clac() in construct_dom0(), we
+     * would need to carefully handle some corner cases.
+     *
+     * So we clear SMAP before construct_dom0() and enable it back afterwards.
+    */
+    if ( cpu_has_smap )
+        write_cr4(read_cr4() & ~X86_CR4_SMAP);
 
     /*
      * We're going to setup domain0 using the module(s) that we stashed safely
@@ -1395,6 +1419,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                         bootstrap_map, cmdline) != 0)
         panic("Could not set up DOM0 guest OS");
 
+    if ( cpu_has_smap )
+        write_cr4(read_cr4() | X86_CR4_SMAP);
+
     /* Scrub RAM that is still free and so may go to an unprivileged domain. */
     scrub_heap_pages();
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 19c96d5..ee203cb 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1183,11 +1183,12 @@ static int handle_gdt_ldt_mapping_fault(
 enum pf_type {
     real_fault,
     smep_fault,
+    smap_fault,
     spurious_fault
 };
 
 static enum pf_type __page_fault_type(
-    unsigned long addr, unsigned int error_code)
+    unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long mfn, cr3 = read_cr3();
     l4_pgentry_t l4e, *l4t;
@@ -1195,6 +1196,7 @@ static enum pf_type __page_fault_type(
     l2_pgentry_t l2e, *l2t;
     l1_pgentry_t l1e, *l1t;
     unsigned int required_flags, disallowed_flags, page_user;
+    unsigned int error_code = regs->error_code;
 
     /*
      * We do not take spurious page faults in IRQ handlers as we do not
@@ -1263,19 +1265,37 @@ static enum pf_type __page_fault_type(
     page_user &= l1e_get_flags(l1e);
 
 leaf:
-    /*
-     * Supervisor Mode Execution Protection (SMEP):
-     * Disallow supervisor execution from user-accessible mappings
-     */
-    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
-         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
-        return smep_fault;
+    if ( page_user )
+    {
+        unsigned long cr4 = read_cr4();
+        /*
+         * Supervisor Mode Execution Prevention (SMEP):
+         * Disallow supervisor execution from user-accessible mappings
+         */
+        if ( (cr4 & X86_CR4_SMEP) &&
+             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
+            return smep_fault;
+
+        /*
+         * Supervisor Mode Access Prevention (SMAP):
+         * Disallow supervisor access user-accessible mappings
+         * A fault is considered as an SMAP violation if the following
+         * conditions are true:
+         *   - X86_CR4_SMAP is set in CR4
+         *   - A user page is being accessed
+         *   - CPL=3 or X86_EFLAGS_AC is clear
+         *   - Page fault in kernel mode
+         */
+        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode) &&
+             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
+            return smap_fault;
+    }
 
     return spurious_fault;
 }
 
 static enum pf_type spurious_page_fault(
-    unsigned long addr, unsigned int error_code)
+    unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long flags;
     enum pf_type pf_type;
@@ -1285,7 +1305,7 @@ static enum pf_type spurious_page_fault(
      * page tables from becoming invalid under our feet during the walk.
      */
     local_irq_save(flags);
-    pf_type = __page_fault_type(addr, error_code);
+    pf_type = __page_fault_type(addr, regs);
     local_irq_restore(flags);
 
     return pf_type;
@@ -1380,8 +1400,14 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(!guest_mode(regs)) )
     {
-        pf_type = spurious_page_fault(addr, error_code);
-        BUG_ON(pf_type == smep_fault);
+        pf_type = spurious_page_fault(addr, regs);
+        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
+        {
+            console_start_sync();
+            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' : 'A');
+            fatal_trap(TRAP_page_fault, regs);
+        }
+
         if ( pf_type != real_fault )
             return;
 
@@ -1407,10 +1433,12 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
     {
-        pf_type = spurious_page_fault(addr, error_code);
-        if ( pf_type == smep_fault )
+        pf_type = spurious_page_fault(addr, regs);
+        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
         {
-            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
+            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
+                   current, (pf_type == smep_fault) ? 'E' : 'A');
+
             domain_crash(current->domain);
         }
         if ( pf_type != real_fault )
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 20881c0..8014241 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -190,6 +190,7 @@
 #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
 
 #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
+#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
 #define cpu_has_fpu_sel         (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
 
 #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) \
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index c5c266f..abf55fb 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, unsigned long guest_cr4);
     (((v)->arch.pv_vcpu.ctrlreg[4]                          \
       | (mmu_cr4_features                                   \
          & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
-            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
+            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
+            X86_CR4_FSGSBASE))                              \
       | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
      & ~X86_CR4_DE)
 #define real_cr4_to_pv_guest_cr4(c)                         \
     ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
-             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
+             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
+             X86_CR4_FSGSBASE | X86_CR4_SMAP))
 
 void domain_cpuid(struct domain *d,
                   unsigned int  input,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (7 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07 10:46   ` Andrew Cooper
  2014-05-08  1:22   ` Tian, Kevin
  2014-05-07  8:19 ` [PATCH v6 10/10] x86/tools: Expose SMAP to HVM guests Feng Wu
                   ` (2 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Intel new CPU supports SMAP (Supervisor Mode Access Prevention).
SMAP prevents supervisor-mode accesses to any linear address with
a valid translation for which the U/S flag (bit 2) is 1 in every
paging-structure entry controlling the translation for the linear
address.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/hvm/hvm.c        |  3 +++
 xen/arch/x86/mm/guest_walk.c  | 40 ++++++++++++++++++++++++++++++----------
 xen/include/asm-x86/hvm/hvm.h | 18 +++++++++++++++++-
 3 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ac05160..76ccd07 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3039,6 +3039,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         if ( (count == 0) && !cpu_has_smep )
             *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
 
+        if ( (count == 0) && !cpu_has_smap )
+            *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
+
         /* Don't expose MPX to hvm when VMX support is not available */
         if ( (count == 0) &&
              (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 70460b6..bb38fda 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -144,7 +144,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
     guest_l4e_t *l4p;
 #endif
     uint32_t gflags, mflags, iflags, rc = 0;
-    int smep;
+    bool_t smep = 0, smap = 0;
     bool_t pse1G = 0, pse2M = 0;
     p2m_query_t qt = P2M_ALLOC | P2M_UNSHARE;
 
@@ -159,13 +159,33 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
     mflags = mandatory_flags(v, pfec);
     iflags = (_PAGE_NX_BIT | _PAGE_INVALID_BITS);
 
-    /* SMEP: kernel-mode instruction fetches from user-mode mappings
-     * should fault.  Unlike NX or invalid bits, we're looking for _all_
-     * entries in the walk to have _PAGE_USER set, so we need to do the
-     * whole walk as if it were a user-mode one and then invert the answer. */
-    smep = (is_hvm_vcpu(v) && hvm_smep_enabled(v) 
-            && (pfec & PFEC_insn_fetch) && !(pfec & PFEC_user_mode) );
-    if ( smep )
+    if ( is_hvm_vcpu(v) && !(pfec & PFEC_user_mode) )
+    {
+        struct segment_register seg;
+        const struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+        hvm_get_segment_register(v, x86_seg_ss, &seg);
+
+        /* SMEP: kernel-mode instruction fetches from user-mode mappings
+         * should fault.  Unlike NX or invalid bits, we're looking for _all_
+         * entries in the walk to have _PAGE_USER set, so we need to do the
+         * whole walk as if it were a user-mode one and then invert the answer. */
+        smep =  hvm_smep_enabled(v) && (pfec & PFEC_insn_fetch);
+
+        /*
+         * SMAP: kernel-mode data accesses from user-mode mappings should fault
+         * A fault is considered as a SMAP violation if the following
+         * conditions come true:
+         *   - X86_CR4_SMAP is set in CR4
+         *   - A user page is accessed
+         *   - CPL = 3 or X86_EFLAGS_AC is clear
+         *   - Page fault in kernel mode
+         */
+        smap = hvm_smap_enabled(v) &&
+               ((seg.attr.fields.dpl == 3) || !(regs->eflags & X86_EFLAGS_AC));
+    }
+
+    if ( smep || smap )
         mflags |= _PAGE_USER;
 
 #if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
@@ -338,8 +358,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
 set_ad:
 #endif
-    /* Now re-invert the user-mode requirement for SMEP. */
-    if ( smep ) 
+    /* Now re-invert the user-mode requirement for SMEP and SMAP */
+    if ( smep || smap )
         rc ^= _PAGE_USER;
 
     /* Go back and set accessed and dirty bits only if the walk was a
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c373930..0ab6b70 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -260,6 +260,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
     (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PAE))
 #define hvm_smep_enabled(v) \
     (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMEP))
+#define hvm_smap_enabled(v) \
+    (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
 #define hvm_nx_enabled(v) \
     (!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
 
@@ -360,13 +362,26 @@ static inline bool_t hvm_vcpu_has_smep(void)
 
     hvm_cpuid(0, &eax, NULL, NULL, NULL);
 
-    if (eax < 7)
+    if ( eax < 7 )
         return 0;
 
     hvm_cpuid(7, NULL, &ebx, NULL, NULL);
     return !!(ebx & cpufeat_mask(X86_FEATURE_SMEP));
 }
 
+static inline bool_t hvm_vcpu_has_smap(void)
+{
+    unsigned int eax, ebx;
+
+    hvm_cpuid(0, &eax, NULL, NULL, NULL);
+
+    if ( eax < 7 )
+        return 0;
+
+    hvm_cpuid(0x7, NULL, &ebx, NULL, NULL);
+    return !!(ebx & cpufeat_mask(X86_FEATURE_SMAP));
+}
+
 /* These reserved bits in lower 32 remain 0 after any load of CR0 */
 #define HVM_CR0_GUEST_RESERVED_BITS             \
     (~((unsigned long)                          \
@@ -387,6 +402,7 @@ static inline bool_t hvm_vcpu_has_smep(void)
         X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
         X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
         (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) |      \
+        (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) |      \
         (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) |     \
         ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\
                       ? X86_CR4_VMXE : 0)  |             \
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v6 10/10] x86/tools: Expose SMAP to HVM guests
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (8 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
@ 2014-05-07  8:19 ` Feng Wu
  2014-05-07  8:35 ` [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
  2014-05-07  8:57 ` Ian Campbell
  11 siblings, 0 replies; 43+ messages in thread
From: Feng Wu @ 2014-05-07  8:19 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

This patch exposes SMAP festure to HVM guests

Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/xc_cpufeature.h | 1 +
 tools/libxc/xc_cpuid_x86.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index 6fce03b..66cc82e 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -140,6 +140,7 @@
 #define X86_FEATURE_RTM         11 /* Restricted Transactional Memory */
 #define X86_FEATURE_RDSEED      18 /* RDSEED instruction */
 #define X86_FEATURE_ADX         19 /* ADCX, ADOX instructions */
+#define X86_FEATURE_SMAP        20 /* Supervisor Mode Access Protection */
 
 
 #endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 71917e3..79490da 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -375,6 +375,7 @@ static void xc_cpuid_hvm_policy(
                         bitmaskof(X86_FEATURE_RTM)  |
                         bitmaskof(X86_FEATURE_RDSEED)  |
                         bitmaskof(X86_FEATURE_ADX)  |
+                        bitmaskof(X86_FEATURE_SMAP) |
                         bitmaskof(X86_FEATURE_FSGSBASE));
         } else
             regs[1] = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (9 preceding siblings ...)
  2014-05-07  8:19 ` [PATCH v6 10/10] x86/tools: Expose SMAP to HVM guests Feng Wu
@ 2014-05-07  8:35 ` Jan Beulich
  2014-05-07  9:00   ` Wu, Feng
  2014-05-07  8:57 ` Ian Campbell
  11 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2014-05-07  8:35 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, ian.campbell, andrew.cooper3, eddie.dong, xen-devel,
	jun.nakajima

>>> On 07.05.14 at 10:19, <feng.wu@intel.com> wrote:
> Feng Wu (10):
>   x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
>   x86: move common_interrupt to entry.S
>   x86: merge stuff from asm-x86/x86_64/asm_defns.h to
>     asm-x86/asm_defns.h

These patches got committed already - it's pretty pointless to re-submit
such about a day later.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
  2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (10 preceding siblings ...)
  2014-05-07  8:35 ` [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
@ 2014-05-07  8:57 ` Ian Campbell
  2014-05-07  8:59   ` Wu, Feng
  11 siblings, 1 reply; 43+ messages in thread
From: Ian Campbell @ 2014-05-07  8:57 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, JBeulich, andrew.cooper3, eddie.dong, xen-devel,
	jun.nakajima

On Wed, 2014-05-07 at 16:19 +0800, Feng Wu wrote:

Thank you for sorting out the threading of your patches.

Ian.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
  2014-05-07  8:57 ` Ian Campbell
@ 2014-05-07  8:59   ` Wu, Feng
  0 siblings, 0 replies; 43+ messages in thread
From: Wu, Feng @ 2014-05-07  8:59 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Tian, Kevin, JBeulich, andrew.cooper3, Dong, Eddie, xen-devel,
	Nakajima, Jun



> -----Original Message-----
> From: Ian Campbell [mailto:Ian.Campbell@citrix.com]
> Sent: Wednesday, May 07, 2014 4:57 PM
> To: Wu, Feng
> Cc: xen-devel@lists.xen.org; JBeulich@suse.com; andrew.cooper3@citrix.com;
> Tian, Kevin; Dong, Eddie; Nakajima, Jun
> Subject: Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention
> (SMAP)
> 
> On Wed, 2014-05-07 at 16:19 +0800, Feng Wu wrote:
> 
> Thank you for sorting out the threading of your patches.

You are welcome. Finally I find a method to handle this, thank you very much for pointing this out!

> 
> Ian.
> 

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
  2014-05-07  8:35 ` [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
@ 2014-05-07  9:00   ` Wu, Feng
  2014-05-07  9:33     ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Wu, Feng @ 2014-05-07  9:00 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, ian.campbell, andrew.cooper3, Dong, Eddie,
	xen-devel, Nakajima, Jun



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, May 07, 2014 4:36 PM
> To: Wu, Feng
> Cc: andrew.cooper3@citrix.com; ian.campbell@citrix.com; Dong, Eddie;
> Nakajima, Jun; Tian, Kevin; xen-devel@lists.xen.org
> Subject: Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention
> (SMAP)
> 
> >>> On 07.05.14 at 10:19, <feng.wu@intel.com> wrote:
> > Feng Wu (10):
> >   x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
> >   x86: move common_interrupt to entry.S
> >   x86: merge stuff from asm-x86/x86_64/asm_defns.h to
> >     asm-x86/asm_defns.h
> 
> These patches got committed already - it's pretty pointless to re-submit
> such about a day later.
> 

Okay, I will follow this rule in future.

> Jan

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
  2014-05-07  9:00   ` Wu, Feng
@ 2014-05-07  9:33     ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2014-05-07  9:33 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, ian.campbell, andrew.cooper3, Eddie Dong, xen-devel,
	Jun Nakajima

>>> On 07.05.14 at 11:00, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> Sent: Wednesday, May 07, 2014 4:36 PM
>> To: Wu, Feng
>> Cc: andrew.cooper3@citrix.com; ian.campbell@citrix.com; Dong, Eddie;
>> Nakajima, Jun; Tian, Kevin; xen-devel@lists.xen.org 
>> Subject: Re: [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention
>> (SMAP)
>> 
>> >>> On 07.05.14 at 10:19, <feng.wu@intel.com> wrote:
>> > Feng Wu (10):
>> >   x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
>> >   x86: move common_interrupt to entry.S
>> >   x86: merge stuff from asm-x86/x86_64/asm_defns.h to
>> >     asm-x86/asm_defns.h
>> 
>> These patches got committed already - it's pretty pointless to re-submit
>> such about a day later.
> 
> Okay, I will follow this rule in future.

And btw, I'm fine now with the rest of the series, but I won't commit
this in a piecemeal approach, i.e. I'll wait for suitable acks on patches
6 and 9 (and ideally other reviews) before committing.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 04/10] x86: Add support for STAC/CLAC instructions
  2014-05-07  8:19 ` [PATCH v6 04/10] x86: Add support for STAC/CLAC instructions Feng Wu
@ 2014-05-07  9:36   ` Andrew Cooper
  0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2014-05-07  9:36 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, ian.campbell, eddie.dong, xen-devel, JBeulich, jun.nakajima

On 07/05/14 09:19, Feng Wu wrote:
> The STAC/CLAC instructions are only available when SMAP feature is
> available, but on the other hand they aren't needed if SMAP is not
> enabled, or before we start to run userspace, in that case, the
> functions and macros do nothing.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

> ---
>  xen/include/asm-x86/asm_defns.h | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
> index c4b2bb2..0302836 100644
> --- a/xen/include/asm-x86/asm_defns.h
> +++ b/xen/include/asm-x86/asm_defns.h
> @@ -8,6 +8,8 @@
>  #endif
>  #include <asm/processor.h>
>  #include <asm/percpu.h>
> +#include <xen/stringify.h>
> +#include <asm/cpufeature.h>
>  
>  #ifndef __ASSEMBLY__
>  void ret_from_intr(void);
> @@ -159,6 +161,34 @@ void ret_from_intr(void);
>  
>  #endif
>  
> +/* "Raw" instruction opcodes */
> +#define __ASM_CLAC      .byte 0x0f,0x01,0xca
> +#define __ASM_STAC      .byte 0x0f,0x01,0xcb
> +
> +#ifdef __ASSEMBLY__
> +#define ASM_AC(op)                                       \
> +        btl $X86_FEATURE_SMAP & 31,                      \
> +        CPUINFO_FEATURE_OFFSET(X86_FEATURE_SMAP)+boot_cpu_data(%rip); \
> +        jnc 881f;                                        \
> +        __ASM_##op;                                      \
> +881:
> +
> +#define ASM_STAC ASM_AC(STAC)
> +#define ASM_CLAC ASM_AC(CLAC)
> +#else
> +static inline void clac(void)
> +{
> +    if ( boot_cpu_has(X86_FEATURE_SMAP) )
> +        asm volatile (__stringify(__ASM_CLAC) : : : "memory");
> +}
> +
> +static inline void stac(void)
> +{
> +    if ( boot_cpu_has(X86_FEATURE_SMAP) )
> +        asm volatile (__stringify(__ASM_STAC) : : : "memory");
> +}
> +#endif
> +
>  #ifdef __ASSEMBLY__
>  .macro SAVE_ALL
>          addq  $-(UREGS_error_code-UREGS_r15), %rsp

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-07  8:19 ` [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
@ 2014-05-07  9:44   ` Andrew Cooper
  2014-05-07 11:40     ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2014-05-07  9:44 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, ian.campbell, eddie.dong, xen-devel, JBeulich, jun.nakajima

On 07/05/14 09:19, Feng Wu wrote:
> Clear AC bit in RFLAGS at the beginning of exception, interrupt, hypercall,
> so Xen itself can be protected by SMAP mechanism.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> ---
>  xen/arch/x86/x86_64/compat/entry.S | 3 ++-
>  xen/arch/x86/x86_64/entry.S        | 7 +++++--
>  xen/arch/x86/x86_64/traps.c        | 2 +-
>  xen/include/asm-x86/asm_defns.h    | 5 ++++-
>  4 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
> index 32b3bcc..00dd0f3 100644
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -13,6 +13,7 @@
>  #include <irq_vectors.h>
>  
>  ENTRY(compat_hypercall)
> +        ASM_CLAC
>          pushq $0
>          SAVE_VOLATILE type=TRAP_syscall compat=1
>  
> @@ -178,7 +179,7 @@ ENTRY(compat_restore_all_guest)
>  
>  .section .fixup,"ax"
>  .Lfx0:  sti
> -        SAVE_ALL
> +        SAVE_ALL 0
>          movq  UREGS_error_code(%rsp),%rsi
>          movq  %rsp,%rax
>          andq  $~0xf,%rsp
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index 1c81852..205251d 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -68,7 +68,7 @@ iret_exit_to_guest:
>  
>  .section .fixup,"ax"
>  .Lfx0:  sti
> -        SAVE_ALL
> +        SAVE_ALL 0
>          movq  UREGS_error_code(%rsp),%rsi
>          movq  %rsp,%rax
>          andq  $~0xf,%rsp
> @@ -273,6 +273,7 @@ ENTRY(sysenter_entry)
>          pushq $0
>          pushfq
>  GLOBAL(sysenter_eflags_saved)
> +        ASM_CLAC
>          pushq $3 /* ring 3 null cs */
>          pushq $0 /* null rip */
>          pushq $0
> @@ -309,6 +310,7 @@ UNLIKELY_END(sysenter_gpf)
>          jmp   .Lbounce_exception
>  
>  ENTRY(int80_direct_trap)
> +        ASM_CLAC
>          pushq $0
>          SAVE_VOLATILE 0x80
>  
> @@ -621,7 +623,7 @@ ENTRY(double_fault)
>  
>          .pushsection .init.text, "ax", @progbits
>  ENTRY(early_page_fault)
> -        SAVE_ALL
> +        SAVE_ALL 0
>          movq  %rsp,%rdi
>          call  do_early_page_fault
>          jmp   restore_all_xen
> @@ -673,6 +675,7 @@ ENTRY(nmi_crash)
>          ud2
>  
>  ENTRY(machine_check)
> +        ASM_CLAC

This is not needed.  the start of handle_ist_exception has a SAVE_ALL,
which also covers the nmi entry point.

On the subject of IST exceptions, perhaps the double fault explicitly
wants a STAC to reduce the likelihood of taking a further fault while
trying to dump state. ?

>          pushq $0
>          movl  $TRAP_machine_check,4(%rsp)
>          jmp   handle_ist_exception
> diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
> index 90072c1..b87b33e 100644
> --- a/xen/arch/x86/x86_64/traps.c
> +++ b/xen/arch/x86/x86_64/traps.c
> @@ -437,7 +437,7 @@ void __devinit subarch_percpu_traps_init(void)
>      /* Common SYSCALL parameters. */
>      wrmsr(MSR_STAR, 0, (FLAT_RING3_CS32<<16) | __HYPERVISOR_CS);
>      wrmsr(MSR_SYSCALL_MASK,
> -          X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
> +          X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
>            X86_EFLAGS_DF|X86_EFLAGS_IF|X86_EFLAGS_TF,
>            0U);
>  }
> diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
> index 0302836..9aa3657 100644
> --- a/xen/include/asm-x86/asm_defns.h
> +++ b/xen/include/asm-x86/asm_defns.h
> @@ -190,7 +190,10 @@ static inline void stac(void)
>  #endif
>  
>  #ifdef __ASSEMBLY__
> -.macro SAVE_ALL
> +.macro SAVE_ALL clear_ac=1
> +.if \clear_ac
> +        ASM_CLAC
> +.endif
>          addq  $-(UREGS_error_code-UREGS_r15), %rsp
>          cld
>          movq  %rdi,UREGS_rdi(%rsp)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode
  2014-05-07  8:19 ` [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
@ 2014-05-07  9:49   ` Andrew Cooper
  2014-05-08  1:14   ` Tian, Kevin
  1 sibling, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2014-05-07  9:49 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, ian.campbell, eddie.dong, xen-devel, JBeulich, jun.nakajima

On 07/05/14 09:19, Feng Wu wrote:
> Use STAC/CLAC to temporarily disable SMAP to allow legal accesses to
> user pages in kernel mode
>
> STAC/CLAC is not needed for compat_create_bounce_frame, since in this
> chunk of code, it only accesses the pv guest's kernel stack, which is
> in ring 1 for 32-bit pv guests.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/arch/x86/traps.c                | 12 ++++++++++++
>  xen/arch/x86/usercopy.c             |  6 ++++++
>  xen/arch/x86/x86_64/entry.S         |  2 ++
>  xen/include/asm-x86/uaccess.h       |  8 ++++++--
>  xen/include/asm-x86/x86_64/system.h |  4 +++-
>  5 files changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 45070bb..19c96d5 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -3811,6 +3811,18 @@ unsigned long do_get_debugreg(int reg)
>  
>  void asm_domain_crash_synchronous(unsigned long addr)
>  {
> +    /*
> +     * We need clear AC bit here because in entry.S AC is set
> +     * by ASM_STAC to temporarily allow accesses to user pages
> +     * which is prevented by SMAP by default.
> +     *
> +     * For some code paths, where this function is called, clac()
> +     * is not needed, but adding clac() here instead of each place
> +     * asm_domain_crash_synchronous() is called can reduce the code
> +     * redundancy, and it is harmless as well.
> +     */
> +    clac();
> +
>      if ( addr == 0 )
>          addr = this_cpu(last_extable_addr);
>  
> diff --git a/xen/arch/x86/usercopy.c b/xen/arch/x86/usercopy.c
> index b79202b..4cc78f5 100644
> --- a/xen/arch/x86/usercopy.c
> +++ b/xen/arch/x86/usercopy.c
> @@ -14,6 +14,7 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned n)
>  {
>      unsigned long __d0, __d1, __d2, __n = n;
>  
> +    stac();
>      asm volatile (
>          "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
>          "    jbe  1f\n"
> @@ -42,6 +43,7 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned n)
>          : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
>          : "0" (__n), "1" (to), "2" (from), "3" (__n)
>          : "memory" );
> +    clac();
>  
>      return __n;
>  }
> @@ -51,6 +53,7 @@ __copy_from_user_ll(void *to, const void __user *from, unsigned n)
>  {
>      unsigned long __d0, __d1, __d2, __n = n;
>  
> +    stac();
>      asm volatile (
>          "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
>          "    jbe  1f\n"
> @@ -85,6 +88,7 @@ __copy_from_user_ll(void *to, const void __user *from, unsigned n)
>          : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
>          : "0" (__n), "1" (to), "2" (from), "3" (__n)
>          : "memory" );
> +    clac();
>  
>      return __n;
>  }
> @@ -113,6 +117,7 @@ copy_to_user(void __user *to, const void *from, unsigned n)
>  #define __do_clear_user(addr,size)					\
>  do {									\
>  	long __d0;							\
> +	stac();								\
>  	__asm__ __volatile__(						\
>  		"0:	rep; stosl\n"					\
>  		"	movl %2,%0\n"					\
> @@ -126,6 +131,7 @@ do {									\
>  		_ASM_EXTABLE(1b,2b)					\
>  		: "=&c"(size), "=&D" (__d0)				\
>  		: "r"(size & 3), "0"(size / 4), "1"((long)addr), "a"(0));	\
> +	clac();								\
>  } while (0)
>  
>  /**
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index 205251d..dce6eb5 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -379,6 +379,7 @@ __UNLIKELY_END(create_bounce_frame_bad_sp)
>          movb  TRAPBOUNCE_flags(%rdx),%cl
>          subq  $40,%rsi
>          movq  UREGS_ss+8(%rsp),%rax
> +        ASM_STAC
>  .Lft2:  movq  %rax,32(%rsi)             # SS
>          movq  UREGS_rsp+8(%rsp),%rax
>  .Lft3:  movq  %rax,24(%rsi)             # RSP
> @@ -423,6 +424,7 @@ UNLIKELY_END(bounce_failsafe)
>  .Lft12: movq  %rax,8(%rsi)              # R11
>          movq  UREGS_rcx+8(%rsp),%rax
>  .Lft13: movq  %rax,(%rsi)               # RCX
> +        ASM_CLAC
>          /* Rewrite our stack frame and return to guest-OS mode. */
>          /* IA32 Ref. Vol. 3: TF, VM, RF and NT flags are cleared on trap. */
>          /* Also clear AC: alignment checks shouldn't trigger in kernel mode. */
> diff --git a/xen/include/asm-x86/uaccess.h b/xen/include/asm-x86/uaccess.h
> index 88b4ba2..92bc322 100644
> --- a/xen/include/asm-x86/uaccess.h
> +++ b/xen/include/asm-x86/uaccess.h
> @@ -146,6 +146,7 @@ struct __large_struct { unsigned long buf[100]; };
>   * aliasing issues.
>   */
>  #define __put_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
> +	stac();								\
>  	__asm__ __volatile__(						\
>  		"1:	mov"itype" %"rtype"1,%2\n"			\
>  		"2:\n"							\
> @@ -155,9 +156,11 @@ struct __large_struct { unsigned long buf[100]; };
>  		".previous\n"						\
>  		_ASM_EXTABLE(1b, 3b)					\
>  		: "=r"(err)						\
> -		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))
> +		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err));	\
> +	clac();

Drop the trailing semicolon here.  The final statement of these macro
shouldn't have one.

>  
>  #define __get_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
> +	stac();								\
>  	__asm__ __volatile__(						\
>  		"1:	mov"itype" %2,%"rtype"1\n"			\
>  		"2:\n"							\
> @@ -168,7 +171,8 @@ struct __large_struct { unsigned long buf[100]; };
>  		".previous\n"						\
>  		_ASM_EXTABLE(1b, 3b)					\
>  		: "=r"(err), ltype (x)					\
> -		: "m"(__m(addr)), "i"(errret), "0"(err))
> +		: "m"(__m(addr)), "i"(errret), "0"(err));		\
> +	clac();

And here...

>  
>  /**
>   * __copy_to_user: - Copy a block of data into user space, with less checking
> diff --git a/xen/include/asm-x86/x86_64/system.h b/xen/include/asm-x86/x86_64/system.h
> index 20f038b..b6341ad1 100644
> --- a/xen/include/asm-x86/x86_64/system.h
> +++ b/xen/include/asm-x86/x86_64/system.h
> @@ -12,6 +12,7 @@
>   * is the same as the initial value of _o then _n is written to location _p.
>   */
>  #define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype)                 \
> +    stac();                                                             \
>      asm volatile (                                                      \
>          "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n"                      \
>          "2:\n"                                                          \
> @@ -22,7 +23,8 @@
>          _ASM_EXTABLE(1b, 3b)                                            \
>          : "=a" (_o), "=r" (_rc)                                         \
>          : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1" (0) \
> -        : "memory");
> +        : "memory");                                                    \
> +    clac();

And here.

~Andrew

>  
>  #define cmpxchg_user(_p,_o,_n)                                          \
>  ({                                                                      \

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-07  8:19 ` [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
@ 2014-05-07 10:26   ` Andrew Cooper
  2014-05-07 11:44     ` Jan Beulich
  2014-05-08  2:32     ` Wu, Feng
  2014-05-08  1:20   ` Tian, Kevin
  1 sibling, 2 replies; 43+ messages in thread
From: Andrew Cooper @ 2014-05-07 10:26 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, ian.campbell, eddie.dong, xen-devel, JBeulich, jun.nakajima

On 07/05/14 09:19, Feng Wu wrote:
> Supervisor Mode Access Prevention (SMAP) is a new security
> feature disclosed by Intel, please refer to the following
> document:
>
> http://software.intel.com/sites/default/files/319433-014.pdf
>
> If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
> to linear addresses that are accessible in user mode. If CPL < 3,
> SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
> applies to all supervisor-mode data accesses (these are implicit
> supervisor accesses) regardless of the value of EFLAGS.AC.
>
> This patch enables SMAP in Xen to prevent Xen hypervisor from
> accessing pv guest data, whose translation paging-structure
> entries' U/S flags are all set.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
>  docs/misc/xen-command-line.markdown |  7 +++++
>  xen/arch/x86/setup.c                | 27 +++++++++++++++++
>  xen/arch/x86/traps.c                | 58 +++++++++++++++++++++++++++----------
>  xen/include/asm-x86/cpufeature.h    |  1 +
>  xen/include/asm-x86/domain.h        |  6 ++--
>  5 files changed, 82 insertions(+), 17 deletions(-)
>
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index 7dc938b..a7ac53d 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
>  
>  Flag to enable Supervisor Mode Execution Protection
>  
> +### smap

### smap (Intel)

I know it is inconsistent, but I am trying to get the vendor and arch
specific command line arguments identified.

> +> `= <boolean>`
> +
> +> Default: `true`
> +
> +Flag to enable Supervisor Mode Access Prevention
> +
>  ### snb\_igd\_quirk
>  > `= <boolean>`
>  
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 2e30701..29b22a1 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
>  static bool_t __initdata disable_smep;
>  invbool_param("smep", disable_smep);
>  
> +/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
> +static bool_t __initdata disable_smap;
> +invbool_param("smap", disable_smap);
> +
>  /* **** Linux config option: propagated to domain0. */
>  /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
>  /* "acpi=force":  Override the disable blacklist.                   */
> @@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>      if ( cpu_has_smep )
>          set_in_cr4(X86_CR4_SMEP);
>  
> +    if ( disable_smap )
> +        setup_clear_cpu_cap(X86_FEATURE_SMAP);
> +    if ( cpu_has_smap )
> +        set_in_cr4(X86_CR4_SMAP);

There is a "pushq $0; popf" in __high_start which will clear AC for the
BSP and all APs when booting.

Is that considered a sufficient guarantee of the state of eflags for
booting purposes?  If not, you probably want an explicit clac() here and
somewhere in start_secondary().

> +
>      if ( cpu_has_fsgsbase )
>          set_in_cr4(X86_CR4_FSGSBASE);
>  
> @@ -1384,6 +1393,21 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>          printk(XENLOG_WARNING
>                 "Multiple initrd candidates, picking module #%u\n",
>                 initrdidx);
> +    /*
> +     * Temporarily clear SMAP bit in CR4 to allow user-page accesses in
> +     * construct_dom0(). This looks a little hackish, but using stac()/clac()
> +     * to do this is not appropriate here: Since construct_dom0() calls
> +     * copy_from_user(), so we cannot wrap construct_dom0() as a whole in
> +     * stac()/clac() - the AC bit remaining cleared after copy_from_user()
> +     * would induce problems in construct_dom0().
> +     *
> +     * On the other hand, if we used stac()/clac() in construct_dom0(), we
> +     * would need to carefully handle some corner cases.
> +     *
> +     * So we clear SMAP before construct_dom0() and enable it back afterwards.
> +    */

Alignment error on the end of the comment.  Furthermore, I feel it is
more verbose than it needs to be.  May I suggest:

"Temporarily clear SMAP in CR4 to allow user-accesses in
construct_dom0().  This saves a large number of corner case interactions
with copy_from_user()"

~Andrew

> +    if ( cpu_has_smap )
> +        write_cr4(read_cr4() & ~X86_CR4_SMAP);
>  
>      /*
>       * We're going to setup domain0 using the module(s) that we stashed safely
> @@ -1395,6 +1419,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>                          bootstrap_map, cmdline) != 0)
>          panic("Could not set up DOM0 guest OS");
>  
> +    if ( cpu_has_smap )
> +        write_cr4(read_cr4() | X86_CR4_SMAP);
> +
>      /* Scrub RAM that is still free and so may go to an unprivileged domain. */
>      scrub_heap_pages();
>  
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 19c96d5..ee203cb 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1183,11 +1183,12 @@ static int handle_gdt_ldt_mapping_fault(
>  enum pf_type {
>      real_fault,
>      smep_fault,
> +    smap_fault,
>      spurious_fault
>  };
>  
>  static enum pf_type __page_fault_type(
> -    unsigned long addr, unsigned int error_code)
> +    unsigned long addr, const struct cpu_user_regs *regs)
>  {
>      unsigned long mfn, cr3 = read_cr3();
>      l4_pgentry_t l4e, *l4t;
> @@ -1195,6 +1196,7 @@ static enum pf_type __page_fault_type(
>      l2_pgentry_t l2e, *l2t;
>      l1_pgentry_t l1e, *l1t;
>      unsigned int required_flags, disallowed_flags, page_user;
> +    unsigned int error_code = regs->error_code;
>  
>      /*
>       * We do not take spurious page faults in IRQ handlers as we do not
> @@ -1263,19 +1265,37 @@ static enum pf_type __page_fault_type(
>      page_user &= l1e_get_flags(l1e);
>  
>  leaf:
> -    /*
> -     * Supervisor Mode Execution Protection (SMEP):
> -     * Disallow supervisor execution from user-accessible mappings
> -     */
> -    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
> -         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
> -        return smep_fault;
> +    if ( page_user )
> +    {
> +        unsigned long cr4 = read_cr4();
> +        /*
> +         * Supervisor Mode Execution Prevention (SMEP):
> +         * Disallow supervisor execution from user-accessible mappings
> +         */
> +        if ( (cr4 & X86_CR4_SMEP) &&
> +             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
> +            return smep_fault;
> +
> +        /*
> +         * Supervisor Mode Access Prevention (SMAP):
> +         * Disallow supervisor access user-accessible mappings
> +         * A fault is considered as an SMAP violation if the following
> +         * conditions are true:
> +         *   - X86_CR4_SMAP is set in CR4
> +         *   - A user page is being accessed
> +         *   - CPL=3 or X86_EFLAGS_AC is clear
> +         *   - Page fault in kernel mode
> +         */
> +        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode) &&
> +             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
> +            return smap_fault;
> +    }
>  
>      return spurious_fault;
>  }
>  
>  static enum pf_type spurious_page_fault(
> -    unsigned long addr, unsigned int error_code)
> +    unsigned long addr, const struct cpu_user_regs *regs)
>  {
>      unsigned long flags;
>      enum pf_type pf_type;
> @@ -1285,7 +1305,7 @@ static enum pf_type spurious_page_fault(
>       * page tables from becoming invalid under our feet during the walk.
>       */
>      local_irq_save(flags);
> -    pf_type = __page_fault_type(addr, error_code);
> +    pf_type = __page_fault_type(addr, regs);
>      local_irq_restore(flags);
>  
>      return pf_type;
> @@ -1380,8 +1400,14 @@ void do_page_fault(struct cpu_user_regs *regs)
>  
>      if ( unlikely(!guest_mode(regs)) )
>      {
> -        pf_type = spurious_page_fault(addr, error_code);
> -        BUG_ON(pf_type == smep_fault);
> +        pf_type = spurious_page_fault(addr, regs);
> +        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
> +        {
> +            console_start_sync();
> +            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' : 'A');
> +            fatal_trap(TRAP_page_fault, regs);
> +        }
> +
>          if ( pf_type != real_fault )
>              return;
>  
> @@ -1407,10 +1433,12 @@ void do_page_fault(struct cpu_user_regs *regs)
>  
>      if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
>      {
> -        pf_type = spurious_page_fault(addr, error_code);
> -        if ( pf_type == smep_fault )
> +        pf_type = spurious_page_fault(addr, regs);
> +        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
>          {
> -            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
> +            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
> +                   current, (pf_type == smep_fault) ? 'E' : 'A');
> +
>              domain_crash(current->domain);
>          }
>          if ( pf_type != real_fault )
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index 20881c0..8014241 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -190,6 +190,7 @@
>  #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
>  
>  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
> +#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
>  #define cpu_has_fpu_sel         (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
>  
>  #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) \
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index c5c266f..abf55fb 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, unsigned long guest_cr4);
>      (((v)->arch.pv_vcpu.ctrlreg[4]                          \
>        | (mmu_cr4_features                                   \
>           & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
> -            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
> +            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
> +            X86_CR4_FSGSBASE))                              \
>        | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
>       & ~X86_CR4_DE)
>  #define real_cr4_to_pv_guest_cr4(c)                         \
>      ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
> -             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
> +             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
> +             X86_CR4_FSGSBASE | X86_CR4_SMAP))
>  
>  void domain_cpuid(struct domain *d,
>                    unsigned int  input,

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest
  2014-05-07  8:19 ` [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
@ 2014-05-07 10:46   ` Andrew Cooper
  2014-05-07 11:47     ` Jan Beulich
  2014-05-08  1:22   ` Tian, Kevin
  1 sibling, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2014-05-07 10:46 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, ian.campbell, eddie.dong, xen-devel, JBeulich, jun.nakajima

On 07/05/14 09:19, Feng Wu wrote:
> Intel new CPU supports SMAP (Supervisor Mode Access Prevention).
> SMAP prevents supervisor-mode accesses to any linear address with
> a valid translation for which the U/S flag (bit 2) is 1 in every
> paging-structure entry controlling the translation for the linear
> address.
>
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/arch/x86/hvm/hvm.c        |  3 +++
>  xen/arch/x86/mm/guest_walk.c  | 40 ++++++++++++++++++++++++++++++----------
>  xen/include/asm-x86/hvm/hvm.h | 18 +++++++++++++++++-
>  3 files changed, 50 insertions(+), 11 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index ac05160..76ccd07 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -3039,6 +3039,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>          if ( (count == 0) && !cpu_has_smep )
>              *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
>  
> +        if ( (count == 0) && !cpu_has_smap )
> +            *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
> +
>          /* Don't expose MPX to hvm when VMX support is not available */
>          if ( (count == 0) &&
>               (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
> index 70460b6..bb38fda 100644
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -144,7 +144,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>      guest_l4e_t *l4p;
>  #endif
>      uint32_t gflags, mflags, iflags, rc = 0;
> -    int smep;
> +    bool_t smep = 0, smap = 0;
>      bool_t pse1G = 0, pse2M = 0;
>      p2m_query_t qt = P2M_ALLOC | P2M_UNSHARE;
>  
> @@ -159,13 +159,33 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>      mflags = mandatory_flags(v, pfec);
>      iflags = (_PAGE_NX_BIT | _PAGE_INVALID_BITS);
>  
> -    /* SMEP: kernel-mode instruction fetches from user-mode mappings
> -     * should fault.  Unlike NX or invalid bits, we're looking for _all_
> -     * entries in the walk to have _PAGE_USER set, so we need to do the
> -     * whole walk as if it were a user-mode one and then invert the answer. */
> -    smep = (is_hvm_vcpu(v) && hvm_smep_enabled(v) 
> -            && (pfec & PFEC_insn_fetch) && !(pfec & PFEC_user_mode) );
> -    if ( smep )
> +    if ( is_hvm_vcpu(v) && !(pfec & PFEC_user_mode) )
> +    {
> +        struct segment_register seg;
> +        const struct cpu_user_regs *regs = guest_cpu_user_regs();
> +
> +        hvm_get_segment_register(v, x86_seg_ss, &seg);
> +
> +        /* SMEP: kernel-mode instruction fetches from user-mode mappings
> +         * should fault.  Unlike NX or invalid bits, we're looking for _all_
> +         * entries in the walk to have _PAGE_USER set, so we need to do the
> +         * whole walk as if it were a user-mode one and then invert the answer. */
> +        smep =  hvm_smep_enabled(v) && (pfec & PFEC_insn_fetch);
> +
> +        /*
> +         * SMAP: kernel-mode data accesses from user-mode mappings should fault
> +         * A fault is considered as a SMAP violation if the following
> +         * conditions come true:
> +         *   - X86_CR4_SMAP is set in CR4
> +         *   - A user page is accessed
> +         *   - CPL = 3 or X86_EFLAGS_AC is clear
> +         *   - Page fault in kernel mode
> +         */
> +        smap = hvm_smap_enabled(v) &&
> +               ((seg.attr.fields.dpl == 3) || !(regs->eflags & X86_EFLAGS_AC));
> +    }
> +
> +    if ( smep || smap )
>          mflags |= _PAGE_USER;
>  
>  #if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
> @@ -338,8 +358,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
>  set_ad:
>  #endif
> -    /* Now re-invert the user-mode requirement for SMEP. */
> -    if ( smep ) 
> +    /* Now re-invert the user-mode requirement for SMEP and SMAP */
> +    if ( smep || smap )
>          rc ^= _PAGE_USER;
>  
>      /* Go back and set accessed and dirty bits only if the walk was a
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index c373930..0ab6b70 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -260,6 +260,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
>      (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PAE))
>  #define hvm_smep_enabled(v) \
>      (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMEP))
> +#define hvm_smap_enabled(v) \
> +    (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
>  #define hvm_nx_enabled(v) \
>      (!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
>  
> @@ -360,13 +362,26 @@ static inline bool_t hvm_vcpu_has_smep(void)
>  
>      hvm_cpuid(0, &eax, NULL, NULL, NULL);
>  
> -    if (eax < 7)
> +    if ( eax < 7 )
>          return 0;
>  
>      hvm_cpuid(7, NULL, &ebx, NULL, NULL);
>      return !!(ebx & cpufeat_mask(X86_FEATURE_SMEP));
>  }
>  
> +static inline bool_t hvm_vcpu_has_smap(void)
> +{
> +    unsigned int eax, ebx;
> +
> +    hvm_cpuid(0, &eax, NULL, NULL, NULL);
> +
> +    if ( eax < 7 )
> +        return 0;
> +
> +    hvm_cpuid(0x7, NULL, &ebx, NULL, NULL);

I have just noticed a bug here and in hvm_vcpu_have_smep().  CPUID.0x7
takes a count of 0 for the feature leaf.  You should not rely on the
default value of dummy inside hvm_cpuid() for this call to work
correctly, so need to explicitly pass ecx.

~Andrew

> +    return !!(ebx & cpufeat_mask(X86_FEATURE_SMAP));
> +}
> +
>  /* These reserved bits in lower 32 remain 0 after any load of CR0 */
>  #define HVM_CR0_GUEST_RESERVED_BITS             \
>      (~((unsigned long)                          \
> @@ -387,6 +402,7 @@ static inline bool_t hvm_vcpu_has_smep(void)
>          X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
>          X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
>          (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) |      \
> +        (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) |      \
>          (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) |     \
>          ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\
>                        ? X86_CR4_VMXE : 0)  |             \

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-07  9:44   ` Andrew Cooper
@ 2014-05-07 11:40     ` Jan Beulich
  2014-05-07 11:53       ` Andrew Cooper
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2014-05-07 11:40 UTC (permalink / raw)
  To: Andrew Cooper, Feng Wu
  Cc: kevin.tian, eddie.dong, ian.campbell, jun.nakajima, xen-devel

>>> On 07.05.14 at 11:44, <andrew.cooper3@citrix.com> wrote:
> On 07/05/14 09:19, Feng Wu wrote:
>> @@ -673,6 +675,7 @@ ENTRY(nmi_crash)
>>          ud2
>>  
>>  ENTRY(machine_check)
>> +        ASM_CLAC
> 
> This is not needed.  the start of handle_ist_exception has a SAVE_ALL,
> which also covers the nmi entry point.
> 
> On the subject of IST exceptions, perhaps the double fault explicitly
> wants a STAC to reduce the likelihood of taking a further fault while
> trying to dump state. ?

I agree. And perhaps along with do_double_fault(), fatal_trap()
should then also get a stac() added?

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-07 10:26   ` Andrew Cooper
@ 2014-05-07 11:44     ` Jan Beulich
  2014-05-07 11:47       ` Andrew Cooper
  2014-05-08  2:32     ` Wu, Feng
  1 sibling, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2014-05-07 11:44 UTC (permalink / raw)
  To: Andrew Cooper, Feng Wu
  Cc: kevin.tian, eddie.dong, ian.campbell, jun.nakajima, xen-devel

>>> On 07.05.14 at 12:26, <andrew.cooper3@citrix.com> wrote:
> On 07/05/14 09:19, Feng Wu wrote:
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
>>  
>>  Flag to enable Supervisor Mode Execution Protection
>>  
>> +### smap
> 
> ### smap (Intel)
> 
> I know it is inconsistent, but I am trying to get the vendor and arch
> specific command line arguments identified.

Actually I disagree - there's nothing inherently Intel here, i.e. nothing
keeps AMD from adding respective functionality. It would be a different
think for VMX or VT-d specific functionality that's being controlled via
command line options.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-07 11:44     ` Jan Beulich
@ 2014-05-07 11:47       ` Andrew Cooper
  0 siblings, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2014-05-07 11:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, Feng Wu, ian.campbell, eddie.dong, xen-devel, jun.nakajima

On 07/05/14 12:44, Jan Beulich wrote:
>>>> On 07.05.14 at 12:26, <andrew.cooper3@citrix.com> wrote:
>> On 07/05/14 09:19, Feng Wu wrote:
>>> --- a/docs/misc/xen-command-line.markdown
>>> +++ b/docs/misc/xen-command-line.markdown
>>> @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
>>>  
>>>  Flag to enable Supervisor Mode Execution Protection
>>>  
>>> +### smap
>> ### smap (Intel)
>>
>> I know it is inconsistent, but I am trying to get the vendor and arch
>> specific command line arguments identified.
> Actually I disagree - there's nothing inherently Intel here, i.e. nothing
> keeps AMD from adding respective functionality. It would be a different
> think for VMX or VT-d specific functionality that's being controlled via
> command line options.
>
> Jan
>

Fair enough.  I retract my correction.

~Andrew

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest
  2014-05-07 10:46   ` Andrew Cooper
@ 2014-05-07 11:47     ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2014-05-07 11:47 UTC (permalink / raw)
  To: Andrew Cooper, Feng Wu
  Cc: kevin.tian, eddie.dong, ian.campbell, jun.nakajima, xen-devel

>>> On 07.05.14 at 12:46, <andrew.cooper3@citrix.com> wrote:
> On 07/05/14 09:19, Feng Wu wrote:
>> @@ -360,13 +362,26 @@ static inline bool_t hvm_vcpu_has_smep(void)
>>  
>>      hvm_cpuid(0, &eax, NULL, NULL, NULL);
>>  
>> -    if (eax < 7)
>> +    if ( eax < 7 )
>>          return 0;
>>  
>>      hvm_cpuid(7, NULL, &ebx, NULL, NULL);
>>      return !!(ebx & cpufeat_mask(X86_FEATURE_SMEP));
>>  }
>>  
>> +static inline bool_t hvm_vcpu_has_smap(void)
>> +{
>> +    unsigned int eax, ebx;
>> +
>> +    hvm_cpuid(0, &eax, NULL, NULL, NULL);
>> +
>> +    if ( eax < 7 )
>> +        return 0;
>> +
>> +    hvm_cpuid(0x7, NULL, &ebx, NULL, NULL);
> 
> I have just noticed a bug here and in hvm_vcpu_have_smep().  CPUID.0x7
> takes a count of 0 for the feature leaf.  You should not rely on the
> default value of dummy inside hvm_cpuid() for this call to work
> correctly, so need to explicitly pass ecx.

While I'm fine with this being made more explicit, I don't think it's a bug
(not even a latent one) to rely on the default sub-leaf being zero.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-07 11:40     ` Jan Beulich
@ 2014-05-07 11:53       ` Andrew Cooper
  2014-05-08  1:41         ` Wu, Feng
  0 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2014-05-07 11:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, Feng Wu, ian.campbell, eddie.dong, xen-devel, jun.nakajima

On 07/05/14 12:40, Jan Beulich wrote:
>>>> On 07.05.14 at 11:44, <andrew.cooper3@citrix.com> wrote:
>> On 07/05/14 09:19, Feng Wu wrote:
>>> @@ -673,6 +675,7 @@ ENTRY(nmi_crash)
>>>          ud2
>>>  
>>>  ENTRY(machine_check)
>>> +        ASM_CLAC
>> This is not needed.  the start of handle_ist_exception has a SAVE_ALL,
>> which also covers the nmi entry point.
>>
>> On the subject of IST exceptions, perhaps the double fault explicitly
>> wants a STAC to reduce the likelihood of taking a further fault while
>> trying to dump state. ?
> I agree. And perhaps along with do_double_fault(), fatal_trap()
> should then also get a stac() added?
>
> Jan
>

With doubt_fault: being sole caller of do_double_fault(), editing the
entry point in entry.S to "ASM_STAC; SAVE_ALL 0" is sufficient to avoid
stac() in do_doube_fault() itself.

I would agree that fatal_trap() wants an stac() in it.

~Andrew

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode
  2014-05-07  8:19 ` [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
  2014-05-07  9:49   ` Andrew Cooper
@ 2014-05-08  1:14   ` Tian, Kevin
  1 sibling, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2014-05-08  1:14 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: andrew.cooper3, Dong, Eddie, Nakajima, Jun, JBeulich, ian.campbell

> From: Wu, Feng
> Sent: Wednesday, May 07, 2014 4:20 PM
> 
> Use STAC/CLAC to temporarily disable SMAP to allow legal accesses to
> user pages in kernel mode
> 
> STAC/CLAC is not needed for compat_create_bounce_frame, since in this
> chunk of code, it only accesses the pv guest's kernel stack, which is
> in ring 1 for 32-bit pv guests.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  xen/arch/x86/traps.c                | 12 ++++++++++++
>  xen/arch/x86/usercopy.c             |  6 ++++++
>  xen/arch/x86/x86_64/entry.S         |  2 ++
>  xen/include/asm-x86/uaccess.h       |  8 ++++++--
>  xen/include/asm-x86/x86_64/system.h |  4 +++-
>  5 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 45070bb..19c96d5 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -3811,6 +3811,18 @@ unsigned long do_get_debugreg(int reg)
> 
>  void asm_domain_crash_synchronous(unsigned long addr)
>  {
> +    /*
> +     * We need clear AC bit here because in entry.S AC is set
> +     * by ASM_STAC to temporarily allow accesses to user pages
> +     * which is prevented by SMAP by default.
> +     *
> +     * For some code paths, where this function is called, clac()
> +     * is not needed, but adding clac() here instead of each place
> +     * asm_domain_crash_synchronous() is called can reduce the code
> +     * redundancy, and it is harmless as well.
> +     */
> +    clac();
> +
>      if ( addr == 0 )
>          addr = this_cpu(last_extable_addr);
> 
> diff --git a/xen/arch/x86/usercopy.c b/xen/arch/x86/usercopy.c
> index b79202b..4cc78f5 100644
> --- a/xen/arch/x86/usercopy.c
> +++ b/xen/arch/x86/usercopy.c
> @@ -14,6 +14,7 @@ unsigned long __copy_to_user_ll(void __user *to, const
> void *from, unsigned n)
>  {
>      unsigned long __d0, __d1, __d2, __n = n;
> 
> +    stac();
>      asm volatile (
>          "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
>          "    jbe  1f\n"
> @@ -42,6 +43,7 @@ unsigned long __copy_to_user_ll(void __user *to, const
> void *from, unsigned n)
>          : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
>          : "0" (__n), "1" (to), "2" (from), "3" (__n)
>          : "memory" );
> +    clac();
> 
>      return __n;
>  }
> @@ -51,6 +53,7 @@ __copy_from_user_ll(void *to, const void __user *from,
> unsigned n)
>  {
>      unsigned long __d0, __d1, __d2, __n = n;
> 
> +    stac();
>      asm volatile (
>          "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
>          "    jbe  1f\n"
> @@ -85,6 +88,7 @@ __copy_from_user_ll(void *to, const void __user *from,
> unsigned n)
>          : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
>          : "0" (__n), "1" (to), "2" (from), "3" (__n)
>          : "memory" );
> +    clac();
> 
>      return __n;
>  }
> @@ -113,6 +117,7 @@ copy_to_user(void __user *to, const void *from,
> unsigned n)
>  #define __do_clear_user(addr,size)					\
>  do {									\
>  	long __d0;							\
> +	stac();								\
>  	__asm__ __volatile__(						\
>  		"0:	rep; stosl\n"					\
>  		"	movl %2,%0\n"					\
> @@ -126,6 +131,7 @@ do {									\
>  		_ASM_EXTABLE(1b,2b)					\
>  		: "=&c"(size), "=&D" (__d0)				\
>  		: "r"(size & 3), "0"(size / 4), "1"((long)addr), "a"(0));	\
> +	clac();								\
>  } while (0)
> 
>  /**
> diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
> index 205251d..dce6eb5 100644
> --- a/xen/arch/x86/x86_64/entry.S
> +++ b/xen/arch/x86/x86_64/entry.S
> @@ -379,6 +379,7 @@ __UNLIKELY_END(create_bounce_frame_bad_sp)
>          movb  TRAPBOUNCE_flags(%rdx),%cl
>          subq  $40,%rsi
>          movq  UREGS_ss+8(%rsp),%rax
> +        ASM_STAC
>  .Lft2:  movq  %rax,32(%rsi)             # SS
>          movq  UREGS_rsp+8(%rsp),%rax
>  .Lft3:  movq  %rax,24(%rsi)             # RSP
> @@ -423,6 +424,7 @@ UNLIKELY_END(bounce_failsafe)
>  .Lft12: movq  %rax,8(%rsi)              # R11
>          movq  UREGS_rcx+8(%rsp),%rax
>  .Lft13: movq  %rax,(%rsi)               # RCX
> +        ASM_CLAC
>          /* Rewrite our stack frame and return to guest-OS mode. */
>          /* IA32 Ref. Vol. 3: TF, VM, RF and NT flags are cleared on trap. */
>          /* Also clear AC: alignment checks shouldn't trigger in kernel mode.
> */
> diff --git a/xen/include/asm-x86/uaccess.h b/xen/include/asm-x86/uaccess.h
> index 88b4ba2..92bc322 100644
> --- a/xen/include/asm-x86/uaccess.h
> +++ b/xen/include/asm-x86/uaccess.h
> @@ -146,6 +146,7 @@ struct __large_struct { unsigned long buf[100]; };
>   * aliasing issues.
>   */
>  #define __put_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
> +	stac();								\
>  	__asm__ __volatile__(						\
>  		"1:	mov"itype" %"rtype"1,%2\n"			\
>  		"2:\n"							\
> @@ -155,9 +156,11 @@ struct __large_struct { unsigned long buf[100]; };
>  		".previous\n"						\
>  		_ASM_EXTABLE(1b, 3b)					\
>  		: "=r"(err)						\
> -		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))
> +		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err));	\
> +	clac();
> 
>  #define __get_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
> +	stac();								\
>  	__asm__ __volatile__(						\
>  		"1:	mov"itype" %2,%"rtype"1\n"			\
>  		"2:\n"							\
> @@ -168,7 +171,8 @@ struct __large_struct { unsigned long buf[100]; };
>  		".previous\n"						\
>  		_ASM_EXTABLE(1b, 3b)					\
>  		: "=r"(err), ltype (x)					\
> -		: "m"(__m(addr)), "i"(errret), "0"(err))
> +		: "m"(__m(addr)), "i"(errret), "0"(err));		\
> +	clac();
> 
>  /**
>   * __copy_to_user: - Copy a block of data into user space, with less checking
> diff --git a/xen/include/asm-x86/x86_64/system.h
> b/xen/include/asm-x86/x86_64/system.h
> index 20f038b..b6341ad1 100644
> --- a/xen/include/asm-x86/x86_64/system.h
> +++ b/xen/include/asm-x86/x86_64/system.h
> @@ -12,6 +12,7 @@
>   * is the same as the initial value of _o then _n is written to location _p.
>   */
>  #define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype)
> \
> +    stac();
> \
>      asm volatile
> (                                                      \
>          "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n"
> \
>          "2:\n"
> \
> @@ -22,7 +23,8 @@
>          _ASM_EXTABLE(1b, 3b)
> \
>          : "=a" (_o), "=r" (_rc)
> \
>          : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1" (0) \
> -        : "memory");
> +        : "memory");
> \
> +    clac();
> 
>  #define cmpxchg_user(_p,_o,_n)
> \
> 
> ({
>         \
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 07/10] VMX: Disable SMAP feature when guest is in non-paging mode
  2014-05-07  8:19 ` [PATCH v6 07/10] VMX: Disable SMAP feature when guest is in non-paging mode Feng Wu
@ 2014-05-08  1:16   ` Tian, Kevin
  0 siblings, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2014-05-08  1:16 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: andrew.cooper3, Dong, Eddie, Nakajima, Jun, JBeulich, ian.campbell

> From: Wu, Feng
> Sent: Wednesday, May 07, 2014 4:20 PM
> 
> SMAP is disabled if CPU is in non-paging mode in hardware.
> However Xen always uses paging mode to emulate guest non-paging
> mode with HAP. To emulate this behavior, SMAP needs to be manually
> disabled when guest switches to non-paging mode.
> 
> This logic is similiar with SMEP.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  xen/arch/x86/hvm/vmx/vmx.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index ecdbc17..f0f6f83 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1319,12 +1319,12 @@ static void vmx_update_guest_cr(struct vcpu *v,
> unsigned int cr)
>          if ( !hvm_paging_enabled(v) )
>          {
>              /*
> -             * SMEP is disabled if CPU is in non-paging mode in hardware.
> +             * SMEP/SMAP is disabled if CPU is in non-paging mode in
> hardware.
>               * However Xen always uses paging mode to emulate guest
> non-paging
> -             * mode. To emulate this behavior, SMEP needs to be
> manually
> +             * mode. To emulate this behavior, SMEP/SMAP needs to be
> manually
>               * disabled when guest VCPU is in non-paging mode.
>               */
> -            v->arch.hvm_vcpu.hw_cr[4] &= ~X86_CR4_SMEP;
> +            v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP |
> X86_CR4_SMAP);
>          }
>          __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
>          break;
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-07  8:19 ` [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
  2014-05-07 10:26   ` Andrew Cooper
@ 2014-05-08  1:20   ` Tian, Kevin
  2014-05-08  6:25     ` Wu, Feng
  1 sibling, 1 reply; 43+ messages in thread
From: Tian, Kevin @ 2014-05-08  1:20 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: andrew.cooper3, Dong, Eddie, Nakajima, Jun, JBeulich, ian.campbell

> From: Wu, Feng
> Sent: Wednesday, May 07, 2014 4:20 PM
> 
> Supervisor Mode Access Prevention (SMAP) is a new security
> feature disclosed by Intel, please refer to the following
> document:
> 
> http://software.intel.com/sites/default/files/319433-014.pdf
> 
> If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
> to linear addresses that are accessible in user mode. If CPL < 3,
> SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
> applies to all supervisor-mode data accesses (these are implicit
> supervisor accesses) regardless of the value of EFLAGS.AC.
> 
> This patch enables SMAP in Xen to prevent Xen hypervisor from
> accessing pv guest data, whose translation paging-structure
> entries' U/S flags are all set.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

could you double-check whether S3 resume and CPU hotplug are covered
well?

Thanks
Kevin

> ---
>  docs/misc/xen-command-line.markdown |  7 +++++
>  xen/arch/x86/setup.c                | 27 +++++++++++++++++
>  xen/arch/x86/traps.c                | 58
> +++++++++++++++++++++++++++----------
>  xen/include/asm-x86/cpufeature.h    |  1 +
>  xen/include/asm-x86/domain.h        |  6 ++--
>  5 files changed, 82 insertions(+), 17 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown
> b/docs/misc/xen-command-line.markdown
> index 7dc938b..a7ac53d 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
> 
>  Flag to enable Supervisor Mode Execution Protection
> 
> +### smap
> +> `= <boolean>`
> +
> +> Default: `true`
> +
> +Flag to enable Supervisor Mode Access Prevention
> +
>  ### snb\_igd\_quirk
>  > `= <boolean>`
> 
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 2e30701..29b22a1 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
>  static bool_t __initdata disable_smep;
>  invbool_param("smep", disable_smep);
> 
> +/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
> +static bool_t __initdata disable_smap;
> +invbool_param("smap", disable_smap);
> +
>  /* **** Linux config option: propagated to domain0. */
>  /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
>  /* "acpi=force":  Override the disable blacklist.                   */
> @@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>      if ( cpu_has_smep )
>          set_in_cr4(X86_CR4_SMEP);
> 
> +    if ( disable_smap )
> +        setup_clear_cpu_cap(X86_FEATURE_SMAP);
> +    if ( cpu_has_smap )
> +        set_in_cr4(X86_CR4_SMAP);
> +
>      if ( cpu_has_fsgsbase )
>          set_in_cr4(X86_CR4_FSGSBASE);
> 
> @@ -1384,6 +1393,21 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>          printk(XENLOG_WARNING
>                 "Multiple initrd candidates, picking module #%u\n",
>                 initrdidx);
> +    /*
> +     * Temporarily clear SMAP bit in CR4 to allow user-page accesses in
> +     * construct_dom0(). This looks a little hackish, but using stac()/clac()
> +     * to do this is not appropriate here: Since construct_dom0() calls
> +     * copy_from_user(), so we cannot wrap construct_dom0() as a whole
> in
> +     * stac()/clac() - the AC bit remaining cleared after copy_from_user()
> +     * would induce problems in construct_dom0().
> +     *
> +     * On the other hand, if we used stac()/clac() in construct_dom0(), we
> +     * would need to carefully handle some corner cases.
> +     *
> +     * So we clear SMAP before construct_dom0() and enable it back
> afterwards.
> +    */
> +    if ( cpu_has_smap )
> +        write_cr4(read_cr4() & ~X86_CR4_SMAP);
> 
>      /*
>       * We're going to setup domain0 using the module(s) that we stashed
> safely
> @@ -1395,6 +1419,9 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
>                          bootstrap_map, cmdline) != 0)
>          panic("Could not set up DOM0 guest OS");
> 
> +    if ( cpu_has_smap )
> +        write_cr4(read_cr4() | X86_CR4_SMAP);
> +
>      /* Scrub RAM that is still free and so may go to an unprivileged domain.
> */
>      scrub_heap_pages();
> 
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 19c96d5..ee203cb 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1183,11 +1183,12 @@ static int handle_gdt_ldt_mapping_fault(
>  enum pf_type {
>      real_fault,
>      smep_fault,
> +    smap_fault,
>      spurious_fault
>  };
> 
>  static enum pf_type __page_fault_type(
> -    unsigned long addr, unsigned int error_code)
> +    unsigned long addr, const struct cpu_user_regs *regs)
>  {
>      unsigned long mfn, cr3 = read_cr3();
>      l4_pgentry_t l4e, *l4t;
> @@ -1195,6 +1196,7 @@ static enum pf_type __page_fault_type(
>      l2_pgentry_t l2e, *l2t;
>      l1_pgentry_t l1e, *l1t;
>      unsigned int required_flags, disallowed_flags, page_user;
> +    unsigned int error_code = regs->error_code;
> 
>      /*
>       * We do not take spurious page faults in IRQ handlers as we do not
> @@ -1263,19 +1265,37 @@ static enum pf_type __page_fault_type(
>      page_user &= l1e_get_flags(l1e);
> 
>  leaf:
> -    /*
> -     * Supervisor Mode Execution Protection (SMEP):
> -     * Disallow supervisor execution from user-accessible mappings
> -     */
> -    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
> -         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> PFEC_insn_fetch) )
> -        return smep_fault;
> +    if ( page_user )
> +    {
> +        unsigned long cr4 = read_cr4();
> +        /*
> +         * Supervisor Mode Execution Prevention (SMEP):
> +         * Disallow supervisor execution from user-accessible mappings
> +         */
> +        if ( (cr4 & X86_CR4_SMEP) &&
> +             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> PFEC_insn_fetch) )
> +            return smep_fault;
> +
> +        /*
> +         * Supervisor Mode Access Prevention (SMAP):
> +         * Disallow supervisor access user-accessible mappings
> +         * A fault is considered as an SMAP violation if the following
> +         * conditions are true:
> +         *   - X86_CR4_SMAP is set in CR4
> +         *   - A user page is being accessed
> +         *   - CPL=3 or X86_EFLAGS_AC is clear
> +         *   - Page fault in kernel mode
> +         */
> +        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode)
> &&
> +             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
> +            return smap_fault;
> +    }
> 
>      return spurious_fault;
>  }
> 
>  static enum pf_type spurious_page_fault(
> -    unsigned long addr, unsigned int error_code)
> +    unsigned long addr, const struct cpu_user_regs *regs)
>  {
>      unsigned long flags;
>      enum pf_type pf_type;
> @@ -1285,7 +1305,7 @@ static enum pf_type spurious_page_fault(
>       * page tables from becoming invalid under our feet during the walk.
>       */
>      local_irq_save(flags);
> -    pf_type = __page_fault_type(addr, error_code);
> +    pf_type = __page_fault_type(addr, regs);
>      local_irq_restore(flags);
> 
>      return pf_type;
> @@ -1380,8 +1400,14 @@ void do_page_fault(struct cpu_user_regs *regs)
> 
>      if ( unlikely(!guest_mode(regs)) )
>      {
> -        pf_type = spurious_page_fault(addr, error_code);
> -        BUG_ON(pf_type == smep_fault);
> +        pf_type = spurious_page_fault(addr, regs);
> +        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
> +        {
> +            console_start_sync();
> +            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' :
> 'A');
> +            fatal_trap(TRAP_page_fault, regs);
> +        }
> +
>          if ( pf_type != real_fault )
>              return;
> 
> @@ -1407,10 +1433,12 @@ void do_page_fault(struct cpu_user_regs *regs)
> 
>      if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
>      {
> -        pf_type = spurious_page_fault(addr, error_code);
> -        if ( pf_type == smep_fault )
> +        pf_type = spurious_page_fault(addr, regs);
> +        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
>          {
> -            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
> +            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
> +                   current, (pf_type == smep_fault) ? 'E' : 'A');
> +
>              domain_crash(current->domain);
>          }
>          if ( pf_type != real_fault )
> diff --git a/xen/include/asm-x86/cpufeature.h
> b/xen/include/asm-x86/cpufeature.h
> index 20881c0..8014241 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -190,6 +190,7 @@
>  #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
> 
>  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
> +#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
>  #define cpu_has_fpu_sel
> (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
> 
>  #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor ==
> X86_VENDOR_AMD) \
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index c5c266f..abf55fb 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu
> *, unsigned long guest_cr4);
>      (((v)->arch.pv_vcpu.ctrlreg[4]                          \
>        | (mmu_cr4_features                                   \
>           & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
> -            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
> +            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
> +            X86_CR4_FSGSBASE))                              \
>        | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
>       & ~X86_CR4_DE)
>  #define real_cr4_to_pv_guest_cr4(c)                         \
>      ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
> -             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
> +             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
> +             X86_CR4_FSGSBASE | X86_CR4_SMAP))
> 
>  void domain_cpuid(struct domain *d,
>                    unsigned int  input,
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest
  2014-05-07  8:19 ` [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
  2014-05-07 10:46   ` Andrew Cooper
@ 2014-05-08  1:22   ` Tian, Kevin
  1 sibling, 0 replies; 43+ messages in thread
From: Tian, Kevin @ 2014-05-08  1:22 UTC (permalink / raw)
  To: Wu, Feng, xen-devel
  Cc: andrew.cooper3, Dong, Eddie, Nakajima, Jun, JBeulich, ian.campbell

> From: Wu, Feng
> Sent: Wednesday, May 07, 2014 4:20 PM
> Intel new CPU supports SMAP (Supervisor Mode Access Prevention).
> SMAP prevents supervisor-mode accesses to any linear address with
> a valid translation for which the U/S flag (bit 2) is 1 in every
> paging-structure entry controlling the translation for the linear
> address.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

> ---
>  xen/arch/x86/hvm/hvm.c        |  3 +++
>  xen/arch/x86/mm/guest_walk.c  | 40
> ++++++++++++++++++++++++++++++----------
>  xen/include/asm-x86/hvm/hvm.h | 18 +++++++++++++++++-
>  3 files changed, 50 insertions(+), 11 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index ac05160..76ccd07 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -3039,6 +3039,9 @@ void hvm_cpuid(unsigned int input, unsigned int
> *eax, unsigned int *ebx,
>          if ( (count == 0) && !cpu_has_smep )
>              *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
> 
> +        if ( (count == 0) && !cpu_has_smap )
> +            *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
> +
>          /* Don't expose MPX to hvm when VMX support is not available */
>          if ( (count == 0) &&
>               (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
> diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
> index 70460b6..bb38fda 100644
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -144,7 +144,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain
> *p2m,
>      guest_l4e_t *l4p;
>  #endif
>      uint32_t gflags, mflags, iflags, rc = 0;
> -    int smep;
> +    bool_t smep = 0, smap = 0;
>      bool_t pse1G = 0, pse2M = 0;
>      p2m_query_t qt = P2M_ALLOC | P2M_UNSHARE;
> 
> @@ -159,13 +159,33 @@ guest_walk_tables(struct vcpu *v, struct
> p2m_domain *p2m,
>      mflags = mandatory_flags(v, pfec);
>      iflags = (_PAGE_NX_BIT | _PAGE_INVALID_BITS);
> 
> -    /* SMEP: kernel-mode instruction fetches from user-mode mappings
> -     * should fault.  Unlike NX or invalid bits, we're looking for _all_
> -     * entries in the walk to have _PAGE_USER set, so we need to do the
> -     * whole walk as if it were a user-mode one and then invert the answer.
> */
> -    smep = (is_hvm_vcpu(v) && hvm_smep_enabled(v)
> -            && (pfec & PFEC_insn_fetch) && !(pfec & PFEC_user_mode) );
> -    if ( smep )
> +    if ( is_hvm_vcpu(v) && !(pfec & PFEC_user_mode) )
> +    {
> +        struct segment_register seg;
> +        const struct cpu_user_regs *regs = guest_cpu_user_regs();
> +
> +        hvm_get_segment_register(v, x86_seg_ss, &seg);
> +
> +        /* SMEP: kernel-mode instruction fetches from user-mode
> mappings
> +         * should fault.  Unlike NX or invalid bits, we're looking for _all_
> +         * entries in the walk to have _PAGE_USER set, so we need to do
> the
> +         * whole walk as if it were a user-mode one and then invert the
> answer. */
> +        smep =  hvm_smep_enabled(v) && (pfec & PFEC_insn_fetch);
> +
> +        /*
> +         * SMAP: kernel-mode data accesses from user-mode mappings
> should fault
> +         * A fault is considered as a SMAP violation if the following
> +         * conditions come true:
> +         *   - X86_CR4_SMAP is set in CR4
> +         *   - A user page is accessed
> +         *   - CPL = 3 or X86_EFLAGS_AC is clear
> +         *   - Page fault in kernel mode
> +         */
> +        smap = hvm_smap_enabled(v) &&
> +               ((seg.attr.fields.dpl == 3) || !(regs->eflags &
> X86_EFLAGS_AC));
> +    }
> +
> +    if ( smep || smap )
>          mflags |= _PAGE_USER;
> 
>  #if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
> @@ -338,8 +358,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain
> *p2m,
>  #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
>  set_ad:
>  #endif
> -    /* Now re-invert the user-mode requirement for SMEP. */
> -    if ( smep )
> +    /* Now re-invert the user-mode requirement for SMEP and SMAP */
> +    if ( smep || smap )
>          rc ^= _PAGE_USER;
> 
>      /* Go back and set accessed and dirty bits only if the walk was a
> diff --git a/xen/include/asm-x86/hvm/hvm.h
> b/xen/include/asm-x86/hvm/hvm.h
> index c373930..0ab6b70 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -260,6 +260,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d,
> uint8_t dest, uint8_t dest_mode);
>      (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] &
> X86_CR4_PAE))
>  #define hvm_smep_enabled(v) \
>      (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] &
> X86_CR4_SMEP))
> +#define hvm_smap_enabled(v) \
> +    (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] &
> X86_CR4_SMAP))
>  #define hvm_nx_enabled(v) \
>      (!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
> 
> @@ -360,13 +362,26 @@ static inline bool_t hvm_vcpu_has_smep(void)
> 
>      hvm_cpuid(0, &eax, NULL, NULL, NULL);
> 
> -    if (eax < 7)
> +    if ( eax < 7 )
>          return 0;
> 
>      hvm_cpuid(7, NULL, &ebx, NULL, NULL);
>      return !!(ebx & cpufeat_mask(X86_FEATURE_SMEP));
>  }
> 
> +static inline bool_t hvm_vcpu_has_smap(void)
> +{
> +    unsigned int eax, ebx;
> +
> +    hvm_cpuid(0, &eax, NULL, NULL, NULL);
> +
> +    if ( eax < 7 )
> +        return 0;
> +
> +    hvm_cpuid(0x7, NULL, &ebx, NULL, NULL);
> +    return !!(ebx & cpufeat_mask(X86_FEATURE_SMAP));
> +}
> +
>  /* These reserved bits in lower 32 remain 0 after any load of CR0 */
>  #define HVM_CR0_GUEST_RESERVED_BITS             \
>      (~((unsigned long)                          \
> @@ -387,6 +402,7 @@ static inline bool_t hvm_vcpu_has_smep(void)
>          X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
>          X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
>          (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) |      \
> +        (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) |      \
>          (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) |     \
>          ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\
>                        ? X86_CR4_VMXE : 0)  |             \
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-07 11:53       ` Andrew Cooper
@ 2014-05-08  1:41         ` Wu, Feng
  2014-05-08  1:57           ` Andrew Cooper
  0 siblings, 1 reply; 43+ messages in thread
From: Wu, Feng @ 2014-05-08  1:41 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: Tian, Kevin, Dong, Eddie, ian.campbell, Nakajima, Jun, xen-devel



> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, May 07, 2014 7:54 PM
> To: Jan Beulich
> Cc: Wu, Feng; ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian,
> Kevin; xen-devel@lists.xen.org
> Subject: Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
> SMAP
> 
> On 07/05/14 12:40, Jan Beulich wrote:
> >>>> On 07.05.14 at 11:44, <andrew.cooper3@citrix.com> wrote:
> >> On 07/05/14 09:19, Feng Wu wrote:
> >>> @@ -673,6 +675,7 @@ ENTRY(nmi_crash)
> >>>          ud2
> >>>
> >>>  ENTRY(machine_check)
> >>> +        ASM_CLAC
> >> This is not needed.  the start of handle_ist_exception has a SAVE_ALL,
> >> which also covers the nmi entry point.
> >>
> >> On the subject of IST exceptions, perhaps the double fault explicitly
> >> wants a STAC to reduce the likelihood of taking a further fault while
> >> trying to dump state. ?
> > I agree. And perhaps along with do_double_fault(), fatal_trap()
> > should then also get a stac() added?
> >
> > Jan
> >
> 
> With doubt_fault: being sole caller of do_double_fault(), editing the
> entry point in entry.S to "ASM_STAC; SAVE_ALL 0" is sufficient to avoid
> stac() in do_doube_fault() itself.

I think it's better to add "ASM_STAC" just before " call  do_double_fault".
Do you think this is okay, Andrew? Thanks!

> 
> I would agree that fatal_trap() wants an stac() in it.
> 
> ~Andrew

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  1:41         ` Wu, Feng
@ 2014-05-08  1:57           ` Andrew Cooper
  2014-05-08  2:02             ` Wu, Feng
  0 siblings, 1 reply; 43+ messages in thread
From: Andrew Cooper @ 2014-05-08  1:57 UTC (permalink / raw)
  To: Wu, Feng, Jan Beulich
  Cc: Tian, Kevin, Dong, Eddie, ian.campbell, Nakajima, Jun, xen-devel

On 08/05/2014 02:41, Wu, Feng wrote:
>
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: Wednesday, May 07, 2014 7:54 PM
>> To: Jan Beulich
>> Cc: Wu, Feng; ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian,
>> Kevin; xen-devel@lists.xen.org
>> Subject: Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
>> SMAP
>>
>> On 07/05/14 12:40, Jan Beulich wrote:
>>>>>> On 07.05.14 at 11:44, <andrew.cooper3@citrix.com> wrote:
>>>> On 07/05/14 09:19, Feng Wu wrote:
>>>>> @@ -673,6 +675,7 @@ ENTRY(nmi_crash)
>>>>>          ud2
>>>>>
>>>>>  ENTRY(machine_check)
>>>>> +        ASM_CLAC
>>>> This is not needed.  the start of handle_ist_exception has a SAVE_ALL,
>>>> which also covers the nmi entry point.
>>>>
>>>> On the subject of IST exceptions, perhaps the double fault explicitly
>>>> wants a STAC to reduce the likelihood of taking a further fault while
>>>> trying to dump state. ?
>>> I agree. And perhaps along with do_double_fault(), fatal_trap()
>>> should then also get a stac() added?
>>>
>>> Jan
>>>
>> With doubt_fault: being sole caller of do_double_fault(), editing the
>> entry point in entry.S to "ASM_STAC; SAVE_ALL 0" is sufficient to avoid
>> stac() in do_doube_fault() itself.
> I think it's better to add "ASM_STAC" just before " call  do_double_fault".
> Do you think this is okay, Andrew? Thanks!

I am not fussed where exactly the STAC goes in the entry point, but
don't leave a CLAC in the SAVE_ALL.

~Andrew

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  1:57           ` Andrew Cooper
@ 2014-05-08  2:02             ` Wu, Feng
  2014-05-08  6:40               ` Jan Beulich
  2014-05-08  9:48               ` Andrew Cooper
  0 siblings, 2 replies; 43+ messages in thread
From: Wu, Feng @ 2014-05-08  2:02 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: Tian, Kevin, Dong, Eddie, ian.campbell, Nakajima, Jun, xen-devel



> -----Original Message-----
> From: Andrew Cooper [mailto:amc96@hermes.cam.ac.uk] On Behalf Of
> Andrew Cooper
> Sent: Thursday, May 08, 2014 9:58 AM
> To: Wu, Feng; Jan Beulich
> Cc: ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian, Kevin;
> xen-devel@lists.xen.org
> Subject: Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
> SMAP
> 
> On 08/05/2014 02:41, Wu, Feng wrote:
> >
> >> -----Original Message-----
> >> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> >> Sent: Wednesday, May 07, 2014 7:54 PM
> >> To: Jan Beulich
> >> Cc: Wu, Feng; ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian,
> >> Kevin; xen-devel@lists.xen.org
> >> Subject: Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
> >> SMAP
> >>
> >> On 07/05/14 12:40, Jan Beulich wrote:
> >>>>>> On 07.05.14 at 11:44, <andrew.cooper3@citrix.com> wrote:
> >>>> On 07/05/14 09:19, Feng Wu wrote:
> >>>>> @@ -673,6 +675,7 @@ ENTRY(nmi_crash)
> >>>>>          ud2
> >>>>>
> >>>>>  ENTRY(machine_check)
> >>>>> +        ASM_CLAC
> >>>> This is not needed.  the start of handle_ist_exception has a SAVE_ALL,
> >>>> which also covers the nmi entry point.
> >>>>
> >>>> On the subject of IST exceptions, perhaps the double fault explicitly
> >>>> wants a STAC to reduce the likelihood of taking a further fault while
> >>>> trying to dump state. ?
> >>> I agree. And perhaps along with do_double_fault(), fatal_trap()
> >>> should then also get a stac() added?
> >>>
> >>> Jan
> >>>
> >> With doubt_fault: being sole caller of do_double_fault(), editing the
> >> entry point in entry.S to "ASM_STAC; SAVE_ALL 0" is sufficient to avoid
> >> stac() in do_doube_fault() itself.
> > I think it's better to add "ASM_STAC" just before " call  do_double_fault".
> > Do you think this is okay, Andrew? Thanks!
> 
> I am not fussed where exactly the STAC goes in the entry point, but
> don't leave a CLAC in the SAVE_ALL.

Sure, thanks for the suggestion! I will pass 0 to SAVE_ALL. 
My point is that ASM_STAC should be deferred as much as possible.

> 
> ~Andrew

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-07 10:26   ` Andrew Cooper
  2014-05-07 11:44     ` Jan Beulich
@ 2014-05-08  2:32     ` Wu, Feng
  1 sibling, 0 replies; 43+ messages in thread
From: Wu, Feng @ 2014-05-08  2:32 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Tian, Kevin, ian.campbell, Dong, Eddie, xen-devel, JBeulich,
	Nakajima, Jun



> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, May 07, 2014 6:26 PM
> To: Wu, Feng
> Cc: xen-devel@lists.xen.org; JBeulich@suse.com; Tian, Kevin; Dong, Eddie;
> Nakajima, Jun; ian.campbell@citrix.com
> Subject: Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention
> (SMAP) for Xen
> 
> On 07/05/14 09:19, Feng Wu wrote:
> > Supervisor Mode Access Prevention (SMAP) is a new security
> > feature disclosed by Intel, please refer to the following
> > document:
> >
> > http://software.intel.com/sites/default/files/319433-014.pdf
> >
> > If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
> > to linear addresses that are accessible in user mode. If CPL < 3,
> > SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
> > applies to all supervisor-mode data accesses (these are implicit
> > supervisor accesses) regardless of the value of EFLAGS.AC.
> >
> > This patch enables SMAP in Xen to prevent Xen hypervisor from
> > accessing pv guest data, whose translation paging-structure
> > entries' U/S flags are all set.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > Reviewed-by: Jan Beulich <jbeulich@suse.com>
> > ---
> >  docs/misc/xen-command-line.markdown |  7 +++++
> >  xen/arch/x86/setup.c                | 27 +++++++++++++++++
> >  xen/arch/x86/traps.c                | 58
> +++++++++++++++++++++++++++----------
> >  xen/include/asm-x86/cpufeature.h    |  1 +
> >  xen/include/asm-x86/domain.h        |  6 ++--
> >  5 files changed, 82 insertions(+), 17 deletions(-)
> >
> > diff --git a/docs/misc/xen-command-line.markdown
> b/docs/misc/xen-command-line.markdown
> > index 7dc938b..a7ac53d 100644
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
> >
> >  Flag to enable Supervisor Mode Execution Protection
> >
> > +### smap
> 
> ### smap (Intel)
> 
> I know it is inconsistent, but I am trying to get the vendor and arch
> specific command line arguments identified.
> 
> > +> `= <boolean>`
> > +
> > +> Default: `true`
> > +
> > +Flag to enable Supervisor Mode Access Prevention
> > +
> >  ### snb\_igd\_quirk
> >  > `= <boolean>`
> >
> > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> > index 2e30701..29b22a1 100644
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
> >  static bool_t __initdata disable_smep;
> >  invbool_param("smep", disable_smep);
> >
> > +/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
> > +static bool_t __initdata disable_smap;
> > +invbool_param("smap", disable_smap);
> > +
> >  /* **** Linux config option: propagated to domain0. */
> >  /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
> >  /* "acpi=force":  Override the disable blacklist.                   */
> > @@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
> >      if ( cpu_has_smep )
> >          set_in_cr4(X86_CR4_SMEP);
> >
> > +    if ( disable_smap )
> > +        setup_clear_cpu_cap(X86_FEATURE_SMAP);
> > +    if ( cpu_has_smap )
> > +        set_in_cr4(X86_CR4_SMAP);
> 
> There is a "pushq $0; popf" in __high_start which will clear AC for the
> BSP and all APs when booting.
> 
> Is that considered a sufficient guarantee of the state of eflags for
> booting purposes?  If not, you probably want an explicit clac() here and
> somewhere in start_secondary().

I think that should be enough for the state of EFLAGS. Do you prefer to add clac() in
these two places? Thanks!

> 
> > +
> >      if ( cpu_has_fsgsbase )
> >          set_in_cr4(X86_CR4_FSGSBASE);
> >
> > @@ -1384,6 +1393,21 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
> >          printk(XENLOG_WARNING
> >                 "Multiple initrd candidates, picking module #%u\n",
> >                 initrdidx);
> > +    /*
> > +     * Temporarily clear SMAP bit in CR4 to allow user-page accesses in
> > +     * construct_dom0(). This looks a little hackish, but using stac()/clac()
> > +     * to do this is not appropriate here: Since construct_dom0() calls
> > +     * copy_from_user(), so we cannot wrap construct_dom0() as a whole
> in
> > +     * stac()/clac() - the AC bit remaining cleared after copy_from_user()
> > +     * would induce problems in construct_dom0().
> > +     *
> > +     * On the other hand, if we used stac()/clac() in construct_dom0(), we
> > +     * would need to carefully handle some corner cases.
> > +     *
> > +     * So we clear SMAP before construct_dom0() and enable it back
> afterwards.
> > +    */
> 
> Alignment error on the end of the comment.  Furthermore, I feel it is
> more verbose than it needs to be.  May I suggest:
> 
> "Temporarily clear SMAP in CR4 to allow user-accesses in
> construct_dom0().  This saves a large number of corner case interactions
> with copy_from_user()"
> 
> ~Andrew

Sure, will simply the comments as you suggested.

> 
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() & ~X86_CR4_SMAP);
> >
> >      /*
> >       * We're going to setup domain0 using the module(s) that we stashed
> safely
> > @@ -1395,6 +1419,9 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
> >                          bootstrap_map, cmdline) != 0)
> >          panic("Could not set up DOM0 guest OS");
> >
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() | X86_CR4_SMAP);
> > +
> >      /* Scrub RAM that is still free and so may go to an unprivileged domain.
> */
> >      scrub_heap_pages();
> >
> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> > index 19c96d5..ee203cb 100644
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -1183,11 +1183,12 @@ static int handle_gdt_ldt_mapping_fault(
> >  enum pf_type {
> >      real_fault,
> >      smep_fault,
> > +    smap_fault,
> >      spurious_fault
> >  };
> >
> >  static enum pf_type __page_fault_type(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long mfn, cr3 = read_cr3();
> >      l4_pgentry_t l4e, *l4t;
> > @@ -1195,6 +1196,7 @@ static enum pf_type __page_fault_type(
> >      l2_pgentry_t l2e, *l2t;
> >      l1_pgentry_t l1e, *l1t;
> >      unsigned int required_flags, disallowed_flags, page_user;
> > +    unsigned int error_code = regs->error_code;
> >
> >      /*
> >       * We do not take spurious page faults in IRQ handlers as we do not
> > @@ -1263,19 +1265,37 @@ static enum pf_type __page_fault_type(
> >      page_user &= l1e_get_flags(l1e);
> >
> >  leaf:
> > -    /*
> > -     * Supervisor Mode Execution Protection (SMEP):
> > -     * Disallow supervisor execution from user-accessible mappings
> > -     */
> > -    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
> > -         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> PFEC_insn_fetch) )
> > -        return smep_fault;
> > +    if ( page_user )
> > +    {
> > +        unsigned long cr4 = read_cr4();
> > +        /*
> > +         * Supervisor Mode Execution Prevention (SMEP):
> > +         * Disallow supervisor execution from user-accessible mappings
> > +         */
> > +        if ( (cr4 & X86_CR4_SMEP) &&
> > +             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> PFEC_insn_fetch) )
> > +            return smep_fault;
> > +
> > +        /*
> > +         * Supervisor Mode Access Prevention (SMAP):
> > +         * Disallow supervisor access user-accessible mappings
> > +         * A fault is considered as an SMAP violation if the following
> > +         * conditions are true:
> > +         *   - X86_CR4_SMAP is set in CR4
> > +         *   - A user page is being accessed
> > +         *   - CPL=3 or X86_EFLAGS_AC is clear
> > +         *   - Page fault in kernel mode
> > +         */
> > +        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode)
> &&
> > +             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
> > +            return smap_fault;
> > +    }
> >
> >      return spurious_fault;
> >  }
> >
> >  static enum pf_type spurious_page_fault(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long flags;
> >      enum pf_type pf_type;
> > @@ -1285,7 +1305,7 @@ static enum pf_type spurious_page_fault(
> >       * page tables from becoming invalid under our feet during the walk.
> >       */
> >      local_irq_save(flags);
> > -    pf_type = __page_fault_type(addr, error_code);
> > +    pf_type = __page_fault_type(addr, regs);
> >      local_irq_restore(flags);
> >
> >      return pf_type;
> > @@ -1380,8 +1400,14 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(!guest_mode(regs)) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        BUG_ON(pf_type == smep_fault);
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
> > +        {
> > +            console_start_sync();
> > +            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' :
> 'A');
> > +            fatal_trap(TRAP_page_fault, regs);
> > +        }
> > +
> >          if ( pf_type != real_fault )
> >              return;
> >
> > @@ -1407,10 +1433,12 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        if ( pf_type == smep_fault )
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
> >          {
> > -            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
> > +            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
> > +                   current, (pf_type == smep_fault) ? 'E' : 'A');
> > +
> >              domain_crash(current->domain);
> >          }
> >          if ( pf_type != real_fault )
> > diff --git a/xen/include/asm-x86/cpufeature.h
> b/xen/include/asm-x86/cpufeature.h
> > index 20881c0..8014241 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -190,6 +190,7 @@
> >  #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
> >
> >  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
> > +#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> >  #define cpu_has_fpu_sel
> (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
> >
> >  #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor ==
> X86_VENDOR_AMD) \
> > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> > index c5c266f..abf55fb 100644
> > --- a/xen/include/asm-x86/domain.h
> > +++ b/xen/include/asm-x86/domain.h
> > @@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct
> vcpu *, unsigned long guest_cr4);
> >      (((v)->arch.pv_vcpu.ctrlreg[4]                          \
> >        | (mmu_cr4_features                                   \
> >           & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
> > -            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
> > +            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
> > +            X86_CR4_FSGSBASE))                              \
> >        | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
> >       & ~X86_CR4_DE)
> >  #define real_cr4_to_pv_guest_cr4(c)                         \
> >      ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
> > -             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
> > +             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
> > +             X86_CR4_FSGSBASE | X86_CR4_SMAP))
> >
> >  void domain_cpuid(struct domain *d,
> >                    unsigned int  input,

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-08  1:20   ` Tian, Kevin
@ 2014-05-08  6:25     ` Wu, Feng
  2014-05-08  7:06       ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Wu, Feng @ 2014-05-08  6:25 UTC (permalink / raw)
  To: Tian, Kevin, xen-devel
  Cc: andrew.cooper3, Dong, Eddie, Nakajima, Jun, JBeulich, ian.campbell



> -----Original Message-----
> From: Tian, Kevin
> Sent: Thursday, May 08, 2014 9:21 AM
> To: Wu, Feng; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; andrew.cooper3@citrix.com; Dong, Eddie; Nakajima,
> Jun; ian.campbell@citrix.com
> Subject: RE: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention
> (SMAP) for Xen
> 
> > From: Wu, Feng
> > Sent: Wednesday, May 07, 2014 4:20 PM
> >
> > Supervisor Mode Access Prevention (SMAP) is a new security
> > feature disclosed by Intel, please refer to the following
> > document:
> >
> > http://software.intel.com/sites/default/files/319433-014.pdf
> >
> > If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
> > to linear addresses that are accessible in user mode. If CPL < 3,
> > SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
> > applies to all supervisor-mode data accesses (these are implicit
> > supervisor accesses) regardless of the value of EFLAGS.AC.
> >
> > This patch enables SMAP in Xen to prevent Xen hypervisor from
> > accessing pv guest data, whose translation paging-structure
> > entries' U/S flags are all set.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> could you double-check whether S3 resume and CPU hotplug are covered
> well?
> 
Kevin, thank you very much for point this out!

I checked the following code path:
wakeup_start --> wakeup_32 --> wakeup_64 --> __ret_point --> restore_rest_processor_state()

I thought there was nothing special to do with SMAP here, but after careful review, I think we should
set X86_EFLAGS_AC for MSR_SYSCALL_MASK in restore_rest_processor_state(), which can
make AC bit cleared when entering kernel mode via SYSCALL after S3 resume. However, I think this
change should be included in patch "Clear AC bit in RFLAGS to protect Xen itself by SMAP",
in which, we did the same thing for MSR_SYSCALL_MASK in subarch_percpu_traps_init().

I also check the following code path, in which, seems we don't need to do anything special for SMAP.
cpu_up() --> __cpu_up() / notifier_call_chain(&cpu_chain, CPU_ONLINE, hcpu, NULL)

Thanks,
Feng

> Thanks
> Kevin
> 
> > ---
> >  docs/misc/xen-command-line.markdown |  7 +++++
> >  xen/arch/x86/setup.c                | 27 +++++++++++++++++
> >  xen/arch/x86/traps.c                | 58
> > +++++++++++++++++++++++++++----------
> >  xen/include/asm-x86/cpufeature.h    |  1 +
> >  xen/include/asm-x86/domain.h        |  6 ++--
> >  5 files changed, 82 insertions(+), 17 deletions(-)
> >
> > diff --git a/docs/misc/xen-command-line.markdown
> > b/docs/misc/xen-command-line.markdown
> > index 7dc938b..a7ac53d 100644
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
> >
> >  Flag to enable Supervisor Mode Execution Protection
> >
> > +### smap
> > +> `= <boolean>`
> > +
> > +> Default: `true`
> > +
> > +Flag to enable Supervisor Mode Access Prevention
> > +
> >  ### snb\_igd\_quirk
> >  > `= <boolean>`
> >
> > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> > index 2e30701..29b22a1 100644
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
> >  static bool_t __initdata disable_smep;
> >  invbool_param("smep", disable_smep);
> >
> > +/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
> > +static bool_t __initdata disable_smap;
> > +invbool_param("smap", disable_smap);
> > +
> >  /* **** Linux config option: propagated to domain0. */
> >  /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
> >  /* "acpi=force":  Override the disable blacklist.                   */
> > @@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long
> > mbi_p)
> >      if ( cpu_has_smep )
> >          set_in_cr4(X86_CR4_SMEP);
> >
> > +    if ( disable_smap )
> > +        setup_clear_cpu_cap(X86_FEATURE_SMAP);
> > +    if ( cpu_has_smap )
> > +        set_in_cr4(X86_CR4_SMAP);
> > +
> >      if ( cpu_has_fsgsbase )
> >          set_in_cr4(X86_CR4_FSGSBASE);
> >
> > @@ -1384,6 +1393,21 @@ void __init noreturn __start_xen(unsigned long
> > mbi_p)
> >          printk(XENLOG_WARNING
> >                 "Multiple initrd candidates, picking module #%u\n",
> >                 initrdidx);
> > +    /*
> > +     * Temporarily clear SMAP bit in CR4 to allow user-page accesses in
> > +     * construct_dom0(). This looks a little hackish, but using stac()/clac()
> > +     * to do this is not appropriate here: Since construct_dom0() calls
> > +     * copy_from_user(), so we cannot wrap construct_dom0() as a whole
> > in
> > +     * stac()/clac() - the AC bit remaining cleared after copy_from_user()
> > +     * would induce problems in construct_dom0().
> > +     *
> > +     * On the other hand, if we used stac()/clac() in construct_dom0(), we
> > +     * would need to carefully handle some corner cases.
> > +     *
> > +     * So we clear SMAP before construct_dom0() and enable it back
> > afterwards.
> > +    */
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() & ~X86_CR4_SMAP);
> >
> >      /*
> >       * We're going to setup domain0 using the module(s) that we stashed
> > safely
> > @@ -1395,6 +1419,9 @@ void __init noreturn __start_xen(unsigned long
> > mbi_p)
> >                          bootstrap_map, cmdline) != 0)
> >          panic("Could not set up DOM0 guest OS");
> >
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() | X86_CR4_SMAP);
> > +
> >      /* Scrub RAM that is still free and so may go to an unprivileged
> domain.
> > */
> >      scrub_heap_pages();
> >
> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> > index 19c96d5..ee203cb 100644
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -1183,11 +1183,12 @@ static int handle_gdt_ldt_mapping_fault(
> >  enum pf_type {
> >      real_fault,
> >      smep_fault,
> > +    smap_fault,
> >      spurious_fault
> >  };
> >
> >  static enum pf_type __page_fault_type(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long mfn, cr3 = read_cr3();
> >      l4_pgentry_t l4e, *l4t;
> > @@ -1195,6 +1196,7 @@ static enum pf_type __page_fault_type(
> >      l2_pgentry_t l2e, *l2t;
> >      l1_pgentry_t l1e, *l1t;
> >      unsigned int required_flags, disallowed_flags, page_user;
> > +    unsigned int error_code = regs->error_code;
> >
> >      /*
> >       * We do not take spurious page faults in IRQ handlers as we do not
> > @@ -1263,19 +1265,37 @@ static enum pf_type __page_fault_type(
> >      page_user &= l1e_get_flags(l1e);
> >
> >  leaf:
> > -    /*
> > -     * Supervisor Mode Execution Protection (SMEP):
> > -     * Disallow supervisor execution from user-accessible mappings
> > -     */
> > -    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
> > -         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> > PFEC_insn_fetch) )
> > -        return smep_fault;
> > +    if ( page_user )
> > +    {
> > +        unsigned long cr4 = read_cr4();
> > +        /*
> > +         * Supervisor Mode Execution Prevention (SMEP):
> > +         * Disallow supervisor execution from user-accessible mappings
> > +         */
> > +        if ( (cr4 & X86_CR4_SMEP) &&
> > +             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> > PFEC_insn_fetch) )
> > +            return smep_fault;
> > +
> > +        /*
> > +         * Supervisor Mode Access Prevention (SMAP):
> > +         * Disallow supervisor access user-accessible mappings
> > +         * A fault is considered as an SMAP violation if the following
> > +         * conditions are true:
> > +         *   - X86_CR4_SMAP is set in CR4
> > +         *   - A user page is being accessed
> > +         *   - CPL=3 or X86_EFLAGS_AC is clear
> > +         *   - Page fault in kernel mode
> > +         */
> > +        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode)
> > &&
> > +             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
> > +            return smap_fault;
> > +    }
> >
> >      return spurious_fault;
> >  }
> >
> >  static enum pf_type spurious_page_fault(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long flags;
> >      enum pf_type pf_type;
> > @@ -1285,7 +1305,7 @@ static enum pf_type spurious_page_fault(
> >       * page tables from becoming invalid under our feet during the walk.
> >       */
> >      local_irq_save(flags);
> > -    pf_type = __page_fault_type(addr, error_code);
> > +    pf_type = __page_fault_type(addr, regs);
> >      local_irq_restore(flags);
> >
> >      return pf_type;
> > @@ -1380,8 +1400,14 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(!guest_mode(regs)) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        BUG_ON(pf_type == smep_fault);
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
> > +        {
> > +            console_start_sync();
> > +            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ?
> 'E' :
> > 'A');
> > +            fatal_trap(TRAP_page_fault, regs);
> > +        }
> > +
> >          if ( pf_type != real_fault )
> >              return;
> >
> > @@ -1407,10 +1433,12 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        if ( pf_type == smep_fault )
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
> >          {
> > -            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
> > +            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
> > +                   current, (pf_type == smep_fault) ? 'E' : 'A');
> > +
> >              domain_crash(current->domain);
> >          }
> >          if ( pf_type != real_fault )
> > diff --git a/xen/include/asm-x86/cpufeature.h
> > b/xen/include/asm-x86/cpufeature.h
> > index 20881c0..8014241 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -190,6 +190,7 @@
> >  #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
> >
> >  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
> > +#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> >  #define cpu_has_fpu_sel
> > (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
> >
> >  #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor ==
> > X86_VENDOR_AMD) \
> > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> > index c5c266f..abf55fb 100644
> > --- a/xen/include/asm-x86/domain.h
> > +++ b/xen/include/asm-x86/domain.h
> > @@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct
> vcpu
> > *, unsigned long guest_cr4);
> >      (((v)->arch.pv_vcpu.ctrlreg[4]                          \
> >        | (mmu_cr4_features                                   \
> >           & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
> > -            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
> > +            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
> > +            X86_CR4_FSGSBASE))                              \
> >        | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
> >       & ~X86_CR4_DE)
> >  #define real_cr4_to_pv_guest_cr4(c)                         \
> >      ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
> > -             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
> > +             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
> > +             X86_CR4_FSGSBASE | X86_CR4_SMAP))
> >
> >  void domain_cpuid(struct domain *d,
> >                    unsigned int  input,
> > --
> > 1.8.3.1

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  2:02             ` Wu, Feng
@ 2014-05-08  6:40               ` Jan Beulich
  2014-05-08  6:49                 ` Wu, Feng
  2014-05-08  9:48               ` Andrew Cooper
  1 sibling, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2014-05-08  6:40 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, ian.campbell, Andrew Cooper, Eddie Dong, xen-devel,
	Jun Nakajima

>>> On 08.05.14 at 04:02, <feng.wu@intel.com> wrote:
>> From: Andrew Cooper [mailto:amc96@hermes.cam.ac.uk] On Behalf Of
>> I am not fussed where exactly the STAC goes in the entry point, but
>> don't leave a CLAC in the SAVE_ALL.
> 
> Sure, thanks for the suggestion! I will pass 0 to SAVE_ALL. 
> My point is that ASM_STAC should be deferred as much as possible.

There's no point in deferring it as much as possible if you don't CLAC
earlier on. In fact I'm inclined to ask to make the SAVE_ALL parameter
a tristate, allowing both CLAC and STAC to be done right there.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  6:40               ` Jan Beulich
@ 2014-05-08  6:49                 ` Wu, Feng
  2014-05-08  6:54                   ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Wu, Feng @ 2014-05-08  6:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, ian.campbell, Andrew Cooper, Dong, Eddie, xen-devel,
	Nakajima, Jun



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, May 08, 2014 2:41 PM
> To: Wu, Feng
> Cc: Andrew Cooper; ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian,
> Kevin; xen-devel@lists.xen.org
> Subject: RE: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
> SMAP
> 
> >>> On 08.05.14 at 04:02, <feng.wu@intel.com> wrote:
> >> From: Andrew Cooper [mailto:amc96@hermes.cam.ac.uk] On Behalf Of
> >> I am not fussed where exactly the STAC goes in the entry point, but
> >> don't leave a CLAC in the SAVE_ALL.
> >
> > Sure, thanks for the suggestion! I will pass 0 to SAVE_ALL.
> > My point is that ASM_STAC should be deferred as much as possible.
> 
> There's no point in deferring it as much as possible if you don't CLAC
> earlier on. In fact I'm inclined to ask to make the SAVE_ALL parameter
> a tristate, allowing both CLAC and STAC to be done right there.
> 
> Jan

Currently, there is only one place where we need to set AC around SAVE_ALL, it is double_fault.
Do we really need to make SAVE_ALL that way?

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  6:49                 ` Wu, Feng
@ 2014-05-08  6:54                   ` Jan Beulich
  2014-05-08  6:58                     ` Wu, Feng
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2014-05-08  6:54 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, ian.campbell, Andrew Cooper, Eddie Dong, xen-devel,
	Jun Nakajima

>>> On 08.05.14 at 08:49, <feng.wu@intel.com> wrote:
>> From: Jan Beulich [mailto:JBeulich@suse.com]
>> >>> On 08.05.14 at 04:02, <feng.wu@intel.com> wrote:
>> > Sure, thanks for the suggestion! I will pass 0 to SAVE_ALL.
>> > My point is that ASM_STAC should be deferred as much as possible.
>> 
>> There's no point in deferring it as much as possible if you don't CLAC
>> earlier on. In fact I'm inclined to ask to make the SAVE_ALL parameter
>> a tristate, allowing both CLAC and STAC to be done right there.
> 
> Currently, there is only one place where we need to set AC around SAVE_ALL, 
> it is double_fault.
> Do we really need to make SAVE_ALL that way?

Note that I wrote "I'm inclined to ask ...", not "I'd like to ask ...", i.e.
if you feel uncomfortable with this, I'm still fine with leaving the macro
as is. But my argument stands that deferring STAC here as much as
possible is pointless.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  6:54                   ` Jan Beulich
@ 2014-05-08  6:58                     ` Wu, Feng
  2014-05-08  7:08                       ` Jan Beulich
  0 siblings, 1 reply; 43+ messages in thread
From: Wu, Feng @ 2014-05-08  6:58 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, ian.campbell, Andrew Cooper, Dong, Eddie, xen-devel,
	Nakajima, Jun



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, May 08, 2014 2:54 PM
> To: Wu, Feng
> Cc: Andrew Cooper; ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian,
> Kevin; xen-devel@lists.xen.org
> Subject: RE: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
> SMAP
> 
> >>> On 08.05.14 at 08:49, <feng.wu@intel.com> wrote:
> >> From: Jan Beulich [mailto:JBeulich@suse.com]
> >> >>> On 08.05.14 at 04:02, <feng.wu@intel.com> wrote:
> >> > Sure, thanks for the suggestion! I will pass 0 to SAVE_ALL.
> >> > My point is that ASM_STAC should be deferred as much as possible.
> >>
> >> There's no point in deferring it as much as possible if you don't CLAC
> >> earlier on. In fact I'm inclined to ask to make the SAVE_ALL parameter
> >> a tristate, allowing both CLAC and STAC to be done right there.
> >
> > Currently, there is only one place where we need to set AC around SAVE_ALL,
> > it is double_fault.
> > Do we really need to make SAVE_ALL that way?
> 
> Note that I wrote "I'm inclined to ask ...", not "I'd like to ask ...", i.e.
> if you feel uncomfortable with this, I'm still fine with leaving the macro
> as is. But my argument stands that deferring STAC here as much as
> possible is pointless.
> 
> Jan

Yes, I agree that it is pointless to defer STAC here. I am definitely fine with making SAVE_ALL
Parameter a tristate. If you think it is better I am willing the change the patch. However, my
question is that is it better to implement SAVE_ALL that way? Just a technical question!:)

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-08  6:25     ` Wu, Feng
@ 2014-05-08  7:06       ` Jan Beulich
  0 siblings, 0 replies; 43+ messages in thread
From: Jan Beulich @ 2014-05-08  7:06 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, ian.campbell, andrew.cooper3, Eddie Dong, xen-devel,
	Jun Nakajima

>>> On 08.05.14 at 08:25, <feng.wu@intel.com> wrote:
>> From: Tian, Kevin
>> could you double-check whether S3 resume and CPU hotplug are covered
>> well?
>> 
> Kevin, thank you very much for point this out!
> 
> I checked the following code path:
> wakeup_start --> wakeup_32 --> wakeup_64 --> __ret_point --> 
> restore_rest_processor_state()
> 
> I thought there was nothing special to do with SMAP here, but after careful 
> review, I think we should
> set X86_EFLAGS_AC for MSR_SYSCALL_MASK in restore_rest_processor_state(), 
> which can
> make AC bit cleared when entering kernel mode via SYSCALL after S3 resume. 
> However, I think this
> change should be included in patch "Clear AC bit in RFLAGS to protect Xen 
> itself by SMAP",
> in which, we did the same thing for MSR_SYSCALL_MASK in 
> subarch_percpu_traps_init().

Yes, that's indeed where it should be done.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  6:58                     ` Wu, Feng
@ 2014-05-08  7:08                       ` Jan Beulich
  2014-05-08  7:13                         ` Wu, Feng
  0 siblings, 1 reply; 43+ messages in thread
From: Jan Beulich @ 2014-05-08  7:08 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, ian.campbell, Andrew Cooper, Eddie Dong, xen-devel,
	Jun Nakajima

>>> On 08.05.14 at 08:58, <feng.wu@intel.com> wrote:
> Yes, I agree that it is pointless to defer STAC here. I am definitely fine 
> with making SAVE_ALL
> Parameter a tristate. If you think it is better I am willing the change the 
> patch. However, my
> question is that is it better to implement SAVE_ALL that way? Just a 
> technical question!:)

It's not really a technical one, it's a matter of taste. And _I_ would be
doing it that way, but I guess others may frown upon such tristates.

Jan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  7:08                       ` Jan Beulich
@ 2014-05-08  7:13                         ` Wu, Feng
  0 siblings, 0 replies; 43+ messages in thread
From: Wu, Feng @ 2014-05-08  7:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, ian.campbell, Andrew Cooper, Dong, Eddie, xen-devel,
	Nakajima, Jun



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, May 08, 2014 3:09 PM
> To: Wu, Feng
> Cc: Andrew Cooper; ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian,
> Kevin; xen-devel@lists.xen.org
> Subject: RE: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
> SMAP
> 
> >>> On 08.05.14 at 08:58, <feng.wu@intel.com> wrote:
> > Yes, I agree that it is pointless to defer STAC here. I am definitely fine
> > with making SAVE_ALL
> > Parameter a tristate. If you think it is better I am willing the change the
> > patch. However, my
> > question is that is it better to implement SAVE_ALL that way? Just a
> > technical question!:)
> 
> It's not really a technical one, it's a matter of taste. And _I_ would be
> doing it that way, but I guess others may frown upon such tristates.
> 
> Jan

That's fine to me, I will change it in the next post.

Thanks,
Feng

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  2:02             ` Wu, Feng
  2014-05-08  6:40               ` Jan Beulich
@ 2014-05-08  9:48               ` Andrew Cooper
  1 sibling, 0 replies; 43+ messages in thread
From: Andrew Cooper @ 2014-05-08  9:48 UTC (permalink / raw)
  To: Wu, Feng, Jan Beulich
  Cc: Tian, Kevin, Dong, Eddie, ian.campbell, Nakajima, Jun, xen-devel

On 08/05/2014 03:02, Wu, Feng wrote:
>> -----Original Message-----
>> From: Andrew Cooper [mailto:amc96@hermes.cam.ac.uk] On Behalf Of
>> Andrew Cooper
>> Sent: Thursday, May 08, 2014 9:58 AM
>> To: Wu, Feng; Jan Beulich
>> Cc: ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian, Kevin;
>> xen-devel@lists.xen.org
>> Subject: Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
>> SMAP
>>
>> On 08/05/2014 02:41, Wu, Feng wrote:
>>>> -----Original Message-----
>>>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>>>> Sent: Wednesday, May 07, 2014 7:54 PM
>>>> To: Jan Beulich
>>>> Cc: Wu, Feng; ian.campbell@citrix.com; Dong, Eddie; Nakajima, Jun; Tian,
>>>> Kevin; xen-devel@lists.xen.org
>>>> Subject: Re: [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by
>>>> SMAP
>>>>
>>>> On 07/05/14 12:40, Jan Beulich wrote:
>>>>>>>> On 07.05.14 at 11:44, <andrew.cooper3@citrix.com> wrote:
>>>>>> On 07/05/14 09:19, Feng Wu wrote:
>>>>>>> @@ -673,6 +675,7 @@ ENTRY(nmi_crash)
>>>>>>>          ud2
>>>>>>>
>>>>>>>  ENTRY(machine_check)
>>>>>>> +        ASM_CLAC
>>>>>> This is not needed.  the start of handle_ist_exception has a SAVE_ALL,
>>>>>> which also covers the nmi entry point.
>>>>>>
>>>>>> On the subject of IST exceptions, perhaps the double fault explicitly
>>>>>> wants a STAC to reduce the likelihood of taking a further fault while
>>>>>> trying to dump state. ?
>>>>> I agree. And perhaps along with do_double_fault(), fatal_trap()
>>>>> should then also get a stac() added?
>>>>>
>>>>> Jan
>>>>>
>>>> With doubt_fault: being sole caller of do_double_fault(), editing the
>>>> entry point in entry.S to "ASM_STAC; SAVE_ALL 0" is sufficient to avoid
>>>> stac() in do_doube_fault() itself.
>>> I think it's better to add "ASM_STAC" just before " call  do_double_fault".
>>> Do you think this is okay, Andrew? Thanks!
>> I am not fussed where exactly the STAC goes in the entry point, but
>> don't leave a CLAC in the SAVE_ALL.
> Sure, thanks for the suggestion! I will pass 0 to SAVE_ALL. 
> My point is that ASM_STAC should be deferred as much as possible.

In the double_fault entry, currently only way to get an SMAP violation
would if the Interrupt Stack Table mechanism landed us on a stack with
user mapings, at which point all bets are already off with respect to
dumping state and rebooting.

In the case that our stack is actually good, a pagefault from trying to
peek at boot_cpu_data would land us in a loop of nested pagefaults until
we wandered off the stack, as each entry point attempted to CLAC.

>From a safety point of view clearing CR4.SMAP would be the best action,
as we can manage that safely on the stack before looking outside.

On the otherhand, if we fault when looking at boot_cpu_data, the chances
of successfully calling do_page_fault() is tantamount to 0.

In the past while hacking^W debugging, I have noticed that the double
fault handling in Xen does have a tendency to end itself up in infinite
loops.  I don't think this series is the appropriate place to sure this
up (I might find some particularly distant copious free time), although
not leaving it in a worse state than it currently is would be nice.

~Andrew

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2014-05-08  9:48 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
2014-05-07  8:19 ` [PATCH v6 01/10] x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET Feng Wu
2014-05-07  8:19 ` [PATCH v6 02/10] x86: move common_interrupt to entry.S Feng Wu
2014-05-07  8:19 ` [PATCH v6 03/10] x86: merge stuff from asm-x86/x86_64/asm_defns.h to asm-x86/asm_defns.h Feng Wu
2014-05-07  8:19 ` [PATCH v6 04/10] x86: Add support for STAC/CLAC instructions Feng Wu
2014-05-07  9:36   ` Andrew Cooper
2014-05-07  8:19 ` [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
2014-05-07  9:44   ` Andrew Cooper
2014-05-07 11:40     ` Jan Beulich
2014-05-07 11:53       ` Andrew Cooper
2014-05-08  1:41         ` Wu, Feng
2014-05-08  1:57           ` Andrew Cooper
2014-05-08  2:02             ` Wu, Feng
2014-05-08  6:40               ` Jan Beulich
2014-05-08  6:49                 ` Wu, Feng
2014-05-08  6:54                   ` Jan Beulich
2014-05-08  6:58                     ` Wu, Feng
2014-05-08  7:08                       ` Jan Beulich
2014-05-08  7:13                         ` Wu, Feng
2014-05-08  9:48               ` Andrew Cooper
2014-05-07  8:19 ` [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
2014-05-07  9:49   ` Andrew Cooper
2014-05-08  1:14   ` Tian, Kevin
2014-05-07  8:19 ` [PATCH v6 07/10] VMX: Disable SMAP feature when guest is in non-paging mode Feng Wu
2014-05-08  1:16   ` Tian, Kevin
2014-05-07  8:19 ` [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
2014-05-07 10:26   ` Andrew Cooper
2014-05-07 11:44     ` Jan Beulich
2014-05-07 11:47       ` Andrew Cooper
2014-05-08  2:32     ` Wu, Feng
2014-05-08  1:20   ` Tian, Kevin
2014-05-08  6:25     ` Wu, Feng
2014-05-08  7:06       ` Jan Beulich
2014-05-07  8:19 ` [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
2014-05-07 10:46   ` Andrew Cooper
2014-05-07 11:47     ` Jan Beulich
2014-05-08  1:22   ` Tian, Kevin
2014-05-07  8:19 ` [PATCH v6 10/10] x86/tools: Expose SMAP to HVM guests Feng Wu
2014-05-07  8:35 ` [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
2014-05-07  9:00   ` Wu, Feng
2014-05-07  9:33     ` Jan Beulich
2014-05-07  8:57 ` Ian Campbell
2014-05-07  8:59   ` Wu, Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.