All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
@ 2014-05-08  8:15 Feng Wu
  2014-05-08  8:15 ` [PATCH v7 05/10] x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Feng Wu @ 2014-05-08  8:15 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Supervisor Mode Access Prevention (SMAP) is a new security
feature disclosed by Intel, please refer to the following
document:

http://software.intel.com/sites/default/files/319433-014.pdf

Every access to a linear address is either a supervisor-mode
access or a user-mode access. All accesses performed while the
current privilege level (CPL) is less than 3 are supervisor-mode
accesses. If CPL = 3, accesses are generally user-mode accesses.
However, some operations implicitly access system data structures,
and the resulting accesses to those data structures are supervisor-mode
accesses regardless of CPL. Examples of such implicit supervisor
accesses include the following: accesses to the global descriptor
table (GDT) or local descriptor table (LDT) to load a segment descriptor;
accesses to the interrupt descriptor table (IDT) when delivering an
interrupt or exception; and accesses to the task-state segment (TSS) as
part of a task switch or change of CPL.

If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
to linear addresses that are accessible in user mode. If CPL < 3,
SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
applies to all supervisor-mode data accesses (these are implicit
supervisor accesses) regardless of the value of EFLAGS.AC.

Version 1:
  * Add two macros for STAC/CLAC instructions
  * Temporary disable SMAP to legally access user pages in kernel mode
  * Enable Supervisor Mode Access Prevention (SMAP) for Xen itself
  * Add SMAP support to HVM guest
  * Disable SMAP feature when guest is in non-paging mode

Version 2:
  * Change the definition of ASM_STAC/ASM_CLAC.
  * Clear AC bit at the beginning of exception, interrup, hypercall.
  * Make construct_dom0() wrapped in a stac()/clac() part as a whole.
  * Reorder some patches in the series.
  * Combine some conditionals with SMEP.
  * Typo, etc. 

Version 3:
  * Clean-ups to ASM_STAC/ASM_CLAC
  * Enable SMAP after constructin domain 0
  * Move common_interrupt to entry.S
  * Remove ASM_CLAC calls in some places where exception happens
  * Correct the logic in hvm_vcpu_has_smep()/hvm_vcpu_has_smap() 
  * Make the output message more readable when SMAP violation happens
  * Use hvm_get_segment_register() to get the guest SS in guest_walk_tables()
  * Coding style changes, etc.

Version 4:
  * Use common macro CPUINFO_features instead of CPUINFO86_ext_features in xen/arch/x86/boot/head.S
  * Make ASM_STAC/ASM_CLAC common both in assembly and C code
  * Merge xen/include/asm-x86/x86_64/asm_defns.h into xen/include/asm-x86/asm_defns.h
  * Add a parameter to SAVE_ALL to include ASM_CALC in it optional
  * Remove ASM_STAC/ASM_CLAC pair in compat_create_bounce_frame, since in this chunk of code,
    it only accesses the pv guest's kernel stack, which is in ring 1 for 32-bit pv guests.
  * Call "setup_clear_cpu_cap(X86_FEATURE_SMAP)" before APs get brought up
  * Coding style changes.

Version 5:
  * Remove C verion of CPUINFO_FEATURE_OFFSET.
  * Implement clac()/stac() in C.
  * Set the default value of the parameter to 1 for macro SAVE_ALL.
  * Add const to the second parameter of __page_fault_type().
  * Clear SMAP bit in CR4 before construct_dom0() and set the bit back after it.
  * Coding style changes.

Version 6:
  * Add memory clobbers for clac()/stac()
  * Add ASM_CLAC at the beginning of ignore_int
  * Some comment changes

Version 7:
  Patch "x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP"
    * Remove ASM_CLAC for machine_check
    * Add ASM_STAC/stac() for double_fault and fatal_trap()
    * set AC for MSR_SYSCALL_MASK after S3 resume
    * Make SAVE_ALL parameter a tristate, allowing both CLAC and STAC to be done right there

  Patch "x86: Temporary disable SMAP to legally access user pages in kernel mode"
    * Remove the unnecessary trailing semicolon for macro __put_user_asm, __get_user_asm, and __cmpxchg_user.

  Patch "x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen"
    * Change some comments

  Patch "x86/hvm: Add SMAP support to HVM guest"
    * Pass ecx explicitly to hvm_cpuid() for CPUID.0x7

Feng Wu (10):
  x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
  x86: move common_interrupt to entry.S
  x86: merge stuff from asm-x86/x86_64/asm_defns.h to
    asm-x86/asm_defns.h
  x86: Add support for STAC/CLAC instructions
  x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP
  x86: Temporary disable SMAP to legally access user pages in kernel
    mode
  VMX: Disable SMAP feature when guest is in non-paging mode
  x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  x86/hvm: Add SMAP support to HVM guest
  x86/tools: Expose SMAP to HVM guests

 docs/misc/xen-command-line.markdown    |   7 +
 tools/libxc/xc_cpufeature.h            |   1 +
 tools/libxc/xc_cpuid_x86.c             |   1 +
 xen/arch/x86/acpi/suspend.c            |   2 +-
 xen/arch/x86/boot/head.S               |   3 +-
 xen/arch/x86/hvm/hvm.c                 |   3 +
 xen/arch/x86/hvm/vmx/vmx.c             |   6 +-
 xen/arch/x86/i8259.c                   |   2 -
 xen/arch/x86/mm/guest_walk.c           |  40 +++--
 xen/arch/x86/setup.c                   |  20 +++
 xen/arch/x86/traps.c                   |  76 ++++++++--
 xen/arch/x86/usercopy.c                |   6 +
 xen/arch/x86/x86_64/asm-offsets.c      |   2 +-
 xen/arch/x86/x86_64/compat/entry.S     |   3 +-
 xen/arch/x86/x86_64/entry.S            |  20 ++-
 xen/arch/x86/x86_64/traps.c            |   2 +-
 xen/include/asm-x86/asm_defns.h        | 263 ++++++++++++++++++++++++++++++++-
 xen/include/asm-x86/cpufeature.h       |   5 +
 xen/include/asm-x86/domain.h           |   6 +-
 xen/include/asm-x86/hvm/hvm.h          |  22 ++-
 xen/include/asm-x86/uaccess.h          |   8 +-
 xen/include/asm-x86/x86_64/asm_defns.h | 231 -----------------------------
 xen/include/asm-x86/x86_64/system.h    |   4 +-
 23 files changed, 455 insertions(+), 278 deletions(-)
 delete mode 100644 xen/include/asm-x86/x86_64/asm_defns.h

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v7 05/10] x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP
  2014-05-08  8:15 [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
@ 2014-05-08  8:15 ` Feng Wu
  2014-05-08  8:15 ` [PATCH v7 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Feng Wu @ 2014-05-08  8:15 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Clear AC bit in RFLAGS at the beginning of exception, interrupt, hypercall,
so Xen itself can be protected by SMAP mechanism. This patch also sets AC
bit at the beginning of double_fault and fatal_trap() to reduce the likelihood
of taking a further fault while trying to dump state.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 xen/arch/x86/acpi/suspend.c        |  2 +-
 xen/arch/x86/traps.c               |  6 ++++++
 xen/arch/x86/x86_64/compat/entry.S |  3 ++-
 xen/arch/x86/x86_64/entry.S        | 12 +++++++++---
 xen/arch/x86/x86_64/traps.c        |  2 +-
 xen/include/asm-x86/asm_defns.h    | 15 ++++++++++++++-
 6 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/acpi/suspend.c b/xen/arch/x86/acpi/suspend.c
index a373e9a..acf667d 100644
--- a/xen/arch/x86/acpi/suspend.c
+++ b/xen/arch/x86/acpi/suspend.c
@@ -57,7 +57,7 @@ void restore_rest_processor_state(void)
     wrmsrl(MSR_CSTAR, saved_cstar);
     wrmsr(MSR_STAR, 0, (FLAT_RING3_CS32<<16) | __HYPERVISOR_CS);
     wrmsr(MSR_SYSCALL_MASK,
-          X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
+          X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
           X86_EFLAGS_DF|X86_EFLAGS_IF|X86_EFLAGS_TF,
           0U);
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 45070bb..3800b6f 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -402,6 +402,12 @@ void fatal_trap(int trapnr, struct cpu_user_regs *regs)
     static DEFINE_PER_CPU(char, depth);
 
     /*
+     * Set AC bit to reduce the likelihood of taking a further fault
+     * while trying to dump state.
+     */
+    stac();
+
+    /*
      * In some cases, we can end up in a vicious cycle of fatal_trap()s
      * within fatal_trap()s. We give the problem a couple of iterations to
      * bottom out, and then we just panic.
diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
index 32b3bcc..1d1d3d6 100644
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -13,6 +13,7 @@
 #include <irq_vectors.h>
 
 ENTRY(compat_hypercall)
+        ASM_CLAC
         pushq $0
         SAVE_VOLATILE type=TRAP_syscall compat=1
 
@@ -178,7 +179,7 @@ ENTRY(compat_restore_all_guest)
 
 .section .fixup,"ax"
 .Lfx0:  sti
-        SAVE_ALL
+        SAVE_ALL 2
         movq  UREGS_error_code(%rsp),%rsi
         movq  %rsp,%rax
         andq  $~0xf,%rsp
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index 1c81852..ed7b96f 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -68,7 +68,7 @@ iret_exit_to_guest:
 
 .section .fixup,"ax"
 .Lfx0:  sti
-        SAVE_ALL
+        SAVE_ALL 2
         movq  UREGS_error_code(%rsp),%rsi
         movq  %rsp,%rax
         andq  $~0xf,%rsp
@@ -273,6 +273,7 @@ ENTRY(sysenter_entry)
         pushq $0
         pushfq
 GLOBAL(sysenter_eflags_saved)
+        ASM_CLAC
         pushq $3 /* ring 3 null cs */
         pushq $0 /* null rip */
         pushq $0
@@ -309,6 +310,7 @@ UNLIKELY_END(sysenter_gpf)
         jmp   .Lbounce_exception
 
 ENTRY(int80_direct_trap)
+        ASM_CLAC
         pushq $0
         SAVE_VOLATILE 0x80
 
@@ -614,14 +616,18 @@ ENTRY(spurious_interrupt_bug)
 
 ENTRY(double_fault)
         movl  $TRAP_double_fault,4(%rsp)
-        SAVE_ALL
+        /*
+         * Set AC bit to reduce the likelihood of taking a further fault
+         * while trying to dump state.
+         */
+        SAVE_ALL 1
         movq  %rsp,%rdi
         call  do_double_fault
         ud2
 
         .pushsection .init.text, "ax", @progbits
 ENTRY(early_page_fault)
-        SAVE_ALL
+        SAVE_ALL 2
         movq  %rsp,%rdi
         call  do_early_page_fault
         jmp   restore_all_xen
diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
index 90072c1..b87b33e 100644
--- a/xen/arch/x86/x86_64/traps.c
+++ b/xen/arch/x86/x86_64/traps.c
@@ -437,7 +437,7 @@ void __devinit subarch_percpu_traps_init(void)
     /* Common SYSCALL parameters. */
     wrmsr(MSR_STAR, 0, (FLAT_RING3_CS32<<16) | __HYPERVISOR_CS);
     wrmsr(MSR_SYSCALL_MASK,
-          X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
+          X86_EFLAGS_AC|X86_EFLAGS_VM|X86_EFLAGS_RF|X86_EFLAGS_NT|
           X86_EFLAGS_DF|X86_EFLAGS_IF|X86_EFLAGS_TF,
           0U);
 }
diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
index 0302836..dadc251 100644
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -190,7 +190,20 @@ static inline void stac(void)
 #endif
 
 #ifdef __ASSEMBLY__
-.macro SAVE_ALL
+/*
+ * Save all registers.
+ *
+ * @ac:
+ *     0 - claer AC bit
+ *     1 - set AC bit
+ *     others - don't modify AC bit
+ */
+.macro SAVE_ALL ac=0
+.if \ac == 0
+        ASM_CLAC
+.elseif \ac == 1
+        ASM_STAC
+.endif
         addq  $-(UREGS_error_code-UREGS_r15), %rsp
         cld
         movq  %rdi,UREGS_rdi(%rsp)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v7 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode
  2014-05-08  8:15 [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
  2014-05-08  8:15 ` [PATCH v7 05/10] x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
@ 2014-05-08  8:15 ` Feng Wu
  2014-05-08  8:15 ` [PATCH v7 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Feng Wu @ 2014-05-08  8:15 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Use STAC/CLAC to temporarily disable SMAP to allow legal accesses to
user pages in kernel mode

STAC/CLAC is not needed for compat_create_bounce_frame, since in this
chunk of code, it only accesses the pv guest's kernel stack, which is
in ring 1 for 32-bit pv guests.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/traps.c                | 12 ++++++++++++
 xen/arch/x86/usercopy.c             |  6 ++++++
 xen/arch/x86/x86_64/entry.S         |  2 ++
 xen/include/asm-x86/uaccess.h       |  8 ++++++--
 xen/include/asm-x86/x86_64/system.h |  4 +++-
 5 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 3800b6f..56ab444 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -3817,6 +3817,18 @@ unsigned long do_get_debugreg(int reg)
 
 void asm_domain_crash_synchronous(unsigned long addr)
 {
+    /*
+     * We need clear AC bit here because in entry.S AC is set
+     * by ASM_STAC to temporarily allow accesses to user pages
+     * which is prevented by SMAP by default.
+     *
+     * For some code paths, where this function is called, clac()
+     * is not needed, but adding clac() here instead of each place
+     * asm_domain_crash_synchronous() is called can reduce the code
+     * redundancy, and it is harmless as well.
+     */
+    clac();
+
     if ( addr == 0 )
         addr = this_cpu(last_extable_addr);
 
diff --git a/xen/arch/x86/usercopy.c b/xen/arch/x86/usercopy.c
index b79202b..4cc78f5 100644
--- a/xen/arch/x86/usercopy.c
+++ b/xen/arch/x86/usercopy.c
@@ -14,6 +14,7 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned n)
 {
     unsigned long __d0, __d1, __d2, __n = n;
 
+    stac();
     asm volatile (
         "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
         "    jbe  1f\n"
@@ -42,6 +43,7 @@ unsigned long __copy_to_user_ll(void __user *to, const void *from, unsigned n)
         : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
         : "0" (__n), "1" (to), "2" (from), "3" (__n)
         : "memory" );
+    clac();
 
     return __n;
 }
@@ -51,6 +53,7 @@ __copy_from_user_ll(void *to, const void __user *from, unsigned n)
 {
     unsigned long __d0, __d1, __d2, __n = n;
 
+    stac();
     asm volatile (
         "    cmp  $"STR(2*BYTES_PER_LONG-1)",%0\n"
         "    jbe  1f\n"
@@ -85,6 +88,7 @@ __copy_from_user_ll(void *to, const void __user *from, unsigned n)
         : "=&c" (__n), "=&D" (__d0), "=&S" (__d1), "=&r" (__d2)
         : "0" (__n), "1" (to), "2" (from), "3" (__n)
         : "memory" );
+    clac();
 
     return __n;
 }
@@ -113,6 +117,7 @@ copy_to_user(void __user *to, const void *from, unsigned n)
 #define __do_clear_user(addr,size)					\
 do {									\
 	long __d0;							\
+	stac();								\
 	__asm__ __volatile__(						\
 		"0:	rep; stosl\n"					\
 		"	movl %2,%0\n"					\
@@ -126,6 +131,7 @@ do {									\
 		_ASM_EXTABLE(1b,2b)					\
 		: "=&c"(size), "=&D" (__d0)				\
 		: "r"(size & 3), "0"(size / 4), "1"((long)addr), "a"(0));	\
+	clac();								\
 } while (0)
 
 /**
diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
index ed7b96f..a81566e 100644
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -379,6 +379,7 @@ __UNLIKELY_END(create_bounce_frame_bad_sp)
         movb  TRAPBOUNCE_flags(%rdx),%cl
         subq  $40,%rsi
         movq  UREGS_ss+8(%rsp),%rax
+        ASM_STAC
 .Lft2:  movq  %rax,32(%rsi)             # SS
         movq  UREGS_rsp+8(%rsp),%rax
 .Lft3:  movq  %rax,24(%rsi)             # RSP
@@ -423,6 +424,7 @@ UNLIKELY_END(bounce_failsafe)
 .Lft12: movq  %rax,8(%rsi)              # R11
         movq  UREGS_rcx+8(%rsp),%rax
 .Lft13: movq  %rax,(%rsi)               # RCX
+        ASM_CLAC
         /* Rewrite our stack frame and return to guest-OS mode. */
         /* IA32 Ref. Vol. 3: TF, VM, RF and NT flags are cleared on trap. */
         /* Also clear AC: alignment checks shouldn't trigger in kernel mode. */
diff --git a/xen/include/asm-x86/uaccess.h b/xen/include/asm-x86/uaccess.h
index 88b4ba2..947470d 100644
--- a/xen/include/asm-x86/uaccess.h
+++ b/xen/include/asm-x86/uaccess.h
@@ -146,6 +146,7 @@ struct __large_struct { unsigned long buf[100]; };
  * aliasing issues.
  */
 #define __put_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
+	stac();								\
 	__asm__ __volatile__(						\
 		"1:	mov"itype" %"rtype"1,%2\n"			\
 		"2:\n"							\
@@ -155,9 +156,11 @@ struct __large_struct { unsigned long buf[100]; };
 		".previous\n"						\
 		_ASM_EXTABLE(1b, 3b)					\
 		: "=r"(err)						\
-		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err))
+		: ltype (x), "m"(__m(addr)), "i"(errret), "0"(err));	\
+	clac()
 
 #define __get_user_asm(x, addr, err, itype, rtype, ltype, errret)	\
+	stac();								\
 	__asm__ __volatile__(						\
 		"1:	mov"itype" %2,%"rtype"1\n"			\
 		"2:\n"							\
@@ -168,7 +171,8 @@ struct __large_struct { unsigned long buf[100]; };
 		".previous\n"						\
 		_ASM_EXTABLE(1b, 3b)					\
 		: "=r"(err), ltype (x)					\
-		: "m"(__m(addr)), "i"(errret), "0"(err))
+		: "m"(__m(addr)), "i"(errret), "0"(err));		\
+	clac()
 
 /**
  * __copy_to_user: - Copy a block of data into user space, with less checking
diff --git a/xen/include/asm-x86/x86_64/system.h b/xen/include/asm-x86/x86_64/system.h
index 20f038b..662813a 100644
--- a/xen/include/asm-x86/x86_64/system.h
+++ b/xen/include/asm-x86/x86_64/system.h
@@ -12,6 +12,7 @@
  * is the same as the initial value of _o then _n is written to location _p.
  */
 #define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype)                 \
+    stac();                                                             \
     asm volatile (                                                      \
         "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n"                      \
         "2:\n"                                                          \
@@ -22,7 +23,8 @@
         _ASM_EXTABLE(1b, 3b)                                            \
         : "=a" (_o), "=r" (_rc)                                         \
         : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1" (0) \
-        : "memory");
+        : "memory");                                                    \
+    clac()
 
 #define cmpxchg_user(_p,_o,_n)                                          \
 ({                                                                      \
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v7 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-08  8:15 [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
  2014-05-08  8:15 ` [PATCH v7 05/10] x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
  2014-05-08  8:15 ` [PATCH v7 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
@ 2014-05-08  8:15 ` Feng Wu
  2014-05-08  8:15 ` [PATCH v7 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
  2014-05-08  8:40 ` [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
  4 siblings, 0 replies; 8+ messages in thread
From: Feng Wu @ 2014-05-08  8:15 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Supervisor Mode Access Prevention (SMAP) is a new security
feature disclosed by Intel, please refer to the following
document:

http://software.intel.com/sites/default/files/319433-014.pdf

If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
to linear addresses that are accessible in user mode. If CPL < 3,
SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
applies to all supervisor-mode data accesses (these are implicit
supervisor accesses) regardless of the value of EFLAGS.AC.

This patch enables SMAP in Xen to prevent Xen hypervisor from
accessing pv guest data, whose translation paging-structure
entries' U/S flags are all set.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 docs/misc/xen-command-line.markdown |  7 +++++
 xen/arch/x86/setup.c                | 20 +++++++++++++
 xen/arch/x86/traps.c                | 58 +++++++++++++++++++++++++++----------
 xen/include/asm-x86/cpufeature.h    |  1 +
 xen/include/asm-x86/domain.h        |  6 ++--
 5 files changed, 75 insertions(+), 17 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 7dc938b..a7ac53d 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -912,6 +912,13 @@ Set the serial transmit buffer size.
 
 Flag to enable Supervisor Mode Execution Protection
 
+### smap
+> `= <boolean>`
+
+> Default: `true`
+
+Flag to enable Supervisor Mode Access Prevention
+
 ### snb\_igd\_quirk
 > `= <boolean>`
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 2e30701..d8598a3 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
 static bool_t __initdata disable_smep;
 invbool_param("smep", disable_smep);
 
+/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
+static bool_t __initdata disable_smap;
+invbool_param("smap", disable_smap);
+
 /* **** Linux config option: propagated to domain0. */
 /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
 /* "acpi=force":  Override the disable blacklist.                   */
@@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( cpu_has_smep )
         set_in_cr4(X86_CR4_SMEP);
 
+    if ( disable_smap )
+        setup_clear_cpu_cap(X86_FEATURE_SMAP);
+    if ( cpu_has_smap )
+        set_in_cr4(X86_CR4_SMAP);
+
     if ( cpu_has_fsgsbase )
         set_in_cr4(X86_CR4_FSGSBASE);
 
@@ -1386,6 +1395,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                initrdidx);
 
     /*
+     * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
+     * This saves a large number of corner cases interactions with
+     * copy_from_user().
+     */
+    if ( cpu_has_smap )
+        write_cr4(read_cr4() & ~X86_CR4_SMAP);
+
+    /*
      * We're going to setup domain0 using the module(s) that we stashed safely
      * above our heap. The second module, if present, is an initrd ramdisk.
      */
@@ -1395,6 +1412,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                         bootstrap_map, cmdline) != 0)
         panic("Could not set up DOM0 guest OS");
 
+    if ( cpu_has_smap )
+        write_cr4(read_cr4() | X86_CR4_SMAP);
+
     /* Scrub RAM that is still free and so may go to an unprivileged domain. */
     scrub_heap_pages();
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 56ab444..19a161f 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1189,11 +1189,12 @@ static int handle_gdt_ldt_mapping_fault(
 enum pf_type {
     real_fault,
     smep_fault,
+    smap_fault,
     spurious_fault
 };
 
 static enum pf_type __page_fault_type(
-    unsigned long addr, unsigned int error_code)
+    unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long mfn, cr3 = read_cr3();
     l4_pgentry_t l4e, *l4t;
@@ -1201,6 +1202,7 @@ static enum pf_type __page_fault_type(
     l2_pgentry_t l2e, *l2t;
     l1_pgentry_t l1e, *l1t;
     unsigned int required_flags, disallowed_flags, page_user;
+    unsigned int error_code = regs->error_code;
 
     /*
      * We do not take spurious page faults in IRQ handlers as we do not
@@ -1269,19 +1271,37 @@ static enum pf_type __page_fault_type(
     page_user &= l1e_get_flags(l1e);
 
 leaf:
-    /*
-     * Supervisor Mode Execution Protection (SMEP):
-     * Disallow supervisor execution from user-accessible mappings
-     */
-    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
-         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
-        return smep_fault;
+    if ( page_user )
+    {
+        unsigned long cr4 = read_cr4();
+        /*
+         * Supervisor Mode Execution Prevention (SMEP):
+         * Disallow supervisor execution from user-accessible mappings
+         */
+        if ( (cr4 & X86_CR4_SMEP) &&
+             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
+            return smep_fault;
+
+        /*
+         * Supervisor Mode Access Prevention (SMAP):
+         * Disallow supervisor access user-accessible mappings
+         * A fault is considered as an SMAP violation if the following
+         * conditions are true:
+         *   - X86_CR4_SMAP is set in CR4
+         *   - A user page is being accessed
+         *   - CPL=3 or X86_EFLAGS_AC is clear
+         *   - Page fault in kernel mode
+         */
+        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode) &&
+             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
+            return smap_fault;
+    }
 
     return spurious_fault;
 }
 
 static enum pf_type spurious_page_fault(
-    unsigned long addr, unsigned int error_code)
+    unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long flags;
     enum pf_type pf_type;
@@ -1291,7 +1311,7 @@ static enum pf_type spurious_page_fault(
      * page tables from becoming invalid under our feet during the walk.
      */
     local_irq_save(flags);
-    pf_type = __page_fault_type(addr, error_code);
+    pf_type = __page_fault_type(addr, regs);
     local_irq_restore(flags);
 
     return pf_type;
@@ -1386,8 +1406,14 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(!guest_mode(regs)) )
     {
-        pf_type = spurious_page_fault(addr, error_code);
-        BUG_ON(pf_type == smep_fault);
+        pf_type = spurious_page_fault(addr, regs);
+        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
+        {
+            console_start_sync();
+            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' : 'A');
+            fatal_trap(TRAP_page_fault, regs);
+        }
+
         if ( pf_type != real_fault )
             return;
 
@@ -1413,10 +1439,12 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
     {
-        pf_type = spurious_page_fault(addr, error_code);
-        if ( pf_type == smep_fault )
+        pf_type = spurious_page_fault(addr, regs);
+        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
         {
-            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
+            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
+                   current, (pf_type == smep_fault) ? 'E' : 'A');
+
             domain_crash(current->domain);
         }
         if ( pf_type != real_fault )
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 20881c0..8014241 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -190,6 +190,7 @@
 #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
 
 #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
+#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
 #define cpu_has_fpu_sel         (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
 
 #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) \
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index c5c266f..abf55fb 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, unsigned long guest_cr4);
     (((v)->arch.pv_vcpu.ctrlreg[4]                          \
       | (mmu_cr4_features                                   \
          & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
-            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
+            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
+            X86_CR4_FSGSBASE))                              \
       | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
      & ~X86_CR4_DE)
 #define real_cr4_to_pv_guest_cr4(c)                         \
     ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
-             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
+             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
+             X86_CR4_FSGSBASE | X86_CR4_SMAP))
 
 void domain_cpuid(struct domain *d,
                   unsigned int  input,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v7 09/10] x86/hvm: Add SMAP support to HVM guest
  2014-05-08  8:15 [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (2 preceding siblings ...)
  2014-05-08  8:15 ` [PATCH v7 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
@ 2014-05-08  8:15 ` Feng Wu
  2014-05-08  8:40 ` [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
  4 siblings, 0 replies; 8+ messages in thread
From: Feng Wu @ 2014-05-08  8:15 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Intel new CPU supports SMAP (Supervisor Mode Access Prevention).
SMAP prevents supervisor-mode accesses to any linear address with
a valid translation for which the U/S flag (bit 2) is 1 in every
paging-structure entry controlling the translation for the linear
address.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/hvm/hvm.c        |  3 +++
 xen/arch/x86/mm/guest_walk.c  | 40 ++++++++++++++++++++++++++++++----------
 xen/include/asm-x86/hvm/hvm.h | 22 +++++++++++++++++++---
 3 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index ac05160..76ccd07 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3039,6 +3039,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         if ( (count == 0) && !cpu_has_smep )
             *ebx &= ~cpufeat_mask(X86_FEATURE_SMEP);
 
+        if ( (count == 0) && !cpu_has_smap )
+            *ebx &= ~cpufeat_mask(X86_FEATURE_SMAP);
+
         /* Don't expose MPX to hvm when VMX support is not available */
         if ( (count == 0) &&
              (!(vmx_vmexit_control & VM_EXIT_CLEAR_BNDCFGS) ||
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 70460b6..bb38fda 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -144,7 +144,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
     guest_l4e_t *l4p;
 #endif
     uint32_t gflags, mflags, iflags, rc = 0;
-    int smep;
+    bool_t smep = 0, smap = 0;
     bool_t pse1G = 0, pse2M = 0;
     p2m_query_t qt = P2M_ALLOC | P2M_UNSHARE;
 
@@ -159,13 +159,33 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
     mflags = mandatory_flags(v, pfec);
     iflags = (_PAGE_NX_BIT | _PAGE_INVALID_BITS);
 
-    /* SMEP: kernel-mode instruction fetches from user-mode mappings
-     * should fault.  Unlike NX or invalid bits, we're looking for _all_
-     * entries in the walk to have _PAGE_USER set, so we need to do the
-     * whole walk as if it were a user-mode one and then invert the answer. */
-    smep = (is_hvm_vcpu(v) && hvm_smep_enabled(v) 
-            && (pfec & PFEC_insn_fetch) && !(pfec & PFEC_user_mode) );
-    if ( smep )
+    if ( is_hvm_vcpu(v) && !(pfec & PFEC_user_mode) )
+    {
+        struct segment_register seg;
+        const struct cpu_user_regs *regs = guest_cpu_user_regs();
+
+        hvm_get_segment_register(v, x86_seg_ss, &seg);
+
+        /* SMEP: kernel-mode instruction fetches from user-mode mappings
+         * should fault.  Unlike NX or invalid bits, we're looking for _all_
+         * entries in the walk to have _PAGE_USER set, so we need to do the
+         * whole walk as if it were a user-mode one and then invert the answer. */
+        smep =  hvm_smep_enabled(v) && (pfec & PFEC_insn_fetch);
+
+        /*
+         * SMAP: kernel-mode data accesses from user-mode mappings should fault
+         * A fault is considered as a SMAP violation if the following
+         * conditions come true:
+         *   - X86_CR4_SMAP is set in CR4
+         *   - A user page is accessed
+         *   - CPL = 3 or X86_EFLAGS_AC is clear
+         *   - Page fault in kernel mode
+         */
+        smap = hvm_smap_enabled(v) &&
+               ((seg.attr.fields.dpl == 3) || !(regs->eflags & X86_EFLAGS_AC));
+    }
+
+    if ( smep || smap )
         mflags |= _PAGE_USER;
 
 #if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
@@ -338,8 +358,8 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
 set_ad:
 #endif
-    /* Now re-invert the user-mode requirement for SMEP. */
-    if ( smep ) 
+    /* Now re-invert the user-mode requirement for SMEP and SMAP */
+    if ( smep || smap )
         rc ^= _PAGE_USER;
 
     /* Go back and set accessed and dirty bits only if the walk was a
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c373930..7feaaa2 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -260,6 +260,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
     (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PAE))
 #define hvm_smep_enabled(v) \
     (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMEP))
+#define hvm_smap_enabled(v) \
+    (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
 #define hvm_nx_enabled(v) \
     (!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
 
@@ -356,17 +358,30 @@ static inline int hvm_event_pending(struct vcpu *v)
 
 static inline bool_t hvm_vcpu_has_smep(void)
 {
-    unsigned int eax, ebx;
+    unsigned int eax, ebx, ecx = 0;
 
     hvm_cpuid(0, &eax, NULL, NULL, NULL);
 
-    if (eax < 7)
+    if ( eax < 7 )
         return 0;
 
-    hvm_cpuid(7, NULL, &ebx, NULL, NULL);
+    hvm_cpuid(7, NULL, &ebx, &ecx, NULL);
     return !!(ebx & cpufeat_mask(X86_FEATURE_SMEP));
 }
 
+static inline bool_t hvm_vcpu_has_smap(void)
+{
+    unsigned int eax, ebx, ecx = 0;
+
+    hvm_cpuid(0, &eax, NULL, NULL, NULL);
+
+    if ( eax < 7 )
+        return 0;
+
+    hvm_cpuid(7, NULL, &ebx, &ecx, NULL);
+    return !!(ebx & cpufeat_mask(X86_FEATURE_SMAP));
+}
+
 /* These reserved bits in lower 32 remain 0 after any load of CR0 */
 #define HVM_CR0_GUEST_RESERVED_BITS             \
     (~((unsigned long)                          \
@@ -387,6 +402,7 @@ static inline bool_t hvm_vcpu_has_smep(void)
         X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
         X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \
         (hvm_vcpu_has_smep() ? X86_CR4_SMEP : 0) |      \
+        (hvm_vcpu_has_smap() ? X86_CR4_SMAP : 0) |      \
         (cpu_has_fsgsbase ? X86_CR4_FSGSBASE : 0) |     \
         ((nestedhvm_enabled((_v)->domain) && cpu_has_vmx)\
                       ? X86_CR4_VMXE : 0)  |             \
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
  2014-05-08  8:15 [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
                   ` (3 preceding siblings ...)
  2014-05-08  8:15 ` [PATCH v7 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
@ 2014-05-08  8:40 ` Jan Beulich
  2014-05-08  8:59   ` Wu, Feng
  4 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2014-05-08  8:40 UTC (permalink / raw)
  To: Feng Wu
  Cc: kevin.tian, ian.campbell, andrew.cooper3, eddie.dong, xen-devel,
	jun.nakajima

>>> On 08.05.14 at 10:15, <feng.wu@intel.com> wrote:
> Feng Wu (10):
>   x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
>   x86: move common_interrupt to entry.S
>   x86: merge stuff from asm-x86/x86_64/asm_defns.h to
>     asm-x86/asm_defns.h
>   x86: Add support for STAC/CLAC instructions
>   x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP
>   x86: Temporary disable SMAP to legally access user pages in kernel
>     mode
>   VMX: Disable SMAP feature when guest is in non-paging mode
>   x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
>   x86/hvm: Add SMAP support to HVM guest
>   x86/tools: Expose SMAP to HVM guests

This list doesn't appear to match what you actually sent - all I see
are patches 5, 6, 8, and 9. Maybe the delivery of 4, 7, and 10 got
delayed, but it seems pretty clear you didn't resend 1-3 (as you
were asked), so why are these still listed here, and why is the
total count still 10 rather than 7 (or lower)?

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP)
  2014-05-08  8:40 ` [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
@ 2014-05-08  8:59   ` Wu, Feng
  0 siblings, 0 replies; 8+ messages in thread
From: Wu, Feng @ 2014-05-08  8:59 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, ian.campbell, andrew.cooper3, Dong, Eddie,
	xen-devel, Nakajima, Jun



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, May 08, 2014 4:40 PM
> To: Wu, Feng
> Cc: andrew.cooper3@citrix.com; ian.campbell@citrix.com; Dong, Eddie;
> Nakajima, Jun; Tian, Kevin; xen-devel@lists.xen.org
> Subject: Re: [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention
> (SMAP)
> 
> >>> On 08.05.14 at 10:15, <feng.wu@intel.com> wrote:
> > Feng Wu (10):
> >   x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET
> >   x86: move common_interrupt to entry.S
> >   x86: merge stuff from asm-x86/x86_64/asm_defns.h to
> >     asm-x86/asm_defns.h
> >   x86: Add support for STAC/CLAC instructions
> >   x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP
> >   x86: Temporary disable SMAP to legally access user pages in kernel
> >     mode
> >   VMX: Disable SMAP feature when guest is in non-paging mode
> >   x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
> >   x86/hvm: Add SMAP support to HVM guest
> >   x86/tools: Expose SMAP to HVM guests
> 
> This list doesn't appear to match what you actually sent - all I see
> are patches 5, 6, 8, and 9. Maybe the delivery of 4, 7, and 10 got
> delayed, but it seems pretty clear you didn't resend 1-3 (as you
> were asked), so why are these still listed here, and why is the
> total count still 10 rather than 7 (or lower)?
> 
> Jan

Actually I only send patches 5, 6, 8, and 9 in this post. I think I don't understand
the format mails correctly, so I didn't send 4, 7, and 10. Sorry for that! I will rebase
the patch set on the latest upstream and resend the remaining 7 patches.

Thanks,
Feng

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v7 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
  2014-05-08  8:13 [PATCH v7 05/10] x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
@ 2014-05-08  8:13 ` Feng Wu
  0 siblings, 0 replies; 8+ messages in thread
From: Feng Wu @ 2014-05-08  8:13 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, Feng Wu, JBeulich, andrew.cooper3, eddie.dong,
	jun.nakajima, ian.campbell

Supervisor Mode Access Prevention (SMAP) is a new security
feature disclosed by Intel, please refer to the following
document:

http://software.intel.com/sites/default/files/319433-014.pdf

If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
to linear addresses that are accessible in user mode. If CPL < 3,
SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
applies to all supervisor-mode data accesses (these are implicit
supervisor accesses) regardless of the value of EFLAGS.AC.

This patch enables SMAP in Xen to prevent Xen hypervisor from
accessing pv guest data, whose translation paging-structure
entries' U/S flags are all set.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 docs/misc/xen-command-line.markdown |  7 +++++
 xen/arch/x86/setup.c                | 20 +++++++++++++
 xen/arch/x86/traps.c                | 58 +++++++++++++++++++++++++++----------
 xen/include/asm-x86/cpufeature.h    |  1 +
 xen/include/asm-x86/domain.h        |  6 ++--
 5 files changed, 75 insertions(+), 17 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 7dc938b..a7ac53d 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -912,6 +912,13 @@ Set the serial transmit buffer size.
 
 Flag to enable Supervisor Mode Execution Protection
 
+### smap
+> `= <boolean>`
+
+> Default: `true`
+
+Flag to enable Supervisor Mode Access Prevention
+
 ### snb\_igd\_quirk
 > `= <boolean>`
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 2e30701..d8598a3 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
 static bool_t __initdata disable_smep;
 invbool_param("smep", disable_smep);
 
+/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
+static bool_t __initdata disable_smap;
+invbool_param("smap", disable_smap);
+
 /* **** Linux config option: propagated to domain0. */
 /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
 /* "acpi=force":  Override the disable blacklist.                   */
@@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( cpu_has_smep )
         set_in_cr4(X86_CR4_SMEP);
 
+    if ( disable_smap )
+        setup_clear_cpu_cap(X86_FEATURE_SMAP);
+    if ( cpu_has_smap )
+        set_in_cr4(X86_CR4_SMAP);
+
     if ( cpu_has_fsgsbase )
         set_in_cr4(X86_CR4_FSGSBASE);
 
@@ -1386,6 +1395,14 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                initrdidx);
 
     /*
+     * Temporarily clear SMAP in CR4 to allow user-accesses in construct_dom0().
+     * This saves a large number of corner cases interactions with
+     * copy_from_user().
+     */
+    if ( cpu_has_smap )
+        write_cr4(read_cr4() & ~X86_CR4_SMAP);
+
+    /*
      * We're going to setup domain0 using the module(s) that we stashed safely
      * above our heap. The second module, if present, is an initrd ramdisk.
      */
@@ -1395,6 +1412,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                         bootstrap_map, cmdline) != 0)
         panic("Could not set up DOM0 guest OS");
 
+    if ( cpu_has_smap )
+        write_cr4(read_cr4() | X86_CR4_SMAP);
+
     /* Scrub RAM that is still free and so may go to an unprivileged domain. */
     scrub_heap_pages();
 
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 56ab444..19a161f 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1189,11 +1189,12 @@ static int handle_gdt_ldt_mapping_fault(
 enum pf_type {
     real_fault,
     smep_fault,
+    smap_fault,
     spurious_fault
 };
 
 static enum pf_type __page_fault_type(
-    unsigned long addr, unsigned int error_code)
+    unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long mfn, cr3 = read_cr3();
     l4_pgentry_t l4e, *l4t;
@@ -1201,6 +1202,7 @@ static enum pf_type __page_fault_type(
     l2_pgentry_t l2e, *l2t;
     l1_pgentry_t l1e, *l1t;
     unsigned int required_flags, disallowed_flags, page_user;
+    unsigned int error_code = regs->error_code;
 
     /*
      * We do not take spurious page faults in IRQ handlers as we do not
@@ -1269,19 +1271,37 @@ static enum pf_type __page_fault_type(
     page_user &= l1e_get_flags(l1e);
 
 leaf:
-    /*
-     * Supervisor Mode Execution Protection (SMEP):
-     * Disallow supervisor execution from user-accessible mappings
-     */
-    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
-         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
-        return smep_fault;
+    if ( page_user )
+    {
+        unsigned long cr4 = read_cr4();
+        /*
+         * Supervisor Mode Execution Prevention (SMEP):
+         * Disallow supervisor execution from user-accessible mappings
+         */
+        if ( (cr4 & X86_CR4_SMEP) &&
+             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) == PFEC_insn_fetch) )
+            return smep_fault;
+
+        /*
+         * Supervisor Mode Access Prevention (SMAP):
+         * Disallow supervisor access user-accessible mappings
+         * A fault is considered as an SMAP violation if the following
+         * conditions are true:
+         *   - X86_CR4_SMAP is set in CR4
+         *   - A user page is being accessed
+         *   - CPL=3 or X86_EFLAGS_AC is clear
+         *   - Page fault in kernel mode
+         */
+        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode) &&
+             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
+            return smap_fault;
+    }
 
     return spurious_fault;
 }
 
 static enum pf_type spurious_page_fault(
-    unsigned long addr, unsigned int error_code)
+    unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long flags;
     enum pf_type pf_type;
@@ -1291,7 +1311,7 @@ static enum pf_type spurious_page_fault(
      * page tables from becoming invalid under our feet during the walk.
      */
     local_irq_save(flags);
-    pf_type = __page_fault_type(addr, error_code);
+    pf_type = __page_fault_type(addr, regs);
     local_irq_restore(flags);
 
     return pf_type;
@@ -1386,8 +1406,14 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(!guest_mode(regs)) )
     {
-        pf_type = spurious_page_fault(addr, error_code);
-        BUG_ON(pf_type == smep_fault);
+        pf_type = spurious_page_fault(addr, regs);
+        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
+        {
+            console_start_sync();
+            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ? 'E' : 'A');
+            fatal_trap(TRAP_page_fault, regs);
+        }
+
         if ( pf_type != real_fault )
             return;
 
@@ -1413,10 +1439,12 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
     {
-        pf_type = spurious_page_fault(addr, error_code);
-        if ( pf_type == smep_fault )
+        pf_type = spurious_page_fault(addr, regs);
+        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
         {
-            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
+            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
+                   current, (pf_type == smep_fault) ? 'E' : 'A');
+
             domain_crash(current->domain);
         }
         if ( pf_type != real_fault )
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 20881c0..8014241 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -190,6 +190,7 @@
 #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
 
 #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
+#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
 #define cpu_has_fpu_sel         (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
 
 #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) \
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index c5c266f..abf55fb 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct vcpu *, unsigned long guest_cr4);
     (((v)->arch.pv_vcpu.ctrlreg[4]                          \
       | (mmu_cr4_features                                   \
          & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
-            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
+            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
+            X86_CR4_FSGSBASE))                              \
       | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
      & ~X86_CR4_DE)
 #define real_cr4_to_pv_guest_cr4(c)                         \
     ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
-             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
+             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
+             X86_CR4_FSGSBASE | X86_CR4_SMAP))
 
 void domain_cpuid(struct domain *d,
                   unsigned int  input,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-05-08  8:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-08  8:15 [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
2014-05-08  8:15 ` [PATCH v7 05/10] x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
2014-05-08  8:15 ` [PATCH v7 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
2014-05-08  8:15 ` [PATCH v7 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
2014-05-08  8:15 ` [PATCH v7 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
2014-05-08  8:40 ` [PATCH v7 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
2014-05-08  8:59   ` Wu, Feng
  -- strict thread matches above, loose matches on Subject: below --
2014-05-08  8:13 [PATCH v7 05/10] x86: Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
2014-05-08  8:13 ` [PATCH v7 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.