All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
@ 2015-11-16 10:31 Huaitong Han
  2015-11-16 10:31 ` [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
                   ` (10 more replies)
  0 siblings, 11 replies; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

The protection-key feature provides an additional mechanism by which IA-32e
paging controls access to usermode addresses.

Hardware support for protection keys for user pages is enumerated with CPUID
feature flag CPUID.7.0.ECX[3]:PKU. Software support is CPUID.7.0.ECX[4]:OSPKE
with the setting of CR4.PKE(bit 22).

When CR4.PKE = 1, every linear address is associated with the 4-bit protection
key located in bits 62:59 of the paging-structure entry that mapped the page
containing the linear address. The PKRU register determines, for each
protection key, whether user-mode addresses with that protection key may be
read or written.

The PKRU register (protection key rights for user pages) is a 32-bit register
with the following format: for each i (0 ≤ i ≤ 15), PKRU[2i] is the
access-disable bit for protection key i (ADi); PKRU[2i+1] is the write-disable
bit for protection key i (WDi).

Software can use the RDPKRU and WRPKRU instructions with ECX = 0 to read and
write PKRU. In addition, the PKRU register is XSAVE-managed state and can thus
be read and written by instructions in the XSAVE feature set.

PFEC.PK (bit 5) is defined as protection key violations.

The specification of Protection Keys can be found at SDM (4.6.2, volume 3)
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf.

Huaitong Han (10):
  x86/hvm: pkeys, add pkeys support for cpuid handling
  x86/hvm: pkeys, add pku support for x86_capability
  x86/hvm: pkeys, add the flag to enable Memory Protection Keys
  x86/hvm: pkeys, add pkeys support when setting CR4
  x86/hvm: pkeys, disable pkeys for guests in non-paging mode
  x86/hvm: pkeys, add functions to get pkeys value from PTE
  x86/hvm: pkeys, add functions to support PKRU access/write
  x86/hvm: pkeys, add pkeys support for do_page_fault
  x86/hvm: pkeys, add pkeys support for guest_walk_tables
  x86/hvm: pkeys, add xstate support for pkeys

 docs/misc/xen-command-line.markdown |  7 +++++
 tools/libxc/xc_cpufeature.h         |  2 ++
 tools/libxc/xc_cpuid_x86.c          |  6 ++--
 xen/arch/x86/cpu/common.c           |  5 ++--
 xen/arch/x86/hvm/hvm.c              | 11 ++++++-
 xen/arch/x86/hvm/vmx/vmx.c          | 11 +++----
 xen/arch/x86/mm/guest_walk.c        | 57 ++++++++++++++++++++++++++++++++++++-
 xen/arch/x86/setup.c                |  9 ++++++
 xen/arch/x86/traps.c                | 47 +++++++++++++++++++++++++-----
 xen/include/asm-x86/cpufeature.h    |  7 ++++-
 xen/include/asm-x86/guest_pt.h      | 11 +++++++
 xen/include/asm-x86/hvm/hvm.h       |  2 ++
 xen/include/asm-x86/page.h          |  7 +++++
 xen/include/asm-x86/processor.h     | 53 +++++++++++++++++++++++++++++-----
 xen/include/asm-x86/x86_64/page.h   | 14 +++++++++
 xen/include/asm-x86/xstate.h        |  3 +-
 16 files changed, 225 insertions(+), 27 deletions(-)

-- 
2.4.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 12:00   ` Andrew Cooper
  2015-11-16 16:58   ` Wei Liu
  2015-11-16 10:31 ` [PATCH 02/10] x86/hvm: pkeys, add pku support for x86_capability Huaitong Han
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds pkeys support for cpuid handing.

Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index c3ddc80..f6a9778 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -141,5 +141,7 @@
 #define X86_FEATURE_ADX         19 /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP        20 /* Supervisor Mode Access Protection */
 
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU     3
 
 #endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index e146a3e..34bb964 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -367,9 +367,11 @@ static void xc_cpuid_hvm_policy(
                         bitmaskof(X86_FEATURE_ADX)  |
                         bitmaskof(X86_FEATURE_SMAP) |
                         bitmaskof(X86_FEATURE_FSGSBASE));
+            regs[2] &= bitmaskof(X86_FEATURE_PKU);
         } else
-            regs[1] = 0;
-        regs[0] = regs[2] = regs[3] = 0;
+            regs[1] = regs[2] = 0;
+
+        regs[0] = regs[3] = 0;
         break;
 
     case 0x0000000d:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 615fa89..66917ff 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4518,6 +4518,12 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         /* Don't expose INVPCID to non-hap hvm. */
         if ( (count == 0) && !hap_enabled(d) )
             *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+
+        if ( (count == 0) && !(cpu_has_pku && hap_enabled(d)) )
+            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
+        if ( (count == 0) && cpu_has_pku )
+            *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
+                     cpufeat_mask(X86_FEATURE_OSPKE) : 0;
         break;
     case 0xb:
         /* Fix the x2APIC identifier. */
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 9a01563..3c3b95f 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -154,6 +154,10 @@
 #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
 
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 8 */
+#define X86_FEATURE_PKU	(8*32+ 3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE	(8*32+ 4) /* OS Protection Keys Enable */
+
 #if !defined(__ASSEMBLY__) && !defined(X86_FEATURES_ONLY)
 #include <xen/bitops.h>
 
@@ -193,6 +197,7 @@
 
 #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
 #define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
+#define cpu_has_pku             boot_cpu_has(X86_FEATURE_PKU)
 #define cpu_has_fpu_sel         (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
 
 #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD) \
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 02/10] x86/hvm: pkeys, add pku support for x86_capability
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
  2015-11-16 10:31 ` [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 13:35   ` Andrew Cooper
  2015-11-16 10:31 ` [PATCH 03/10] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds pku support for x86_capability.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 35ef21b..04bf4fb 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -211,7 +211,7 @@ static void __init early_cpu_detect(void)
 
 static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 {
-	u32 tfms, capability, excap, ebx;
+	u32 tfms, capability, excap, ebx, ecx;
 
 	/* Get vendor name */
 	cpuid(0x00000000, &c->cpuid_level,
@@ -258,8 +258,9 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 	/* Intel-defined flags: level 0x00000007 */
 	if ( c->cpuid_level >= 0x00000007 ) {
 		u32 dummy;
-		cpuid_count(0x00000007, 0, &dummy, &ebx, &dummy, &dummy);
+		cpuid_count(0x00000007, 0, &dummy, &ebx, &ecx, &dummy);
 		c->x86_capability[X86_FEATURE_FSGSBASE / 32] = ebx;
+		c->x86_capability[X86_FEATURE_PKU / 32] = ecx;
 	}
 }
 
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 3c3b95f..b2899e3 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -9,7 +9,7 @@
 #define __ASM_I386_CPUFEATURE_H
 #endif
 
-#define NCAPINTS	8	/* N 32-bit words worth of info */
+#define NCAPINTS	9	/* N 32-bit words worth of info */
 
 /* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
 #define X86_FEATURE_FPU		(0*32+ 0) /* Onboard FPU */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 03/10] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
  2015-11-16 10:31 ` [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
  2015-11-16 10:31 ` [PATCH 02/10] x86/hvm: pkeys, add pku support for x86_capability Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 13:56   ` Andrew Cooper
  2015-11-16 10:31 ` [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds the flag to enable Memory Protection Keys.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index a565c1b..0ded4bf 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1303,6 +1303,13 @@ Flag to enable Supervisor Mode Execution Protection
 
 Flag to enable Supervisor Mode Access Prevention
 
+### pku
+> `= <boolean>>`
+
+> Default: `true`
+
+Flag to enable Memory Protection Keys
+
 ### snb\_igd\_quirk
 > `= <boolean> | cap | <integer>`
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 3946e4c..c1f924e 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -67,6 +67,10 @@ invbool_param("smep", disable_smep);
 static bool_t __initdata disable_smap;
 invbool_param("smap", disable_smap);
 
+/* pku: Enable/disable Memory Protection Keys (default on). */
+static bool_t __initdata disable_pku;
+invbool_param("pku", disable_pku);
+
 /* Boot dom0 in pvh mode */
 static bool_t __initdata opt_dom0pvh;
 boolean_param("dom0pvh", opt_dom0pvh);
@@ -1304,6 +1308,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
     if ( cpu_has_smap )
         set_in_cr4(X86_CR4_SMAP);
 
+    if ( disable_pku )
+        setup_clear_cpu_cap(X86_FEATURE_PKU);
+
     if ( cpu_has_fsgsbase )
         set_in_cr4(X86_CR4_FSGSBASE);
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (2 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 03/10] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 14:02   ` Andrew Cooper
  2015-11-20  1:16   ` Wu, Feng
  2015-11-16 10:31 ` [PATCH 05/10] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
                   ` (6 subsequent siblings)
  10 siblings, 2 replies; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds pkeys support when setting CR4

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 66917ff..953047f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1911,6 +1911,7 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
         leaf1_edx = boot_cpu_data.x86_capability[X86_FEATURE_VME / 32];
         leaf1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PCID / 32];
         leaf7_0_ebx = boot_cpu_data.x86_capability[X86_FEATURE_FSGSBASE / 32];
+        leaf7_0_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PKU / 32];
     }
 
     return ~(unsigned long)
@@ -1946,7 +1947,9 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
              (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
               X86_CR4_SMEP : 0) |
              (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
-              X86_CR4_SMAP : 0));
+              X86_CR4_SMAP : 0) |
+             (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
+              X86_CR4_PKE : 0));
 }
 
 static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index c1f924e..8101a1b 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1310,6 +1310,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
 
     if ( disable_pku )
         setup_clear_cpu_cap(X86_FEATURE_PKU);
+    if ( cpu_has_pku )
+        set_in_cr4(X86_CR4_PKE);
 
     if ( cpu_has_fsgsbase )
         set_in_cr4(X86_CR4_FSGSBASE);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 05/10] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (3 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 14:03   ` Andrew Cooper
  2015-11-16 10:31 ` [PATCH 06/10] x86/hvm: pkeys, add functions to get pkeys value from PTE Huaitong Han
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch disables pkeys for guest in non-paging mode, However XEN always uses
paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
needs to be manually disabled when guest switches to non-paging mode.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2582cdd..bc9c4b0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1367,12 +1367,13 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
         if ( !hvm_paging_enabled(v) )
         {
             /*
-             * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
-             * However Xen always uses paging mode to emulate guest non-paging
-             * mode. To emulate this behavior, SMEP/SMAP needs to be manually
-             * disabled when guest VCPU is in non-paging mode.
+             * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+             * hardware. However Xen always uses paging mode to emulate guest
+             * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs
+             * to be manually disabled when guest VCPU is in non-paging mode.
              */
-            v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
+            v->arch.hvm_vcpu.hw_cr[4] &=
+                ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
         }
         __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
         break;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 06/10] x86/hvm: pkeys, add functions to get pkeys value from PTE
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (4 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 05/10] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 14:16   ` Andrew Cooper
  2015-11-16 10:31 ` [PATCH 07/10] x86/hvm: pkeys, add functions to support PKRU access/write Huaitong Han
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds functions to get pkeys value from PTE.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 87b3341..1cdbfc8 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -93,6 +93,13 @@
 #define l3e_get_flags(x)           (get_pte_flags((x).l3))
 #define l4e_get_flags(x)           (get_pte_flags((x).l4))
 
+/* Get pte pkeys (unsigned int). */
+#define l1e_get_pkeys(x)           (get_pte_pkeys((x).l1))
+#define l2e_get_pkeys(x)           (get_pte_pkeys((x).l2))
+#define l3e_get_pkeys(x)           (get_pte_pkeys((x).l3))
+#define l4e_get_pkeys(x)           (get_pte_pkeys((x).l4))
+
+
 /* Construct an empty pte. */
 #define l1e_empty()                ((l1_pgentry_t) { 0 })
 #define l2e_empty()                ((l2_pgentry_t) { 0 })
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..03418ba 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -134,6 +134,18 @@ typedef l4_pgentry_t root_pgentry_t;
 #define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
 #define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
 
+/*
+* Protection keys define a new 4-bit protection key field
+* (PKEY) in bits 62:59 of leaf entries of the page tables.
+* This corresponds to bit 22:19 of a 24-bit flags.
+*/
+#define _PAGE_PKEY_BIT0 19       /* Protection Keys, bit 1/4 */
+#define _PAGE_PKEY_BIT1 20       /* Protection Keys, bit 2/4 */
+#define _PAGE_PKEY_BIT2 21       /* Protection Keys, bit 3/4 */
+#define _PAGE_PKEY_BIT3 22       /* Protection Keys, bit 4/4 */
+
+#define get_pte_pkeys(x) ((int)(get_pte_flags(x) >> _PAGE_PKEY_BIT0) & 0xF)
+
 /* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
 #define _PAGE_NX_BIT (1U<<23)
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 07/10] x86/hvm: pkeys, add functions to support PKRU access/write
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (5 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 06/10] x86/hvm: pkeys, add functions to get pkeys value from PTE Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 15:09   ` Andrew Cooper
  2015-11-16 10:31 ` [PATCH 08/10] x86/hvm: pkeys, add pkeys support for do_page_fault Huaitong Han
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds functions to support PKRU access/write

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index f507f5e..427eb84 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -336,6 +336,44 @@ static inline void write_cr4(unsigned long val)
     asm volatile ( "mov %0,%%cr4" : : "r" (val) );
 }
 
+static inline unsigned int read_pkru(void)
+{
+    unsigned int eax, edx;
+    unsigned int ecx = 0;
+    unsigned int pkru;
+
+    asm volatile(".byte 0x0f,0x01,0xee\n\t"
+                 : "=a" (eax), "=d" (edx)
+                 : "c" (ecx));
+    pkru = eax;
+    return pkru;
+}
+
+static inline void write_pkru(unsigned int pkru)
+{
+    unsigned int eax = pkru;
+    unsigned int ecx = 0;
+    unsigned int edx = 0;
+
+    asm volatile(".byte 0x0f,0x01,0xef\n\t"
+                 : : "a" (eax), "c" (ecx), "d" (edx));
+}
+
+/* macros for pkru */
+#define PKRU_READ  0
+#define PKRU_WRITE 1
+#define PKRU_ATTRS 2
+
+/*
+* PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
+* domain in pkru, pkeys is index to a defined domain, so the value of
+* pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
+*/
+#define READ_PKRU_AD(x) ((read_pkru() >> (x * PKRU_ATTRS + PKRU_READ)) & 1)
+#define READ_PKRU_WD(x) ((read_pkru() >> (x * PKRU_ATTRS + PKRU_WRITE)) & 1)
+
+
+
 /* Clear and set 'TS' bit respectively */
 static inline void clts(void) 
 {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 08/10] x86/hvm: pkeys, add pkeys support for do_page_fault
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (6 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 07/10] x86/hvm: pkeys, add functions to support PKRU access/write Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 15:25   ` Andrew Cooper
  2015-11-16 10:31 ` [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds pkeys support for do_page_fault.

the protection keys architecture define a new status bit in the PFEC. PFEC.PK
(bit 5) is set to 1 if an only if protection keys block the access.

Protection keys block an access and induce a page fault if and only if
1.Protection keys are enabled (CR4.PKE=1 and EFER.LMA=1), and
2.The page has a valid translation (page is present with no reserved bit
  violations), and
3.The access is not an instruction fetch, and
4.The access is to a user page, and
5.At least one of the following restrictions apply:
    --The access is a data read or data write and AD=1
    --The access is a data write and WD=1 and either CR0.WP=1 or (CR0.WP=0 and
    it is a user access)

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 9f5a6c6..73abb3b 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1287,7 +1287,7 @@ enum pf_type {
     spurious_fault
 };
 
-static enum pf_type __page_fault_type(
+static enum pf_type __page_fault_type(struct vcpu *vcpu,
     unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long mfn, cr3 = read_cr3();
@@ -1295,7 +1295,7 @@ static enum pf_type __page_fault_type(
     l3_pgentry_t l3e, *l3t;
     l2_pgentry_t l2e, *l2t;
     l1_pgentry_t l1e, *l1t;
-    unsigned int required_flags, disallowed_flags, page_user;
+    unsigned int required_flags, disallowed_flags, page_user, pte_pkeys;
     unsigned int error_code = regs->error_code;
 
     /*
@@ -1340,6 +1340,7 @@ static enum pf_type __page_fault_type(
     if ( ((l3e_get_flags(l3e) & required_flags) != required_flags) ||
          (l3e_get_flags(l3e) & disallowed_flags) )
         return real_fault;
+    pte_pkeys = l3e_get_pkeys(l3e);
     page_user &= l3e_get_flags(l3e);
     if ( l3e_get_flags(l3e) & _PAGE_PSE )
         goto leaf;
@@ -1351,6 +1352,7 @@ static enum pf_type __page_fault_type(
     if ( ((l2e_get_flags(l2e) & required_flags) != required_flags) ||
          (l2e_get_flags(l2e) & disallowed_flags) )
         return real_fault;
+    pte_pkeys = l2e_get_pkeys(l2e);
     page_user &= l2e_get_flags(l2e);
     if ( l2e_get_flags(l2e) & _PAGE_PSE )
         goto leaf;
@@ -1362,12 +1364,22 @@ static enum pf_type __page_fault_type(
     if ( ((l1e_get_flags(l1e) & required_flags) != required_flags) ||
          (l1e_get_flags(l1e) & disallowed_flags) )
         return real_fault;
+    pte_pkeys = l1e_get_pkeys(l1e);
     page_user &= l1e_get_flags(l1e);
 
 leaf:
     if ( page_user )
     {
         unsigned long cr4 = read_cr4();
+        unsigned int ff, wf, uf, rsvdf, pkuf;
+        unsigned int pkru_ad, pkru_wd;
+
+        uf = error_code & PFEC_user_mode;
+        wf = error_code & PFEC_write_access;
+        rsvdf = error_code & PFEC_reserved_bit;
+        ff = error_code & PFEC_insn_fetch;
+        pkuf = error_code & PFEC_protection_key;
+
         /*
          * Supervisor Mode Execution Prevention (SMEP):
          * Disallow supervisor execution from user-accessible mappings
@@ -1386,15 +1398,35 @@ leaf:
          *   - CPL=3 or X86_EFLAGS_AC is clear
          *   - Page fault in kernel mode
          */
-        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode) &&
+        if ( (cr4 & X86_CR4_SMAP) && !uf &&
              (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
             return smap_fault;
+         /*
+         * PKU:  additional mechanism by which the paging controls
+         * access to user-mode addresses based on the value in the
+         * PKRU register. A fault is considered as a PKU violation if all
+         * of the following conditions are ture:
+         * 1.CR4_PKE=1.
+         * 2.EFER_LMA=1.
+         * 3.page is present with no reserved bit violations.
+         * 4.the access is not an instruction fetch.
+         * 5.the access is to a user page.
+         * 6.PKRU.AD=1
+         *       or The access is a data write and PKRU.WD=1
+         *           and either CR0.WP=1 or it is a user access.
+         */
+         pkru_ad = READ_PKRU_AD(pte_pkeys);
+         pkru_wd = READ_PKRU_AD(pte_pkeys);
+         if ( pkuf && (cr4 & X86_CR4_PKE) && hvm_long_mode_enabled(vcpu) &&
+             !rsvdf && !ff && (pkru_ad ||
+             (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf))))
+             return real_fault;
     }
 
     return spurious_fault;
 }
 
-static enum pf_type spurious_page_fault(
+static enum pf_type spurious_page_fault(struct vcpu *vcpu,
     unsigned long addr, const struct cpu_user_regs *regs)
 {
     unsigned long flags;
@@ -1405,7 +1437,7 @@ static enum pf_type spurious_page_fault(
      * page tables from becoming invalid under our feet during the walk.
      */
     local_irq_save(flags);
-    pf_type = __page_fault_type(addr, regs);
+    pf_type = __page_fault_type(vcpu, addr, regs);
     local_irq_restore(flags);
 
     return pf_type;
@@ -1479,6 +1511,7 @@ static int fixup_page_fault(unsigned long addr, struct cpu_user_regs *regs)
  *  Bit 2: User mode (=1) ; Supervisor mode (=0)
  *  Bit 3: Reserved bit violation
  *  Bit 4: Instruction fetch
+ *  Bit 5: Protection-key violations
  */
 void do_page_fault(struct cpu_user_regs *regs)
 {
@@ -1500,7 +1533,7 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(!guest_mode(regs)) )
     {
-        pf_type = spurious_page_fault(addr, regs);
+        pf_type = spurious_page_fault(current, addr, regs);
         if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
         {
             console_start_sync();
@@ -1533,7 +1566,7 @@ void do_page_fault(struct cpu_user_regs *regs)
 
     if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
     {
-        pf_type = spurious_page_fault(addr, regs);
+        pf_type = spurious_page_fault(current, addr, regs);
         if ( (pf_type == smep_fault) || (pf_type == smap_fault))
         {
             printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 427eb84..e8090fb 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -135,13 +135,14 @@
 #define TF_kernel_mode         (1<<_TF_kernel_mode)
 
 /* #PF error code values. */
-#define PFEC_page_present   (1U<<0)
-#define PFEC_write_access   (1U<<1)
-#define PFEC_user_mode      (1U<<2)
-#define PFEC_reserved_bit   (1U<<3)
-#define PFEC_insn_fetch     (1U<<4)
-#define PFEC_page_paged     (1U<<5)
-#define PFEC_page_shared    (1U<<6)
+#define PFEC_page_present       (1U<<0)
+#define PFEC_write_access       (1U<<1)
+#define PFEC_user_mode          (1U<<2)
+#define PFEC_reserved_bit       (1U<<3)
+#define PFEC_insn_fetch         (1U<<4)
+#define PFEC_protection_key     (1U<<5)
+#define PFEC_page_paged         (1U<<6)
+#define PFEC_page_shared        (1U<<7)
 
 #define XEN_MINIMAL_CR4 (X86_CR4_PGE | X86_CR4_PAE)
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (7 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 08/10] x86/hvm: pkeys, add pkeys support for do_page_fault Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 16:52   ` Andrew Cooper
  2015-11-16 16:59   ` Andrew Cooper
  2015-11-16 10:31 ` [PATCH 10/10] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
  2015-11-16 17:45 ` [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Andrew Cooper
  10 siblings, 2 replies; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds pkeys support for guest_walk_tables.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 773454d..7a7ae96 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -124,6 +124,46 @@ void *map_domain_gfn(struct p2m_domain *p2m, gfn_t gfn, mfn_t *mfn,
     return map;
 }
 
+#if GUEST_PAGING_LEVELS >= 4
+uint32_t leaf_pte_pkeys_check(struct vcpu *vcpu, uint32_t pfec,
+                uint32_t pte_access, uint32_t pte_pkeys)
+{
+    unsigned int pkru_ad, pkru_wd;
+    unsigned int ff, wf, uf, rsvdf, pkuf;
+
+    uf = pfec & PFEC_user_mode;
+    wf = pfec & PFEC_write_access;
+    rsvdf = pfec & PFEC_reserved_bit;
+    ff = pfec & PFEC_insn_fetch;
+    pkuf = pfec & PFEC_protection_key;
+
+    if (!pkuf)
+        return 0;
+
+    /*
+     * PKU:  additional mechanism by which the paging controls
+    * access to user-mode addresses based on the value in the
+    * PKRU register. A fault is considered as a PKU violation if all
+    * of the following conditions are ture:
+    * 1.CR4_PKE=1.
+    * 2.EFER_LMA=1.
+    * 3.page is present with no reserved bit violations.
+    * 4.the access is not an instruction fetch.
+    * 5.the access is to a user page.
+    * 6.PKRU.AD=1
+    *       or The access is a data write and PKRU.WD=1
+    *            and either CR0.WP=1 or it is a user access.
+    */
+    pkru_ad = READ_PKRU_AD(pte_pkeys);
+    pkru_wd = READ_PKRU_AD(pte_pkeys);
+    if ( hvm_pku_enabled(vcpu) && hvm_long_mode_enabled(vcpu) &&
+        !rsvdf && !ff && (pkru_ad ||
+        (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf))))
+        return 1;
+
+    return 0;
+}
+#endif
 
 /* Walk the guest pagetables, after the manner of a hardware walker. */
 /* Because the walk is essentially random, it can cause a deadlock 
@@ -141,6 +181,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
     guest_l3e_t *l3p = NULL;
     guest_l4e_t *l4p;
+    uint32_t pkeys;
 #endif
     uint32_t gflags, mflags, iflags, rc = 0;
     bool_t smep = 0, smap = 0;
@@ -225,6 +266,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
         goto out;
     /* Get the l3e and check its flags*/
     gw->l3e = l3p[guest_l3_table_offset(va)];
+    pkeys = guest_l3e_get_pkeys(gw->l3e);
     gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
     if ( !(gflags & _PAGE_PRESENT) ) {
         rc |= _PAGE_PRESENT;
@@ -234,6 +276,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
     
     pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); 
 
+    if (pse1G && leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
+        rc |= _PAGE_PK_BIT;
+
     if ( pse1G )
     {
         /* Generate a fake l1 table entry so callers don't all 
@@ -295,7 +340,6 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
     gw->l2e = l2p[guest_l2_table_offset(va)];
 
 #endif /* All levels... */
-
     gflags = guest_l2e_get_flags(gw->l2e) ^ iflags;
     if ( !(gflags & _PAGE_PRESENT) ) {
         rc |= _PAGE_PRESENT;
@@ -305,6 +349,12 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 
     pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
 
+#if GUEST_PAGING_LEVELS >= 4
+    pkeys = guest_l2e_get_pkeys(gw->l2e);
+    if (pse2M && leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
+        rc |= _PAGE_PK_BIT;
+#endif
+
     if ( pse2M )
     {
         /* Special case: this guest VA is in a PSE superpage, so there's
@@ -365,6 +415,11 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
             goto out;
         }
         rc |= ((gflags & mflags) ^ mflags);
+#if GUEST_PAGING_LEVELS >= 4
+        pkeys = guest_l1e_get_pkeys(gw->l1e);
+        if (leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
+            rc |= _PAGE_PK_BIT;
+#endif
     }
 
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index f8a0d76..1c0f050 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -154,6 +154,17 @@ static inline u32 guest_l4e_get_flags(guest_l4e_t gl4e)
 { return l4e_get_flags(gl4e); }
 #endif
 
+static inline u32 guest_l1e_get_pkeys(guest_l1e_t gl1e)
+{ return l1e_get_pkeys(gl1e); }
+static inline u32 guest_l2e_get_pkeys(guest_l2e_t gl2e)
+{ return l2e_get_pkeys(gl2e); }
+static inline u32 guest_l3e_get_pkeys(guest_l3e_t gl3e)
+{ return l3e_get_pkeys(gl3e); }
+#if GUEST_PAGING_LEVELS >= 4
+static inline u32 guest_l4e_get_pkeys(guest_l4e_t gl4e)
+{ return l4e_get_pkeys(gl4e); }
+#endif
+
 static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
 { return l1e_from_pfn(gfn_x(gfn), flags); }
 static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 68b216c..e421a9d 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -273,6 +273,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
     (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
 #define hvm_nx_enabled(v) \
     (!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
+#define hvm_pku_enabled(v) \
+    (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE))
 
 /* Can we use superpages in the HAP p2m table? */
 #define hvm_hap_has_1gb(d) \
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 03418ba..7bb5d2d 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -146,6 +146,8 @@ typedef l4_pgentry_t root_pgentry_t;
 
 #define get_pte_pkeys(x) ((int)(get_pte_flags(x) >> _PAGE_PKEY_BIT0) & 0xF)
 
+#define _PAGE_PK_BIT (1U<<_PAGE_PKEY_BIT0)
+
 /* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
 #define _PAGE_NX_BIT (1U<<23)
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 10/10] x86/hvm: pkeys, add xstate support for pkeys
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (8 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-11-16 10:31 ` Huaitong Han
  2015-11-16 16:52   ` Andrew Cooper
  2015-11-16 17:45 ` [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Andrew Cooper
  10 siblings, 1 reply; 36+ messages in thread
From: Huaitong Han @ 2015-11-16 10:31 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds xstate support for pkeys.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>

diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 4c690db..5674f3e 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -33,13 +33,14 @@
 #define XSTATE_OPMASK  (1ULL << 5)
 #define XSTATE_ZMM     (1ULL << 6)
 #define XSTATE_HI_ZMM  (1ULL << 7)
+#define XSTATE_PKRU    (1ULL << 9)
 #define XSTATE_LWP     (1ULL << 62) /* AMD lightweight profiling */
 #define XSTATE_FP_SSE  (XSTATE_FP | XSTATE_SSE)
 #define XCNTXT_MASK    (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
                         XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
 
 #define XSTATE_ALL     (~(1ULL << 63))
-#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
+#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | XSTATE_PKRU)
 #define XSTATE_LAZY    (XSTATE_ALL & ~XSTATE_NONLAZY)
 
 extern u64 xfeature_mask;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-11-16 10:31 ` [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
@ 2015-11-16 12:00   ` Andrew Cooper
  2015-11-19 14:39     ` Wu, Feng
  2015-11-16 16:58   ` Wei Liu
  1 sibling, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 12:00 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds pkeys support for cpuid handing.
>
> Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
> CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
>
> diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
> index c3ddc80..f6a9778 100644
> --- a/tools/libxc/xc_cpufeature.h
> +++ b/tools/libxc/xc_cpufeature.h
> @@ -141,5 +141,7 @@
>  #define X86_FEATURE_ADX         19 /* ADCX, ADOX instructions */
>  #define X86_FEATURE_SMAP        20 /* Supervisor Mode Access Protection */
>  
> +/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
> +#define X86_FEATURE_PKU     3
>  
>  #endif /* __LIBXC_CPUFEATURE_H */
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index e146a3e..34bb964 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -367,9 +367,11 @@ static void xc_cpuid_hvm_policy(
>                          bitmaskof(X86_FEATURE_ADX)  |
>                          bitmaskof(X86_FEATURE_SMAP) |
>                          bitmaskof(X86_FEATURE_FSGSBASE));
> +            regs[2] &= bitmaskof(X86_FEATURE_PKU);
>          } else
> -            regs[1] = 0;
> -        regs[0] = regs[2] = regs[3] = 0;
> +            regs[1] = regs[2] = 0;
> +
> +        regs[0] = regs[3] = 0;
>          break;
>  
>      case 0x0000000d:
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 615fa89..66917ff 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4518,6 +4518,12 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>          /* Don't expose INVPCID to non-hap hvm. */
>          if ( (count == 0) && !hap_enabled(d) )
>              *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> +
> +        if ( (count == 0) && !(cpu_has_pku && hap_enabled(d)) )
> +            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> +        if ( (count == 0) && cpu_has_pku )
> +            *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> +                     cpufeat_mask(X86_FEATURE_OSPKE) : 0;

This logic (all being gated on count == 0 && !hap_enabled() ) should
extend the INVPCID if() statement.

Setting OSPKE should be gated on *ecx having PKU and guest CR4 alone. 
As it currently stands, a guest could end up observing OSPKE but not PKU.

>          break;
>      case 0xb:
>          /* Fix the x2APIC identifier. */
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index 9a01563..3c3b95f 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -154,6 +154,10 @@
>  #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
>  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
>  
> +/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 8 */
> +#define X86_FEATURE_PKU	(8*32+ 3) /* Protection Keys for Userspace */
> +#define X86_FEATURE_OSPKE	(8*32+ 4) /* OS Protection Keys Enable */
> +
>  #if !defined(__ASSEMBLY__) && !defined(X86_FEATURES_ONLY)
>  #include <xen/bitops.h>
>  
> @@ -193,6 +197,7 @@
>  
>  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
>  #define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> +#define cpu_has_pku             boot_cpu_has(X86_FEATURE_PKU)

This read overflows c->x86_capabilities, as you didn't bump NCAPINTs

I see that you bump it in the following patch, but you must move that
hunk forwards to this patch.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 02/10] x86/hvm: pkeys, add pku support for x86_capability
  2015-11-16 10:31 ` [PATCH 02/10] x86/hvm: pkeys, add pku support for x86_capability Huaitong Han
@ 2015-11-16 13:35   ` Andrew Cooper
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 13:35 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds pku support for x86_capability.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>

Please rebase your series (preferably on staging).  There are some
recent changes in this area.

As rebasing (and correcting patch 1) will drop this patch to a single
hunk, I would just fold it into patch 1.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 03/10] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
  2015-11-16 10:31 ` [PATCH 03/10] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-11-16 13:56   ` Andrew Cooper
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 13:56 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds the flag to enable Memory Protection Keys.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
>
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index a565c1b..0ded4bf 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1303,6 +1303,13 @@ Flag to enable Supervisor Mode Execution Protection
>  
>  Flag to enable Supervisor Mode Access Prevention
>  
> +### pku

Options should be in alphabetical order please.

> +> `= <boolean>>`

Extra closing arrow.

> +
> +> Default: `true`
> +
> +Flag to enable Memory Protection Keys

I know there are a number of bad examples in this file, but please avoid
adding to the problem.

It would be useful to have a very brief description of what PKU is, and
which hardware it is available on.  See the 'psr' option as an example.

> +
>  ### snb\_igd\_quirk
>  > `= <boolean> | cap | <integer>`
>  
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 3946e4c..c1f924e 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -67,6 +67,10 @@ invbool_param("smep", disable_smep);
>  static bool_t __initdata disable_smap;
>  invbool_param("smap", disable_smap);
>  
> +/* pku: Enable/disable Memory Protection Keys (default on). */
> +static bool_t __initdata disable_pku;
> +invbool_param("pku", disable_pku);

I am going to submit a patch removing invbool_param(), as it is barely
used and adds unnecessary cognitive overhead

Please us:

static bool_t __initdata opt_pku = 1;
boolean_param("pku", opt_pku);

instead.

~Andrew

> +
>  /* Boot dom0 in pvh mode */
>  static bool_t __initdata opt_dom0pvh;
>  boolean_param("dom0pvh", opt_dom0pvh);
> @@ -1304,6 +1308,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>      if ( cpu_has_smap )
>          set_in_cr4(X86_CR4_SMAP);
>  
> +    if ( disable_pku )
> +        setup_clear_cpu_cap(X86_FEATURE_PKU);
> +
>      if ( cpu_has_fsgsbase )
>          set_in_cr4(X86_CR4_FSGSBASE);
>  

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4
  2015-11-16 10:31 ` [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
@ 2015-11-16 14:02   ` Andrew Cooper
  2015-11-20  1:16   ` Wu, Feng
  1 sibling, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 14:02 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds pkeys support when setting CR4
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>

Again, this patch is in need of a rebase (although it should be entirely
mechanical).

~Andrew

>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 66917ff..953047f 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1911,6 +1911,7 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
>          leaf1_edx = boot_cpu_data.x86_capability[X86_FEATURE_VME / 32];
>          leaf1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PCID / 32];
>          leaf7_0_ebx = boot_cpu_data.x86_capability[X86_FEATURE_FSGSBASE / 32];
> +        leaf7_0_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PKU / 32];
>      }
>  
>      return ~(unsigned long)
> @@ -1946,7 +1947,9 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
>               (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
>                X86_CR4_SMEP : 0) |
>               (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
> -              X86_CR4_SMAP : 0));
> +              X86_CR4_SMAP : 0) |
> +             (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
> +              X86_CR4_PKE : 0));
>  }
>  
>  static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index c1f924e..8101a1b 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1310,6 +1310,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>  
>      if ( disable_pku )
>          setup_clear_cpu_cap(X86_FEATURE_PKU);
> +    if ( cpu_has_pku )
> +        set_in_cr4(X86_CR4_PKE);
>  
>      if ( cpu_has_fsgsbase )
>          set_in_cr4(X86_CR4_FSGSBASE);

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 05/10] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
  2015-11-16 10:31 ` [PATCH 05/10] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2015-11-16 14:03   ` Andrew Cooper
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 14:03 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch disables pkeys for guest in non-paging mode, However XEN always uses
> paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
> needs to be manually disabled when guest switches to non-paging mode.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/10] x86/hvm: pkeys, add functions to get pkeys value from PTE
  2015-11-16 10:31 ` [PATCH 06/10] x86/hvm: pkeys, add functions to get pkeys value from PTE Huaitong Han
@ 2015-11-16 14:16   ` Andrew Cooper
  2015-11-16 14:42     ` Jan Beulich
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 14:16 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds functions to get pkeys value from PTE.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
>
> diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
> index 87b3341..1cdbfc8 100644
> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -93,6 +93,13 @@
>  #define l3e_get_flags(x)           (get_pte_flags((x).l3))
>  #define l4e_get_flags(x)           (get_pte_flags((x).l4))
>  
> +/* Get pte pkeys (unsigned int). */
> +#define l1e_get_pkeys(x)           (get_pte_pkeys((x).l1))
> +#define l2e_get_pkeys(x)           (get_pte_pkeys((x).l2))
> +#define l3e_get_pkeys(x)           (get_pte_pkeys((x).l3))
> +#define l4e_get_pkeys(x)           (get_pte_pkeys((x).l4))
> +
> +

Please only insert a single line here.

>  /* Construct an empty pte. */
>  #define l1e_empty()                ((l1_pgentry_t) { 0 })
>  #define l2e_empty()                ((l2_pgentry_t) { 0 })
> diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
> index 19ab4d0..03418ba 100644
> --- a/xen/include/asm-x86/x86_64/page.h
> +++ b/xen/include/asm-x86/x86_64/page.h
> @@ -134,6 +134,18 @@ typedef l4_pgentry_t root_pgentry_t;
>  #define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
>  #define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
>  
> +/*
> +* Protection keys define a new 4-bit protection key field
> +* (PKEY) in bits 62:59 of leaf entries of the page tables.
> +* This corresponds to bit 22:19 of a 24-bit flags.
> +*/

Please align all the *'s of the comment vertically (i.e. with a space on
the very left hand side)

> +#define _PAGE_PKEY_BIT0 19       /* Protection Keys, bit 1/4 */
> +#define _PAGE_PKEY_BIT1 20       /* Protection Keys, bit 2/4 */
> +#define _PAGE_PKEY_BIT2 21       /* Protection Keys, bit 3/4 */
> +#define _PAGE_PKEY_BIT3 22       /* Protection Keys, bit 4/4 */
> +
> +#define get_pte_pkeys(x) ((int)(get_pte_flags(x) >> _PAGE_PKEY_BIT0) & 0xF)

Please implemented these as masks, for consistency with the other
_PAGE_* entries.  I would recommend also having a _SHIFT and _MASK define.

> +
>  /* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
>  #define _PAGE_NX_BIT (1U<<23)
>  

There is however another issue. Xen currently uses bit 62 for software
purposes, which is incompatible with enabling PKE.  _PAGE_GNTTAB needs
moving to a different, software-available bit.  I would recommend bit 13
of flags, adjacent to _PAGE_GUEST_KERNEL which is our other
software-used flag.

Also, the changes to _PAGE_GNTTAB must be ahead of patch 4 which enables
CR4.PKE for Xen.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 06/10] x86/hvm: pkeys, add functions to get pkeys value from PTE
  2015-11-16 14:16   ` Andrew Cooper
@ 2015-11-16 14:42     ` Jan Beulich
  0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2015-11-16 14:42 UTC (permalink / raw)
  To: Andrew Cooper, ian.campbell, wei.liu2, george.dunlap,
	ian.jackson, stefano.stabellini, eddie.dong, Huaitong Han,
	jun.nakajima, kevin.tian, keir
  Cc: xen-devel

>>> On 16.11.15 at 15:16, <andrew.cooper3@citrix.com> wrote:
> On 16/11/15 10:31, Huaitong Han wrote:
>> +#define _PAGE_PKEY_BIT0 19       /* Protection Keys, bit 1/4 */
>> +#define _PAGE_PKEY_BIT1 20       /* Protection Keys, bit 2/4 */
>> +#define _PAGE_PKEY_BIT2 21       /* Protection Keys, bit 3/4 */
>> +#define _PAGE_PKEY_BIT3 22       /* Protection Keys, bit 4/4 */
>> +
>> +#define get_pte_pkeys(x) ((int)(get_pte_flags(x) >> _PAGE_PKEY_BIT0) & 0xF)
> 
> Please implemented these as masks, for consistency with the other
> _PAGE_* entries.  I would recommend also having a _SHIFT and _MASK define.

Since _SHIFT can always be calculated from _MASK, and since
we have MASK_{INSR,EXTR}(), I'd prefer just _MASK ones to
be added, unless this results in significantly less readable code.

>> +
>>  /* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
>>  #define _PAGE_NX_BIT (1U<<23)
>>  
> 
> There is however another issue. Xen currently uses bit 62 for software
> purposes, which is incompatible with enabling PKE.  _PAGE_GNTTAB needs
> moving to a different, software-available bit.  I would recommend bit 13
> of flags, adjacent to _PAGE_GUEST_KERNEL which is our other
> software-used flag.

Can we in fact do that? _PAGE_GNTTAB is visible to PV guests, and
those guests need to avoid using that flag for their own purposes.

Jan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 07/10] x86/hvm: pkeys, add functions to support PKRU access/write
  2015-11-16 10:31 ` [PATCH 07/10] x86/hvm: pkeys, add functions to support PKRU access/write Huaitong Han
@ 2015-11-16 15:09   ` Andrew Cooper
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 15:09 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds functions to support PKRU access/write
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
>
> diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
> index f507f5e..427eb84 100644
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -336,6 +336,44 @@ static inline void write_cr4(unsigned long val)
>      asm volatile ( "mov %0,%%cr4" : : "r" (val) );
>  }
>  
> +static inline unsigned int read_pkru(void)
> +{
> +    unsigned int eax, edx;
> +    unsigned int ecx = 0;
> +    unsigned int pkru;
> +
> +    asm volatile(".byte 0x0f,0x01,0xee\n\t"
> +                 : "=a" (eax), "=d" (edx)
> +                 : "c" (ecx));
> +    pkru = eax;
> +    return pkru;

This can be far more simple.

{
unsigned int pkru;

asm volatile (".byte 0x0f,0x01,0xee"
                    : "=a" (pkru) : "c" (0) : "dx");

return pkru;
}

> +}
> +
> +static inline void write_pkru(unsigned int pkru)
> +{
> +    unsigned int eax = pkru;
> +    unsigned int ecx = 0;
> +    unsigned int edx = 0;
> +
> +    asm volatile(".byte 0x0f,0x01,0xef\n\t"
> +                 : : "a" (eax), "c" (ecx), "d" (edx));
> +}

I don't see any need for Xen to use WRPKRU, and this helper isn't used
anywhere in your series.  Please drop it (unless I have overlooked
something).

> +
> +/* macros for pkru */
> +#define PKRU_READ  0
> +#define PKRU_WRITE 1
> +#define PKRU_ATTRS 2
> +
> +/*
> +* PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
> +* domain in pkru, pkeys is index to a defined domain, so the value of
> +* pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
> +*/
> +#define READ_PKRU_AD(x) ((read_pkru() >> (x * PKRU_ATTRS + PKRU_READ)) & 1)
> +#define READ_PKRU_WD(x) ((read_pkru() >> (x * PKRU_ATTRS + PKRU_WRITE)) & 1)

Sorry, but no.  This hides expensive rdpkru instructions in innocuous code.

Instead, a function manipulating the keys should cache rdpkru once, and
use the following helpers:

static inline bool_t pkru_ad(unsigned int pkru, unsigned int key);
static inline bool_t pkru_wd(unsigned int pkru, unsigned int key);

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 08/10] x86/hvm: pkeys, add pkeys support for do_page_fault
  2015-11-16 10:31 ` [PATCH 08/10] x86/hvm: pkeys, add pkeys support for do_page_fault Huaitong Han
@ 2015-11-16 15:25   ` Andrew Cooper
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 15:25 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds pkeys support for do_page_fault.
>
> the protection keys architecture define a new status bit in the PFEC. PFEC.PK
> (bit 5) is set to 1 if an only if protection keys block the access.
>
> Protection keys block an access and induce a page fault if and only if
> 1.Protection keys are enabled (CR4.PKE=1 and EFER.LMA=1), and
> 2.The page has a valid translation (page is present with no reserved bit
>   violations), and
> 3.The access is not an instruction fetch, and
> 4.The access is to a user page, and
> 5.At least one of the following restrictions apply:
>     --The access is a data read or data write and AD=1
>     --The access is a data write and WD=1 and either CR0.WP=1 or (CR0.WP=0 and
>     it is a user access)
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
>
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 9f5a6c6..73abb3b 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1287,7 +1287,7 @@ enum pf_type {
>      spurious_fault
>  };
>  
> -static enum pf_type __page_fault_type(
> +static enum pf_type __page_fault_type(struct vcpu *vcpu,
>      unsigned long addr, const struct cpu_user_regs *regs)
>  {
>      unsigned long mfn, cr3 = read_cr3();
> @@ -1295,7 +1295,7 @@ static enum pf_type __page_fault_type(
>      l3_pgentry_t l3e, *l3t;
>      l2_pgentry_t l2e, *l2t;
>      l1_pgentry_t l1e, *l1t;
> -    unsigned int required_flags, disallowed_flags, page_user;
> +    unsigned int required_flags, disallowed_flags, page_user, pte_pkeys;
>      unsigned int error_code = regs->error_code;
>  
>      /*hvm_wp_enabled
>
> @@ -1340,6 +1340,7 @@ static enum pf_type __page_fault_type(
>      if ( ((l3e_get_flags(l3e) & required_flags) != required_flags) ||
>           (l3e_get_flags(l3e) & disallowed_flags) )
>          return real_fault;
> +    pte_pkeys = l3e_get_pkeys(l3e);
>      page_user &= l3e_get_flags(l3e);
>      if ( l3e_get_flags(l3e) & _PAGE_PSE )
>          goto leaf;
> @@ -1351,6 +1352,7 @@ static enum pf_type __page_fault_type(
>      if ( ((l2e_get_flags(l2e) & required_flags) != required_flags) ||
>           (l2e_get_flags(l2e) & disallowed_flags) )
>          return real_fault;
> +    pte_pkeys = l2e_get_pkeys(l2e);
>      page_user &= l2e_get_flags(l2e);
>      if ( l2e_get_flags(l2e) & _PAGE_PSE )
>          goto leaf;
> @@ -1362,12 +1364,22 @@ static enum pf_type __page_fault_type(
>      if ( ((l1e_get_flags(l1e) & required_flags) != required_flags) ||
>           (l1e_get_flags(l1e) & disallowed_flags) )
>          return real_fault;
> +    pte_pkeys = l1e_get_pkeys(l1e);
>      page_user &= l1e_get_flags(l1e);
>  
>  leaf:
>      if ( page_user )
>      {
>          unsigned long cr4 = read_cr4();
> +        unsigned int ff, wf, uf, rsvdf, pkuf;
> +        unsigned int pkru_ad, pkru_wd;
> +
> +        uf = error_code & PFEC_user_mode;
> +        wf = error_code & PFEC_write_access;
> +        rsvdf = error_code & PFEC_reserved_bit;
> +        ff = error_code & PFEC_insn_fetch;
> +        pkuf = error_code & PFEC_protection_key;

These should be bool_t's rather than unsigned int, but I don't really
see the point of breaking them all out.

> +
>          /*
>           * Supervisor Mode Execution Prevention (SMEP):
>           * Disallow supervisor execution from user-accessible mappings
> @@ -1386,15 +1398,35 @@ leaf:
>           *   - CPL=3 or X86_EFLAGS_AC is clear
>           *   - Page fault in kernel mode
>           */
> -        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode) &&
> +        if ( (cr4 & X86_CR4_SMAP) && !uf &&
>               (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
>              return smap_fault;
> +         /*
> +         * PKU:  additional mechanism by which the paging controls
> +         * access to user-mode addresses based on the value in the
> +         * PKRU register. A fault is considered as a PKU violation if all
> +         * of the following conditions are ture:
> +         * 1.CR4_PKE=1.
> +         * 2.EFER_LMA=1.
> +         * 3.page is present with no reserved bit violations.
> +         * 4.the access is not an instruction fetch.
> +         * 5.the access is to a user page.
> +         * 6.PKRU.AD=1
> +         *       or The access is a data write and PKRU.WD=1
> +         *           and either CR0.WP=1 or it is a user access.
> +         */
> +         pkru_ad = READ_PKRU_AD(pte_pkeys);
> +         pkru_wd = READ_PKRU_AD(pte_pkeys);
> +         if ( pkuf && (cr4 & X86_CR4_PKE) && hvm_long_mode_enabled(vcpu) &&
> +             !rsvdf && !ff && (pkru_ad ||
> +             (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf))))
> +             return real_fault;

This has clearly never been tested on a non-Skylake processor.  The
rdpkru instruction must be after the CR4_PKE check, or it will kill Xen
with a #UD fault.

The hvm_long_mode_enabled() and hvm_wp_enabled() are not valid without a
has_hvm_container() check.  However, you on a PV-only codepath so can
guarantee that the vcpu is not an hvm one (All HVM pagefaults are
handled via paging_fault()).  You can also guarantee that CR0.WP is 1,
and don't actually need to pass a vcpu pointer into here at at all.

>      }
>  
>      return spurious_fault;
>  }
>  
> -static enum pf_type spurious_page_fault(
> +static enum pf_type spurious_page_fault(struct vcpu *vcpu,
>      unsigned long addr, const struct cpu_user_regs *regs)
>  {
>      unsigned long flags;
> @@ -1405,7 +1437,7 @@ static enum pf_type spurious_page_fault(
>       * page tables from becoming invalid under our feet during the walk.
>       */
>      local_irq_save(flags);
> -    pf_type = __page_fault_type(addr, regs);
> +    pf_type = __page_fault_type(vcpu, addr, regs);
>      local_irq_restore(flags);
>  
>      return pf_type;
> @@ -1479,6 +1511,7 @@ static int fixup_page_fault(unsigned long addr, struct cpu_user_regs *regs)
>   *  Bit 2: User mode (=1) ; Supervisor mode (=0)
>   *  Bit 3: Reserved bit violation
>   *  Bit 4: Instruction fetch
> + *  Bit 5: Protection-key violations
>   */
>  void do_page_fault(struct cpu_user_regs *regs)
>  {
> @@ -1500,7 +1533,7 @@ void do_page_fault(struct cpu_user_regs *regs)
>  
>      if ( unlikely(!guest_mode(regs)) )
>      {
> -        pf_type = spurious_page_fault(addr, regs);
> +        pf_type = spurious_page_fault(current, addr, regs);
>          if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
>          {
>              console_start_sync();
> @@ -1533,7 +1566,7 @@ void do_page_fault(struct cpu_user_regs *regs)
>  
>      if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
>      {
> -        pf_type = spurious_page_fault(addr, regs);
> +        pf_type = spurious_page_fault(current, addr, regs);
>          if ( (pf_type == smep_fault) || (pf_type == smap_fault))
>          {
>              printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
> diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
> index 427eb84..e8090fb 100644
> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -135,13 +135,14 @@
>  #define TF_kernel_mode         (1<<_TF_kernel_mode)
>  
>  /* #PF error code values. */
> -#define PFEC_page_present   (1U<<0)
> -#define PFEC_write_access   (1U<<1)
> -#define PFEC_user_mode      (1U<<2)
> -#define PFEC_reserved_bit   (1U<<3)
> -#define PFEC_insn_fetch     (1U<<4)
> -#define PFEC_page_paged     (1U<<5)
> -#define PFEC_page_shared    (1U<<6)
> +#define PFEC_page_present       (1U<<0)
> +#define PFEC_write_access       (1U<<1)
> +#define PFEC_user_mode          (1U<<2)
> +#define PFEC_reserved_bit       (1U<<3)
> +#define PFEC_insn_fetch         (1U<<4)
> +#define PFEC_protection_key     (1U<<5)
> +#define PFEC_page_paged         (1U<<6)
> +#define PFEC_page_shared        (1U<<7)

Again, this needs a rebase, and you will find that PFEC_prot_key is
already included.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables
  2015-11-16 10:31 ` [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-11-16 16:52   ` Andrew Cooper
  2015-11-16 16:59   ` Andrew Cooper
  1 sibling, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 16:52 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds pkeys support for guest_walk_tables.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
>
> diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
> index 773454d..7a7ae96 100644
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -124,6 +124,46 @@ void *map_domain_gfn(struct p2m_domain *p2m, gfn_t gfn, mfn_t *mfn,
>      return map;
>  }
>  
> +#if GUEST_PAGING_LEVELS >= 4
> +uint32_t leaf_pte_pkeys_check(struct vcpu *vcpu, uint32_t pfec,
> +                uint32_t pte_access, uint32_t pte_pkeys)
> +{
> +    unsigned int pkru_ad, pkru_wd;
> +    unsigned int ff, wf, uf, rsvdf, pkuf;
> +
> +    uf = pfec & PFEC_user_mode;
> +    wf = pfec & PFEC_write_access;
> +    rsvdf = pfec & PFEC_reserved_bit;
> +    ff = pfec & PFEC_insn_fetch;
> +    pkuf = pfec & PFEC_protection_key;
> +
> +    if (!pkuf)
> +        return 0;
> +
> +    /*
> +     * PKU:  additional mechanism by which the paging controls
> +    * access to user-mode addresses based on the value in the
> +    * PKRU register. A fault is considered as a PKU violation if all
> +    * of the following conditions are ture:
> +    * 1.CR4_PKE=1.
> +    * 2.EFER_LMA=1.
> +    * 3.page is present with no reserved bit violations.
> +    * 4.the access is not an instruction fetch.
> +    * 5.the access is to a user page.
> +    * 6.PKRU.AD=1
> +    *       or The access is a data write and PKRU.WD=1
> +    *            and either CR0.WP=1 or it is a user access.
> +    */

Please fix the alignment of the comment.

> +    pkru_ad = READ_PKRU_AD(pte_pkeys);
> +    pkru_wd = READ_PKRU_AD(pte_pkeys);
> +    if ( hvm_pku_enabled(vcpu) && hvm_long_mode_enabled(vcpu) &&
> +        !rsvdf && !ff && (pkru_ad ||
> +        (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf))))
> +        return 1;

Same comments as patch 8 for the PV case.

> +
> +    return 0;
> +}
> +#endif
>  
>  /* Walk the guest pagetables, after the manner of a hardware walker. */
>  /* Because the walk is essentially random, it can cause a deadlock 
> @@ -141,6 +181,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
>      guest_l3e_t *l3p = NULL;
>      guest_l4e_t *l4p;
> +    uint32_t pkeys;
>  #endif
>      uint32_t gflags, mflags, iflags, rc = 0;
>      bool_t smep = 0, smap = 0;
> @@ -225,6 +266,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>          goto out;
>      /* Get the l3e and check its flags*/
>      gw->l3e = l3p[guest_l3_table_offset(va)];
> +    pkeys = guest_l3e_get_pkeys(gw->l3e);
>      gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
>      if ( !(gflags & _PAGE_PRESENT) ) {
>          rc |= _PAGE_PRESENT;
> @@ -234,6 +276,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>      
>      pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); 
>  
> +    if (pse1G && leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
> +        rc |= _PAGE_PK_BIT;
> +
>      if ( pse1G )
>      {
>          /* Generate a fake l1 table entry so callers don't all 
> @@ -295,7 +340,6 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>      gw->l2e = l2p[guest_l2_table_offset(va)];
>  
>  #endif /* All levels... */
> -

Spurious deletion.

>      gflags = guest_l2e_get_flags(gw->l2e) ^ iflags;
>      if ( !(gflags & _PAGE_PRESENT) ) {
>          rc |= _PAGE_PRESENT;
> @@ -305,6 +349,12 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  
>      pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
>  
> +#if GUEST_PAGING_LEVELS >= 4
> +    pkeys = guest_l2e_get_pkeys(gw->l2e);
> +    if (pse2M && leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
> +        rc |= _PAGE_PK_BIT;
> +#endif
> +
>      if ( pse2M )
>      {
>          /* Special case: this guest VA is in a PSE superpage, so there's
> @@ -365,6 +415,11 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>              goto out;
>          }
>          rc |= ((gflags & mflags) ^ mflags);
> +#if GUEST_PAGING_LEVELS >= 4
> +        pkeys = guest_l1e_get_pkeys(gw->l1e);
> +        if (leaf_pte_pkeys_check(v, pfec, gflags, pkeys))
> +            rc |= _PAGE_PK_BIT;
> +#endif

Without modifying the caller's logic, setting _PAGE_PK_BIT in the return
value isn't going to get propagated to the guest correctly.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 10/10] x86/hvm: pkeys, add xstate support for pkeys
  2015-11-16 10:31 ` [PATCH 10/10] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-11-16 16:52   ` Andrew Cooper
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 16:52 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> This patch adds xstate support for pkeys.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-11-16 10:31 ` [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
  2015-11-16 12:00   ` Andrew Cooper
@ 2015-11-16 16:58   ` Wei Liu
  1 sibling, 0 replies; 36+ messages in thread
From: Wei Liu @ 2015-11-16 16:58 UTC (permalink / raw)
  To: Huaitong Han
  Cc: kevin.tian, wei.liu2, jbeulich, stefano.stabellini,
	george.dunlap, andrew.cooper3, eddie.dong, xen-devel,
	jun.nakajima, keir, ian.jackson, ian.campbell

On Mon, Nov 16, 2015 at 06:31:48PM +0800, Huaitong Han wrote:
> This patch adds pkeys support for cpuid handing.
> 
> Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
> CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
> 
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> 

All patches in your series don't contain diffstat. Can you have that in
your patches in the future?

> diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
> index c3ddc80..f6a9778 100644
> --- a/tools/libxc/xc_cpufeature.h
> +++ b/tools/libxc/xc_cpufeature.h
> @@ -141,5 +141,7 @@
>  #define X86_FEATURE_ADX         19 /* ADCX, ADOX instructions */
>  #define X86_FEATURE_SMAP        20 /* Supervisor Mode Access Protection */
>  
> +/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
> +#define X86_FEATURE_PKU     3
>  
>  #endif /* __LIBXC_CPUFEATURE_H */
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index e146a3e..34bb964 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -367,9 +367,11 @@ static void xc_cpuid_hvm_policy(
>                          bitmaskof(X86_FEATURE_ADX)  |
>                          bitmaskof(X86_FEATURE_SMAP) |
>                          bitmaskof(X86_FEATURE_FSGSBASE));
> +            regs[2] &= bitmaskof(X86_FEATURE_PKU);
>          } else
> -            regs[1] = 0;
> -        regs[0] = regs[2] = regs[3] = 0;
> +            regs[1] = regs[2] = 0;
> +
> +        regs[0] = regs[3] = 0;
>          break;

As far as I can tell this snippet is the only place toolstack code is
touched in this series. I'm going to delegate reviewing to x86
maintainers. :-)

Wei.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables
  2015-11-16 10:31 ` [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
  2015-11-16 16:52   ` Andrew Cooper
@ 2015-11-16 16:59   ` Andrew Cooper
  1 sibling, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 16:59 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
>  #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
> diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
> index f8a0d76..1c0f050 100644
> --- a/xen/include/asm-x86/guest_pt.h
> +++ b/xen/include/asm-x86/guest_pt.h
> @@ -154,6 +154,17 @@ static inline u32 guest_l4e_get_flags(guest_l4e_t gl4e)
>  { return l4e_get_flags(gl4e); }
>  #endif
>  
> +static inline u32 guest_l1e_get_pkeys(guest_l1e_t gl1e)
> +{ return l1e_get_pkeys(gl1e); }
> +static inline u32 guest_l2e_get_pkeys(guest_l2e_t gl2e)
> +{ return l2e_get_pkeys(gl2e); }
> +static inline u32 guest_l3e_get_pkeys(guest_l3e_t gl3e)
> +{ return l3e_get_pkeys(gl3e); }
> +#if GUEST_PAGING_LEVELS >= 4
> +static inline u32 guest_l4e_get_pkeys(guest_l4e_t gl4e)
> +{ return l4e_get_pkeys(gl4e); }
> +#endif

Actually, the manual indicates that protection keys only exist in L3, L2
and L1 entries.  Therefore, guest_l4e_get_pkeys() is not valid and
should be removed, as should l4e_get_pkeys() in patch 6.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (9 preceding siblings ...)
  2015-11-16 10:31 ` [PATCH 10/10] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-11-16 17:45 ` Andrew Cooper
  2015-11-17 10:26   ` Jan Beulich
  10 siblings, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2015-11-16 17:45 UTC (permalink / raw)
  To: Huaitong Han, jbeulich, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: xen-devel

On 16/11/15 10:31, Huaitong Han wrote:
> The protection-key feature provides an additional mechanism by which IA-32e
> paging controls access to usermode addresses.
>
> Hardware support for protection keys for user pages is enumerated with CPUID
> feature flag CPUID.7.0.ECX[3]:PKU. Software support is CPUID.7.0.ECX[4]:OSPKE
> with the setting of CR4.PKE(bit 22).
>
> When CR4.PKE = 1, every linear address is associated with the 4-bit protection
> key located in bits 62:59 of the paging-structure entry that mapped the page
> containing the linear address. The PKRU register determines, for each
> protection key, whether user-mode addresses with that protection key may be
> read or written.
>
> The PKRU register (protection key rights for user pages) is a 32-bit register
> with the following format: for each i (0 ≤ i ≤ 15), PKRU[2i] is the
> access-disable bit for protection key i (ADi); PKRU[2i+1] is the write-disable
> bit for protection key i (WDi).
>
> Software can use the RDPKRU and WRPKRU instructions with ECX = 0 to read and
> write PKRU. In addition, the PKRU register is XSAVE-managed state and can thus
> be read and written by instructions in the XSAVE feature set.
>
> PFEC.PK (bit 5) is defined as protection key violations.
>
> The specification of Protection Keys can be found at SDM (4.6.2, volume 3)
> http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf.

Thankyou for this series.  On the whole, it is fairly good.

However, a couple of issues have surfaced.  The use of the
software-defined bits are an ABI with PV guests.  (With many things from
those days, nothing is actually written down about this ABI).

As such, it does not appear to be safe to enable PKE in the context of
an unaware PV guest.

Furthermore, it is unclear (given the unwritten ABI) whether it is even
safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV guest.


Finally, do you have some example usecases for PKE?  As WRPKRU lacks a
cpl check, any entity on the system can arbitrarily change the
protection keys, meaning that it can't be used for any system level
enforcement.  I can't currently see a situation where it is actually a
useful feature, but I presume there is one.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-16 17:45 ` [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Andrew Cooper
@ 2015-11-17 10:26   ` Jan Beulich
  2015-11-17 16:24     ` Andrew Cooper
  2015-11-18  9:12     ` Wu, Feng
  0 siblings, 2 replies; 36+ messages in thread
From: Jan Beulich @ 2015-11-17 10:26 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, ian.jackson, xen-devel, jun.nakajima,
	Huaitong Han, keir

>>> On 16.11.15 at 18:45, <andrew.cooper3@citrix.com> wrote:
> Furthermore, it is unclear (given the unwritten ABI) whether it is even
> safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV guest.

It seems pretty clear to me that this would be unsafe: It being
part of L1_DISALLOW_MASK, if it moved and a guest used the
bit for its own purposes, the guest would break. I guess we'll
need an ELF note by which the guest can advertise which of the
available bits it doesn't care about itself.

Jan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-17 10:26   ` Jan Beulich
@ 2015-11-17 16:24     ` Andrew Cooper
  2015-11-17 16:36       ` Jan Beulich
  2015-11-18  9:12     ` Wu, Feng
  1 sibling, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2015-11-17 16:24 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, ian.jackson, xen-devel, jun.nakajima,
	Huaitong Han, keir

On 17/11/15 10:26, Jan Beulich wrote:
>>>> On 16.11.15 at 18:45, <andrew.cooper3@citrix.com> wrote:
>> Furthermore, it is unclear (given the unwritten ABI) whether it is even
>> safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV guest.
> It seems pretty clear to me that this would be unsafe: It being
> part of L1_DISALLOW_MASK, if it moved and a guest used the
> bit for its own purposes, the guest would break. I guess we'll
> need an ELF note by which the guest can advertise which of the
> available bits it doesn't care about itself.

Well - it depends whether any of these bits actually get used.

If none actually do get used, we would be better to retro-fit a real ABI
in place, without requiring all new guests to opt-in to get new features.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-17 16:24     ` Andrew Cooper
@ 2015-11-17 16:36       ` Jan Beulich
  0 siblings, 0 replies; 36+ messages in thread
From: Jan Beulich @ 2015-11-17 16:36 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, ian.jackson, xen-devel, jun.nakajima,
	Huaitong Han, keir

>>> On 17.11.15 at 17:24, <andrew.cooper3@citrix.com> wrote:
> On 17/11/15 10:26, Jan Beulich wrote:
>>>>> On 16.11.15 at 18:45, <andrew.cooper3@citrix.com> wrote:
>>> Furthermore, it is unclear (given the unwritten ABI) whether it is even
>>> safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV guest.
>> It seems pretty clear to me that this would be unsafe: It being
>> part of L1_DISALLOW_MASK, if it moved and a guest used the
>> bit for its own purposes, the guest would break. I guess we'll
>> need an ELF note by which the guest can advertise which of the
>> available bits it doesn't care about itself.
> 
> Well - it depends whether any of these bits actually get used.
> 
> If none actually do get used, we would be better to retro-fit a real ABI
> in place, without requiring all new guests to opt-in to get new features.

And how exactly do you propose we check _all_ existing kernels?

Jan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-17 10:26   ` Jan Beulich
  2015-11-17 16:24     ` Andrew Cooper
@ 2015-11-18  9:12     ` Wu, Feng
  2015-11-18 10:10       ` Andrew Cooper
  1 sibling, 1 reply; 36+ messages in thread
From: Wu, Feng @ 2015-11-18  9:12 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, ian.jackson, xen-devel, Nakajima, Jun, Han,
	Huaitong, keir, Wu, Feng



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Jan Beulich
> Sent: Tuesday, November 17, 2015 6:26 PM
> To: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Tian, Kevin <kevin.tian@intel.com>; wei.liu2@citrix.com;
> ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com;
> george.dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; xen-
> devel@lists.xen.org; Nakajima, Jun <jun.nakajima@intel.com>; Han,
> Huaitong <huaitong.han@intel.com>; keir@xen.org
> Subject: Re: [Xen-devel] [PATCH 00/10] x86/hvm: pkeys, add memory
> protection-key support
> 
> >>> On 16.11.15 at 18:45, <andrew.cooper3@citrix.com> wrote:
> > Furthermore, it is unclear (given the unwritten ABI) whether it is even
> > safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV guest.
> 
> It seems pretty clear to me that this would be unsafe: It being
> part of L1_DISALLOW_MASK, if it moved and a guest used the
> bit for its own purposes, the guest would break. I guess we'll
> need an ELF note by which the guest can advertise which of the
> available bits it doesn't care about itself.

Actually, we don't expose this feature to PV guest, we only expose it
to HVM. In that case, is there still issues like you discussed above?

Thanks,
Feng

> 
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-18  9:12     ` Wu, Feng
@ 2015-11-18 10:10       ` Andrew Cooper
  2015-11-19  7:44         ` Wu, Feng
  0 siblings, 1 reply; 36+ messages in thread
From: Andrew Cooper @ 2015-11-18 10:10 UTC (permalink / raw)
  To: Wu, Feng, Jan Beulich
  Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, ian.jackson, xen-devel, Nakajima, Jun, Han,
	Huaitong, keir

On 18/11/15 09:12, Wu, Feng wrote:
>
>> -----Original Message-----
>> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> bounces@lists.xen.org] On Behalf Of Jan Beulich
>> Sent: Tuesday, November 17, 2015 6:26 PM
>> To: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: Tian, Kevin <kevin.tian@intel.com>; wei.liu2@citrix.com;
>> ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com;
>> george.dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; xen-
>> devel@lists.xen.org; Nakajima, Jun <jun.nakajima@intel.com>; Han,
>> Huaitong <huaitong.han@intel.com>; keir@xen.org
>> Subject: Re: [Xen-devel] [PATCH 00/10] x86/hvm: pkeys, add memory
>> protection-key support
>>
>>>>> On 16.11.15 at 18:45, <andrew.cooper3@citrix.com> wrote:
>>> Furthermore, it is unclear (given the unwritten ABI) whether it is even
>>> safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV guest.
>> It seems pretty clear to me that this would be unsafe: It being
>> part of L1_DISALLOW_MASK, if it moved and a guest used the
>> bit for its own purposes, the guest would break. I guess we'll
>> need an ELF note by which the guest can advertise which of the
>> available bits it doesn't care about itself.
> Actually, we don't expose this feature to PV guest, we only expose it
> to HVM. In that case, is there still issues like you discussed above?

You have turned on CR4.PKE, and _PAGE_GNTTAB is bit 62 in a PTE. 
Futhermore, you don't prevent/audit a PV guest's use of the PK bits.

This makes it usable by PV guests, even if the feature isn't advertised.

~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-18 10:10       ` Andrew Cooper
@ 2015-11-19  7:44         ` Wu, Feng
  2015-11-19  8:44           ` Jan Beulich
  0 siblings, 1 reply; 36+ messages in thread
From: Wu, Feng @ 2015-11-19  7:44 UTC (permalink / raw)
  To: Andrew Cooper, Jan Beulich
  Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, ian.jackson, xen-devel, Nakajima, Jun, Han,
	Huaitong, keir, Wu, Feng



> -----Original Message-----
> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Wednesday, November 18, 2015 6:11 PM
> To: Wu, Feng <feng.wu@intel.com>; Jan Beulich <JBeulich@suse.com>
> Cc: Tian, Kevin <kevin.tian@intel.com>; wei.liu2@citrix.com;
> ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com;
> george.dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; xen-
> devel@lists.xen.org; Nakajima, Jun <jun.nakajima@intel.com>; Han,
> Huaitong <huaitong.han@intel.com>; keir@xen.org
> Subject: Re: [Xen-devel] [PATCH 00/10] x86/hvm: pkeys, add memory
> protection-key support
> 
> On 18/11/15 09:12, Wu, Feng wrote:
> >
> >> -----Original Message-----
> >> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> >> bounces@lists.xen.org] On Behalf Of Jan Beulich
> >> Sent: Tuesday, November 17, 2015 6:26 PM
> >> To: Andrew Cooper <andrew.cooper3@citrix.com>
> >> Cc: Tian, Kevin <kevin.tian@intel.com>; wei.liu2@citrix.com;
> >> ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com;
> >> george.dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; xen-
> >> devel@lists.xen.org; Nakajima, Jun <jun.nakajima@intel.com>; Han,
> >> Huaitong <huaitong.han@intel.com>; keir@xen.org
> >> Subject: Re: [Xen-devel] [PATCH 00/10] x86/hvm: pkeys, add memory
> >> protection-key support
> >>
> >>>>> On 16.11.15 at 18:45, <andrew.cooper3@citrix.com> wrote:
> >>> Furthermore, it is unclear (given the unwritten ABI) whether it is even
> >>> safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV
> guest.
> >> It seems pretty clear to me that this would be unsafe: It being
> >> part of L1_DISALLOW_MASK, if it moved and a guest used the
> >> bit for its own purposes, the guest would break. I guess we'll
> >> need an ELF note by which the guest can advertise which of the
> >> available bits it doesn't care about itself.
> > Actually, we don't expose this feature to PV guest, we only expose it
> > to HVM. In that case, is there still issues like you discussed above?
> 
> You have turned on CR4.PKE, and _PAGE_GNTTAB is bit 62 in a PTE.

Oh, yes, actually, we shouldn't turn on CR4.PKE for Xen, since we don't
actually enable it for Xen itself (No such usage model).

> Futhermore, you don't prevent/audit a PV guest's use of the PK bits.

I think the guest (HVM or PV) should use the PK bits only when Pkey
is enabled (CR4.PKE set) by the kernel, Xen cannot control it, right?

Thanks,
Feng

> 
> This makes it usable by PV guests, even if the feature isn't advertised.
> 
> ~Andrew

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-19  7:44         ` Wu, Feng
@ 2015-11-19  8:44           ` Jan Beulich
  2015-11-19  8:49             ` Wu, Feng
  0 siblings, 1 reply; 36+ messages in thread
From: Jan Beulich @ 2015-11-19  8:44 UTC (permalink / raw)
  To: Feng Wu
  Cc: Kevin Tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, Andrew Cooper, ian.jackson, xen-devel,
	Jun Nakajima, Huaitong Han, keir

>>> On 19.11.15 at 08:44, <feng.wu@intel.com> wrote:

> 
>> -----Original Message-----
>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>> Sent: Wednesday, November 18, 2015 6:11 PM
>> To: Wu, Feng <feng.wu@intel.com>; Jan Beulich <JBeulich@suse.com>
>> Cc: Tian, Kevin <kevin.tian@intel.com>; wei.liu2@citrix.com;
>> ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com;
>> george.dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; xen-
>> devel@lists.xen.org; Nakajima, Jun <jun.nakajima@intel.com>; Han,
>> Huaitong <huaitong.han@intel.com>; keir@xen.org 
>> Subject: Re: [Xen-devel] [PATCH 00/10] x86/hvm: pkeys, add memory
>> protection-key support
>> 
>> On 18/11/15 09:12, Wu, Feng wrote:
>> >
>> >> -----Original Message-----
>> >> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
>> >> bounces@lists.xen.org] On Behalf Of Jan Beulich
>> >> Sent: Tuesday, November 17, 2015 6:26 PM
>> >> To: Andrew Cooper <andrew.cooper3@citrix.com>
>> >> Cc: Tian, Kevin <kevin.tian@intel.com>; wei.liu2@citrix.com;
>> >> ian.campbell@citrix.com; stefano.stabellini@eu.citrix.com;
>> >> george.dunlap@eu.citrix.com; ian.jackson@eu.citrix.com; xen-
>> >> devel@lists.xen.org; Nakajima, Jun <jun.nakajima@intel.com>; Han,
>> >> Huaitong <huaitong.han@intel.com>; keir@xen.org 
>> >> Subject: Re: [Xen-devel] [PATCH 00/10] x86/hvm: pkeys, add memory
>> >> protection-key support
>> >>
>> >>>>> On 16.11.15 at 18:45, <andrew.cooper3@citrix.com> wrote:
>> >>> Furthermore, it is unclear (given the unwritten ABI) whether it is even
>> >>> safe to move _PAGE_GNTTAB out of the way, as this is visible to a PV
>> guest.
>> >> It seems pretty clear to me that this would be unsafe: It being
>> >> part of L1_DISALLOW_MASK, if it moved and a guest used the
>> >> bit for its own purposes, the guest would break. I guess we'll
>> >> need an ELF note by which the guest can advertise which of the
>> >> available bits it doesn't care about itself.
>> > Actually, we don't expose this feature to PV guest, we only expose it
>> > to HVM. In that case, is there still issues like you discussed above?
>> 
>> You have turned on CR4.PKE, and _PAGE_GNTTAB is bit 62 in a PTE.
> 
> Oh, yes, actually, we shouldn't turn on CR4.PKE for Xen, since we don't
> actually enable it for Xen itself (No such usage model).
> 
>> Futhermore, you don't prevent/audit a PV guest's use of the PK bits.
> 
> I think the guest (HVM or PV) should use the PK bits only when Pkey
> is enabled (CR4.PKE set) by the kernel, Xen cannot control it, right?

Correct, but this is connected to the earlier item. I.e. if only you make
sure Xen and PV guests get run with CR4.PKE always clear, then no
extra auditing will be necessary.

Jan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support
  2015-11-19  8:44           ` Jan Beulich
@ 2015-11-19  8:49             ` Wu, Feng
  0 siblings, 0 replies; 36+ messages in thread
From: Wu, Feng @ 2015-11-19  8:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, Andrew Cooper, ian.jackson, xen-devel, Nakajima,
	Jun, Han, Huaitong, keir, Wu, Feng



> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Thursday, November 19, 2015 4:44 PM
> To: Wu, Feng <feng.wu@intel.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>; ian.campbell@citrix.com;
> wei.liu2@citrix.com; george.dunlap@eu.citrix.com;
> ian.jackson@eu.citrix.com; stefano.stabellini@eu.citrix.com; Han, Huaitong
> <huaitong.han@intel.com>; Nakajima, Jun <jun.nakajima@intel.com>; Tian,
> Kevin <kevin.tian@intel.com>; xen-devel@lists.xen.org; keir@xen.org
> Subject: RE: [Xen-devel] [PATCH 00/10] x86/hvm: pkeys, add memory
> protection-key support
> 
> >>> On 19.11.15 at 08:44, <feng.wu@intel.com> wrote:
> 
> >
> Correct, but this is connected to the earlier item. I.e. if only you make
> sure Xen and PV guests get run with CR4.PKE always clear, then no
> extra auditing will be necessary.

Yes, I think we will make it clear. We only expose this to HVM. And for
Xen and PV, the CR4.PKE should be zero.

Thanks,
Feng

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-11-16 12:00   ` Andrew Cooper
@ 2015-11-19 14:39     ` Wu, Feng
  0 siblings, 0 replies; 36+ messages in thread
From: Wu, Feng @ 2015-11-19 14:39 UTC (permalink / raw)
  To: Andrew Cooper, Han, Huaitong, jbeulich, Nakajima, Jun, Dong,
	Eddie, Tian, Kevin, george.dunlap, ian.jackson,
	stefano.stabellini, ian.campbell, wei.liu2, keir
  Cc: Wu, Feng, xen-devel



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Andrew Cooper
> Sent: Monday, November 16, 2015 8:01 PM
> To: Han, Huaitong <huaitong.han@intel.com>; jbeulich@suse.com; Nakajima,
> Jun <jun.nakajima@intel.com>; Dong, Eddie <eddie.dong@intel.com>; Tian,
> Kevin <kevin.tian@intel.com>; george.dunlap@eu.citrix.com;
> ian.jackson@eu.citrix.com; stefano.stabellini@eu.citrix.com;
> ian.campbell@citrix.com; wei.liu2@citrix.com; keir@xen.org
> Cc: xen-devel@lists.xen.org
> Subject: Re: [Xen-devel] [PATCH 01/10] x86/hvm: pkeys, add pkeys support
> for cpuid handling
> 
> On 16/11/15 10:31, Huaitong Han wrote:
> > This patch adds pkeys support for cpuid handing.
> >
> > Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
> > CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
> >
> > Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> >
> > diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
> > index c3ddc80..f6a9778 100644
> > --- a/tools/libxc/xc_cpufeature.h
> > +++ b/tools/libxc/xc_cpufeature.h
> > @@ -141,5 +141,7 @@
> >  #define X86_FEATURE_ADX         19 /* ADCX, ADOX instructions */
> >  #define X86_FEATURE_SMAP        20 /* Supervisor Mode Access Protection
> */
> >
> > +/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
> > +#define X86_FEATURE_PKU     3
> >
> >  #endif /* __LIBXC_CPUFEATURE_H */
> > diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> > index e146a3e..34bb964 100644
> > --- a/tools/libxc/xc_cpuid_x86.c
> > +++ b/tools/libxc/xc_cpuid_x86.c
> > @@ -367,9 +367,11 @@ static void xc_cpuid_hvm_policy(
> >                          bitmaskof(X86_FEATURE_ADX)  |
> >                          bitmaskof(X86_FEATURE_SMAP) |
> >                          bitmaskof(X86_FEATURE_FSGSBASE));
> > +            regs[2] &= bitmaskof(X86_FEATURE_PKU);
> >          } else
> > -            regs[1] = 0;
> > -        regs[0] = regs[2] = regs[3] = 0;
> > +            regs[1] = regs[2] = 0;
> > +
> > +        regs[0] = regs[3] = 0;
> >          break;
> >
> >      case 0x0000000d:
> > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > index 615fa89..66917ff 100644
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -4518,6 +4518,12 @@ void hvm_cpuid(unsigned int input, unsigned
> int *eax, unsigned int *ebx,
> >          /* Don't expose INVPCID to non-hap hvm. */
> >          if ( (count == 0) && !hap_enabled(d) )
> >              *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> > +
> > +        if ( (count == 0) && !(cpu_has_pku && hap_enabled(d)) )
> > +            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> > +        if ( (count == 0) && cpu_has_pku )
> > +            *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> > +                     cpufeat_mask(X86_FEATURE_OSPKE) : 0;
> 
> This logic (all being gated on count == 0 && !hap_enabled() ) should
> extend the INVPCID if() statement.
> 
> Setting OSPKE should be gated on *ecx having PKU and guest CR4 alone.

Although I agree that it is better to check *ecx having PKU here, however,

> As it currently stands, a guest could end up observing OSPKE but not PKU.

I am wondering how this can happen. If guest cannot see PKU, which means
this feature is not exposed to it, the guest will not set the PKE bit in cr4 (guests
will set PKE only when CPUID says that this feature is supported), hence it
cannot see OSPKE bit.

BTW, Huaitong, it is better to put this patch in the end of this series, since
we usually expose the feature after all other things are finished.

Thanks,
Feng

> 
> >          break;
> >      case 0xb:
> >          /* Fix the x2APIC identifier. */
> > diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-
> x86/cpufeature.h
> > index 9a01563..3c3b95f 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -154,6 +154,10 @@
> >  #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions
> */
> >  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access
> Prevention */
> >
> > +/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 8 */
> > +#define X86_FEATURE_PKU	(8*32+ 3) /* Protection Keys for Userspace */
> > +#define X86_FEATURE_OSPKE	(8*32+ 4) /* OS Protection Keys
> Enable */
> > +
> >  #if !defined(__ASSEMBLY__) && !defined(X86_FEATURES_ONLY)
> >  #include <xen/bitops.h>
> >
> > @@ -193,6 +197,7 @@
> >
> >  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
> >  #define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> > +#define cpu_has_pku             boot_cpu_has(X86_FEATURE_PKU)
> 
> This read overflows c->x86_capabilities, as you didn't bump NCAPINTs
> 
> I see that you bump it in the following patch, but you must move that
> hunk forwards to this patch.
> 
> ~Andrew
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4
  2015-11-16 10:31 ` [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
  2015-11-16 14:02   ` Andrew Cooper
@ 2015-11-20  1:16   ` Wu, Feng
  2015-11-20 10:41     ` Andrew Cooper
  1 sibling, 1 reply; 36+ messages in thread
From: Wu, Feng @ 2015-11-20  1:16 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, Nakajima, Jun, Dong, Eddie, Tian,
	Kevin, george.dunlap, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2, keir
  Cc: Han, Huaitong, Wu, Feng, xen-devel



> -----Original Message-----
> From: xen-devel-bounces@lists.xen.org [mailto:xen-devel-
> bounces@lists.xen.org] On Behalf Of Huaitong Han
> Sent: Monday, November 16, 2015 6:32 PM
> To: jbeulich@suse.com; andrew.cooper3@citrix.com; Nakajima, Jun
> <jun.nakajima@intel.com>; Dong, Eddie <eddie.dong@intel.com>; Tian,
> Kevin <kevin.tian@intel.com>; george.dunlap@eu.citrix.com;
> ian.jackson@eu.citrix.com; stefano.stabellini@eu.citrix.com;
> ian.campbell@citrix.com; wei.liu2@citrix.com; keir@xen.org
> Cc: Han, Huaitong <huaitong.han@intel.com>; xen-devel@lists.xen.org
> Subject: [Xen-devel] [PATCH 04/10] x86/hvm: pkeys, add pkeys support when
> setting CR4
> 
> This patch adds pkeys support when setting CR4
> 
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 66917ff..953047f 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -1911,6 +1911,7 @@ static unsigned long
> hvm_cr4_guest_reserved_bits(const struct vcpu *v,
>          leaf1_edx = boot_cpu_data.x86_capability[X86_FEATURE_VME / 32];
>          leaf1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PCID / 32];
>          leaf7_0_ebx = boot_cpu_data.x86_capability[X86_FEATURE_FSGSBASE /
> 32];
> +        leaf7_0_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PKU / 32];

What is the purpose of the above change?

>      }
> 
>      return ~(unsigned long)
> @@ -1946,7 +1947,9 @@ static unsigned long
> hvm_cr4_guest_reserved_bits(const struct vcpu *v,
>               (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
>                X86_CR4_SMEP : 0) |
>               (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
> -              X86_CR4_SMAP : 0));
> +              X86_CR4_SMAP : 0) |
> +             (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
> +              X86_CR4_PKE : 0));
>  }
> 
>  static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index c1f924e..8101a1b 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -1310,6 +1310,8 @@ void __init noreturn __start_xen(unsigned long
> mbi_p)
> 
>      if ( disable_pku )
>          setup_clear_cpu_cap(X86_FEATURE_PKU);
> +    if ( cpu_has_pku )
> +        set_in_cr4(X86_CR4_PKE);

As mentioned in [00/10] of this series, we shouldn't set PKE in CR4 for Xen
itself.

Thanks,
Feng

> 
>      if ( cpu_has_fsgsbase )
>          set_in_cr4(X86_CR4_FSGSBASE);
> --
> 2.4.3
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4
  2015-11-20  1:16   ` Wu, Feng
@ 2015-11-20 10:41     ` Andrew Cooper
  0 siblings, 0 replies; 36+ messages in thread
From: Andrew Cooper @ 2015-11-20 10:41 UTC (permalink / raw)
  To: Wu, Feng, Han, Huaitong, jbeulich, Nakajima, Jun, Dong, Eddie,
	Tian, Kevin, george.dunlap, ian.jackson, stefano.stabellini,
	ian.campbell, wei.liu2, keir
  Cc: xen-devel

On 20/11/15 01:16, Wu, Feng wrote:
>> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
>> index 66917ff..953047f 100644
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -1911,6 +1911,7 @@ static unsigned long
>> hvm_cr4_guest_reserved_bits(const struct vcpu *v,
>>          leaf1_edx = boot_cpu_data.x86_capability[X86_FEATURE_VME / 32];
>>          leaf1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PCID / 32];
>>          leaf7_0_ebx = boot_cpu_data.x86_capability[X86_FEATURE_FSGSBASE /
>> 32];
>> +        leaf7_0_ecx = boot_cpu_data.x86_capability[X86_FEATURE_PKU / 32];
> What is the purpose of the above change?

So the hunk below can correctly audit a guests attempt to set CR4.PKE.

~Andrew

>
>>      }
>>
>>      return ~(unsigned long)
>> @@ -1946,7 +1947,9 @@ static unsigned long
>> hvm_cr4_guest_reserved_bits(const struct vcpu *v,
>>               (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
>>                X86_CR4_SMEP : 0) |
>>               (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
>> -              X86_CR4_SMAP : 0));
>> +              X86_CR4_SMAP : 0) |
>> +             (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
>> +              X86_CR4_PKE : 0));
>>  }
>>
>>  static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2015-11-20 10:41 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-16 10:31 [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2015-11-16 10:31 ` [PATCH 01/10] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
2015-11-16 12:00   ` Andrew Cooper
2015-11-19 14:39     ` Wu, Feng
2015-11-16 16:58   ` Wei Liu
2015-11-16 10:31 ` [PATCH 02/10] x86/hvm: pkeys, add pku support for x86_capability Huaitong Han
2015-11-16 13:35   ` Andrew Cooper
2015-11-16 10:31 ` [PATCH 03/10] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
2015-11-16 13:56   ` Andrew Cooper
2015-11-16 10:31 ` [PATCH 04/10] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
2015-11-16 14:02   ` Andrew Cooper
2015-11-20  1:16   ` Wu, Feng
2015-11-20 10:41     ` Andrew Cooper
2015-11-16 10:31 ` [PATCH 05/10] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
2015-11-16 14:03   ` Andrew Cooper
2015-11-16 10:31 ` [PATCH 06/10] x86/hvm: pkeys, add functions to get pkeys value from PTE Huaitong Han
2015-11-16 14:16   ` Andrew Cooper
2015-11-16 14:42     ` Jan Beulich
2015-11-16 10:31 ` [PATCH 07/10] x86/hvm: pkeys, add functions to support PKRU access/write Huaitong Han
2015-11-16 15:09   ` Andrew Cooper
2015-11-16 10:31 ` [PATCH 08/10] x86/hvm: pkeys, add pkeys support for do_page_fault Huaitong Han
2015-11-16 15:25   ` Andrew Cooper
2015-11-16 10:31 ` [PATCH 09/10] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
2015-11-16 16:52   ` Andrew Cooper
2015-11-16 16:59   ` Andrew Cooper
2015-11-16 10:31 ` [PATCH 10/10] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
2015-11-16 16:52   ` Andrew Cooper
2015-11-16 17:45 ` [PATCH 00/10] x86/hvm: pkeys, add memory protection-key support Andrew Cooper
2015-11-17 10:26   ` Jan Beulich
2015-11-17 16:24     ` Andrew Cooper
2015-11-17 16:36       ` Jan Beulich
2015-11-18  9:12     ` Wu, Feng
2015-11-18 10:10       ` Andrew Cooper
2015-11-19  7:44         ` Wu, Feng
2015-11-19  8:44           ` Jan Beulich
2015-11-19  8:49             ` Wu, Feng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.