All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support
@ 2015-12-21  7:21 Huaitong Han
  2015-12-21  7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Huaitong Han @ 2015-12-21  7:21 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

Changes in v4:
*Delete gva2gfn patch, and when page is present, PFEC_prot_key is always checked.
*Use RDPKRU instead of xsave_read because RDPKRU does cost less.
*Squash pkeys patch and pkru patch to guest_walk_tables patch.

Changes in v3:
*Get CPUID:ospke depend on guest cpuid instead of host hardware capable,
and Move cpuid patch to the last of patches.
*Move opt_pku to cpu/common.c.
*Use MASK_EXTR for get_pte_pkeys.
*Add quoting for pkru macro, and use static inline pkru_read functions.
*Rebase get_xsave_addr for updated codes, and add uncompressed format support
for xsave state.
*Use fpu_xsave instead of vcpu_save_fpu, and adjust the code style for
leaf_pte_pkeys_check.
*Add parentheses for PFEC_prot_key of gva2gfn funcitons.

Changes in v2:
*Rebase all patches in staging branch
*Disable X86_CR4_PKE on hypervisor, and delete pkru_read/write functions, and
use xsave state read to get pkru value.
*Delete the patch that adds pkeys support for do_page_fault.
*Add pkeys support for gva2gfn so that setting _PAGE_PK_BIT in the return
value can get propagated to the guest correctly.

The protection-key feature provides an additional mechanism by which IA-32e
paging controls access to usermode addresses.

Hardware support for protection keys for user pages is enumerated with CPUID
feature flag CPUID.7.0.ECX[3]:PKU. Software support is CPUID.7.0.ECX[4]:OSPKE
with the setting of CR4.PKE(bit 22).

When CR4.PKE = 1, every linear address is associated with the 4-bit protection
key located in bits 62:59 of the paging-structure entry that mapped the page
containing the linear address. The PKRU register determines, for each
protection key, whether user-mode addresses with that protection key may be
read or written.

The PKRU register (protection key rights for user pages) is a 32-bit register
with the following format: for each i (0 ≤ i ≤ 15), PKRU[2i] is the
access-disable bit for protection key i (ADi); PKRU[2i+1] is the write-disable
bit for protection key i (WDi).

Software can use the RDPKRU and WRPKRU instructions with ECX = 0 to read and
write PKRU. In addition, the PKRU register is XSAVE-managed state and can thus
be read and written by instructions in the XSAVE feature set.

PFEC.PK (bit 5) is defined as protection key violations.

The specification of Protection Keys can be found at SDM (4.6.2, volume 3)
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf.

Huaitong Han (6):
  x86/hvm: pkeys, add the flag to enable Memory Protection Keys
  x86/hvm: pkeys, add pkeys support when setting CR4
  x86/hvm: pkeys, disable pkeys for guests in non-paging mode
  x86/hvm: pkeys, add pkeys support for guest_walk_tables
  x86/hvm: pkeys, add xstate support for pkeys
  x86/hvm: pkeys, add pkeys support for cpuid handling

 docs/misc/xen-command-line.markdown | 10 ++++++
 tools/libxc/xc_cpufeature.h         |  2 ++
 tools/libxc/xc_cpuid_x86.c          |  6 ++--
 xen/arch/x86/cpu/common.c           | 10 +++++-
 xen/arch/x86/hvm/hvm.c              | 15 +++++++--
 xen/arch/x86/hvm/vmx/vmx.c          | 11 ++++---
 xen/arch/x86/mm/guest_walk.c        | 65 +++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/hap/guest_walk.c    |  3 ++
 xen/include/asm-x86/cpufeature.h    |  6 +++-
 xen/include/asm-x86/guest_pt.h      |  7 ++++
 xen/include/asm-x86/hvm/hvm.h       |  2 ++
 xen/include/asm-x86/page.h          |  5 +++
 xen/include/asm-x86/processor.h     | 39 ++++++++++++++++++++++
 xen/include/asm-x86/x86_64/page.h   | 12 +++++++
 xen/include/asm-x86/xstate.h        |  4 ++-
 15 files changed, 185 insertions(+), 12 deletions(-)

-- 
2.4.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
  2015-12-21  7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
@ 2015-12-21  7:21 ` Huaitong Han
  2015-12-21 12:58   ` Jan Beulich
  2015-12-21  7:21 ` [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21  7:21 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds the flag to enable Memory Protection Keys.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 docs/misc/xen-command-line.markdown | 10 ++++++++++
 xen/arch/x86/cpu/common.c           | 10 +++++++++-
 xen/include/asm-x86/cpufeature.h    |  6 +++++-
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index c103894..36ecf80 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1177,6 +1177,16 @@ This option can be specified more than once (up to 8 times at present).
 ### ple\_window
 > `= <integer>`
 
+### pku
+> `= <boolean>`
+
+> Default: `true`
+
+Flag to enable Memory Protection Keys.
+
+The protection-key feature provides an additional mechanism by which IA-32e
+paging controls access to usermode addresses.
+
 ### psr (Intel)
 > `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )`
 
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 310ec85..7d03e52 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -22,6 +22,10 @@ boolean_param("xsave", use_xsave);
 bool_t opt_arat = 1;
 boolean_param("arat", opt_arat);
 
+/* pku: Flag to enable Memory Protection Keys (default on). */
+bool_t opt_pku = 1;
+boolean_param("pku", opt_pku);
+
 unsigned int opt_cpuid_mask_ecx = ~0u;
 integer_param("cpuid_mask_ecx", opt_cpuid_mask_ecx);
 unsigned int opt_cpuid_mask_edx = ~0u;
@@ -270,7 +274,8 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
 	if ( c->cpuid_level >= 0x00000007 )
 		cpuid_count(0x00000007, 0, &tmp,
 			    &c->x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)],
-			    &tmp, &tmp);
+			    &c->x86_capability[cpufeat_word(X86_FEATURE_PKU)],
+			    &tmp);
 }
 
 /*
@@ -323,6 +328,9 @@ void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
 	if ( cpu_has_xsave )
 		xstate_init(c);
 
+   	if ( !opt_pku )
+		setup_clear_cpu_cap(X86_FEATURE_PKU);
+
 	/*
 	 * The vendor-specific functions might have changed features.  Now
 	 * we do "generic changes."
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index af127cf..ef96514 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -11,7 +11,7 @@
 
 #include <xen/const.h>
 
-#define NCAPINTS	8	/* N 32-bit words worth of info */
+#define NCAPINTS	9	/* N 32-bit words worth of info */
 
 /* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
 #define X86_FEATURE_FPU		(0*32+ 0) /* Onboard FPU */
@@ -163,6 +163,10 @@
 #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
 
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 8 */
+#define X86_FEATURE_PKU	(8*32+ 3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE	(8*32+ 4) /* OS Protection Keys Enable */
+
 #define cpufeat_word(idx)	((idx) / 32)
 #define cpufeat_bit(idx)	((idx) % 32)
 #define cpufeat_mask(idx)	(_AC(1, U) << cpufeat_bit(idx))
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4
  2015-12-21  7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
  2015-12-21  7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-12-21  7:21 ` Huaitong Han
  2015-12-21 15:00   ` Jan Beulich
  2015-12-21  7:21 ` [PATCH V4 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21  7:21 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds pkeys support when setting CR4.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/hvm.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index db0aeba..59916ed 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1924,6 +1924,7 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
         leaf1_edx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_VME)];
         leaf1_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PCID)];
         leaf7_0_ebx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)];
+        leaf7_0_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PKU)];
     }
 
     return ~(unsigned long)
@@ -1959,7 +1960,9 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
              (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
               X86_CR4_SMEP : 0) |
              (leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
-              X86_CR4_SMAP : 0));
+              X86_CR4_SMAP : 0) |
+              (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
+              X86_CR4_PKE : 0));
 }
 
 static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V4 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
  2015-12-21  7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
  2015-12-21  7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
  2015-12-21  7:21 ` [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
@ 2015-12-21  7:21 ` Huaitong Han
  2015-12-21  7:21 ` [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 18+ messages in thread
From: Huaitong Han @ 2015-12-21  7:21 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch disables pkeys for guest in non-paging mode, However XEN always uses
paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
needs to be manually disabled when guest switches to non-paging mode.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/hvm/vmx/vmx.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2581e97..7123912 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1380,12 +1380,13 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
         if ( !hvm_paging_enabled(v) )
         {
             /*
-             * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
-             * However Xen always uses paging mode to emulate guest non-paging
-             * mode. To emulate this behavior, SMEP/SMAP needs to be manually
-             * disabled when guest VCPU is in non-paging mode.
+             * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+             * hardware. However Xen always uses paging mode to emulate guest
+             * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs
+             * to be manually disabled when guest VCPU is in non-paging mode.
              */
-            v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
+            v->arch.hvm_vcpu.hw_cr[4] &=
+                ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
         }
         __vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
         break;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
  2015-12-21  7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (2 preceding siblings ...)
  2015-12-21  7:21 ` [PATCH V4 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2015-12-21  7:21 ` Huaitong Han
  2015-12-21 15:32   ` Jan Beulich
  2015-12-21  7:21 ` [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
  2015-12-21  7:21 ` [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
  5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21  7:21 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
leaf entries of the page tables.

PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
i (WDi). PKEY is index to a defined domain.

A fault is considered as a PKU violation if all of the following conditions are
ture:
1.CR4_PKE=1.
2.EFER_LMA=1.
3.Page is present with no reserved bit violations.
4.The access is not an instruction fetch.
5.The access is to a user page.
6.PKRU.AD=1
    or The access is a data write and PKRU.WD=1
        and either CR0.WP=1 or it is a user access.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
 xen/arch/x86/mm/guest_walk.c      | 65 +++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/mm/hap/guest_walk.c  |  3 ++
 xen/include/asm-x86/guest_pt.h    |  7 +++++
 xen/include/asm-x86/hvm/hvm.h     |  2 ++
 xen/include/asm-x86/page.h        |  5 +++
 xen/include/asm-x86/processor.h   | 39 +++++++++++++++++++++++
 xen/include/asm-x86/x86_64/page.h | 12 ++++++++
 7 files changed, 133 insertions(+)

diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 18d1acf..f65ba27 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
     return 0;
 }
 
+#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
+bool_t leaf_pte_pkeys_check(struct vcpu *vcpu,
+        uint32_t pfec, uint32_t pte_pkey)
+{
+    unsigned int pkru = 0;
+    bool_t pkru_ad, pkru_wd;
+
+    bool_t pf = !!(pfec & PFEC_page_present);
+    bool_t uf = !!(pfec & PFEC_user_mode);
+    bool_t wf = !!(pfec & PFEC_write_access);
+    bool_t ff = !!(pfec & PFEC_insn_fetch);
+    bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
+
+    /* When page is present,  PFEC_prot_key is always checked */
+    if ( !pf || is_pv_vcpu(vcpu) )
+        return 0;
+
+    /*
+     * PKU:  additional mechanism by which the paging controls
+     * access to user-mode addresses based on the value in the
+     * PKRU register. A fault is considered as a PKU violation if all
+     * of the following conditions are ture:
+     * 1.CR4_PKE=1.
+     * 2.EFER_LMA=1.
+     * 3.page is present with no reserved bit violations.
+     * 4.the access is not an instruction fetch.
+     * 5.the access is to a user page.
+     * 6.PKRU.AD=1
+     *       or The access is a data write and PKRU.WD=1
+     *            and either CR0.WP=1 or it is a user access.
+     */
+    if ( !hvm_pku_enabled(vcpu) ||
+            !hvm_long_mode_enabled(vcpu) || rsvdf || ff )
+        return 0;
+
+    pkru = read_pkru();
+    if ( unlikely(pkru) )
+    {
+        pkru_ad = read_pkru_ad(pkru, pte_pkey);
+        pkru_wd = read_pkru_wd(pkru, pte_pkey);
+        /* Condition 6 */
+        if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf)))
+            return 1;
+    }
+
+    return 0;
+}
+#endif
+
 /* Walk the guest pagetables, after the manner of a hardware walker. */
 /* Because the walk is essentially random, it can cause a deadlock 
  * warning in the p2m locking code. Highly unlikely this is an actual
@@ -106,6 +155,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
     guest_l3e_t *l3p = NULL;
     guest_l4e_t *l4p;
+    unsigned int pkey;
 #endif
     uint32_t gflags, mflags, iflags, rc = 0;
     bool_t smep = 0, smap = 0;
@@ -190,6 +240,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
         goto out;
     /* Get the l3e and check its flags*/
     gw->l3e = l3p[guest_l3_table_offset(va)];
+    pkey = guest_l3e_get_pkey(gw->l3e);
     gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
     if ( !(gflags & _PAGE_PRESENT) ) {
         rc |= _PAGE_PRESENT;
@@ -199,6 +250,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
     
     pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v); 
 
+    if ( pse1G && leaf_pte_pkeys_check(v, pfec, pkey) )
+        rc |= _PAGE_PKEY_BITS;
+
     if ( pse1G )
     {
         /* Generate a fake l1 table entry so callers don't all 
@@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
 
     pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
 
+#if GUEST_PAGING_LEVELS >= 4
+    pkey = guest_l2e_get_pkey(gw->l2e);
+    if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
+        rc |= _PAGE_PKEY_BITS;
+#endif
+
     if ( pse2M )
     {
         /* Special case: this guest VA is in a PSE superpage, so there's
@@ -330,6 +390,11 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
             goto out;
         }
         rc |= ((gflags & mflags) ^ mflags);
+#if GUEST_PAGING_LEVELS >= 4
+        pkey = guest_l1e_get_pkey(gw->l1e);
+        if ( leaf_pte_pkeys_check(v, pfec, pkey) )
+            rc |= _PAGE_PKEY_BITS;
+#endif
     }
 
 #if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 11c1b35..49d0328 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -130,6 +130,9 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
     if ( missing & _PAGE_INVALID_BITS ) 
         pfec[0] |= PFEC_reserved_bit;
 
+    if ( missing & _PAGE_PKEY_BITS )
+        pfec[0] |= PFEC_prot_key;
+
     if ( missing & _PAGE_PAGED )
         pfec[0] = PFEC_page_paged;
 
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index 3447973..d02f296 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -154,6 +154,13 @@ static inline u32 guest_l4e_get_flags(guest_l4e_t gl4e)
 { return l4e_get_flags(gl4e); }
 #endif
 
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return l1e_get_pkey(gl1e); }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return l2e_get_pkey(gl2e); }
+static inline u32 guest_l3e_get_pkey(guest_l3e_t gl3e)
+{ return l3e_get_pkey(gl3e); }
+
 static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
 { return l1e_from_pfn(gfn_x(gfn), flags); }
 static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index f80e143..79b3421 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -275,6 +275,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
     (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
 #define hvm_nx_enabled(v) \
     (!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
+#define hvm_pku_enabled(v) \
+    (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE))
 
 /* Can we use superpages in the HAP p2m table? */
 #define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index a095a93..d69d91d 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -93,6 +93,11 @@
 #define l3e_get_flags(x)           (get_pte_flags((x).l3))
 #define l4e_get_flags(x)           (get_pte_flags((x).l4))
 
+/* Get pte pkeys (unsigned int). */
+#define l1e_get_pkey(x)           (get_pte_pkey((x).l1))
+#define l2e_get_pkey(x)           (get_pte_pkey((x).l2))
+#define l3e_get_pkey(x)           (get_pte_pkey((x).l3))
+
 /* Construct an empty pte. */
 #define l1e_empty()                ((l1_pgentry_t) { 0 })
 #define l2e_empty()                ((l2_pgentry_t) { 0 })
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 3f8411f..3f251aa 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -373,6 +373,45 @@ static always_inline void clear_in_cr4 (unsigned long mask)
     write_cr4(read_cr4() & ~mask);
 }
 
+static inline unsigned int read_pkru(void)
+{
+    unsigned int pkru;
+
+    /*
+     * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
+     * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can
+     * be used with temporarily setting CR4.PKE.
+     */
+    set_in_cr4(X86_CR4_PKE);
+    asm volatile (".byte 0x0f,0x01,0xee"
+        : "=a" (pkru) : "c" (0) : "dx");
+    clear_in_cr4(X86_CR4_PKE);
+
+    return pkru;
+}
+
+/* Macros for PKRU domain */
+#define PKRU_READ  (0)
+#define PKRU_WRITE (1)
+#define PKRU_ATTRS (2)
+
+/*
+ * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
+ * domain in pkru, pkeys is index to a defined domain, so the value of
+ * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
+ */
+static inline bool_t read_pkru_ad(unsigned int pkru, unsigned int pkey)
+{
+    ASSERT(pkey < 16);
+    return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
+}
+
+static inline bool_t read_pkru_wd(unsigned int pkru, unsigned int pkey)
+{
+    ASSERT(pkey < 16);
+    return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
+}
+
 /*
  *      NSC/Cyrix CPU configuration register indexes
  */
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..86abb94 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -134,6 +134,18 @@ typedef l4_pgentry_t root_pgentry_t;
 #define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
 #define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
 
+/*
+ * Protection keys define a new 4-bit protection key field
+ * (PKEY) in bits 62:59 of leaf entries of the page tables.
+ * This corresponds to bit 22:19 of a 24-bit flags.
+ *
+ * Notice: Bit 22 is used by _PAGE_GNTTAB which is visible to PV guests,
+ * so Protection keys must be disabled on PV guests.
+ */
+#define _PAGE_PKEY_BITS  (0x780000)	 /* Protection Keys, 22:19 */
+
+#define get_pte_pkey(x) (MASK_EXTR(get_pte_flags(x), _PAGE_PKEY_BITS))
+
 /* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
 #define _PAGE_NX_BIT (1U<<23)
 
-- 
2.4.3


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys
  2015-12-21  7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (3 preceding siblings ...)
  2015-12-21  7:21 ` [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-12-21  7:21 ` Huaitong Han
  2015-12-21 15:09   ` Jan Beulich
  2015-12-21  7:21 ` [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
  5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21  7:21 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds xstate support for pkeys.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/include/asm-x86/xstate.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 12d939b..f7c41ba 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -34,13 +34,15 @@
 #define XSTATE_OPMASK  (1ULL << 5)
 #define XSTATE_ZMM     (1ULL << 6)
 #define XSTATE_HI_ZMM  (1ULL << 7)
+#define XSTATE_PKRU    (1ULL << 9)
 #define XSTATE_LWP     (1ULL << 62) /* AMD lightweight profiling */
 #define XSTATE_FP_SSE  (XSTATE_FP | XSTATE_SSE)
 #define XCNTXT_MASK    (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
                         XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
 
 #define XSTATE_ALL     (~(1ULL << 63))
-#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
+#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
+                        XSTATE_PKRU)
 #define XSTATE_LAZY    (XSTATE_ALL & ~XSTATE_NONLAZY)
 #define XSTATE_COMPACTION_ENABLED  (1ULL << 63)
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-12-21  7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
                   ` (4 preceding siblings ...)
  2015-12-21  7:21 ` [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-12-21  7:21 ` Huaitong Han
  2015-12-21 15:07   ` Jan Beulich
  5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21  7:21 UTC (permalink / raw)
  To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
	george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
	wei.liu2, keir
  Cc: Huaitong Han, xen-devel

This patch adds pkeys support for cpuid handing.

Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.

Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
 tools/libxc/xc_cpufeature.h |  2 ++
 tools/libxc/xc_cpuid_x86.c  |  6 ++++--
 xen/arch/x86/hvm/hvm.c      | 10 +++++++++-
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index c3ddc80..f6a9778 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -141,5 +141,7 @@
 #define X86_FEATURE_ADX         19 /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP        20 /* Supervisor Mode Access Protection */
 
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU     3
 
 #endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 8882c01..1ce979b 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
                         bitmaskof(X86_FEATURE_ADX)  |
                         bitmaskof(X86_FEATURE_SMAP) |
                         bitmaskof(X86_FEATURE_FSGSBASE));
+            regs[2] &= bitmaskof(X86_FEATURE_PKU);
         } else
-            regs[1] = 0;
-        regs[0] = regs[2] = regs[3] = 0;
+            regs[1] = regs[2] = 0;
+
+        regs[0] = regs[3] = 0;
         break;
 
     case 0x0000000d:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 59916ed..05821ed 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
             __clear_bit(X86_FEATURE_APIC & 31, edx);
 
         /* Fix up OSXSAVE. */
-        if ( cpu_has_xsave )
+        if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
             *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
                      cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
 
@@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         /* Don't expose INVPCID to non-hap hvm. */
         if ( (count == 0) && !hap_enabled(d) )
             *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+
+        /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
+        if ( (count == 0) && !hap_enabled(d) )
+            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
+
+        if ( (count == 0) && (*ecx & cpufeat_mask(X86_FEATURE_PKU)) )
+            *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
+                     cpufeat_mask(X86_FEATURE_OSPKE) : 0;
         break;
     case 0xb:
         /* Fix the x2APIC identifier. */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
  2015-12-21  7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-12-21 12:58   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 12:58 UTC (permalink / raw)
  To: Huaitong Han
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	jun.nakajima, keir

>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:

First of all - please trim your Cc lists. I don't see, for example, why all
the tools maintainers needed to be Cc-ed on this patch.

> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -22,6 +22,10 @@ boolean_param("xsave", use_xsave);
>  bool_t opt_arat = 1;
>  boolean_param("arat", opt_arat);
>  
> +/* pku: Flag to enable Memory Protection Keys (default on). */
> +bool_t opt_pku = 1;

I would have thought that I had commented on this wanting to be
static already, but looks like I'm misremembering.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4
  2015-12-21  7:21 ` [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
@ 2015-12-21 15:00   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:00 UTC (permalink / raw)
  To: Huaitong Han
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	jun.nakajima, keir

>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> This patch adds pkeys support when setting CR4.

Btw there's little point in a commit message simply repeating the
title.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-12-21  7:21 ` [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
@ 2015-12-21 15:07   ` Jan Beulich
  2015-12-22  2:54     ` Han, Huaitong
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:07 UTC (permalink / raw)
  To: Huaitong Han
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	jun.nakajima, keir

>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>          /* Don't expose INVPCID to non-hap hvm. */
>          if ( (count == 0) && !hap_enabled(d) )
>              *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> +
> +        /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
> +        if ( (count == 0) && !hap_enabled(d) )
> +            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);

I'm still missing the xsave dependency here.

> +        if ( (count == 0) && (*ecx & cpufeat_mask(X86_FEATURE_PKU)) )
> +            *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> +                     cpufeat_mask(X86_FEATURE_OSPKE) : 0;

Also all (would now be 6!) count == 0 dependencies should be
combined instead of repeating that condition so many times.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys
  2015-12-21  7:21 ` [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-12-21 15:09   ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:09 UTC (permalink / raw)
  To: Huaitong Han
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	jun.nakajima, keir

>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> This patch adds xstate support for pkeys.
> 
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
>  xen/include/asm-x86/xstate.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
> index 12d939b..f7c41ba 100644
> --- a/xen/include/asm-x86/xstate.h
> +++ b/xen/include/asm-x86/xstate.h
> @@ -34,13 +34,15 @@
>  #define XSTATE_OPMASK  (1ULL << 5)
>  #define XSTATE_ZMM     (1ULL << 6)
>  #define XSTATE_HI_ZMM  (1ULL << 7)
> +#define XSTATE_PKRU    (1ULL << 9)
>  #define XSTATE_LWP     (1ULL << 62) /* AMD lightweight profiling */
>  #define XSTATE_FP_SSE  (XSTATE_FP | XSTATE_SSE)
>  #define XCNTXT_MASK    (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
>                          XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
>  
>  #define XSTATE_ALL     (~(1ULL << 63))
> -#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
> +#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
> +                        XSTATE_PKRU)

Is it really a good idea to do this uniformly despite PV not going to
have pkeys support?

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
  2015-12-21  7:21 ` [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-12-21 15:32   ` Jan Beulich
  2015-12-22  8:12     ` Han, Huaitong
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:32 UTC (permalink / raw)
  To: Huaitong Han
  Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	jun.nakajima, keir

>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
>      return 0;
>  }
>  
> +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS

GUEST_PAGING_LEVELS >= 4 (just like further down)

> +bool_t leaf_pte_pkeys_check(struct vcpu *vcpu,
> +        uint32_t pfec, uint32_t pte_pkey)
> +{
> +    unsigned int pkru = 0;
> +    bool_t pkru_ad, pkru_wd;
> +

Stray blank line.

> +    bool_t pf = !!(pfec & PFEC_page_present);
> +    bool_t uf = !!(pfec & PFEC_user_mode);
> +    bool_t wf = !!(pfec & PFEC_write_access);
> +    bool_t ff = !!(pfec & PFEC_insn_fetch);
> +    bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> +
> +    /* When page is present,  PFEC_prot_key is always checked */
> +    if ( !pf || is_pv_vcpu(vcpu) )
> +        return 0;

I think for a function called "check" together with how its callers use
it the return value meaning is inverted here. Also the comment seems
inverted wrt the actual check (and is missing a full stop). And doesn't
key 0 have static meaning, in which case you could bail early (and
namely avoid the expensive RDPKRU further down)?

> +    /*
> +     * PKU:  additional mechanism by which the paging controls
> +     * access to user-mode addresses based on the value in the
> +     * PKRU register. A fault is considered as a PKU violation if all
> +     * of the following conditions are ture:
> +     * 1.CR4_PKE=1.
> +     * 2.EFER_LMA=1.
> +     * 3.page is present with no reserved bit violations.
> +     * 4.the access is not an instruction fetch.
> +     * 5.the access is to a user page.
> +     * 6.PKRU.AD=1
> +     *       or The access is a data write and PKRU.WD=1
> +     *            and either CR0.WP=1 or it is a user access.
> +     */
> +    if ( !hvm_pku_enabled(vcpu) ||
> +            !hvm_long_mode_enabled(vcpu) || rsvdf || ff )

Where's the "user page" check? Also - indentation.

> +        return 0;
> +
> +    pkru = read_pkru();
> +    if ( unlikely(pkru) )
> +    {
> +        pkru_ad = read_pkru_ad(pkru, pte_pkey);
> +        pkru_wd = read_pkru_wd(pkru, pte_pkey);
> +        /* Condition 6 */
> +        if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf)))

Ah, uf is being checked here. But according to the comment it could
(and should, again to avoid the RDPKRU) move up.

> @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>  
>      pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
>  
> +#if GUEST_PAGING_LEVELS >= 4
> +    pkey = guest_l2e_get_pkey(gw->l2e);
> +    if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
> +        rc |= _PAGE_PKEY_BITS;
> +#endif

I think the #ifdef isn't really needed here, if you moved the one
around leaf_pte_pkeys_check() into that function, and if you
perhaps also dropped the "pkey" local variable.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-12-21 15:07   ` Jan Beulich
@ 2015-12-22  2:54     ` Han, Huaitong
  2015-12-22  8:30       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Han, Huaitong @ 2015-12-22  2:54 UTC (permalink / raw)
  To: JBeulich
  Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel, Nakajima,
	Jun, keir

On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned
> > int *eax, unsigned int *ebx,
> >          /* Don't expose INVPCID to non-hap hvm. */
> >          if ( (count == 0) && !hap_enabled(d) )
> >              *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> > +
> > +        /* X86_FEATURE_PKU is not yet implemented for shadow
> > paging. */
> > +        if ( (count == 0) && !hap_enabled(d) )
> > +            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> 
> I'm still missing the xsave dependency here.
Xsave dependency deletion becasue we use RDPKRU to get PKRU register
value instead of XSAVE now.
> 
> > +        if ( (count == 0) && (*ecx &
> > cpufeat_mask(X86_FEATURE_PKU)) )
> > +            *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> > +                     cpufeat_mask(X86_FEATURE_OSPKE) : 0;
> 
> Also all (would now be 6!) count == 0 dependencies should be
> combined instead of repeating that condition so many times.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
  2015-12-21 15:32   ` Jan Beulich
@ 2015-12-22  8:12     ` Han, Huaitong
  2015-12-22  8:43       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Han, Huaitong @ 2015-12-22  8:12 UTC (permalink / raw)
  To: JBeulich
  Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel, Nakajima,
	Jun, keir

On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote:
> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> > --- a/xen/arch/x86/mm/guest_walk.c
> > +++ b/xen/arch/x86/mm/guest_walk.c
> > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void
> > *walk_p, int set_dirty)
> >      return 0;
> >  }
> >  
> > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
> 
> GUEST_PAGING_LEVELS >= 4 (just like further down)
The code is modified according Andrew's comments:
"
  This is a latent linking bug for the future 
  when 5 levels comes along.

  It will probably be best to use the same trick
  as gw_page_flags to compile it once but use it
  multiple times.
"

> 
> > +bool_t leaf_pte_pkeys_check(struct vcpu *vcpu,
> > +        uint32_t pfec, uint32_t pte_pkey)
> > +{
> > +    unsigned int pkru = 0;
> > +    bool_t pkru_ad, pkru_wd;
> > +
> 
> Stray blank line.
> 
> > +    bool_t pf = !!(pfec & PFEC_page_present);
> > +    bool_t uf = !!(pfec & PFEC_user_mode);
> > +    bool_t wf = !!(pfec & PFEC_write_access);
> > +    bool_t ff = !!(pfec & PFEC_insn_fetch);
> > +    bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> > +
> > +    /* When page is present,  PFEC_prot_key is always checked */
> > +    if ( !pf || is_pv_vcpu(vcpu) )
> > +        return 0;
> 
> I think for a function called "check" together with how its callers
> use
> it the return value meaning is inverted here. 
I will change the function name to "pkey_page_fault".

> Also the comment seems
> inverted wrt the actual check (and is missing a full stop). And
> doesn't
> key 0 have static meaning, in which case you could bail early (and
> namely avoid the expensive RDPKRU further down)?
Key 0 have no static meaning, the default key maybe different in
different OS.
> 
> > +    /*
> > +     * PKU:  additional mechanism by which the paging controls
> > +     * access to user-mode addresses based on the value in the
> > +     * PKRU register. A fault is considered as a PKU violation if
> > all
> > +     * of the following conditions are ture:
> > +     * 1.CR4_PKE=1.
> > +     * 2.EFER_LMA=1.
> > +     * 3.page is present with no reserved bit violations.
> > +     * 4.the access is not an instruction fetch.
> > +     * 5.the access is to a user page.
> > +     * 6.PKRU.AD=1
> > +     *       or The access is a data write and PKRU.WD=1
> > +     *            and either CR0.WP=1 or it is a user access.
> > +     */
> > +    if ( !hvm_pku_enabled(vcpu) ||
> > +            !hvm_long_mode_enabled(vcpu) || rsvdf || ff )
> 
> Where's the "user page" check? Also - indentation.
it should be:
    if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
            rsvdf || ff || !(pte_flags & _PAGE_USER) )
> 
> > +        return 0;
> > +
> > +    pkru = read_pkru();
> > +    if ( unlikely(pkru) )
> > +    {
> > +        pkru_ad = read_pkru_ad(pkru, pte_pkey);
> > +        pkru_wd = read_pkru_wd(pkru, pte_pkey);
> > +        /* Condition 6 */
> > +        if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) ||
> > uf)))
> 
> Ah, uf is being checked here. But according to the comment it could
> (and should, again to avoid the RDPKRU) move up.
> 
> > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct
> > p2m_domain *p2m,
> >  
> >      pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
> >  
> > +#if GUEST_PAGING_LEVELS >= 4
> > +    pkey = guest_l2e_get_pkey(gw->l2e);
> > +    if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
> > +        rc |= _PAGE_PKEY_BITS;
> > +#endif
> 
> I think the #ifdef isn't really needed here, if you moved the one
> around leaf_pte_pkeys_check() into that function, and if you
> perhaps also dropped the "pkey" local variable.
guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS
too, and I think it's better to keep it.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-12-22  2:54     ` Han, Huaitong
@ 2015-12-22  8:30       ` Jan Beulich
  2015-12-22  9:25         ` Han, Huaitong
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2015-12-22  8:30 UTC (permalink / raw)
  To: Huaitong Han
  Cc: Kevin Tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Jun Nakajima, keir

>>> On 22.12.15 at 03:54, <huaitong.han@intel.com> wrote:
> On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
>> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
>> > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned
>> > int *eax, unsigned int *ebx,
>> >          /* Don't expose INVPCID to non-hap hvm. */
>> >          if ( (count == 0) && !hap_enabled(d) )
>> >              *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
>> > +
>> > +        /* X86_FEATURE_PKU is not yet implemented for shadow
>> > paging. */
>> > +        if ( (count == 0) && !hap_enabled(d) )
>> > +            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
>> 
>> I'm still missing the xsave dependency here.
> Xsave dependency deletion becasue we use RDPKRU to get PKRU register
> value instead of XSAVE now.

What the hypervisor does doesn't matter here. The question is
whether from an architectural standpoint XSAVE is a prerequsite.
If it is, then you need to clear PKU when _guest_ XSAVE is clear.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
  2015-12-22  8:12     ` Han, Huaitong
@ 2015-12-22  8:43       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-22  8:43 UTC (permalink / raw)
  To: Huaitong Han
  Cc: Kevin Tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Jun Nakajima, keir

>>> On 22.12.15 at 09:12, <huaitong.han@intel.com> wrote:
> On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote:
>> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
>> > --- a/xen/arch/x86/mm/guest_walk.c
>> > +++ b/xen/arch/x86/mm/guest_walk.c
>> > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void
>> > *walk_p, int set_dirty)
>> >      return 0;
>> >  }
>> >  
>> > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
>> 
>> GUEST_PAGING_LEVELS >= 4 (just like further down)
> The code is modified according Andrew's comments:
> "
>   This is a latent linking bug for the future 
>   when 5 levels comes along.
> 
>   It will probably be best to use the same trick
>   as gw_page_flags to compile it once but use it
>   multiple times.
> "

Okay, understood. But then you should follow _that_ model (using
== instead of >=) instead of inventing a 3rd variant. Similar things
should be done in similar ways so that they can be easily recognized
being similar.

Otoh the function isn't being called from code other than
GUEST_PAGING_LEVELS == 4, so at present it could be static,
which would take care of the latent linking issue Andrew referred to.

>> > +    bool_t pf = !!(pfec & PFEC_page_present);
>> > +    bool_t uf = !!(pfec & PFEC_user_mode);
>> > +    bool_t wf = !!(pfec & PFEC_write_access);
>> > +    bool_t ff = !!(pfec & PFEC_insn_fetch);
>> > +    bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
>> > +
>> > +    /* When page is present,  PFEC_prot_key is always checked */
>> > +    if ( !pf || is_pv_vcpu(vcpu) )
>> > +        return 0;
>> 
>> I think for a function called "check" together with how its callers
>> use
>> it the return value meaning is inverted here. 
> I will change the function name to "pkey_page_fault".

Perhaps just pkey_fault() then, since it's clear there's no other fault
possible here besides a page fault?

>> Also the comment seems
>> inverted wrt the actual check (and is missing a full stop). And
>> doesn't
>> key 0 have static meaning, in which case you could bail early (and
>> namely avoid the expensive RDPKRU further down)?
> Key 0 have no static meaning, the default key maybe different in
> different OS.

Okay - I thought I had seen something like this mentioned in
another sub-thread.

>> > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct
>> > p2m_domain *p2m,
>> >  
>> >      pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v); 
>> >  
>> > +#if GUEST_PAGING_LEVELS >= 4
>> > +    pkey = guest_l2e_get_pkey(gw->l2e);
>> > +    if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
>> > +        rc |= _PAGE_PKEY_BITS;
>> > +#endif
>> 
>> I think the #ifdef isn't really needed here, if you moved the one
>> around leaf_pte_pkeys_check() into that function, and if you
>> perhaps also dropped the "pkey" local variable.
> guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS
> too, and I think it's better to keep it.

I didn't suggest to drop guest_l2e_get_pkey(). I'd like to see the
#ifdef-s go away if at all possible, to help code readability.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-12-22  8:30       ` Jan Beulich
@ 2015-12-22  9:25         ` Han, Huaitong
  2015-12-22  9:51           ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Han, Huaitong @ 2015-12-22  9:25 UTC (permalink / raw)
  To: JBeulich
  Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel, Nakajima,
	Jun, keir

On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote:
> > > > On 22.12.15 at 03:54, <huaitong.han@intel.com> wrote:
> > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
> > > > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input,
> > > > unsigned
> > > > int *eax, unsigned int *ebx,
> > > >          /* Don't expose INVPCID to non-hap hvm. */
> > > >          if ( (count == 0) && !hap_enabled(d) )
> > > >              *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> > > > +
> > > > +        /* X86_FEATURE_PKU is not yet implemented for shadow
> > > > paging. */
> > > > +        if ( (count == 0) && !hap_enabled(d) )
> > > > +            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> > > 
> > > I'm still missing the xsave dependency here.
> > Xsave dependency deletion becasue we use RDPKRU to get PKRU
> > register
> > value instead of XSAVE now.
> 
> What the hypervisor does doesn't matter here. The question is
> whether from an architectural standpoint XSAVE is a prerequsite.
> If it is, then you need to clear PKU when _guest_ XSAVE is clear.
No, XSAVE is not a prerequsite.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
  2015-12-22  9:25         ` Han, Huaitong
@ 2015-12-22  9:51           ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-22  9:51 UTC (permalink / raw)
  To: Huaitong Han
  Cc: Kevin Tian, wei.liu2, ian.campbell, stefano.stabellini,
	george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
	Jun Nakajima, keir

>>> On 22.12.15 at 10:25, <huaitong.han@intel.com> wrote:
> On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote:
>> > > > On 22.12.15 at 03:54, <huaitong.han@intel.com> wrote:
>> > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
>> > > > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
>> > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input,
>> > > > unsigned
>> > > > int *eax, unsigned int *ebx,
>> > > >          /* Don't expose INVPCID to non-hap hvm. */
>> > > >          if ( (count == 0) && !hap_enabled(d) )
>> > > >              *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
>> > > > +
>> > > > +        /* X86_FEATURE_PKU is not yet implemented for shadow
>> > > > paging. */
>> > > > +        if ( (count == 0) && !hap_enabled(d) )
>> > > > +            *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
>> > > 
>> > > I'm still missing the xsave dependency here.
>> > Xsave dependency deletion becasue we use RDPKRU to get PKRU
>> > register
>> > value instead of XSAVE now.
>> 
>> What the hypervisor does doesn't matter here. The question is
>> whether from an architectural standpoint XSAVE is a prerequsite.
>> If it is, then you need to clear PKU when _guest_ XSAVE is clear.
> No, XSAVE is not a prerequsite.

I.e. OSes are expected to deal with both cases (PKU w/ XSAVE and
PKU w/o XSAVE)? Sounds like a recipe for trouble, but well - okay.

Jan

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-12-22  9:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-21  7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2015-12-21  7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
2015-12-21 12:58   ` Jan Beulich
2015-12-21  7:21 ` [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
2015-12-21 15:00   ` Jan Beulich
2015-12-21  7:21 ` [PATCH V4 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
2015-12-21  7:21 ` [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
2015-12-21 15:32   ` Jan Beulich
2015-12-22  8:12     ` Han, Huaitong
2015-12-22  8:43       ` Jan Beulich
2015-12-21  7:21 ` [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
2015-12-21 15:09   ` Jan Beulich
2015-12-21  7:21 ` [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
2015-12-21 15:07   ` Jan Beulich
2015-12-22  2:54     ` Han, Huaitong
2015-12-22  8:30       ` Jan Beulich
2015-12-22  9:25         ` Han, Huaitong
2015-12-22  9:51           ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.