* [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
2015-12-21 7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
@ 2015-12-21 7:21 ` Huaitong Han
2015-12-21 12:58 ` Jan Beulich
2015-12-21 7:21 ` [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
` (4 subsequent siblings)
5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21 7:21 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds the flag to enable Memory Protection Keys.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
docs/misc/xen-command-line.markdown | 10 ++++++++++
xen/arch/x86/cpu/common.c | 10 +++++++++-
xen/include/asm-x86/cpufeature.h | 6 +++++-
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index c103894..36ecf80 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1177,6 +1177,16 @@ This option can be specified more than once (up to 8 times at present).
### ple\_window
> `= <integer>`
+### pku
+> `= <boolean>`
+
+> Default: `true`
+
+Flag to enable Memory Protection Keys.
+
+The protection-key feature provides an additional mechanism by which IA-32e
+paging controls access to usermode addresses.
+
### psr (Intel)
> `= List of ( cmt:<boolean> | rmid_max:<integer> | cat:<boolean> | cos_max:<integer> | cdp:<boolean> )`
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
index 310ec85..7d03e52 100644
--- a/xen/arch/x86/cpu/common.c
+++ b/xen/arch/x86/cpu/common.c
@@ -22,6 +22,10 @@ boolean_param("xsave", use_xsave);
bool_t opt_arat = 1;
boolean_param("arat", opt_arat);
+/* pku: Flag to enable Memory Protection Keys (default on). */
+bool_t opt_pku = 1;
+boolean_param("pku", opt_pku);
+
unsigned int opt_cpuid_mask_ecx = ~0u;
integer_param("cpuid_mask_ecx", opt_cpuid_mask_ecx);
unsigned int opt_cpuid_mask_edx = ~0u;
@@ -270,7 +274,8 @@ static void __cpuinit generic_identify(struct cpuinfo_x86 *c)
if ( c->cpuid_level >= 0x00000007 )
cpuid_count(0x00000007, 0, &tmp,
&c->x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)],
- &tmp, &tmp);
+ &c->x86_capability[cpufeat_word(X86_FEATURE_PKU)],
+ &tmp);
}
/*
@@ -323,6 +328,9 @@ void __cpuinit identify_cpu(struct cpuinfo_x86 *c)
if ( cpu_has_xsave )
xstate_init(c);
+ if ( !opt_pku )
+ setup_clear_cpu_cap(X86_FEATURE_PKU);
+
/*
* The vendor-specific functions might have changed features. Now
* we do "generic changes."
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index af127cf..ef96514 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -11,7 +11,7 @@
#include <xen/const.h>
-#define NCAPINTS 8 /* N 32-bit words worth of info */
+#define NCAPINTS 9 /* N 32-bit words worth of info */
/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
#define X86_FEATURE_FPU (0*32+ 0) /* Onboard FPU */
@@ -163,6 +163,10 @@
#define X86_FEATURE_ADX (7*32+19) /* ADCX, ADOX instructions */
#define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx), word 8 */
+#define X86_FEATURE_PKU (8*32+ 3) /* Protection Keys for Userspace */
+#define X86_FEATURE_OSPKE (8*32+ 4) /* OS Protection Keys Enable */
+
#define cpufeat_word(idx) ((idx) / 32)
#define cpufeat_bit(idx) ((idx) % 32)
#define cpufeat_mask(idx) (_AC(1, U) << cpufeat_bit(idx))
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys
2015-12-21 7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-12-21 12:58 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 12:58 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
First of all - please trim your Cc lists. I don't see, for example, why all
the tools maintainers needed to be Cc-ed on this patch.
> --- a/xen/arch/x86/cpu/common.c
> +++ b/xen/arch/x86/cpu/common.c
> @@ -22,6 +22,10 @@ boolean_param("xsave", use_xsave);
> bool_t opt_arat = 1;
> boolean_param("arat", opt_arat);
>
> +/* pku: Flag to enable Memory Protection Keys (default on). */
> +bool_t opt_pku = 1;
I would have thought that I had commented on this wanting to be
static already, but looks like I'm misremembering.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4
2015-12-21 7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2015-12-21 7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
@ 2015-12-21 7:21 ` Huaitong Han
2015-12-21 15:00 ` Jan Beulich
2015-12-21 7:21 ` [PATCH V4 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
` (3 subsequent siblings)
5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21 7:21 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds pkeys support when setting CR4.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/hvm/hvm.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index db0aeba..59916ed 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1924,6 +1924,7 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
leaf1_edx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_VME)];
leaf1_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PCID)];
leaf7_0_ebx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_FSGSBASE)];
+ leaf7_0_ecx = boot_cpu_data.x86_capability[cpufeat_word(X86_FEATURE_PKU)];
}
return ~(unsigned long)
@@ -1959,7 +1960,9 @@ static unsigned long hvm_cr4_guest_reserved_bits(const struct vcpu *v,
(leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMEP) ?
X86_CR4_SMEP : 0) |
(leaf7_0_ebx & cpufeat_mask(X86_FEATURE_SMAP) ?
- X86_CR4_SMAP : 0));
+ X86_CR4_SMAP : 0) |
+ (leaf7_0_ecx & cpufeat_mask(X86_FEATURE_PKU) ?
+ X86_CR4_PKE : 0));
}
static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4
2015-12-21 7:21 ` [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
@ 2015-12-21 15:00 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:00 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> This patch adds pkeys support when setting CR4.
Btw there's little point in a commit message simply repeating the
title.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH V4 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode
2015-12-21 7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
2015-12-21 7:21 ` [PATCH V4 1/6] x86/hvm: pkeys, add the flag to enable Memory Protection Keys Huaitong Han
2015-12-21 7:21 ` [PATCH V4 2/6] x86/hvm: pkeys, add pkeys support when setting CR4 Huaitong Han
@ 2015-12-21 7:21 ` Huaitong Han
2015-12-21 7:21 ` [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
` (2 subsequent siblings)
5 siblings, 0 replies; 18+ messages in thread
From: Huaitong Han @ 2015-12-21 7:21 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch disables pkeys for guest in non-paging mode, However XEN always uses
paging mode to emulate guest non-paging mode, To emulate this behavior, pkeys
needs to be manually disabled when guest switches to non-paging mode.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/arch/x86/hvm/vmx/vmx.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 2581e97..7123912 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1380,12 +1380,13 @@ static void vmx_update_guest_cr(struct vcpu *v, unsigned int cr)
if ( !hvm_paging_enabled(v) )
{
/*
- * SMEP/SMAP is disabled if CPU is in non-paging mode in hardware.
- * However Xen always uses paging mode to emulate guest non-paging
- * mode. To emulate this behavior, SMEP/SMAP needs to be manually
- * disabled when guest VCPU is in non-paging mode.
+ * SMEP/SMAP/PKU is disabled if CPU is in non-paging mode in
+ * hardware. However Xen always uses paging mode to emulate guest
+ * non-paging mode. To emulate this behavior, SMEP/SMAP/PKU needs
+ * to be manually disabled when guest VCPU is in non-paging mode.
*/
- v->arch.hvm_vcpu.hw_cr[4] &= ~(X86_CR4_SMEP | X86_CR4_SMAP);
+ v->arch.hvm_vcpu.hw_cr[4] &=
+ ~(X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_PKE);
}
__vmwrite(GUEST_CR4, v->arch.hvm_vcpu.hw_cr[4]);
break;
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-21 7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (2 preceding siblings ...)
2015-12-21 7:21 ` [PATCH V4 3/6] x86/hvm: pkeys, disable pkeys for guests in non-paging mode Huaitong Han
@ 2015-12-21 7:21 ` Huaitong Han
2015-12-21 15:32 ` Jan Beulich
2015-12-21 7:21 ` [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
2015-12-21 7:21 ` [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21 7:21 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
Protection keys define a new 4-bit protection key field(PKEY) in bits 62:59 of
leaf entries of the page tables.
PKRU register defines 32 bits, there are 16 domains and 2 attribute bits per
domain in pkru, for each i (0 ≤ i ≤ 15), PKRU[2i] is the access-disable bit for
protection key i (ADi); PKRU[2i+1] is the write-disable bit for protection key
i (WDi). PKEY is index to a defined domain.
A fault is considered as a PKU violation if all of the following conditions are
ture:
1.CR4_PKE=1.
2.EFER_LMA=1.
3.Page is present with no reserved bit violations.
4.The access is not an instruction fetch.
5.The access is to a user page.
6.PKRU.AD=1
or The access is a data write and PKRU.WD=1
and either CR0.WP=1 or it is a user access.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
xen/arch/x86/mm/guest_walk.c | 65 +++++++++++++++++++++++++++++++++++++++
xen/arch/x86/mm/hap/guest_walk.c | 3 ++
xen/include/asm-x86/guest_pt.h | 7 +++++
xen/include/asm-x86/hvm/hvm.h | 2 ++
xen/include/asm-x86/page.h | 5 +++
xen/include/asm-x86/processor.h | 39 +++++++++++++++++++++++
xen/include/asm-x86/x86_64/page.h | 12 ++++++++
7 files changed, 133 insertions(+)
diff --git a/xen/arch/x86/mm/guest_walk.c b/xen/arch/x86/mm/guest_walk.c
index 18d1acf..f65ba27 100644
--- a/xen/arch/x86/mm/guest_walk.c
+++ b/xen/arch/x86/mm/guest_walk.c
@@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
return 0;
}
+#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
+bool_t leaf_pte_pkeys_check(struct vcpu *vcpu,
+ uint32_t pfec, uint32_t pte_pkey)
+{
+ unsigned int pkru = 0;
+ bool_t pkru_ad, pkru_wd;
+
+ bool_t pf = !!(pfec & PFEC_page_present);
+ bool_t uf = !!(pfec & PFEC_user_mode);
+ bool_t wf = !!(pfec & PFEC_write_access);
+ bool_t ff = !!(pfec & PFEC_insn_fetch);
+ bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
+
+ /* When page is present, PFEC_prot_key is always checked */
+ if ( !pf || is_pv_vcpu(vcpu) )
+ return 0;
+
+ /*
+ * PKU: additional mechanism by which the paging controls
+ * access to user-mode addresses based on the value in the
+ * PKRU register. A fault is considered as a PKU violation if all
+ * of the following conditions are ture:
+ * 1.CR4_PKE=1.
+ * 2.EFER_LMA=1.
+ * 3.page is present with no reserved bit violations.
+ * 4.the access is not an instruction fetch.
+ * 5.the access is to a user page.
+ * 6.PKRU.AD=1
+ * or The access is a data write and PKRU.WD=1
+ * and either CR0.WP=1 or it is a user access.
+ */
+ if ( !hvm_pku_enabled(vcpu) ||
+ !hvm_long_mode_enabled(vcpu) || rsvdf || ff )
+ return 0;
+
+ pkru = read_pkru();
+ if ( unlikely(pkru) )
+ {
+ pkru_ad = read_pkru_ad(pkru, pte_pkey);
+ pkru_wd = read_pkru_wd(pkru, pte_pkey);
+ /* Condition 6 */
+ if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf)))
+ return 1;
+ }
+
+ return 0;
+}
+#endif
+
/* Walk the guest pagetables, after the manner of a hardware walker. */
/* Because the walk is essentially random, it can cause a deadlock
* warning in the p2m locking code. Highly unlikely this is an actual
@@ -106,6 +155,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
#if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
guest_l3e_t *l3p = NULL;
guest_l4e_t *l4p;
+ unsigned int pkey;
#endif
uint32_t gflags, mflags, iflags, rc = 0;
bool_t smep = 0, smap = 0;
@@ -190,6 +240,7 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
goto out;
/* Get the l3e and check its flags*/
gw->l3e = l3p[guest_l3_table_offset(va)];
+ pkey = guest_l3e_get_pkey(gw->l3e);
gflags = guest_l3e_get_flags(gw->l3e) ^ iflags;
if ( !(gflags & _PAGE_PRESENT) ) {
rc |= _PAGE_PRESENT;
@@ -199,6 +250,9 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
pse1G = (gflags & _PAGE_PSE) && guest_supports_1G_superpages(v);
+ if ( pse1G && leaf_pte_pkeys_check(v, pfec, pkey) )
+ rc |= _PAGE_PKEY_BITS;
+
if ( pse1G )
{
/* Generate a fake l1 table entry so callers don't all
@@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v);
+#if GUEST_PAGING_LEVELS >= 4
+ pkey = guest_l2e_get_pkey(gw->l2e);
+ if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
+ rc |= _PAGE_PKEY_BITS;
+#endif
+
if ( pse2M )
{
/* Special case: this guest VA is in a PSE superpage, so there's
@@ -330,6 +390,11 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
goto out;
}
rc |= ((gflags & mflags) ^ mflags);
+#if GUEST_PAGING_LEVELS >= 4
+ pkey = guest_l1e_get_pkey(gw->l1e);
+ if ( leaf_pte_pkeys_check(v, pfec, pkey) )
+ rc |= _PAGE_PKEY_BITS;
+#endif
}
#if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
diff --git a/xen/arch/x86/mm/hap/guest_walk.c b/xen/arch/x86/mm/hap/guest_walk.c
index 11c1b35..49d0328 100644
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -130,6 +130,9 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PAGING_LEVELS)(
if ( missing & _PAGE_INVALID_BITS )
pfec[0] |= PFEC_reserved_bit;
+ if ( missing & _PAGE_PKEY_BITS )
+ pfec[0] |= PFEC_prot_key;
+
if ( missing & _PAGE_PAGED )
pfec[0] = PFEC_page_paged;
diff --git a/xen/include/asm-x86/guest_pt.h b/xen/include/asm-x86/guest_pt.h
index 3447973..d02f296 100644
--- a/xen/include/asm-x86/guest_pt.h
+++ b/xen/include/asm-x86/guest_pt.h
@@ -154,6 +154,13 @@ static inline u32 guest_l4e_get_flags(guest_l4e_t gl4e)
{ return l4e_get_flags(gl4e); }
#endif
+static inline u32 guest_l1e_get_pkey(guest_l1e_t gl1e)
+{ return l1e_get_pkey(gl1e); }
+static inline u32 guest_l2e_get_pkey(guest_l2e_t gl2e)
+{ return l2e_get_pkey(gl2e); }
+static inline u32 guest_l3e_get_pkey(guest_l3e_t gl3e)
+{ return l3e_get_pkey(gl3e); }
+
static inline guest_l1e_t guest_l1e_from_gfn(gfn_t gfn, u32 flags)
{ return l1e_from_pfn(gfn_x(gfn), flags); }
static inline guest_l2e_t guest_l2e_from_gfn(gfn_t gfn, u32 flags)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index f80e143..79b3421 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -275,6 +275,8 @@ int hvm_girq_dest_2_vcpu_id(struct domain *d, uint8_t dest, uint8_t dest_mode);
(hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_SMAP))
#define hvm_nx_enabled(v) \
(!!((v)->arch.hvm_vcpu.guest_efer & EFER_NX))
+#define hvm_pku_enabled(v) \
+ (hvm_paging_enabled(v) && ((v)->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE))
/* Can we use superpages in the HAP p2m table? */
#define hap_has_1gb (!!(hvm_funcs.hap_capabilities & HVM_HAP_SUPERPAGE_1GB))
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index a095a93..d69d91d 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -93,6 +93,11 @@
#define l3e_get_flags(x) (get_pte_flags((x).l3))
#define l4e_get_flags(x) (get_pte_flags((x).l4))
+/* Get pte pkeys (unsigned int). */
+#define l1e_get_pkey(x) (get_pte_pkey((x).l1))
+#define l2e_get_pkey(x) (get_pte_pkey((x).l2))
+#define l3e_get_pkey(x) (get_pte_pkey((x).l3))
+
/* Construct an empty pte. */
#define l1e_empty() ((l1_pgentry_t) { 0 })
#define l2e_empty() ((l2_pgentry_t) { 0 })
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 3f8411f..3f251aa 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -373,6 +373,45 @@ static always_inline void clear_in_cr4 (unsigned long mask)
write_cr4(read_cr4() & ~mask);
}
+static inline unsigned int read_pkru(void)
+{
+ unsigned int pkru;
+
+ /*
+ * _PAGE_PKEY_BITS have a conflict with _PAGE_GNTTAB used by PV guests,
+ * so that X86_CR4_PKE is disable on hypervisor, RDPKRU instruction can
+ * be used with temporarily setting CR4.PKE.
+ */
+ set_in_cr4(X86_CR4_PKE);
+ asm volatile (".byte 0x0f,0x01,0xee"
+ : "=a" (pkru) : "c" (0) : "dx");
+ clear_in_cr4(X86_CR4_PKE);
+
+ return pkru;
+}
+
+/* Macros for PKRU domain */
+#define PKRU_READ (0)
+#define PKRU_WRITE (1)
+#define PKRU_ATTRS (2)
+
+/*
+ * PKRU defines 32 bits, there are 16 domains and 2 attribute bits per
+ * domain in pkru, pkeys is index to a defined domain, so the value of
+ * pte_pkeys * PKRU_ATTRS + R/W is offset of a defined domain attribute.
+ */
+static inline bool_t read_pkru_ad(unsigned int pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_READ)) & 1;
+}
+
+static inline bool_t read_pkru_wd(unsigned int pkru, unsigned int pkey)
+{
+ ASSERT(pkey < 16);
+ return (pkru >> (pkey * PKRU_ATTRS + PKRU_WRITE)) & 1;
+}
+
/*
* NSC/Cyrix CPU configuration register indexes
*/
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..86abb94 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -134,6 +134,18 @@ typedef l4_pgentry_t root_pgentry_t;
#define get_pte_flags(x) (((int)((x) >> 40) & ~0xFFF) | ((int)(x) & 0xFFF))
#define put_pte_flags(x) (((intpte_t)((x) & ~0xFFF) << 40) | ((x) & 0xFFF))
+/*
+ * Protection keys define a new 4-bit protection key field
+ * (PKEY) in bits 62:59 of leaf entries of the page tables.
+ * This corresponds to bit 22:19 of a 24-bit flags.
+ *
+ * Notice: Bit 22 is used by _PAGE_GNTTAB which is visible to PV guests,
+ * so Protection keys must be disabled on PV guests.
+ */
+#define _PAGE_PKEY_BITS (0x780000) /* Protection Keys, 22:19 */
+
+#define get_pte_pkey(x) (MASK_EXTR(get_pte_flags(x), _PAGE_PKEY_BITS))
+
/* Bit 23 of a 24-bit flag mask. This corresponds to bit 63 of a pte.*/
#define _PAGE_NX_BIT (1U<<23)
--
2.4.3
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-21 7:21 ` [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-12-21 15:32 ` Jan Beulich
2015-12-22 8:12 ` Han, Huaitong
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:32 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> --- a/xen/arch/x86/mm/guest_walk.c
> +++ b/xen/arch/x86/mm/guest_walk.c
> @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void *walk_p, int set_dirty)
> return 0;
> }
>
> +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
GUEST_PAGING_LEVELS >= 4 (just like further down)
> +bool_t leaf_pte_pkeys_check(struct vcpu *vcpu,
> + uint32_t pfec, uint32_t pte_pkey)
> +{
> + unsigned int pkru = 0;
> + bool_t pkru_ad, pkru_wd;
> +
Stray blank line.
> + bool_t pf = !!(pfec & PFEC_page_present);
> + bool_t uf = !!(pfec & PFEC_user_mode);
> + bool_t wf = !!(pfec & PFEC_write_access);
> + bool_t ff = !!(pfec & PFEC_insn_fetch);
> + bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> +
> + /* When page is present, PFEC_prot_key is always checked */
> + if ( !pf || is_pv_vcpu(vcpu) )
> + return 0;
I think for a function called "check" together with how its callers use
it the return value meaning is inverted here. Also the comment seems
inverted wrt the actual check (and is missing a full stop). And doesn't
key 0 have static meaning, in which case you could bail early (and
namely avoid the expensive RDPKRU further down)?
> + /*
> + * PKU: additional mechanism by which the paging controls
> + * access to user-mode addresses based on the value in the
> + * PKRU register. A fault is considered as a PKU violation if all
> + * of the following conditions are ture:
> + * 1.CR4_PKE=1.
> + * 2.EFER_LMA=1.
> + * 3.page is present with no reserved bit violations.
> + * 4.the access is not an instruction fetch.
> + * 5.the access is to a user page.
> + * 6.PKRU.AD=1
> + * or The access is a data write and PKRU.WD=1
> + * and either CR0.WP=1 or it is a user access.
> + */
> + if ( !hvm_pku_enabled(vcpu) ||
> + !hvm_long_mode_enabled(vcpu) || rsvdf || ff )
Where's the "user page" check? Also - indentation.
> + return 0;
> +
> + pkru = read_pkru();
> + if ( unlikely(pkru) )
> + {
> + pkru_ad = read_pkru_ad(pkru, pte_pkey);
> + pkru_wd = read_pkru_wd(pkru, pte_pkey);
> + /* Condition 6 */
> + if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) || uf)))
Ah, uf is being checked here. But according to the comment it could
(and should, again to avoid the RDPKRU) move up.
> @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct p2m_domain *p2m,
>
> pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v);
>
> +#if GUEST_PAGING_LEVELS >= 4
> + pkey = guest_l2e_get_pkey(gw->l2e);
> + if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
> + rc |= _PAGE_PKEY_BITS;
> +#endif
I think the #ifdef isn't really needed here, if you moved the one
around leaf_pte_pkeys_check() into that function, and if you
perhaps also dropped the "pkey" local variable.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-21 15:32 ` Jan Beulich
@ 2015-12-22 8:12 ` Han, Huaitong
2015-12-22 8:43 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Han, Huaitong @ 2015-12-22 8:12 UTC (permalink / raw)
To: JBeulich
Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel, Nakajima,
Jun, keir
On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote:
> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> > --- a/xen/arch/x86/mm/guest_walk.c
> > +++ b/xen/arch/x86/mm/guest_walk.c
> > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void
> > *walk_p, int set_dirty)
> > return 0;
> > }
> >
> > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
>
> GUEST_PAGING_LEVELS >= 4 (just like further down)
The code is modified according Andrew's comments:
"
This is a latent linking bug for the future
when 5 levels comes along.
It will probably be best to use the same trick
as gw_page_flags to compile it once but use it
multiple times.
"
>
> > +bool_t leaf_pte_pkeys_check(struct vcpu *vcpu,
> > + uint32_t pfec, uint32_t pte_pkey)
> > +{
> > + unsigned int pkru = 0;
> > + bool_t pkru_ad, pkru_wd;
> > +
>
> Stray blank line.
>
> > + bool_t pf = !!(pfec & PFEC_page_present);
> > + bool_t uf = !!(pfec & PFEC_user_mode);
> > + bool_t wf = !!(pfec & PFEC_write_access);
> > + bool_t ff = !!(pfec & PFEC_insn_fetch);
> > + bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
> > +
> > + /* When page is present, PFEC_prot_key is always checked */
> > + if ( !pf || is_pv_vcpu(vcpu) )
> > + return 0;
>
> I think for a function called "check" together with how its callers
> use
> it the return value meaning is inverted here.
I will change the function name to "pkey_page_fault".
> Also the comment seems
> inverted wrt the actual check (and is missing a full stop). And
> doesn't
> key 0 have static meaning, in which case you could bail early (and
> namely avoid the expensive RDPKRU further down)?
Key 0 have no static meaning, the default key maybe different in
different OS.
>
> > + /*
> > + * PKU: additional mechanism by which the paging controls
> > + * access to user-mode addresses based on the value in the
> > + * PKRU register. A fault is considered as a PKU violation if
> > all
> > + * of the following conditions are ture:
> > + * 1.CR4_PKE=1.
> > + * 2.EFER_LMA=1.
> > + * 3.page is present with no reserved bit violations.
> > + * 4.the access is not an instruction fetch.
> > + * 5.the access is to a user page.
> > + * 6.PKRU.AD=1
> > + * or The access is a data write and PKRU.WD=1
> > + * and either CR0.WP=1 or it is a user access.
> > + */
> > + if ( !hvm_pku_enabled(vcpu) ||
> > + !hvm_long_mode_enabled(vcpu) || rsvdf || ff )
>
> Where's the "user page" check? Also - indentation.
it should be:
if ( !hvm_pku_enabled(vcpu) || !hvm_long_mode_enabled(vcpu) ||
rsvdf || ff || !(pte_flags & _PAGE_USER) )
>
> > + return 0;
> > +
> > + pkru = read_pkru();
> > + if ( unlikely(pkru) )
> > + {
> > + pkru_ad = read_pkru_ad(pkru, pte_pkey);
> > + pkru_wd = read_pkru_wd(pkru, pte_pkey);
> > + /* Condition 6 */
> > + if ( pkru_ad || (pkru_wd && wf && (hvm_wp_enabled(vcpu) ||
> > uf)))
>
> Ah, uf is being checked here. But according to the comment it could
> (and should, again to avoid the RDPKRU) move up.
>
> > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct
> > p2m_domain *p2m,
> >
> > pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v);
> >
> > +#if GUEST_PAGING_LEVELS >= 4
> > + pkey = guest_l2e_get_pkey(gw->l2e);
> > + if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
> > + rc |= _PAGE_PKEY_BITS;
> > +#endif
>
> I think the #ifdef isn't really needed here, if you moved the one
> around leaf_pte_pkeys_check() into that function, and if you
> perhaps also dropped the "pkey" local variable.
guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS
too, and I think it's better to keep it.
>
> Jan
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables
2015-12-22 8:12 ` Han, Huaitong
@ 2015-12-22 8:43 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-22 8:43 UTC (permalink / raw)
To: Huaitong Han
Cc: Kevin Tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
Jun Nakajima, keir
>>> On 22.12.15 at 09:12, <huaitong.han@intel.com> wrote:
> On Mon, 2015-12-21 at 08:32 -0700, Jan Beulich wrote:
>> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
>> > --- a/xen/arch/x86/mm/guest_walk.c
>> > +++ b/xen/arch/x86/mm/guest_walk.c
>> > @@ -90,6 +90,55 @@ static uint32_t set_ad_bits(void *guest_p, void
>> > *walk_p, int set_dirty)
>> > return 0;
>> > }
>> >
>> > +#if GUEST_PAGING_LEVELS >= CONFIG_PAGING_LEVELS
>>
>> GUEST_PAGING_LEVELS >= 4 (just like further down)
> The code is modified according Andrew's comments:
> "
> This is a latent linking bug for the future
> when 5 levels comes along.
>
> It will probably be best to use the same trick
> as gw_page_flags to compile it once but use it
> multiple times.
> "
Okay, understood. But then you should follow _that_ model (using
== instead of >=) instead of inventing a 3rd variant. Similar things
should be done in similar ways so that they can be easily recognized
being similar.
Otoh the function isn't being called from code other than
GUEST_PAGING_LEVELS == 4, so at present it could be static,
which would take care of the latent linking issue Andrew referred to.
>> > + bool_t pf = !!(pfec & PFEC_page_present);
>> > + bool_t uf = !!(pfec & PFEC_user_mode);
>> > + bool_t wf = !!(pfec & PFEC_write_access);
>> > + bool_t ff = !!(pfec & PFEC_insn_fetch);
>> > + bool_t rsvdf = !!(pfec & PFEC_reserved_bit);
>> > +
>> > + /* When page is present, PFEC_prot_key is always checked */
>> > + if ( !pf || is_pv_vcpu(vcpu) )
>> > + return 0;
>>
>> I think for a function called "check" together with how its callers
>> use
>> it the return value meaning is inverted here.
> I will change the function name to "pkey_page_fault".
Perhaps just pkey_fault() then, since it's clear there's no other fault
possible here besides a page fault?
>> Also the comment seems
>> inverted wrt the actual check (and is missing a full stop). And
>> doesn't
>> key 0 have static meaning, in which case you could bail early (and
>> namely avoid the expensive RDPKRU further down)?
> Key 0 have no static meaning, the default key maybe different in
> different OS.
Okay - I thought I had seen something like this mentioned in
another sub-thread.
>> > @@ -270,6 +324,12 @@ guest_walk_tables(struct vcpu *v, struct
>> > p2m_domain *p2m,
>> >
>> > pse2M = (gflags & _PAGE_PSE) && guest_supports_superpages(v);
>> >
>> > +#if GUEST_PAGING_LEVELS >= 4
>> > + pkey = guest_l2e_get_pkey(gw->l2e);
>> > + if ( pse2M && leaf_pte_pkeys_check(v, pfec, pkey) )
>> > + rc |= _PAGE_PKEY_BITS;
>> > +#endif
>>
>> I think the #ifdef isn't really needed here, if you moved the one
>> around leaf_pte_pkeys_check() into that function, and if you
>> perhaps also dropped the "pkey" local variable.
> guest_l2e_get_pkey has different macro depend on GUEST_PAGING_LEVELS
> too, and I think it's better to keep it.
I didn't suggest to drop guest_l2e_get_pkey(). I'd like to see the
#ifdef-s go away if at all possible, to help code readability.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys
2015-12-21 7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (3 preceding siblings ...)
2015-12-21 7:21 ` [PATCH V4 4/6] x86/hvm: pkeys, add pkeys support for guest_walk_tables Huaitong Han
@ 2015-12-21 7:21 ` Huaitong Han
2015-12-21 15:09 ` Jan Beulich
2015-12-21 7:21 ` [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21 7:21 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds xstate support for pkeys.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
xen/include/asm-x86/xstate.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 12d939b..f7c41ba 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -34,13 +34,15 @@
#define XSTATE_OPMASK (1ULL << 5)
#define XSTATE_ZMM (1ULL << 6)
#define XSTATE_HI_ZMM (1ULL << 7)
+#define XSTATE_PKRU (1ULL << 9)
#define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
#define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE)
#define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
#define XSTATE_ALL (~(1ULL << 63))
-#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
+#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
+ XSTATE_PKRU)
#define XSTATE_LAZY (XSTATE_ALL & ~XSTATE_NONLAZY)
#define XSTATE_COMPACTION_ENABLED (1ULL << 63)
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys
2015-12-21 7:21 ` [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-12-21 15:09 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:09 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> This patch adds xstate support for pkeys.
>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> xen/include/asm-x86/xstate.h | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
> index 12d939b..f7c41ba 100644
> --- a/xen/include/asm-x86/xstate.h
> +++ b/xen/include/asm-x86/xstate.h
> @@ -34,13 +34,15 @@
> #define XSTATE_OPMASK (1ULL << 5)
> #define XSTATE_ZMM (1ULL << 6)
> #define XSTATE_HI_ZMM (1ULL << 7)
> +#define XSTATE_PKRU (1ULL << 9)
> #define XSTATE_LWP (1ULL << 62) /* AMD lightweight profiling */
> #define XSTATE_FP_SSE (XSTATE_FP | XSTATE_SSE)
> #define XCNTXT_MASK (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
> XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
>
> #define XSTATE_ALL (~(1ULL << 63))
> -#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
> +#define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR | \
> + XSTATE_PKRU)
Is it really a good idea to do this uniformly despite PV not going to
have pkeys support?
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-21 7:21 [PATCH V4 0/6] x86/hvm: pkeys, add memory protection-key support Huaitong Han
` (4 preceding siblings ...)
2015-12-21 7:21 ` [PATCH V4 5/6] x86/hvm: pkeys, add xstate support for pkeys Huaitong Han
@ 2015-12-21 7:21 ` Huaitong Han
2015-12-21 15:07 ` Jan Beulich
5 siblings, 1 reply; 18+ messages in thread
From: Huaitong Han @ 2015-12-21 7:21 UTC (permalink / raw)
To: jbeulich, andrew.cooper3, jun.nakajima, eddie.dong, kevin.tian,
george.dunlap, ian.jackson, stefano.stabellini, ian.campbell,
wei.liu2, keir
Cc: Huaitong Han, xen-devel
This patch adds pkeys support for cpuid handing.
Pkeys hardware support is CPUID.7.0.ECX[3]:PKU. software support is
CPUID.7.0.ECX[4]:OSPKE and it reflects the support setting of CR4.PKE.
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
---
tools/libxc/xc_cpufeature.h | 2 ++
tools/libxc/xc_cpuid_x86.c | 6 ++++--
xen/arch/x86/hvm/hvm.c | 10 +++++++++-
3 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/tools/libxc/xc_cpufeature.h b/tools/libxc/xc_cpufeature.h
index c3ddc80..f6a9778 100644
--- a/tools/libxc/xc_cpufeature.h
+++ b/tools/libxc/xc_cpufeature.h
@@ -141,5 +141,7 @@
#define X86_FEATURE_ADX 19 /* ADCX, ADOX instructions */
#define X86_FEATURE_SMAP 20 /* Supervisor Mode Access Protection */
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (ecx) */
+#define X86_FEATURE_PKU 3
#endif /* __LIBXC_CPUFEATURE_H */
diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index 8882c01..1ce979b 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -427,9 +427,11 @@ static void xc_cpuid_hvm_policy(xc_interface *xch,
bitmaskof(X86_FEATURE_ADX) |
bitmaskof(X86_FEATURE_SMAP) |
bitmaskof(X86_FEATURE_FSGSBASE));
+ regs[2] &= bitmaskof(X86_FEATURE_PKU);
} else
- regs[1] = 0;
- regs[0] = regs[2] = regs[3] = 0;
+ regs[1] = regs[2] = 0;
+
+ regs[0] = regs[3] = 0;
break;
case 0x0000000d:
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 59916ed..05821ed 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4572,7 +4572,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
__clear_bit(X86_FEATURE_APIC & 31, edx);
/* Fix up OSXSAVE. */
- if ( cpu_has_xsave )
+ if ( *ecx & cpufeat_mask(X86_FEATURE_XSAVE) )
*ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_OSXSAVE) ?
cpufeat_mask(X86_FEATURE_OSXSAVE) : 0;
@@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
/* Don't expose INVPCID to non-hap hvm. */
if ( (count == 0) && !hap_enabled(d) )
*ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
+
+ /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
+ if ( (count == 0) && !hap_enabled(d) )
+ *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
+
+ if ( (count == 0) && (*ecx & cpufeat_mask(X86_FEATURE_PKU)) )
+ *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
+ cpufeat_mask(X86_FEATURE_OSPKE) : 0;
break;
case 0xb:
/* Fix the x2APIC identifier. */
--
2.4.3
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-21 7:21 ` [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling Huaitong Han
@ 2015-12-21 15:07 ` Jan Beulich
2015-12-22 2:54 ` Han, Huaitong
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2015-12-21 15:07 UTC (permalink / raw)
To: Huaitong Han
Cc: kevin.tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
jun.nakajima, keir
>>> On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
> /* Don't expose INVPCID to non-hap hvm. */
> if ( (count == 0) && !hap_enabled(d) )
> *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> +
> + /* X86_FEATURE_PKU is not yet implemented for shadow paging. */
> + if ( (count == 0) && !hap_enabled(d) )
> + *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
I'm still missing the xsave dependency here.
> + if ( (count == 0) && (*ecx & cpufeat_mask(X86_FEATURE_PKU)) )
> + *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> + cpufeat_mask(X86_FEATURE_OSPKE) : 0;
Also all (would now be 6!) count == 0 dependencies should be
combined instead of repeating that condition so many times.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-21 15:07 ` Jan Beulich
@ 2015-12-22 2:54 ` Han, Huaitong
2015-12-22 8:30 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Han, Huaitong @ 2015-12-22 2:54 UTC (permalink / raw)
To: JBeulich
Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel, Nakajima,
Jun, keir
On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned
> > int *eax, unsigned int *ebx,
> > /* Don't expose INVPCID to non-hap hvm. */
> > if ( (count == 0) && !hap_enabled(d) )
> > *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> > +
> > + /* X86_FEATURE_PKU is not yet implemented for shadow
> > paging. */
> > + if ( (count == 0) && !hap_enabled(d) )
> > + *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
>
> I'm still missing the xsave dependency here.
Xsave dependency deletion becasue we use RDPKRU to get PKRU register
value instead of XSAVE now.
>
> > + if ( (count == 0) && (*ecx &
> > cpufeat_mask(X86_FEATURE_PKU)) )
> > + *ecx |= (v->arch.hvm_vcpu.guest_cr[4] & X86_CR4_PKE) ?
> > + cpufeat_mask(X86_FEATURE_OSPKE) : 0;
>
> Also all (would now be 6!) count == 0 dependencies should be
> combined instead of repeating that condition so many times.
>
> Jan
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-22 2:54 ` Han, Huaitong
@ 2015-12-22 8:30 ` Jan Beulich
2015-12-22 9:25 ` Han, Huaitong
0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2015-12-22 8:30 UTC (permalink / raw)
To: Huaitong Han
Cc: Kevin Tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
Jun Nakajima, keir
>>> On 22.12.15 at 03:54, <huaitong.han@intel.com> wrote:
> On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
>> > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
>> > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input, unsigned
>> > int *eax, unsigned int *ebx,
>> > /* Don't expose INVPCID to non-hap hvm. */
>> > if ( (count == 0) && !hap_enabled(d) )
>> > *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
>> > +
>> > + /* X86_FEATURE_PKU is not yet implemented for shadow
>> > paging. */
>> > + if ( (count == 0) && !hap_enabled(d) )
>> > + *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
>>
>> I'm still missing the xsave dependency here.
> Xsave dependency deletion becasue we use RDPKRU to get PKRU register
> value instead of XSAVE now.
What the hypervisor does doesn't matter here. The question is
whether from an architectural standpoint XSAVE is a prerequsite.
If it is, then you need to clear PKU when _guest_ XSAVE is clear.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-22 8:30 ` Jan Beulich
@ 2015-12-22 9:25 ` Han, Huaitong
2015-12-22 9:51 ` Jan Beulich
0 siblings, 1 reply; 18+ messages in thread
From: Han, Huaitong @ 2015-12-22 9:25 UTC (permalink / raw)
To: JBeulich
Cc: Tian, Kevin, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel, Nakajima,
Jun, keir
On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote:
> > > > On 22.12.15 at 03:54, <huaitong.han@intel.com> wrote:
> > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
> > > > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
> > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input,
> > > > unsigned
> > > > int *eax, unsigned int *ebx,
> > > > /* Don't expose INVPCID to non-hap hvm. */
> > > > if ( (count == 0) && !hap_enabled(d) )
> > > > *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
> > > > +
> > > > + /* X86_FEATURE_PKU is not yet implemented for shadow
> > > > paging. */
> > > > + if ( (count == 0) && !hap_enabled(d) )
> > > > + *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
> > >
> > > I'm still missing the xsave dependency here.
> > Xsave dependency deletion becasue we use RDPKRU to get PKRU
> > register
> > value instead of XSAVE now.
>
> What the hypervisor does doesn't matter here. The question is
> whether from an architectural standpoint XSAVE is a prerequsite.
> If it is, then you need to clear PKU when _guest_ XSAVE is clear.
No, XSAVE is not a prerequsite.
>
> Jan
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH V4 6/6] x86/hvm: pkeys, add pkeys support for cpuid handling
2015-12-22 9:25 ` Han, Huaitong
@ 2015-12-22 9:51 ` Jan Beulich
0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2015-12-22 9:51 UTC (permalink / raw)
To: Huaitong Han
Cc: Kevin Tian, wei.liu2, ian.campbell, stefano.stabellini,
george.dunlap, andrew.cooper3, ian.jackson, xen-devel,
Jun Nakajima, keir
>>> On 22.12.15 at 10:25, <huaitong.han@intel.com> wrote:
> On Tue, 2015-12-22 at 01:30 -0700, Jan Beulich wrote:
>> > > > On 22.12.15 at 03:54, <huaitong.han@intel.com> wrote:
>> > On Mon, 2015-12-21 at 08:07 -0700, Jan Beulich wrote:
>> > > > > > On 21.12.15 at 08:21, <huaitong.han@intel.com> wrote:
>> > > > @@ -4600,6 +4600,14 @@ void hvm_cpuid(unsigned int input,
>> > > > unsigned
>> > > > int *eax, unsigned int *ebx,
>> > > > /* Don't expose INVPCID to non-hap hvm. */
>> > > > if ( (count == 0) && !hap_enabled(d) )
>> > > > *ebx &= ~cpufeat_mask(X86_FEATURE_INVPCID);
>> > > > +
>> > > > + /* X86_FEATURE_PKU is not yet implemented for shadow
>> > > > paging. */
>> > > > + if ( (count == 0) && !hap_enabled(d) )
>> > > > + *ecx &= ~cpufeat_mask(X86_FEATURE_PKU);
>> > >
>> > > I'm still missing the xsave dependency here.
>> > Xsave dependency deletion becasue we use RDPKRU to get PKRU
>> > register
>> > value instead of XSAVE now.
>>
>> What the hypervisor does doesn't matter here. The question is
>> whether from an architectural standpoint XSAVE is a prerequsite.
>> If it is, then you need to clear PKU when _guest_ XSAVE is clear.
> No, XSAVE is not a prerequsite.
I.e. OSes are expected to deal with both cases (PKU w/ XSAVE and
PKU w/o XSAVE)? Sounds like a recipe for trouble, but well - okay.
Jan
^ permalink raw reply [flat|nested] 18+ messages in thread