All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/4] add xsaves/xrstors support
@ 2015-08-25 10:54 Shuai Ruan
  2015-08-25 10:54 ` [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves Shuai Ruan
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-25 10:54 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, andrew.cooper3, ian.jackson, eddie.dong, jbeulich,
	keir

Changes in v4:
* Address comments from Andrew, mainly:
* No xsaves suporting for pv guest.
* Using xstate_sizes instead of domain_cpuid in hvm_cupid in patch 3.
* Add support for format translation when perform pv guest migration in patch 2. 
* Fix some code errors.

Changes in v3:
* Address comments from Jan/Konrad
* Change the bits of eax/ebx/ecx/edx passed to guest in patch 4.
* Add code to verify whether host supports xsaves or not in patch 1.

Changes in v2:
* Address comments from Andrew/chao/Jan, mainly:
* Add details information of xsaves/xrstors feature.
* Test migration between xsaves-support machine and none-xsaves-support machine 
* Remove XGETBV1/XSAVEC/XSAVEOPT out of 'else' in patch 3.
* Change macro name XGETBV to XGETBV1 in patch 4.

This patchset enable xsaves/xrstors feature which will be available on 
Intel Skylake and later platform. Like xsave/xrstor, xsaves/xrstors 
feature will save and load processor state from a region of memory called 
XSAVE area. While unlike xsave/xrstor, xsaves/xrstors:

a) use the compacted format only for the extended region 
   of the XSAVE area which saves memory for you;
b) can operate on supervisor state components so the feature
   is prerequisite to support new supervisor state components;
c) execute only if CPL=0. 

Detail hardware spec can be found in chapter 13 (section 13.11 13.12) of the 
Intel SDM [1].

patch1: add basic definition/function to support xsaves
patch2: add xsaves/xrstors support for xen
patch3-4: add xsaves/xrstors support for hvm guest


[1] Intel SDM (http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf)

Shuai Ruan (4):
  x86/xsaves: add basic definitions/helpers to support xsaves
  x86/xsaves: enable xsaves/xrstors in xen
  x86/xsaves: enable xsaves/xrstors for hvm guest
  libxc: expose xsaves/xgetbv1/xsavec to hvm guest

 tools/libxc/xc_cpuid_x86.c         |  13 +-
 xen/arch/x86/domain.c              |   2 +
 xen/arch/x86/domctl.c              |  34 ++++-
 xen/arch/x86/hvm/hvm.c             |  62 ++++++++-
 xen/arch/x86/hvm/vmx/vmcs.c        |   6 +-
 xen/arch/x86/hvm/vmx/vmx.c         |  20 +++
 xen/arch/x86/i387.c                |   4 +
 xen/arch/x86/traps.c               |   7 +-
 xen/arch/x86/xstate.c              | 253 ++++++++++++++++++++++++++++++++-----
 xen/include/asm-x86/cpufeature.h   |   4 +
 xen/include/asm-x86/domain.h       |   1 +
 xen/include/asm-x86/hvm/vmx/vmcs.h |   7 +
 xen/include/asm-x86/hvm/vmx/vmx.h  |   2 +
 xen/include/asm-x86/msr-index.h    |   2 +
 xen/include/asm-x86/xstate.h       |  12 +-
 15 files changed, 374 insertions(+), 55 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
  2015-08-25 10:54 [PATCH V4 0/4] add xsaves/xrstors support Shuai Ruan
@ 2015-08-25 10:54 ` Shuai Ruan
  2015-08-26  9:47   ` Andrew Cooper
  2015-08-26 12:55   ` Jan Beulich
  2015-08-25 10:54 ` [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen Shuai Ruan
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-25 10:54 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, andrew.cooper3, ian.jackson, eddie.dong, jbeulich,
	keir

This patch add basic definitions/helpers which will be used in
later patches.

Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
---
 xen/arch/x86/xstate.c            | 137 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/cpufeature.h |   4 ++
 xen/include/asm-x86/domain.h     |   1 +
 xen/include/asm-x86/msr-index.h  |   2 +
 xen/include/asm-x86/xstate.h     |  11 +++-
 5 files changed, 154 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index d5f5e3b..3986515 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -29,6 +29,10 @@ static u32 __read_mostly xsave_cntxt_size;
 /* A 64-bit bitmask of the XSAVE/XRSTOR features supported by processor. */
 u64 __read_mostly xfeature_mask;
 
+unsigned int *xstate_offsets, *xstate_sizes;
+static unsigned int xstate_features;
+static unsigned int xstate_comp_offsets[sizeof(xfeature_mask)*8];
+
 /* Cached xcr0 for fast read */
 static DEFINE_PER_CPU(uint64_t, xcr0);
 
@@ -65,6 +69,139 @@ uint64_t get_xcr0(void)
     return this_cpu(xcr0);
 }
 
+static void setup_xstate_features(void)
+{
+    unsigned int eax, ebx, ecx, edx, leaf = 0x2;
+    unsigned int i;
+
+    xstate_features = fls(xfeature_mask);
+    xstate_offsets = xzalloc_array(unsigned int, xstate_features);
+    xstate_sizes = xzalloc_array(unsigned int, xstate_features);
+
+    for (i = 0; i < xstate_features ; i++)
+    {
+        cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);
+
+        xstate_offsets[leaf] = ebx;
+        xstate_sizes[leaf] = eax;
+
+        leaf++;
+    }
+}
+
+static void setup_xstate_comp(void)
+{
+    unsigned int xstate_comp_sizes[sizeof(xfeature_mask)*8];
+    unsigned int i;
+
+    /*
+     * The FP xstates and SSE xstates are legacy states. They are always
+     * in the fixed offsets in the xsave area in either compacted form
+     * or standard form.
+     */
+    xstate_comp_offsets[0] = 0;
+    xstate_comp_offsets[1] = XSAVE_SSE_OFFSET;
+
+    xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE;
+
+    for (i = 2; i < xstate_features; i++)
+    {
+        if ( (1ull << i) & xfeature_mask )
+            xstate_comp_sizes[i] = xstate_sizes[i];
+        else
+            xstate_comp_sizes[i] = 0;
+
+        if ( i > 2 )
+            xstate_comp_offsets[i] = xstate_comp_offsets[i-1]
+                                    + xstate_comp_sizes[i-1];
+    }
+}
+
+static void *get_xsave_addr(struct xsave_struct *xsave, int xstate)
+{
+    int feature = fls(xstate) - 1;
+    if ( !(1ull << feature & xfeature_mask) )
+        return NULL;
+
+    return (void *)xsave + xstate_comp_offsets[feature];
+}
+
+void save_xsave_states(struct vcpu *v, void *dest ,unsigned int size)
+{
+    struct xsave_struct *xsave = v->arch.xsave_area;
+    u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
+    u64 valid;
+
+    /*
+     * Copy legacy XSAVE area, to avoid complications with CPUID
+     * leaves 0 and 1 in the loop below.
+     */
+    memcpy(dest, xsave, FXSAVE_SIZE);
+
+    /* Set XSTATE_BV */
+    *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv;
+
+    /*
+     * Copy each region from the possibly compacted offset to the
+     * non-compacted offset.
+     */
+    valid = xstate_bv & ~XSTATE_FP_SSE;
+    while ( valid )
+    {
+        u64 feature = valid & -valid;
+        int index = fls(feature) - 1;
+        void *src = get_xsave_addr(xsave, feature);
+
+        if ( src )
+        {
+            if ( xstate_offsets[index] + xstate_sizes[index] > size )
+                BUG();
+            memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
+        }
+
+        valid -= feature;
+    }
+
+}
+
+void load_xsave_states(struct vcpu *v, void *src, unsigned int size)
+{
+    struct xsave_struct *xsave = v->arch.xsave_area;
+    u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET);
+    u64 valid;
+
+    /*
+     * Copy legacy XSAVE area, to avoid complications with CPUID
+     * leaves 0 and 1 in the loop below.
+     */
+    memcpy(xsave, src, FXSAVE_SIZE);
+
+    /* Set XSTATE_BV and possibly XCOMP_BV.  */
+    xsave->xsave_hdr.xstate_bv = xstate_bv;
+    xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | XSTATE_COMPACTION_ENABLED;
+
+    /*
+     * Copy each region from the non-compacted offset to the
+     * possibly compacted offset.
+     */
+    valid = xstate_bv & ~XSTATE_FP_SSE;
+    while ( valid )
+    {
+        u64 feature = valid & -valid;
+        int index = fls(feature) - 1;
+        void *dest = get_xsave_addr(xsave, feature);
+
+        if (dest)
+        {
+            if ( xstate_offsets[index] + xstate_sizes[index] > size )
+                BUG();
+            memcpy(dest, src + xstate_offsets[index], xstate_sizes[index]);
+        }
+
+        valid -= feature;
+    }
+}
+
 void xsave(struct vcpu *v, uint64_t mask)
 {
     struct xsave_struct *ptr = v->arch.xsave_area;
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 9a01563..16b1386 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -153,6 +153,10 @@
 #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
 #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
 #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
+#define XSAVEOPT		(1 << 0)
+#define XSAVEC			(1 << 1)
+#define XGETBV1		(1 << 2)
+#define XSAVES			(1 << 3)
 
 #if !defined(__ASSEMBLY__) && !defined(X86_FEATURES_ONLY)
 #include <xen/bitops.h>
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 0fce09e..588a4dc 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -506,6 +506,7 @@ struct arch_vcpu
      */
     struct xsave_struct *xsave_area;
     uint64_t xcr0;
+    uint64_t msr_ia32_xss;
     /* Accumulated eXtended features mask for using XSAVE/XRESTORE by Xen
      * itself, as we can never know whether guest OS depends on content
      * preservation whenever guest OS clears one feature flag (for example,
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index e9c4723..4e5b31f 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -58,6 +58,8 @@
 
 #define MSR_IA32_BNDCFGS		0x00000D90
 
+#define MSR_IA32_XSS			0x00000da0
+
 #define MSR_MTRRfix64K_00000		0x00000250
 #define MSR_MTRRfix16K_80000		0x00000258
 #define MSR_MTRRfix16K_A0000		0x00000259
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 4c690db..a68e18b 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -22,7 +22,11 @@
 
 #define XCR_XFEATURE_ENABLED_MASK 0x00000000  /* index of XCR0 */
 
+#define XSAVE_HDR_SIZE            64
+#define XSAVE_SSE_OFFSET          160
 #define XSTATE_YMM_SIZE           256
+#define FXSAVE_SIZE               512
+#define XSAVE_HDR_OFFSET          FXSAVE_SIZE
 #define XSTATE_AREA_MIN_SIZE      (512 + 64)  /* FP/SSE + XSAVE.HEADER */
 
 #define XSTATE_FP      (1ULL << 0)
@@ -41,9 +45,11 @@
 #define XSTATE_ALL     (~(1ULL << 63))
 #define XSTATE_NONLAZY (XSTATE_LWP | XSTATE_BNDREGS | XSTATE_BNDCSR)
 #define XSTATE_LAZY    (XSTATE_ALL & ~XSTATE_NONLAZY)
+#define XSTATE_COMPACTION_ENABLED  (1ULL << 63)
 
 extern u64 xfeature_mask;
 extern bool_t cpu_has_xsaves, cpu_has_xgetbv1;
+extern unsigned int *xstate_offsets, *xstate_sizes;
 
 /* extended state save area */
 struct __packed __attribute__((aligned (64))) xsave_struct
@@ -72,7 +78,8 @@ struct __packed __attribute__((aligned (64))) xsave_struct
 
     struct {
         u64 xstate_bv;
-        u64 reserved[7];
+        u64 xcomp_bv;
+        u64 reserved[6];
     } xsave_hdr;                             /* The 64-byte header */
 
     struct { char x[XSTATE_YMM_SIZE]; } ymm; /* YMM */
@@ -87,6 +94,8 @@ void xrstor(struct vcpu *v, uint64_t mask);
 bool_t xsave_enabled(const struct vcpu *v);
 int __must_check validate_xstate(u64 xcr0, u64 xcr0_accum, u64 xstate_bv);
 int __must_check handle_xsetbv(u32 index, u64 new_bv);
+void save_xsave_states(struct vcpu *v, void *dest, unsigned int size);
+void load_xsave_states(struct vcpu *v, void *src, unsigned int size);
 
 /* extended state init and cleanup functions */
 void xstate_free_save_area(struct vcpu *v);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-25 10:54 [PATCH V4 0/4] add xsaves/xrstors support Shuai Ruan
  2015-08-25 10:54 ` [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves Shuai Ruan
@ 2015-08-25 10:54 ` Shuai Ruan
  2015-08-26 10:12   ` Andrew Cooper
  2015-08-26 13:06   ` Jan Beulich
  2015-08-25 10:54 ` [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest Shuai Ruan
  2015-08-25 10:54 ` [PATCH V4 4/4] libxc: expose xsaves/xgetbv1/xsavec to " Shuai Ruan
  3 siblings, 2 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-25 10:54 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, andrew.cooper3, ian.jackson, eddie.dong, jbeulich,
	keir

This patch uses xsaves/xrstors instead of xsaveopt/xrstor
to perform the xsave_area switching so that xen itself
can benefit from them when available.

For xsaves/xrstors only use compact format. Add format conversion
support when perform guest os migration.

Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
---
 xen/arch/x86/domain.c  |   2 +
 xen/arch/x86/domctl.c  |  34 ++++++++++++---
 xen/arch/x86/hvm/hvm.c |  19 ++++++---
 xen/arch/x86/i387.c    |   4 ++
 xen/arch/x86/traps.c   |   7 ++--
 xen/arch/x86/xstate.c  | 112 ++++++++++++++++++++++++++++++++++---------------
 6 files changed, 132 insertions(+), 46 deletions(-)

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 045f6ff..7b8f649 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -1529,6 +1529,8 @@ static void __context_switch(void)
             if ( xcr0 != get_xcr0() && !set_xcr0(xcr0) )
                 BUG();
         }
+        if ( cpu_has_xsaves )
+            wrmsrl(MSR_IA32_XSS, n->arch.msr_ia32_xss);
         vcpu_restore_fpu_eager(n);
         n->arch.ctxt_switch_to(n);
     }
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index bf62a88..da136876 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -867,6 +867,7 @@ long arch_do_domctl(
         if ( domctl->cmd == XEN_DOMCTL_getvcpuextstate )
         {
             unsigned int size;
+            void * xsave_area;
 
             ret = 0;
             vcpu_pause(v);
@@ -896,9 +897,28 @@ long arch_do_domctl(
                 ret = -EFAULT;
 
             offset += sizeof(v->arch.xcr0_accum);
-            if ( !ret && copy_to_guest_offset(evc->buffer, offset,
-                                              (void *)v->arch.xsave_area,
-                                              size - 2 * sizeof(uint64_t)) )
+
+            if ( cpu_has_xsaves )
+            {
+                xsave_area = xmalloc_bytes(size);
+                if ( !xsave_area )
+                {
+                    ret = -ENOMEM;
+                    goto vcpuextstate_out;
+                }
+
+                save_xsave_states(v, xsave_area,
+                                  evc->size - 2*sizeof(uint64_t));
+
+                if ( !ret && copy_to_guest_offset(evc->buffer, offset,
+					          xsave_area, size -
+		                                  2 * sizeof(uint64_t)) )
+                     ret = -EFAULT;
+                 xfree(xsave_area);
+           }
+           else if ( !ret && copy_to_guest_offset(evc->buffer, offset,
+                                                 (void *)v->arch.xsave_area,
+                                                 size - 2 * sizeof(uint64_t)) )
                 ret = -EFAULT;
 
             vcpu_unpause(v);
@@ -954,8 +974,12 @@ long arch_do_domctl(
                 v->arch.xcr0_accum = _xcr0_accum;
                 if ( _xcr0_accum & XSTATE_NONLAZY )
                     v->arch.nonlazy_xstate_used = 1;
-                memcpy(v->arch.xsave_area, _xsave_area,
-                       evc->size - 2 * sizeof(uint64_t));
+                if ( cpu_has_xsaves )
+                    load_xsave_states(v, (void *)_xsave_area,
+                                      evc->size - 2*sizeof(uint64_t));
+                else
+                    memcpy(v->arch.xsave_area, (void *)_xsave_area,
+                           evc->size - 2 * sizeof(uint64_t));
                 vcpu_unpause(v);
             }
             else
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index c957610..dc444ac 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2148,8 +2148,12 @@ static int hvm_save_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
         ctxt->xfeature_mask = xfeature_mask;
         ctxt->xcr0 = v->arch.xcr0;
         ctxt->xcr0_accum = v->arch.xcr0_accum;
-        memcpy(&ctxt->save_area, v->arch.xsave_area,
-               size - offsetof(struct hvm_hw_cpu_xsave, save_area));
+	if ( cpu_has_xsaves )
+            save_xsave_states(v, (void *)&ctxt->save_area,
+                              size - offsetof(struct hvm_hw_cpu_xsave,save_area));
+        else
+            memcpy(&ctxt->save_area, v->arch.xsave_area,
+                   size - offsetof(struct hvm_hw_cpu_xsave, save_area));
     }
 
     return 0;
@@ -2248,9 +2252,14 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
     v->arch.xcr0_accum = ctxt->xcr0_accum;
     if ( ctxt->xcr0_accum & XSTATE_NONLAZY )
         v->arch.nonlazy_xstate_used = 1;
-    memcpy(v->arch.xsave_area, &ctxt->save_area,
-           min(desc->length, size) - offsetof(struct hvm_hw_cpu_xsave,
-           save_area));
+    if ( cpu_has_xsaves )
+        load_xsave_states(v, (void *)&ctxt->save_area,
+                          min(desc->length, size) -
+                          offsetof(struct hvm_hw_cpu_xsave,save_area));
+    else
+        memcpy(v->arch.xsave_area, &ctxt->save_area,
+               min(desc->length, size) - offsetof(struct hvm_hw_cpu_xsave,
+               save_area));
 
     return 0;
 }
diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index 14f2a79..b60b194 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -309,7 +309,11 @@ int vcpu_init_fpu(struct vcpu *v)
         return rc;
 
     if ( v->arch.xsave_area )
+    {
         v->arch.fpu_ctxt = &v->arch.xsave_area->fpu_sse;
+        if ( cpu_has_xsaves )
+            v->arch.xsave_area->xsave_hdr.xcomp_bv |= XSTATE_COMPACTION_ENABLED;
+    }
     else
     {
         v->arch.fpu_ctxt = _xzalloc(sizeof(v->arch.xsave_area->fpu_sse), 16);
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 9f5a6c6..e9beec1 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
         if ( regs->_ecx == 1 )
         {
             a &= XSTATE_FEATURE_XSAVEOPT |
-                 XSTATE_FEATURE_XSAVEC |
-                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
-                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
+                 XSTATE_FEATURE_XSAVEC;
+	         /* PV guest will not support xsaves. */
+                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
+                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */
             if ( !cpu_has_xsaves )
                 b = c = d = 0;
         }
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 3986515..9050607 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -214,6 +214,11 @@ void xsave(struct vcpu *v, uint64_t mask)
         typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
         typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
 
+        if ( cpu_has_xsaves )
+            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
+                           : "=m" (*ptr)
+                           : "a" (lmask), "d" (hmask), "D" (ptr) );
+        else
         if ( cpu_has_xsaveopt )
         {
             /*
@@ -267,6 +272,11 @@ void xsave(struct vcpu *v, uint64_t mask)
     }
     else
     {
+        if ( cpu_has_xsaves )
+            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
+                           : "=m" (*ptr)
+                           : "a" (lmask), "d" (hmask), "D" (ptr) );
+        else
         if ( cpu_has_xsaveopt )
             asm volatile ( ".byte 0x0f,0xae,0x37"
                            : "=m" (*ptr)
@@ -310,36 +320,68 @@ void xrstor(struct vcpu *v, uint64_t mask)
     switch ( __builtin_expect(ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET], 8) )
     {
     default:
-        asm volatile ( "1: .byte 0x48,0x0f,0xae,0x2f\n"
-                       ".section .fixup,\"ax\"      \n"
-                       "2: mov %5,%%ecx             \n"
-                       "   xor %1,%1                \n"
-                       "   rep stosb                \n"
-                       "   lea %2,%0                \n"
-                       "   mov %3,%1                \n"
-                       "   jmp 1b                   \n"
-                       ".previous                   \n"
-                       _ASM_EXTABLE(1b, 2b)
-                       : "+&D" (ptr), "+&a" (lmask)
-                       : "m" (*ptr), "g" (lmask), "d" (hmask),
-                         "m" (xsave_cntxt_size)
-                       : "ecx" );
-        break;
+        if ( cpu_has_xsaves )
+            asm volatile ( "1: .byte 0x48,0x0f,0xc7,0x1f\n"
+                           ".section .fixup,\"ax\"      \n"
+                           "2: mov %5,%%ecx             \n"
+                           "   xor %1,%1                \n"
+                           "   rep stosb                \n"
+                           "   lea %2,%0                \n"
+                           "   mov %3,%1                \n"
+                           "   jmp 1b                   \n"
+                           ".previous                   \n"
+                           _ASM_EXTABLE(1b, 2b)
+                           : "+&D" (ptr), "+&a" (lmask)
+                           : "m" (*ptr), "g" (lmask), "d" (hmask),
+                             "m" (xsave_cntxt_size)
+                           : "ecx" );
+	else
+            asm volatile ( "1: .byte 0x48,0x0f,0xae,0x2f\n"
+                           ".section .fixup,\"ax\"      \n"
+                           "2: mov %5,%%ecx             \n"
+                           "   xor %1,%1                \n"
+                           "   rep stosb                \n"
+                           "   lea %2,%0                \n"
+                           "   mov %3,%1                \n"
+                           "   jmp 1b                   \n"
+                           ".previous                   \n"
+                           _ASM_EXTABLE(1b, 2b)
+                           : "+&D" (ptr), "+&a" (lmask)
+                           : "m" (*ptr), "g" (lmask), "d" (hmask),
+                             "m" (xsave_cntxt_size)
+                           : "ecx" );
+            break;
     case 4: case 2:
-        asm volatile ( "1: .byte 0x0f,0xae,0x2f\n"
-                       ".section .fixup,\"ax\" \n"
-                       "2: mov %5,%%ecx        \n"
-                       "   xor %1,%1           \n"
-                       "   rep stosb           \n"
-                       "   lea %2,%0           \n"
-                       "   mov %3,%1           \n"
-                       "   jmp 1b              \n"
-                       ".previous              \n"
-                       _ASM_EXTABLE(1b, 2b)
-                       : "+&D" (ptr), "+&a" (lmask)
-                       : "m" (*ptr), "g" (lmask), "d" (hmask),
-                         "m" (xsave_cntxt_size)
-                       : "ecx" );
+        if ( cpu_has_xsaves )
+            asm volatile ( "1: .byte 0x48,0x0f,0xc7,0x1f\n"
+                           ".section .fixup,\"ax\"      \n"
+                           "2: mov %5,%%ecx             \n"
+                           "   xor %1,%1                \n"
+                           "   rep stosb                \n"
+                           "   lea %2,%0                \n"
+                           "   mov %3,%1                \n"
+                           "   jmp 1b                   \n"
+                           ".previous                   \n"
+                           _ASM_EXTABLE(1b, 2b)
+                           : "+&D" (ptr), "+&a" (lmask)
+                           : "m" (*ptr), "g" (lmask), "d" (hmask),
+                             "m" (xsave_cntxt_size)
+                           : "ecx" );
+	else
+            asm volatile ( "1: .byte 0x0f,0xae,0x2f\n"
+                           ".section .fixup,\"ax\" \n"
+                           "2: mov %5,%%ecx        \n"
+                           "   xor %1,%1           \n"
+                           "   rep stosb           \n"
+                           "   lea %2,%0           \n"
+                           "   mov %3,%1           \n"
+                           "   jmp 1b              \n"
+                           ".previous              \n"
+                           _ASM_EXTABLE(1b, 2b)
+                           : "+&D" (ptr), "+&a" (lmask)
+                           : "m" (*ptr), "g" (lmask), "d" (hmask),
+                             "m" (xsave_cntxt_size)
+                           : "ecx" );
         break;
     }
 }
@@ -466,16 +508,20 @@ void xstate_init(bool_t bsp)
     {
         cpu_has_xsaveopt = !!(eax & XSTATE_FEATURE_XSAVEOPT);
         cpu_has_xsavec = !!(eax & XSTATE_FEATURE_XSAVEC);
-        /* XXX cpu_has_xgetbv1 = !!(eax & XSTATE_FEATURE_XGETBV1); */
-        /* XXX cpu_has_xsaves = !!(eax & XSTATE_FEATURE_XSAVES); */
+        cpu_has_xgetbv1 = !!(eax & XSTATE_FEATURE_XGETBV1);
+        cpu_has_xsaves = !!(eax & XSTATE_FEATURE_XSAVES);
     }
     else
     {
         BUG_ON(!cpu_has_xsaveopt != !(eax & XSTATE_FEATURE_XSAVEOPT));
         BUG_ON(!cpu_has_xsavec != !(eax & XSTATE_FEATURE_XSAVEC));
-        /* XXX BUG_ON(!cpu_has_xgetbv1 != !(eax & XSTATE_FEATURE_XGETBV1)); */
-        /* XXX BUG_ON(!cpu_has_xsaves != !(eax & XSTATE_FEATURE_XSAVES)); */
+        BUG_ON(!cpu_has_xgetbv1 != !(eax & XSTATE_FEATURE_XGETBV1));
+        BUG_ON(!cpu_has_xsaves != !(eax & XSTATE_FEATURE_XSAVES));
     }
+
+    setup_xstate_features();
+    if ( cpu_has_xsaves )
+        setup_xstate_comp();
 }
 
 static bool_t valid_xcr0(u64 xcr0)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest
  2015-08-25 10:54 [PATCH V4 0/4] add xsaves/xrstors support Shuai Ruan
  2015-08-25 10:54 ` [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves Shuai Ruan
  2015-08-25 10:54 ` [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen Shuai Ruan
@ 2015-08-25 10:54 ` Shuai Ruan
  2015-08-26 10:36   ` Andrew Cooper
  2015-08-26 13:14   ` Jan Beulich
  2015-08-25 10:54 ` [PATCH V4 4/4] libxc: expose xsaves/xgetbv1/xsavec to " Shuai Ruan
  3 siblings, 2 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-25 10:54 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, andrew.cooper3, ian.jackson, eddie.dong, jbeulich,
	keir

This patch enables xsaves for hvm guest, includes:
1.handle xsaves vmcs init and vmexit.
2.add logic to write/read the XSS msr.

Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
---
 xen/arch/x86/hvm/hvm.c             | 43 ++++++++++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/vmx/vmcs.c        |  6 ++++--
 xen/arch/x86/hvm/vmx/vmx.c         | 20 ++++++++++++++++++
 xen/arch/x86/xstate.c              |  4 ++--
 xen/include/asm-x86/hvm/vmx/vmcs.h |  7 +++++++
 xen/include/asm-x86/hvm/vmx/vmx.h  |  2 ++
 xen/include/asm-x86/xstate.h       |  1 +
 7 files changed, 79 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index dc444ac..57612de 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4540,6 +4540,33 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                     *ebx = _eax + _ebx;
             }
         }
+        if ( count == 1 )
+        {
+            if ( cpu_has_xsaves )
+            {
+                *ebx = XSTATE_AREA_MIN_SIZE;
+                if ( v->arch.xcr0 | v->arch.msr_ia32_xss )
+                    for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ )
+                    {
+                        if ( !((v->arch.xcr0 | v->arch.msr_ia32_xss)
+			   & (1ULL << sub_leaf)) )
+                            continue;
+                        _eax = xstate_sizes[sub_leaf];
+                        *ebx =  *ebx + _eax;
+                    }
+            }
+            else
+            {
+                *eax &= ~XSAVES;
+                *ebx = *ecx = *edx = 0;
+            }
+            if ( !cpu_has_xgetbv1 )
+                *eax &= ~XGETBV1;
+            if ( !cpu_has_xsavec )
+                *eax &= ~XSAVEC;
+            if ( !cpu_has_xsaveopt )
+                *eax &= ~XSAVEOPT;
+        }
         break;
 
     case 0x80000001:
@@ -4639,6 +4666,12 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         *msr_content = v->arch.hvm_vcpu.guest_efer;
         break;
 
+    case MSR_IA32_XSS:
+        if ( !cpu_has_xsaves )
+            goto gp_fault;
+        *msr_content = v->arch.msr_ia32_xss;
+        break;
+
     case MSR_IA32_TSC:
         *msr_content = _hvm_rdtsc_intercept();
         break;
@@ -4771,6 +4804,16 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
            return X86EMUL_EXCEPTION;
         break;
 
+    case MSR_IA32_XSS:
+	/*
+	 * skylake only support bit 8 , but it is
+	 * not support in xen.
+	 */
+        if ( !cpu_has_xsaves || msr_content != 0 )
+            goto gp_fault;
+        v->arch.msr_ia32_xss = msr_content;
+        break;
+
     case MSR_IA32_TSC:
         hvm_set_guest_tsc(v, msr_content);
         break;
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index a0a97e7..924ebd0 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -236,7 +236,8 @@ static int vmx_init_vmcs_config(void)
                SECONDARY_EXEC_PAUSE_LOOP_EXITING |
                SECONDARY_EXEC_ENABLE_INVPCID |
                SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
-               SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS);
+               SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS |
+               SECONDARY_EXEC_XSAVES);
         rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -1238,7 +1239,8 @@ static int construct_vmcs(struct vcpu *v)
         __vmwrite(HOST_PAT, host_pat);
         __vmwrite(GUEST_PAT, guest_pat);
     }
-
+    if ( cpu_has_vmx_xsaves )
+        __vmwrite(XSS_EXIT_BITMAP, VMX_XSS_EXIT_BITMAP);
     vmx_vmcs_exit(v);
 
     /* PVH: paging mode is updated by arch_set_info_guest(). */
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c32d863..42e08da 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2830,6 +2830,18 @@ static void vmx_idtv_reinject(unsigned long idtv_info)
     }
 }
 
+static void vmx_handle_xsaves(void)
+{
+    gdprintk(XENLOG_ERR, "xsaves should not cause vmexit\n");
+    domain_crash(current->domain);
+}
+
+static void vmx_handle_xrstors(void)
+{
+    gdprintk(XENLOG_ERR, "xrstors should not cause vmexit\n");
+    domain_crash(current->domain);
+}
+
 static int vmx_handle_apic_write(void)
 {
     unsigned long exit_qualification;
@@ -3401,6 +3413,14 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs)
         vmx_vcpu_flush_pml_buffer(v);
         break;
 
+    case EXIT_REASON_XSAVES:
+        vmx_handle_xsaves();
+        break;
+
+    case EXIT_REASON_XRSTORS:
+        vmx_handle_xrstors();
+        break;
+
     case EXIT_REASON_ACCESS_GDTR_OR_IDTR:
     case EXIT_REASON_ACCESS_LDTR_OR_TR:
     case EXIT_REASON_VMX_PREEMPTION_TIMER_EXPIRED:
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 9050607..2cc2870 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -14,8 +14,8 @@
 #include <asm/xstate.h>
 #include <asm/asm_defns.h>
 
-static bool_t __read_mostly cpu_has_xsaveopt;
-static bool_t __read_mostly cpu_has_xsavec;
+bool_t __read_mostly cpu_has_xsaveopt;
+bool_t __read_mostly cpu_has_xsavec;
 bool_t __read_mostly cpu_has_xgetbv1;
 bool_t __read_mostly cpu_has_xsaves;
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index f1126d4..65947c4 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -225,6 +225,7 @@ extern u32 vmx_vmentry_control;
 #define SECONDARY_EXEC_ENABLE_VMCS_SHADOWING    0x00004000
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
 #define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
+#define SECONDARY_EXEC_XSAVES                   0x00100000
 extern u32 vmx_secondary_exec_control;
 
 #define VMX_EPT_EXEC_ONLY_SUPPORTED             0x00000001
@@ -240,6 +241,8 @@ extern u32 vmx_secondary_exec_control;
 
 #define VMX_MISC_VMWRITE_ALL                    0x20000000
 
+#define VMX_XSS_EXIT_BITMAP                     0
+
 #define VMX_VPID_INVVPID_INSTRUCTION                        0x100000000ULL
 #define VMX_VPID_INVVPID_INDIVIDUAL_ADDR                    0x10000000000ULL
 #define VMX_VPID_INVVPID_SINGLE_CONTEXT                     0x20000000000ULL
@@ -291,6 +294,8 @@ extern u32 vmx_secondary_exec_control;
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS)
 #define cpu_has_vmx_pml \
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
+#define cpu_has_vmx_xsaves \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_XSAVES)
 
 #define VMCS_RID_TYPE_MASK              0x80000000
 
@@ -365,6 +370,8 @@ enum vmcs_field {
     VMREAD_BITMAP                   = 0x00002026,
     VMWRITE_BITMAP                  = 0x00002028,
     VIRT_EXCEPTION_INFO             = 0x0000202a,
+    XSS_EXIT_BITMAP                 = 0x0000202c,
+    XSS_EXIT_BITMAP_HIGH            = 0x0000202d,
     GUEST_PHYSICAL_ADDRESS          = 0x00002400,
     VMCS_LINK_POINTER               = 0x00002800,
     GUEST_IA32_DEBUGCTL             = 0x00002802,
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index 3fbfa44..bfc2368 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -188,6 +188,8 @@ static inline unsigned long pi_get_pir(struct pi_desc *pi_desc, int group)
 #define EXIT_REASON_INVPCID             58
 #define EXIT_REASON_VMFUNC              59
 #define EXIT_REASON_PML_FULL            62
+#define EXIT_REASON_XSAVES              63
+#define EXIT_REASON_XRSTORS             64
 
 /*
  * Interruption-information format
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index a68e18b..201a109 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -49,6 +49,7 @@
 
 extern u64 xfeature_mask;
 extern bool_t cpu_has_xsaves, cpu_has_xgetbv1;
+extern bool_t cpu_has_xsavec, cpu_has_xsaveopt;
 extern unsigned int *xstate_offsets, *xstate_sizes;
 
 /* extended state save area */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH V4 4/4] libxc: expose xsaves/xgetbv1/xsavec to hvm guest
  2015-08-25 10:54 [PATCH V4 0/4] add xsaves/xrstors support Shuai Ruan
                   ` (2 preceding siblings ...)
  2015-08-25 10:54 ` [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest Shuai Ruan
@ 2015-08-25 10:54 ` Shuai Ruan
  2015-08-26 10:43   ` Andrew Cooper
  3 siblings, 1 reply; 25+ messages in thread
From: Shuai Ruan @ 2015-08-25 10:54 UTC (permalink / raw)
  To: xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, andrew.cooper3, ian.jackson, eddie.dong, jbeulich,
	keir

This patch exposes xsaves/xgetbv1/xsavec to hvm guest.
The reserved bits of eax/ebx/ecx/edx must be cleaned up
when call cpuid(0dh) with leaf 1 or 2..63.

According to the spec the following bits must be reserved:
For leaf 1, bits 03-04/08-31 of ecx is reserved. Edx is reserved.
For leaf 2...63, bits 01-31 of ecx is reserved. Edx is reserved.

Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
---
 tools/libxc/xc_cpuid_x86.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
index e146a3e..73908b0 100644
--- a/tools/libxc/xc_cpuid_x86.c
+++ b/tools/libxc/xc_cpuid_x86.c
@@ -210,6 +210,9 @@ static void intel_xc_cpuid_policy(
 }
 
 #define XSAVEOPT        (1 << 0)
+#define XSAVEC          (1 << 1)
+#define XGETBV1         (1 << 2)
+#define XSAVES          (1 << 3)
 /* Configure extended state enumeration leaves (0x0000000D for xsave) */
 static void xc_cpuid_config_xsave(
     xc_interface *xch, domid_t domid, uint64_t xfeature_mask,
@@ -246,8 +249,9 @@ static void xc_cpuid_config_xsave(
         regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */
         break;
     case 1: /* leaf 1 */
-        regs[0] &= XSAVEOPT;
-        regs[1] = regs[2] = regs[3] = 0;
+        regs[0] &= (XSAVEOPT | XSAVEC | XGETBV1 | XSAVES);
+        regs[2] &= 0xe7;
+        regs[3] = 0;
         break;
     case 2 ... 63: /* sub-leaves */
         if ( !(xfeature_mask & (1ULL << input[1])) )
@@ -255,8 +259,9 @@ static void xc_cpuid_config_xsave(
             regs[0] = regs[1] = regs[2] = regs[3] = 0;
             break;
         }
-        /* Don't touch EAX, EBX. Also cleanup ECX and EDX */
-        regs[2] = regs[3] = 0;
+        /* Don't touch EAX, EBX. Also cleanup EDX. Cleanup bits 01-32 of ECX*/
+        regs[2] &= 0x1;
+        regs[3] = 0;
         break;
     }
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
  2015-08-25 10:54 ` [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves Shuai Ruan
@ 2015-08-26  9:47   ` Andrew Cooper
  2015-08-26 11:41     ` Jan Beulich
  2015-08-28  2:49     ` Shuai Ruan
  2015-08-26 12:55   ` Jan Beulich
  1 sibling, 2 replies; 25+ messages in thread
From: Andrew Cooper @ 2015-08-26  9:47 UTC (permalink / raw)
  To: Shuai Ruan, xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	eddie.dong, ian.jackson, jbeulich, jun.nakajima, keir

On 25/08/2015 11:54, Shuai Ruan wrote:
> This patch add basic definitions/helpers which will be used in
> later patches.
>
> Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>

Thankyou - this series is looking far better now.

> ---
>  xen/arch/x86/xstate.c            | 137 +++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-x86/cpufeature.h |   4 ++
>  xen/include/asm-x86/domain.h     |   1 +
>  xen/include/asm-x86/msr-index.h  |   2 +
>  xen/include/asm-x86/xstate.h     |  11 +++-
>  5 files changed, 154 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> index d5f5e3b..3986515 100644
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -29,6 +29,10 @@ static u32 __read_mostly xsave_cntxt_size;
>  /* A 64-bit bitmask of the XSAVE/XRSTOR features supported by processor. */
>  u64 __read_mostly xfeature_mask;
>  
> +unsigned int *xstate_offsets, *xstate_sizes;
> +static unsigned int xstate_features;
> +static unsigned int xstate_comp_offsets[sizeof(xfeature_mask)*8];

All of these should be tagged as __read_mostly.

> +
>  /* Cached xcr0 for fast read */
>  static DEFINE_PER_CPU(uint64_t, xcr0);
>  
> @@ -65,6 +69,139 @@ uint64_t get_xcr0(void)
>      return this_cpu(xcr0);
>  }
>  
> +static void setup_xstate_features(void)
> +{
> +    unsigned int eax, ebx, ecx, edx, leaf = 0x2;
> +    unsigned int i;
> +
> +    xstate_features = fls(xfeature_mask);
> +    xstate_offsets = xzalloc_array(unsigned int, xstate_features);
> +    xstate_sizes = xzalloc_array(unsigned int, xstate_features);

These can both be plain xmalloc_array(), as we unconditionally set every
member below.

As a separate patch (or possibly this one if it isn't too big), you now
need to modify the xstate initialisation to return int rather than
void.  You must also check for allocation failures and bail with
-ENOMEM.  It is fine for an error like this to be fatal to system or AP
startup.

Also, merging review from patch 2, this function gets called once per
pcpu, meaning that you waste a moderate quantity of memory and
repeatedly clobber the xstate_{offsets,sizes} pointers.  You should pass
in bool_t bsp and, if bsp, allocate the arrays and read out, while if
not bsp, check that the ap is identical to the bsp.

> +
> +    for (i = 0; i < xstate_features ; i++)

Style: spaces inside braces, and there shouldn't be one between
"xstate_features;"

Also, you can remove i entirely, and use "for ( leaf = 2; leaf <
xstate_features; ++i )"

> +    {
> +        cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);
> +
> +        xstate_offsets[leaf] = ebx;
> +        xstate_sizes[leaf] = eax;

You can drop the ebx and eax temporaries, and as ecx and edx are only
around as discard variables, you can get away with having a single "tmp"
variable.

Therefore, something like:

cpuid_count(XSTATE_CPUID, leaf, &xstate_sizes[leaf],
&xstate_offsets[leaf], &tmp, &tmp);

which also drops the body of the function to a single statement,
allowing you to drop the braces.

> +
> +        leaf++;
> +    }
> +}
> +
> +static void setup_xstate_comp(void)

This function can get away with being __init, as it is based solely on
data written by the BSP.

Also merging review from patch 2, it only needs calling once.

> +{
> +    unsigned int xstate_comp_sizes[sizeof(xfeature_mask)*8];
> +    unsigned int i;
> +
> +    /*
> +     * The FP xstates and SSE xstates are legacy states. They are always
> +     * in the fixed offsets in the xsave area in either compacted form
> +     * or standard form.
> +     */
> +    xstate_comp_offsets[0] = 0;
> +    xstate_comp_offsets[1] = XSAVE_SSE_OFFSET;
> +
> +    xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> +
> +    for (i = 2; i < xstate_features; i++)

Style - spaces inside braces.

> +    {
> +        if ( (1ull << i) & xfeature_mask )
> +            xstate_comp_sizes[i] = xstate_sizes[i];
> +        else
> +            xstate_comp_sizes[i] = 0;
> +
> +        if ( i > 2 )
> +            xstate_comp_offsets[i] = xstate_comp_offsets[i-1]
> +                                    + xstate_comp_sizes[i-1];

The entire xstate_comp_sizes array (which is quite large) can be removed
if you rearrange the loop body as:

xstate_comp_offsets[i] = xstate_comp_offsets[i-1] + (((1ul << i) &
xfeature_mask) ? xstate_comp_sizes[i-1] : 0);

which also removes the "if ( i > 2 )"

> +    }

As this point, it is probably worth matching xstate_comp_offsets[i-1]
against cpuid.0xd[0].ecx.  If hardware and Xen disagree on the maximum
size of an xsave area, we shouldn't continue.

> +}
> +
> +static void *get_xsave_addr(struct xsave_struct *xsave, int xstate)
> +{
> +    int feature = fls(xstate) - 1;

In each callsite, you have already calculated fls(xstate) - 1.  It would
be better to pass an "unsigned int xfeature_idx" and avoid the repeated
calculation.

> +    if ( !(1ull << feature & xfeature_mask) )

Style.  This is a mix of binary operators so needs more brackets than this.

if ( !((1ul << feature) & xfeature_mask )

> +        return NULL;
> +
> +    return (void *)xsave + xstate_comp_offsets[feature];
> +}
> +
> +void save_xsave_states(struct vcpu *v, void *dest ,unsigned int size)

s/ ,/, /

> +{
> +    struct xsave_struct *xsave = v->arch.xsave_area;
> +    u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
> +    u64 valid;
> +
> +    /*
> +     * Copy legacy XSAVE area, to avoid complications with CPUID
> +     * leaves 0 and 1 in the loop below.
> +     */
> +    memcpy(dest, xsave, FXSAVE_SIZE);
> +
> +    /* Set XSTATE_BV */
> +    *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv;
> +
> +    /*
> +     * Copy each region from the possibly compacted offset to the
> +     * non-compacted offset.
> +     */
> +    valid = xstate_bv & ~XSTATE_FP_SSE;
> +    while ( valid )
> +    {
> +        u64 feature = valid & -valid;
> +        int index = fls(feature) - 1;
> +        void *src = get_xsave_addr(xsave, feature);
> +
> +        if ( src )
> +        {
> +            if ( xstate_offsets[index] + xstate_sizes[index] > size )
> +                BUG();

On second thoughts, this should probably turn into

ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);

The size parameter is from a trusted caller, and it will be more obvious
in the crash report if it does somehow get broken.

> +            memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
> +        }
> +
> +        valid -= feature;
> +    }
> +
> +}
> +
> +void load_xsave_states(struct vcpu *v, void *src, unsigned int size)

const void *src

> +{
> +    struct xsave_struct *xsave = v->arch.xsave_area;
> +    u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET);
> +    u64 valid;
> +
> +    /*
> +     * Copy legacy XSAVE area, to avoid complications with CPUID
> +     * leaves 0 and 1 in the loop below.
> +     */
> +    memcpy(xsave, src, FXSAVE_SIZE);
> +
> +    /* Set XSTATE_BV and possibly XCOMP_BV.  */
> +    xsave->xsave_hdr.xstate_bv = xstate_bv;
> +    xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | XSTATE_COMPACTION_ENABLED;
> +
> +    /*
> +     * Copy each region from the non-compacted offset to the
> +     * possibly compacted offset.
> +     */
> +    valid = xstate_bv & ~XSTATE_FP_SSE;
> +    while ( valid )
> +    {
> +        u64 feature = valid & -valid;
> +        int index = fls(feature) - 1;
> +        void *dest = get_xsave_addr(xsave, feature);
> +
> +        if (dest)

Style.

> +        {
> +            if ( xstate_offsets[index] + xstate_sizes[index] > size )
> +                BUG();
> +            memcpy(dest, src + xstate_offsets[index], xstate_sizes[index]);
> +        }
> +
> +        valid -= feature;
> +    }
> +}
> +
>  void xsave(struct vcpu *v, uint64_t mask)
>  {
>      struct xsave_struct *ptr = v->arch.xsave_area;
> diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> index 9a01563..16b1386 100644
> --- a/xen/include/asm-x86/cpufeature.h
> +++ b/xen/include/asm-x86/cpufeature.h
> @@ -153,6 +153,10 @@
>  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
>  #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
>  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
> +#define XSAVEOPT		(1 << 0)
> +#define XSAVEC			(1 << 1)
> +#define XGETBV1		(1 << 2)
> +#define XSAVES			(1 << 3)

This absolutely must be X86_FEATURE_* for consistency, and you will need
to introduce a new capability word.

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-25 10:54 ` [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen Shuai Ruan
@ 2015-08-26 10:12   ` Andrew Cooper
  2015-08-26 11:50     ` Jan Beulich
  2015-08-28  5:22     ` Shuai Ruan
  2015-08-26 13:06   ` Jan Beulich
  1 sibling, 2 replies; 25+ messages in thread
From: Andrew Cooper @ 2015-08-26 10:12 UTC (permalink / raw)
  To: Shuai Ruan, xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	eddie.dong, ian.jackson, jbeulich, jun.nakajima, keir

On 25/08/15 11:54, Shuai Ruan wrote:
> This patch uses xsaves/xrstors instead of xsaveopt/xrstor
> to perform the xsave_area switching so that xen itself
> can benefit from them when available.
>
> For xsaves/xrstors only use compact format. Add format conversion
> support when perform guest os migration.
>
> Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
> ---
>  xen/arch/x86/domain.c  |   2 +
>  xen/arch/x86/domctl.c  |  34 ++++++++++++---
>  xen/arch/x86/hvm/hvm.c |  19 ++++++---
>  xen/arch/x86/i387.c    |   4 ++
>  xen/arch/x86/traps.c   |   7 ++--
>  xen/arch/x86/xstate.c  | 112 ++++++++++++++++++++++++++++++++++---------------
>  6 files changed, 132 insertions(+), 46 deletions(-)
>
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index 045f6ff..7b8f649 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -1529,6 +1529,8 @@ static void __context_switch(void)
>              if ( xcr0 != get_xcr0() && !set_xcr0(xcr0) )
>                  BUG();
>          }
> +        if ( cpu_has_xsaves )
> +            wrmsrl(MSR_IA32_XSS, n->arch.msr_ia32_xss);

This is still not appropriate.  You need a per cpu variable and only
perform lazy writes of the MSR.

>          vcpu_restore_fpu_eager(n);
>          n->arch.ctxt_switch_to(n);
>      }
> diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> index bf62a88..da136876 100644
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -867,6 +867,7 @@ long arch_do_domctl(
>          if ( domctl->cmd == XEN_DOMCTL_getvcpuextstate )
>          {
>              unsigned int size;
> +            void * xsave_area;

Unnecessary space.

>  
>              ret = 0;
>              vcpu_pause(v);
> @@ -896,9 +897,28 @@ long arch_do_domctl(
>                  ret = -EFAULT;
>  
>              offset += sizeof(v->arch.xcr0_accum);
> -            if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> -                                              (void *)v->arch.xsave_area,
> -                                              size - 2 * sizeof(uint64_t)) )
> +
> +            if ( cpu_has_xsaves )

I think this would be more correct to gate on v->arch.xsave_area
actually being compressed.

> +            {
> +                xsave_area = xmalloc_bytes(size);
> +                if ( !xsave_area )
> +                {
> +                    ret = -ENOMEM;
> +                    goto vcpuextstate_out;
> +                }
> +
> +                save_xsave_states(v, xsave_area,
> +                                  evc->size - 2*sizeof(uint64_t));

And on a similar note, save_xsave_states() is really
uncompress_xsave_area().  On further consideration, it should ASSERT()
that the source is currently compressed.

> +
> +                if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> +					          xsave_area, size -
> +		                                  2 * sizeof(uint64_t)) )
> +                     ret = -EFAULT;
> +                 xfree(xsave_area);
> +           }
> +           else if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> +                                                 (void *)v->arch.xsave_area,
> +                                                 size - 2 * sizeof(uint64_t)) )
>                  ret = -EFAULT;
>  
>              vcpu_unpause(v);
> @@ -954,8 +974,12 @@ long arch_do_domctl(
>                  v->arch.xcr0_accum = _xcr0_accum;
>                  if ( _xcr0_accum & XSTATE_NONLAZY )
>                      v->arch.nonlazy_xstate_used = 1;
> -                memcpy(v->arch.xsave_area, _xsave_area,
> -                       evc->size - 2 * sizeof(uint64_t));
> +                if ( cpu_has_xsaves )
> +                    load_xsave_states(v, (void *)_xsave_area,
> +                                      evc->size - 2*sizeof(uint64_t));

Similarly, load_xsave_states() is really compress_xsave_area().

You should check to see whether the area is already compressed, and just
memcpy() in that case.

> +                else
> +                    memcpy(v->arch.xsave_area, (void *)_xsave_area,
> +                           evc->size - 2 * sizeof(uint64_t));
>                  vcpu_unpause(v);
>              }
>              else
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index c957610..dc444ac 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2148,8 +2148,12 @@ static int hvm_save_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>          ctxt->xfeature_mask = xfeature_mask;
>          ctxt->xcr0 = v->arch.xcr0;
>          ctxt->xcr0_accum = v->arch.xcr0_accum;
> -        memcpy(&ctxt->save_area, v->arch.xsave_area,
> -               size - offsetof(struct hvm_hw_cpu_xsave, save_area));
> +	if ( cpu_has_xsaves )

Stray hard tab.

> +            save_xsave_states(v, (void *)&ctxt->save_area,
> +                              size - offsetof(struct hvm_hw_cpu_xsave,save_area));

These offsetof()s can become far shorter by using offsetof(*ctxt,
save_area).

> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 9f5a6c6..e9beec1 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
>          if ( regs->_ecx == 1 )
>          {
>              a &= XSTATE_FEATURE_XSAVEOPT |
> -                 XSTATE_FEATURE_XSAVEC |
> -                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> -                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
> +                 XSTATE_FEATURE_XSAVEC;
> +	         /* PV guest will not support xsaves. */
> +                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> +                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */

Don't leave this code commented out like this.  Just delete it.

>              if ( !cpu_has_xsaves )
>                  b = c = d = 0;
>          }
> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> index 3986515..9050607 100644
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -214,6 +214,11 @@ void xsave(struct vcpu *v, uint64_t mask)
>          typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
>          typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
>  
> +        if ( cpu_has_xsaves )
> +            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
> +                           : "=m" (*ptr)
> +                           : "a" (lmask), "d" (hmask), "D" (ptr) );
> +        else
>          if ( cpu_has_xsaveopt )

Please join the if() onto the previous line.

>          {
>              /*
> @@ -267,6 +272,11 @@ void xsave(struct vcpu *v, uint64_t mask)
>      }
>      else
>      {
> +        if ( cpu_has_xsaves )
> +            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
> +                           : "=m" (*ptr)
> +                           : "a" (lmask), "d" (hmask), "D" (ptr) );
> +        else
>          if ( cpu_has_xsaveopt )

And here.

> @@ -466,16 +508,20 @@ void xstate_init(bool_t bsp)
>      {
>          cpu_has_xsaveopt = !!(eax & XSTATE_FEATURE_XSAVEOPT);
>          cpu_has_xsavec = !!(eax & XSTATE_FEATURE_XSAVEC);
> -        /* XXX cpu_has_xgetbv1 = !!(eax & XSTATE_FEATURE_XGETBV1); */
> -        /* XXX cpu_has_xsaves = !!(eax & XSTATE_FEATURE_XSAVES); */
> +        cpu_has_xgetbv1 = !!(eax & XSTATE_FEATURE_XGETBV1);
> +        cpu_has_xsaves = !!(eax & XSTATE_FEATURE_XSAVES);
>      }
>      else
>      {
>          BUG_ON(!cpu_has_xsaveopt != !(eax & XSTATE_FEATURE_XSAVEOPT));
>          BUG_ON(!cpu_has_xsavec != !(eax & XSTATE_FEATURE_XSAVEC));
> -        /* XXX BUG_ON(!cpu_has_xgetbv1 != !(eax & XSTATE_FEATURE_XGETBV1)); */
> -        /* XXX BUG_ON(!cpu_has_xsaves != !(eax & XSTATE_FEATURE_XSAVES)); */
> +        BUG_ON(!cpu_has_xgetbv1 != !(eax & XSTATE_FEATURE_XGETBV1));
> +        BUG_ON(!cpu_has_xsaves != !(eax & XSTATE_FEATURE_XSAVES));
>      }
> +
> +    setup_xstate_features();
> +    if ( cpu_has_xsaves )
> +        setup_xstate_comp();

setup_state_comp() should only be called on the bsp, and
setup_xstate_features() should have a similar bsp/ap split with
generating or validating state.

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest
  2015-08-25 10:54 ` [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest Shuai Ruan
@ 2015-08-26 10:36   ` Andrew Cooper
  2015-08-28  2:52     ` Shuai Ruan
  2015-08-26 13:14   ` Jan Beulich
  1 sibling, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2015-08-26 10:36 UTC (permalink / raw)
  To: Shuai Ruan, xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	eddie.dong, ian.jackson, jbeulich, jun.nakajima, keir

On 25/08/15 11:54, Shuai Ruan wrote:
> This patch enables xsaves for hvm guest, includes:
> 1.handle xsaves vmcs init and vmexit.
> 2.add logic to write/read the XSS msr.
>
> Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, given two
corrections...

> ---
>  xen/arch/x86/hvm/hvm.c             | 43 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/vmx/vmcs.c        |  6 ++++--
>  xen/arch/x86/hvm/vmx/vmx.c         | 20 ++++++++++++++++++
>  xen/arch/x86/xstate.c              |  4 ++--
>  xen/include/asm-x86/hvm/vmx/vmcs.h |  7 +++++++
>  xen/include/asm-x86/hvm/vmx/vmx.h  |  2 ++
>  xen/include/asm-x86/xstate.h       |  1 +
>  7 files changed, 79 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index dc444ac..57612de 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4540,6 +4540,33 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>                      *ebx = _eax + _ebx;
>              }
>          }
> +        if ( count == 1 )
> +        {
> +            if ( cpu_has_xsaves )
> +            {
> +                *ebx = XSTATE_AREA_MIN_SIZE;
> +                if ( v->arch.xcr0 | v->arch.msr_ia32_xss )
> +                    for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ )
> +                    {
> +                        if ( !((v->arch.xcr0 | v->arch.msr_ia32_xss)
> +			   & (1ULL << sub_leaf)) )

Stray hard tabs.

> @@ -4771,6 +4804,16 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
>             return X86EMUL_EXCEPTION;
>          break;
>  
> +    case MSR_IA32_XSS:
> +	/*
> +	 * skylake only support bit 8 , but it is
> +	 * not support in xen.
> +	 */

I would alter this comment to

/* No XSS features currently supported for guests. */

The skylake bits are covered by cpu_has_xsaves.

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 4/4] libxc: expose xsaves/xgetbv1/xsavec to hvm guest
  2015-08-25 10:54 ` [PATCH V4 4/4] libxc: expose xsaves/xgetbv1/xsavec to " Shuai Ruan
@ 2015-08-26 10:43   ` Andrew Cooper
  2015-08-26 12:03     ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2015-08-26 10:43 UTC (permalink / raw)
  To: Shuai Ruan, xen-devel
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	eddie.dong, ian.jackson, jbeulich, jun.nakajima, keir

On 25/08/15 11:54, Shuai Ruan wrote:
> This patch exposes xsaves/xgetbv1/xsavec to hvm guest.
> The reserved bits of eax/ebx/ecx/edx must be cleaned up
> when call cpuid(0dh) with leaf 1 or 2..63.
>
> According to the spec the following bits must be reserved:
> For leaf 1, bits 03-04/08-31 of ecx is reserved. Edx is reserved.
> For leaf 2...63, bits 01-31 of ecx is reserved. Edx is reserved.
>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
> ---
>  tools/libxc/xc_cpuid_x86.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/tools/libxc/xc_cpuid_x86.c b/tools/libxc/xc_cpuid_x86.c
> index e146a3e..73908b0 100644
> --- a/tools/libxc/xc_cpuid_x86.c
> +++ b/tools/libxc/xc_cpuid_x86.c
> @@ -210,6 +210,9 @@ static void intel_xc_cpuid_policy(
>  }
>  
>  #define XSAVEOPT        (1 << 0)
> +#define XSAVEC          (1 << 1)
> +#define XGETBV1         (1 << 2)
> +#define XSAVES          (1 << 3)
>  /* Configure extended state enumeration leaves (0x0000000D for xsave) */
>  static void xc_cpuid_config_xsave(
>      xc_interface *xch, domid_t domid, uint64_t xfeature_mask,
> @@ -246,8 +249,9 @@ static void xc_cpuid_config_xsave(
>          regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */
>          break;
>      case 1: /* leaf 1 */
> -        regs[0] &= XSAVEOPT;
> -        regs[1] = regs[2] = regs[3] = 0;
> +        regs[0] &= (XSAVEOPT | XSAVEC | XGETBV1 | XSAVES);
> +        regs[2] &= 0xe7;

Shouldn't this 0xe7 be mask of xstate_feature bits?

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
  2015-08-26  9:47   ` Andrew Cooper
@ 2015-08-26 11:41     ` Jan Beulich
  2015-08-26 12:53       ` Jan Beulich
  2015-08-28  2:49     ` Shuai Ruan
  1 sibling, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 11:41 UTC (permalink / raw)
  To: Andrew Cooper, Shuai Ruan
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	ian.jackson, xen-devel, jun.nakajima, keir

>>> On 26.08.15 at 11:47, <andrew.cooper3@citrix.com> wrote:
> On 25/08/2015 11:54, Shuai Ruan wrote:
>> --- a/xen/include/asm-x86/cpufeature.h
>> +++ b/xen/include/asm-x86/cpufeature.h
>> @@ -153,6 +153,10 @@
>>  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
>>  #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
>>  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
>> +#define XSAVEOPT		(1 << 0)
>> +#define XSAVEC			(1 << 1)
>> +#define XGETBV1		(1 << 2)
>> +#define XSAVES			(1 << 3)
> 
> This absolutely must be X86_FEATURE_* for consistency, and you will need
> to introduce a new capability word.

Or even better re-use word 2.

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-26 10:12   ` Andrew Cooper
@ 2015-08-26 11:50     ` Jan Beulich
  2015-08-26 12:05       ` Andrew Cooper
  2015-08-28  5:22     ` Shuai Ruan
  1 sibling, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 11:50 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	eddie.dong, Shuai Ruan, ian.jackson, xen-devel, jun.nakajima,
	keir

>>> On 26.08.15 at 12:12, <andrew.cooper3@citrix.com> wrote:
> On 25/08/15 11:54, Shuai Ruan wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -2148,8 +2148,12 @@ static int hvm_save_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>>          ctxt->xfeature_mask = xfeature_mask;
>>          ctxt->xcr0 = v->arch.xcr0;
>>          ctxt->xcr0_accum = v->arch.xcr0_accum;
>> -        memcpy(&ctxt->save_area, v->arch.xsave_area,
>> -               size - offsetof(struct hvm_hw_cpu_xsave, save_area));
>> +	if ( cpu_has_xsaves )
> 
> Stray hard tab.
> 
>> +            save_xsave_states(v, (void *)&ctxt->save_area,
>> +                              size - offsetof(struct hvm_hw_cpu_xsave,save_area));
> 
> These offsetof()s can become far shorter by using offsetof(*ctxt,
> save_area).

Are you mixing this up with sizeof()? If anything, offsetof(typeof(),).

>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
>>          if ( regs->_ecx == 1 )
>>          {
>>              a &= XSTATE_FEATURE_XSAVEOPT |
>> -                 XSTATE_FEATURE_XSAVEC |
>> -                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
>> -                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
>> +                 XSTATE_FEATURE_XSAVEC;
>> +	         /* PV guest will not support xsaves. */
>> +                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
>> +                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */
> 
> Don't leave this code commented out like this.  Just delete it.

Agreed, but - mind reminding me again why supporting them for
PV guests isn't going to work?

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 4/4] libxc: expose xsaves/xgetbv1/xsavec to hvm guest
  2015-08-26 10:43   ` Andrew Cooper
@ 2015-08-26 12:03     ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 12:03 UTC (permalink / raw)
  To: Shuai Ruan
  Cc: kevin.tian, wei.liu2, eddie.dong, stefano.stabellini,
	Andrew Cooper, ian.jackson, xen-devel, jun.nakajima, keir,
	Ian.Campbell

>>> On 26.08.15 at 12:43, <andrew.cooper3@citrix.com> wrote:
> On 25/08/15 11:54, Shuai Ruan wrote:
>> @@ -246,8 +249,9 @@ static void xc_cpuid_config_xsave(
>>          regs[1] = 512 + 64; /* FP/SSE + XSAVE.HEADER */
>>          break;
>>      case 1: /* leaf 1 */
>> -        regs[0] &= XSAVEOPT;
>> -        regs[1] = regs[2] = regs[3] = 0;
>> +        regs[0] &= (XSAVEOPT | XSAVEC | XGETBV1 | XSAVES);
>> +        regs[2] &= 0xe7;
> 
> Shouldn't this 0xe7 be mask of xstate_feature bits?

And a few lines down there's also a similar hard coded 1. Please
stay away from adding new literal numbers - if there are no suitable
ones, introduce manifest constants.

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-26 11:50     ` Jan Beulich
@ 2015-08-26 12:05       ` Andrew Cooper
  2015-08-26 12:35         ` Jan Beulich
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2015-08-26 12:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	eddie.dong, Shuai Ruan, ian.jackson, xen-devel, jun.nakajima,
	keir

On 26/08/15 12:50, Jan Beulich wrote:
>>>> On 26.08.15 at 12:12, <andrew.cooper3@citrix.com> wrote:
>> On 25/08/15 11:54, Shuai Ruan wrote:
>>> --- a/xen/arch/x86/hvm/hvm.c
>>> +++ b/xen/arch/x86/hvm/hvm.c
>>> @@ -2148,8 +2148,12 @@ static int hvm_save_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>>>          ctxt->xfeature_mask = xfeature_mask;
>>>          ctxt->xcr0 = v->arch.xcr0;
>>>          ctxt->xcr0_accum = v->arch.xcr0_accum;
>>> -        memcpy(&ctxt->save_area, v->arch.xsave_area,
>>> -               size - offsetof(struct hvm_hw_cpu_xsave, save_area));
>>> +	if ( cpu_has_xsaves )
>> Stray hard tab.
>>
>>> +            save_xsave_states(v, (void *)&ctxt->save_area,
>>> +                              size - offsetof(struct hvm_hw_cpu_xsave,save_area));
>> These offsetof()s can become far shorter by using offsetof(*ctxt,
>> save_area).
> Are you mixing this up with sizeof()? If anything, offsetof(typeof(),).

Oops sorry yes.  I did mean to include the use of typeof(), which still
makes it shorter.

>
>>> --- a/xen/arch/x86/traps.c
>>> +++ b/xen/arch/x86/traps.c
>>> @@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
>>>          if ( regs->_ecx == 1 )
>>>          {
>>>              a &= XSTATE_FEATURE_XSAVEOPT |
>>> -                 XSTATE_FEATURE_XSAVEC |
>>> -                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
>>> -                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
>>> +                 XSTATE_FEATURE_XSAVEC;
>>> +	         /* PV guest will not support xsaves. */
>>> +                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
>>> +                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */
>> Don't leave this code commented out like this.  Just delete it.
> Agreed, but - mind reminding me again why supporting them for
> PV guests isn't going to work?

xsaves is a cpl0 instruction used to manage state which userspace can't
be trusted to handle alone.  Xen therefore can't trust PV guests to use
it either.

The first of these features is Processor Trace.  A PV guest able to use
xsaves/xrstors would be able to gather trace data of hypervisor
execution, or cause the trace buffers to clobber arbitrary physical memory.

The features covered by MSR_IA32_XSS can safely be exposed to PV guests,
but via a hypercall interface.

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-26 12:05       ` Andrew Cooper
@ 2015-08-26 12:35         ` Jan Beulich
  2015-08-28  5:25           ` Shuai Ruan
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 12:35 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	eddie.dong, Shuai Ruan, ian.jackson, xen-devel, jun.nakajima,
	keir

>>> On 26.08.15 at 14:05, <andrew.cooper3@citrix.com> wrote:
> On 26/08/15 12:50, Jan Beulich wrote:
>>>>> On 26.08.15 at 12:12, <andrew.cooper3@citrix.com> wrote:
>>> On 25/08/15 11:54, Shuai Ruan wrote:
>>>> --- a/xen/arch/x86/traps.c
>>>> +++ b/xen/arch/x86/traps.c
>>>> @@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
>>>>          if ( regs->_ecx == 1 )
>>>>          {
>>>>              a &= XSTATE_FEATURE_XSAVEOPT |
>>>> -                 XSTATE_FEATURE_XSAVEC |
>>>> -                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
>>>> -                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
>>>> +                 XSTATE_FEATURE_XSAVEC;
>>>> +	         /* PV guest will not support xsaves. */
>>>> +                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
>>>> +                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */
>>> Don't leave this code commented out like this.  Just delete it.
>> Agreed, but - mind reminding me again why supporting them for
>> PV guests isn't going to work?
> 
> xsaves is a cpl0 instruction used to manage state which userspace can't
> be trusted to handle alone.  Xen therefore can't trust PV guests to use
> it either.
> 
> The first of these features is Processor Trace.  A PV guest able to use
> xsaves/xrstors would be able to gather trace data of hypervisor
> execution, or cause the trace buffers to clobber arbitrary physical memory.
> 
> The features covered by MSR_IA32_XSS can safely be exposed to PV guests,
> but via a hypercall interface.

That covers xsaves, but not xgetbv1 (which also covers XSAVEOPT).

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
  2015-08-26 11:41     ` Jan Beulich
@ 2015-08-26 12:53       ` Jan Beulich
  2015-08-28  5:34         ` Shuai Ruan
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 12:53 UTC (permalink / raw)
  To: Andrew Cooper, Shuai Ruan
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	ian.jackson, xen-devel, jun.nakajima, keir

>>> On 26.08.15 at 13:41, <JBeulich@suse.com> wrote:
>>>> On 26.08.15 at 11:47, <andrew.cooper3@citrix.com> wrote:
>> On 25/08/2015 11:54, Shuai Ruan wrote:
>>> --- a/xen/include/asm-x86/cpufeature.h
>>> +++ b/xen/include/asm-x86/cpufeature.h
>>> @@ -153,6 +153,10 @@
>>>  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
>>>  #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
>>>  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
>>> +#define XSAVEOPT		(1 << 0)
>>> +#define XSAVEC			(1 << 1)
>>> +#define XGETBV1		(1 << 2)
>>> +#define XSAVES			(1 << 3)
>> 
>> This absolutely must be X86_FEATURE_* for consistency, and you will need
>> to introduce a new capability word.
> 
> Or even better re-use word 2.

And make sure to replace the then redundant XSTATE_FEATURE_*
definitions in xstate.h.

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
  2015-08-25 10:54 ` [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves Shuai Ruan
  2015-08-26  9:47   ` Andrew Cooper
@ 2015-08-26 12:55   ` Jan Beulich
  1 sibling, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 12:55 UTC (permalink / raw)
  To: Shuai Ruan
  Cc: kevin.tian, wei.liu2, eddie.dong, stefano.stabellini,
	andrew.cooper3, ian.jackson, xen-devel, jun.nakajima, keir,
	Ian.Campbell

>>> On 25.08.15 at 12:54, <shuai.ruan@linux.intel.com> wrote:
> +void save_xsave_states(struct vcpu *v, void *dest ,unsigned int size)
> +{
> +    struct xsave_struct *xsave = v->arch.xsave_area;
> +    u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
> +    u64 valid;
> +
> +    /*
> +     * Copy legacy XSAVE area, to avoid complications with CPUID
> +     * leaves 0 and 1 in the loop below.
> +     */
> +    memcpy(dest, xsave, FXSAVE_SIZE);
> +
> +    /* Set XSTATE_BV */
> +    *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv;

Please try to avoid such (dangerous) casts.

> +    /*
> +     * Copy each region from the possibly compacted offset to the
> +     * non-compacted offset.
> +     */
> +    valid = xstate_bv & ~XSTATE_FP_SSE;
> +    while ( valid )
> +    {
> +        u64 feature = valid & -valid;
> +        int index = fls(feature) - 1;

find_first_set_bit() (or ffsl()) would make this more readable.

> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -506,6 +506,7 @@ struct arch_vcpu
>       */
>      struct xsave_struct *xsave_area;
>      uint64_t xcr0;
> +    uint64_t msr_ia32_xss;

Considering that you mean to not support this for PV guests, this
should go into struct hvm_vcpu. And I think naming it just xss
or at most msr_xss would be quite fine.

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-25 10:54 ` [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen Shuai Ruan
  2015-08-26 10:12   ` Andrew Cooper
@ 2015-08-26 13:06   ` Jan Beulich
  2015-08-28 10:54     ` Shuai Ruan
       [not found]     ` <20150828105408.GA18437@shuai.ruan@linux.intel.com>
  1 sibling, 2 replies; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 13:06 UTC (permalink / raw)
  To: Shuai Ruan
  Cc: kevin.tian, wei.liu2, eddie.dong, stefano.stabellini,
	andrew.cooper3, ian.jackson, xen-devel, jun.nakajima, keir,
	Ian.Campbell

>>> On 25.08.15 at 12:54, <shuai.ruan@linux.intel.com> wrote:
> @@ -896,9 +897,28 @@ long arch_do_domctl(
>                  ret = -EFAULT;
>  
>              offset += sizeof(v->arch.xcr0_accum);
> -            if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> -                                              (void *)v->arch.xsave_area,
> -                                              size - 2 * sizeof(uint64_t)) )
> +
> +            if ( cpu_has_xsaves )
> +            {

There is not point in entering this code when ret != 0.

> +                xsave_area = xmalloc_bytes(size);
> +                if ( !xsave_area )
> +                {
> +                    ret = -ENOMEM;
> +                    goto vcpuextstate_out;

As can even be seen from the patch context, this bypasses the
vcpu_unpause().

> +                }
> +
> +                save_xsave_states(v, xsave_area,
> +                                  evc->size - 2*sizeof(uint64_t));
> +
> +                if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> +					          xsave_area, size -
> +		                                  2 * sizeof(uint64_t)) )
> +                     ret = -EFAULT;
> +                 xfree(xsave_area);
> +           }
> +           else if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> +                                                 (void *)v->arch.xsave_area,
> +                                                 size - 2 * sizeof(uint64_t)) )
>                  ret = -EFAULT;
>  
>              vcpu_unpause(v);
> @@ -954,8 +974,12 @@ long arch_do_domctl(
>                  v->arch.xcr0_accum = _xcr0_accum;
>                  if ( _xcr0_accum & XSTATE_NONLAZY )
>                      v->arch.nonlazy_xstate_used = 1;
> -                memcpy(v->arch.xsave_area, _xsave_area,
> -                       evc->size - 2 * sizeof(uint64_t));
> +                if ( cpu_has_xsaves )
> +                    load_xsave_states(v, (void *)_xsave_area,

Casts like this and ...

> +                                      evc->size - 2*sizeof(uint64_t));
> +                else
> +                    memcpy(v->arch.xsave_area, (void *)_xsave_area,
> +                           evc->size - 2 * sizeof(uint64_t));
>                  vcpu_unpause(v);
>              }
>              else
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -2148,8 +2148,12 @@ static int hvm_save_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>          ctxt->xfeature_mask = xfeature_mask;
>          ctxt->xcr0 = v->arch.xcr0;
>          ctxt->xcr0_accum = v->arch.xcr0_accum;
> -        memcpy(&ctxt->save_area, v->arch.xsave_area,
> -               size - offsetof(struct hvm_hw_cpu_xsave, save_area));
> +	if ( cpu_has_xsaves )
> +            save_xsave_states(v, (void *)&ctxt->save_area,

... this and ...

> +                              size - offsetof(struct hvm_hw_cpu_xsave,save_area));
> +        else
> +            memcpy(&ctxt->save_area, v->arch.xsave_area,
> +                   size - offsetof(struct hvm_hw_cpu_xsave, save_area));
>      }
>  
>      return 0;
> @@ -2248,9 +2252,14 @@ static int hvm_load_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
>      v->arch.xcr0_accum = ctxt->xcr0_accum;
>      if ( ctxt->xcr0_accum & XSTATE_NONLAZY )
>          v->arch.nonlazy_xstate_used = 1;
> -    memcpy(v->arch.xsave_area, &ctxt->save_area,
> -           min(desc->length, size) - offsetof(struct hvm_hw_cpu_xsave,
> -           save_area));
> +    if ( cpu_has_xsaves )
> +        load_xsave_states(v, (void *)&ctxt->save_area,

... this are bogus and should be avoided if at all possible.

> --- a/xen/arch/x86/i387.c
> +++ b/xen/arch/x86/i387.c
> @@ -309,7 +309,11 @@ int vcpu_init_fpu(struct vcpu *v)
>          return rc;
>  
>      if ( v->arch.xsave_area )
> +    {
>          v->arch.fpu_ctxt = &v->arch.xsave_area->fpu_sse;
> +        if ( cpu_has_xsaves )

Why (only) dependent on XSAVES and not (also) XSAVEC?

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
>          if ( regs->_ecx == 1 )
>          {
>              a &= XSTATE_FEATURE_XSAVEOPT |
> -                 XSTATE_FEATURE_XSAVEC |
> -                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> -                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
> +                 XSTATE_FEATURE_XSAVEC;
> +	         /* PV guest will not support xsaves. */
> +                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> +                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */

As already said to Andrew - fine for XSAVES, but why also for
XGETBV1?

> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -214,6 +214,11 @@ void xsave(struct vcpu *v, uint64_t mask)
>          typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
>          typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
>  
> +        if ( cpu_has_xsaves )
> +            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
> +                           : "=m" (*ptr)
> +                           : "a" (lmask), "d" (hmask), "D" (ptr) );
> +        else
>          if ( cpu_has_xsaveopt )

Same question as above - why not also use XSAVEC when
available?

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest
  2015-08-25 10:54 ` [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest Shuai Ruan
  2015-08-26 10:36   ` Andrew Cooper
@ 2015-08-26 13:14   ` Jan Beulich
  1 sibling, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2015-08-26 13:14 UTC (permalink / raw)
  To: Shuai Ruan
  Cc: kevin.tian, wei.liu2, eddie.dong, stefano.stabellini,
	andrew.cooper3, ian.jackson, xen-devel, jun.nakajima, keir,
	Ian.Campbell

>>> On 25.08.15 at 12:54, <shuai.ruan@linux.intel.com> wrote:
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -4540,6 +4540,33 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
>                      *ebx = _eax + _ebx;
>              }
>          }
> +        if ( count == 1 )
> +        {
> +            if ( cpu_has_xsaves )
> +            {
> +                *ebx = XSTATE_AREA_MIN_SIZE;
> +                if ( v->arch.xcr0 | v->arch.msr_ia32_xss )
> +                    for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ )
> +                    {
> +                        if ( !((v->arch.xcr0 | v->arch.msr_ia32_xss)
> +			   & (1ULL << sub_leaf)) )
> +                            continue;
> +                        _eax = xstate_sizes[sub_leaf];
> +                        *ebx =  *ebx + _eax;

Why not just "*ebx += xstate_sizes[sub_leaf]"?

> +                    }
> +            }
> +            else
> +            {
> +                *eax &= ~XSAVES;
> +                *ebx = *ecx = *edx = 0;
> +            }
> +            if ( !cpu_has_xgetbv1 )
> +                *eax &= ~XGETBV1;
> +            if ( !cpu_has_xsavec )
> +                *eax &= ~XSAVEC;
> +            if ( !cpu_has_xsaveopt )
> +                *eax &= ~XSAVEOPT;

Are all these really necessary? And is they are, why isn't the
XSAVEOPT one needed already before this series?

> @@ -1238,7 +1239,8 @@ static int construct_vmcs(struct vcpu *v)
>          __vmwrite(HOST_PAT, host_pat);
>          __vmwrite(GUEST_PAT, guest_pat);
>      }
> -
> +    if ( cpu_has_vmx_xsaves )
> +        __vmwrite(XSS_EXIT_BITMAP, VMX_XSS_EXIT_BITMAP);

This reads as if you were writing an architecture mandated value
into an architecture defined VMCS slot. I.e. the second constant
would better be renamed imo.

> @@ -365,6 +370,8 @@ enum vmcs_field {
>      VMREAD_BITMAP                   = 0x00002026,
>      VMWRITE_BITMAP                  = 0x00002028,
>      VIRT_EXCEPTION_INFO             = 0x0000202a,
> +    XSS_EXIT_BITMAP                 = 0x0000202c,
> +    XSS_EXIT_BITMAP_HIGH            = 0x0000202d,

Did you not notice that all the ..._HIGH value are gone in favor of
using VMCS_HIGH() in the few cases where a 64-bit hypervisor really
needs to access (or rather, emulate) them?

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
  2015-08-26  9:47   ` Andrew Cooper
  2015-08-26 11:41     ` Jan Beulich
@ 2015-08-28  2:49     ` Shuai Ruan
  1 sibling, 0 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-28  2:49 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, ian.jackson, eddie.dong, xen-devel, jbeulich, keir

On Wed, Aug 26, 2015 at 10:47:22AM +0100, Andrew Cooper wrote:
> On 25/08/2015 11:54, Shuai Ruan wrote:
> > This patch add basic definitions/helpers which will be used in
> > later patches.
> >
> > Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
> 
> Thankyou - this series is looking far better now.
> 
> > ---
> >  xen/arch/x86/xstate.c            | 137 +++++++++++++++++++++++++++++++++++++++
> >  xen/include/asm-x86/cpufeature.h |   4 ++
> >  xen/include/asm-x86/domain.h     |   1 +
> >  xen/include/asm-x86/msr-index.h  |   2 +
> >  xen/include/asm-x86/xstate.h     |  11 +++-
> >  5 files changed, 154 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> > index d5f5e3b..3986515 100644
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -29,6 +29,10 @@ static u32 __read_mostly xsave_cntxt_size;
> >  /* A 64-bit bitmask of the XSAVE/XRSTOR features supported by processor. */
> >  u64 __read_mostly xfeature_mask;
> >  
> > +unsigned int *xstate_offsets, *xstate_sizes;
> > +static unsigned int xstate_features;
> > +static unsigned int xstate_comp_offsets[sizeof(xfeature_mask)*8];
> 
> All of these should be tagged as __read_mostly.
> 
Ok.
> > +
> >  /* Cached xcr0 for fast read */
> >  static DEFINE_PER_CPU(uint64_t, xcr0);
> >  
> > @@ -65,6 +69,139 @@ uint64_t get_xcr0(void)
> >      return this_cpu(xcr0);
> >  }
> >  
> > +static void setup_xstate_features(void)
> > +{
> > +    unsigned int eax, ebx, ecx, edx, leaf = 0x2;
> > +    unsigned int i;
> > +
> > +    xstate_features = fls(xfeature_mask);
> > +    xstate_offsets = xzalloc_array(unsigned int, xstate_features);
> > +    xstate_sizes = xzalloc_array(unsigned int, xstate_features);
> 
> These can both be plain xmalloc_array(), as we unconditionally set every
> member below.
> 
> As a separate patch (or possibly this one if it isn't too big), you now
> need to modify the xstate initialisation to return int rather than
> void.  You must also check for allocation failures and bail with
> -ENOMEM.  It is fine for an error like this to be fatal to system or AP
> startup.
> 
> Also, merging review from patch 2, this function gets called once per
> pcpu, meaning that you waste a moderate quantity of memory and
> repeatedly clobber the xstate_{offsets,sizes} pointers.  You should pass
> in bool_t bsp and, if bsp, allocate the arrays and read out, while if
> not bsp, check that the ap is identical to the bsp.
> 
Ok.
> > +
> > +    for (i = 0; i < xstate_features ; i++)
> 
> Style: spaces inside braces, and there shouldn't be one between
> "xstate_features;"
> 
> Also, you can remove i entirely, and use "for ( leaf = 2; leaf <
> xstate_features; ++i )"
> 
Sorry.
> > +    {
> > +        cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);
> > +
> > +        xstate_offsets[leaf] = ebx;
> > +        xstate_sizes[leaf] = eax;
> 
> You can drop the ebx and eax temporaries, and as ecx and edx are only
> around as discard variables, you can get away with having a single "tmp"
> variable.
> 
> Therefore, something like:
> 
> cpuid_count(XSTATE_CPUID, leaf, &xstate_sizes[leaf],
> &xstate_offsets[leaf], &tmp, &tmp);
> 
> which also drops the body of the function to a single statement,
> allowing you to drop the braces.
> 
Ok.
> > +
> > +        leaf++;
> > +    }
> > +}
> > +
> > +static void setup_xstate_comp(void)
> 
> This function can get away with being __init, as it is based solely on
> data written by the BSP.
> 
> Also merging review from patch 2, it only needs calling once.
> 
ok.
> > +{
> > +    unsigned int xstate_comp_sizes[sizeof(xfeature_mask)*8];
> > +    unsigned int i;
> > +
> > +    /*
> > +     * The FP xstates and SSE xstates are legacy states. They are always
> > +     * in the fixed offsets in the xsave area in either compacted form
> > +     * or standard form.
> > +     */
> > +    xstate_comp_offsets[0] = 0;
> > +    xstate_comp_offsets[1] = XSAVE_SSE_OFFSET;
> > +
> > +    xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> > +
> > +    for (i = 2; i < xstate_features; i++)
> 
> Style - spaces inside braces.
> 
Sorry.
> > +    {
> > +        if ( (1ull << i) & xfeature_mask )
> > +            xstate_comp_sizes[i] = xstate_sizes[i];
> > +        else
> > +            xstate_comp_sizes[i] = 0;
> > +
> > +        if ( i > 2 )
> > +            xstate_comp_offsets[i] = xstate_comp_offsets[i-1]
> > +                                    + xstate_comp_sizes[i-1];
> 
> The entire xstate_comp_sizes array (which is quite large) can be removed
> if you rearrange the loop body as:
> 
> xstate_comp_offsets[i] = xstate_comp_offsets[i-1] + (((1ul << i) &
> xfeature_mask) ? xstate_comp_sizes[i-1] : 0);
> 
> which also removes the "if ( i > 2 )"
> 
> > +    }
> 
> As this point, it is probably worth matching xstate_comp_offsets[i-1]
> against cpuid.0xd[0].ecx.  If hardware and Xen disagree on the maximum
> size of an xsave area, we shouldn't continue.
> 
Ok.
> > +}
> > +
> > +static void *get_xsave_addr(struct xsave_struct *xsave, int xstate)
> > +{
> > +    int feature = fls(xstate) - 1;
> 
> In each callsite, you have already calculated fls(xstate) - 1.  It would
> be better to pass an "unsigned int xfeature_idx" and avoid the repeated
> calculation.
> 
Ok.
> > +    if ( !(1ull << feature & xfeature_mask) )
> 
> Style.  This is a mix of binary operators so needs more brackets than this.
> 
> if ( !((1ul << feature) & xfeature_mask )
> 
Ok.
> > +        return NULL;
> > +
> > +    return (void *)xsave + xstate_comp_offsets[feature];
> > +}
> > +
> > +void save_xsave_states(struct vcpu *v, void *dest ,unsigned int size)
> 
> s/ ,/, /
> 
> > +{
> > +    struct xsave_struct *xsave = v->arch.xsave_area;
> > +    u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
> > +    u64 valid;
> > +
> > +    /*
> > +     * Copy legacy XSAVE area, to avoid complications with CPUID
> > +     * leaves 0 and 1 in the loop below.
> > +     */
> > +    memcpy(dest, xsave, FXSAVE_SIZE);
> > +
> > +    /* Set XSTATE_BV */
> > +    *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv;
> > +
> > +    /*
> > +     * Copy each region from the possibly compacted offset to the
> > +     * non-compacted offset.
> > +     */
> > +    valid = xstate_bv & ~XSTATE_FP_SSE;
> > +    while ( valid )
> > +    {
> > +        u64 feature = valid & -valid;
> > +        int index = fls(feature) - 1;
> > +        void *src = get_xsave_addr(xsave, feature);
> > +
> > +        if ( src )
> > +        {
> > +            if ( xstate_offsets[index] + xstate_sizes[index] > size )
> > +                BUG();
> 
> On second thoughts, this should probably turn into
> 
> ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
> 
> The size parameter is from a trusted caller, and it will be more obvious
> in the crash report if it does somehow get broken.
> 
Ok
> > +            memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
> > +        }
> > +
> > +        valid -= feature;
> > +    }
> > +
> > +}
> > +
> > +void load_xsave_states(struct vcpu *v, void *src, unsigned int size)
> 
> const void *src
> 
Ok.
> > +{
> > +    struct xsave_struct *xsave = v->arch.xsave_area;
> > +    u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET);
> > +    u64 valid;
> > +
> > +    /*
> > +     * Copy legacy XSAVE area, to avoid complications with CPUID
> > +     * leaves 0 and 1 in the loop below.
> > +     */
> > +    memcpy(xsave, src, FXSAVE_SIZE);
> > +
> > +    /* Set XSTATE_BV and possibly XCOMP_BV.  */
> > +    xsave->xsave_hdr.xstate_bv = xstate_bv;
> > +    xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | XSTATE_COMPACTION_ENABLED;
> > +
> > +    /*
> > +     * Copy each region from the non-compacted offset to the
> > +     * possibly compacted offset.
> > +     */
> > +    valid = xstate_bv & ~XSTATE_FP_SSE;
> > +    while ( valid )
> > +    {
> > +        u64 feature = valid & -valid;
> > +        int index = fls(feature) - 1;
> > +        void *dest = get_xsave_addr(xsave, feature);
> > +
> > +        if (dest)
> 
> Style.
> 
> > +        {
> > +            if ( xstate_offsets[index] + xstate_sizes[index] > size )
> > +                BUG();
> > +            memcpy(dest, src + xstate_offsets[index], xstate_sizes[index]);
> > +        }
> > +
> > +        valid -= feature;
> > +    }
> > +}
> > +
> >  void xsave(struct vcpu *v, uint64_t mask)
> >  {
> >      struct xsave_struct *ptr = v->arch.xsave_area;
> > diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
> > index 9a01563..16b1386 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -153,6 +153,10 @@
> >  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
> >  #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
> >  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
> > +#define XSAVEOPT		(1 << 0)
> > +#define XSAVEC			(1 << 1)
> > +#define XGETBV1		(1 << 2)
> > +#define XSAVES			(1 << 3)
> 
> This absolutely must be X86_FEATURE_* for consistency, and you will need
> to introduce a new capability word.
> 
Ok.Next version I will change this.
> ~Andrew
> 
> _______________________________________________
> Xen-devel mailing list
Thanks for review,Andrew
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest
  2015-08-26 10:36   ` Andrew Cooper
@ 2015-08-28  2:52     ` Shuai Ruan
  0 siblings, 0 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-28  2:52 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, ian.jackson, eddie.dong, xen-devel, jbeulich, keir

On Wed, Aug 26, 2015 at 11:36:53AM +0100, Andrew Cooper wrote:
> On 25/08/15 11:54, Shuai Ruan wrote:
> > This patch enables xsaves for hvm guest, includes:
> > 1.handle xsaves vmcs init and vmexit.
> > 2.add logic to write/read the XSS msr.
> >
> > Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>, given two
> corrections...
> 
Thanks.
> > ---
> >  xen/arch/x86/hvm/hvm.c             | 43 ++++++++++++++++++++++++++++++++++++++
> >  xen/arch/x86/hvm/vmx/vmcs.c        |  6 ++++--
> >  xen/arch/x86/hvm/vmx/vmx.c         | 20 ++++++++++++++++++
> >  xen/arch/x86/xstate.c              |  4 ++--
> >  xen/include/asm-x86/hvm/vmx/vmcs.h |  7 +++++++
> >  xen/include/asm-x86/hvm/vmx/vmx.h  |  2 ++
> >  xen/include/asm-x86/xstate.h       |  1 +
> >  7 files changed, 79 insertions(+), 4 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > index dc444ac..57612de 100644
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -4540,6 +4540,33 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
> >                      *ebx = _eax + _ebx;
> >              }
> >          }
> > +        if ( count == 1 )
> > +        {
> > +            if ( cpu_has_xsaves )
> > +            {
> > +                *ebx = XSTATE_AREA_MIN_SIZE;
> > +                if ( v->arch.xcr0 | v->arch.msr_ia32_xss )
> > +                    for ( sub_leaf = 2; sub_leaf < 63; sub_leaf++ )
> > +                    {
> > +                        if ( !((v->arch.xcr0 | v->arch.msr_ia32_xss)
> > +			   & (1ULL << sub_leaf)) )
> 
> Stray hard tabs.
> 
Ok.
> > @@ -4771,6 +4804,16 @@ int hvm_msr_write_intercept(unsigned int msr, uint64_t msr_content,
> >             return X86EMUL_EXCEPTION;
> >          break;
> >  
> > +    case MSR_IA32_XSS:
> > +	/*
> > +	 * skylake only support bit 8 , but it is
> > +	 * not support in xen.
> > +	 */
> 
> I would alter this comment to
> 
> /* No XSS features currently supported for guests. */
> 
> The skylake bits are covered by cpu_has_xsaves.
> 
> ~Andrew
> 
Ok.
> _______________________________________________
> Xen-devel mailing list
Thanks for your reviwe,Andrew
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-26 10:12   ` Andrew Cooper
  2015-08-26 11:50     ` Jan Beulich
@ 2015-08-28  5:22     ` Shuai Ruan
  1 sibling, 0 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-28  5:22 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	jun.nakajima, eddie.dong, ian.jackson, xen-devel, jbeulich, keir

On Wed, Aug 26, 2015 at 11:12:02AM +0100, Andrew Cooper wrote:
> On 25/08/15 11:54, Shuai Ruan wrote:
> > This patch uses xsaves/xrstors instead of xsaveopt/xrstor
> > to perform the xsave_area switching so that xen itself
> > can benefit from them when available.
> >
> > For xsaves/xrstors only use compact format. Add format conversion
> > support when perform guest os migration.
> >
> > Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
> > ---
> >  xen/arch/x86/domain.c  |   2 +
> >  xen/arch/x86/domctl.c  |  34 ++++++++++++---
> >  xen/arch/x86/hvm/hvm.c |  19 ++++++---
> >  xen/arch/x86/i387.c    |   4 ++
> >  xen/arch/x86/traps.c   |   7 ++--
> >  xen/arch/x86/xstate.c  | 112 ++++++++++++++++++++++++++++++++++---------------
> >  6 files changed, 132 insertions(+), 46 deletions(-)
> >
> > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> > index 045f6ff..7b8f649 100644
> > --- a/xen/arch/x86/domain.c
> > +++ b/xen/arch/x86/domain.c
> > @@ -1529,6 +1529,8 @@ static void __context_switch(void)
> >              if ( xcr0 != get_xcr0() && !set_xcr0(xcr0) )
> >                  BUG();
> >          }
> > +        if ( cpu_has_xsaves )
> > +            wrmsrl(MSR_IA32_XSS, n->arch.msr_ia32_xss);
> 
> This is still not appropriate.  You need a per cpu variable and only
> perform lazy writes of the MSR.
> 
Ok.I will introduce two helper
void set_xss_msr(uint64_t msr_xss);
uint64_t get_xss_msr();
to perform lazy writes of the MSR.
> >          vcpu_restore_fpu_eager(n);
> >          n->arch.ctxt_switch_to(n);
> >      }
> > diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
> > index bf62a88..da136876 100644
> > --- a/xen/arch/x86/domctl.c
> > +++ b/xen/arch/x86/domctl.c
> > @@ -867,6 +867,7 @@ long arch_do_domctl(
> >          if ( domctl->cmd == XEN_DOMCTL_getvcpuextstate )
> >          {
> >              unsigned int size;
> > +            void * xsave_area;
> 
> Unnecessary space.
> 
Sorry.
> >  
> >              ret = 0;
> >              vcpu_pause(v);
> > @@ -896,9 +897,28 @@ long arch_do_domctl(
> >                  ret = -EFAULT;
> >  
> >              offset += sizeof(v->arch.xcr0_accum);
> > -            if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> > -                                              (void *)v->arch.xsave_area,
> > -                                              size - 2 * sizeof(uint64_t)) )
> > +
> > +            if ( cpu_has_xsaves )
> 
> I think this would be more correct to gate on v->arch.xsave_area
> actually being compressed.
> 
I will introduce a helper called 
bool_t xsave_area_compressed(struct xsave_struct *xsave_area)
to check whether the area is compressed or not.
> > +            {
> > +                xsave_area = xmalloc_bytes(size);
> > +                if ( !xsave_area )
> > +                {
> > +                    ret = -ENOMEM;
> > +                    goto vcpuextstate_out;
> > +                }
> > +
> > +                save_xsave_states(v, xsave_area,
> > +                                  evc->size - 2*sizeof(uint64_t));
> 
> And on a similar note, save_xsave_states() is really
> uncompress_xsave_area().  On further consideration, it should ASSERT()
> that the source is currently compressed.
> 
Ok.
> > +
> > +                if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> > +					          xsave_area, size -
> > +		                                  2 * sizeof(uint64_t)) )
> > +                     ret = -EFAULT;
> > +                 xfree(xsave_area);
> > +           }
> > +           else if ( !ret && copy_to_guest_offset(evc->buffer, offset,
> > +                                                 (void *)v->arch.xsave_area,
> > +                                                 size - 2 * sizeof(uint64_t)) )
> >                  ret = -EFAULT;
> >  
> >              vcpu_unpause(v);
> > @@ -954,8 +974,12 @@ long arch_do_domctl(
> >                  v->arch.xcr0_accum = _xcr0_accum;
> >                  if ( _xcr0_accum & XSTATE_NONLAZY )
> >                      v->arch.nonlazy_xstate_used = 1;
> > -                memcpy(v->arch.xsave_area, _xsave_area,
> > -                       evc->size - 2 * sizeof(uint64_t));
> > +                if ( cpu_has_xsaves )
> > +                    load_xsave_states(v, (void *)_xsave_area,
> > +                                      evc->size - 2*sizeof(uint64_t));
> 
> Similarly, load_xsave_states() is really compress_xsave_area().
> 
> You should check to see whether the area is already compressed, and just
> memcpy() in that case.
> 
Ok.
> > +                else
> > +                    memcpy(v->arch.xsave_area, (void *)_xsave_area,
> > +                           evc->size - 2 * sizeof(uint64_t));
> >                  vcpu_unpause(v);
> >              }
> >              else
> > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > index c957610..dc444ac 100644
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -2148,8 +2148,12 @@ static int hvm_save_cpu_xsave_states(struct domain *d, hvm_domain_context_t *h)
> >          ctxt->xfeature_mask = xfeature_mask;
> >          ctxt->xcr0 = v->arch.xcr0;
> >          ctxt->xcr0_accum = v->arch.xcr0_accum;
> > -        memcpy(&ctxt->save_area, v->arch.xsave_area,
> > -               size - offsetof(struct hvm_hw_cpu_xsave, save_area));
> > +	if ( cpu_has_xsaves )
> 
> Stray hard tab.
> 
Ok.
> > +            save_xsave_states(v, (void *)&ctxt->save_area,
> > +                              size - offsetof(struct hvm_hw_cpu_xsave,save_area));
> 
> These offsetof()s can become far shorter by using offsetof(*ctxt,
> save_area).
> 
> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> > index 9f5a6c6..e9beec1 100644
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
> >          if ( regs->_ecx == 1 )
> >          {
> >              a &= XSTATE_FEATURE_XSAVEOPT |
> > -                 XSTATE_FEATURE_XSAVEC |
> > -                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> > -                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
> > +                 XSTATE_FEATURE_XSAVEC;
> > +	         /* PV guest will not support xsaves. */
> > +                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> > +                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */
> 
> Don't leave this code commented out like this.  Just delete it.
> 
Sorry.
> >              if ( !cpu_has_xsaves )
> >                  b = c = d = 0;
> >          }
> > diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> > index 3986515..9050607 100644
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -214,6 +214,11 @@ void xsave(struct vcpu *v, uint64_t mask)
> >          typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
> >          typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
> >  
> > +        if ( cpu_has_xsaves )
> > +            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
> > +                           : "=m" (*ptr)
> > +                           : "a" (lmask), "d" (hmask), "D" (ptr) );
> > +        else
> >          if ( cpu_has_xsaveopt )
> 
> Please join the if() onto the previous line.
> 
Ok.
> >          {
> >              /*
> > @@ -267,6 +272,11 @@ void xsave(struct vcpu *v, uint64_t mask)
> >      }
> >      else
> >      {
> > +        if ( cpu_has_xsaves )
> > +            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
> > +                           : "=m" (*ptr)
> > +                           : "a" (lmask), "d" (hmask), "D" (ptr) );
> > +        else
> >          if ( cpu_has_xsaveopt )
> 
> And here.
> 
> > @@ -466,16 +508,20 @@ void xstate_init(bool_t bsp)
> >      {
> >          cpu_has_xsaveopt = !!(eax & XSTATE_FEATURE_XSAVEOPT);
> >          cpu_has_xsavec = !!(eax & XSTATE_FEATURE_XSAVEC);
> > -        /* XXX cpu_has_xgetbv1 = !!(eax & XSTATE_FEATURE_XGETBV1); */
> > -        /* XXX cpu_has_xsaves = !!(eax & XSTATE_FEATURE_XSAVES); */
> > +        cpu_has_xgetbv1 = !!(eax & XSTATE_FEATURE_XGETBV1);
> > +        cpu_has_xsaves = !!(eax & XSTATE_FEATURE_XSAVES);
> >      }
> >      else
> >      {
> >          BUG_ON(!cpu_has_xsaveopt != !(eax & XSTATE_FEATURE_XSAVEOPT));
> >          BUG_ON(!cpu_has_xsavec != !(eax & XSTATE_FEATURE_XSAVEC));
> > -        /* XXX BUG_ON(!cpu_has_xgetbv1 != !(eax & XSTATE_FEATURE_XGETBV1)); */
> > -        /* XXX BUG_ON(!cpu_has_xsaves != !(eax & XSTATE_FEATURE_XSAVES)); */
> > +        BUG_ON(!cpu_has_xgetbv1 != !(eax & XSTATE_FEATURE_XGETBV1));
> > +        BUG_ON(!cpu_has_xsaves != !(eax & XSTATE_FEATURE_XSAVES));
> >      }
> > +
> > +    setup_xstate_features();
> > +    if ( cpu_has_xsaves )
> > +        setup_xstate_comp();
> 
> setup_state_comp() should only be called on the bsp, and
> setup_xstate_features() should have a similar bsp/ap split with
> generating or validating state.
> 
Ok.
> ~Andrew
> 
Thanks for your review, Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-26 12:35         ` Jan Beulich
@ 2015-08-28  5:25           ` Shuai Ruan
  0 siblings, 0 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-28  5:25 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	Andrew Cooper, eddie.dong, xen-devel, jun.nakajima, keir,
	ian.jackson

On Wed, Aug 26, 2015 at 06:35:51AM -0600, Jan Beulich wrote:
> >>> On 26.08.15 at 14:05, <andrew.cooper3@citrix.com> wrote:
> > On 26/08/15 12:50, Jan Beulich wrote:
> >>>>> On 26.08.15 at 12:12, <andrew.cooper3@citrix.com> wrote:
> >>> On 25/08/15 11:54, Shuai Ruan wrote:
> >>>> --- a/xen/arch/x86/traps.c
> >>>> +++ b/xen/arch/x86/traps.c
> >>>> @@ -936,9 +936,10 @@ void pv_cpuid(struct cpu_user_regs *regs)
> >>>>          if ( regs->_ecx == 1 )
> >>>>          {
> >>>>              a &= XSTATE_FEATURE_XSAVEOPT |
> >>>> -                 XSTATE_FEATURE_XSAVEC |
> >>>> -                 (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> >>>> -                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0);
> >>>> +                 XSTATE_FEATURE_XSAVEC;
> >>>> +	         /* PV guest will not support xsaves. */
> >>>> +                 /* (cpu_has_xgetbv1 ? XSTATE_FEATURE_XGETBV1 : 0) |
> >>>> +                 (cpu_has_xsaves ? XSTATE_FEATURE_XSAVES : 0); */
> >>> Don't leave this code commented out like this.  Just delete it.
> >> Agreed, but - mind reminding me again why supporting them for
> >> PV guests isn't going to work?
> > 
> > xsaves is a cpl0 instruction used to manage state which userspace can't
> > be trusted to handle alone.  Xen therefore can't trust PV guests to use
> > it either.
> > 
> > The first of these features is Processor Trace.  A PV guest able to use
> > xsaves/xrstors would be able to gather trace data of hypervisor
> > execution, or cause the trace buffers to clobber arbitrary physical memory.
> > 
> > The features covered by MSR_IA32_XSS can safely be exposed to PV guests,
> > but via a hypercall interface.
> 
> That covers xsaves, but not xgetbv1 (which also covers XSAVEOPT).
> 
That is my error. I will fix it in next version.
> Jan
> 
> 
Thans for your review,Jan
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
  2015-08-26 12:53       ` Jan Beulich
@ 2015-08-28  5:34         ` Shuai Ruan
  0 siblings, 0 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-28  5:34 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	Andrew Cooper, ian.jackson, xen-devel, jun.nakajima, keir

On Wed, Aug 26, 2015 at 06:53:06AM -0600, Jan Beulich wrote:
> >>> On 26.08.15 at 13:41, <JBeulich@suse.com> wrote:
> >>>> On 26.08.15 at 11:47, <andrew.cooper3@citrix.com> wrote:
> >> On 25/08/2015 11:54, Shuai Ruan wrote:
> >>> --- a/xen/include/asm-x86/cpufeature.h
> >>> +++ b/xen/include/asm-x86/cpufeature.h
> >>> @@ -153,6 +153,10 @@
> >>>  #define X86_FEATURE_RDSEED	(7*32+18) /* RDSEED instruction */
> >>>  #define X86_FEATURE_ADX		(7*32+19) /* ADCX, ADOX instructions */
> >>>  #define X86_FEATURE_SMAP	(7*32+20) /* Supervisor Mode Access Prevention */
> >>> +#define XSAVEOPT		(1 << 0)
> >>> +#define XSAVEC			(1 << 1)
> >>> +#define XGETBV1		(1 << 2)
> >>> +#define XSAVES			(1 << 3)
> >> 
> >> This absolutely must be X86_FEATURE_* for consistency, and you will need
> >> to introduce a new capability word.
> > 
> > Or even better re-use word 2.
> 
> And make sure to replace the then redundant XSTATE_FEATURE_*
> definitions in xstate.h.
> 
Ok.Thanks I will fix these in next version.
> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
  2015-08-26 13:06   ` Jan Beulich
@ 2015-08-28 10:54     ` Shuai Ruan
       [not found]     ` <20150828105408.GA18437@shuai.ruan@linux.intel.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Shuai Ruan @ 2015-08-28 10:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: kevin.tian, wei.liu2, Ian.Campbell, stefano.stabellini,
	andrew.cooper3, eddie.dong, xen-devel, jun.nakajima, keir,
	ian.jackson

On Wed, Aug 26, 2015 at 07:06:00AM -0600, Jan Beulich wrote:
> >>> On 25.08.15 at 12:54, <shuai.ruan@linux.intel.com> wrote:
> 
> > +++ b/xen/arch/x86/xstate.c
> > @@ -214,6 +214,11 @@ void xsave(struct vcpu *v, uint64_t mask)
> >          typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
> >          typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
> >  
> > +        if ( cpu_has_xsaves )
> > +            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
> > +                           : "=m" (*ptr)
> > +                           : "a" (lmask), "d" (hmask), "D" (ptr) );
> > +        else
> >          if ( cpu_has_xsaveopt )
> 
> Same question as above - why not also use XSAVEC when
> available?
> 
In practice no real processor exists that only has one of the
xsavec/xsaves.
> Jan
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen
       [not found]     ` <20150828105408.GA18437@shuai.ruan@linux.intel.com>
@ 2015-08-28 11:27       ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2015-08-28 11:27 UTC (permalink / raw)
  To: Shuai Ruan
  Cc: kevin.tian, wei.liu2, eddie.dong, stefano.stabellini,
	andrew.cooper3, ian.jackson, xen-devel, jun.nakajima, keir,
	Ian.Campbell

>>> On 28.08.15 at 12:54, <shuai.ruan@linux.intel.com> wrote:
> On Wed, Aug 26, 2015 at 07:06:00AM -0600, Jan Beulich wrote:
>> >>> On 25.08.15 at 12:54, <shuai.ruan@linux.intel.com> wrote:
>> 
>> > +++ b/xen/arch/x86/xstate.c
>> > @@ -214,6 +214,11 @@ void xsave(struct vcpu *v, uint64_t mask)
>> >          typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
>> >          typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
>> >  
>> > +        if ( cpu_has_xsaves )
>> > +            asm volatile ( ".byte 0x48,0x0f,0xc7,0x2f"
>> > +                           : "=m" (*ptr)
>> > +                           : "a" (lmask), "d" (hmask), "D" (ptr) );
>> > +        else
>> >          if ( cpu_has_xsaveopt )
>> 
>> Same question as above - why not also use XSAVEC when
>> available?
>> 
> In practice no real processor exists that only has one of the
> xsavec/xsaves.

No today. And in any event - there are two feature flags, so both
should be honored independently.

Jan

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-08-28 11:27 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-25 10:54 [PATCH V4 0/4] add xsaves/xrstors support Shuai Ruan
2015-08-25 10:54 ` [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves Shuai Ruan
2015-08-26  9:47   ` Andrew Cooper
2015-08-26 11:41     ` Jan Beulich
2015-08-26 12:53       ` Jan Beulich
2015-08-28  5:34         ` Shuai Ruan
2015-08-28  2:49     ` Shuai Ruan
2015-08-26 12:55   ` Jan Beulich
2015-08-25 10:54 ` [PATCH V4 2/4] x86/xsaves: enable xsaves/xrstors in xen Shuai Ruan
2015-08-26 10:12   ` Andrew Cooper
2015-08-26 11:50     ` Jan Beulich
2015-08-26 12:05       ` Andrew Cooper
2015-08-26 12:35         ` Jan Beulich
2015-08-28  5:25           ` Shuai Ruan
2015-08-28  5:22     ` Shuai Ruan
2015-08-26 13:06   ` Jan Beulich
2015-08-28 10:54     ` Shuai Ruan
     [not found]     ` <20150828105408.GA18437@shuai.ruan@linux.intel.com>
2015-08-28 11:27       ` Jan Beulich
2015-08-25 10:54 ` [PATCH V4 3/4] x86/xsaves: enable xsaves/xrstors for hvm guest Shuai Ruan
2015-08-26 10:36   ` Andrew Cooper
2015-08-28  2:52     ` Shuai Ruan
2015-08-26 13:14   ` Jan Beulich
2015-08-25 10:54 ` [PATCH V4 4/4] libxc: expose xsaves/xgetbv1/xsavec to " Shuai Ruan
2015-08-26 10:43   ` Andrew Cooper
2015-08-26 12:03     ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.