xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/cpuid: fix dom0 crash on skylake machine
@ 2016-06-01  4:58 Luwei Kang
  2016-06-01  5:54 ` Han, Huaitong
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Luwei Kang @ 2016-06-01  4:58 UTC (permalink / raw)
  To: xen-devel; +Cc: andrew.cooper3, Luwei Kang, yong.y.wang, jbeulich, huaitong.han

CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
will crash on skylake machine with PKRU support.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 xen/arch/x86/traps.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 1ef8401..5e72e44 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
              */
             if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
                 cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
+
+            /* PV is not supported by XSTATE_PKRU. */
+            a &= ~XSTATE_PKRU;
             break;
         }
 
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  4:58 [PATCH] x86/cpuid: fix dom0 crash on skylake machine Luwei Kang
@ 2016-06-01  5:54 ` Han, Huaitong
  2016-06-01  8:49   ` Jan Beulich
  2016-06-01  9:03 ` Andrew Cooper
  2016-06-01  9:04 ` Jan Beulich
  2 siblings, 1 reply; 24+ messages in thread
From: Han, Huaitong @ 2016-06-01  5:54 UTC (permalink / raw)
  To: Kang, Luwei; +Cc: andrew.cooper3, wei.liu2, Wang, Yong Y, jbeulich, xen-devel

On Wed, 2016-06-01 at 12:58 +0800, Luwei Kang wrote:
> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
> will crash on skylake machine with PKRU support.
> 
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> ---
>  xen/arch/x86/traps.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 1ef8401..5e72e44 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
>               */
>              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
>                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
> +
> +            /* PV is not supported by XSTATE_PKRU. */
> +            a &= ~XSTATE_PKRU;
>              break;
>          }
>  

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  5:54 ` Han, Huaitong
@ 2016-06-01  8:49   ` Jan Beulich
  2016-06-01  9:00     ` Han, Huaitong
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2016-06-01  8:49 UTC (permalink / raw)
  To: Huaitong Han; +Cc: andrew.cooper3, wei.liu2, Yong Y Wang, Luwei Kang, xen-devel

>>> On 01.06.16 at 07:54, <huaitong.han@intel.com> wrote:
> On Wed, 2016-06-01 at 12:58 +0800, Luwei Kang wrote:
>> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
>> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
>> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
>> will crash on skylake machine with PKRU support.
>> 
>> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Huaitong Han <huaitong.han@intel.com>

I don't understand this: Did you mean Reviewed-by? Or else did
Luwei forget to mention your co-authorship (albeit co-authoring
on this small a patch seems unlikely), or the patch having flown
from you to him or the other way around?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  8:49   ` Jan Beulich
@ 2016-06-01  9:00     ` Han, Huaitong
  0 siblings, 0 replies; 24+ messages in thread
From: Han, Huaitong @ 2016-06-01  9:00 UTC (permalink / raw)
  To: JBeulich; +Cc: andrew.cooper3, wei.liu2, Wang, Yong Y, Kang, Luwei, xen-devel

On Wed, 2016-06-01 at 02:49 -0600, Jan Beulich wrote:
> >>> On 01.06.16 at 07:54, <huaitong.han@intel.com> wrote:
> > On Wed, 2016-06-01 at 12:58 +0800, Luwei Kang wrote:
> >> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
> >> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
> >> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
> >> will crash on skylake machine with PKRU support.
> >> 
> >> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Huaitong Han <huaitong.han@intel.com>
> 
> I don't understand this: Did you mean Reviewed-by? Or else did
> Luwei forget to mention your co-authorship (albeit co-authoring
> on this small a patch seems unlikely), or the patch having flown
> from you to him or the other way around?
Reviewed-by: Huaitong Han <huaitong.han@intel.com>
> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  4:58 [PATCH] x86/cpuid: fix dom0 crash on skylake machine Luwei Kang
  2016-06-01  5:54 ` Han, Huaitong
@ 2016-06-01  9:03 ` Andrew Cooper
  2016-06-01  9:17   ` Jan Beulich
                     ` (3 more replies)
  2016-06-01  9:04 ` Jan Beulich
  2 siblings, 4 replies; 24+ messages in thread
From: Andrew Cooper @ 2016-06-01  9:03 UTC (permalink / raw)
  To: Luwei Kang, xen-devel; +Cc: huaitong.han, yong.y.wang, jbeulich

[-- Attachment #1: Type: text/plain, Size: 1119 bytes --]

On 01/06/16 05:58, Luwei Kang wrote:
> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
> will crash on skylake machine with PKRU support.
>
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> ---
>  xen/arch/x86/traps.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> index 1ef8401..5e72e44 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
>               */
>              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
>                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
> +
> +            /* PV is not supported by XSTATE_PKRU. */
> +            a &= ~XSTATE_PKRU;
>              break;
>          }
>  

While this does work, it undoes some of the work I started with my cpuid
improvements in 4.7

Does the attached patch also resolve your issue?

~Andrew

[-- Attachment #2: 0001-xen-x86-Clip-guests-view-of-xfeature_mask-at-the-per.patch --]
[-- Type: text/x-diff, Size: 4491 bytes --]

>From 3cbb2a63fe368ac185e483b9ef5a8504340a3702 Mon Sep 17 00:00:00 2001
From: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Wed, 1 Jun 2016 10:00:12 +0100
Subject: [PATCH] xen/x86: Clip guests view of xfeature_mask at the per-domain
 maximum

---
 xen/arch/x86/cpuid.c        | 33 ++++++++++++++++++++++++++++++---
 xen/arch/x86/hvm/hvm.c      |  3 +++
 xen/arch/x86/traps.c        |  3 +++
 xen/include/asm-x86/cpuid.h |  2 ++
 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
index e1e0e44..0a75d4a 100644
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -4,6 +4,7 @@
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/vmx/vmcs.h>
 #include <asm/processor.h>
+#include <asm/xstate.h>
 
 const uint32_t known_features[] = INIT_KNOWN_FEATURES;
 const uint32_t special_features[] = INIT_SPECIAL_FEATURES;
@@ -17,11 +18,14 @@ uint32_t __read_mostly raw_featureset[FSCAPINTS];
 uint32_t __read_mostly pv_featureset[FSCAPINTS];
 uint32_t __read_mostly hvm_featureset[FSCAPINTS];
 
-static void __init sanitise_featureset(uint32_t *fs)
+uint64_t __read_mostly pv_xfeature_mask, __read_mostly hvm_xfeature_mask;
+
+static void __init sanitise_featureset(uint32_t *fs, uint64_t *xfeature_mask)
 {
     /* for_each_set_bit() uses unsigned longs.  Extend with zeroes. */
     uint32_t disabled_features[
         ROUNDUP(FSCAPINTS, sizeof(unsigned long)/sizeof(uint32_t))] = {};
+    uint64_t mask = 0;
     unsigned int i;
 
     for ( i = 0; i < FSCAPINTS; ++i )
@@ -62,6 +66,29 @@ static void __init sanitise_featureset(uint32_t *fs)
      */
     fs[FEATURESET_e1d] = ((fs[FEATURESET_1d]  &  CPUID_COMMON_1D_FEATURES) |
                           (fs[FEATURESET_e1d] & ~CPUID_COMMON_1D_FEATURES));
+
+    /*
+     * Calculate the maximum applicable xfeature mask, based on the
+     * featureset.
+     */
+    if ( test_bit(X86_FEATURE_XSAVE, fs) )
+    {
+        mask = XSTATE_SSE | XSTATE_FP;
+
+        if ( test_bit(X86_FEATURE_AVX, fs) )
+            mask |= XSTATE_YMM;
+
+        if ( test_bit(X86_FEATURE_MPX, fs) )
+            mask |= XSTATE_BNDREGS | XSTATE_BNDCSR;
+
+        if ( test_bit(X86_FEATURE_PKU, fs) )
+            mask |= XSTATE_PKRU;
+
+        if ( test_bit(X86_FEATURE_LWP, fs) )
+            mask |= XSTATE_LWP;
+    }
+
+    *xfeature_mask = mask;
 }
 
 static void __init calculate_raw_featureset(void)
@@ -119,7 +146,7 @@ static void __init calculate_pv_featureset(void)
     __set_bit(X86_FEATURE_X2APIC, pv_featureset);
     __set_bit(X86_FEATURE_CMP_LEGACY, pv_featureset);
 
-    sanitise_featureset(pv_featureset);
+    sanitise_featureset(pv_featureset, &pv_xfeature_mask);
 }
 
 static void __init calculate_hvm_featureset(void)
@@ -179,7 +206,7 @@ static void __init calculate_hvm_featureset(void)
             __clear_bit(X86_FEATURE_PCOMMIT, hvm_featureset);
     }
 
-    sanitise_featureset(hvm_featureset);
+    sanitise_featureset(hvm_featureset, & hvm_xfeature_mask);
 }
 
 void __init calculate_featuresets(void)
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5040a5c..8678da9 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3448,6 +3448,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                 if ( (_eax + _ebx) > *ebx )
                     *ebx = _eax + _ebx;
             }
+
+            *eax &= (uint32_t)hvm_xfeature_mask;
+            *edx &= (uint32_t)(hvm_xfeature_mask >> 32);
         }
 
         if ( count == 1 )
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 1ef8401..81935b8 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1101,6 +1101,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
             if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
                 cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
             break;
+
+            a &= (uint32_t)pv_xfeature_mask;
+            d &= (uint32_t)(pv_xfeature_mask >> 32);
         }
 
         case 1:
diff --git a/xen/include/asm-x86/cpuid.h b/xen/include/asm-x86/cpuid.h
index 9a21c25..231c6d4 100644
--- a/xen/include/asm-x86/cpuid.h
+++ b/xen/include/asm-x86/cpuid.h
@@ -29,6 +29,8 @@ extern uint32_t raw_featureset[FSCAPINTS];
 extern uint32_t pv_featureset[FSCAPINTS];
 extern uint32_t hvm_featureset[FSCAPINTS];
 
+extern uint64_t pv_xfeature_mask, hvm_xfeature_mask;
+
 void calculate_featuresets(void);
 
 const uint32_t *lookup_deep_deps(uint32_t feature);
-- 
2.1.4


[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  4:58 [PATCH] x86/cpuid: fix dom0 crash on skylake machine Luwei Kang
  2016-06-01  5:54 ` Han, Huaitong
  2016-06-01  9:03 ` Andrew Cooper
@ 2016-06-01  9:04 ` Jan Beulich
  2 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2016-06-01  9:04 UTC (permalink / raw)
  To: Luwei Kang; +Cc: andrew.cooper3, xen-devel, yong.y.wang, huaitong.han

>>> On 01.06.16 at 06:58, <luwei.kang@intel.com> wrote:
> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
> with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
> ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
> will crash on skylake machine with PKRU support.

I have some difficulty following this description (albeit I think I see
what is going wrong): handle_xsetbv() doesn't _ignore_
XSTATE_PKRU for PV guests, it _fails_ in that case. And along those
lines the patch title should also be a little more specific.

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
>               */
>              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
>                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
> +
> +            /* PV is not supported by XSTATE_PKRU. */
> +            a &= ~XSTATE_PKRU;

Okay, so this is the trivial immediate fix to deal with the problem. Did
you, however, think about it in some more generic terms? For
example, MPX isn't supported for PV guests either, so
handle_xsetbv() should refuse requests to set the respective two
bits too. Which in turn would call for abstracting things via a new
#define in xstate.h.

Yet taking one more step, HVM guests may have PKRU and MPX
(and in fact any other of the features connected to the various
XSTATE_* bits) disabled too, in which case requests to enable
those in XCR0 should be refused too. Which in turn gets me to
ask how Dom0 actually learned of (in your case) XCR0.PKRU
being modifiable: Andrew's new CPUID handling should, I would
hope, make the XSTATE leaf not report any unavailable features.
And remember that PV Dom0 is _required_ to use the PV CPUID
mechanism to obtain available features, so if I am ti trust the
initial part of your description, the bug really is in Dom0 (and no
hypervisor change is necessary at all). Please clarify.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:03 ` Andrew Cooper
@ 2016-06-01  9:17   ` Jan Beulich
  2016-06-01  9:34     ` Andrew Cooper
  2016-06-01  9:21   ` Han, Huaitong
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2016-06-01  9:17 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

>>> On 01.06.16 at 11:03, <andrew.cooper3@citrix.com> wrote:
> While this does work, it undoes some of the work I started with my cpuid
> improvements in 4.7
> 
> Does the attached patch also resolve your issue?

While that's much better than the original, I don't think it's quite
enough. The rest of the domain policy should be taken into account
(and I think I had suggested to do so during review of your CPUID
rework series), i.e. this can't be calculated once for every domain.
And then, as said in reply to the original patch, handle_xsetbv()'s
checking should be generalized from the special casing of PKRU (or
if we don't want that, then that special case would better get
removed for consistency reasons).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:03 ` Andrew Cooper
  2016-06-01  9:17   ` Jan Beulich
@ 2016-06-01  9:21   ` Han, Huaitong
  2016-06-01  9:30   ` Wei Liu
  2016-06-01 10:54   ` Kang, Luwei
  3 siblings, 0 replies; 24+ messages in thread
From: Han, Huaitong @ 2016-06-01  9:21 UTC (permalink / raw)
  To: andrew.cooper3, Kang, Luwei; +Cc: Wang, Yong Y, jbeulich, xen-devel

Y, I think it works well, and more better.

to Luwei: you can test if the problem is solved. 

On Wed, 2016-06-01 at 10:03 +0100, Andrew Cooper wrote:
> On 01/06/16 05:58, Luwei Kang wrote:
> > CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will xsetbv
> > with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but handle_xsetbv has
> > ingored XSTATE_PKRU with hardware protection fault emulation, so dom0 kernel
> > will crash on skylake machine with PKRU support.
> >
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > ---
> >  xen/arch/x86/traps.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> > index 1ef8401..5e72e44 100644
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
> >               */
> >              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
> >                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
> > +
> > +            /* PV is not supported by XSTATE_PKRU. */
> > +            a &= ~XSTATE_PKRU;
> >              break;
> >          }
> >  
> 
> While this does work, it undoes some of the work I started with my cpuid
> improvements in 4.7
> 
> Does the attached patch also resolve your issue?
> 
> ~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:03 ` Andrew Cooper
  2016-06-01  9:17   ` Jan Beulich
  2016-06-01  9:21   ` Han, Huaitong
@ 2016-06-01  9:30   ` Wei Liu
  2016-06-01  9:45     ` Andrew Cooper
  2016-06-01 10:54   ` Kang, Luwei
  3 siblings, 1 reply; 24+ messages in thread
From: Wei Liu @ 2016-06-01  9:30 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Wei Liu, yong.y.wang, Luwei Kang, xen-devel, jbeulich, huaitong.han

On Wed, Jun 01, 2016 at 10:03:44AM +0100, Andrew Cooper wrote:
>  
>  static void __init calculate_hvm_featureset(void)
> @@ -179,7 +206,7 @@ static void __init calculate_hvm_featureset(void)
>              __clear_bit(X86_FEATURE_PCOMMIT, hvm_featureset);
>      }
>  
> -    sanitise_featureset(hvm_featureset);
> +    sanitise_featureset(hvm_featureset, & hvm_xfeature_mask);

Nit: extraneous space after "&".

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:17   ` Jan Beulich
@ 2016-06-01  9:34     ` Andrew Cooper
  2016-06-01  9:43       ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2016-06-01  9:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

On 01/06/16 10:17, Jan Beulich wrote:
>>>> On 01.06.16 at 11:03, <andrew.cooper3@citrix.com> wrote:
>> While this does work, it undoes some of the work I started with my cpuid
>> improvements in 4.7
>>
>> Does the attached patch also resolve your issue?
> While that's much better than the original, I don't think it's quite
> enough. The rest of the domain policy should be taken into account
> (and I think I had suggested to do so during review of your CPUID
> rework series), i.e. this can't be calculated once for every domain.

Like the current use of {hvm,pv}_featureset, as an upper bound, this is
just a stopgap fix.

Fixing this in a properly per-domain way is part of my further plans for
cpuid improvements.  The reason it isn't done like this yet is because
there is a substantial quantity of work required to make this function.

> And then, as said in reply to the original patch, handle_xsetbv()'s
> checking should be generalized from the special casing of PKRU (or
> if we don't want that, then that special case would better get
> removed for consistency reasons).

Oh - I hadn't even noticed that.  How about this incremental change?

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index a0cfcc2..67c0e4b 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -658,8 +658,8 @@ int handle_xsetbv(u32 index, u64 new_bv)
     if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
         return -EINVAL;
 
-    /* XCR0.PKRU is disabled on PV mode. */
-    if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
+    /* Sanity check against domain maximum. */
+    if ( new_bv & ~(is_pv_vcpu(curr) ? pv_xfeature_mask :
hvm_xfeature_mask) )
         return -EOPNOTSUPP;
 
     if ( !set_xcr0(new_bv) )


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:34     ` Andrew Cooper
@ 2016-06-01  9:43       ` Jan Beulich
  2016-06-01 11:27         ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2016-06-01  9:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

>>> On 01.06.16 at 11:34, <andrew.cooper3@citrix.com> wrote:
> On 01/06/16 10:17, Jan Beulich wrote:
>>>>> On 01.06.16 at 11:03, <andrew.cooper3@citrix.com> wrote:
>>> While this does work, it undoes some of the work I started with my cpuid
>>> improvements in 4.7
>>>
>>> Does the attached patch also resolve your issue?
>> While that's much better than the original, I don't think it's quite
>> enough. The rest of the domain policy should be taken into account
>> (and I think I had suggested to do so during review of your CPUID
>> rework series), i.e. this can't be calculated once for every domain.
> 
> Like the current use of {hvm,pv}_featureset, as an upper bound, this is
> just a stopgap fix.
> 
> Fixing this in a properly per-domain way is part of my further plans for
> cpuid improvements.  The reason it isn't done like this yet is because
> there is a substantial quantity of work required to make this function.

What substantial work other than XSTATE leaf handling inquiring
other leaves do you see?

>> And then, as said in reply to the original patch, handle_xsetbv()'s
>> checking should be generalized from the special casing of PKRU (or
>> if we don't want that, then that special case would better get
>> removed for consistency reasons).
> 
> Oh - I hadn't even noticed that.  How about this incremental change?
> 
> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> index a0cfcc2..67c0e4b 100644
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -658,8 +658,8 @@ int handle_xsetbv(u32 index, u64 new_bv)
>      if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
>          return -EINVAL;
>  
> -    /* XCR0.PKRU is disabled on PV mode. */
> -    if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
> +    /* Sanity check against domain maximum. */
> +    if ( new_bv & ~(is_pv_vcpu(curr) ? pv_xfeature_mask : hvm_xfeature_mask) )
>          return -EOPNOTSUPP;

Looks good.

But one other aspect: MPX is tagged S, while PKU is tagged H. Do
we perhaps need three masks to get the above right at least with
just the global masking?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:30   ` Wei Liu
@ 2016-06-01  9:45     ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2016-06-01  9:45 UTC (permalink / raw)
  To: Wei Liu; +Cc: huaitong.han, Luwei Kang, yong.y.wang, jbeulich, xen-devel

On 01/06/16 10:30, Wei Liu wrote:
> On Wed, Jun 01, 2016 at 10:03:44AM +0100, Andrew Cooper wrote:
>>  
>>  static void __init calculate_hvm_featureset(void)
>> @@ -179,7 +206,7 @@ static void __init calculate_hvm_featureset(void)
>>              __clear_bit(X86_FEATURE_PCOMMIT, hvm_featureset);
>>      }
>>  
>> -    sanitise_featureset(hvm_featureset);
>> +    sanitise_featureset(hvm_featureset, & hvm_xfeature_mask);
> Nit: extraneous space after "&".

Yeah - I noticed that and fixed it up after posting the email.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:03 ` Andrew Cooper
                     ` (2 preceding siblings ...)
  2016-06-01  9:30   ` Wei Liu
@ 2016-06-01 10:54   ` Kang, Luwei
  2016-06-01 10:57     ` Andrew Cooper
  3 siblings, 1 reply; 24+ messages in thread
From: Kang, Luwei @ 2016-06-01 10:54 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel; +Cc: Han, Huaitong, Wang, Yong Y, jbeulich

Thank  you Andrew Cooper, this patch indeed resolve my issue and  two point need modify.

The code need  move ahead of "break;"
@@ -1101,6 +1101,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
             if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
                 cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
             break;
+
+            a &= (uint32_t)pv_xfeature_mask;
+            d &= (uint32_t)(pv_xfeature_mask >> 32);
         }

extraneous space after "&".
-    sanitise_featureset(hvm_featureset);
+    sanitise_featureset(hvm_featureset, & hvm_xfeature_mask);


-----Original Message-----
From: Andrew Cooper [mailto:andrew.cooper3@citrix.com] 
Sent: Wednesday, June 1, 2016 5:04 PM
To: Kang, Luwei <luwei.kang@intel.com>; xen-devel@lists.xen.org
Cc: jbeulich@suse.com; Han, Huaitong <huaitong.han@intel.com>; Wang, Yong Y <yong.y.wang@intel.com>
Subject: Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine

On 01/06/16 05:58, Luwei Kang wrote:
> CPUID.0XD.0X0.EAX is from machine value for dom0, and dom0 kernel will 
> xsetbv with xfeatures_mask that is from CPUID.0XD.0X0.EAX, but 
> handle_xsetbv has ingored XSTATE_PKRU with hardware protection fault 
> emulation, so dom0 kernel will crash on skylake machine with PKRU support.
>
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> ---
>  xen/arch/x86/traps.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 
> 1ef8401..5e72e44 100644
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -1100,6 +1100,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
>               */
>              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
>                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
> +
> +            /* PV is not supported by XSTATE_PKRU. */
> +            a &= ~XSTATE_PKRU;
>              break;
>          }
>  

While this does work, it undoes some of the work I started with my cpuid improvements in 4.7

Does the attached patch also resolve your issue?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01 10:54   ` Kang, Luwei
@ 2016-06-01 10:57     ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2016-06-01 10:57 UTC (permalink / raw)
  To: Kang, Luwei, xen-devel; +Cc: Han, Huaitong, Wang, Yong Y, jbeulich

On 01/06/16 11:54, Kang, Luwei wrote:
> Thank  you Andrew Cooper, this patch indeed resolve my issue and  two point need modify.
>
> The code need  move ahead of "break;"
> @@ -1101,6 +1101,9 @@ void pv_cpuid(struct cpu_user_regs *regs)
>              if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
>                  cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
>              break;
> +
> +            a &= (uint32_t)pv_xfeature_mask;
> +            d &= (uint32_t)(pv_xfeature_mask >> 32);
>          }

Ah of course.  That is quite a silly mistake on my behalf.

>
> extraneous space after "&".
> -    sanitise_featureset(hvm_featureset);
> +    sanitise_featureset(hvm_featureset, & hvm_xfeature_mask);

I had already spotted and fixed this.

I will collect all feedback and post a formal patch to the list.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01  9:43       ` Jan Beulich
@ 2016-06-01 11:27         ` Andrew Cooper
  2016-06-01 11:38           ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2016-06-01 11:27 UTC (permalink / raw)
  To: Jan Beulich; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

On 01/06/16 10:43, Jan Beulich wrote:
>>>> On 01.06.16 at 11:34, <andrew.cooper3@citrix.com> wrote:
>> On 01/06/16 10:17, Jan Beulich wrote:
>>>>>> On 01.06.16 at 11:03, <andrew.cooper3@citrix.com> wrote:
>>>> While this does work, it undoes some of the work I started with my cpuid
>>>> improvements in 4.7
>>>>
>>>> Does the attached patch also resolve your issue?
>>> While that's much better than the original, I don't think it's quite
>>> enough. The rest of the domain policy should be taken into account
>>> (and I think I had suggested to do so during review of your CPUID
>>> rework series), i.e. this can't be calculated once for every domain.
>> Like the current use of {hvm,pv}_featureset, as an upper bound, this is
>> just a stopgap fix.
>>
>> Fixing this in a properly per-domain way is part of my further plans for
>> cpuid improvements.  The reason it isn't done like this yet is because
>> there is a substantial quantity of work required to make this function.
> What substantial work other than XSTATE leaf handling inquiring
> other leaves do you see?

I want to adjust the representation of cpuid information in struct
domain. The current loop in domain_cpuid() causes an O(N) overhead for
every query, which is very poor for actions which really should be a
single bit test at a fixed offset.

This needs to be combined with properly splitting the per-domain and
per-vcpu information, which requires knowing the expected vcpu topology
during domain creation.

On top of that, there needs to be verification logic to check the
correctness of information passed from the toolstack.

All of these areas are covered in the "known issues" section of the
feature doc, and I do plan to fix them all.  However, it isn't a couple
of hours worth of work.

>
>>> And then, as said in reply to the original patch, handle_xsetbv()'s
>>> checking should be generalized from the special casing of PKRU (or
>>> if we don't want that, then that special case would better get
>>> removed for consistency reasons).
>> Oh - I hadn't even noticed that.  How about this incremental change?
>>
>> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
>> index a0cfcc2..67c0e4b 100644
>> --- a/xen/arch/x86/xstate.c
>> +++ b/xen/arch/x86/xstate.c
>> @@ -658,8 +658,8 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>      if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
>>          return -EINVAL;
>>  
>> -    /* XCR0.PKRU is disabled on PV mode. */
>> -    if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
>> +    /* Sanity check against domain maximum. */
>> +    if ( new_bv & ~(is_pv_vcpu(curr) ? pv_xfeature_mask : hvm_xfeature_mask) )
>>          return -EOPNOTSUPP;
> Looks good.
>
> But one other aspect: MPX is tagged S, while PKU is tagged H. Do
> we perhaps need three masks to get the above right at least with
> just the global masking?

So it is.  I wonder why.  (It was previously like that, and I didn't
alter it as part of my cpuid work).  I can't see anything in MPX which
makes it unsafe to use with shadow.  All control state works on linear
addresses rather than physical addresses.

For now, I will implement the same kind of runtime check as hides MPX
from shadow guests in hvm_cpuid().  This again will eventually be
subsumed by the further work.

I am also wondering whether we should disable MPX by default for
domains.  We know it won't work properly for anything which ends up
being emulated.  Users who specifically want to try it should be able to
turn it on in their domain configuration file, but it really shouldn't
be on my default.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01 11:27         ` Andrew Cooper
@ 2016-06-01 11:38           ` Jan Beulich
  2016-06-01 11:45             ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2016-06-01 11:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

>>> On 01.06.16 at 13:27, <andrew.cooper3@citrix.com> wrote:
> On 01/06/16 10:43, Jan Beulich wrote:
>>>>> On 01.06.16 at 11:34, <andrew.cooper3@citrix.com> wrote:
>>> On 01/06/16 10:17, Jan Beulich wrote:
>>>>>>> On 01.06.16 at 11:03, <andrew.cooper3@citrix.com> wrote:
>>>>> While this does work, it undoes some of the work I started with my cpuid
>>>>> improvements in 4.7
>>>>>
>>>>> Does the attached patch also resolve your issue?
>>>> While that's much better than the original, I don't think it's quite
>>>> enough. The rest of the domain policy should be taken into account
>>>> (and I think I had suggested to do so during review of your CPUID
>>>> rework series), i.e. this can't be calculated once for every domain.
>>> Like the current use of {hvm,pv}_featureset, as an upper bound, this is
>>> just a stopgap fix.
>>>
>>> Fixing this in a properly per-domain way is part of my further plans for
>>> cpuid improvements.  The reason it isn't done like this yet is because
>>> there is a substantial quantity of work required to make this function.
>> What substantial work other than XSTATE leaf handling inquiring
>> other leaves do you see?
> 
> I want to adjust the representation of cpuid information in struct
> domain. The current loop in domain_cpuid() causes an O(N) overhead for
> every query, which is very poor for actions which really should be a
> single bit test at a fixed offset.
> 
> This needs to be combined with properly splitting the per-domain and
> per-vcpu information, which requires knowing the expected vcpu topology
> during domain creation.
> 
> On top of that, there needs to be verification logic to check the
> correctness of information passed from the toolstack.
> 
> All of these areas are covered in the "known issues" section of the
> feature doc, and I do plan to fix them all.  However, it isn't a couple
> of hours worth of work.

All understood, yet not to the point: The original remark was that
the very XSTATE handling could be done better with far not as much
of a change, at least afaict without having tried.

>>>> And then, as said in reply to the original patch, handle_xsetbv()'s
>>>> checking should be generalized from the special casing of PKRU (or
>>>> if we don't want that, then that special case would better get
>>>> removed for consistency reasons).
>>> Oh - I hadn't even noticed that.  How about this incremental change?
>>>
>>> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
>>> index a0cfcc2..67c0e4b 100644
>>> --- a/xen/arch/x86/xstate.c
>>> +++ b/xen/arch/x86/xstate.c
>>> @@ -658,8 +658,8 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>>      if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
>>>          return -EINVAL;
>>>  
>>> -    /* XCR0.PKRU is disabled on PV mode. */
>>> -    if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
>>> +    /* Sanity check against domain maximum. */
>>> +    if ( new_bv & ~(is_pv_vcpu(curr) ? pv_xfeature_mask : hvm_xfeature_mask) )
>>>          return -EOPNOTSUPP;
>> Looks good.
>>
>> But one other aspect: MPX is tagged S, while PKU is tagged H. Do
>> we perhaps need three masks to get the above right at least with
>> just the global masking?
> 
> So it is.  I wonder why.  (It was previously like that, and I didn't
> alter it as part of my cpuid work).  I can't see anything in MPX which
> makes it unsafe to use with shadow.  All control state works on linear
> addresses rather than physical addresses.

Well, it's MPX that's tagged as available for shadow guests, and
PKU as unavailable.

> For now, I will implement the same kind of runtime check as hides MPX
> from shadow guests in hvm_cpuid().  This again will eventually be
> subsumed by the further work.
> 
> I am also wondering whether we should disable MPX by default for
> domains.  We know it won't work properly for anything which ends up
> being emulated.  Users who specifically want to try it should be able to
> turn it on in their domain configuration file, but it really shouldn't
> be on my default.

For guests being introspected I agree. For others I'm not sure, as
normally we shouldn't hit the instruction emulator for branches (the
main exception being PAE mode shadow guests).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01 11:38           ` Jan Beulich
@ 2016-06-01 11:45             ` Andrew Cooper
  2016-06-01 12:01               ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2016-06-01 11:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

On 01/06/16 12:38, Jan Beulich wrote:
>>>> On 01.06.16 at 13:27, <andrew.cooper3@citrix.com> wrote:
>> On 01/06/16 10:43, Jan Beulich wrote:
>>>>>> On 01.06.16 at 11:34, <andrew.cooper3@citrix.com> wrote:
>>>> On 01/06/16 10:17, Jan Beulich wrote:
>>>>>>>> On 01.06.16 at 11:03, <andrew.cooper3@citrix.com> wrote:
>>>>>> While this does work, it undoes some of the work I started with my cpuid
>>>>>> improvements in 4.7
>>>>>>
>>>>>> Does the attached patch also resolve your issue?
>>>>> While that's much better than the original, I don't think it's quite
>>>>> enough. The rest of the domain policy should be taken into account
>>>>> (and I think I had suggested to do so during review of your CPUID
>>>>> rework series), i.e. this can't be calculated once for every domain.
>>>> Like the current use of {hvm,pv}_featureset, as an upper bound, this is
>>>> just a stopgap fix.
>>>>
>>>> Fixing this in a properly per-domain way is part of my further plans for
>>>> cpuid improvements.  The reason it isn't done like this yet is because
>>>> there is a substantial quantity of work required to make this function.
>>> What substantial work other than XSTATE leaf handling inquiring
>>> other leaves do you see?
>> I want to adjust the representation of cpuid information in struct
>> domain. The current loop in domain_cpuid() causes an O(N) overhead for
>> every query, which is very poor for actions which really should be a
>> single bit test at a fixed offset.
>>
>> This needs to be combined with properly splitting the per-domain and
>> per-vcpu information, which requires knowing the expected vcpu topology
>> during domain creation.
>>
>> On top of that, there needs to be verification logic to check the
>> correctness of information passed from the toolstack.
>>
>> All of these areas are covered in the "known issues" section of the
>> feature doc, and I do plan to fix them all.  However, it isn't a couple
>> of hours worth of work.
> All understood, yet not to the point: The original remark was that
> the very XSTATE handling could be done better with far not as much
> of a change, at least afaict without having tried.

In which case I don't know what you were suggesting.

>
>>>>> And then, as said in reply to the original patch, handle_xsetbv()'s
>>>>> checking should be generalized from the special casing of PKRU (or
>>>>> if we don't want that, then that special case would better get
>>>>> removed for consistency reasons).
>>>> Oh - I hadn't even noticed that.  How about this incremental change?
>>>>
>>>> diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
>>>> index a0cfcc2..67c0e4b 100644
>>>> --- a/xen/arch/x86/xstate.c
>>>> +++ b/xen/arch/x86/xstate.c
>>>> @@ -658,8 +658,8 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>>>      if ( (new_bv & ~xfeature_mask) || !valid_xcr0(new_bv) )
>>>>          return -EINVAL;
>>>>  
>>>> -    /* XCR0.PKRU is disabled on PV mode. */
>>>> -    if ( is_pv_vcpu(curr) && (new_bv & XSTATE_PKRU) )
>>>> +    /* Sanity check against domain maximum. */
>>>> +    if ( new_bv & ~(is_pv_vcpu(curr) ? pv_xfeature_mask : hvm_xfeature_mask) )
>>>>          return -EOPNOTSUPP;
>>> Looks good.
>>>
>>> But one other aspect: MPX is tagged S, while PKU is tagged H. Do
>>> we perhaps need three masks to get the above right at least with
>>> just the global masking?
>> So it is.  I wonder why.  (It was previously like that, and I didn't
>> alter it as part of my cpuid work).  I can't see anything in MPX which
>> makes it unsafe to use with shadow.  All control state works on linear
>> addresses rather than physical addresses.
> Well, it's MPX that's tagged as available for shadow guests, and
> PKU as unavailable.

Ah yes - my confusion.  We should dynamically hide PKU not MPX.

>
>> For now, I will implement the same kind of runtime check as hides MPX
>> from shadow guests in hvm_cpuid().  This again will eventually be
>> subsumed by the further work.
>>
>> I am also wondering whether we should disable MPX by default for
>> domains.  We know it won't work properly for anything which ends up
>> being emulated.  Users who specifically want to try it should be able to
>> turn it on in their domain configuration file, but it really shouldn't
>> be on my default.
> For guests being introspected I agree. For others I'm not sure, as
> normally we shouldn't hit the instruction emulator for branches (the
> main exception being PAE mode shadow guests).

MPX can be used for any memory reference, and one stated purpose is for
buffer overflow detection.  While the instruction emulator doesn't
understand MPX, any MMIO or shadow pagetable updates are liable to be
incorrect.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01 11:45             ` Andrew Cooper
@ 2016-06-01 12:01               ` Jan Beulich
  2016-06-01 13:03                 ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2016-06-01 12:01 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

>>> On 01.06.16 at 13:45, <andrew.cooper3@citrix.com> wrote:
> On 01/06/16 12:38, Jan Beulich wrote:
>>>>> On 01.06.16 at 13:27, <andrew.cooper3@citrix.com> wrote:
>>> On 01/06/16 10:43, Jan Beulich wrote:
>>>>>>> On 01.06.16 at 11:34, <andrew.cooper3@citrix.com> wrote:
>>>>> On 01/06/16 10:17, Jan Beulich wrote:
>>>>>>>>> On 01.06.16 at 11:03, <andrew.cooper3@citrix.com> wrote:
>>>>>>> While this does work, it undoes some of the work I started with my cpuid
>>>>>>> improvements in 4.7
>>>>>>>
>>>>>>> Does the attached patch also resolve your issue?
>>>>>> While that's much better than the original, I don't think it's quite
>>>>>> enough. The rest of the domain policy should be taken into account
>>>>>> (and I think I had suggested to do so during review of your CPUID
>>>>>> rework series), i.e. this can't be calculated once for every domain.
>>>>> Like the current use of {hvm,pv}_featureset, as an upper bound, this is
>>>>> just a stopgap fix.
>>>>>
>>>>> Fixing this in a properly per-domain way is part of my further plans for
>>>>> cpuid improvements.  The reason it isn't done like this yet is because
>>>>> there is a substantial quantity of work required to make this function.
>>>> What substantial work other than XSTATE leaf handling inquiring
>>>> other leaves do you see?
>>> I want to adjust the representation of cpuid information in struct
>>> domain. The current loop in domain_cpuid() causes an O(N) overhead for
>>> every query, which is very poor for actions which really should be a
>>> single bit test at a fixed offset.
>>>
>>> This needs to be combined with properly splitting the per-domain and
>>> per-vcpu information, which requires knowing the expected vcpu topology
>>> during domain creation.
>>>
>>> On top of that, there needs to be verification logic to check the
>>> correctness of information passed from the toolstack.
>>>
>>> All of these areas are covered in the "known issues" section of the
>>> feature doc, and I do plan to fix them all.  However, it isn't a couple
>>> of hours worth of work.
>> All understood, yet not to the point: The original remark was that
>> the very XSTATE handling could be done better with far not as much
>> of a change, at least afaict without having tried.
> 
> In which case I don't know what you were suggesting.

Make {hvm,pv}_cpuid() invoke themselves recursively to
determine what bits to mask off from CPUID[0xd].EAX.

>>> I am also wondering whether we should disable MPX by default for
>>> domains.  We know it won't work properly for anything which ends up
>>> being emulated.  Users who specifically want to try it should be able to
>>> turn it on in their domain configuration file, but it really shouldn't
>>> be on my default.
>> For guests being introspected I agree. For others I'm not sure, as
>> normally we shouldn't hit the instruction emulator for branches (the
>> main exception being PAE mode shadow guests).
> 
> MPX can be used for any memory reference, and one stated purpose is for
> buffer overflow detection.  While the instruction emulator doesn't
> understand MPX, any MMIO or shadow pagetable updates are liable to be
> incorrect.

No, memory references are completely unaffected by MPX. Code
needs to explicitly use BNDC{L,N,U} to invoke bounds checking.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01 12:01               ` Jan Beulich
@ 2016-06-01 13:03                 ` Andrew Cooper
  2016-06-01 13:28                   ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2016-06-01 13:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

On 01/06/16 13:01, Jan Beulich wrote:
>>
>>>> I want to adjust the representation of cpuid information in struct
>>>> domain. The current loop in domain_cpuid() causes an O(N) overhead for
>>>> every query, which is very poor for actions which really should be a
>>>> single bit test at a fixed offset.
>>>>
>>>> This needs to be combined with properly splitting the per-domain and
>>>> per-vcpu information, which requires knowing the expected vcpu topology
>>>> during domain creation.
>>>>
>>>> On top of that, there needs to be verification logic to check the
>>>> correctness of information passed from the toolstack.
>>>>
>>>> All of these areas are covered in the "known issues" section of the
>>>> feature doc, and I do plan to fix them all.  However, it isn't a couple
>>>> of hours worth of work.
>>> All understood, yet not to the point: The original remark was that
>>> the very XSTATE handling could be done better with far not as much
>>> of a change, at least afaict without having tried.
>> In which case I don't know what you were suggesting.
> Make {hvm,pv}_cpuid() invoke themselves recursively to
> determine what bits to mask off from CPUID[0xd].EAX.

So that would work.  However, to do this, you need to query leaves 1,
0x80000001 and 7, all of which will hit the O(N) loop in domain_cpuid()

Luckily, none of those specific paths further recurse into {hvm,pv}_cpuid().

I am unsure which to go with.  My gut feel is that this would be quite a
performance hit, but I have no evidence either way.  OTOH, it will give
the correct answer, rather than an approximation.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01 13:03                 ` Andrew Cooper
@ 2016-06-01 13:28                   ` Jan Beulich
  2016-06-02 11:12                     ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2016-06-01 13:28 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

>>> On 01.06.16 at 15:03, <andrew.cooper3@citrix.com> wrote:
> On 01/06/16 13:01, Jan Beulich wrote:
>>>
>>>>> I want to adjust the representation of cpuid information in struct
>>>>> domain. The current loop in domain_cpuid() causes an O(N) overhead for
>>>>> every query, which is very poor for actions which really should be a
>>>>> single bit test at a fixed offset.
>>>>>
>>>>> This needs to be combined with properly splitting the per-domain and
>>>>> per-vcpu information, which requires knowing the expected vcpu topology
>>>>> during domain creation.
>>>>>
>>>>> On top of that, there needs to be verification logic to check the
>>>>> correctness of information passed from the toolstack.
>>>>>
>>>>> All of these areas are covered in the "known issues" section of the
>>>>> feature doc, and I do plan to fix them all.  However, it isn't a couple
>>>>> of hours worth of work.
>>>> All understood, yet not to the point: The original remark was that
>>>> the very XSTATE handling could be done better with far not as much
>>>> of a change, at least afaict without having tried.
>>> In which case I don't know what you were suggesting.
>> Make {hvm,pv}_cpuid() invoke themselves recursively to
>> determine what bits to mask off from CPUID[0xd].EAX.
> 
> So that would work.  However, to do this, you need to query leaves 1,
> 0x80000001 and 7, all of which will hit the O(N) loop in domain_cpuid()
> 
> Luckily, none of those specific paths further recurse into {hvm,pv}_cpuid().
> 
> I am unsure which to go with.  My gut feel is that this would be quite a
> performance hit, but I have no evidence either way.  OTOH, it will give
> the correct answer, rather than an approximation.

Not only since I believe performance is very close to irrelevant for
CPUID leaf 0xD invocations, I think I'd prefer correctness over
performance (as would be basically always the case). How about
you?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-01 13:28                   ` Jan Beulich
@ 2016-06-02 11:12                     ` Andrew Cooper
  2016-06-02 11:34                       ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2016-06-02 11:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

[-- Attachment #1: Type: text/plain, Size: 2343 bytes --]

On 01/06/16 14:28, Jan Beulich wrote:
>>>> On 01.06.16 at 15:03, <andrew.cooper3@citrix.com> wrote:
>> On 01/06/16 13:01, Jan Beulich wrote:
>>>>>> I want to adjust the representation of cpuid information in struct
>>>>>> domain. The current loop in domain_cpuid() causes an O(N) overhead for
>>>>>> every query, which is very poor for actions which really should be a
>>>>>> single bit test at a fixed offset.
>>>>>>
>>>>>> This needs to be combined with properly splitting the per-domain and
>>>>>> per-vcpu information, which requires knowing the expected vcpu topology
>>>>>> during domain creation.
>>>>>>
>>>>>> On top of that, there needs to be verification logic to check the
>>>>>> correctness of information passed from the toolstack.
>>>>>>
>>>>>> All of these areas are covered in the "known issues" section of the
>>>>>> feature doc, and I do plan to fix them all.  However, it isn't a couple
>>>>>> of hours worth of work.
>>>>> All understood, yet not to the point: The original remark was that
>>>>> the very XSTATE handling could be done better with far not as much
>>>>> of a change, at least afaict without having tried.
>>>> In which case I don't know what you were suggesting.
>>> Make {hvm,pv}_cpuid() invoke themselves recursively to
>>> determine what bits to mask off from CPUID[0xd].EAX.
>> So that would work.  However, to do this, you need to query leaves 1,
>> 0x80000001 and 7, all of which will hit the O(N) loop in domain_cpuid()
>>
>> Luckily, none of those specific paths further recurse into {hvm,pv}_cpuid().
>>
>> I am unsure which to go with.  My gut feel is that this would be quite a
>> performance hit, but I have no evidence either way.  OTOH, it will give
>> the correct answer, rather than an approximation.
> Not only since I believe performance is very close to irrelevant for
> CPUID leaf 0xD invocations, I think I'd prefer correctness over
> performance (as would be basically always the case). How about
> you?

Right - this is the alternative, doing the calculation in
{hvm,pv}_cpuid(), based on top of your cleanup from yesterday.

There is a bugfix in the PV side (pv_featureset[FEATURESET_1c] should be
taken into account even for control/hardware domain accesses), and a
preemptive fix on the HVM side to avoid advertising any XSS states, as
we don't support any yet.

Thoughts?

~Andrew

[-- Attachment #2: 0001-xen-x86-Clip-guests-view-of-xfeature_mask.patch --]
[-- Type: text/x-diff, Size: 8366 bytes --]

>From be056a4b78929e428d50ca386ec56b42c9cde9ac Mon Sep 17 00:00:00 2001
From: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Thu, 2 Jun 2016 12:08:42 +0100
Subject: [PATCH] xen/x86: Clip guests view of xfeature_mask

---
 xen/arch/x86/hvm/hvm.c       | 46 +++++++++++++++++++++++++++++--
 xen/arch/x86/traps.c         | 65 +++++++++++++++++++++++++++++++++++++-------
 xen/include/asm-x86/xstate.h | 31 ++++++++++++++-------
 3 files changed, 120 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index bb98051..53f2bdd 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -3362,7 +3362,7 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
 
     switch ( input )
     {
-        unsigned int _ecx, _edx;
+        unsigned int _ebx, _ecx, _edx;
 
     case 0x1:
         /* Fix up VLAPIC details. */
@@ -3443,6 +3443,44 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
         switch ( count )
         {
         case 0:
+        {
+            uint64_t xfeature_mask = XSTATE_SSE | XSTATE_FP;
+            uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
+
+            if ( _ecx & cpufeat_mask(X86_FEATURE_AVX) )
+            {
+                xfeature_mask |= XSTATE_YMM;
+                xstate_size += xstate_sizes[_XSTATE_YMM];
+            }
+
+            _ecx = 0;
+            hvm_cpuid(7, NULL, &_ebx, &_ecx, NULL);
+
+            if ( _ebx & cpufeat_mask(X86_FEATURE_MPX) )
+            {
+                xfeature_mask |= XSTATE_BNDREGS | XSTATE_BNDCSR;
+                xstate_size += xstate_sizes[_XSTATE_BNDREGS] +
+                    xstate_sizes[_XSTATE_BNDCSR];
+            }
+
+            if ( _ebx & cpufeat_mask(X86_FEATURE_PKU) )
+            {
+                xfeature_mask |= XSTATE_PKRU;
+                xstate_size += xstate_sizes[_XSTATE_PKRU];
+            }
+
+            hvm_cpuid(0x80000001, NULL, NULL, &_ecx, NULL);
+
+            if ( _ecx & cpufeat_mask(X86_FEATURE_LWP) )
+            {
+                xfeature_mask |= XSTATE_LWP;
+                xstate_size += xstate_sizes[_XSTATE_LWP];
+            }
+
+            *eax = (uint32_t)xfeature_mask;
+            *edx = (uint32_t)(xfeature_mask >> 32);
+            *ecx = xstate_size;
+
             /*
              * Always read CPUID[0xD,0].EBX from hardware, rather than domain
              * policy.  It varies with enabled xstate, and the correct xcr0 is
@@ -3450,6 +3488,8 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
              */
             cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
             break;
+        }
+
         case 1:
             *eax &= hvm_featureset[FEATURESET_Da1];
 
@@ -3463,7 +3503,9 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx,
                 cpuid_count(input, count, &dummy, ebx, &dummy, &dummy);
             }
             else
-                *ebx = *ecx = *edx = 0;
+                *ebx = 0;
+
+            *ecx = *edx = 0;
             break;
         }
         break;
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 5d7232d..4ba90a4 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -928,7 +928,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
 
     switch ( leaf )
     {
-        uint32_t tmp;
+        uint32_t tmp, _ebx, _ecx, _edx;
 
     case 0x00000001:
         c &= pv_featureset[FEATURESET_1c];
@@ -1087,19 +1087,63 @@ void pv_cpuid(struct cpu_user_regs *regs)
         break;
 
     case XSTATE_CPUID:
-        if ( !((!is_control_domain(currd) && !is_hardware_domain(currd)
-                ? ({
-                    uint32_t ecx;
-
-                    domain_cpuid(currd, 1, 0, &tmp, &tmp, &ecx, &tmp);
-                    ecx & pv_featureset[FEATURESET_1c];
-                  })
-                : cpuid_ecx(1)) & cpufeat_mask(X86_FEATURE_XSAVE)) ||
-             subleaf >= 63 )
+
+        if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
+            domain_cpuid(currd, 1, 0, &tmp, &tmp, &_ecx, &tmp);
+        else
+            _ecx = cpuid_ecx(1);
+        _ecx &= pv_featureset[FEATURESET_1c];
+
+        if ( !(_ecx & cpufeat_mask(X86_FEATURE_XSAVE)) || subleaf >= 63 )
             goto unsupported;
         switch ( subleaf )
         {
         case 0:
+        {
+            uint64_t xfeature_mask = XSTATE_SSE | XSTATE_FP;
+            uint32_t xstate_size = XSTATE_AREA_MIN_SIZE;
+
+            if ( _ecx & cpufeat_mask(X86_FEATURE_AVX) )
+            {
+                xfeature_mask |= XSTATE_YMM;
+                xstate_size += xstate_sizes[_XSTATE_YMM];
+            }
+
+            if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
+                domain_cpuid(currd, 7, 0, &tmp, &_ebx, &tmp, &tmp);
+            else
+                cpuid_count(7, 0, &tmp, &_ebx, &tmp, &tmp);
+            _ebx &= pv_featureset[FEATURESET_7b0];
+
+            if ( _ebx & cpufeat_mask(X86_FEATURE_MPX) )
+            {
+                xfeature_mask |= XSTATE_BNDREGS | XSTATE_BNDCSR;
+                xstate_size += xstate_sizes[_XSTATE_BNDREGS] +
+                    xstate_sizes[_XSTATE_BNDCSR];
+            }
+
+            if ( _ebx & cpufeat_mask(X86_FEATURE_PKU) )
+            {
+                xfeature_mask |= XSTATE_PKRU;
+                xstate_size += xstate_sizes[_XSTATE_PKRU];
+            }
+
+            if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
+                domain_cpuid(currd, 0x80000001, 0, &tmp, &tmp, &tmp, &_edx);
+            else
+                _edx = cpuid_edx(0x80000001);
+            _edx &= pv_featureset[FEATURESET_e1d];
+
+            if ( _ecx & cpufeat_mask(X86_FEATURE_LWP) )
+            {
+                xfeature_mask |= XSTATE_LWP;
+                xstate_size += xstate_sizes[_XSTATE_LWP];
+            }
+
+            a = (uint32_t)xfeature_mask;
+            d = (uint32_t)(xfeature_mask >> 32);
+            c = xstate_size;
+
             /*
              * Always read CPUID.0xD[ECX=0].EBX from hardware, rather than
              * domain policy.  It varies with enabled xstate, and the correct
@@ -1108,6 +1152,7 @@ void pv_cpuid(struct cpu_user_regs *regs)
             if ( !is_control_domain(currd) && !is_hardware_domain(currd) )
                 cpuid_count(leaf, subleaf, &tmp, &b, &tmp, &tmp);
             break;
+        }
 
         case 1:
             a &= pv_featureset[FEATURESET_Da1];
diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h
index 4535354..4b0c33e 100644
--- a/xen/include/asm-x86/xstate.h
+++ b/xen/include/asm-x86/xstate.h
@@ -26,16 +26,27 @@
 #define XSAVE_HDR_OFFSET          FXSAVE_SIZE
 #define XSTATE_AREA_MIN_SIZE      (FXSAVE_SIZE + XSAVE_HDR_SIZE)
 
-#define XSTATE_FP      (1ULL << 0)
-#define XSTATE_SSE     (1ULL << 1)
-#define XSTATE_YMM     (1ULL << 2)
-#define XSTATE_BNDREGS (1ULL << 3)
-#define XSTATE_BNDCSR  (1ULL << 4)
-#define XSTATE_OPMASK  (1ULL << 5)
-#define XSTATE_ZMM     (1ULL << 6)
-#define XSTATE_HI_ZMM  (1ULL << 7)
-#define XSTATE_PKRU    (1ULL << 9)
-#define XSTATE_LWP     (1ULL << 62) /* AMD lightweight profiling */
+#define _XSTATE_FP                0
+#define XSTATE_FP                 (1ULL << _XSTATE_FP)
+#define _XSTATE_SSE               1
+#define XSTATE_SSE                (1ULL << _XSTATE_SSE)
+#define _XSTATE_YMM               2
+#define XSTATE_YMM                (1ULL << _XSTATE_YMM)
+#define _XSTATE_BNDREGS           3
+#define XSTATE_BNDREGS            (1ULL << _XSTATE_BNDREGS)
+#define _XSTATE_BNDCSR            4
+#define XSTATE_BNDCSR             (1ULL << _XSTATE_BNDCSR)
+#define _XSTATE_OPMASK            5
+#define XSTATE_OPMASK             (1ULL << _XSTATE_OPMASK)
+#define _XSTATE_ZMM               6
+#define XSTATE_ZMM                (1ULL << _XSTATE_ZMM)
+#define _XSTATE_HI_ZMM            7
+#define XSTATE_HI_ZMM             (1ULL << _XSTATE_HI_ZMM)
+#define _XSTATE_PKRU              9
+#define XSTATE_PKRU               (1ULL << _XSTATE_PKRU)
+#define _XSTATE_LWP               62
+#define XSTATE_LWP                (1ULL << _XSTATE_LWP)
+
 #define XSTATE_FP_SSE  (XSTATE_FP | XSTATE_SSE)
 #define XCNTXT_MASK    (XSTATE_FP | XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | \
                         XSTATE_ZMM | XSTATE_HI_ZMM | XSTATE_NONLAZY)
-- 
2.1.4


[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-02 11:12                     ` Andrew Cooper
@ 2016-06-02 11:34                       ` Jan Beulich
  2016-06-02 11:44                         ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2016-06-02 11:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

>>> On 02.06.16 at 13:12, <andrew.cooper3@citrix.com> wrote:
> On 01/06/16 14:28, Jan Beulich wrote:
>>>>> On 01.06.16 at 15:03, <andrew.cooper3@citrix.com> wrote:
>>> On 01/06/16 13:01, Jan Beulich wrote:
>>>>>>> I want to adjust the representation of cpuid information in struct
>>>>>>> domain. The current loop in domain_cpuid() causes an O(N) overhead for
>>>>>>> every query, which is very poor for actions which really should be a
>>>>>>> single bit test at a fixed offset.
>>>>>>>
>>>>>>> This needs to be combined with properly splitting the per-domain and
>>>>>>> per-vcpu information, which requires knowing the expected vcpu topology
>>>>>>> during domain creation.
>>>>>>>
>>>>>>> On top of that, there needs to be verification logic to check the
>>>>>>> correctness of information passed from the toolstack.
>>>>>>>
>>>>>>> All of these areas are covered in the "known issues" section of the
>>>>>>> feature doc, and I do plan to fix them all.  However, it isn't a couple
>>>>>>> of hours worth of work.
>>>>>> All understood, yet not to the point: The original remark was that
>>>>>> the very XSTATE handling could be done better with far not as much
>>>>>> of a change, at least afaict without having tried.
>>>>> In which case I don't know what you were suggesting.
>>>> Make {hvm,pv}_cpuid() invoke themselves recursively to
>>>> determine what bits to mask off from CPUID[0xd].EAX.
>>> So that would work.  However, to do this, you need to query leaves 1,
>>> 0x80000001 and 7, all of which will hit the O(N) loop in domain_cpuid()
>>>
>>> Luckily, none of those specific paths further recurse into {hvm,pv}_cpuid().
>>>
>>> I am unsure which to go with.  My gut feel is that this would be quite a
>>> performance hit, but I have no evidence either way.  OTOH, it will give
>>> the correct answer, rather than an approximation.
>> Not only since I believe performance is very close to irrelevant for
>> CPUID leaf 0xD invocations, I think I'd prefer correctness over
>> performance (as would be basically always the case). How about
>> you?
> 
> Right - this is the alternative, doing the calculation in
> {hvm,pv}_cpuid(), based on top of your cleanup from yesterday.

Please use XSTATE_FP_SSE instead of open coding it.

Is the accumulation logic for xstate_size really correct? Doesn't the
uncompressed area including, say, PKRU, have the same size no
matter whether AVX or MPX are available? I.e. I think you need

                xstate_size = xstate_offsets[...] + xstate_sizes[...];

everywhere.

Why are you dealing with MPX and PKU in pv_cpuid()? They're
always off for PV guests.

> There is a bugfix in the PV side (pv_featureset[FEATURESET_1c] should be
> taken into account even for control/hardware domain accesses),

Ouch - I had thought of this yesterday night, and then forgot before
committing.

> and a
> preemptive fix on the HVM side to avoid advertising any XSS states, as
> we don't support any yet.

I don't think I really like this part. What's wrong with keeping
things the way they are?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-02 11:34                       ` Jan Beulich
@ 2016-06-02 11:44                         ` Andrew Cooper
  2016-06-02 12:15                           ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2016-06-02 11:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

On 02/06/16 12:34, Jan Beulich wrote:
>>>> On 02.06.16 at 13:12, <andrew.cooper3@citrix.com> wrote:
>> On 01/06/16 14:28, Jan Beulich wrote:
>>>>>> On 01.06.16 at 15:03, <andrew.cooper3@citrix.com> wrote:
>>>> On 01/06/16 13:01, Jan Beulich wrote:
>>>>>>>> I want to adjust the representation of cpuid information in struct
>>>>>>>> domain. The current loop in domain_cpuid() causes an O(N) overhead for
>>>>>>>> every query, which is very poor for actions which really should be a
>>>>>>>> single bit test at a fixed offset.
>>>>>>>>
>>>>>>>> This needs to be combined with properly splitting the per-domain and
>>>>>>>> per-vcpu information, which requires knowing the expected vcpu topology
>>>>>>>> during domain creation.
>>>>>>>>
>>>>>>>> On top of that, there needs to be verification logic to check the
>>>>>>>> correctness of information passed from the toolstack.
>>>>>>>>
>>>>>>>> All of these areas are covered in the "known issues" section of the
>>>>>>>> feature doc, and I do plan to fix them all.  However, it isn't a couple
>>>>>>>> of hours worth of work.
>>>>>>> All understood, yet not to the point: The original remark was that
>>>>>>> the very XSTATE handling could be done better with far not as much
>>>>>>> of a change, at least afaict without having tried.
>>>>>> In which case I don't know what you were suggesting.
>>>>> Make {hvm,pv}_cpuid() invoke themselves recursively to
>>>>> determine what bits to mask off from CPUID[0xd].EAX.
>>>> So that would work.  However, to do this, you need to query leaves 1,
>>>> 0x80000001 and 7, all of which will hit the O(N) loop in domain_cpuid()
>>>>
>>>> Luckily, none of those specific paths further recurse into {hvm,pv}_cpuid().
>>>>
>>>> I am unsure which to go with.  My gut feel is that this would be quite a
>>>> performance hit, but I have no evidence either way.  OTOH, it will give
>>>> the correct answer, rather than an approximation.
>>> Not only since I believe performance is very close to irrelevant for
>>> CPUID leaf 0xD invocations, I think I'd prefer correctness over
>>> performance (as would be basically always the case). How about
>>> you?
>> Right - this is the alternative, doing the calculation in
>> {hvm,pv}_cpuid(), based on top of your cleanup from yesterday.
> Please use XSTATE_FP_SSE instead of open coding it.

Ok.

>
> Is the accumulation logic for xstate_size really correct? Doesn't the
> uncompressed area including, say, PKRU, have the same size no
> matter whether AVX or MPX are available? I.e. I think you need
>
>                 xstate_size = xstate_offsets[...] + xstate_sizes[...];

You are right.

>
> everywhere.
>
> Why are you dealing with MPX and PKU in pv_cpuid()? They're
> always off for PV guests.

Too much copy&paste.  I will back some of it out.

>
>> There is a bugfix in the PV side (pv_featureset[FEATURESET_1c] should be
>> taken into account even for control/hardware domain accesses),
> Ouch - I had thought of this yesterday night, and then forgot before
> committing.
>
>> and a
>> preemptive fix on the HVM side to avoid advertising any XSS states, as
>> we don't support any yet.
> I don't think I really like this part. What's wrong with keeping
> things the way they are?

We currently blindly trust the toolstack-provided values for
CPUID.7[1].ECX/EDX, which are the valid XSS bits.

Given that the bug trying to be fixed here is that Linux writes
CPUID.7[0].EAX/EDX directly into %xcr0, I felt it prudent to make the
same fix for XSS.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] x86/cpuid: fix dom0 crash on skylake machine
  2016-06-02 11:44                         ` Andrew Cooper
@ 2016-06-02 12:15                           ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2016-06-02 12:15 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: huaitong.han, Luwei Kang, yong.y.wang, xen-devel

>>> On 02.06.16 at 13:44, <andrew.cooper3@citrix.com> wrote:
> On 02/06/16 12:34, Jan Beulich wrote:
>>>>> On 02.06.16 at 13:12, <andrew.cooper3@citrix.com> wrote:
>>> and a
>>> preemptive fix on the HVM side to avoid advertising any XSS states, as
>>> we don't support any yet.
>> I don't think I really like this part. What's wrong with keeping
>> things the way they are?
> 
> We currently blindly trust the toolstack-provided values for
> CPUID.7[1].ECX/EDX, which are the valid XSS bits.
> 
> Given that the bug trying to be fixed here is that Linux writes
> CPUID.7[0].EAX/EDX directly into %xcr0, I felt it prudent to make the
> same fix for XSS.

Oh, you're right - this is actually well in line with the rest of the
change. Keep it.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-06-02 12:15 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01  4:58 [PATCH] x86/cpuid: fix dom0 crash on skylake machine Luwei Kang
2016-06-01  5:54 ` Han, Huaitong
2016-06-01  8:49   ` Jan Beulich
2016-06-01  9:00     ` Han, Huaitong
2016-06-01  9:03 ` Andrew Cooper
2016-06-01  9:17   ` Jan Beulich
2016-06-01  9:34     ` Andrew Cooper
2016-06-01  9:43       ` Jan Beulich
2016-06-01 11:27         ` Andrew Cooper
2016-06-01 11:38           ` Jan Beulich
2016-06-01 11:45             ` Andrew Cooper
2016-06-01 12:01               ` Jan Beulich
2016-06-01 13:03                 ` Andrew Cooper
2016-06-01 13:28                   ` Jan Beulich
2016-06-02 11:12                     ` Andrew Cooper
2016-06-02 11:34                       ` Jan Beulich
2016-06-02 11:44                         ` Andrew Cooper
2016-06-02 12:15                           ` Jan Beulich
2016-06-01  9:21   ` Han, Huaitong
2016-06-01  9:30   ` Wei Liu
2016-06-01  9:45     ` Andrew Cooper
2016-06-01 10:54   ` Kang, Luwei
2016-06-01 10:57     ` Andrew Cooper
2016-06-01  9:04 ` Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).