All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] x86: Fix boot time detection of SMT settings
@ 2019-04-13 16:22 ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

It turns out that on Intel hardware, CPUID.0xb.smt_shift represents the
hardware capability, not the current configuration.  Therefore, the L1TF
nagging warning triggers even when the admin has disabled hyperthreading in
the firmware.

The way to fix this is using the MSR_INTEL_CORE_THREAD_COUNT.  Several updates
to the Intel MSR documentation are expected to be forthcoming.

See individual patches for details.

Andrew Cooper (3):
  x86/spec-ctrl: Reposition the XPTI command line parsing logic
  x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
  x86/boot: Detect the firmware SMT setting correctly on Intel hardware

 xen/arch/x86/cpu/amd.c          |   2 +-
 xen/arch/x86/msr.c              |   2 +
 xen/arch/x86/spec_ctrl.c        | 182 ++++++++++++++++++++++++----------------
 xen/include/asm-x86/msr-index.h |   4 +
 4 files changed, 119 insertions(+), 71 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Xen-devel] [PATCH 0/3] x86: Fix boot time detection of SMT settings
@ 2019-04-13 16:22 ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

It turns out that on Intel hardware, CPUID.0xb.smt_shift represents the
hardware capability, not the current configuration.  Therefore, the L1TF
nagging warning triggers even when the admin has disabled hyperthreading in
the firmware.

The way to fix this is using the MSR_INTEL_CORE_THREAD_COUNT.  Several updates
to the Intel MSR documentation are expected to be forthcoming.

See individual patches for details.

Andrew Cooper (3):
  x86/spec-ctrl: Reposition the XPTI command line parsing logic
  x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
  x86/boot: Detect the firmware SMT setting correctly on Intel hardware

 xen/arch/x86/cpu/amd.c          |   2 +-
 xen/arch/x86/msr.c              |   2 +
 xen/arch/x86/spec_ctrl.c        | 182 ++++++++++++++++++++++++----------------
 xen/include/asm-x86/msr-index.h |   4 +
 4 files changed, 119 insertions(+), 71 deletions(-)

-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/3] x86/spec-ctrl: Reposition the XPTI command line parsing logic
@ 2019-04-13 16:22   ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

It has ended up in the middle of the mitigation calculation logic.  Move it to
be beside the other command line parsing.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/spec_ctrl.c | 134 +++++++++++++++++++++++------------------------
 1 file changed, 67 insertions(+), 67 deletions(-)

diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 4fd09f8..88b56f9 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -173,6 +173,73 @@ static int __init parse_spec_ctrl(const char *s)
 }
 custom_param("spec-ctrl", parse_spec_ctrl);
 
+int8_t __read_mostly opt_xpti_hwdom = -1;
+int8_t __read_mostly opt_xpti_domu = -1;
+
+static __init void xpti_init_default(uint64_t caps)
+{
+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+        caps = ARCH_CAPS_RDCL_NO;
+
+    if ( caps & ARCH_CAPS_RDCL_NO )
+    {
+        if ( opt_xpti_hwdom < 0 )
+            opt_xpti_hwdom = 0;
+        if ( opt_xpti_domu < 0 )
+            opt_xpti_domu = 0;
+    }
+    else
+    {
+        if ( opt_xpti_hwdom < 0 )
+            opt_xpti_hwdom = 1;
+        if ( opt_xpti_domu < 0 )
+            opt_xpti_domu = 1;
+    }
+}
+
+static __init int parse_xpti(const char *s)
+{
+    const char *ss;
+    int val, rc = 0;
+
+    /* Interpret 'xpti' alone in its positive boolean form. */
+    if ( *s == '\0' )
+        opt_xpti_hwdom = opt_xpti_domu = 1;
+
+    do {
+        ss = strchr(s, ',');
+        if ( !ss )
+            ss = strchr(s, '\0');
+
+        switch ( parse_bool(s, ss) )
+        {
+        case 0:
+            opt_xpti_hwdom = opt_xpti_domu = 0;
+            break;
+
+        case 1:
+            opt_xpti_hwdom = opt_xpti_domu = 1;
+            break;
+
+        default:
+            if ( !strcmp(s, "default") )
+                opt_xpti_hwdom = opt_xpti_domu = -1;
+            else if ( (val = parse_boolean("dom0", s, ss)) >= 0 )
+                opt_xpti_hwdom = val;
+            else if ( (val = parse_boolean("domu", s, ss)) >= 0 )
+                opt_xpti_domu = val;
+            else if ( *s )
+                rc = -EINVAL;
+            break;
+        }
+
+        s = ss + 1;
+    } while ( *ss );
+
+    return rc;
+}
+custom_param("xpti", parse_xpti);
+
 int8_t __read_mostly opt_pv_l1tf_hwdom = -1;
 int8_t __read_mostly opt_pv_l1tf_domu = -1;
 
@@ -634,73 +701,6 @@ static __init void l1tf_calculations(uint64_t caps)
                                             : (3ul << (paddr_bits - 2))));
 }
 
-int8_t __read_mostly opt_xpti_hwdom = -1;
-int8_t __read_mostly opt_xpti_domu = -1;
-
-static __init void xpti_init_default(uint64_t caps)
-{
-    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
-        caps = ARCH_CAPS_RDCL_NO;
-
-    if ( caps & ARCH_CAPS_RDCL_NO )
-    {
-        if ( opt_xpti_hwdom < 0 )
-            opt_xpti_hwdom = 0;
-        if ( opt_xpti_domu < 0 )
-            opt_xpti_domu = 0;
-    }
-    else
-    {
-        if ( opt_xpti_hwdom < 0 )
-            opt_xpti_hwdom = 1;
-        if ( opt_xpti_domu < 0 )
-            opt_xpti_domu = 1;
-    }
-}
-
-static __init int parse_xpti(const char *s)
-{
-    const char *ss;
-    int val, rc = 0;
-
-    /* Interpret 'xpti' alone in its positive boolean form. */
-    if ( *s == '\0' )
-        opt_xpti_hwdom = opt_xpti_domu = 1;
-
-    do {
-        ss = strchr(s, ',');
-        if ( !ss )
-            ss = strchr(s, '\0');
-
-        switch ( parse_bool(s, ss) )
-        {
-        case 0:
-            opt_xpti_hwdom = opt_xpti_domu = 0;
-            break;
-
-        case 1:
-            opt_xpti_hwdom = opt_xpti_domu = 1;
-            break;
-
-        default:
-            if ( !strcmp(s, "default") )
-                opt_xpti_hwdom = opt_xpti_domu = -1;
-            else if ( (val = parse_boolean("dom0", s, ss)) >= 0 )
-                opt_xpti_hwdom = val;
-            else if ( (val = parse_boolean("domu", s, ss)) >= 0 )
-                opt_xpti_domu = val;
-            else if ( *s )
-                rc = -EINVAL;
-            break;
-        }
-
-        s = ss + 1;
-    } while ( *ss );
-
-    return rc;
-}
-custom_param("xpti", parse_xpti);
-
 void __init init_speculation_mitigations(void)
 {
     enum ind_thunk thunk = THUNK_DEFAULT;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Xen-devel] [PATCH 1/3] x86/spec-ctrl: Reposition the XPTI command line parsing logic
@ 2019-04-13 16:22   ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

It has ended up in the middle of the mitigation calculation logic.  Move it to
be beside the other command line parsing.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/spec_ctrl.c | 134 +++++++++++++++++++++++------------------------
 1 file changed, 67 insertions(+), 67 deletions(-)

diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 4fd09f8..88b56f9 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -173,6 +173,73 @@ static int __init parse_spec_ctrl(const char *s)
 }
 custom_param("spec-ctrl", parse_spec_ctrl);
 
+int8_t __read_mostly opt_xpti_hwdom = -1;
+int8_t __read_mostly opt_xpti_domu = -1;
+
+static __init void xpti_init_default(uint64_t caps)
+{
+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+        caps = ARCH_CAPS_RDCL_NO;
+
+    if ( caps & ARCH_CAPS_RDCL_NO )
+    {
+        if ( opt_xpti_hwdom < 0 )
+            opt_xpti_hwdom = 0;
+        if ( opt_xpti_domu < 0 )
+            opt_xpti_domu = 0;
+    }
+    else
+    {
+        if ( opt_xpti_hwdom < 0 )
+            opt_xpti_hwdom = 1;
+        if ( opt_xpti_domu < 0 )
+            opt_xpti_domu = 1;
+    }
+}
+
+static __init int parse_xpti(const char *s)
+{
+    const char *ss;
+    int val, rc = 0;
+
+    /* Interpret 'xpti' alone in its positive boolean form. */
+    if ( *s == '\0' )
+        opt_xpti_hwdom = opt_xpti_domu = 1;
+
+    do {
+        ss = strchr(s, ',');
+        if ( !ss )
+            ss = strchr(s, '\0');
+
+        switch ( parse_bool(s, ss) )
+        {
+        case 0:
+            opt_xpti_hwdom = opt_xpti_domu = 0;
+            break;
+
+        case 1:
+            opt_xpti_hwdom = opt_xpti_domu = 1;
+            break;
+
+        default:
+            if ( !strcmp(s, "default") )
+                opt_xpti_hwdom = opt_xpti_domu = -1;
+            else if ( (val = parse_boolean("dom0", s, ss)) >= 0 )
+                opt_xpti_hwdom = val;
+            else if ( (val = parse_boolean("domu", s, ss)) >= 0 )
+                opt_xpti_domu = val;
+            else if ( *s )
+                rc = -EINVAL;
+            break;
+        }
+
+        s = ss + 1;
+    } while ( *ss );
+
+    return rc;
+}
+custom_param("xpti", parse_xpti);
+
 int8_t __read_mostly opt_pv_l1tf_hwdom = -1;
 int8_t __read_mostly opt_pv_l1tf_domu = -1;
 
@@ -634,73 +701,6 @@ static __init void l1tf_calculations(uint64_t caps)
                                             : (3ul << (paddr_bits - 2))));
 }
 
-int8_t __read_mostly opt_xpti_hwdom = -1;
-int8_t __read_mostly opt_xpti_domu = -1;
-
-static __init void xpti_init_default(uint64_t caps)
-{
-    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
-        caps = ARCH_CAPS_RDCL_NO;
-
-    if ( caps & ARCH_CAPS_RDCL_NO )
-    {
-        if ( opt_xpti_hwdom < 0 )
-            opt_xpti_hwdom = 0;
-        if ( opt_xpti_domu < 0 )
-            opt_xpti_domu = 0;
-    }
-    else
-    {
-        if ( opt_xpti_hwdom < 0 )
-            opt_xpti_hwdom = 1;
-        if ( opt_xpti_domu < 0 )
-            opt_xpti_domu = 1;
-    }
-}
-
-static __init int parse_xpti(const char *s)
-{
-    const char *ss;
-    int val, rc = 0;
-
-    /* Interpret 'xpti' alone in its positive boolean form. */
-    if ( *s == '\0' )
-        opt_xpti_hwdom = opt_xpti_domu = 1;
-
-    do {
-        ss = strchr(s, ',');
-        if ( !ss )
-            ss = strchr(s, '\0');
-
-        switch ( parse_bool(s, ss) )
-        {
-        case 0:
-            opt_xpti_hwdom = opt_xpti_domu = 0;
-            break;
-
-        case 1:
-            opt_xpti_hwdom = opt_xpti_domu = 1;
-            break;
-
-        default:
-            if ( !strcmp(s, "default") )
-                opt_xpti_hwdom = opt_xpti_domu = -1;
-            else if ( (val = parse_boolean("dom0", s, ss)) >= 0 )
-                opt_xpti_hwdom = val;
-            else if ( (val = parse_boolean("domu", s, ss)) >= 0 )
-                opt_xpti_domu = val;
-            else if ( *s )
-                rc = -EINVAL;
-            break;
-        }
-
-        s = ss + 1;
-    } while ( *ss );
-
-    return rc;
-}
-custom_param("xpti", parse_xpti);
-
 void __init init_speculation_mitigations(void)
 {
     enum ind_thunk thunk = THUNK_DEFAULT;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-13 16:22   ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

This is a model specific register which details the current configuration
cores and threads in the package.  Because of how Hyperthread and Core
configuration works works in firmware, the MSR it is de-facto constant and
will remain unchanged until the next system reset.

It is a read only MSR, and for now, reject guest attempts to read it, to avoid
the system setting leaking into guest context.  Further CPUID/MSR work is
required before we can start virtualising a consistent topology to the guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

Observant people will notice that in the latest SDM at the time of writing,
version 069, this MSR is listed in single table for Haswell generation Xeon's,
and that the thread and core count fields are the wrong way around.

I'm informed that this MSR has existed since the Nehalem era (except for some
of the early in-order Atoms), has had consistent behaviour in that time, and
that the documentation will be addressed in future SDM revisions.
---
 xen/arch/x86/msr.c              | 2 ++
 xen/include/asm-x86/msr-index.h | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 815d599..948d07d 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_PRED_CMD:
     case MSR_FLUSH_CMD:
         /* Write-only */
+    case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_TSX_FORCE_ABORT:
         /* Not offered to guests. */
         goto gp_fault;
@@ -267,6 +268,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     {
         uint64_t rsvd;
 
+    case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
         /* Read-only */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 11512d4..389f95f 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -32,6 +32,10 @@
 #define EFER_KNOWN_MASK		(EFER_SCE | EFER_LME | EFER_LMA | EFER_NX | \
 				 EFER_SVME | EFER_FFXSE)
 
+#define MSR_INTEL_CORE_THREAD_COUNT     0x00000035
+#define MSR_CTC_THREAD_MASK             0x0000ffff
+#define MSR_CTC_CORE_MASK               0xffff0000
+
 /* Speculation Controls. */
 #define MSR_SPEC_CTRL			0x00000048
 #define SPEC_CTRL_IBRS			(_AC(1, ULL) << 0)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Xen-devel] [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-13 16:22   ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

This is a model specific register which details the current configuration
cores and threads in the package.  Because of how Hyperthread and Core
configuration works works in firmware, the MSR it is de-facto constant and
will remain unchanged until the next system reset.

It is a read only MSR, and for now, reject guest attempts to read it, to avoid
the system setting leaking into guest context.  Further CPUID/MSR work is
required before we can start virtualising a consistent topology to the guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>

Observant people will notice that in the latest SDM at the time of writing,
version 069, this MSR is listed in single table for Haswell generation Xeon's,
and that the thread and core count fields are the wrong way around.

I'm informed that this MSR has existed since the Nehalem era (except for some
of the early in-order Atoms), has had consistent behaviour in that time, and
that the documentation will be addressed in future SDM revisions.
---
 xen/arch/x86/msr.c              | 2 ++
 xen/include/asm-x86/msr-index.h | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 815d599..948d07d 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_PRED_CMD:
     case MSR_FLUSH_CMD:
         /* Write-only */
+    case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_TSX_FORCE_ABORT:
         /* Not offered to guests. */
         goto gp_fault;
@@ -267,6 +268,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     {
         uint64_t rsvd;
 
+    case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
         /* Read-only */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 11512d4..389f95f 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -32,6 +32,10 @@
 #define EFER_KNOWN_MASK		(EFER_SCE | EFER_LME | EFER_LMA | EFER_NX | \
 				 EFER_SVME | EFER_FFXSE)
 
+#define MSR_INTEL_CORE_THREAD_COUNT     0x00000035
+#define MSR_CTC_THREAD_MASK             0x0000ffff
+#define MSR_CTC_CORE_MASK               0xffff0000
+
 /* Speculation Controls. */
 #define MSR_SPEC_CTRL			0x00000048
 #define SPEC_CTRL_IBRS			(_AC(1, ULL) << 0)
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/3] x86/boot: Detect the firmware SMT setting correctly on Intel hardware
@ 2019-04-13 16:22   ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

While boot_cpu_data.x86_num_siblings is an accurate value to use on AMD
hardware, it isn't on Intel when the user has disabled Hyperthreading in the
firmware.  As a result, a user which has chosen to disable HT still gets
nagged on L1TF-vulnerable hardware when they haven't chosen an explicit
smt=<bool> setting.

Make use of the largely-undocumented MSR_INTEL_CORE_THREAD_COUNT which in
practice exists since Nehalem, when booting on real hardware.  Fall back to
using the ACPI table APIC IDs.

While adjusting this logic, fix a latent bug in amd_get_topology().  The
thread count field in CPUID.0x8000001e.ebx is documented as 8 bits wide,
rather than 2 bits wide.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/cpu/amd.c   |  2 +-
 xen/arch/x86/spec_ctrl.c | 50 +++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index e19a5ea..23de258 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -507,7 +507,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
                 u32 eax, ebx, ecx, edx;
 
                 cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
-                c->x86_num_siblings = ((ebx >> 8) & 0x3) + 1;
+                c->x86_num_siblings = ((ebx >> 8) & 0xff) + 1;
 
                 if (c->x86 < 0x17)
                         c->compute_unit_id = ebx & 0xFF;
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 88b56f9..be7f896 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -58,7 +58,7 @@ uint8_t __read_mostly default_xen_spec_ctrl;
 uint8_t __read_mostly default_spec_ctrl_flags;
 
 paddr_t __read_mostly l1tf_addr_mask, __read_mostly l1tf_safe_maddr;
-static bool __initdata cpu_has_bug_l1tf;
+static bool __initdata cpu_has_bug_l1tf, hw_smt_enabled;
 static unsigned int __initdata l1d_maxphysaddr;
 
 static int __init parse_spec_ctrl(const char *s)
@@ -284,6 +284,49 @@ static __init int parse_pv_l1tf(const char *s)
 }
 custom_param("pv-l1tf", parse_pv_l1tf);
 
+static void __init detect_smt_status(void)
+{
+    uint64_t val;
+    unsigned int cpu;
+
+    /*
+     * x86_num_siblings defaults to 1 in the absence of other information, and
+     * is adjusted based on other topology information found in CPUID leaves.
+     *
+     * On AMD hardware, it will be the current SMT configuration.  On Intel
+     * hardware, it will represent the maximum capability, rather than the
+     * current configuration.
+     */
+    if ( boot_cpu_data.x86_num_siblings < 2 )
+        return;
+
+    /*
+     * Intel Nehalem and later hardware does have an MSR which reports the
+     * current count of cores/threads in the package.
+     *
+     * At the time of writing, it is almost completely undocumented, so isn't
+     * virtualised reliably.
+     */
+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && !cpu_has_hypervisor &&
+         !rdmsr_safe(MSR_INTEL_CORE_THREAD_COUNT, val) )
+    {
+        hw_smt_enabled = (MASK_EXTR(val, MSR_CTC_CORE_MASK) !=
+                          MASK_EXTR(val, MSR_CTC_THREAD_MASK));
+        return;
+    }
+
+    /*
+     * Search over the CPUs reported in the ACPI tables.  Any whose APIC ID
+     * has a non-zero thread id component indicates that SMT is active.
+     */
+    for_each_present_cpu ( cpu )
+        if ( x86_cpu_to_apicid[cpu] & (boot_cpu_data.x86_num_siblings - 1) )
+        {
+            hw_smt_enabled = true;
+            return;
+        }
+}
+
 static void __init print_details(enum ind_thunk thunk, uint64_t caps)
 {
     unsigned int _7d0 = 0, e8b = 0, tmp;
@@ -710,6 +753,8 @@ void __init init_speculation_mitigations(void)
     if ( boot_cpu_has(X86_FEATURE_ARCH_CAPS) )
         rdmsrl(MSR_ARCH_CAPABILITIES, caps);
 
+    detect_smt_status();
+
     /*
      * Has the user specified any custom BTI mitigations?  If so, follow their
      * instructions exactly and disable all heuristics.
@@ -886,8 +931,7 @@ void __init init_speculation_mitigations(void)
      * However, if we are on affected hardware, with HT enabled, and the user
      * hasn't explicitly chosen whether to use HT or not, nag them to do so.
      */
-    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim &&
-         boot_cpu_data.x86_num_siblings > 1 )
+    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && hw_smt_enabled )
         warning_add(
             "Booted on L1TF-vulnerable hardware with SMT/Hyperthreading\n"
             "enabled.  Please assess your configuration and choose an\n"
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [Xen-devel] [PATCH 3/3] x86/boot: Detect the firmware SMT setting correctly on Intel hardware
@ 2019-04-13 16:22   ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-13 16:22 UTC (permalink / raw)
  To: Xen-devel; +Cc: Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monné

While boot_cpu_data.x86_num_siblings is an accurate value to use on AMD
hardware, it isn't on Intel when the user has disabled Hyperthreading in the
firmware.  As a result, a user which has chosen to disable HT still gets
nagged on L1TF-vulnerable hardware when they haven't chosen an explicit
smt=<bool> setting.

Make use of the largely-undocumented MSR_INTEL_CORE_THREAD_COUNT which in
practice exists since Nehalem, when booting on real hardware.  Fall back to
using the ACPI table APIC IDs.

While adjusting this logic, fix a latent bug in amd_get_topology().  The
thread count field in CPUID.0x8000001e.ebx is documented as 8 bits wide,
rather than 2 bits wide.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/cpu/amd.c   |  2 +-
 xen/arch/x86/spec_ctrl.c | 50 +++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
index e19a5ea..23de258 100644
--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -507,7 +507,7 @@ static void amd_get_topology(struct cpuinfo_x86 *c)
                 u32 eax, ebx, ecx, edx;
 
                 cpuid(0x8000001e, &eax, &ebx, &ecx, &edx);
-                c->x86_num_siblings = ((ebx >> 8) & 0x3) + 1;
+                c->x86_num_siblings = ((ebx >> 8) & 0xff) + 1;
 
                 if (c->x86 < 0x17)
                         c->compute_unit_id = ebx & 0xFF;
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 88b56f9..be7f896 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -58,7 +58,7 @@ uint8_t __read_mostly default_xen_spec_ctrl;
 uint8_t __read_mostly default_spec_ctrl_flags;
 
 paddr_t __read_mostly l1tf_addr_mask, __read_mostly l1tf_safe_maddr;
-static bool __initdata cpu_has_bug_l1tf;
+static bool __initdata cpu_has_bug_l1tf, hw_smt_enabled;
 static unsigned int __initdata l1d_maxphysaddr;
 
 static int __init parse_spec_ctrl(const char *s)
@@ -284,6 +284,49 @@ static __init int parse_pv_l1tf(const char *s)
 }
 custom_param("pv-l1tf", parse_pv_l1tf);
 
+static void __init detect_smt_status(void)
+{
+    uint64_t val;
+    unsigned int cpu;
+
+    /*
+     * x86_num_siblings defaults to 1 in the absence of other information, and
+     * is adjusted based on other topology information found in CPUID leaves.
+     *
+     * On AMD hardware, it will be the current SMT configuration.  On Intel
+     * hardware, it will represent the maximum capability, rather than the
+     * current configuration.
+     */
+    if ( boot_cpu_data.x86_num_siblings < 2 )
+        return;
+
+    /*
+     * Intel Nehalem and later hardware does have an MSR which reports the
+     * current count of cores/threads in the package.
+     *
+     * At the time of writing, it is almost completely undocumented, so isn't
+     * virtualised reliably.
+     */
+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && !cpu_has_hypervisor &&
+         !rdmsr_safe(MSR_INTEL_CORE_THREAD_COUNT, val) )
+    {
+        hw_smt_enabled = (MASK_EXTR(val, MSR_CTC_CORE_MASK) !=
+                          MASK_EXTR(val, MSR_CTC_THREAD_MASK));
+        return;
+    }
+
+    /*
+     * Search over the CPUs reported in the ACPI tables.  Any whose APIC ID
+     * has a non-zero thread id component indicates that SMT is active.
+     */
+    for_each_present_cpu ( cpu )
+        if ( x86_cpu_to_apicid[cpu] & (boot_cpu_data.x86_num_siblings - 1) )
+        {
+            hw_smt_enabled = true;
+            return;
+        }
+}
+
 static void __init print_details(enum ind_thunk thunk, uint64_t caps)
 {
     unsigned int _7d0 = 0, e8b = 0, tmp;
@@ -710,6 +753,8 @@ void __init init_speculation_mitigations(void)
     if ( boot_cpu_has(X86_FEATURE_ARCH_CAPS) )
         rdmsrl(MSR_ARCH_CAPABILITIES, caps);
 
+    detect_smt_status();
+
     /*
      * Has the user specified any custom BTI mitigations?  If so, follow their
      * instructions exactly and disable all heuristics.
@@ -886,8 +931,7 @@ void __init init_speculation_mitigations(void)
      * However, if we are on affected hardware, with HT enabled, and the user
      * hasn't explicitly chosen whether to use HT or not, nag them to do so.
      */
-    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim &&
-         boot_cpu_data.x86_num_siblings > 1 )
+    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && hw_smt_enabled )
         warning_add(
             "Booted on L1TF-vulnerable hardware with SMT/Hyperthreading\n"
             "enabled.  Please assess your configuration and choose an\n"
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/3] x86/spec-ctrl: Reposition the XPTI command line parsing logic
@ 2019-04-15  7:58     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15  7:58 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>
>It has ended up in the middle of the mitigation calculation logic.  Move it to
>be beside the other command line parsing.
>
>No functional change.
>
>Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] x86/spec-ctrl: Reposition the XPTI command line parsing logic
@ 2019-04-15  7:58     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15  7:58 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>
>It has ended up in the middle of the mitigation calculation logic.  Move it to
>be beside the other command line parsing.
>
>No functional change.
>
>Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-15  8:09     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15  8:09 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>--- a/xen/arch/x86/msr.c+++ b/xen/arch/x86/msr.c
>@@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>case MSR_PRED_CMD:
>case MSR_FLUSH_CMD:
>/* Write-only */
>+    case MSR_INTEL_CORE_THREAD_COUNT:
>case MSR_TSX_FORCE_ABORT:
>/* Not offered to guests. */
>goto gp_fault;

In a private talk we had, didn't you mention there is at least one OS (was it OSX)
that unconditionally ready this MSR? Despite assuming there are other issues
with OSX, wouldn't it be better to avoid giving #GP(0) back here? I realize we can't
yet give back a fully consistent value here, looking at what conclusions Linux
draws from other CPUID output, couldn't 0x00010001 be used as "fake" value?


Nevertheless if you're convinced it should be this way, and if this doesn't actively
break any guest OS (due to introducing this as the _only_ thing causing their boot
to fail)

Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-15  8:09     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15  8:09 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>--- a/xen/arch/x86/msr.c+++ b/xen/arch/x86/msr.c
>@@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>case MSR_PRED_CMD:
>case MSR_FLUSH_CMD:
>/* Write-only */
>+    case MSR_INTEL_CORE_THREAD_COUNT:
>case MSR_TSX_FORCE_ABORT:
>/* Not offered to guests. */
>goto gp_fault;

In a private talk we had, didn't you mention there is at least one OS (was it OSX)
that unconditionally ready this MSR? Despite assuming there are other issues
with OSX, wouldn't it be better to avoid giving #GP(0) back here? I realize we can't
yet give back a fully consistent value here, looking at what conclusions Linux
draws from other CPUID output, couldn't 0x00010001 be used as "fake" value?


Nevertheless if you're convinced it should be this way, and if this doesn't actively
break any guest OS (due to introducing this as the _only_ thing causing their boot
to fail)

Acked-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] x86/boot: Detect the firmware SMT setting correctly on Intel hardware
@ 2019-04-15  8:25     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15  8:25 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>
>--- a/xen/arch/x86/spec_ctrl.c
>+++ b/xen/arch/x86/spec_ctrl.c
>@@ -58,7 +58,7 @@ uint8_t __read_mostly default_xen_spec_ctrl;
>uint8_t __read_mostly default_spec_ctrl_flags;
 >
>paddr_t __read_mostly l1tf_addr_mask, __read_mostly l1tf_safe_maddr;
>-static bool __initdata cpu_has_bug_l1tf;
>+static bool __initdata cpu_has_bug_l1tf, hw_smt_enabled;

I wonder whether this wouldn't better start out as "true" and get set to false when
we're certain SMT is off. I'd rather see the nagging one time too often, than get a
report that someone's still vulnerable because of having SMT enabled without
noticing.


>+static void __init detect_smt_status(void)

>+{
>+    uint64_t val;
>+    unsigned int cpu;
>+
>+    /*
>+     * x86_num_siblings defaults to 1 in the absence of other information, and
>+     * is adjusted based on other topology information found in CPUID leaves.
>+     *
>+     * On AMD hardware, it will be the current SMT configuration.  On Intel
>+     * hardware, it will represent the maximum capability, rather than the
>+     * current configuration.
>+     */
>+    if ( boot_cpu_data.x86_num_siblings < 2 )
>+        return;
>+
>+    /*
>+     * Intel Nehalem and later hardware does have an MSR which reports the
>+     * current count of cores/threads in the package.
>+     *
>+     * At the time of writing, it is almost completely undocumented, so isn't
>+     * virtualised reliably.
>+     */
>+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && !cpu_has_hypervisor &&

While you clarify why you have the last part of the condition in the comment, I
still wonder whether it wouldn't be worthwhile to assume it might be (properly)
virtualized, and rather do a little more sanity checking ...


>+         !rdmsr_safe(MSR_INTEL_CORE_THREAD_COUNT, val) )
>+    {
>+        hw_smt_enabled = (MASK_EXTR(val, MSR_CTC_CORE_MASK) !=
>+                          MASK_EXTR(val, MSR_CTC_THREAD_MASK));
>+        return;

... before this: Verify that both fields are non-zero, that core count is no greater
than thread count, and perhaps the latter also divides evenly by the former.

However, this being an improvement on its own already, I could accept it going
in as is, so even in its current shape (albeit preferably with the start-out-as-true
adjustment mentioned above)

Acked-by: Jan Beulich <jbeulich@suse.com>

>@@ -886,8 +931,7 @@ void __init init_speculation_mitigations(void)
      >* However, if we are on affected hardware, with HT enabled, and the user
>* hasn't explicitly chosen whether to use HT or not, nag them to do so.
>*/
>-    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim &&
>-         boot_cpu_data.x86_num_siblings > 1 )
>+    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && hw_smt_enabled )

It feels like the pv_shim part should have become redundant now, but I'm
uncertain whether it really does.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH 3/3] x86/boot: Detect the firmware SMT setting correctly on Intel hardware
@ 2019-04-15  8:25     ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15  8:25 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>
>--- a/xen/arch/x86/spec_ctrl.c
>+++ b/xen/arch/x86/spec_ctrl.c
>@@ -58,7 +58,7 @@ uint8_t __read_mostly default_xen_spec_ctrl;
>uint8_t __read_mostly default_spec_ctrl_flags;
 >
>paddr_t __read_mostly l1tf_addr_mask, __read_mostly l1tf_safe_maddr;
>-static bool __initdata cpu_has_bug_l1tf;
>+static bool __initdata cpu_has_bug_l1tf, hw_smt_enabled;

I wonder whether this wouldn't better start out as "true" and get set to false when
we're certain SMT is off. I'd rather see the nagging one time too often, than get a
report that someone's still vulnerable because of having SMT enabled without
noticing.


>+static void __init detect_smt_status(void)

>+{
>+    uint64_t val;
>+    unsigned int cpu;
>+
>+    /*
>+     * x86_num_siblings defaults to 1 in the absence of other information, and
>+     * is adjusted based on other topology information found in CPUID leaves.
>+     *
>+     * On AMD hardware, it will be the current SMT configuration.  On Intel
>+     * hardware, it will represent the maximum capability, rather than the
>+     * current configuration.
>+     */
>+    if ( boot_cpu_data.x86_num_siblings < 2 )
>+        return;
>+
>+    /*
>+     * Intel Nehalem and later hardware does have an MSR which reports the
>+     * current count of cores/threads in the package.
>+     *
>+     * At the time of writing, it is almost completely undocumented, so isn't
>+     * virtualised reliably.
>+     */
>+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && !cpu_has_hypervisor &&

While you clarify why you have the last part of the condition in the comment, I
still wonder whether it wouldn't be worthwhile to assume it might be (properly)
virtualized, and rather do a little more sanity checking ...


>+         !rdmsr_safe(MSR_INTEL_CORE_THREAD_COUNT, val) )
>+    {
>+        hw_smt_enabled = (MASK_EXTR(val, MSR_CTC_CORE_MASK) !=
>+                          MASK_EXTR(val, MSR_CTC_THREAD_MASK));
>+        return;

... before this: Verify that both fields are non-zero, that core count is no greater
than thread count, and perhaps the latter also divides evenly by the former.

However, this being an improvement on its own already, I could accept it going
in as is, so even in its current shape (albeit preferably with the start-out-as-true
adjustment mentioned above)

Acked-by: Jan Beulich <jbeulich@suse.com>

>@@ -886,8 +931,7 @@ void __init init_speculation_mitigations(void)
      >* However, if we are on affected hardware, with HT enabled, and the user
>* hasn't explicitly chosen whether to use HT or not, nag them to do so.
>*/
>-    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim &&
>-         boot_cpu_data.x86_num_siblings > 1 )
>+    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && hw_smt_enabled )

It feels like the pv_shim part should have become redundant now, but I'm
uncertain whether it really does.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-15  8:29       ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-15  8:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, wei.liu2, roger.pau

On 15/04/2019 09:09, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>--- a/xen/arch/x86/msr.c+++ b/xen/arch/x86/msr.c
>> @@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>> case MSR_PRED_CMD:
>> case MSR_FLUSH_CMD:
>> /* Write-only */
>> +    case MSR_INTEL_CORE_THREAD_COUNT:
>> case MSR_TSX_FORCE_ABORT:
>> /* Not offered to guests. */
>> goto gp_fault;
> In a private talk we had, didn't you mention there is at least one OS (was it OSX)
> that unconditionally ready this MSR? Despite assuming there are other issues
> with OSX, wouldn't it be better to avoid giving #GP(0) back here? I realize we can't
> yet give back a fully consistent value here, looking at what conclusions Linux
> draws from other CPUID output, couldn't 0x00010001 be used as "fake" value?

I did consider that option, but for anyone looking at this MSR, it is
likely to make the situation worse rather than better.

The other option to consider is retaining the current leaky behaviour. 
There haven't been obvious problems thus far, and the accurate topology
information is going to be arriving shortly to mesh in with the core
scheduling work.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-15  8:29       ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-15  8:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, wei.liu2, roger.pau

On 15/04/2019 09:09, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>--- a/xen/arch/x86/msr.c+++ b/xen/arch/x86/msr.c
>> @@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>> case MSR_PRED_CMD:
>> case MSR_FLUSH_CMD:
>> /* Write-only */
>> +    case MSR_INTEL_CORE_THREAD_COUNT:
>> case MSR_TSX_FORCE_ABORT:
>> /* Not offered to guests. */
>> goto gp_fault;
> In a private talk we had, didn't you mention there is at least one OS (was it OSX)
> that unconditionally ready this MSR? Despite assuming there are other issues
> with OSX, wouldn't it be better to avoid giving #GP(0) back here? I realize we can't
> yet give back a fully consistent value here, looking at what conclusions Linux
> draws from other CPUID output, couldn't 0x00010001 be used as "fake" value?

I did consider that option, but for anyone looking at this MSR, it is
likely to make the situation worse rather than better.

The other option to consider is retaining the current leaky behaviour. 
There haven't been obvious problems thus far, and the accurate topology
information is going to be arriving shortly to mesh in with the core
scheduling work.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 3/3] x86/boot: Detect the firmware SMT setting correctly on Intel hardware
@ 2019-04-15 12:43       ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-15 12:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, wei.liu2, roger.pau

On 15/04/2019 09:25, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>
>> --- a/xen/arch/x86/spec_ctrl.c
>> +++ b/xen/arch/x86/spec_ctrl.c
>> @@ -58,7 +58,7 @@ uint8_t __read_mostly default_xen_spec_ctrl;
>> uint8_t __read_mostly default_spec_ctrl_flags;
>  >
>> paddr_t __read_mostly l1tf_addr_mask, __read_mostly l1tf_safe_maddr;
>> -static bool __initdata cpu_has_bug_l1tf;
>> +static bool __initdata cpu_has_bug_l1tf, hw_smt_enabled;
> I wonder whether this wouldn't better start out as "true" and get set to false when
> we're certain SMT is off. I'd rather see the nagging one time too often, than get a
> report that someone's still vulnerable because of having SMT enabled without
> noticing.

Actually on second thoughts, I've switched it to being local to
init_speculation_mitigations(), and...

>> +static void __init detect_smt_status(void)

... this to returning bool.

>> +{
>> +    uint64_t val;
>> +    unsigned int cpu;
>> +
>> +    /*
>> +     * x86_num_siblings defaults to 1 in the absence of other information, and
>> +     * is adjusted based on other topology information found in CPUID leaves.
>> +     *
>> +     * On AMD hardware, it will be the current SMT configuration.  On Intel
>> +     * hardware, it will represent the maximum capability, rather than the
>> +     * current configuration.
>> +     */
>> +    if ( boot_cpu_data.x86_num_siblings < 2 )
>> +        return;
>> +
>> +    /*
>> +     * Intel Nehalem and later hardware does have an MSR which reports the
>> +     * current count of cores/threads in the package.
>> +     *
>> +     * At the time of writing, it is almost completely undocumented, so isn't
>> +     * virtualised reliably.
>> +     */
>> +    if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && !cpu_has_hypervisor &&
> While you clarify why you have the last part of the condition in the comment, I
> still wonder whether it wouldn't be worthwhile to assume it might be (properly)
> virtualized, and rather do a little more sanity checking ...
>
>
>> +         !rdmsr_safe(MSR_INTEL_CORE_THREAD_COUNT, val) )
>> +    {
>> +        hw_smt_enabled = (MASK_EXTR(val, MSR_CTC_CORE_MASK) !=
>> +                          MASK_EXTR(val, MSR_CTC_THREAD_MASK));
>> +        return;
> ... before this: Verify that both fields are non-zero, that core count is no greater
> than thread count, and perhaps the latter also divides evenly by the former.

This plan won't work when nested under older Xen, and potentially other
hypervisors.  The problem is that when the real hardware value is leaked
in, it bypasses the behaviour of the L0 scheduler which may be doing
core scheduling and removing the host HT-ness from the equation.

>
> However, this being an improvement on its own already, I could accept it going
> in as is, so even in its current shape (albeit preferably with the start-out-as-true
> adjustment mentioned above)
>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks.

>
>> @@ -886,8 +931,7 @@ void __init init_speculation_mitigations(void)
>       >* However, if we are on affected hardware, with HT enabled, and the user
>> * hasn't explicitly chosen whether to use HT or not, nag them to do so.
>> */
>> -    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim &&
>> -         boot_cpu_data.x86_num_siblings > 1 )
>> +    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && hw_smt_enabled )
> It feels like the pv_shim part should have become redundant now, but I'm
> uncertain whether it really does.

No - its not redundant.  Even with HT enabled in the shim case, we have
argued ourselves to be safe because the guest can only attack itself and
the shim Xen, so shouldn't get the nagging warning.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH 3/3] x86/boot: Detect the firmware SMT setting correctly on Intel hardware
@ 2019-04-15 12:43       ` Andrew Cooper
  0 siblings, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2019-04-15 12:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, wei.liu2, roger.pau

On 15/04/2019 09:25, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM >>>
>> --- a/xen/arch/x86/spec_ctrl.c
>> +++ b/xen/arch/x86/spec_ctrl.c
>> @@ -58,7 +58,7 @@ uint8_t __read_mostly default_xen_spec_ctrl;
>> uint8_t __read_mostly default_spec_ctrl_flags;
>  >
>> paddr_t __read_mostly l1tf_addr_mask, __read_mostly l1tf_safe_maddr;
>> -static bool __initdata cpu_has_bug_l1tf;
>> +static bool __initdata cpu_has_bug_l1tf, hw_smt_enabled;
> I wonder whether this wouldn't better start out as "true" and get set to false when
> we're certain SMT is off. I'd rather see the nagging one time too often, than get a
> report that someone's still vulnerable because of having SMT enabled without
> noticing.

Actually on second thoughts, I've switched it to being local to
init_speculation_mitigations(), and...

>> +static void __init detect_smt_status(void)

... this to returning bool.

>> +{
>> +    uint64_t val;
>> +    unsigned int cpu;
>> +
>> +    /*
>> +     * x86_num_siblings defaults to 1 in the absence of other information, and
>> +     * is adjusted based on other topology information found in CPUID leaves.
>> +     *
>> +     * On AMD hardware, it will be the current SMT configuration.  On Intel
>> +     * hardware, it will represent the maximum capability, rather than the
>> +     * current configuration.
>> +     */
>> +    if ( boot_cpu_data.x86_num_siblings < 2 )
>> +        return;
>> +
>> +    /*
>> +     * Intel Nehalem and later hardware does have an MSR which reports the
>> +     * current count of cores/threads in the package.
>> +     *
>> +     * At the time of writing, it is almost completely undocumented, so isn't
>> +     * virtualised reliably.
>> +     */
>> +    if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL && !cpu_has_hypervisor &&
> While you clarify why you have the last part of the condition in the comment, I
> still wonder whether it wouldn't be worthwhile to assume it might be (properly)
> virtualized, and rather do a little more sanity checking ...
>
>
>> +         !rdmsr_safe(MSR_INTEL_CORE_THREAD_COUNT, val) )
>> +    {
>> +        hw_smt_enabled = (MASK_EXTR(val, MSR_CTC_CORE_MASK) !=
>> +                          MASK_EXTR(val, MSR_CTC_THREAD_MASK));
>> +        return;
> ... before this: Verify that both fields are non-zero, that core count is no greater
> than thread count, and perhaps the latter also divides evenly by the former.

This plan won't work when nested under older Xen, and potentially other
hypervisors.  The problem is that when the real hardware value is leaked
in, it bypasses the behaviour of the L0 scheduler which may be doing
core scheduling and removing the host HT-ness from the equation.

>
> However, this being an improvement on its own already, I could accept it going
> in as is, so even in its current shape (albeit preferably with the start-out-as-true
> adjustment mentioned above)
>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Thanks.

>
>> @@ -886,8 +931,7 @@ void __init init_speculation_mitigations(void)
>       >* However, if we are on affected hardware, with HT enabled, and the user
>> * hasn't explicitly chosen whether to use HT or not, nag them to do so.
>> */
>> -    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim &&
>> -         boot_cpu_data.x86_num_siblings > 1 )
>> +    if ( opt_smt == -1 && cpu_has_bug_l1tf && !pv_shim && hw_smt_enabled )
> It feels like the pv_shim part should have become redundant now, but I'm
> uncertain whether it really does.

No - its not redundant.  Even with HT enabled in the shim case, we have
argued ourselves to be safe because the guest can only attack itself and
the shim Xen, so shouldn't get the nagging warning.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-15 15:18         ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15 15:18 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/15/19 10:30 AM >>>
>On 15/04/2019 09:09, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM 

>>>--- a/xen/arch/x86/msr.c+++ b/xen/arch/x86/msr.c
>>> @@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>>> case MSR_PRED_CMD:
>>> case MSR_FLUSH_CMD:
>>> /* Write-only */
>>> +    case MSR_INTEL_CORE_THREAD_COUNT:
>>> case MSR_TSX_FORCE_ABORT:
>>> /* Not offered to guests. */
>>> goto gp_fault;
>> In a private talk we had, didn't you mention there is at least one OS (was it OSX)
>> that unconditionally ready this MSR? Despite assuming there are other issues
>> with OSX, wouldn't it be better to avoid giving #GP(0) back here? I realize we can't
>> yet give back a fully consistent value here, looking at what conclusions Linux
>> draws from other CPUID output, couldn't 0x00010001 be used as "fake" value?
>
>I did consider that option, but for anyone looking at this MSR, it is
>likely to make the situation worse rather than better.
>
>The other option to consider is retaining the current leaky behaviour. 
>There haven't been obvious problems thus far, and the accurate topology
>information is going to be arriving shortly to mesh in with the core
>scheduling work.

Imo this would still be better than delivering #GP(0).

Jan




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
@ 2019-04-15 15:18         ` Jan Beulich
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Beulich @ 2019-04-15 15:18 UTC (permalink / raw)
  To: andrew.cooper3; +Cc: xen-devel, wei.liu2, roger.pau

>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/15/19 10:30 AM >>>
>On 15/04/2019 09:09, Jan Beulich wrote:
>>>> Andrew Cooper <andrew.cooper3@citrix.com> 04/13/19 6:22 PM 

>>>--- a/xen/arch/x86/msr.c+++ b/xen/arch/x86/msr.c
>>> @@ -131,6 +131,7 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>>> case MSR_PRED_CMD:
>>> case MSR_FLUSH_CMD:
>>> /* Write-only */
>>> +    case MSR_INTEL_CORE_THREAD_COUNT:
>>> case MSR_TSX_FORCE_ABORT:
>>> /* Not offered to guests. */
>>> goto gp_fault;
>> In a private talk we had, didn't you mention there is at least one OS (was it OSX)
>> that unconditionally ready this MSR? Despite assuming there are other issues
>> with OSX, wouldn't it be better to avoid giving #GP(0) back here? I realize we can't
>> yet give back a fully consistent value here, looking at what conclusions Linux
>> draws from other CPUID output, couldn't 0x00010001 be used as "fake" value?
>
>I did consider that option, but for anyone looking at this MSR, it is
>likely to make the situation worse rather than better.
>
>The other option to consider is retaining the current leaky behaviour. 
>There haven't been obvious problems thus far, and the accurate topology
>information is going to be arriving shortly to mesh in with the core
>scheduling work.

Imo this would still be better than delivering #GP(0).

Jan




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-04-15 15:15 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-13 16:22 [PATCH 0/3] x86: Fix boot time detection of SMT settings Andrew Cooper
2019-04-13 16:22 ` [Xen-devel] " Andrew Cooper
2019-04-13 16:22 ` [PATCH 1/3] x86/spec-ctrl: Reposition the XPTI command line parsing logic Andrew Cooper
2019-04-13 16:22   ` [Xen-devel] " Andrew Cooper
2019-04-15  7:58   ` Jan Beulich
2019-04-15  7:58     ` [Xen-devel] " Jan Beulich
2019-04-13 16:22 ` [PATCH 2/3] x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT Andrew Cooper
2019-04-13 16:22   ` [Xen-devel] " Andrew Cooper
2019-04-15  8:09   ` Jan Beulich
2019-04-15  8:09     ` [Xen-devel] " Jan Beulich
2019-04-15  8:29     ` Andrew Cooper
2019-04-15  8:29       ` [Xen-devel] " Andrew Cooper
2019-04-15 15:18       ` Jan Beulich
2019-04-15 15:18         ` [Xen-devel] " Jan Beulich
2019-04-13 16:22 ` [PATCH 3/3] x86/boot: Detect the firmware SMT setting correctly on Intel hardware Andrew Cooper
2019-04-13 16:22   ` [Xen-devel] " Andrew Cooper
2019-04-15  8:25   ` Jan Beulich
2019-04-15  8:25     ` [Xen-devel] " Jan Beulich
2019-04-15 12:43     ` Andrew Cooper
2019-04-15 12:43       ` [Xen-devel] " Andrew Cooper

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.