[patch V6 00/14] MDS basics 0

* [patch V6 00/14] MDS basics 0
@ 2019-03-01 21:47 Thomas Gleixner
  2019-03-01 21:47 ` [patch V6 01/14] MDS basics 1 Thomas Gleixner
                   ` (15 more replies)
  0 siblings, 16 replies; 89+ messages in thread
From: Thomas Gleixner @ 2019-03-01 21:47 UTC (permalink / raw)
  To: speck

Changes vs. V5:

  - Fix tools/ build (Josh)

  - Dropped the AIRMONT_MID change as it needs confirmation from Intel

  - Made the consolidated whitelist more readable and correct

  - Added the MSBDS only quirk for XEON PHI, made the idle flush
    depend on it and updated the sysfs output accordingly.

  - Fixed the protection matrix in the admin documentation and clarified
    the SMT situation vs. MSBDS only.

  - Updated the KVM/VMX changelog.

Delta patch against V5 below.

Available from git:

   cvs.ou.linutronix.de:linux/speck/linux WIP.mds

The linux-4.20.y, linux-4.19.y and linux-4.14.y branches are updated as
well and contain the untested backports of the pile for reference.

I'll send git bundles of the pile as well.

Thanks,

	tglx

8<---------------------------

diff --git a/Documentation/admin-guide/hw-vuln/mds.rst b/Documentation/admin-guide/hw-vuln/mds.rst
index 73cdc390aece..1de29d28903d 100644
--- a/Documentation/admin-guide/hw-vuln/mds.rst
+++ b/Documentation/admin-guide/hw-vuln/mds.rst
@@ -23,6 +23,10 @@ vulnerability is not present on:
 Whether a processor is affected or not can be read out from the MDS
 vulnerability file in sysfs. See :ref:`mds_sys_info`.
 
+Not all processors are affected by all variants of MDS, but the mitigation
+is identical for all of them so the kernel treats them as a single
+vulnerability.
+
 Related CVEs
 ------------
 
@@ -112,6 +116,7 @@ to the above information:
 
     ========================  ============================================
     'SMT vulnerable'          SMT is enabled
+    'SMT mitigated'           SMT is enabled and mitigated
     'SMT disabled'            SMT is disabled
     'SMT Host state unknown'  Kernel runs in a VM, Host SMT state unknown
     ========================  ============================================
@@ -153,8 +158,12 @@ CPU buffer clearing
   The mitigation for MDS clears the affected CPU buffers on return to user
   space and when entering a guest.
 
-  If SMT is enabled it also clears the buffers on idle entry, but that's not
-  a sufficient SMT protection for all MDS variants; it covers solely MSBDS.
+  If SMT is enabled it also clears the buffers on idle entry when the CPU
+  is only affected by MSBDS and not any other MDS variant, because the
+  other variants cannot be protected against cross Hyper-Thread attacks.
+
+  For CPUs which are only affected by MSBDS the user space, guest and idle
+  transition mitigations are sufficient and SMT is not affected.
 
 .. _virt_mechanism:
 
@@ -168,8 +177,10 @@ Virtualization mitigation
 
     If the L1D flush mitigation is enabled and up to date microcode is
     available, the L1D flush mitigation is automatically protecting the
-    guest transition. If the L1D flush mitigation is disabled the MDS
-    mitigation is disabled as well.
+    guest transition.
+
+    If the L1D flush mitigation is disabled then the MDS mitigation is
+    invoked explicit when the host MDS mitigation is enabled.
 
     For details on L1TF and virtualization see:
     :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <mitigation_control_kvm>`.
@@ -177,16 +188,18 @@ Virtualization mitigation
   - CPU is not affected by L1TF:
 
     CPU buffers are flushed before entering the guest when the host MDS
-    protection is enabled.
+    mitigation is enabled.
 
   The resulting MDS protection matrix for the host to guest transition:
 
   ============ ===== ============= ============ =================
-   L1TF         MDS   VMX-L1FLUSH   Host MDS     State
+   L1TF         MDS   VMX-L1FLUSH   Host MDS     MDS-State
 
    Don't care   No    Don't care    N/A          Not affected
 
-   Yes          Yes   Disabled      Don't care   Vulnerable
+   Yes          Yes   Disabled      Off          Vulnerable
+
+   Yes          Yes   Disabled      Full         Mitigated
 
    Yes          Yes   Enabled       Don't care   Mitigated
 
@@ -196,7 +209,7 @@ Virtualization mitigation
   ============ ===== ============= ============ =================
 
   This only covers the host to guest transition, i.e. prevents leakage from
-  host to guest, but does not protect the guest internally. Guest need to
+  host to guest, but does not protect the guest internally. Guests need to
   have their own protections.
 
 .. _xeon_phi:
@@ -210,14 +223,22 @@ XEON PHI specific considerations
   for malicious user space. The exposure can be disabled on the kernel
   command line with the 'ring3mwait=disable' command line option.
 
+  XEON PHI is not affected by the other MDS variants and MSBDS is mitigated
+  before the CPU enters a idle state. As XEON PHI is not affected by L1TF
+  either disabling SMT is not required for full protection.
+
 .. _mds_smt_control:
 
 SMT control
 ^^^^^^^^^^^
 
-  To prevent the SMT issues of MDS it might be necessary to disable SMT
-  completely. Disabling SMT can have a significant performance impact, but
-  the impact depends on the type of workloads.
+  All MDS variants except MSBDS can be attacked cross Hyper-Threads. That
+  means on CPUs which are affected by MFBDS or MLPDS it is necessary to
+  disable SMT for full protection. These are most of the affected CPUs; the
+  exception is XEON PHI, see :ref:`xeon_phi`.
+
+  Disabling SMT can have a significant performance impact, but the impact
+  depends on the type of workloads.
 
   See the relevant chapter in the L1TF mitigation documentation for details:
   :ref:`Documentation/admin-guide/hw-vuln/l1tf.rst <smt_control>`.
@@ -260,9 +281,7 @@ Mitigation selection guide
 2. Virtualization with trusted guests
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-   The same considerations as above versus trusted user space apply. See
-   also: :ref:`Documentation/admin-guide/hw-vuln//l1tf.rst <mitigation_selection>`.
-
+   The same considerations as above versus trusted user space apply.
 
 3. Virtualization with untrusted guests
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -270,6 +289,8 @@ Mitigation selection guide
    The protection depends on the state of the L1TF mitigations.
    See :ref:`virt_mechanism`.
 
+   If the MDS mitigation is enabled and SMT is disabled, guest to host and
+   guest to guest attacks are prevented.
 
 .. _mds_default_mitigations:
 
diff --git a/Documentation/x86/mds.rst b/Documentation/x86/mds.rst
index b050623c869c..3d6f943f1afb 100644
--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -107,19 +107,19 @@ user space or VM guests.
 Kernel internal mitigation modes
 --------------------------------
 
- ======= ===========================================================
- off     Mitigation is disabled. Either the CPU is not affected or
-         mds=off is supplied on the kernel command line
+ ======= ============================================================
+ off      Mitigation is disabled. Either the CPU is not affected or
+          mds=off is supplied on the kernel command line
 
- full    Mitigation is eanbled. CPU is affected and MD_CLEAR is
-         advertised in CPUID.
+ full     Mitigation is eanbled. CPU is affected and MD_CLEAR is
+          advertised in CPUID.
 
- vmwerv	 Mitigation is enabled. CPU is affected and MD_CLEAR is not
-	 advertised in CPUID. That is mainly for virtualization
-	 scenarios where the host has the updated microcode but the
-	 hypervisor does not expose MD_CLEAR in CPUID. It's a best
-	 effort approach without guarantee.
- ======= ===========================================================
+ vmwerv	  Mitigation is enabled. CPU is affected and MD_CLEAR is not
+	  advertised in CPUID. That is mainly for virtualization
+	  scenarios where the host has the updated microcode but the
+	  hypervisor does not expose MD_CLEAR in CPUID. It's a best
+	  effort approach without guarantee.
+ ======= ============================================================
 
 If the CPU is affected and mds=off is not supplied on the kernel command
 line then the kernel selects the appropriate mitigation mode depending on
@@ -189,6 +189,13 @@ Mitigation points
    When SMT is inactive, i.e. either the CPU does not support it or all
    sibling threads are offline CPU buffer clearing is not required.
 
+   The idle clearing is enabled on CPUs which are only affected by MSBDS
+   and not by any other MDS variant. The other MDS variants cannot be
+   protected against cross Hyper-Thread attacks because the Fill Buffer and
+   the Load Ports are shared. So on CPUs affected by other variants, the
+   idle clearing would be a window dressing exercise and is therefore not
+   activated.
+
    The invocation is controlled by the static key mds_idle_clear which is
    switched depending on the chosen mitigation mode and the SMT state of
    the system.
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index ae3f987b24f1..bdcea163850a 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -383,5 +383,6 @@
 #define X86_BUG_SPEC_STORE_BYPASS	X86_BUG(17) /* CPU is affected by speculative store bypass attack */
 #define X86_BUG_L1TF			X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
 #define X86_BUG_MDS			X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
+#define X86_BUG_MSBDS_ONLY		X86_BUG(20) /* CPU is only affected by the  MSDBS variant of BUG_MDS */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index aea871e69d64..e11654f93e71 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -667,6 +667,15 @@ static void update_indir_branch_cond(void)
 /* Update the static key controlling the MDS CPU buffer clear in idle */
 static void update_mds_branch_idle(void)
 {
+	/*
+	 * Enable the idle clearing on CPUs which are affected only by
+	 * MDBDS and not any other MDS variant. The other variants cannot
+	 * be mitigated when SMT is enabled, so clearing the buffers on
+	 * idle would be a window dressing exercise.
+	 */
+	if (!boot_cpu_has(X86_BUG_MSBDS_ONLY))
+		return;
+
 	if (sched_smt_active())
 		static_branch_enable(&mds_idle_clear);
 	else
@@ -1174,6 +1183,11 @@ static ssize_t mds_show_state(char *buf)
 			       mds_strings[mds_mitigation]);
 	}
 
+	if (boot_cpu_has(X86_BUG_MSBDS_ONLY)) {
+		return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
+			       sched_smt_active() ? "mitigated" : "disabled");
+	}
+
 	return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
 		       sched_smt_active() ? "vulnerable" : "disabled");
 }
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 389853338c2f..71d953a2c4db 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -953,38 +953,57 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
 #define NO_SSB		BIT(2)
 #define NO_L1TF		BIT(3)
 #define NO_MDS		BIT(4)
+#define MSBDS_ONLY	BIT(5)
+
+#define VULNWL(_vendor, _family, _model, _whitelist)	\
+	{ X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
+
+#define VULNWL_INTEL(model, whitelist)		\
+	VULNWL(INTEL, 6, INTEL_FAM6_##model, whitelist)
+
+#define VULNWL_AMD(family, whitelist)		\
+	VULNWL(AMD, family, X86_MODEL_ANY, whitelist)
+
+#define VULNWL_HYGON(family, whitelist)		\
+	VULNWL(HYGON, family, X86_MODEL_ANY, whitelist)
 
 static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
-	{ X86_VENDOR_ANY,	4, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_CENTAUR,	5, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_INTEL,	5, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_NSC,	5, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_SALTWELL,		X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_SALTWELL_TABLET,	X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_BONNELL_MID,		X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_SALTWELL_MID,	X86_FEATURE_ANY, NO_SPECULATION },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_BONNELL,		X86_FEATURE_ANY, NO_SPECULATION },
-
-	{ X86_VENDOR_AMD,	X86_FAMILY_ANY, X86_MODEL_ANY,		X86_FEATURE_ANY, NO_MELTDOWN | NO_L1TF },
-	{ X86_VENDOR_HYGON,	X86_FAMILY_ANY, X86_MODEL_ANY,		X86_FEATURE_ANY, NO_MELTDOWN | NO_L1TF },
-
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_SILVERMONT,		X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_SILVERMONT_X,	X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_SILVERMONT_MID,	X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_AIRMONT,		X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_AIRMONT_MID,		X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_CORE_YONAH,		X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_XEON_PHI_KNL,		X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_XEON_PHI_KNM,		X86_FEATURE_ANY, NO_SSB | NO_L1TF },
-
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_GOLDMONT,		X86_FEATURE_ANY, NO_L1TF | NO_MDS },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_GOLDMONT_X,		X86_FEATURE_ANY, NO_L1TF | NO_MDS },
-	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_GOLDMONT_PLUS,	X86_FEATURE_ANY, NO_L1TF | NO_MDS },
-
-	{ X86_VENDOR_AMD,	0x0f, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SSB },
-	{ X86_VENDOR_AMD,	0x10, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SSB },
-	{ X86_VENDOR_AMD,	0x11, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SSB },
-	{ X86_VENDOR_AMD,	0x12, X86_MODEL_ANY,			X86_FEATURE_ANY, NO_SSB },
+	VULNWL(ANY,	4, X86_MODEL_ANY,	NO_SPECULATION),
+	VULNWL(CENTAUR,	5, X86_MODEL_ANY,	NO_SPECULATION),
+	VULNWL(INTEL,	5, X86_MODEL_ANY,	NO_SPECULATION),
+	VULNWL(NSC,	5, X86_MODEL_ANY,	NO_SPECULATION),
+
+	/* Intel Family 6 */
+	VULNWL_INTEL(ATOM_SALTWELL,		NO_SPECULATION),
+	VULNWL_INTEL(ATOM_SALTWELL_TABLET,	NO_SPECULATION),
+	VULNWL_INTEL(ATOM_SALTWELL_MID,		NO_SPECULATION),
+	VULNWL_INTEL(ATOM_BONNELL,		NO_SPECULATION),
+	VULNWL_INTEL(ATOM_BONNELL_MID,		NO_SPECULATION),
+
+	VULNWL_INTEL(ATOM_SILVERMONT,		NO_SSB | NO_L1TF),
+	VULNWL_INTEL(ATOM_SILVERMONT_X,		NO_SSB | NO_L1TF),
+	VULNWL_INTEL(ATOM_SILVERMONT_MID,	NO_SSB | NO_L1TF),
+	VULNWL_INTEL(ATOM_AIRMONT,		NO_SSB | NO_L1TF),
+	VULNWL_INTEL(XEON_PHI_KNL,		NO_SSB | NO_L1TF | MSBDS_ONLY),
+	VULNWL_INTEL(XEON_PHI_KNM,		NO_SSB | NO_L1TF | MSBDS_ONLY),
+
+	VULNWL_INTEL(CORE_YONAH,		NO_SSB),
+
+	VULNWL_INTEL(ATOM_AIRMONT_MID,		NO_L1TF),
+
+	VULNWL_INTEL(ATOM_GOLDMONT,		NO_MDS | NO_L1TF),
+	VULNWL_INTEL(ATOM_GOLDMONT_X,		NO_MDS | NO_L1TF),
+	VULNWL_INTEL(ATOM_GOLDMONT_PLUS,	NO_MDS | NO_L1TF),
+
+	/* AMD Family 0xf - 0x12 */
+	VULNWL_AMD(0x0f,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+	VULNWL_AMD(0x10,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+	VULNWL_AMD(0x11,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+	VULNWL_AMD(0x12,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS),
+
+	/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
+	VULNWL_AMD(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS),
+	VULNWL_HYGON(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS),
 	{}
 };
 
@@ -1015,8 +1034,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 	if (ia32_cap & ARCH_CAP_IBRS_ALL)
 		setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
 
-	if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO))
+	if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO)) {
 		setup_force_cpu_bug(X86_BUG_MDS);
+		if (cpu_matches(MSBDS_ONLY))
+			setup_force_cpu_bug(X86_BUG_MSBDS_ONLY);
+	}
 
 	if (cpu_matches(NO_MELTDOWN))
 		return;
diff --git a/tools/power/x86/turbostat/Makefile b/tools/power/x86/turbostat/Makefile
index 1598b4fa0b11..045f5f7d68ab 100644
--- a/tools/power/x86/turbostat/Makefile
+++ b/tools/power/x86/turbostat/Makefile
@@ -9,7 +9,7 @@ ifeq ("$(origin O)", "command line")
 endif
 
 turbostat : turbostat.c
-override CFLAGS +=	-Wall
+override CFLAGS +=	-Wall -I../../../include
 override CFLAGS +=	-DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'
 override CFLAGS +=	-DINTEL_FAMILY_HEADER='"../../../../arch/x86/include/asm/intel-family.h"'
 
diff --git a/tools/power/x86/x86_energy_perf_policy/Makefile b/tools/power/x86/x86_energy_perf_policy/Makefile
index ae7a0e09b722..1fdeef864e7c 100644
--- a/tools/power/x86/x86_energy_perf_policy/Makefile
+++ b/tools/power/x86/x86_energy_perf_policy/Makefile
@@ -9,7 +9,7 @@ ifeq ("$(origin O)", "command line")
 endif
 
 x86_energy_perf_policy : x86_energy_perf_policy.c
-override CFLAGS +=	-Wall
+override CFLAGS +=	-Wall -I../../../include
 override CFLAGS +=	-DMSRHEADER='"../../../../arch/x86/include/asm/msr-index.h"'
 
 %: %.c

^ permalink raw reply related	[flat|nested] 89+ messages in thread