historical-speck.lore.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch V3 0/9] MDS basics 0
@ 2019-02-21 23:44 Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 1/9] MDS basics 1 Thomas Gleixner
                   ` (8 more replies)
  0 siblings, 9 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Hi!

Thanks for the valuable feedback to everyone!

Changes since V2:

 - Added the NMI mitigation and added an explanation. Thanks Andi and
   Kees.

 - Fixed the VERW asm magic as pointed out by Andrew and added
   more explanation as requested by Borislav and Andrew.

 - Adopted Peter's static branch suggestions

 - Renamed the _HOPE mode to _VMWERV along with an explanation of the
   acronym in the changelog. Thanks Mark for the inspiration.

 - Updated documentation. The return to user section has changed a
   lot. Added some explanation about assumptions and hopefully fixed all
   issues mentioned by Borislav, Andrew, Greg....

 - Cleaned up the bitmask issues in the speculation MSR defines as
   pointed out by Greg.

 - Got the Copy & Paste in the sysfs code right this time.

 - Dropped the conditional mode stuff for now. Needs more thought on
   all ends and I wish we just don't need it at all :)

 - Collected a few Reviewed-by tags, but not for the patches which
   have significant changes.

The admin documentation is still WIP, so not included.

It's also available through the git repository in the force updated
branch: WIP.mds

Thanks,

	tglx


 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 1/9] MDS basics 1
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-22  6:53   ` [MODERATED] " Greg KH
  2019-02-22  7:30   ` Borislav Petkov
  2019-02-21 23:44 ` [patch V3 2/9] MDS basics 2 Thomas Gleixner
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 1/9] x86/msr-index: Cleanup bit defines
From: Thomas Gleixner <tglx@linutronix.de>

Greg pointed out that speculation related bit defines are using (1 << N)
format instead of BIT(N). Aside of that (1 << N) is wrong as it should 1UL
at least.

Clean it up.

Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/msr-index.h |   34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)

--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_X86_MSR_INDEX_H
 #define _ASM_X86_MSR_INDEX_H
 
+#include <linux/bits.h>
+
 /*
  * CPU model specific register (MSR) numbers.
  *
@@ -40,14 +42,14 @@
 /* Intel MSRs. Some also available on other CPUs */
 
 #define MSR_IA32_SPEC_CTRL		0x00000048 /* Speculation Control */
-#define SPEC_CTRL_IBRS			(1 << 0)   /* Indirect Branch Restricted Speculation */
+#define SPEC_CTRL_IBRS			BIT(0)	   /* Indirect Branch Restricted Speculation */
 #define SPEC_CTRL_STIBP_SHIFT		1	   /* Single Thread Indirect Branch Predictor (STIBP) bit */
-#define SPEC_CTRL_STIBP			(1 << SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
+#define SPEC_CTRL_STIBP			BIT(SPEC_CTRL_STIBP_SHIFT)	/* STIBP mask */
 #define SPEC_CTRL_SSBD_SHIFT		2	   /* Speculative Store Bypass Disable bit */
-#define SPEC_CTRL_SSBD			(1 << SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
+#define SPEC_CTRL_SSBD			BIT(SPEC_CTRL_SSBD_SHIFT)	/* Speculative Store Bypass Disable */
 
 #define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */
-#define PRED_CMD_IBPB			(1 << 0)   /* Indirect Branch Prediction Barrier */
+#define PRED_CMD_IBPB			BIT(0)	   /* Indirect Branch Prediction Barrier */
 
 #define MSR_PPIN_CTL			0x0000004e
 #define MSR_PPIN			0x0000004f
@@ -69,20 +71,20 @@
 #define MSR_MTRRcap			0x000000fe
 
 #define MSR_IA32_ARCH_CAPABILITIES	0x0000010a
-#define ARCH_CAP_RDCL_NO		(1 << 0)   /* Not susceptible to Meltdown */
-#define ARCH_CAP_IBRS_ALL		(1 << 1)   /* Enhanced IBRS support */
-#define ARCH_CAP_SKIP_VMENTRY_L1DFLUSH	(1 << 3)   /* Skip L1D flush on vmentry */
-#define ARCH_CAP_SSB_NO			(1 << 4)   /*
-						    * Not susceptible to Speculative Store Bypass
-						    * attack, so no Speculative Store Bypass
-						    * control required.
-						    */
+#define ARCH_CAP_RDCL_NO		BIT(0)	/* Not susceptible to Meltdown */
+#define ARCH_CAP_IBRS_ALL		BIT(1)	/* Enhanced IBRS support */
+#define ARCH_CAP_SKIP_VMENTRY_L1DFLUSH	BIT(3)	/* Skip L1D flush on vmentry */
+#define ARCH_CAP_SSB_NO			BIT(4)	/*
+						 * Not susceptible to Speculative Store Bypass
+						 * attack, so no Speculative Store Bypass
+						 * control required.
+						 */
 
 #define MSR_IA32_FLUSH_CMD		0x0000010b
-#define L1D_FLUSH			(1 << 0)   /*
-						    * Writeback and invalidate the
-						    * L1 data cache.
-						    */
+#define L1D_FLUSH			BIT(0)	/*
+						 * Writeback and invalidate the
+						 * L1 data cache.
+						 */
 
 #define MSR_IA32_BBL_CR_CTL		0x00000119
 #define MSR_IA32_BBL_CR_CTL3		0x0000011e

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 2/9] MDS basics 2
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 1/9] MDS basics 1 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 3/9] MDS basics 3 Thomas Gleixner
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 2/9] x86/speculation/mds: Add basic bug infrastructure for MDS
From: Andi Kleen <ak@linux.intel.com>

Microarchitectural Data Sampling (MDS), is a class of side channel attacks
on internal buffers in Intel CPUs. The variants are:

 - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
 - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
 - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)

MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
dependent load (store-to-load forwarding) as an optimization. The forward
can also happen to a faulting or assisting load operation for a different
memory address, which can be exploited under certain conditions. Store
buffers are partitioned between Hyper-Threads so cross thread forwarding is
not possible. But if a thread enters or exits a sleep state the store
buffer is repartitioned which can expose data from one thread to the other.

MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
L1 miss situations and to hold data which is returned or sent in response
to a memory or I/O operation. Fill buffers can forward data to a load
operation and also write data to the cache. When the fill buffer is
deallocated it can retain the stale data of the preceding operations which
can then be forwarded to a faulting or assisting load operation, which can
be exploited under certain conditions. Fill buffers are shared between
Hyper-Threads so cross thread leakage is possible.

MLDPS leaks Load Port Data. Load ports are used to perform load operations
from memory or I/O. The received data is then forwarded to the register
file or a subsequent operation. In some implementations the Load Port can
contain stale data from a previous operation which can be forwarded to
faulting or assisting loads under certain conditions, which again can be
exploited eventually. Load ports are shared between Hyper-Threads so cross
thread leakage is possible.

All variants have the same mitigation for single CPU thread case (SMT off),
so the kernel can treat them as one MDS issue.

Add the basic infrastructure to detect if the current CPU is affected by
MDS.

[ tglx: Rewrote changelog ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
V3: Addressed Borislav's review comments
---
 arch/x86/include/asm/cpufeatures.h |    2 ++
 arch/x86/include/asm/msr-index.h   |    5 +++++
 arch/x86/kernel/cpu/common.c       |   13 +++++++++++++
 3 files changed, 20 insertions(+)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -344,6 +344,7 @@
 /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
 #define X86_FEATURE_AVX512_4VNNIW	(18*32+ 2) /* AVX-512 Neural Network Instructions */
 #define X86_FEATURE_AVX512_4FMAPS	(18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_MD_CLEAR		(18*32+10) /* VERW clears CPU buffers */
 #define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
 #define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
 #define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
@@ -381,5 +382,6 @@
 #define X86_BUG_SPECTRE_V2		X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */
 #define X86_BUG_SPEC_STORE_BYPASS	X86_BUG(17) /* CPU is affected by speculative store bypass attack */
 #define X86_BUG_L1TF			X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
+#define X86_BUG_MDS			X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -79,6 +79,11 @@
 						 * attack, so no Speculative Store Bypass
 						 * control required.
 						 */
+#define ARCH_CAP_MDS_NO			BIT(5)   /*
+						  * Not susceptible to
+						  * Microarchitectural Data
+						  * Sampling (MDS) vulnerabilities.
+						  */
 
 #define MSR_IA32_FLUSH_CMD		0x0000010b
 #define L1D_FLUSH			BIT(0)	/*
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -998,6 +998,14 @@ static const __initconst struct x86_cpu_
 	{}
 };
 
+static const __initconst struct x86_cpu_id cpu_no_mds[] = {
+	/* in addition to cpu_no_speculation */
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GOLDMONT	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GOLDMONT_X	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GOLDMONT_PLUS	},
+	{}
+};
+
 static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 {
 	u64 ia32_cap = 0;
@@ -1019,6 +1027,11 @@ static void __init cpu_set_bug_bits(stru
 	if (ia32_cap & ARCH_CAP_IBRS_ALL)
 		setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
 
+	if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+	    !x86_match_cpu(cpu_no_mds)) &&
+	    !(ia32_cap & ARCH_CAP_MDS_NO))
+		setup_force_cpu_bug(X86_BUG_MDS);
+
 	if (x86_match_cpu(cpu_no_meltdown))
 		return;
 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 3/9] MDS basics 3
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 1/9] MDS basics 1 Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 2/9] MDS basics 2 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 4/9] MDS basics 4 Thomas Gleixner
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

From: Andi Kleen <ak@linux.intel.com>
Subject: [patch V3 3/9] x86/kvm: Expose X86_FEATURE_MD_CLEAR to guests

X86_FEATURE_MD_CLEAR is a new CPUID bit which is set when microcode
provides the mechanism to invoke a flush of various exploitable CPU buffers
by invoking the VERW instruction.

Hand it through to guests so they can adjust their mitigations.

This also requires corresponding qemu changes, which are available
separately.

[ tglx: Massaged changelog ]

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kvm/cpuid.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -409,7 +409,8 @@ static inline int __do_cpuid_ent(struct
 	/* cpuid 7.0.edx*/
 	const u32 kvm_cpuid_7_0_edx_x86_features =
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
-		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP);
+		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
+		F(MD_CLEAR);
 
 	/* all calls to cpuid_count() should be made on the same cpu */
 	get_cpu();

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 4/9] MDS basics 4
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
                   ` (2 preceding siblings ...)
  2019-02-21 23:44 ` [patch V3 3/9] MDS basics 3 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-22  6:58   ` [MODERATED] " Greg KH
                     ` (2 more replies)
  2019-02-21 23:44 ` [patch V3 5/9] MDS basics 5 Thomas Gleixner
                   ` (4 subsequent siblings)
  8 siblings, 3 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 4/9] x86/speculation/mds: Add mds_clear_cpu_buffer()
From: Thomas Gleixner <tglx@linutronix.de>

The Microarchitectural Data Sampling (MDS) vulernabilities are mitigated by
clearing the affected CPU buffers. The mechanism for clearing the buffers
uses the unused and obsolete VERW instruction in combination with a
microcode update which triggers a CPU buffer clear when VERW is executed.

Provide a inline function with the assembly magic. The argument of the VERW
instruction must be a memory operand as documented:

  "MD_CLEAR enumerates that the memory-operand variant of VERW (for
   example, VERW m16) has been extended to also overwrite buffers affected
   by MDS. This buffer overwriting functionality is not guaranteed for the register
   operand variant of VERW."

Add x86 specific documentation about MDS and the internal workings of the
mitigation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2 --> V3: Add VERW documentation and fix typos/grammar..., dropped 'i(0)'
       	   Add more details fo the documentation file

V1 --> V2: Add "cc" clobber and documentation
---
 Documentation/index.rst              |    1 
 Documentation/x86/conf.py            |   10 +++
 Documentation/x86/index.rst          |    8 ++
 Documentation/x86/mds.rst            |   94 +++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/nospec-branch.h |   23 ++++++++
 5 files changed, 136 insertions(+)

--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -101,6 +101,7 @@ implementation.
    :maxdepth: 2
 
    sh/index
+   x86/index
 
 Filesystem Documentation
 ------------------------
--- /dev/null
+++ b/Documentation/x86/conf.py
@@ -0,0 +1,10 @@
+# -*- coding: utf-8; mode: python -*-
+
+project = "X86 architecture specific documentation"
+
+tags.add("subproject")
+
+latex_documents = [
+    ('index', 'x86.tex', project,
+     'The kernel development community', 'manual'),
+]
--- /dev/null
+++ b/Documentation/x86/index.rst
@@ -0,0 +1,8 @@
+==========================
+x86 architecture specifics
+==========================
+
+.. toctree::
+   :maxdepth: 1
+
+   mds
--- /dev/null
+++ b/Documentation/x86/mds.rst
@@ -0,0 +1,94 @@
+Microarchitecural Data Sampling (MDS) mitigation
+================================================
+
+Microarchitectural Data Sampling (MDS) is a class of side channel attacks
+on internal buffers in Intel CPUs. The variants are:
+
+ - Microarchitectural Store Buffer Data Sampling (MSBDS) (CVE-2018-12126)
+ - Microarchitectural Fill Buffer Data Sampling (MFBDS) (CVE-2018-12130)
+ - Microarchitectural Load Port Data Sampling (MLPDS) (CVE-2018-12127)
+
+MSBDS leaks Store Buffer Entries which can be speculatively forwarded to a
+dependent load (store-to-load forwarding) as an optimization. The forward
+can also happen to a faulting or assisting load operation for a different
+memory address, which can be exploited under certain conditions. Store
+buffers are partitioned between Hyper-Threads so cross thread forwarding is
+not possible. But if a thread enters or exits a sleep state the store
+buffer is repartitioned which can expose data from one thread to the other.
+
+MFBDS leaks Fill Buffer Entries. Fill buffers are used internally to manage
+L1 miss situations and to hold data which is returned or sent in response
+to a memory or I/O operation. Fill buffers can forward data to a load
+operation and also write data to the cache. When the fill buffer is
+deallocated it can retain the stale data of the preceding operations which
+can then be forwarded to a faulting or assisting load operation, which can
+be exploited under certain conditions. Fill buffers are shared between
+Hyper-Threads so cross thread leakage is possible.
+
+MLDPS leaks Load Port Data. Load ports are used to perform load operations
+from memory or I/O. The received data is then forwarded to the register
+file or a subsequent operation. In some implementations the Load Port can
+contain stale data from a previous operation which can be forwarded to
+faulting or assisting loads under certain conditions, which again can be
+exploited eventually. Load ports are shared between Hyper-Threads so cross
+thread leakage is possible.
+
+Exposure assumptions
+--------------------
+
+It is assumed that attack code resides in user space or in a guest with one
+exception. The rationale behind this assumption is that the code construct
+needed for exploiting MDS requires:
+
+ - to control the load to trigger a fault or assist
+
+ - to have a disclosure gadget which exposes the speculatively accessed
+   data for consumption through a side channel.
+
+ - to control the pointer through which the disclosure gadget exposes the
+   data
+
+The existance of such a construct cannot be excluded with 100% certainty,
+but the complexity involved makes it extremly unlikely.
+
+There is one exception, which is untrusted BPF. The functionality of
+untrusted BPF is limited, but it needs to be thoroughly investigated
+whether it can be used to create such a construct.
+
+
+Mitigation strategy
+-------------------
+
+All variants have the same mitigation strategy at least for the single CPU
+thread case (SMT off): Force the CPU to clear the affected buffers.
+
+This is achieved by using the otherwise unused and obsolete VERW
+instruction in combination with a microcode update. The microcode clears
+the affected CPU buffers when the VERW instruction is executed.
+
+For virtualization there are two ways to achieve CPU buffer
+clearing. Either the modified VERW instruction or via the L1D Flush
+command. The latter is issued when L1TF mitigation is enabled so the extra
+VERW can be avoided. If the CPU is not affected by L1TF then VERW needs to
+be issued.
+
+If the VERW instruction with the supplied segment selector argument is
+executed on a CPU without the microcode update there is no side effect
+other than a small number of pointlessly wasted CPU cycles.
+
+This does not protect against cross Hyper-Thread attacks except for MSBDS
+which is only exploitable cross Hyper-thread when one of the Hyper-Threads
+enters a C-state.
+
+The kernel provides a function to invoke the buffer clearing:
+
+    mds_clear_cpu_buffers()
+
+The mitigation is invoked on kernel/userspace, hypervisor/guest and C-state
+(idle) transitions. Depending on the mitigation mode and the system state
+the invocation can be enforced or conditional.
+
+According to current knowledge additional mitigations inside the kernel
+itself are not required because the necessary gadgets to expose the leaked
+data cannot be controlled in a way which allows exploitation from malicious
+user space or VM guests.
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -318,6 +318,29 @@ DECLARE_STATIC_KEY_FALSE(switch_to_cond_
 DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
 DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
 
+#include <asm/segment.h>
+
+/**
+ * mds_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * This uses the otherwise unused and obsolete VERW instruction in
+ * combination with microcode which triggers a CPU buffer flush when the
+ * instruction is executed.
+ */
+static inline void mds_clear_cpu_buffers(void)
+{
+	static const u16 ds = __KERNEL_DS;
+
+	/*
+	 * Has to be the memory-operand variant because only that
+	 * guarantees the CPU buffer flush functionality according to
+	 * documentation. The register-operand variant does not.
+	 *
+	 * "cc" clobber is required because VERW modifies ZF.
+	 */
+	asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc");
+}
+
 #endif /* __ASSEMBLY__ */
 
 /*

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 5/9] MDS basics 5
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
                   ` (3 preceding siblings ...)
  2019-02-21 23:44 ` [patch V3 4/9] MDS basics 4 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-22  0:46   ` [MODERATED] " Andrew Cooper
  2019-02-21 23:44 ` [patch V3 6/9] MDS basics 6 Thomas Gleixner
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 5/9] x86/speculation/mds: Clear CPU buffers on exit to user
From: Thomas Gleixner <tglx@linutronix.de>

Add a static key which controls the invocation of the CPU buffer clear
mechanism on exit to user space and add the call into
prepare_exit_to_usermode() and do_nmi() right before actually returning.

Add documentation which kernel to user space transition this covers and
explain why some corner cases are not mitigated.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V3: Add NMI conditional on user regs and update documentation accordingly.
    Use the static branch scheme suggested by Peter. Fix typos ...
---
 Documentation/x86/mds.rst            |   33 +++++++++++++++++++++++++++++++++
 arch/x86/entry/common.c              |   10 ++++++++++
 arch/x86/include/asm/nospec-branch.h |    2 ++
 arch/x86/kernel/cpu/bugs.c           |    4 +++-
 arch/x86/kernel/nmi.c                |    6 ++++++
 5 files changed, 54 insertions(+), 1 deletion(-)

--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -92,3 +92,36 @@ According to current knowledge additiona
 itself are not required because the necessary gadgets to expose the leaked
 data cannot be controlled in a way which allows exploitation from malicious
 user space or VM guests.
+
+Mitigation points
+-----------------
+
+1. Return to user space
+^^^^^^^^^^^^^^^^^^^^^^^
+   When transitioning from kernel to user space the CPU buffers are flushed
+   on affected CPUs:
+
+   - always when the mitigation mode is full. The migitation is enabled
+     through the static key mds_user_clear.
+
+   This covers transitions from kernel to user space through a return to
+   user space from a syscall and from an interrupt or a regular exception.
+
+   There are other kernel to user space transitions which are not covered
+   by this: NMIs and all non maskable exceptions which go through the
+   paranoid exit, which means that they are not invoking the regular
+   prepare_exit_to_usermode() which handles the CPU buffer clearing.
+
+   Access to sensible data like keys, credentials in the NMI context is
+   mostly theoretical: The CPU can do prefetching or execute a
+   misspeculated code path and thereby fetching data which might end up
+   leaking through a buffer.
+
+   But for mounting other attacks the kernel stack address of the task is
+   already valuable information. So in full mitigation mode, the NMI is
+   mitigated on the return from do_nmi() to provide almost complete
+   coverage.
+
+   There is one non maskable exception which returns through paranoid exit
+   and is not mitigated: #DF. If user space is able to trigger a double
+   fault the possible MDS leakage is the least problem to worry about.
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -31,6 +31,7 @@
 #include <asm/vdso.h>
 #include <linux/uaccess.h>
 #include <asm/cpufeature.h>
+#include <asm/nospec-branch.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/syscalls.h>
@@ -180,6 +181,13 @@ static void exit_to_usermode_loop(struct
 	}
 }
 
+static inline void mds_user_clear_cpu_buffers(void)
+{
+	if (!static_branch_likely(&mds_user_clear))
+		return;
+	mds_clear_cpu_buffers();
+}
+
 /* Called with IRQs disabled. */
 __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
 {
@@ -212,6 +220,8 @@ static void exit_to_usermode_loop(struct
 #endif
 
 	user_enter_irqoff();
+
+	mds_user_clear_cpu_buffers();
 }
 
 #define SYSCALL_EXIT_WORK_FLAGS				\
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -318,6 +318,8 @@ DECLARE_STATIC_KEY_FALSE(switch_to_cond_
 DECLARE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
 DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
 
+DECLARE_STATIC_KEY_FALSE(mds_user_clear);
+
 #include <asm/segment.h>
 
 /**
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -63,10 +63,12 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_i
 /* Control unconditional IBPB in switch_mm() */
 DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
 
+/* Control MDS CPU buffer clear before returning to user space */
+DEFINE_STATIC_KEY_FALSE(mds_user_clear);
+
 void __init check_bugs(void)
 {
 	identify_boot_cpu();
-
 	/*
 	 * identify_boot_cpu() initialized SMT support information, let the
 	 * core code know.
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -34,6 +34,7 @@
 #include <asm/x86_init.h>
 #include <asm/reboot.h>
 #include <asm/cache.h>
+#include <asm/nospec-branch.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/nmi.h>
@@ -533,6 +534,11 @@ do_nmi(struct pt_regs *regs, long error_
 		write_cr2(this_cpu_read(nmi_cr2));
 	if (this_cpu_dec_return(nmi_state))
 		goto nmi_restart;
+
+	if (!static_branch_likely(&mds_user_clear))
+		return;
+	if (user_mode(regs))
+		mds_clear_cpu_buffers();
 }
 NOKPROBE_SYMBOL(do_nmi);
 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 6/9] MDS basics 6
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
                   ` (4 preceding siblings ...)
  2019-02-21 23:44 ` [patch V3 5/9] MDS basics 5 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 7/9] MDS basics 7 Thomas Gleixner
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 6/9] x86/speculation/mds: Conditionally clear CPU buffers on idle entry
From: Thomas Gleixner <tglx@linutronix.de>

Add a static key which controls the invocation of the CPU buffer clear
mechanism on idle entry. This is independent of other MDS mitigations
because the idle entry invocation to mitigate the potential leakage due to
store buffer repartitioning is only necessary on SMT systems.

Add the actual invocations to the different halt/mwait variants which
covers all usage sites. mwaitx is not patched as it's not available on
Intel CPUs.

The buffer clear is only invoked before entering the C-State to prevent
that stale data from the idling CPU is spilled to the Hyper-Thread sibling
after the Store buffer got repartitioned and all entries are available to
the non idle sibling.

When coming out of idle the store buffer is partitioned again so each
sibling has half of it available. Now CPU which returned from idle could be
speculatively exposed to contents of the sibling, but the buffers are
flushed either on exit to user space or on VMENTER.

When later on conditional buffer clearing is implemented on top of this,
then there is no action required either because before returning to user
space the context switch will set the condition flag which causes a flush
on the return to user path.

This intentionaly does not handle the case in the acpi/processor_idle
driver which uses the legacy IO port interface for C-State transitions for
two reasons:

 - The acpi/processor_idle driver was replaced by the intel_idle driver
   almost a decade ago. Anything Nehalem upwards supports it and defaults
   to that new driver.

 - The legacy IO port interface is likely to be used on older and therefore
   unaffected CPUs or on systems which do not receive microcode updates
   anymore, so there is no point in adding that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
V3: Adjust document wording
---
 Documentation/x86/mds.rst            |   35 +++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/irqflags.h      |    4 ++++
 arch/x86/include/asm/mwait.h         |    7 +++++++
 arch/x86/include/asm/nospec-branch.h |   12 ++++++++++++
 arch/x86/kernel/cpu/bugs.c           |    2 ++
 5 files changed, 60 insertions(+)

--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -125,3 +125,38 @@ Mitigation points
    There is one non maskable exception which returns through paranoid exit
    and is not mitigated: #DF. If user space is able to trigger a double
    fault the possible MDS leakage is the least problem to worry about.
+
+
+2. C-State transition
+^^^^^^^^^^^^^^^^^^^^^
+
+   When a CPU goes idle and enters a C-State the CPU buffers need to be
+   cleared on affected CPUs when SMT is active. This addresses the
+   repartitioning of the store buffer when one of the Hyper-Threads enters
+   a C-State.
+
+   When SMT is inactive, i.e. either the CPU does not support it or all
+   sibling threads are offline CPU buffer clearing is not required.
+
+   The invocation is controlled by the static key mds_idle_clear which is
+   switched depending on the chosen mitigation mode and the SMT state of
+   the system.
+
+   The buffer clear is only invoked before entering the C-State to prevent
+   that stale data from the idling CPU can be spilled to the Hyper-Thread
+   sibling after the store buffer got repartitioned and all entries are
+   available to the non idle sibling.
+
+   When coming out of idle the store buffer is partitioned again so each
+   sibling has half of it available. The back from idle CPU could be then
+   speculatively exposed to contents of the sibling. The buffers are
+   flushed either on exit to user space or on VMENTER so malicious code
+   in user space or the guest cannot speculatively access them.
+
+   The mitigation is hooked into all variants of halt()/mwait(), but does
+   not cover the legacy ACPI IO-Port mechanism because the ACPI idle driver
+   has been superseded by the intel_idle driver around 2010 and is
+   preferred on all affected CPUs which are expected to gain the MD_CLEAR
+   functionality in microcode. Aside of that the IO-Port mechanism is a
+   legacy interface which is only used on older systems which are either
+   not affected or do not receive microcode updates anymore.
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -6,6 +6,8 @@
 
 #ifndef __ASSEMBLY__
 
+#include <asm/nospec-branch.h>
+
 /* Provide __cpuidle; we can't safely include <linux/cpu.h> */
 #define __cpuidle __attribute__((__section__(".cpuidle.text")))
 
@@ -54,11 +56,13 @@ static inline void native_irq_enable(voi
 
 static inline __cpuidle void native_safe_halt(void)
 {
+	mds_idle_clear_cpu_buffers();
 	asm volatile("sti; hlt": : :"memory");
 }
 
 static inline __cpuidle void native_halt(void)
 {
+	mds_idle_clear_cpu_buffers();
 	asm volatile("hlt": : :"memory");
 }
 
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -6,6 +6,7 @@
 #include <linux/sched/idle.h>
 
 #include <asm/cpufeature.h>
+#include <asm/nospec-branch.h>
 
 #define MWAIT_SUBSTATE_MASK		0xf
 #define MWAIT_CSTATE_MASK		0xf
@@ -40,6 +41,8 @@ static inline void __monitorx(const void
 
 static inline void __mwait(unsigned long eax, unsigned long ecx)
 {
+	mds_idle_clear_cpu_buffers();
+
 	/* "mwait %eax, %ecx;" */
 	asm volatile(".byte 0x0f, 0x01, 0xc9;"
 		     :: "a" (eax), "c" (ecx));
@@ -74,6 +77,8 @@ static inline void __mwait(unsigned long
 static inline void __mwaitx(unsigned long eax, unsigned long ebx,
 			    unsigned long ecx)
 {
+	/* No MDS buffer clear as this is AMD/HYGON only */
+
 	/* "mwaitx %eax, %ebx, %ecx;" */
 	asm volatile(".byte 0x0f, 0x01, 0xfb;"
 		     :: "a" (eax), "b" (ebx), "c" (ecx));
@@ -81,6 +86,8 @@ static inline void __mwaitx(unsigned lon
 
 static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
 {
+	mds_idle_clear_cpu_buffers();
+
 	trace_hardirqs_on();
 	/* "mwait %eax, %ecx;" */
 	asm volatile("sti; .byte 0x0f, 0x01, 0xc9;"
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -319,6 +319,7 @@ DECLARE_STATIC_KEY_FALSE(switch_mm_cond_
 DECLARE_STATIC_KEY_FALSE(switch_mm_always_ibpb);
 
 DECLARE_STATIC_KEY_FALSE(mds_user_clear);
+DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
 
 #include <asm/segment.h>
 
@@ -343,6 +344,17 @@ static inline void mds_clear_cpu_buffers
 	asm volatile("verw %[ds]" : : [ds] "m" (ds) : "cc");
 }
 
+/**
+ * mds_idle_clear_cpu_buffers - Mitigation for MDS vulnerability
+ *
+ * Clear CPU buffers if the corresponding static key is enabled
+ */
+static inline void mds_idle_clear_cpu_buffers(void)
+{
+	if (static_branch_likely(&mds_idle_clear))
+		mds_clear_cpu_buffers();
+}
+
 #endif /* __ASSEMBLY__ */
 
 /*
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -65,6 +65,8 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_always
 
 /* Control MDS CPU buffer clear before returning to user space */
 DEFINE_STATIC_KEY_FALSE(mds_user_clear);
+/* Control MDS CPU buffer clear before idling (halt, mwait) */
+DEFINE_STATIC_KEY_FALSE(mds_idle_clear);
 
 void __init check_bugs(void)
 {

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 7/9] MDS basics 7
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
                   ` (5 preceding siblings ...)
  2019-02-21 23:44 ` [patch V3 6/9] MDS basics 6 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-22  7:08   ` [MODERATED] " Greg KH
  2019-02-21 23:44 ` [patch V3 8/9] MDS basics 8 Thomas Gleixner
  2019-02-21 23:44 ` [patch V3 9/9] MDS basics 9 Thomas Gleixner
  8 siblings, 1 reply; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 7/9] x86/speculation/mds: Add mitigation control for MDS
From: Thomas Gleixner <tglx@linutronix.de>

Now that the mitigations are in place, add a command line parameter to
control the mitigation, a mitigation selector function and a SMT update
mechanism.

This is the minimal straight forward initial implementation which just
provides an always on/off mode. The command line parameter is:

  mds=[full|off|auto]

This is consistent with the existing mitigations for other speculative
hardware vulnerabilities.

The idle invocation is dynamically updated according to the SMT state of
the system similar to the dynamic update of the STIBP mitigation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
---
 Documentation/admin-guide/kernel-parameters.txt |   27 ++++++++
 arch/x86/include/asm/processor.h                |    6 +
 arch/x86/kernel/cpu/bugs.c                      |   76 ++++++++++++++++++++++++
 3 files changed, 109 insertions(+)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2356,6 +2356,33 @@
 			Format: <first>,<last>
 			Specifies range of consoles to be captured by the MDA.
 
+	mds=		[X86,INTEL]
+			Control mitigation for the Micro-architectural Data
+			Sampling (MDS) vulnerability.
+
+			Certain CPUs are vulnerable to an exploit against CPU
+			internal buffers which can forward information to a
+			disclosure gadget under certain conditions.
+
+			In vulnerable processors, the speculatively
+			forwarded data can be used in a cache side channel
+			attack, to access data to which the attacker does
+			not have direct access.
+
+			This parameter controls the MDS mitigation. The the
+			options are:
+
+			full    - Unconditionally enable MDS mitigation
+			off     - Unconditionally disable MDS mitigation
+			auto    - Kernel detects whether the CPU model is
+				  vulnerable to MDS and picks the most
+				  appropriate mitigation. If the CPU is not
+				  vulnerable, "off" is selected. If the CPU
+				  is vulnerable "full" is selected.
+
+			Not specifying this option is equivalent to
+			mds=auto.
+
 	mem=nn[KMG]	[KNL,BOOT] Force usage of a specific amount of memory
 			Amount of memory to be used when the kernel is not able
 			to see the whole system memory or for test.
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -992,4 +992,10 @@ enum l1tf_mitigations {
 
 extern enum l1tf_mitigations l1tf_mitigation;
 
+enum mds_mitigations {
+	MDS_MITIGATION_OFF,
+	MDS_MITIGATION_AUTO,
+	MDS_MITIGATION_FULL,
+};
+
 #endif /* _ASM_X86_PROCESSOR_H */
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -37,6 +37,7 @@
 static void __init spectre_v2_select_mitigation(void);
 static void __init ssb_select_mitigation(void);
 static void __init l1tf_select_mitigation(void);
+static void __init mds_select_mitigation(void);
 
 /* The base value of the SPEC_CTRL MSR that always has to be preserved. */
 u64 x86_spec_ctrl_base;
@@ -105,6 +106,8 @@ void __init check_bugs(void)
 
 	l1tf_select_mitigation();
 
+	mds_select_mitigation();
+
 #ifdef CONFIG_X86_32
 	/*
 	 * Check whether we are able to run this kernel safely on SMP.
@@ -211,6 +214,59 @@ static void x86_amd_ssb_disable(void)
 }
 
 #undef pr_fmt
+#define pr_fmt(fmt)	"MDS: " fmt
+
+/* Default mitigation for L1TF-affected CPUs */
+static enum mds_mitigations mds_mitigation __ro_after_init = MDS_MITIGATION_AUTO;
+
+static const char * const mds_strings[] = {
+	[MDS_MITIGATION_OFF]	= "Vulnerable",
+	[MDS_MITIGATION_FULL]	= "Mitigation: Clear CPU buffers"
+};
+
+static void mds_select_mitigation(void)
+{
+	if (!boot_cpu_has_bug(X86_BUG_MDS)) {
+		mds_mitigation = MDS_MITIGATION_OFF;
+		return;
+	}
+
+	switch (mds_mitigation) {
+	case MDS_MITIGATION_OFF:
+		break;
+	case MDS_MITIGATION_AUTO:
+	case MDS_MITIGATION_FULL:
+		if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) {
+			mds_mitigation = MDS_MITIGATION_FULL;
+			static_branch_enable(&mds_user_clear);
+		} else {
+			mds_mitigation = MDS_MITIGATION_OFF;
+		}
+		break;
+	}
+	pr_info("%s\n", mds_strings[mds_mitigation]);
+}
+
+static int __init mds_cmdline(char *str)
+{
+	if (!boot_cpu_has_bug(X86_BUG_MDS))
+		return 0;
+
+	if (!str)
+		return -EINVAL;
+
+	if (!strcmp(str, "off"))
+		mds_mitigation = MDS_MITIGATION_OFF;
+	else if (!strcmp(str, "auto"))
+		mds_mitigation = MDS_MITIGATION_AUTO;
+	else if (!strcmp(str, "full"))
+		mds_mitigation = MDS_MITIGATION_FULL;
+
+	return 0;
+}
+early_param("mds", mds_cmdline);
+
+#undef pr_fmt
 #define pr_fmt(fmt)     "Spectre V2 : " fmt
 
 static enum spectre_v2_mitigation spectre_v2_enabled __ro_after_init =
@@ -614,6 +670,15 @@ static void update_indir_branch_cond(voi
 		static_branch_disable(&switch_to_cond_stibp);
 }
 
+/* Update the static key controlling the MDS CPU buffer clear in idle */
+static void update_mds_branch_idle(void)
+{
+	if (sched_smt_active())
+		static_branch_enable(&mds_idle_clear);
+	else
+		static_branch_disable(&mds_idle_clear);
+}
+
 void arch_smt_update(void)
 {
 	/* Enhanced IBRS implies STIBP. No update required. */
@@ -635,6 +700,17 @@ void arch_smt_update(void)
 		break;
 	}
 
+	switch (mds_mitigation) {
+	case MDS_MITIGATION_OFF:
+		break;
+	case MDS_MITIGATION_FULL:
+		update_mds_branch_idle();
+		break;
+	/* Keep GCC happy */
+	case MDS_MITIGATION_AUTO:
+		break;
+	}
+
 	mutex_unlock(&spec_ctrl_mutex);
 }
 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 8/9] MDS basics 8
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
                   ` (6 preceding siblings ...)
  2019-02-21 23:44 ` [patch V3 7/9] MDS basics 7 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-22  7:14   ` [MODERATED] " Greg KH
  2019-02-22  8:55   ` Borislav Petkov
  2019-02-21 23:44 ` [patch V3 9/9] MDS basics 9 Thomas Gleixner
  8 siblings, 2 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 8/9] x86/speculation/mds: Add sysfs reporting for MDS
From: Thomas Gleixner <tglx@linutronix.de>

Add the sysfs reporting file for MDS. It exposes the vulnerability and
mitigation state similar to the existing files for the other speculative
hardware vulnerabilities.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V3: Copy & Paste done right :(
---
 Documentation/ABI/testing/sysfs-devices-system-cpu |    1 +
 arch/x86/kernel/cpu/bugs.c                         |   20 ++++++++++++++++++++
 drivers/base/cpu.c                                 |    8 ++++++++
 include/linux/cpu.h                                |    2 ++
 4 files changed, 31 insertions(+)

--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -484,6 +484,7 @@ What:		/sys/devices/system/cpu/vulnerabi
 		/sys/devices/system/cpu/vulnerabilities/spectre_v2
 		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
 		/sys/devices/system/cpu/vulnerabilities/l1tf
+		/sys/devices/system/cpu/vulnerabilities/mds
 Date:		January 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Information about CPU vulnerabilities
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1175,6 +1175,17 @@ static ssize_t l1tf_show_state(char *buf
 }
 #endif
 
+static ssize_t mds_show_state(char *buf)
+{
+	if (!hypervisor_is_type(X86_HYPER_NATIVE)) {
+		return sprintf(buf, "%s; SMT Host state unknown\n",
+			       mds_strings[mds_mitigation]);
+	}
+
+	return sprintf(buf, "%s; SMT %s\n", mds_strings[mds_mitigation],
+		       sched_smt_active() ? "vulnerable" : "disabled");
+}
+
 static char *stibp_state(void)
 {
 	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
@@ -1241,6 +1252,10 @@ static ssize_t cpu_show_common(struct de
 		if (boot_cpu_has(X86_FEATURE_L1TF_PTEINV))
 			return l1tf_show_state(buf);
 		break;
+
+	case X86_BUG_MDS:
+		return mds_show_state(buf);
+
 	default:
 		break;
 	}
@@ -1272,4 +1287,9 @@ ssize_t cpu_show_l1tf(struct device *dev
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_L1TF);
 }
+
+ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_MDS);
+}
 #endif
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -546,11 +546,18 @@ ssize_t __weak cpu_show_l1tf(struct devi
 	return sprintf(buf, "Not affected\n");
 }
 
+ssize_t __weak cpu_show_mds(struct device *dev,
+			    struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
 static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
 static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
+static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
 
 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_meltdown.attr,
@@ -558,6 +565,7 @@ static struct attribute *cpu_root_vulner
 	&dev_attr_spectre_v2.attr,
 	&dev_attr_spec_store_bypass.attr,
 	&dev_attr_l1tf.attr,
+	&dev_attr_mds.attr,
 	NULL
 };
 
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -57,6 +57,8 @@ extern ssize_t cpu_show_spec_store_bypas
 					  struct device_attribute *attr, char *buf);
 extern ssize_t cpu_show_l1tf(struct device *dev,
 			     struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_mds(struct device *dev,
+			    struct device_attribute *attr, char *buf);
 
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [patch V3 9/9] MDS basics 9
  2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
                   ` (7 preceding siblings ...)
  2019-02-21 23:44 ` [patch V3 8/9] MDS basics 8 Thomas Gleixner
@ 2019-02-21 23:44 ` Thomas Gleixner
  2019-02-22  7:50   ` [MODERATED] " Greg KH
  2019-02-22 15:54   ` [MODERATED] " Borislav Petkov
  8 siblings, 2 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-21 23:44 UTC (permalink / raw)
  To: speck

Subject: [patch V3 9/9] x86/speculation/mds: Add mitigation mode VMWERV
From: Thomas Gleixner <tglx@linutronix.de>

In virtualized environments it can happen that the host has the microcode
update which utilizes the VERW instruction to clear CPU buffers, but the
hypervisor is not yet updated to expose the X86_FEATURE_MD_CLEAR CPUID bit
to guests.

Introduce an internal mitigation mode 'VWWERV' which enables the invocation
of the CPU buffer clearing even if X86_FEATURE_MD_CLEAR is not set. If the
system has no updated microcode this results in a pointless execution of
the VERW instruction wasting a few CPU cycles. If the microcode is updated,
but not exposed to a guest then the CPU buffers will be cleared.

That said: Virtual Machines Will Eventually Receive Vaccine

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2 -> V3: Rename mode.
---
 Documentation/x86/mds.rst        |    5 +++++
 arch/x86/include/asm/processor.h |    1 +
 arch/x86/kernel/cpu/bugs.c       |   14 ++++++++------
 3 files changed, 14 insertions(+), 6 deletions(-)

--- a/Documentation/x86/mds.rst
+++ b/Documentation/x86/mds.rst
@@ -88,6 +88,11 @@ The mitigation is invoked on kernel/user
 (idle) transitions. Depending on the mitigation mode and the system state
 the invocation can be enforced or conditional.
 
+As a special quirk to address virtualization scenarios where the host has
+the microcode updated, but the hypervisor does not (yet) expose the
+MD_CLEAR CPUID bit to guests, the kernel issues the VERW instruction in the
+hope that it might work. The state is reflected accordingly.
+
 According to current knowledge additional mitigations inside the kernel
 itself are not required because the necessary gadgets to expose the leaked
 data cannot be controlled in a way which allows exploitation from malicious
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -996,6 +996,7 @@ enum mds_mitigations {
 	MDS_MITIGATION_OFF,
 	MDS_MITIGATION_AUTO,
 	MDS_MITIGATION_FULL,
+	MDS_MITIGATION_VMWERV,
 };
 
 #endif /* _ASM_X86_PROCESSOR_H */
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -221,7 +221,8 @@ static enum mds_mitigations mds_mitigati
 
 static const char * const mds_strings[] = {
 	[MDS_MITIGATION_OFF]	= "Vulnerable",
-	[MDS_MITIGATION_FULL]	= "Mitigation: Clear CPU buffers"
+	[MDS_MITIGATION_FULL]	= "Mitigation: Clear CPU buffers",
+	[MDS_MITIGATION_VMWERV]	= "Vulnerable: Clear CPU buffers attempted, no microcode",
 };
 
 static void mds_select_mitigation(void)
@@ -236,12 +237,12 @@ static void mds_select_mitigation(void)
 		break;
 	case MDS_MITIGATION_AUTO:
 	case MDS_MITIGATION_FULL:
-		if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) {
+	case MDS_MITIGATION_VMWERV:
+		if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
 			mds_mitigation = MDS_MITIGATION_FULL;
-			static_branch_enable(&mds_user_clear);
-		} else {
-			mds_mitigation = MDS_MITIGATION_OFF;
-		}
+		else
+			mds_mitigation = MDS_MITIGATION_VMWERV;
+		static_branch_enable(&mds_user_clear);
 		break;
 	}
 	pr_info("%s\n", mds_strings[mds_mitigation]);
@@ -704,6 +705,7 @@ void arch_smt_update(void)
 	case MDS_MITIGATION_OFF:
 		break;
 	case MDS_MITIGATION_FULL:
+	case MDS_MITIGATION_VMWERV:
 		update_mds_branch_idle();
 		break;
 	/* Keep GCC happy */

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 5/9] MDS basics 5
  2019-02-21 23:44 ` [patch V3 5/9] MDS basics 5 Thomas Gleixner
@ 2019-02-22  0:46   ` Andrew Cooper
  2019-02-22  7:00     ` Thomas Gleixner
  2019-02-22  9:20     ` [MODERATED] " Peter Zijlstra
  0 siblings, 2 replies; 32+ messages in thread
From: Andrew Cooper @ 2019-02-22  0:46 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 556 bytes --]

On 21/02/2019 23:44, speck for Thomas Gleixner wrote:
> +   There is one non maskable exception which returns through paranoid exit
> +   and is not mitigated: #DF. If user space is able to trigger a double
> +   fault the possible MDS leakage is the least problem to worry about.

What about espfix64?  An IRET fault from that ends up at #DF, and
purposefully recovers.  It is trigger-able from at least modify_ldt().

The #DF path is normally fatal, but in the cases that it's not, an extra
VERW isn't going to be the slow part.

~Andrew


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 1/9] MDS basics 1
  2019-02-21 23:44 ` [patch V3 1/9] MDS basics 1 Thomas Gleixner
@ 2019-02-22  6:53   ` Greg KH
  2019-02-22  7:30   ` Borislav Petkov
  1 sibling, 0 replies; 32+ messages in thread
From: Greg KH @ 2019-02-22  6:53 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:32AM +0100, speck for Thomas Gleixner wrote:
> Subject: [patch V3 1/9] x86/msr-index: Cleanup bit defines
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Greg pointed out that speculation related bit defines are using (1 << N)
> format instead of BIT(N). Aside of that (1 << N) is wrong as it should 1UL
> at least.
> 
> Clean it up.
> 
> Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Thanks for cleaning this up!

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 4/9] MDS basics 4
  2019-02-21 23:44 ` [patch V3 4/9] MDS basics 4 Thomas Gleixner
@ 2019-02-22  6:58   ` Greg KH
  2019-02-22 10:44     ` Thomas Gleixner
  2019-02-22  7:45   ` [MODERATED] Encrypted Message Jon Masters
  2019-02-22  7:50   ` [MODERATED] Re: [patch V3 4/9] MDS basics 4 Borislav Petkov
  2 siblings, 1 reply; 32+ messages in thread
From: Greg KH @ 2019-02-22  6:58 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:35AM +0100, speck for Thomas Gleixner wrote:
> +Exposure assumptions
> +--------------------
> +
> +It is assumed that attack code resides in user space or in a guest with one
> +exception. The rationale behind this assumption is that the code construct
> +needed for exploiting MDS requires:
> +
> + - to control the load to trigger a fault or assist
> +
> + - to have a disclosure gadget which exposes the speculatively accessed
> +   data for consumption through a side channel.
> +
> + - to control the pointer through which the disclosure gadget exposes the
> +   data
> +
> +The existance of such a construct cannot be excluded with 100% certainty,
> +but the complexity involved makes it extremly unlikely.
> +
> +There is one exception, which is untrusted BPF. The functionality of
> +untrusted BPF is limited, but it needs to be thoroughly investigated
> +whether it can be used to create such a construct.

A meta-comment, is anyone looking at the untrusted BPF issue?  Do we
have the BPF developers on this list so that they have the chance to
figure this out?

Anyway, this looks great, thanks for summarizing all of this in a
readable way, I now know more about the insides of Intel cpus than I
ever wanted to:

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch V3 5/9] MDS basics 5
  2019-02-22  0:46   ` [MODERATED] " Andrew Cooper
@ 2019-02-22  7:00     ` Thomas Gleixner
  2019-02-22  9:20     ` [MODERATED] " Peter Zijlstra
  1 sibling, 0 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-22  7:00 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 736 bytes --]

On Fri, 22 Feb 2019, speck for Andrew Cooper wrote:

> On 21/02/2019 23:44, speck for Thomas Gleixner wrote:
> > +   There is one non maskable exception which returns through paranoid exit
> > +   and is not mitigated: #DF. If user space is able to trigger a double
> > +   fault the possible MDS leakage is the least problem to worry about.
> 
> What about espfix64?  An IRET fault from that ends up at #DF, and
> purposefully recovers.  It is trigger-able from at least modify_ldt().
> 
> The #DF path is normally fatal, but in the cases that it's not, an extra
> VERW isn't going to be the slow part.

Right you are. I stared at that for quite a while and did not make the
connection.

Thanks!

	tglx


       

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 7/9] MDS basics 7
  2019-02-21 23:44 ` [patch V3 7/9] MDS basics 7 Thomas Gleixner
@ 2019-02-22  7:08   ` Greg KH
  0 siblings, 0 replies; 32+ messages in thread
From: Greg KH @ 2019-02-22  7:08 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:38AM +0100, speck for Thomas Gleixner wrote:
> Subject: [patch V3 7/9] x86/speculation/mds: Add mitigation control for MDS
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Now that the mitigations are in place, add a command line parameter to
> control the mitigation, a mitigation selector function and a SMT update
> mechanism.
> 
> This is the minimal straight forward initial implementation which just
> provides an always on/off mode. The command line parameter is:
> 
>   mds=[full|off|auto]
> 
> This is consistent with the existing mitigations for other speculative
> hardware vulnerabilities.
> 
> The idle invocation is dynamically updated according to the SMT state of
> the system similar to the dynamic update of the STIBP mitigation.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Reviewed-by: Borislav Petkov <bp@suse.de>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 8/9] MDS basics 8
  2019-02-21 23:44 ` [patch V3 8/9] MDS basics 8 Thomas Gleixner
@ 2019-02-22  7:14   ` Greg KH
  2019-02-22  8:55   ` Borislav Petkov
  1 sibling, 0 replies; 32+ messages in thread
From: Greg KH @ 2019-02-22  7:14 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:39AM +0100, speck for Thomas Gleixner wrote:
> Subject: [patch V3 8/9] x86/speculation/mds: Add sysfs reporting for MDS
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Add the sysfs reporting file for MDS. It exposes the vulnerability and
> mitigation state similar to the existing files for the other speculative
> hardware vulnerabilities.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 1/9] MDS basics 1
  2019-02-21 23:44 ` [patch V3 1/9] MDS basics 1 Thomas Gleixner
  2019-02-22  6:53   ` [MODERATED] " Greg KH
@ 2019-02-22  7:30   ` Borislav Petkov
  1 sibling, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2019-02-22  7:30 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:32AM +0100, speck for Thomas Gleixner wrote:
> Subject: [patch V3 1/9] x86/msr-index: Cleanup bit defines
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Greg pointed out that speculation related bit defines are using (1 << N)
> format instead of BIT(N). Aside of that (1 << N) is wrong as it should 1UL
> at least.
> 
> Clean it up.
> 
> Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  arch/x86/include/asm/msr-index.h |   34 ++++++++++++++++++----------------
>  1 file changed, 18 insertions(+), 16 deletions(-)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Encrypted Message
  2019-02-21 23:44 ` [patch V3 4/9] MDS basics 4 Thomas Gleixner
  2019-02-22  6:58   ` [MODERATED] " Greg KH
@ 2019-02-22  7:45   ` Jon Masters
  2019-02-22 17:16     ` [MODERATED] " Linus Torvalds
  2019-02-22  7:50   ` [MODERATED] Re: [patch V3 4/9] MDS basics 4 Borislav Petkov
  2 siblings, 1 reply; 32+ messages in thread
From: Jon Masters @ 2019-02-22  7:45 UTC (permalink / raw)
  To: speck

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/rfc822-headers; protected-headers="v1", Size: 128 bytes --]

From: Jon Masters <jcm@redhat.com>
To: speck for Thomas Gleixner <speck@linutronix.de>
Subject: Re: [patch V3 4/9] MDS basics 4

[-- Attachment #2: Type: text/plain, Size: 653 bytes --]

On 2/21/19 6:44 PM, speck for Thomas Gleixner wrote:
> +#include <asm/segment.h>
> +
> +/**
> + * mds_clear_cpu_buffers - Mitigation for MDS vulnerability
> + *
> + * This uses the otherwise unused and obsolete VERW instruction in
> + * combination with microcode which triggers a CPU buffer flush when the
> + * instruction is executed.
> + */
> +static inline void mds_clear_cpu_buffers(void)
> +{
> +	static const u16 ds = __KERNEL_DS;

Dunno if it's worth documenting that using a specifically valid segment
is faster than a zero selector according to Intel.

Jon.

-- 
Computer Architect | Sent with my Fedora powered laptop


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 9/9] MDS basics 9
  2019-02-21 23:44 ` [patch V3 9/9] MDS basics 9 Thomas Gleixner
@ 2019-02-22  7:50   ` Greg KH
  2019-02-22 10:38     ` Thomas Gleixner
  2019-02-22 15:54   ` [MODERATED] " Borislav Petkov
  1 sibling, 1 reply; 32+ messages in thread
From: Greg KH @ 2019-02-22  7:50 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:40AM +0100, speck for Thomas Gleixner wrote:
> Subject: [patch V3 9/9] x86/speculation/mds: Add mitigation mode VMWERV
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> In virtualized environments it can happen that the host has the microcode
> update which utilizes the VERW instruction to clear CPU buffers, but the
> hypervisor is not yet updated to expose the X86_FEATURE_MD_CLEAR CPUID bit
> to guests.
> 
> Introduce an internal mitigation mode 'VWWERV' which enables the invocation
> of the CPU buffer clearing even if X86_FEATURE_MD_CLEAR is not set. If the
> system has no updated microcode this results in a pointless execution of
> the VERW instruction wasting a few CPU cycles. If the microcode is updated,
> but not exposed to a guest then the CPU buffers will be cleared.
> 
> That said: Virtual Machines Will Eventually Receive Vaccine
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2 -> V3: Rename mode.

Nice name :)

> ---
>  Documentation/x86/mds.rst        |    5 +++++
>  arch/x86/include/asm/processor.h |    1 +
>  arch/x86/kernel/cpu/bugs.c       |   14 ++++++++------
>  3 files changed, 14 insertions(+), 6 deletions(-)
> 
> --- a/Documentation/x86/mds.rst
> +++ b/Documentation/x86/mds.rst
> @@ -88,6 +88,11 @@ The mitigation is invoked on kernel/user
>  (idle) transitions. Depending on the mitigation mode and the system state
>  the invocation can be enforced or conditional.
>  
> +As a special quirk to address virtualization scenarios where the host has
> +the microcode updated, but the hypervisor does not (yet) expose the
> +MD_CLEAR CPUID bit to guests, the kernel issues the VERW instruction in the
> +hope that it might work. The state is reflected accordingly.
> +
>  According to current knowledge additional mitigations inside the kernel
>  itself are not required because the necessary gadgets to expose the leaked
>  data cannot be controlled in a way which allows exploitation from malicious
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -996,6 +996,7 @@ enum mds_mitigations {
>  	MDS_MITIGATION_OFF,
>  	MDS_MITIGATION_AUTO,
>  	MDS_MITIGATION_FULL,
> +	MDS_MITIGATION_VMWERV,
>  };
>  
>  #endif /* _ASM_X86_PROCESSOR_H */
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -221,7 +221,8 @@ static enum mds_mitigations mds_mitigati
>  
>  static const char * const mds_strings[] = {
>  	[MDS_MITIGATION_OFF]	= "Vulnerable",
> -	[MDS_MITIGATION_FULL]	= "Mitigation: Clear CPU buffers"
> +	[MDS_MITIGATION_FULL]	= "Mitigation: Clear CPU buffers",
> +	[MDS_MITIGATION_VMWERV]	= "Vulnerable: Clear CPU buffers attempted, no microcode",
>  };
>  
>  static void mds_select_mitigation(void)
> @@ -236,12 +237,12 @@ static void mds_select_mitigation(void)
>  		break;
>  	case MDS_MITIGATION_AUTO:
>  	case MDS_MITIGATION_FULL:
> -		if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) {
> +	case MDS_MITIGATION_VMWERV:
> +		if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
>  			mds_mitigation = MDS_MITIGATION_FULL;
> -			static_branch_enable(&mds_user_clear);
> -		} else {
> -			mds_mitigation = MDS_MITIGATION_OFF;
> -		}
> +		else
> +			mds_mitigation = MDS_MITIGATION_VMWERV;
> +		static_branch_enable(&mds_user_clear);

So did we just loose the ability at "auto" to turn this off if present,
because we really do not know if we can turn it off automatically?

Or am I reading this code wrong?

Should there be a new usespace command line option for "vmwerv"?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 4/9] MDS basics 4
  2019-02-21 23:44 ` [patch V3 4/9] MDS basics 4 Thomas Gleixner
  2019-02-22  6:58   ` [MODERATED] " Greg KH
  2019-02-22  7:45   ` [MODERATED] Encrypted Message Jon Masters
@ 2019-02-22  7:50   ` Borislav Petkov
  2 siblings, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2019-02-22  7:50 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:35AM +0100, speck for Thomas Gleixner wrote:
> +Exposure assumptions
> +--------------------
> +
> +It is assumed that attack code resides in user space or in a guest with one
> +exception. The rationale behind this assumption is that the code construct
> +needed for exploiting MDS requires:
> +
> + - to control the load to trigger a fault or assist
> +
> + - to have a disclosure gadget which exposes the speculatively accessed
> +   data for consumption through a side channel.
> +
> + - to control the pointer through which the disclosure gadget exposes the
> +   data
> +
> +The existance of such a construct cannot be excluded with 100% certainty,

WARNING: 'existance' may be misspelled - perhaps 'existence'?

With that fixed:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 8/9] MDS basics 8
  2019-02-21 23:44 ` [patch V3 8/9] MDS basics 8 Thomas Gleixner
  2019-02-22  7:14   ` [MODERATED] " Greg KH
@ 2019-02-22  8:55   ` Borislav Petkov
  1 sibling, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2019-02-22  8:55 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:39AM +0100, speck for Thomas Gleixner wrote:
> Subject: [patch V3 8/9] x86/speculation/mds: Add sysfs reporting for MDS
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Add the sysfs reporting file for MDS. It exposes the vulnerability and
> mitigation state similar to the existing files for the other speculative
> hardware vulnerabilities.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V3: Copy & Paste done right :(
> ---
>  Documentation/ABI/testing/sysfs-devices-system-cpu |    1 +
>  arch/x86/kernel/cpu/bugs.c                         |   20 ++++++++++++++++++++
>  drivers/base/cpu.c                                 |    8 ++++++++
>  include/linux/cpu.h                                |    2 ++
>  4 files changed, 31 insertions(+)

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 5/9] MDS basics 5
  2019-02-22  0:46   ` [MODERATED] " Andrew Cooper
  2019-02-22  7:00     ` Thomas Gleixner
@ 2019-02-22  9:20     ` Peter Zijlstra
  2019-02-22 10:23       ` Thomas Gleixner
  1 sibling, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2019-02-22  9:20 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:46:46AM +0000, speck for Andrew Cooper wrote:
> On 21/02/2019 23:44, speck for Thomas Gleixner wrote:
> > +   There is one non maskable exception which returns through paranoid exit
> > +   and is not mitigated: #DF. If user space is able to trigger a double
> > +   fault the possible MDS leakage is the least problem to worry about.
> 
> What about espfix64?  An IRET fault from that ends up at #DF, and
> purposefully recovers.  It is trigger-able from at least modify_ldt().
> 
> The #DF path is normally fatal, but in the cases that it's not, an extra
> VERW isn't going to be the slow part.

What about the #MC, do_mce has paranoid=1 on.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch V3 5/9] MDS basics 5
  2019-02-22  9:20     ` [MODERATED] " Peter Zijlstra
@ 2019-02-22 10:23       ` Thomas Gleixner
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-22 10:23 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1167 bytes --]

On Fri, 22 Feb 2019, speck for Peter Zijlstra wrote:
> On Fri, Feb 22, 2019 at 12:46:46AM +0000, speck for Andrew Cooper wrote:
> > On 21/02/2019 23:44, speck for Thomas Gleixner wrote:
> > > +   There is one non maskable exception which returns through paranoid exit
> > > +   and is not mitigated: #DF. If user space is able to trigger a double
> > > +   fault the possible MDS leakage is the least problem to worry about.
> > 
> > What about espfix64?  An IRET fault from that ends up at #DF, and
> > purposefully recovers.  It is trigger-able from at least modify_ldt().
> > 
> > The #DF path is normally fatal, but in the cases that it's not, an extra
> > VERW isn't going to be the slow part.
> 
> What about the #MC, do_mce has paranoid=1 on.

#MC takes the regular error_enter/exit_path when it comes from user
space. The paranoid path is taken when it hits kernel. There is a small
window though between the buffer clear and the actual return to user, which
is not covered. Sure we can slap a clear_buffer() into #MC as well, but
that's just for checking the tickbox on the spreadsheet. TBH, either way is
fine for me.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch V3 9/9] MDS basics 9
  2019-02-22  7:50   ` [MODERATED] " Greg KH
@ 2019-02-22 10:38     ` Thomas Gleixner
  2019-02-22 14:44       ` [MODERATED] " Greg KH
  2019-02-22 15:53       ` [MODERATED] " Borislav Petkov
  0 siblings, 2 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-22 10:38 UTC (permalink / raw)
  To: speck

On Fri, 22 Feb 2019, speck for Greg KH wrote:
> On Fri, Feb 22, 2019 at 12:44:40AM +0100, speck for Thomas Gleixner wrote:
> >  static void mds_select_mitigation(void)
> > @@ -236,12 +237,12 @@ static void mds_select_mitigation(void)
> >  		break;
> >  	case MDS_MITIGATION_AUTO:
> >  	case MDS_MITIGATION_FULL:
> > -		if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) {
> > +	case MDS_MITIGATION_VMWERV:
> > +		if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
> >  			mds_mitigation = MDS_MITIGATION_FULL;
> > -			static_branch_enable(&mds_user_clear);
> > -		} else {
> > -			mds_mitigation = MDS_MITIGATION_OFF;
> > -		}
> > +		else
> > +			mds_mitigation = MDS_MITIGATION_VMWERV;
> > +		static_branch_enable(&mds_user_clear);
> 
> So did we just loose the ability at "auto" to turn this off if present,
> because we really do not know if we can turn it off automatically?
> 
> Or am I reading this code wrong?
> 
> Should there be a new usespace command line option for "vmwerv"?

I'd prefer not. So the logic here is:

if CPU not affected:
   do nothing

if 'off':
   do nothing

if 'auto' or 'full':
   enable VERW

The latter has two variants:

  1) cpuid MD_CLEAR is set

     Switches the internal mode to FULL and VERW provides real protection.

  2) cpuid MD_CLEAR is not set

     Switches the internal mode to VMWERV, issues VERW which protects or
     not. If microcode is not updated the VERW wastes a few cpu cycles
     pointlessly.

The internal state is there so the dmesg/sysfs output reflects the
protection state real vs. lottery.

Thanks,

	tglx


     

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch V3 4/9] MDS basics 4
  2019-02-22  6:58   ` [MODERATED] " Greg KH
@ 2019-02-22 10:44     ` Thomas Gleixner
  2019-02-22 14:36       ` [MODERATED] " Greg KH
  0 siblings, 1 reply; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-22 10:44 UTC (permalink / raw)
  To: speck

On Fri, 22 Feb 2019, speck for Greg KH wrote:
> On Fri, Feb 22, 2019 at 12:44:35AM +0100, speck for Thomas Gleixner wrote:
> > +There is one exception, which is untrusted BPF. The functionality of
> > +untrusted BPF is limited, but it needs to be thoroughly investigated
> > +whether it can be used to create such a construct.
> 
> A meta-comment, is anyone looking at the untrusted BPF issue?  Do we
> have the BPF developers on this list so that they have the chance to
> figure this out?

I assume this is a rethorical question :(

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 4/9] MDS basics 4
  2019-02-22 10:44     ` Thomas Gleixner
@ 2019-02-22 14:36       ` Greg KH
  2019-02-22 22:38         ` Thomas Gleixner
  0 siblings, 1 reply; 32+ messages in thread
From: Greg KH @ 2019-02-22 14:36 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 11:44:13AM +0100, speck for Thomas Gleixner wrote:
> On Fri, 22 Feb 2019, speck for Greg KH wrote:
> > On Fri, Feb 22, 2019 at 12:44:35AM +0100, speck for Thomas Gleixner wrote:
> > > +There is one exception, which is untrusted BPF. The functionality of
> > > +untrusted BPF is limited, but it needs to be thoroughly investigated
> > > +whether it can be used to create such a construct.
> > 
> > A meta-comment, is anyone looking at the untrusted BPF issue?  Do we
> > have the BPF developers on this list so that they have the chance to
> > figure this out?
> 
> I assume this is a rethorical question :(

No, it wasn't, I was really hoping that the BPF developers were notified
of all of this.

Ugh, this whole mess is horrible...

greg k-h

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 9/9] MDS basics 9
  2019-02-22 10:38     ` Thomas Gleixner
@ 2019-02-22 14:44       ` Greg KH
  2019-02-22 15:53       ` [MODERATED] " Borislav Petkov
  1 sibling, 0 replies; 32+ messages in thread
From: Greg KH @ 2019-02-22 14:44 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 11:38:36AM +0100, speck for Thomas Gleixner wrote:
> On Fri, 22 Feb 2019, speck for Greg KH wrote:
> > On Fri, Feb 22, 2019 at 12:44:40AM +0100, speck for Thomas Gleixner wrote:
> > >  static void mds_select_mitigation(void)
> > > @@ -236,12 +237,12 @@ static void mds_select_mitigation(void)
> > >  		break;
> > >  	case MDS_MITIGATION_AUTO:
> > >  	case MDS_MITIGATION_FULL:
> > > -		if (boot_cpu_has(X86_FEATURE_MD_CLEAR)) {
> > > +	case MDS_MITIGATION_VMWERV:
> > > +		if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
> > >  			mds_mitigation = MDS_MITIGATION_FULL;
> > > -			static_branch_enable(&mds_user_clear);
> > > -		} else {
> > > -			mds_mitigation = MDS_MITIGATION_OFF;
> > > -		}
> > > +		else
> > > +			mds_mitigation = MDS_MITIGATION_VMWERV;
> > > +		static_branch_enable(&mds_user_clear);
> > 
> > So did we just loose the ability at "auto" to turn this off if present,
> > because we really do not know if we can turn it off automatically?
> > 
> > Or am I reading this code wrong?
> > 
> > Should there be a new usespace command line option for "vmwerv"?
> 
> I'd prefer not. So the logic here is:
> 
> if CPU not affected:
>    do nothing
> 
> if 'off':
>    do nothing
> 
> if 'auto' or 'full':
>    enable VERW
> 
> The latter has two variants:
> 
>   1) cpuid MD_CLEAR is set
> 
>      Switches the internal mode to FULL and VERW provides real protection.
> 
>   2) cpuid MD_CLEAR is not set
> 
>      Switches the internal mode to VMWERV, issues VERW which protects or
>      not. If microcode is not updated the VERW wastes a few cpu cycles
>      pointlessly.
> 
> The internal state is there so the dmesg/sysfs output reflects the
> protection state real vs. lottery.

Ok, thanks, that makes more sense.  That might want to go into the
documentation somewhere :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: Re: [patch V3 9/9] MDS basics 9
  2019-02-22 10:38     ` Thomas Gleixner
  2019-02-22 14:44       ` [MODERATED] " Greg KH
@ 2019-02-22 15:53       ` Borislav Petkov
  1 sibling, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2019-02-22 15:53 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 11:38:36AM +0100, speck for Thomas Gleixner wrote:
> The internal state is there so the dmesg/sysfs output reflects the
> protection state real vs. lottery.

MDS_MITIGATION_LOTTERY is starting to sound ok too, all of a sudden. :-P

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: [patch V3 9/9] MDS basics 9
  2019-02-21 23:44 ` [patch V3 9/9] MDS basics 9 Thomas Gleixner
  2019-02-22  7:50   ` [MODERATED] " Greg KH
@ 2019-02-22 15:54   ` Borislav Petkov
  1 sibling, 0 replies; 32+ messages in thread
From: Borislav Petkov @ 2019-02-22 15:54 UTC (permalink / raw)
  To: speck

On Fri, Feb 22, 2019 at 12:44:40AM +0100, speck for Thomas Gleixner wrote:
> Subject: [patch V3 9/9] x86/speculation/mds: Add mitigation mode VMWERV
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> In virtualized environments it can happen that the host has the microcode
> update which utilizes the VERW instruction to clear CPU buffers, but the
> hypervisor is not yet updated to expose the X86_FEATURE_MD_CLEAR CPUID bit
> to guests.
> 
> Introduce an internal mitigation mode 'VWWERV' which enables the invocation
					 ^
					 VMWERV


> of the CPU buffer clearing even if X86_FEATURE_MD_CLEAR is not set. If the
> system has no updated microcode this results in a pointless execution of
> the VERW instruction wasting a few CPU cycles. If the microcode is updated,
> but not exposed to a guest then the CPU buffers will be cleared.
> 
> That said: Virtual Machines Will Eventually Receive Vaccine

Haha.

> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2 -> V3: Rename mode.
> ---
>  Documentation/x86/mds.rst        |    5 +++++
>  arch/x86/include/asm/processor.h |    1 +
>  arch/x86/kernel/cpu/bugs.c       |   14 ++++++++------
>  3 files changed, 14 insertions(+), 6 deletions(-)
> 
> --- a/Documentation/x86/mds.rst
> +++ b/Documentation/x86/mds.rst
> @@ -88,6 +88,11 @@ The mitigation is invoked on kernel/user
>  (idle) transitions. Depending on the mitigation mode and the system state
>  the invocation can be enforced or conditional.
>  
> +As a special quirk to address virtualization scenarios where the host has
> +the microcode updated, but the hypervisor does not (yet) expose the
> +MD_CLEAR CPUID bit to guests, the kernel issues the VERW instruction in the
> +hope that it might work. The state is reflected accordingly.

"... in the hope that it would clear the buffers, additionally."

It will work, the question is how much more will it do. :)

In any case:

Reviewed-by: Borislav Petkov <bp@suse.de>

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [MODERATED] Re: Encrypted Message
  2019-02-22  7:45   ` [MODERATED] Encrypted Message Jon Masters
@ 2019-02-22 17:16     ` Linus Torvalds
  2019-02-22 17:40       ` Thomas Gleixner
  0 siblings, 1 reply; 32+ messages in thread
From: Linus Torvalds @ 2019-02-22 17:16 UTC (permalink / raw)
  To: speck

On Thu, Feb 21, 2019 at 11:46 PM speck for Jon Masters
<speck@linutronix.de> wrote:
>
> Dunno if it's worth documenting that using a specifically valid segment
> is faster than a zero selector according to Intel.

It probably is worth documenting, because it's so non-intuitive.

In a sane world, Intel would have realized that

 (a) nobody uses verw for anything else

 (b) a zero selector doesn't need any LDT/GDT loads

and optimized for that case. Preferably allowing a register operand too.

So the fact that the verw has to have a memory op and is faster for a
non-zero segment descriptor is all kinds of crazy.

And being crazy, a comment about it is worth it, since otherwise it
looks like _we_ are the crazy ones.

              Linus

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Encrypted Message
  2019-02-22 17:16     ` [MODERATED] " Linus Torvalds
@ 2019-02-22 17:40       ` Thomas Gleixner
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-22 17:40 UTC (permalink / raw)
  To: speck

On Fri, 22 Feb 2019, speck for Linus Torvalds wrote:
> On Thu, Feb 21, 2019 at 11:46 PM speck for Jon Masters
> <speck@linutronix.de> wrote:
> >
> > Dunno if it's worth documenting that using a specifically valid segment
> > is faster than a zero selector according to Intel.
> 
> It probably is worth documenting, because it's so non-intuitive.
> 
> In a sane world, Intel would have realized that
> 
>  (a) nobody uses verw for anything else
> 
>  (b) a zero selector doesn't need any LDT/GDT loads
> 
> and optimized for that case. Preferably allowing a register operand too.
> 
> So the fact that the verw has to have a memory op and is faster for a
> non-zero segment descriptor is all kinds of crazy.

Will add one.

> And being crazy, a comment about it is worth it, since otherwise it
> looks like _we_ are the crazy ones.

Some time ago you stated at the kernel summit, that everyone in the room is
crazy: https://quotes.yourdictionary.com/author/linus-torvalds/190350

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch V3 4/9] MDS basics 4
  2019-02-22 14:36       ` [MODERATED] " Greg KH
@ 2019-02-22 22:38         ` Thomas Gleixner
  0 siblings, 0 replies; 32+ messages in thread
From: Thomas Gleixner @ 2019-02-22 22:38 UTC (permalink / raw)
  To: speck

On Fri, 22 Feb 2019, speck for Greg KH wrote:

> On Fri, Feb 22, 2019 at 11:44:13AM +0100, speck for Thomas Gleixner wrote:
> > On Fri, 22 Feb 2019, speck for Greg KH wrote:
> > > On Fri, Feb 22, 2019 at 12:44:35AM +0100, speck for Thomas Gleixner wrote:
> > > > +There is one exception, which is untrusted BPF. The functionality of
> > > > +untrusted BPF is limited, but it needs to be thoroughly investigated
> > > > +whether it can be used to create such a construct.
> > > 
> > > A meta-comment, is anyone looking at the untrusted BPF issue?  Do we
> > > have the BPF developers on this list so that they have the chance to
> > > figure this out?
> > 
> > I assume this is a rethorical question :(
> 
> No, it wasn't, I was really hoping that the BPF developers were notified
> of all of this.
> 
> Ugh, this whole mess is horrible...

FYI, I've asked Intel to either provide an authoritive answer to that
question or to agree that we bring at least Alexei on board. Hopefully this
won't take 8+ weeks as last time we tried ...

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-02-22 22:38 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-21 23:44 [patch V3 0/9] MDS basics 0 Thomas Gleixner
2019-02-21 23:44 ` [patch V3 1/9] MDS basics 1 Thomas Gleixner
2019-02-22  6:53   ` [MODERATED] " Greg KH
2019-02-22  7:30   ` Borislav Petkov
2019-02-21 23:44 ` [patch V3 2/9] MDS basics 2 Thomas Gleixner
2019-02-21 23:44 ` [patch V3 3/9] MDS basics 3 Thomas Gleixner
2019-02-21 23:44 ` [patch V3 4/9] MDS basics 4 Thomas Gleixner
2019-02-22  6:58   ` [MODERATED] " Greg KH
2019-02-22 10:44     ` Thomas Gleixner
2019-02-22 14:36       ` [MODERATED] " Greg KH
2019-02-22 22:38         ` Thomas Gleixner
2019-02-22  7:45   ` [MODERATED] Encrypted Message Jon Masters
2019-02-22 17:16     ` [MODERATED] " Linus Torvalds
2019-02-22 17:40       ` Thomas Gleixner
2019-02-22  7:50   ` [MODERATED] Re: [patch V3 4/9] MDS basics 4 Borislav Petkov
2019-02-21 23:44 ` [patch V3 5/9] MDS basics 5 Thomas Gleixner
2019-02-22  0:46   ` [MODERATED] " Andrew Cooper
2019-02-22  7:00     ` Thomas Gleixner
2019-02-22  9:20     ` [MODERATED] " Peter Zijlstra
2019-02-22 10:23       ` Thomas Gleixner
2019-02-21 23:44 ` [patch V3 6/9] MDS basics 6 Thomas Gleixner
2019-02-21 23:44 ` [patch V3 7/9] MDS basics 7 Thomas Gleixner
2019-02-22  7:08   ` [MODERATED] " Greg KH
2019-02-21 23:44 ` [patch V3 8/9] MDS basics 8 Thomas Gleixner
2019-02-22  7:14   ` [MODERATED] " Greg KH
2019-02-22  8:55   ` Borislav Petkov
2019-02-21 23:44 ` [patch V3 9/9] MDS basics 9 Thomas Gleixner
2019-02-22  7:50   ` [MODERATED] " Greg KH
2019-02-22 10:38     ` Thomas Gleixner
2019-02-22 14:44       ` [MODERATED] " Greg KH
2019-02-22 15:53       ` [MODERATED] " Borislav Petkov
2019-02-22 15:54   ` [MODERATED] " Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).