historical-speck.lore.kernel.org archive mirror
 help / color / mirror / Atom feed
* [MODERATED] [PATCH 2/3] v4 more sampling fun 2
  2020-03-18 21:27 [MODERATED] [PATCH 0/3] v4 more sampling fun 0 mark gross
@ 2020-01-16 22:16 ` mark gross
  2020-01-30 19:12 ` [MODERATED] [PATCH 3/3] v4 more sampling fun 3 mark gross
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: mark gross @ 2020-01-16 22:16 UTC (permalink / raw)
  To: speck

From: mark gross <mgross@linux.intel.com>
Subject: [PATCH 2/3] x86/speculation: Special Register Buffer Data Sampling
 (SRBDS) mitigation control.

SRBDS is an MDS-like speculative side channel that can leak bits from
the RNG across cores and threads. New microcode serializes the processor
access during the execution of RDRAND and RDSEED.  This ensures that the
shared buffer is overwritten before it is released for reuse.

While it is present on all affected CPU models, the microcode mitigation
is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
cases where TSX is not supported or has been disabled with TSX_CTRL.

The mitigation is activated by default on affected processors and it
increases latency for RDRAND and RDSEED instructions.  Among other
effects this will reduce throughput from /dev/urandom.

* enable administrator to configure the mitigation off when desired
  using either mitigations=off or srbds=off.
* export vulnerability status via sysfs
* rename file scoped macros to apply for non-whitelist table
  initializations.

Signed-off-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 .../ABI/testing/sysfs-devices-system-cpu      |   1 +
 .../admin-guide/kernel-parameters.txt         |  11 ++
 arch/x86/include/asm/cpufeatures.h            |   2 +
 arch/x86/include/asm/msr-index.h              |   4 +
 arch/x86/kernel/cpu/bugs.c                    | 112 +++++++++++++++
 arch/x86/kernel/cpu/common.c                  | 128 +++++++++++-------
 arch/x86/kernel/cpu/cpu.h                     |  13 ++
 arch/x86/kernel/cpu/intel.c                   |   2 +
 arch/x86/kernel/cpu/match.c                   |   6 +-
 drivers/base/cpu.c                            |   8 ++
 include/linux/mod_devicetable.h               |   2 +-
 11 files changed, 236 insertions(+), 53 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2e0e3b45d02a..a890b3769335 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -492,6 +492,7 @@ What:		/sys/devices/system/cpu/vulnerabilities
 		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
 		/sys/devices/system/cpu/vulnerabilities/l1tf
 		/sys/devices/system/cpu/vulnerabilities/mds
+		/sys/devices/system/cpu/vulnerabilities/special_register_data_sampling
 		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
 		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
 Date:		January 2018
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index c07815d230bc..71cf7e7eae45 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4659,6 +4659,17 @@
 	spia_pedr=
 	spia_peddr=
 
+	srbds=		[x86]
+			Special Register Buffer Data Sampling mitigation
+			control.
+
+			On CPUs vulnerable to this issue and users impacted by
+			the mitigation slowing down RDRAND and RDSEED can
+			disable by setting this:
+
+			off:	disable mitigation and remove performance
+				impact to rdrand and rdseed
+
 	srcutree.counter_wrap_check [KNL]
 			Specifies how frequently to check for
 			grace-period sequence counter wrap for the
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index f3327cb56edf..69f7dcb1fa5c 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -360,6 +360,7 @@
 #define X86_FEATURE_AVX512_4FMAPS	(18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
 #define X86_FEATURE_FSRM		(18*32+ 4) /* Fast Short Rep Mov */
 #define X86_FEATURE_AVX512_VP2INTERSECT (18*32+ 8) /* AVX-512 Intersect for D/Q */
+#define X86_FEATURE_SRBDS_CTRL		(18*32+ 9) /* "" SRBDS mitigation MSR available */
 #define X86_FEATURE_MD_CLEAR		(18*32+10) /* VERW clears CPU buffers */
 #define X86_FEATURE_TSX_FORCE_ABORT	(18*32+13) /* "" TSX_FORCE_ABORT */
 #define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
@@ -404,5 +405,6 @@
 #define X86_BUG_SWAPGS			X86_BUG(21) /* CPU is affected by speculation through SWAPGS */
 #define X86_BUG_TAA			X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
 #define X86_BUG_ITLB_MULTIHIT		X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
+#define X86_BUG_SRBDS			X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index d5e517d1c3dd..af64c8e80ff4 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -119,6 +119,10 @@
 #define TSX_CTRL_RTM_DISABLE		BIT(0)	/* Disable RTM feature */
 #define TSX_CTRL_CPUID_CLEAR		BIT(1)	/* Disable TSX enumeration */
 
+/* SRBDS support */
+#define MSR_IA32_MCU_OPT_CTRL		0x00000123
+#define RNGDS_MITG_DIS			BIT(0)
+
 #define MSR_IA32_SYSENTER_CS		0x00000174
 #define MSR_IA32_SYSENTER_ESP		0x00000175
 #define MSR_IA32_SYSENTER_EIP		0x00000176
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index ed54b3b21c39..eff2b8cbf8bb 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -41,6 +41,7 @@ static void __init l1tf_select_mitigation(void);
 static void __init mds_select_mitigation(void);
 static void __init mds_print_mitigation(void);
 static void __init taa_select_mitigation(void);
+static void __init srbds_select_mitigation(void);
 
 /* The base value of the SPEC_CTRL MSR that always has to be preserved. */
 u64 x86_spec_ctrl_base;
@@ -108,6 +109,7 @@ void __init check_bugs(void)
 	l1tf_select_mitigation();
 	mds_select_mitigation();
 	taa_select_mitigation();
+	srbds_select_mitigation();
 
 	/*
 	 * As MDS and TAA mitigations are inter-related, print MDS
@@ -397,6 +399,103 @@ static int __init tsx_async_abort_parse_cmdline(char *str)
 }
 early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
 
+#undef pr_fmt
+#define pr_fmt(fmt)	"SRBDS: " fmt
+
+enum srbds_mitigations srbds_mitigation __ro_after_init = SRBDS_MITIGATION_FULL;
+static const char * const srbds_strings[] = {
+	[SRBDS_NOT_AFFECTED]		= "Not affected",
+	[SRBDS_MITIGATION_OFF]		= "Vulnerable",
+	[SRBDS_MITIGATION_UCODE_NEEDED]	= "Vulnerable: no microcode",
+	[SRBDS_MITIGATION_FULL]		= "Mitigated",
+	[SRBDS_NOT_AFFECTED_TSX_OFF]	= "Not affected (TSX disabled)",
+	[SRBDS_HYPERVISOR]		= "Unknown",
+};
+
+static bool srbds_off;
+
+void srbds_configure_mitigation(void)
+{
+	u64 mcu_ctrl;
+
+	if (srbds_mitigation == SRBDS_NOT_AFFECTED)
+		return;
+
+	if (srbds_mitigation == SRBDS_HYPERVISOR)
+		return;
+
+	if (srbds_mitigation == SRBDS_MITIGATION_UCODE_NEEDED)
+		return;
+
+	rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
+
+	switch (srbds_mitigation) {
+	case SRBDS_MITIGATION_OFF:
+	case SRBDS_NOT_AFFECTED_TSX_OFF:
+		mcu_ctrl |= RNGDS_MITG_DIS;
+		break;
+	case SRBDS_MITIGATION_FULL:
+		mcu_ctrl &= ~RNGDS_MITG_DIS;
+		break;
+	default:
+		break;
+	}
+
+	wrmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_ctrl);
+}
+
+static void __init srbds_select_mitigation(void)
+{
+	u64 ia32_cap;
+
+	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+		srbds_mitigation = SRBDS_HYPERVISOR;
+		return;
+	}
+
+	if (!boot_cpu_has_bug(X86_BUG_SRBDS)) {
+		srbds_mitigation = SRBDS_NOT_AFFECTED;
+		return;
+	}
+
+	if (!boot_cpu_has(X86_FEATURE_SRBDS_CTRL)) {
+		srbds_mitigation = SRBDS_MITIGATION_UCODE_NEEDED;
+		return;
+	}
+
+	if (boot_cpu_has_bug(X86_BUG_SRBDS)) {
+		srbds_mitigation = SRBDS_MITIGATION_FULL;
+		/*
+		 * Check to see if this is one of the MDS_NO systems supporting
+		 * TSX that are only exposed to SRBDS when TSX is enabled.
+		 */
+		ia32_cap = x86_read_arch_cap_msr();
+		if (ia32_cap & ARCH_CAP_MDS_NO) {
+			if (!boot_cpu_has(X86_FEATURE_RTM))
+				srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
+		}
+	}
+
+	if (cpu_mitigations_off() || srbds_off) {
+		if (srbds_mitigation != SRBDS_NOT_AFFECTED_TSX_OFF)
+			srbds_mitigation = SRBDS_MITIGATION_OFF;
+	}
+
+	srbds_configure_mitigation();
+}
+
+static int __init srbds_parse_cmdline(char *str)
+{
+	if (!str)
+		return -EINVAL;
+
+	if (!strcmp(str, "off"))
+		srbds_off = true;
+
+	return 0;
+}
+early_param("srbds", srbds_parse_cmdline);
+
 #undef pr_fmt
 #define pr_fmt(fmt)     "Spectre V1 : " fmt
 
@@ -1528,6 +1627,11 @@ static char *ibpb_state(void)
 	return "";
 }
 
+static ssize_t srbds_show_state(char *buf)
+{
+	return sprintf(buf, "%s\n", srbds_strings[srbds_mitigation]);
+}
+
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
 			       char *buf, unsigned int bug)
 {
@@ -1572,6 +1676,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 	case X86_BUG_ITLB_MULTIHIT:
 		return itlb_multihit_show_state(buf);
 
+	case X86_BUG_SRBDS:
+		return srbds_show_state(buf);
+
 	default:
 		break;
 	}
@@ -1618,4 +1725,9 @@ ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_ITLB_MULTIHIT);
 }
+
+ssize_t cpu_show_special_register_data_sampling(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_SRBDS);
+}
 #endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 4cdb123ff66a..d8263d8ed072 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1007,47 +1007,51 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
 #define NO_SWAPGS		BIT(6)
 #define NO_ITLB_MULTIHIT	BIT(7)
 #define NO_SPECTRE_V2		BIT(8)
+#define SRBDS			BIT(9)
 
-#define VULNWL(_vendor, _family, _model, _whitelist)	\
+#define VULNHW(_vendor, _family, _model, _whitelist)	\
 	{ X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
 
-#define VULNWL_INTEL(model, whitelist)		\
-	VULNWL(INTEL, 6, INTEL_FAM6_##model, whitelist)
+#define VULNHW_INTEL(model, whitelist)		\
+	VULNHW(INTEL, 6, INTEL_FAM6_##model, whitelist)
 
-#define VULNWL_AMD(family, whitelist)		\
-	VULNWL(AMD, family, X86_MODEL_ANY, whitelist)
+#define VULNHW_AMD(family, whitelist)		\
+	VULNHW(AMD, family, X86_MODEL_ANY, whitelist)
 
-#define VULNWL_HYGON(family, whitelist)		\
-	VULNWL(HYGON, family, X86_MODEL_ANY, whitelist)
+#define VULNHW_HYGON(family, whitelist)		\
+	VULNHW(HYGON, family, X86_MODEL_ANY, whitelist)
+
+#define VULNHW_INTEL_STEPPING(model, steppings, issues)	\
+	{ X86_VENDOR_INTEL, 6, INTEL_FAM6_##model, X86_FEATURE_ANY, issues, steppings }
 
 static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
-	VULNWL(ANY,	4, X86_MODEL_ANY,	NO_SPECULATION),
-	VULNWL(CENTAUR,	5, X86_MODEL_ANY,	NO_SPECULATION),
-	VULNWL(INTEL,	5, X86_MODEL_ANY,	NO_SPECULATION),
-	VULNWL(NSC,	5, X86_MODEL_ANY,	NO_SPECULATION),
+	VULNHW(ANY,	4, X86_MODEL_ANY,	NO_SPECULATION),
+	VULNHW(CENTAUR,	5, X86_MODEL_ANY,	NO_SPECULATION),
+	VULNHW(INTEL,	5, X86_MODEL_ANY,	NO_SPECULATION),
+	VULNHW(NSC,	5, X86_MODEL_ANY,	NO_SPECULATION),
 
 	/* Intel Family 6 */
-	VULNWL_INTEL(ATOM_SALTWELL,		NO_SPECULATION | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_SALTWELL_TABLET,	NO_SPECULATION | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_SALTWELL_MID,		NO_SPECULATION | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_BONNELL,		NO_SPECULATION | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_BONNELL_MID,		NO_SPECULATION | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_SALTWELL,		NO_SPECULATION | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_SALTWELL_TABLET,	NO_SPECULATION | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_SALTWELL_MID,		NO_SPECULATION | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_BONNELL,		NO_SPECULATION | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_BONNELL_MID,		NO_SPECULATION | NO_ITLB_MULTIHIT),
 
-	VULNWL_INTEL(ATOM_SILVERMONT,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_SILVERMONT_D,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_SILVERMONT_MID,	NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_AIRMONT,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(XEON_PHI_KNL,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(XEON_PHI_KNM,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_SILVERMONT,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_SILVERMONT_D,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_SILVERMONT_MID,	NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_AIRMONT,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(XEON_PHI_KNL,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(XEON_PHI_KNM,		NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
 
-	VULNWL_INTEL(CORE_YONAH,		NO_SSB),
+	VULNHW_INTEL(CORE_YONAH,		NO_SSB),
 
-	VULNWL_INTEL(ATOM_AIRMONT_MID,		NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_AIRMONT_NP,		NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_AIRMONT_MID,		NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_AIRMONT_NP,		NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
 
-	VULNWL_INTEL(ATOM_GOLDMONT,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_GOLDMONT_D,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_INTEL(ATOM_GOLDMONT_PLUS,	NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_GOLDMONT,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_GOLDMONT_D,		NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_GOLDMONT_PLUS,	NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
 
 	/*
 	 * Technically, swapgs isn't serializing on AMD (despite it previously
@@ -1057,27 +1061,47 @@ static const __initconst struct x86_cpu_id cpu_vuln_whitelist[] = {
 	 * good enough for our purposes.
 	 */
 
-	VULNWL_INTEL(ATOM_TREMONT_D,		NO_ITLB_MULTIHIT),
+	VULNHW_INTEL(ATOM_TREMONT_D,		NO_ITLB_MULTIHIT),
 
 	/* AMD Family 0xf - 0x12 */
-	VULNWL_AMD(0x0f,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_AMD(0x10,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_AMD(0x11,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_AMD(0x12,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_AMD(0x0f,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_AMD(0x10,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_AMD(0x11,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_AMD(0x12,	NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
 
 	/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
-	VULNWL_AMD(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
-	VULNWL_HYGON(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_AMD(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+	VULNHW_HYGON(X86_FAMILY_ANY,	NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
 
 	/* Zhaoxin Family 7 */
-	VULNWL(CENTAUR,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS),
-	VULNWL(ZHAOXIN,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS),
+	VULNHW(CENTAUR,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS),
+	VULNHW(ZHAOXIN,	7, X86_MODEL_ANY,	NO_SPECTRE_V2 | NO_SWAPGS),
 	{}
 };
 
-static bool __init cpu_matches(unsigned long which)
+/*
+ * List affected CPU's for issues that cannot be enumerated.  The last 4
+ * entries could be collapsed into 2 but, to make it easier to review against
+ * the Intel white paper on SRBDS they are listed out in a similar manner to
+ * the white paper.
+ */
+static const struct x86_cpu_id affected_cpus[] __initconst = {
+	VULNHW_INTEL(IVYBRIDGE,		SRBDS),
+	VULNHW_INTEL(HASWELL,		SRBDS),
+	VULNHW_INTEL(HASWELL_L,		SRBDS),
+	VULNHW_INTEL(HASWELL_G,		SRBDS),
+	VULNHW_INTEL(BROADWELL_G,	SRBDS),
+	VULNHW_INTEL(BROADWELL,		SRBDS),
+	VULNHW_INTEL(SKYLAKE_L,		SRBDS),
+	VULNHW_INTEL(SKYLAKE,		SRBDS),
+	VULNHW_INTEL_STEPPING(KABYLAKE_L, GENMASK(0xC, 0), SRBDS), /*06_8E steppings <=C*/
+	VULNHW_INTEL_STEPPING(KABYLAKE, GENMASK(0xD, 0),   SRBDS), /*06_9E steppings <=D*/
+	{}
+};
+
+static bool __init cpu_matches(unsigned long which, const struct x86_cpu_id *table)
 {
-	const struct x86_cpu_id *m = x86_match_cpu(cpu_vuln_whitelist);
+	const struct x86_cpu_id *m = x86_match_cpu(table);
 
 	return m && !!(m->driver_data & which);
 }
@@ -1097,33 +1121,39 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 	u64 ia32_cap = x86_read_arch_cap_msr();
 
 	/* Set ITLB_MULTIHIT bug if cpu is not in the whitelist and not mitigated */
-	if (!cpu_matches(NO_ITLB_MULTIHIT) && !(ia32_cap & ARCH_CAP_PSCHANGE_MC_NO))
+	if (!cpu_matches(NO_ITLB_MULTIHIT, cpu_vuln_whitelist) &&
+	    !(ia32_cap & ARCH_CAP_PSCHANGE_MC_NO))
 		setup_force_cpu_bug(X86_BUG_ITLB_MULTIHIT);
 
-	if (cpu_matches(NO_SPECULATION))
+	if (cpu_matches(NO_SPECULATION, cpu_vuln_whitelist))
 		return;
 
 	setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
 
-	if (!cpu_matches(NO_SPECTRE_V2))
+	if (!cpu_matches(NO_SPECTRE_V2, cpu_vuln_whitelist))
 		setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
 
-	if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) &&
-	   !cpu_has(c, X86_FEATURE_AMD_SSB_NO))
+	if (!cpu_matches(NO_SSB, cpu_vuln_whitelist) &&
+	    (ia32_cap & ARCH_CAP_SSB_NO) &&
+	    !cpu_has(c, X86_FEATURE_AMD_SSB_NO))
 		setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
 
 	if (ia32_cap & ARCH_CAP_IBRS_ALL)
 		setup_force_cpu_cap(X86_FEATURE_IBRS_ENHANCED);
 
-	if (!cpu_matches(NO_MDS) && !(ia32_cap & ARCH_CAP_MDS_NO)) {
+	if (!cpu_matches(NO_MDS, cpu_vuln_whitelist) &&
+	    !(ia32_cap & ARCH_CAP_MDS_NO)) {
 		setup_force_cpu_bug(X86_BUG_MDS);
-		if (cpu_matches(MSBDS_ONLY))
+		if (cpu_matches(MSBDS_ONLY, cpu_vuln_whitelist))
 			setup_force_cpu_bug(X86_BUG_MSBDS_ONLY);
 	}
 
-	if (!cpu_matches(NO_SWAPGS))
+	if (!cpu_matches(NO_SWAPGS, cpu_vuln_whitelist))
 		setup_force_cpu_bug(X86_BUG_SWAPGS);
 
+	if (cpu_matches(SRBDS, affected_cpus))
+		setup_force_cpu_bug(X86_BUG_SRBDS);
+
 	/*
 	 * When the CPU is not mitigated for TAA (TAA_NO=0) set TAA bug when:
 	 *	- TSX is supported or
@@ -1139,7 +1169,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 	     (ia32_cap & ARCH_CAP_TSX_CTRL_MSR)))
 		setup_force_cpu_bug(X86_BUG_TAA);
 
-	if (cpu_matches(NO_MELTDOWN))
+	if (cpu_matches(NO_MELTDOWN, cpu_vuln_whitelist))
 		return;
 
 	/* Rogue Data Cache Load? No! */
@@ -1148,7 +1178,7 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 
 	setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
 
-	if (cpu_matches(NO_L1TF))
+	if (cpu_matches(NO_L1TF, cpu_vuln_whitelist))
 		return;
 
 	setup_force_cpu_bug(X86_BUG_L1TF);
diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index 37fdefd14f28..0691cf09093e 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -44,7 +44,20 @@ struct _tlb_table {
 extern const struct cpu_dev *const __x86_cpu_dev_start[],
 			    *const __x86_cpu_dev_end[];
 
+enum srbds_mitigations {
+	SRBDS_NOT_AFFECTED,
+	SRBDS_MITIGATION_OFF,
+	SRBDS_MITIGATION_UCODE_NEEDED,
+	SRBDS_MITIGATION_FULL,
+	SRBDS_NOT_AFFECTED_TSX_OFF,
+	SRBDS_HYPERVISOR,
+};
+
+extern __ro_after_init enum srbds_mitigations srbds_mitigation;
+void srbds_configure_mitigation(void);
+
 #ifdef CONFIG_CPU_SUP_INTEL
+
 enum tsx_ctrl_states {
 	TSX_CTRL_ENABLE,
 	TSX_CTRL_DISABLE,
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index be82cd5841c3..1b083a2a415b 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -684,6 +684,8 @@ static void init_intel(struct cpuinfo_x86 *c)
 		tsx_enable();
 	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
 		tsx_disable();
+
+	srbds_configure_mitigation();
 }
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c
index 129df1a959e9..8e176db2eb9a 100644
--- a/arch/x86/kernel/cpu/match.c
+++ b/arch/x86/kernel/cpu/match.c
@@ -35,7 +35,7 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
 	for (m = match; m->vendor | m->family | m->model | m->feature |
-							   m->stepping; m++) {
+							   m->steppings; m++) {
 		if (m->vendor != X86_VENDOR_ANY && c->x86_vendor != m->vendor)
 			continue;
 		if (m->family != X86_FAMILY_ANY && c->x86 != m->family)
@@ -44,8 +44,8 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
 			continue;
 		if (m->feature != X86_FEATURE_ANY && !cpu_has(c, m->feature))
 			continue;
-		if (m->stepping != X86_STEPPING_ANY &&
-				!(c->x86_stepping & m->stepping))
+		if (m->steppings != X86_STEPPING_ANY &&
+		    !(BIT(c->x86_stepping) & m->steppings))
 			continue;
 		return m;
 	}
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 6265871a4af2..d69e094e790c 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -567,6 +567,12 @@ ssize_t __weak cpu_show_itlb_multihit(struct device *dev,
 	return sprintf(buf, "Not affected\n");
 }
 
+ssize_t __weak cpu_show_special_register_data_sampling(struct device *dev,
+						       struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
@@ -575,6 +581,7 @@ static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
 static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
 static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL);
 static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);
+static DEVICE_ATTR(special_register_data_sampling, 0444, cpu_show_special_register_data_sampling, NULL);
 
 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_meltdown.attr,
@@ -585,6 +592,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_mds.attr,
 	&dev_attr_tsx_async_abort.attr,
 	&dev_attr_itlb_multihit.attr,
+	&dev_attr_special_register_data_sampling.attr,
 	NULL
 };
 
diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
index 340ad760a47a..b32b032edf4c 100644
--- a/include/linux/mod_devicetable.h
+++ b/include/linux/mod_devicetable.h
@@ -665,7 +665,7 @@ struct x86_cpu_id {
 	__u16 model;
 	__u16 feature;	/* bit index */
 	kernel_ulong_t driver_data;
-	__u16 stepping;
+	__u16 steppings; /* bit mask of steppings */
 };
 
 #define X86_FEATURE_MATCH(x) \
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [MODERATED] [PATCH 3/3] v4 more sampling fun 3
  2020-03-18 21:27 [MODERATED] [PATCH 0/3] v4 more sampling fun 0 mark gross
  2020-01-16 22:16 ` [MODERATED] [PATCH 2/3] v4 more sampling fun 2 mark gross
@ 2020-01-30 19:12 ` mark gross
  2020-03-17  0:56 ` [MODERATED] [PATCH 1/3] v4 more sampling fun 1 mark gross
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: mark gross @ 2020-01-30 19:12 UTC (permalink / raw)
  To: speck

From: mark gross <mgross@linux.intel.com>
Subject: [PATCH 3/3] x86/speculation: SRBDS vulnerability and mitigation
 documentation

Add documentation for the SRBDS vulnerability and its mitigation.

Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Mark Gross <mgross@linux.intel.com>
---
 Documentation/admin-guide/hw-vuln/index.rst   |   2 +
 .../special-register-buffer-data-sampling.rst | 150 ++++++++++++++++++
 2 files changed, 152 insertions(+)
 create mode 100644 Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst

diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 0795e3c2643f..99d5b3244ef9 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -14,3 +14,5 @@ are configurable at compile, boot or run time.
    mds
    tsx_async_abort
    multihit.rst
+   special-register-buffer-data-sampling.rst
+
diff --git a/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
new file mode 100644
index 000000000000..1fae5f455eb8
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
@@ -0,0 +1,150 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+SRBDS - Special Register Buffer Data Sampling
+=============================================
+
+SRBDS is a hardware vulnerability that allows MDS :doc:`mds` techniques to
+infer values returned from special register accesses.  Special register
+accesses are accesses to off core registers.  According to Intels evaluation,
+the special register reads that have a security expectation of privacy are:
+RDRAND, RDSEED and SGX EGETKEY.
+
+When RDRAND RDSEED and EGETKEY instructions are used the data is moved to the
+core through the special register mechanism that is susceptible to MDS attacks.
+
+Affected processors
+--------------------
+Core models (desktop, mobile, Xeon-E3) that implement RDRAND and/or RDSEED and
+are vulnerable to MFBDS (Micro architectural Fill Buffer Data Sampling) variant
+of MDS (Micro architectural Data Sampling) or to the TAA (TSX Asynchronous
+Abort) when TSX is enabled,
+
+  =============  ============  ==========================
+  common name    Family_Model  Stepping
+  =============  ============  ==========================
+  Ivybridge      06_3AH        All
+
+  Haswell        06_3CH        All
+  Haswell_L      06_45H        All
+  Haswell_G      06_46H        All
+
+  Broadwell_G    06_47H        All
+  Broadwell      06_3DH        All
+
+  Skylake_L      06_4EH        All
+  Skylake        06_5EH        All
+
+  Kabylake_L     06_8EH        <=A
+  Kabylake_L     06_8EH        0xB only if TSX is enabled
+  Kabylake_L     06_8EH        0xC only if TSX is enabled
+
+  Kabylake       06_9EH        <=B
+  Kabylake       06_9EH        0xC only if TSX is enabled
+  Kabylake       06_9EH        0xD only if TSX is enabled
+  =============  ============  ==========================
+
+Related CVEs
+------------
+
+The following CVE entry is related to this SRBDS issue:
+
+    ==============  =====  =====================================
+    CVE-2020-0543   SRBDS  Speical Register Buffer Data Sampling
+    ==============  =====  =====================================
+
+Attack scenarios
+----------------
+An unprivileged user can extract returned values from RDRAND and RDSEED
+executed on another core or sibling thread using MDS techniques.
+
+
+Mitigtion mechanism
+-------------------
+Intel will release microcode updates that modify the RDRAND, RDSEED, and
+EGETKEY instructions to overwrite secret special register data in the shared
+staging buffer before the secret data can be accessed by another logical
+processor.
+
+During execution of the RDRAND, RDSEED, or EGETKEY instruction, off-core
+accesses from other logical processors will be delayed until the special
+register read is complete and the secret data in the shared staging buffer is
+overwritten.
+
+This has three effects on performance:
+
+#. RDRAND, RDSEED, or EGETKEY instruction have higher latency.
+
+#. Executing RDRAND at the same time on multiple logical processors will be
+   serialized, resulting in an overall reduction in the maximum RDRAND
+   bandwidth.
+
+#. Executing RDRAND, RDSEED or EGETKEY will delay memory accesses from other
+   logical processors that miss their core caches, with an impact similar to
+   legacy locked cache-line-split accesses.
+
+Because of the performance impact of the mitigation, the microcode updates also
+provide an opt-out mechanism (RNGDS_MITG_DIS) to disable the mitigation for
+RDRAND and RDSEED instructions executed outside of Intel Software Guard
+Extensions (Intel SGX) enclaves. On logical processors that disable the
+mitigation using this opt-out mechanism, RDRAND and RDSEED do not take longer
+to execute and do not impact performance of sibling logical processors memory
+accesses. The opt-out mechanism does not affect Intel SGX enclaves (including
+execution of RDRAND or RDSEED inside of an enclave, as well as EGETKEY
+execution).
+
+IA32_MCU_OPT_CTRL MSR Definition
+--------------------------------
+Along with the mitigation for this issue, Intel added a new thread-scope
+IA32_MCU_OPT_CTRL MSR, (address 0x123). The presence of this MSR and
+RNGDS_MITG_DIS (bit 0) is enumerated by CPUID.(EAX=07H,ECX=0).EDX[SRBDS_CTRL =
+9]==1. This MSR is introduced through the microcode update.
+
+Setting IA32_MCU_OPT_CTRL[0] (RNGDS_MITG_DIS) to 1 for a logical processor
+disables the mitigation for RDRAND and RDSEED executed outside of an Intel SGX
+enclave on that logical processor. Opting out of the mitigation for a
+particular logical processor does not affect the RDRAND and RDSEED mitigations
+for other logical processors.
+
+Note that inside of an Intel SGX enclave, the mitigation is applied regardless
+of the value of RNGDS_MITG_DS.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+The kernel command line allows control over the SRBDS mitigation at boot time
+with the option "srbds=".  The option for this is:
+
+  ============= =============================================================
+  off           This option disables SRBS mitigation for RDRAND and RDSEED on
+                affected platforms.
+  ============= =============================================================
+
+SRBS System Information
+-----------------------
+The Linux kernel provides vulnerability status information through sysfs.  For
+SRBDS this can be accessed by the following sysfs file:
+/sys/devices/system/cpu/vulnerabilities/special_register_data_sampling
+
+The possible values contained in this file are:
+
+ ============================== =============================================
+ Not affected                   Processor not vulnerable
+ Vulnerable                     Processor vulnerable and mitigation disabled
+ Vulnerable: no microcode       Processor vulnerable and microcode is missing
+                                mitigation
+ Mitigated                      Processor is vulnerable and mitigation is in
+                                effect.
+ Not affected (TSX disabled)    Processor is only vulnerable when TSX is
+                                enabled while this system was booted with TSX
+                                disabled.
+ Unknown                        Running on virtual guest processor that is
+                                affected but with no way to know if host
+                                processor is mitigated or vulnerable.
+ ============================== =============================================
+
+SRBDS Default mitigation
+------------------------
+This new microcode serializes processor access during execution of RDRAND,
+RDSEED ensures that the shared buffer is overwritten before it is released for
+reuse use the "srbds=off" kernel command line to disable the mitigation for
+RDRAND and RDSEED.
+
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [MODERATED] [PATCH 1/3] v4 more sampling fun 1
  2020-03-18 21:27 [MODERATED] [PATCH 0/3] v4 more sampling fun 0 mark gross
  2020-01-16 22:16 ` [MODERATED] [PATCH 2/3] v4 more sampling fun 2 mark gross
  2020-01-30 19:12 ` [MODERATED] [PATCH 3/3] v4 more sampling fun 3 mark gross
@ 2020-03-17  0:56 ` mark gross
       [not found] ` <5e7296c7.1c69fb81.f9a2f.00ebSMTPIN_ADDED_BROKEN@mx.google.com>
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: mark gross @ 2020-03-17  0:56 UTC (permalink / raw)
  To: speck

From: mark gross <mgross@linux.intel.com>
Subject: [PATCH 1/3] x86/cpu: Add stepping field to x86_cpu_id structure

Intel uses the same family/model for several CPUs. Sometimes
the stepping must be checked to tell them apart.

Note that to keep this patch simple the new field has been added at the
end to avoid churn with all the pre-C99 initialized uses of this
structure. Such legacy usage will result in a "0" value for the stepping
field, hence X86_STEPPING_ANY is defined as "0".

Signed-off-by: Mark Gross <mgross@.linux.intel.com>
Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kernel/cpu/match.c     | 6 +++++-
 include/linux/mod_devicetable.h | 2 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/match.c b/arch/x86/kernel/cpu/match.c
index 6dd78d8235e4..129df1a959e9 100644
--- a/arch/x86/kernel/cpu/match.c
+++ b/arch/x86/kernel/cpu/match.c
@@ -34,7 +34,8 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
 	const struct x86_cpu_id *m;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	for (m = match; m->vendor | m->family | m->model | m->feature; m++) {
+	for (m = match; m->vendor | m->family | m->model | m->feature |
+							   m->stepping; m++) {
 		if (m->vendor != X86_VENDOR_ANY && c->x86_vendor != m->vendor)
 			continue;
 		if (m->family != X86_FAMILY_ANY && c->x86 != m->family)
@@ -43,6 +44,9 @@ const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
 			continue;
 		if (m->feature != X86_FEATURE_ANY && !cpu_has(c, m->feature))
 			continue;
+		if (m->stepping != X86_STEPPING_ANY &&
+				!(c->x86_stepping & m->stepping))
+			continue;
 		return m;
 	}
 	return NULL;
diff --git a/include/linux/mod_devicetable.h b/include/linux/mod_devicetable.h
index e3596db077dc..340ad760a47a 100644
--- a/include/linux/mod_devicetable.h
+++ b/include/linux/mod_devicetable.h
@@ -665,6 +665,7 @@ struct x86_cpu_id {
 	__u16 model;
 	__u16 feature;	/* bit index */
 	kernel_ulong_t driver_data;
+	__u16 stepping;
 };
 
 #define X86_FEATURE_MATCH(x) \
@@ -674,6 +675,7 @@ struct x86_cpu_id {
 #define X86_FAMILY_ANY 0
 #define X86_MODEL_ANY  0
 #define X86_FEATURE_ANY 0	/* Same as FPU, you can't test for that */
+#define X86_STEPPING_ANY 0
 
 /*
  * Generic table type for matching CPU features.
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [MODERATED] [PATCH 0/3] v4 more sampling fun 0
@ 2020-03-18 21:27 mark gross
  2020-01-16 22:16 ` [MODERATED] [PATCH 2/3] v4 more sampling fun 2 mark gross
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: mark gross @ 2020-03-18 21:27 UTC (permalink / raw)
  To: speck

From: mark gross <mgross@linux.intel.com>
Subject: [PATCH 0/3] v4 more sampling fun

This version of the Special Register Buffer Data Sampling has been updated to
address feedback gotten.  Including:
* changes x86_cpu_id as a prerequisite patch similar to the initial version
  posted a few weeks ago.
* simplified the affected processor table
* renamed the VULNWL* macros to VULNHW*
* added an ABI note introducing the sysfs addition
* assorted clean ups and some additional inline commentary.

---

Special Register Buffer Data Sampling is a sampling type of vulnerability that
leaks data across cores sharing the HW-RNG for vulnerable processors.

This leak is fixed by a microcode update and is enabled by default.

This new microcode serializes processor access during execution of RDRAND or
RDSEED. It ensures that the shared buffer is overwritten before it is released
for reuse.

The mitigation impacts the throughput of the RDRAND and RDSEED instructions and
latency of RT processing running on the socket while executing RDRAND or
RDSEED.  The micro benchmark of calling RDRAND many times shows a slowdown.

This patch set enables kernel command line control of this mitigation and
exports vulnerability and mitigation status.
This patch set includes 3 patches:
* The first patch extends match.c and mod-devicetable.h to include steppings.
* The second patch enables the command line control of the mitigation as well as
  the sysfs export of vulnerability status.
* The third patch has the Documentation/admin-guide/hw-vuln documentation for
  this issue and the control over the mitigation.

The microcode defaults to enabling the mitigation.

mark gross (3):
  x86/cpu: Add stepping field to x86_cpu_id structure
  x86/speculation: Special Register Buffer Data Sampling (SRBDS)
    mitigation control.
  x86/speculation: SRBDS vulnerability and mitigation documentation

 .../ABI/testing/sysfs-devices-system-cpu      |   1 +
 Documentation/admin-guide/hw-vuln/index.rst   |   2 +
 .../special-register-buffer-data-sampling.rst | 150 ++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  11 ++
 arch/x86/include/asm/cpufeatures.h            |   2 +
 arch/x86/include/asm/msr-index.h              |   4 +
 arch/x86/kernel/cpu/bugs.c                    | 112 +++++++++++++
 arch/x86/kernel/cpu/common.c                  | 128 +++++++++------
 arch/x86/kernel/cpu/cpu.h                     |  13 ++
 arch/x86/kernel/cpu/intel.c                   |   2 +
 arch/x86/kernel/cpu/match.c                   |   6 +-
 drivers/base/cpu.c                            |   8 +
 include/linux/mod_devicetable.h               |   2 +
 13 files changed, 391 insertions(+), 50 deletions(-)
 create mode 100644 Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst

-- 
2.17.1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 1/3] v4 more sampling fun 1
       [not found] ` <5e7296c7.1c69fb81.f9a2f.00ebSMTPIN_ADDED_BROKEN@mx.google.com>
@ 2020-03-19  8:50   ` Greg KH
  2020-03-19 15:40     ` mark gross
  2020-03-19 18:13     ` Thomas Gleixner
  0 siblings, 2 replies; 17+ messages in thread
From: Greg KH @ 2020-03-19  8:50 UTC (permalink / raw)
  To: speck

On Mon, Mar 16, 2020 at 05:56:27PM -0700, speck for mark gross wrote:
> From: mark gross <mgross@linux.intel.com>
> Subject: [PATCH 1/3] x86/cpu: Add stepping field to x86_cpu_id structure
> 
> Intel uses the same family/model for several CPUs. Sometimes
> the stepping must be checked to tell them apart.
> 
> Note that to keep this patch simple the new field has been added at the
> end to avoid churn with all the pre-C99 initialized uses of this
> structure. Such legacy usage will result in a "0" value for the stepping
> field, hence X86_STEPPING_ANY is defined as "0".

Are you sure about the "avoid churn" stuff?  Can't you just fix up the
X86_FEATURE_MATCH() #define to do that?

Also INTEL_CPU_FAM_ANY() already uses named identifiers, so you are ok
with that usage.  How many other places is this "open coded" that would
need to be fixed up.  Can't be more than 10-20, right?

> --- a/include/linux/mod_devicetable.h
> +++ b/include/linux/mod_devicetable.h
> @@ -665,6 +665,7 @@ struct x86_cpu_id {
>  	__u16 model;
>  	__u16 feature;	/* bit index */
>  	kernel_ulong_t driver_data;
> +	__u16 stepping;
>  };

That just makes my eyes hurt, stepping really should be before
driver_data.

The macros should be fixed up anyway to use named identifiers, no time
like the present :)

And this, and the cleanups, can be done in public, like I thought we
asked for before, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 1/3] v4 more sampling fun 1
  2020-03-19  8:50   ` [MODERATED] " Greg KH
@ 2020-03-19 15:40     ` mark gross
  2020-03-19 15:50       ` Luck, Tony
  2020-03-19 18:13     ` Thomas Gleixner
  1 sibling, 1 reply; 17+ messages in thread
From: mark gross @ 2020-03-19 15:40 UTC (permalink / raw)
  To: speck

On Thu, Mar 19, 2020 at 09:50:40AM +0100, speck for Greg KH wrote:
> On Mon, Mar 16, 2020 at 05:56:27PM -0700, speck for mark gross wrote:
> > From: mark gross <mgross@linux.intel.com>
> > Subject: [PATCH 1/3] x86/cpu: Add stepping field to x86_cpu_id structure
> > 
> > Intel uses the same family/model for several CPUs. Sometimes
> > the stepping must be checked to tell them apart.
> > 
> > Note that to keep this patch simple the new field has been added at the
> > end to avoid churn with all the pre-C99 initialized uses of this
> > structure. Such legacy usage will result in a "0" value for the stepping
> > field, hence X86_STEPPING_ANY is defined as "0".
> 
> Are you sure about the "avoid churn" stuff?  Can't you just fix up the
> X86_FEATURE_MATCH() #define to do that?
I wasn't confident about avoiding churn but I was hoping too.
Yes, I am pretty sure I can fix up that macro.

> Also INTEL_CPU_FAM_ANY() already uses named identifiers, so you are ok
> with that usage.  How many other places is this "open coded" that would
> need to be fixed up.  Can't be more than 10-20, right?
git grep X86_FEATURE_MATCH |wc
    15      37    1110

Right.

> 
> > --- a/include/linux/mod_devicetable.h
> > +++ b/include/linux/mod_devicetable.h
> > @@ -665,6 +665,7 @@ struct x86_cpu_id {
> >  	__u16 model;
> >  	__u16 feature;	/* bit index */
> >  	kernel_ulong_t driver_data;
> > +	__u16 stepping;
> >  };
> 
> That just makes my eyes hurt, stepping really should be before
> driver_data.
Well, it wasnt' my first choice but, it did minimize wripple.

> 
> The macros should be fixed up anyway to use named identifiers, no time
> like the present :)
Works for me.

> 
> And this, and the cleanups, can be done in public, like I thought we
> asked for before, right?

Yes, I think you and Boris both and maybe a few others did.  I wanted to
pre-flight things here as much as possible given the design level changes I've
made to date on this.  Unless I get massive pushback on this version of the
patch set I think now is an ok time to push a patch to add steppings  along
with using named fields clean up on the public list.

Thank your for the feedback.  I will post a public version of this dependency
for the SRBDS mitigation control before my next patch post to the spec list.

--mark
> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 1/3] v4 more sampling fun 1
  2020-03-19 15:40     ` mark gross
@ 2020-03-19 15:50       ` Luck, Tony
  2020-03-19 16:34         ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Luck, Tony @ 2020-03-19 15:50 UTC (permalink / raw)
  To: speck

On Thu, Mar 19, 2020 at 08:40:46AM -0700, speck for mark gross wrote:
> On Thu, Mar 19, 2020 at 09:50:40AM +0100, speck for Greg KH wrote:
> > Are you sure about the "avoid churn" stuff?  Can't you just fix up the
> > X86_FEATURE_MATCH() #define to do that?
> I wasn't confident about avoiding churn but I was hoping too.
> Yes, I am pretty sure I can fix up that macro.

My fault ... I thought it might be easier on the backport to
all the stable and long term kernels if we kept this small at
this stage.  If upstream wants to cleanup all the users in a
subsequent patch, then we can do that too.

Alternatively there is Thomas' suggestion to encode the
range of steppings in the upper bits of driver_data and
avoid touching this structure altogether.

-Tony

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 1/3] v4 more sampling fun 1
  2020-03-19 15:50       ` Luck, Tony
@ 2020-03-19 16:34         ` Greg KH
  0 siblings, 0 replies; 17+ messages in thread
From: Greg KH @ 2020-03-19 16:34 UTC (permalink / raw)
  To: speck

On Thu, Mar 19, 2020 at 08:50:28AM -0700, speck for Luck, Tony wrote:
> On Thu, Mar 19, 2020 at 08:40:46AM -0700, speck for mark gross wrote:
> > On Thu, Mar 19, 2020 at 09:50:40AM +0100, speck for Greg KH wrote:
> > > Are you sure about the "avoid churn" stuff?  Can't you just fix up the
> > > X86_FEATURE_MATCH() #define to do that?
> > I wasn't confident about avoiding churn but I was hoping too.
> > Yes, I am pretty sure I can fix up that macro.
> 
> My fault ... I thought it might be easier on the backport to
> all the stable and long term kernels if we kept this small at
> this stage.  If upstream wants to cleanup all the users in a
> subsequent patch, then we can do that too.

Don't worry about stable/longterm, do the right thing first.  We can
worry about backports later.

> Alternatively there is Thomas' suggestion to encode the
> range of steppings in the upper bits of driver_data and
> avoid touching this structure altogether.

Ick, ick ick...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/3] v4 more sampling fun 1
  2020-03-19  8:50   ` [MODERATED] " Greg KH
  2020-03-19 15:40     ` mark gross
@ 2020-03-19 18:13     ` Thomas Gleixner
  1 sibling, 0 replies; 17+ messages in thread
From: Thomas Gleixner @ 2020-03-19 18:13 UTC (permalink / raw)
  To: speck

speck for Greg KH <speck@linutronix.de> writes:
> On Mon, Mar 16, 2020 at 05:56:27PM -0700, speck for mark gross wrote:
>> From: mark gross <mgross@linux.intel.com>
>> Subject: [PATCH 1/3] x86/cpu: Add stepping field to x86_cpu_id structure
>> 
>> Intel uses the same family/model for several CPUs. Sometimes
>> the stepping must be checked to tell them apart.
>> 
>> Note that to keep this patch simple the new field has been added at the
>> end to avoid churn with all the pre-C99 initialized uses of this
>> structure. Such legacy usage will result in a "0" value for the stepping
>> field, hence X86_STEPPING_ANY is defined as "0".
>
> Are you sure about the "avoid churn" stuff?  Can't you just fix up the
> X86_FEATURE_MATCH() #define to do that?
>
> Also INTEL_CPU_FAM_ANY() already uses named identifiers, so you are ok
> with that usage.  How many other places is this "open coded" that would
> need to be fixed up.  Can't be more than 10-20, right?
>
>> --- a/include/linux/mod_devicetable.h
>> +++ b/include/linux/mod_devicetable.h
>> @@ -665,6 +665,7 @@ struct x86_cpu_id {
>>  	__u16 model;
>>  	__u16 feature;	/* bit index */
>>  	kernel_ulong_t driver_data;
>> +	__u16 stepping;
>>  };
>
> That just makes my eyes hurt, stepping really should be before
> driver_data.
>
> The macros should be fixed up anyway to use named identifiers, no time
> like the present :)
>
> And this, and the cleanups, can be done in public, like I thought we
> asked for before, right?

I have played with this already. That's a pretty amazing amount of
creative macro games all over the place and about 55+ files to cleanup.

Let me dust off the patches a bit and send them as reference to LKML.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2
  2020-03-18 21:27 [MODERATED] [PATCH 0/3] v4 more sampling fun 0 mark gross
                   ` (3 preceding siblings ...)
       [not found] ` <5e7296c7.1c69fb81.f9a2f.00ebSMTPIN_ADDED_BROKEN@mx.google.com>
@ 2020-03-26  3:19 ` Josh Poimboeuf
  2020-03-27 16:20   ` mark gross
  2020-03-26  3:25 ` [MODERATED] Re: [PATCH 3/3] v4 more sampling fun 3 Josh Poimboeuf
  5 siblings, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2020-03-26  3:19 UTC (permalink / raw)
  To: speck

On Thu, Jan 16, 2020 at 02:16:07PM -0800, speck for mark gross wrote:
> From: mark gross <mgross@linux.intel.com>
> Subject: [PATCH 2/3] x86/speculation: Special Register Buffer Data Sampling
>  (SRBDS) mitigation control.
> 
> SRBDS is an MDS-like speculative side channel that can leak bits from
> the RNG across cores and threads. New microcode serializes the processor
> access during the execution of RDRAND and RDSEED.  This ensures that the
> shared buffer is overwritten before it is released for reuse.
> 
> While it is present on all affected CPU models, the microcode mitigation
> is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
> cases where TSX is not supported or has been disabled with TSX_CTRL.
> 
> The mitigation is activated by default on affected processors and it
> increases latency for RDRAND and RDSEED instructions.  Among other
> effects this will reduce throughput from /dev/urandom.
> 
> * enable administrator to configure the mitigation off when desired
>   using either mitigations=off or srbds=off.
> * export vulnerability status via sysfs
> * rename file scoped macros to apply for non-whitelist table
>   initializations.
> 
> Signed-off-by: Mark Gross <mgross@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>

Sorry for being a little late to the review.

> ---
>  .../ABI/testing/sysfs-devices-system-cpu      |   1 +
>  .../admin-guide/kernel-parameters.txt         |  11 ++
>  arch/x86/include/asm/cpufeatures.h            |   2 +
>  arch/x86/include/asm/msr-index.h              |   4 +
>  arch/x86/kernel/cpu/bugs.c                    | 112 +++++++++++++++
>  arch/x86/kernel/cpu/common.c                  | 128 +++++++++++-------
>  arch/x86/kernel/cpu/cpu.h                     |  13 ++
>  arch/x86/kernel/cpu/intel.c                   |   2 +
>  arch/x86/kernel/cpu/match.c                   |   6 +-
>  drivers/base/cpu.c                            |   8 ++
>  include/linux/mod_devicetable.h               |   2 +-
>  11 files changed, 236 insertions(+), 53 deletions(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 2e0e3b45d02a..a890b3769335 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -492,6 +492,7 @@ What:		/sys/devices/system/cpu/vulnerabilities
>  		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
>  		/sys/devices/system/cpu/vulnerabilities/l1tf
>  		/sys/devices/system/cpu/vulnerabilities/mds
> +		/sys/devices/system/cpu/vulnerabilities/special_register_data_sampling

That's not the actual name of the vulnerability.  The file should match
the actual name (or my personal preference, the "srbds" abbreviation).

>  		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
>  		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
>  Date:		January 2018
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index c07815d230bc..71cf7e7eae45 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4659,6 +4659,17 @@
>  	spia_pedr=
>  	spia_peddr=
>  
> +	srbds=		[x86]

			[X86,INTEL]

(note the capital X for consistency with other entries and
kernel-parameters.rst)

> +			Special Register Buffer Data Sampling mitigation
> +			control.

			Control mitigation for the Special Register
			Buffer Data Sampling (SRBDS) mitigation.

> +
> +			On CPUs vulnerable to this issue and users impacted by
> +			the mitigation slowing down RDRAND and RDSEED can
> +			disable by setting this:

-ENOPARSE

How about:

			Certain CPUs are vulnerable to an MDS-like
			exploit which can leak bits from the random
			number generator.

			By default, this issue is mitigated by
			microcode.  However, the microcode fix can cause
			the RDRAND and RDSEED instructions to become
			much slower.  Among other effects, this will
			result in reduced throughput from /dev/urandom.

			The microcode mitigation can be disabled with
			the following option:

			off:	Disable mitigation and remove
				performance impact to RDRAND and RDSEED

> +
> +			off:	disable mitigation and remove performance
> +				impact to rdrand and rdseed
> +
>  	srcutree.counter_wrap_check [KNL]
>  			Specifies how frequently to check for
>  			grace-period sequence counter wrap for the
> @@ -397,6 +399,103 @@ static int __init tsx_async_abort_parse_cmdline(char *str)
>  }
>  early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
>  
> +#undef pr_fmt
> +#define pr_fmt(fmt)	"SRBDS: " fmt
> +
> +enum srbds_mitigations srbds_mitigation __ro_after_init = SRBDS_MITIGATION_FULL;
> +static const char * const srbds_strings[] = {
> +	[SRBDS_NOT_AFFECTED]		= "Not affected",

I don't think the SRBDS_NOT_AFFECTED enum is needed.

The "Not affected" string is already printed by cpu_show_common().

And srbds_configure_mitigation() can just check for
boot_cpu_has_bug(X86_BUG_SRBDS) instead of this enum value.

> +	[SRBDS_MITIGATION_OFF]		= "Vulnerable",
> +	[SRBDS_MITIGATION_UCODE_NEEDED]	= "Vulnerable: no microcode",
> +	[SRBDS_MITIGATION_FULL]		= "Mitigated",
> +	[SRBDS_NOT_AFFECTED_TSX_OFF]	= "Not affected (TSX disabled)",
> +	[SRBDS_HYPERVISOR]		= "Unknown",
> +};
> +
> +static bool srbds_off;
> +
> +void srbds_configure_mitigation(void)

To me, "configure" sounds like it could be a software thing.

How about update_srbds_msr(), which is also more consistent with some of
the other naming in the file.

> +static void __init srbds_select_mitigation(void)
> +{
> +	u64 ia32_cap;
> +
> +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> +		srbds_mitigation = SRBDS_HYPERVISOR;
> +		return;
> +	}

It's confusing that this comes before the X86_BUG_SRBDS check.  Is that
on purpose?

If so, I presume it doesn't have the intended effect, because if booting
in a guest on a non-vulnerable CPU, srbds_mitigation will get set to
SRBDS_HYPERVISOR, but the shown state will still be "Not affected"
because of how cpu_show_common() checks for the bug at the beginning.

> +
> +	if (!boot_cpu_has_bug(X86_BUG_SRBDS)) {
> +		srbds_mitigation = SRBDS_NOT_AFFECTED;
> +		return;
> +	}
> +
> +	if (!boot_cpu_has(X86_FEATURE_SRBDS_CTRL)) {
> +		srbds_mitigation = SRBDS_MITIGATION_UCODE_NEEDED;
> +		return;
> +	}
> +
> +	if (boot_cpu_has_bug(X86_BUG_SRBDS)) {
> +		srbds_mitigation = SRBDS_MITIGATION_FULL;
> +		/*
> +		 * Check to see if this is one of the MDS_NO systems supporting
> +		 * TSX that are only exposed to SRBDS when TSX is enabled.
> +		 */
> +		ia32_cap = x86_read_arch_cap_msr();
> +		if (ia32_cap & ARCH_CAP_MDS_NO) {
> +			if (!boot_cpu_has(X86_FEATURE_RTM))
> +				srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
> +		}

I wonder if this would be equivalent?

		if (!boot_cpu_has_bug(X86_BUG_MDS) && boot_cpu_has(X86_BUG_TAA) &&
		    !boot_cpu_has(X86_FEATURE_RTM))
		    	srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;

> +	}
> +
> +	if (cpu_mitigations_off() || srbds_off) {
> +		if (srbds_mitigation != SRBDS_NOT_AFFECTED_TSX_OFF)
> +			srbds_mitigation = SRBDS_MITIGATION_OFF;
> +	}
> +
> +	srbds_configure_mitigation();
> +}

The other mitigations do a printk with the mitigation status here.
Should we not do the same here?

> +
> +static int __init srbds_parse_cmdline(char *str)
> +{
> +	if (!str)
> +		return -EINVAL;
> +
> +	if (!strcmp(str, "off"))
> +		srbds_off = true;
> +
> +	return 0;
> +}
> +early_param("srbds", srbds_parse_cmdline);
> +
>  #undef pr_fmt
>  #define pr_fmt(fmt)     "Spectre V1 : " fmt
>  
> @@ -1528,6 +1627,11 @@ static char *ibpb_state(void)
>  	return "";
>  }
>  
> +static ssize_t srbds_show_state(char *buf)
> +{
> +	return sprintf(buf, "%s\n", srbds_strings[srbds_mitigation]);
> +}
> +
>  static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
>  			       char *buf, unsigned int bug)
>  {
> @@ -1572,6 +1676,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
>  	case X86_BUG_ITLB_MULTIHIT:
>  		return itlb_multihit_show_state(buf);
>  
> +	case X86_BUG_SRBDS:
> +		return srbds_show_state(buf);
> +
>  	default:
>  		break;
>  	}
> @@ -1618,4 +1725,9 @@ ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr
>  {
>  	return cpu_show_common(dev, attr, buf, X86_BUG_ITLB_MULTIHIT);
>  }
> +
> +ssize_t cpu_show_special_register_data_sampling(struct device *dev, struct device_attribute *attr, char *buf)
> +{

This function name can be renamed to match whatever the filename turns
out to be:

  cpu_show_srbds()

or

  cpu_show_special_register_data_buffer_sampling()

> +	return cpu_show_common(dev, attr, buf, X86_BUG_SRBDS);
> +}
>  #endif
> --- a/arch/x86/kernel/cpu/cpu.h
> +++ b/arch/x86/kernel/cpu/cpu.h
> @@ -44,7 +44,20 @@ struct _tlb_table {
>  extern const struct cpu_dev *const __x86_cpu_dev_start[],
>  			    *const __x86_cpu_dev_end[];
>  
> +enum srbds_mitigations {
> +	SRBDS_NOT_AFFECTED,
> +	SRBDS_MITIGATION_OFF,
> +	SRBDS_MITIGATION_UCODE_NEEDED,
> +	SRBDS_MITIGATION_FULL,
> +	SRBDS_NOT_AFFECTED_TSX_OFF,
> +	SRBDS_HYPERVISOR,
> +};
> +
> +extern __ro_after_init enum srbds_mitigations srbds_mitigation;

srbds_mitigation isn't used outside of bugs.c, so it can just be a
static variable and the enum definition can live there as well.

> +void srbds_configure_mitigation(void);
> +
>  #ifdef CONFIG_CPU_SUP_INTEL
> +
>  enum tsx_ctrl_states {
>  	TSX_CTRL_ENABLE,
>  	TSX_CTRL_DISABLE,
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index be82cd5841c3..1b083a2a415b 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -684,6 +684,8 @@ static void init_intel(struct cpuinfo_x86 *c)
>  		tsx_enable();
>  	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
>  		tsx_disable();
> +
> +	srbds_configure_mitigation();
>  }

I'm not sure this is the right place to call this.  On the boot CPU,
this gets called *before* the mitigation is selected, and then again
from srbds_select_mitigation() itself.

Maybe it should instead be called from identify_secondary_cpu(), while
keeping the existing call from srbds_select_mitigation() so the boot CPU
also gets updated.

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 3/3] v4 more sampling fun 3
  2020-03-18 21:27 [MODERATED] [PATCH 0/3] v4 more sampling fun 0 mark gross
                   ` (4 preceding siblings ...)
  2020-03-26  3:19 ` [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2 Josh Poimboeuf
@ 2020-03-26  3:25 ` Josh Poimboeuf
  2020-03-27 16:28   ` mark gross
  5 siblings, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2020-03-26  3:25 UTC (permalink / raw)
  To: speck

On Thu, Jan 30, 2020 at 11:12:02AM -0800, speck for mark gross wrote:
> From: mark gross <mgross@linux.intel.com>
> Subject: [PATCH 3/3] x86/speculation: SRBDS vulnerability and mitigation
>  documentation
> 
> Add documentation for the SRBDS vulnerability and its mitigation.
> 
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Mark Gross <mgross@linux.intel.com>
> ---
>  Documentation/admin-guide/hw-vuln/index.rst   |   2 +
>  .../special-register-buffer-data-sampling.rst | 150 ++++++++++++++++++
>  2 files changed, 152 insertions(+)
>  create mode 100644 Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
> 
> diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
> index 0795e3c2643f..99d5b3244ef9 100644
> --- a/Documentation/admin-guide/hw-vuln/index.rst
> +++ b/Documentation/admin-guide/hw-vuln/index.rst
> @@ -14,3 +14,5 @@ are configurable at compile, boot or run time.
>     mds
>     tsx_async_abort
>     multihit.rst
> +   special-register-buffer-data-sampling.rst
> +
> diff --git a/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
> new file mode 100644
> index 000000000000..1fae5f455eb8
> --- /dev/null
> +++ b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
> @@ -0,0 +1,150 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +SRBDS - Special Register Buffer Data Sampling
> +=============================================
> +
> +SRBDS is a hardware vulnerability that allows MDS :doc:`mds` techniques to
> +infer values returned from special register accesses.  Special register
> +accesses are accesses to off core registers.  According to Intels evaluation,

Intel's

> +the special register reads that have a security expectation of privacy are:
> +RDRAND, RDSEED and SGX EGETKEY.
> +
> +When RDRAND RDSEED and EGETKEY instructions are used the data is moved to the
> +core through the special register mechanism that is susceptible to MDS attacks.
> +
> +Affected processors
> +--------------------
> +Core models (desktop, mobile, Xeon-E3) that implement RDRAND and/or RDSEED and
> +are vulnerable to MFBDS (Micro architectural Fill Buffer Data Sampling) variant
> +of MDS (Micro architectural Data Sampling) or to the TAA (TSX Asynchronous
> +Abort) when TSX is enabled,
> +
> +  =============  ============  ==========================
> +  common name    Family_Model  Stepping
> +  =============  ============  ==========================
> +  Ivybridge      06_3AH        All
> +
> +  Haswell        06_3CH        All
> +  Haswell_L      06_45H        All
> +  Haswell_G      06_46H        All
> +
> +  Broadwell_G    06_47H        All
> +  Broadwell      06_3DH        All
> +
> +  Skylake_L      06_4EH        All
> +  Skylake        06_5EH        All
> +
> +  Kabylake_L     06_8EH        <=A
> +  Kabylake_L     06_8EH        0xB only if TSX is enabled
> +  Kabylake_L     06_8EH        0xC only if TSX is enabled
> +
> +  Kabylake       06_9EH        <=B
> +  Kabylake       06_9EH        0xC only if TSX is enabled
> +  Kabylake       06_9EH        0xD only if TSX is enabled
> +  =============  ============  ==========================
> +
> +Related CVEs
> +------------
> +
> +The following CVE entry is related to this SRBDS issue:
> +
> +    ==============  =====  =====================================
> +    CVE-2020-0543   SRBDS  Speical Register Buffer Data Sampling
> +    ==============  =====  =====================================
> +
> +Attack scenarios
> +----------------
> +An unprivileged user can extract returned values from RDRAND and RDSEED
> +executed on another core or sibling thread using MDS techniques.
> +
> +
> +Mitigtion mechanism

Mitigation

> +-------------------
> +Intel will release microcode updates that modify the RDRAND, RDSEED, and
> +EGETKEY instructions to overwrite secret special register data in the shared
> +staging buffer before the secret data can be accessed by another logical
> +processor.
> +
> +During execution of the RDRAND, RDSEED, or EGETKEY instruction, off-core
> +accesses from other logical processors will be delayed until the special
> +register read is complete and the secret data in the shared staging buffer is
> +overwritten.
> +
> +This has three effects on performance:
> +
> +#. RDRAND, RDSEED, or EGETKEY instruction have higher latency.
> +
> +#. Executing RDRAND at the same time on multiple logical processors will be
> +   serialized, resulting in an overall reduction in the maximum RDRAND
> +   bandwidth.
> +
> +#. Executing RDRAND, RDSEED or EGETKEY will delay memory accesses from other
> +   logical processors that miss their core caches, with an impact similar to
> +   legacy locked cache-line-split accesses.
> +
> +Because of the performance impact of the mitigation, the microcode updates also
> +provide an opt-out mechanism (RNGDS_MITG_DIS) to disable the mitigation for
> +RDRAND and RDSEED instructions executed outside of Intel Software Guard
> +Extensions (Intel SGX) enclaves. On logical processors that disable the
> +mitigation using this opt-out mechanism, RDRAND and RDSEED do not take longer
> +to execute and do not impact performance of sibling logical processors memory
> +accesses. The opt-out mechanism does not affect Intel SGX enclaves (including
> +execution of RDRAND or RDSEED inside of an enclave, as well as EGETKEY
> +execution).
> +
> +IA32_MCU_OPT_CTRL MSR Definition
> +--------------------------------
> +Along with the mitigation for this issue, Intel added a new thread-scope
> +IA32_MCU_OPT_CTRL MSR, (address 0x123). The presence of this MSR and
> +RNGDS_MITG_DIS (bit 0) is enumerated by CPUID.(EAX=07H,ECX=0).EDX[SRBDS_CTRL =
> +9]==1. This MSR is introduced through the microcode update.
> +
> +Setting IA32_MCU_OPT_CTRL[0] (RNGDS_MITG_DIS) to 1 for a logical processor
> +disables the mitigation for RDRAND and RDSEED executed outside of an Intel SGX
> +enclave on that logical processor. Opting out of the mitigation for a
> +particular logical processor does not affect the RDRAND and RDSEED mitigations
> +for other logical processors.
> +
> +Note that inside of an Intel SGX enclave, the mitigation is applied regardless
> +of the value of RNGDS_MITG_DS.
> +
> +Mitigation control on the kernel command line
> +---------------------------------------------
> +The kernel command line allows control over the SRBDS mitigation at boot time
> +with the option "srbds=".  The option for this is:
> +
> +  ============= =============================================================
> +  off           This option disables SRBS mitigation for RDRAND and RDSEED on
> +                affected platforms.
> +  ============= =============================================================
> +
> +SRBS System Information

SRBDS

> +-----------------------
> +The Linux kernel provides vulnerability status information through sysfs.  For
> +SRBDS this can be accessed by the following sysfs file:
> +/sys/devices/system/cpu/vulnerabilities/special_register_data_sampling

should be the full name, or "srbds" as I said in my review for patch 2

> +
> +The possible values contained in this file are:
> +
> + ============================== =============================================
> + Not affected                   Processor not vulnerable
> + Vulnerable                     Processor vulnerable and mitigation disabled
> + Vulnerable: no microcode       Processor vulnerable and microcode is missing
> +                                mitigation
> + Mitigated                      Processor is vulnerable and mitigation is in
> +                                effect.
> + Not affected (TSX disabled)    Processor is only vulnerable when TSX is
> +                                enabled while this system was booted with TSX
> +                                disabled.
> + Unknown                        Running on virtual guest processor that is
> +                                affected but with no way to know if host
> +                                processor is mitigated or vulnerable.
> + ============================== =============================================
> +
> +SRBDS Default mitigation
> +------------------------
> +This new microcode serializes processor access during execution of RDRAND,
> +RDSEED ensures that the shared buffer is overwritten before it is released for
> +reuse use the "srbds=off" kernel command line to disable the mitigation for
> +RDRAND and RDSEED.

Run-on sentence...

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2
  2020-03-26  3:19 ` [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2 Josh Poimboeuf
@ 2020-03-27 16:20   ` mark gross
  2020-03-27 17:23     ` Luck, Tony
  2020-03-27 17:37     ` Josh Poimboeuf
  0 siblings, 2 replies; 17+ messages in thread
From: mark gross @ 2020-03-27 16:20 UTC (permalink / raw)
  To: speck

On Wed, Mar 25, 2020 at 10:19:23PM -0500, speck for Josh Poimboeuf wrote:
> On Thu, Jan 16, 2020 at 02:16:07PM -0800, speck for mark gross wrote:
> > From: mark gross <mgross@linux.intel.com>
> > Subject: [PATCH 2/3] x86/speculation: Special Register Buffer Data Sampling
> >  (SRBDS) mitigation control.
> > 
> > SRBDS is an MDS-like speculative side channel that can leak bits from
> > the RNG across cores and threads. New microcode serializes the processor
> > access during the execution of RDRAND and RDSEED.  This ensures that the
> > shared buffer is overwritten before it is released for reuse.
> > 
> > While it is present on all affected CPU models, the microcode mitigation
> > is not needed on models that enumerate ARCH_CAPABILITIES[MDS_NO] in the
> > cases where TSX is not supported or has been disabled with TSX_CTRL.
> > 
> > The mitigation is activated by default on affected processors and it
> > increases latency for RDRAND and RDSEED instructions.  Among other
> > effects this will reduce throughput from /dev/urandom.
> > 
> > * enable administrator to configure the mitigation off when desired
> >   using either mitigations=off or srbds=off.
> > * export vulnerability status via sysfs
> > * rename file scoped macros to apply for non-whitelist table
> >   initializations.
> > 
> > Signed-off-by: Mark Gross <mgross@linux.intel.com>
> > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> > Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
> 
> Sorry for being a little late to the review.
Thanks for the reivew!

> 
> > ---
> >  .../ABI/testing/sysfs-devices-system-cpu      |   1 +
> >  .../admin-guide/kernel-parameters.txt         |  11 ++
> >  arch/x86/include/asm/cpufeatures.h            |   2 +
> >  arch/x86/include/asm/msr-index.h              |   4 +
> >  arch/x86/kernel/cpu/bugs.c                    | 112 +++++++++++++++
> >  arch/x86/kernel/cpu/common.c                  | 128 +++++++++++-------
> >  arch/x86/kernel/cpu/cpu.h                     |  13 ++
> >  arch/x86/kernel/cpu/intel.c                   |   2 +
> >  arch/x86/kernel/cpu/match.c                   |   6 +-
> >  drivers/base/cpu.c                            |   8 ++
> >  include/linux/mod_devicetable.h               |   2 +-
> >  11 files changed, 236 insertions(+), 53 deletions(-)
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> > index 2e0e3b45d02a..a890b3769335 100644
> > --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> > +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> > @@ -492,6 +492,7 @@ What:		/sys/devices/system/cpu/vulnerabilities
> >  		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
> >  		/sys/devices/system/cpu/vulnerabilities/l1tf
> >  		/sys/devices/system/cpu/vulnerabilities/mds
> > +		/sys/devices/system/cpu/vulnerabilities/special_register_data_sampling
> 
> That's not the actual name of the vulnerability.  The file should match
> the actual name (or my personal preference, the "srbds" abbreviation).
> 
> >  		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
> >  		/sys/devices/system/cpu/vulnerabilities/itlb_multihit
> >  Date:		January 2018
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index c07815d230bc..71cf7e7eae45 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -4659,6 +4659,17 @@
> >  	spia_pedr=
> >  	spia_peddr=
> >  
> > +	srbds=		[x86]
> 
> 			[X86,INTEL]
> 
> (note the capital X for consistency with other entries and
> kernel-parameters.rst)
> 
> > +			Special Register Buffer Data Sampling mitigation
> > +			control.
> 
> 			Control mitigation for the Special Register
> 			Buffer Data Sampling (SRBDS) mitigation.
> 
> > +
> > +			On CPUs vulnerable to this issue and users impacted by
> > +			the mitigation slowing down RDRAND and RDSEED can
> > +			disable by setting this:
> 
> -ENOPARSE
> 
> How about:
> 
> 			Certain CPUs are vulnerable to an MDS-like
> 			exploit which can leak bits from the random
> 			number generator.
> 
> 			By default, this issue is mitigated by
> 			microcode.  However, the microcode fix can cause
> 			the RDRAND and RDSEED instructions to become
> 			much slower.  Among other effects, this will
> 			result in reduced throughput from /dev/urandom.
> 
> 			The microcode mitigation can be disabled with
> 			the following option:
> 
> 			off:	Disable mitigation and remove
> 				performance impact to RDRAND and RDSEED
> 
Looks like fine update to include in the next version to me, I'll use your
version.

> > +
> > +			off:	disable mitigation and remove performance
> > +				impact to rdrand and rdseed
> > +
> >  	srcutree.counter_wrap_check [KNL]
> >  			Specifies how frequently to check for
> >  			grace-period sequence counter wrap for the
> > @@ -397,6 +399,103 @@ static int __init tsx_async_abort_parse_cmdline(char *str)
> >  }
> >  early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
> >  
> > +#undef pr_fmt
> > +#define pr_fmt(fmt)	"SRBDS: " fmt
> > +
> > +enum srbds_mitigations srbds_mitigation __ro_after_init = SRBDS_MITIGATION_FULL;
> > +static const char * const srbds_strings[] = {
> > +	[SRBDS_NOT_AFFECTED]		= "Not affected",
> 
> I don't think the SRBDS_NOT_AFFECTED enum is needed.
> 
> The "Not affected" string is already printed by cpu_show_common().
> 
> And srbds_configure_mitigation() can just check for
> boot_cpu_has_bug(X86_BUG_SRBDS) instead of this enum value.

True.  I prefer having the complete state machine represented by the
enumeration but, if you feel strongly I'll try to remove it.

> 
> > +	[SRBDS_MITIGATION_OFF]		= "Vulnerable",
> > +	[SRBDS_MITIGATION_UCODE_NEEDED]	= "Vulnerable: no microcode",
> > +	[SRBDS_MITIGATION_FULL]		= "Mitigated",
> > +	[SRBDS_NOT_AFFECTED_TSX_OFF]	= "Not affected (TSX disabled)",
> > +	[SRBDS_HYPERVISOR]		= "Unknown",
> > +};
> > +
> > +static bool srbds_off;
> > +
> > +void srbds_configure_mitigation(void)
> 
> To me, "configure" sounds like it could be a software thing.
We are configuring the meditation by honoring the kernel command line in this
function.

> 
> How about update_srbds_msr(), which is also more consistent with some of
> the other naming in the file.

If you like but, I was mimicking the TAA implementation when choosing the name
of this function.

> 
> > +static void __init srbds_select_mitigation(void)
> > +{
> > +	u64 ia32_cap;
> > +
> > +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> > +		srbds_mitigation = SRBDS_HYPERVISOR;
> > +		return;
> > +	}
> 
> It's confusing that this comes before the X86_BUG_SRBDS check.  Is that
> on purpose?
VM vendors on keybase wanted it so.

> 
> If so, I presume it doesn't have the intended effect, because if booting
> in a guest on a non-vulnerable CPU, srbds_mitigation will get set to
> SRBDS_HYPERVISOR, but the shown state will still be "Not affected"
> because of how cpu_show_common() checks for the bug at the beginning.

The intended effect is to report "Unknown" for the vulnerability status of
SRBDS for all guest VM's running this code and to never touch the MSR
controlling the mitigation.

I think this is what happens.  I'll double check and confirm with testing.

> > +
> > +	if (!boot_cpu_has_bug(X86_BUG_SRBDS)) {
> > +		srbds_mitigation = SRBDS_NOT_AFFECTED;
> > +		return;
> > +	}
> > +
> > +	if (!boot_cpu_has(X86_FEATURE_SRBDS_CTRL)) {
> > +		srbds_mitigation = SRBDS_MITIGATION_UCODE_NEEDED;
> > +		return;
> > +	}
> > +
> > +	if (boot_cpu_has_bug(X86_BUG_SRBDS)) {
> > +		srbds_mitigation = SRBDS_MITIGATION_FULL;
> > +		/*
> > +		 * Check to see if this is one of the MDS_NO systems supporting
> > +		 * TSX that are only exposed to SRBDS when TSX is enabled.
> > +		 */
> > +		ia32_cap = x86_read_arch_cap_msr();
> > +		if (ia32_cap & ARCH_CAP_MDS_NO) {
> > +			if (!boot_cpu_has(X86_FEATURE_RTM))
> > +				srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
> > +		}
> 
> I wonder if this would be equivalent?
> 
> 		if (!boot_cpu_has_bug(X86_BUG_MDS) && boot_cpu_has(X86_BUG_TAA) &&
> 		    !boot_cpu_has(X86_FEATURE_RTM))
> 		    	srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
I have no clue if its equivalent but, I find your version much harder to parse
and my version matches more closely to what the white paper states.

> 
> > +	}
> > +
> > +	if (cpu_mitigations_off() || srbds_off) {
> > +		if (srbds_mitigation != SRBDS_NOT_AFFECTED_TSX_OFF)
> > +			srbds_mitigation = SRBDS_MITIGATION_OFF;
> > +	}
> > +
> > +	srbds_configure_mitigation();
> > +}
> 
> The other mitigations do a printk with the mitigation status here.
> Should we not do the same here?
Perhaps but, I was attempting to limit the extra chatter in the kernel logs.
If you feel it adds value to the code I'll put it in.  FWIW I think the chatter
from the other places should be removed.

> 
> > +
> > +static int __init srbds_parse_cmdline(char *str)
> > +{
> > +	if (!str)
> > +		return -EINVAL;
> > +
> > +	if (!strcmp(str, "off"))
> > +		srbds_off = true;
> > +
> > +	return 0;
> > +}
> > +early_param("srbds", srbds_parse_cmdline);
> > +
> >  #undef pr_fmt
> >  #define pr_fmt(fmt)     "Spectre V1 : " fmt
> >  
> > @@ -1528,6 +1627,11 @@ static char *ibpb_state(void)
> >  	return "";
> >  }
> >  
> > +static ssize_t srbds_show_state(char *buf)
> > +{
> > +	return sprintf(buf, "%s\n", srbds_strings[srbds_mitigation]);
> > +}
> > +
> >  static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
> >  			       char *buf, unsigned int bug)
> >  {
> > @@ -1572,6 +1676,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
> >  	case X86_BUG_ITLB_MULTIHIT:
> >  		return itlb_multihit_show_state(buf);
> >  
> > +	case X86_BUG_SRBDS:
> > +		return srbds_show_state(buf);
> > +
> >  	default:
> >  		break;
> >  	}
> > @@ -1618,4 +1725,9 @@ ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr
> >  {
> >  	return cpu_show_common(dev, attr, buf, X86_BUG_ITLB_MULTIHIT);
> >  }
> > +
> > +ssize_t cpu_show_special_register_data_sampling(struct device *dev, struct device_attribute *attr, char *buf)
> > +{
> 
> This function name can be renamed to match whatever the filename turns
> out to be:
> 
>   cpu_show_srbds()
> 
> or
> 
>   cpu_show_special_register_data_buffer_sampling()

I got some early review feedback to not use the shorter name "srbds" for the
sysfs inodes so I was trying to avoid the full spelled out the full names.

"srbds" and "special_register_data_buffer_sampling" do not imply HWRNG security
risk so IMO I'd like to just use srbds.  What do others think?


> 
> > +	return cpu_show_common(dev, attr, buf, X86_BUG_SRBDS);
> > +}
> >  #endif
> > --- a/arch/x86/kernel/cpu/cpu.h
> > +++ b/arch/x86/kernel/cpu/cpu.h
> > @@ -44,7 +44,20 @@ struct _tlb_table {
> >  extern const struct cpu_dev *const __x86_cpu_dev_start[],
> >  			    *const __x86_cpu_dev_end[];
> >  
> > +enum srbds_mitigations {
> > +	SRBDS_NOT_AFFECTED,
> > +	SRBDS_MITIGATION_OFF,
> > +	SRBDS_MITIGATION_UCODE_NEEDED,
> > +	SRBDS_MITIGATION_FULL,
> > +	SRBDS_NOT_AFFECTED_TSX_OFF,
> > +	SRBDS_HYPERVISOR,
> > +};
> > +
> > +extern __ro_after_init enum srbds_mitigations srbds_mitigation;
> 
> srbds_mitigation isn't used outside of bugs.c, so it can just be a
> static variable and the enum definition can live there as well.

True, I'll remove the extern declaration its a left over from an earlier
implementation.


> > +void srbds_configure_mitigation(void);
> > +
> >  #ifdef CONFIG_CPU_SUP_INTEL
> > +
> >  enum tsx_ctrl_states {
> >  	TSX_CTRL_ENABLE,
> >  	TSX_CTRL_DISABLE,
> > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> > index be82cd5841c3..1b083a2a415b 100644
> > --- a/arch/x86/kernel/cpu/intel.c
> > +++ b/arch/x86/kernel/cpu/intel.c
> > @@ -684,6 +684,8 @@ static void init_intel(struct cpuinfo_x86 *c)
> >  		tsx_enable();
> >  	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
> >  		tsx_disable();
> > +
> > +	srbds_configure_mitigation();
> >  }
> 
> I'm not sure this is the right place to call this.  On the boot CPU,
> this gets called *before* the mitigation is selected, and then again
> from srbds_select_mitigation() itself.
> 
> Maybe it should instead be called from identify_secondary_cpu(), while
> keeping the existing call from srbds_select_mitigation() so the boot CPU
> also gets updated.
The SRBDS MSR needs to be poked from each CPU thread if you want to disable the
mitigation.  Does identify_secondary_cpu get called per HW thread?

I felt it would be nice to keep the call close to other mitigation setup
logic, in this case the TAA logic.

What do you think it will be more readable if I move it to
identify_secondary_cpu?

--mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 3/3] v4 more sampling fun 3
  2020-03-26  3:25 ` [MODERATED] Re: [PATCH 3/3] v4 more sampling fun 3 Josh Poimboeuf
@ 2020-03-27 16:28   ` mark gross
  0 siblings, 0 replies; 17+ messages in thread
From: mark gross @ 2020-03-27 16:28 UTC (permalink / raw)
  To: speck

On Wed, Mar 25, 2020 at 10:25:21PM -0500, speck for Josh Poimboeuf wrote:
> On Thu, Jan 30, 2020 at 11:12:02AM -0800, speck for mark gross wrote:
> > From: mark gross <mgross@linux.intel.com>
> > Subject: [PATCH 3/3] x86/speculation: SRBDS vulnerability and mitigation
> >  documentation
> > 
> > Add documentation for the SRBDS vulnerability and its mitigation.
> > 
> > Reviewed-by: Tony Luck <tony.luck@intel.com>
> > Signed-off-by: Mark Gross <mgross@linux.intel.com>
> > ---
> >  Documentation/admin-guide/hw-vuln/index.rst   |   2 +
> >  .../special-register-buffer-data-sampling.rst | 150 ++++++++++++++++++
> >  2 files changed, 152 insertions(+)
> >  create mode 100644 Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
> > 
> > diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
> > index 0795e3c2643f..99d5b3244ef9 100644
> > --- a/Documentation/admin-guide/hw-vuln/index.rst
> > +++ b/Documentation/admin-guide/hw-vuln/index.rst
> > @@ -14,3 +14,5 @@ are configurable at compile, boot or run time.
> >     mds
> >     tsx_async_abort
> >     multihit.rst
> > +   special-register-buffer-data-sampling.rst
> > +
> > diff --git a/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
> > new file mode 100644
> > index 000000000000..1fae5f455eb8
> > --- /dev/null
> > +++ b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
> > @@ -0,0 +1,150 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +SRBDS - Special Register Buffer Data Sampling
> > +=============================================
> > +
> > +SRBDS is a hardware vulnerability that allows MDS :doc:`mds` techniques to
> > +infer values returned from special register accesses.  Special register
> > +accesses are accesses to off core registers.  According to Intels evaluation,
> 
> Intel's
Thanks!
> 
> > +the special register reads that have a security expectation of privacy are:
> > +RDRAND, RDSEED and SGX EGETKEY.
> > +
> > +When RDRAND RDSEED and EGETKEY instructions are used the data is moved to the
> > +core through the special register mechanism that is susceptible to MDS attacks.
> > +
> > +Affected processors
> > +--------------------
> > +Core models (desktop, mobile, Xeon-E3) that implement RDRAND and/or RDSEED and
> > +are vulnerable to MFBDS (Micro architectural Fill Buffer Data Sampling) variant
> > +of MDS (Micro architectural Data Sampling) or to the TAA (TSX Asynchronous
> > +Abort) when TSX is enabled,
> > +
> > +  =============  ============  ==========================
> > +  common name    Family_Model  Stepping
> > +  =============  ============  ==========================
> > +  Ivybridge      06_3AH        All
> > +
> > +  Haswell        06_3CH        All
> > +  Haswell_L      06_45H        All
> > +  Haswell_G      06_46H        All
> > +
> > +  Broadwell_G    06_47H        All
> > +  Broadwell      06_3DH        All
> > +
> > +  Skylake_L      06_4EH        All
> > +  Skylake        06_5EH        All
> > +
> > +  Kabylake_L     06_8EH        <=A
> > +  Kabylake_L     06_8EH        0xB only if TSX is enabled
> > +  Kabylake_L     06_8EH        0xC only if TSX is enabled
> > +
> > +  Kabylake       06_9EH        <=B
> > +  Kabylake       06_9EH        0xC only if TSX is enabled
> > +  Kabylake       06_9EH        0xD only if TSX is enabled
> > +  =============  ============  ==========================
> > +
> > +Related CVEs
> > +------------
> > +
> > +The following CVE entry is related to this SRBDS issue:
> > +
> > +    ==============  =====  =====================================
> > +    CVE-2020-0543   SRBDS  Speical Register Buffer Data Sampling
> > +    ==============  =====  =====================================
> > +
> > +Attack scenarios
> > +----------------
> > +An unprivileged user can extract returned values from RDRAND and RDSEED
> > +executed on another core or sibling thread using MDS techniques.
> > +
> > +
> > +Mitigtion mechanism
> 
> Mitigation
Thanks!
> 
> > +-------------------
> > +Intel will release microcode updates that modify the RDRAND, RDSEED, and
> > +EGETKEY instructions to overwrite secret special register data in the shared
> > +staging buffer before the secret data can be accessed by another logical
> > +processor.
> > +
> > +During execution of the RDRAND, RDSEED, or EGETKEY instruction, off-core
> > +accesses from other logical processors will be delayed until the special
> > +register read is complete and the secret data in the shared staging buffer is
> > +overwritten.
> > +
> > +This has three effects on performance:
> > +
> > +#. RDRAND, RDSEED, or EGETKEY instruction have higher latency.
> > +
> > +#. Executing RDRAND at the same time on multiple logical processors will be
> > +   serialized, resulting in an overall reduction in the maximum RDRAND
> > +   bandwidth.
> > +
> > +#. Executing RDRAND, RDSEED or EGETKEY will delay memory accesses from other
> > +   logical processors that miss their core caches, with an impact similar to
> > +   legacy locked cache-line-split accesses.
> > +
> > +Because of the performance impact of the mitigation, the microcode updates also
> > +provide an opt-out mechanism (RNGDS_MITG_DIS) to disable the mitigation for
> > +RDRAND and RDSEED instructions executed outside of Intel Software Guard
> > +Extensions (Intel SGX) enclaves. On logical processors that disable the
> > +mitigation using this opt-out mechanism, RDRAND and RDSEED do not take longer
> > +to execute and do not impact performance of sibling logical processors memory
> > +accesses. The opt-out mechanism does not affect Intel SGX enclaves (including
> > +execution of RDRAND or RDSEED inside of an enclave, as well as EGETKEY
> > +execution).
> > +
> > +IA32_MCU_OPT_CTRL MSR Definition
> > +--------------------------------
> > +Along with the mitigation for this issue, Intel added a new thread-scope
> > +IA32_MCU_OPT_CTRL MSR, (address 0x123). The presence of this MSR and
> > +RNGDS_MITG_DIS (bit 0) is enumerated by CPUID.(EAX=07H,ECX=0).EDX[SRBDS_CTRL =
> > +9]==1. This MSR is introduced through the microcode update.
> > +
> > +Setting IA32_MCU_OPT_CTRL[0] (RNGDS_MITG_DIS) to 1 for a logical processor
> > +disables the mitigation for RDRAND and RDSEED executed outside of an Intel SGX
> > +enclave on that logical processor. Opting out of the mitigation for a
> > +particular logical processor does not affect the RDRAND and RDSEED mitigations
> > +for other logical processors.
> > +
> > +Note that inside of an Intel SGX enclave, the mitigation is applied regardless
> > +of the value of RNGDS_MITG_DS.
> > +
> > +Mitigation control on the kernel command line
> > +---------------------------------------------
> > +The kernel command line allows control over the SRBDS mitigation at boot time
> > +with the option "srbds=".  The option for this is:
> > +
> > +  ============= =============================================================
> > +  off           This option disables SRBS mitigation for RDRAND and RDSEED on
> > +                affected platforms.
> > +  ============= =============================================================
> > +
> > +SRBS System Information
> 
> SRBDS
Thanks!
> 
> > +-----------------------
> > +The Linux kernel provides vulnerability status information through sysfs.  For
> > +SRBDS this can be accessed by the following sysfs file:
> > +/sys/devices/system/cpu/vulnerabilities/special_register_data_sampling
> 
> should be the full name, or "srbds" as I said in my review for patch 2
I'll make them match.  FWIW I'd rather have srbds be the file name.

> > +
> > +The possible values contained in this file are:
> > +
> > + ============================== =============================================
> > + Not affected                   Processor not vulnerable
> > + Vulnerable                     Processor vulnerable and mitigation disabled
> > + Vulnerable: no microcode       Processor vulnerable and microcode is missing
> > +                                mitigation
> > + Mitigated                      Processor is vulnerable and mitigation is in
> > +                                effect.
> > + Not affected (TSX disabled)    Processor is only vulnerable when TSX is
> > +                                enabled while this system was booted with TSX
> > +                                disabled.
> > + Unknown                        Running on virtual guest processor that is
> > +                                affected but with no way to know if host
> > +                                processor is mitigated or vulnerable.
> > + ============================== =============================================
> > +
> > +SRBDS Default mitigation
> > +------------------------
> > +This new microcode serializes processor access during execution of RDRAND,
> > +RDSEED ensures that the shared buffer is overwritten before it is released for
> > +reuse use the "srbds=off" kernel command line to disable the mitigation for
> > +RDRAND and RDSEED.
> 
> Run-on sentence...
How about?:
This new microcode serializes processor access during execution of RDRAND,
RDSEED ensures that the shared buffer is overwritten before it is released for
reuse.


thanks,

--mark

> 
> -- 
> Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2
  2020-03-27 16:20   ` mark gross
@ 2020-03-27 17:23     ` Luck, Tony
  2020-03-27 19:12       ` mark gross
  2020-03-27 17:37     ` Josh Poimboeuf
  1 sibling, 1 reply; 17+ messages in thread
From: Luck, Tony @ 2020-03-27 17:23 UTC (permalink / raw)
  To: speck

On Fri, Mar 27, 2020 at 09:20:41AM -0700, speck for mark gross wrote:
> On Wed, Mar 25, 2020 at 10:19:23PM -0500, speck for Josh Poimboeuf wrote:
> > > +static void __init srbds_select_mitigation(void)
> > > +{
> > > +	u64 ia32_cap;
> > > +
> > > +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> > > +		srbds_mitigation = SRBDS_HYPERVISOR;
> > > +		return;
> > > +	}
> > 
> > It's confusing that this comes before the X86_BUG_SRBDS check.  Is that
> > on purpose?
> VM vendors on keybase wanted it so.

No. They don't want this. It results in all multi-socket servers
(which are not affected by SRBDS) saying the mitigation status is
unknown.

The preferred order to check is:

1) Am I on the affected list?

2) Am I running under a hypervisor?


Initially Andrew Cooper had asked for the hypervisor check to come
first. But he has since relented.  Below is a comment to explain
why it is practially better to check for prescence on the affected
list *before* checking for hypervisor:

/*
 * This test relies on the CPUID values of vendor, family, model,
 * stepping which might not reflect the real hardware when we are
 * running as a guest. But VMM vendors have asked that we do this
 * before the X86_FEATURE_HYPERVISOR test since this provides better
 * guidance to users in most real situations.
 */

-Tony

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2
  2020-03-27 16:20   ` mark gross
  2020-03-27 17:23     ` Luck, Tony
@ 2020-03-27 17:37     ` Josh Poimboeuf
  2020-03-27 19:27       ` mark gross
  1 sibling, 1 reply; 17+ messages in thread
From: Josh Poimboeuf @ 2020-03-27 17:37 UTC (permalink / raw)
  To: speck

On Fri, Mar 27, 2020 at 09:20:41AM -0700, speck for mark gross wrote:
> > > +
> > > +			off:	disable mitigation and remove performance
> > > +				impact to rdrand and rdseed
> > > +
> > >  	srcutree.counter_wrap_check [KNL]
> > >  			Specifies how frequently to check for
> > >  			grace-period sequence counter wrap for the
> > > @@ -397,6 +399,103 @@ static int __init tsx_async_abort_parse_cmdline(char *str)
> > >  }
> > >  early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
> > >  
> > > +#undef pr_fmt
> > > +#define pr_fmt(fmt)	"SRBDS: " fmt
> > > +
> > > +enum srbds_mitigations srbds_mitigation __ro_after_init = SRBDS_MITIGATION_FULL;
> > > +static const char * const srbds_strings[] = {
> > > +	[SRBDS_NOT_AFFECTED]		= "Not affected",
> > 
> > I don't think the SRBDS_NOT_AFFECTED enum is needed.
> > 
> > The "Not affected" string is already printed by cpu_show_common().
> > 
> > And srbds_configure_mitigation() can just check for
> > boot_cpu_has_bug(X86_BUG_SRBDS) instead of this enum value.
> 
> True.  I prefer having the complete state machine represented by the
> enumeration but, if you feel strongly I'll try to remove it.

If the state isn't actually used by the sysfs printing code then it just
serves to add confusion for a reader of the code.

And notice none of the other mitigations have this.  It's a subtle
inconsistency, which is a recipe for future bugs.

> > > +	[SRBDS_MITIGATION_OFF]		= "Vulnerable",
> > > +	[SRBDS_MITIGATION_UCODE_NEEDED]	= "Vulnerable: no microcode",
> > > +	[SRBDS_MITIGATION_FULL]		= "Mitigated",
> > > +	[SRBDS_NOT_AFFECTED_TSX_OFF]	= "Not affected (TSX disabled)",
> > > +	[SRBDS_HYPERVISOR]		= "Unknown",
> > > +};
> > > +
> > > +static bool srbds_off;
> > > +
> > > +void srbds_configure_mitigation(void)
> > 
> > To me, "configure" sounds like it could be a software thing.
> We are configuring the meditation by honoring the kernel command line in this
> function.
> 
> > 
> > How about update_srbds_msr(), which is also more consistent with some of
> > the other naming in the file.
> 
> If you like but, I was mimicking the TAA implementation when choosing the name
> of this function.

Hm, which TAA function name were you mimicking?

> > > +static void __init srbds_select_mitigation(void)
> > > +{
> > > +	u64 ia32_cap;
> > > +
> > > +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> > > +		srbds_mitigation = SRBDS_HYPERVISOR;
> > > +		return;
> > > +	}
> > 
> > It's confusing that this comes before the X86_BUG_SRBDS check.  Is that
> > on purpose?
> VM vendors on keybase wanted it so.
> 
> > 
> > If so, I presume it doesn't have the intended effect, because if booting
> > in a guest on a non-vulnerable CPU, srbds_mitigation will get set to
> > SRBDS_HYPERVISOR, but the shown state will still be "Not affected"
> > because of how cpu_show_common() checks for the bug at the beginning.
> 
> The intended effect is to report "Unknown" for the vulnerability status of
> SRBDS for all guest VM's running this code and to never touch the MSR
> controlling the mitigation.
> 
> I think this is what happens.  I'll double check and confirm with testing.

Maybe I'm blind, but given how cpu_show_common() works, I don't see how
that could possibly happen when boot_cpu_has_bug(X86_BUG_SRBDS) is
false.  If I'm wrong please enlighten me.

> > > +
> > > +	if (!boot_cpu_has_bug(X86_BUG_SRBDS)) {
> > > +		srbds_mitigation = SRBDS_NOT_AFFECTED;
> > > +		return;
> > > +	}
> > > +
> > > +	if (!boot_cpu_has(X86_FEATURE_SRBDS_CTRL)) {
> > > +		srbds_mitigation = SRBDS_MITIGATION_UCODE_NEEDED;
> > > +		return;
> > > +	}
> > > +
> > > +	if (boot_cpu_has_bug(X86_BUG_SRBDS)) {
> > > +		srbds_mitigation = SRBDS_MITIGATION_FULL;
> > > +		/*
> > > +		 * Check to see if this is one of the MDS_NO systems supporting
> > > +		 * TSX that are only exposed to SRBDS when TSX is enabled.
> > > +		 */
> > > +		ia32_cap = x86_read_arch_cap_msr();
> > > +		if (ia32_cap & ARCH_CAP_MDS_NO) {
> > > +			if (!boot_cpu_has(X86_FEATURE_RTM))
> > > +				srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
> > > +		}
> > 
> > I wonder if this would be equivalent?
> > 
> > 		if (!boot_cpu_has_bug(X86_BUG_MDS) && boot_cpu_has(X86_BUG_TAA) &&
> > 		    !boot_cpu_has(X86_FEATURE_RTM))
> > 		    	srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
> I have no clue if its equivalent but, I find your version much harder to parse
> and my version matches more closely to what the white paper states.

Ok.

> > > +	}
> > > +
> > > +	if (cpu_mitigations_off() || srbds_off) {
> > > +		if (srbds_mitigation != SRBDS_NOT_AFFECTED_TSX_OFF)
> > > +			srbds_mitigation = SRBDS_MITIGATION_OFF;
> > > +	}
> > > +
> > > +	srbds_configure_mitigation();
> > > +}
> > 
> > The other mitigations do a printk with the mitigation status here.
> > Should we not do the same here?
> Perhaps but, I was attempting to limit the extra chatter in the kernel logs.
> If you feel it adds value to the code I'll put it in.  FWIW I think the chatter
> from the other places should be removed.

It should at least be consistent with the others.

> > > +ssize_t cpu_show_special_register_data_sampling(struct device *dev, struct device_attribute *attr, char *buf)
> > > +{
> > 
> > This function name can be renamed to match whatever the filename turns
> > out to be:
> > 
> >   cpu_show_srbds()
> > 
> > or
> > 
> >   cpu_show_special_register_data_buffer_sampling()
> 
> I got some early review feedback to not use the shorter name "srbds" for the
> sysfs inodes so I was trying to avoid the full spelled out the full names.
> 
> "srbds" and "special_register_data_buffer_sampling" do not imply HWRNG security
> risk so IMO I'd like to just use srbds.  What do others think?

+1 for "srbds" for both file name and function name.

> > > +void srbds_configure_mitigation(void);
> > > +
> > >  #ifdef CONFIG_CPU_SUP_INTEL
> > > +
> > >  enum tsx_ctrl_states {
> > >  	TSX_CTRL_ENABLE,
> > >  	TSX_CTRL_DISABLE,
> > > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> > > index be82cd5841c3..1b083a2a415b 100644
> > > --- a/arch/x86/kernel/cpu/intel.c
> > > +++ b/arch/x86/kernel/cpu/intel.c
> > > @@ -684,6 +684,8 @@ static void init_intel(struct cpuinfo_x86 *c)
> > >  		tsx_enable();
> > >  	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
> > >  		tsx_disable();
> > > +
> > > +	srbds_configure_mitigation();
> > >  }
> > 
> > I'm not sure this is the right place to call this.  On the boot CPU,
> > this gets called *before* the mitigation is selected, and then again
> > from srbds_select_mitigation() itself.
> > 
> > Maybe it should instead be called from identify_secondary_cpu(), while
> > keeping the existing call from srbds_select_mitigation() so the boot CPU
> > also gets updated.
> The SRBDS MSR needs to be poked from each CPU thread if you want to disable the
> mitigation.  Does identify_secondary_cpu get called per HW thread?

identify_secondary_cpu() gets called for every HW thread, other than the
boot one.

The boot one already does it in srbds_select_mitigation().

> I felt it would be nice to keep the call close to other mitigation setup
> logic, in this case the TAA logic.

That's not the TAA logic, but rather the TSX logic (though yes I realize
they're related).  And anyway SRBDS and TSX have different requirements.

> What do you think it will be more readable if I move it to
> identify_secondary_cpu?

As I said - it gets called *twice* on the boot CPU.  The first time is
before srbds_mitigation has even been configured.  That's confusing.

-- 
Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2
  2020-03-27 17:23     ` Luck, Tony
@ 2020-03-27 19:12       ` mark gross
  0 siblings, 0 replies; 17+ messages in thread
From: mark gross @ 2020-03-27 19:12 UTC (permalink / raw)
  To: speck

On Fri, Mar 27, 2020 at 10:23:56AM -0700, speck for Luck, Tony wrote:
> On Fri, Mar 27, 2020 at 09:20:41AM -0700, speck for mark gross wrote:
> > On Wed, Mar 25, 2020 at 10:19:23PM -0500, speck for Josh Poimboeuf wrote:
> > > > +static void __init srbds_select_mitigation(void)
> > > > +{
> > > > +	u64 ia32_cap;
> > > > +
> > > > +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> > > > +		srbds_mitigation = SRBDS_HYPERVISOR;
> > > > +		return;
> > > > +	}
> > > 
> > > It's confusing that this comes before the X86_BUG_SRBDS check.  Is that
> > > on purpose?
> > VM vendors on keybase wanted it so.
> 
> No. They don't want this. It results in all multi-socket servers
> (which are not affected by SRBDS) saying the mitigation status is
> unknown.
> 
> The preferred order to check is:
> 
> 1) Am I on the affected list?
> 
> 2) Am I running under a hypervisor?
> 
> 
> Initially Andrew Cooper had asked for the hypervisor check to come
> first. But he has since relented.  Below is a comment to explain
> why it is practially better to check for prescence on the affected
> list *before* checking for hypervisor:
> 
> /*
>  * This test relies on the CPUID values of vendor, family, model,
>  * stepping which might not reflect the real hardware when we are
>  * running as a guest. But VMM vendors have asked that we do this
>  * before the X86_FEATURE_HYPERVISOR test since this provides better
>  * guidance to users in most real situations.
>  */

Ok, I'll change it.

--mark

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2
  2020-03-27 17:37     ` Josh Poimboeuf
@ 2020-03-27 19:27       ` mark gross
  0 siblings, 0 replies; 17+ messages in thread
From: mark gross @ 2020-03-27 19:27 UTC (permalink / raw)
  To: speck

On Fri, Mar 27, 2020 at 12:37:10PM -0500, speck for Josh Poimboeuf wrote:
> On Fri, Mar 27, 2020 at 09:20:41AM -0700, speck for mark gross wrote:
> > > > +
> > > > +			off:	disable mitigation and remove performance
> > > > +				impact to rdrand and rdseed
> > > > +
> > > >  	srcutree.counter_wrap_check [KNL]
> > > >  			Specifies how frequently to check for
> > > >  			grace-period sequence counter wrap for the
> > > > @@ -397,6 +399,103 @@ static int __init tsx_async_abort_parse_cmdline(char *str)
> > > >  }
> > > >  early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
> > > >  
> > > > +#undef pr_fmt
> > > > +#define pr_fmt(fmt)	"SRBDS: " fmt
> > > > +
> > > > +enum srbds_mitigations srbds_mitigation __ro_after_init = SRBDS_MITIGATION_FULL;
> > > > +static const char * const srbds_strings[] = {
> > > > +	[SRBDS_NOT_AFFECTED]		= "Not affected",
> > > 
> > > I don't think the SRBDS_NOT_AFFECTED enum is needed.
> > > 
> > > The "Not affected" string is already printed by cpu_show_common().
> > > 
> > > And srbds_configure_mitigation() can just check for
> > > boot_cpu_has_bug(X86_BUG_SRBDS) instead of this enum value.
> > 
> > True.  I prefer having the complete state machine represented by the
> > enumeration but, if you feel strongly I'll try to remove it.
> 
> If the state isn't actually used by the sysfs printing code then it just
> serves to add confusion for a reader of the code.
> 
> And notice none of the other mitigations have this.  It's a subtle
> inconsistency, which is a recipe for future bugs.
I'll remove it.


> > > > +	[SRBDS_MITIGATION_OFF]		= "Vulnerable",
> > > > +	[SRBDS_MITIGATION_UCODE_NEEDED]	= "Vulnerable: no microcode",
> > > > +	[SRBDS_MITIGATION_FULL]		= "Mitigated",
> > > > +	[SRBDS_NOT_AFFECTED_TSX_OFF]	= "Not affected (TSX disabled)",
> > > > +	[SRBDS_HYPERVISOR]		= "Unknown",
> > > > +};
> > > > +
> > > > +static bool srbds_off;
> > > > +
> > > > +void srbds_configure_mitigation(void)
> > > 
> > > To me, "configure" sounds like it could be a software thing.
> > We are configuring the meditation by honoring the kernel command line in this
> > function.
> > 
> > > 
> > > How about update_srbds_msr(), which is also more consistent with some of
> > > the other naming in the file.
> > 
> > If you like but, I was mimicking the TAA implementation when choosing the name
> > of this function.
> 
> Hm, which TAA function name were you mimicking?

I thought taa_select_mitigation but, I see what you are talkng about.  I split
out the mitication selection and the application of the policy.

I'll change the name to update_srbds_msr.


> 
> > > > +static void __init srbds_select_mitigation(void)
> > > > +{
> > > > +	u64 ia32_cap;
> > > > +
> > > > +	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
> > > > +		srbds_mitigation = SRBDS_HYPERVISOR;
> > > > +		return;
> > > > +	}
> > > 
> > > It's confusing that this comes before the X86_BUG_SRBDS check.  Is that
> > > on purpose?
> > VM vendors on keybase wanted it so.
> > 
> > > 
> > > If so, I presume it doesn't have the intended effect, because if booting
> > > in a guest on a non-vulnerable CPU, srbds_mitigation will get set to
> > > SRBDS_HYPERVISOR, but the shown state will still be "Not affected"
> > > because of how cpu_show_common() checks for the bug at the beginning.
> > 
> > The intended effect is to report "Unknown" for the vulnerability status of
> > SRBDS for all guest VM's running this code and to never touch the MSR
> > controlling the mitigation.
> > 
> > I think this is what happens.  I'll double check and confirm with testing.
> 
> Maybe I'm blind, but given how cpu_show_common() works, I don't see how
> that could possibly happen when boot_cpu_has_bug(X86_BUG_SRBDS) is
> false.  If I'm wrong please enlighten me.
Maybe I'm blind.  Your right.  my qemu test boot on an old server part
workstation that is not affected by SRBDS is reporting "not affected" when it
I intended it to be reporting "Unknow"  (which if you saw Tony's email is also
not what we want now that I ahve clarification from Tony.)

> > > > +
> > > > +	if (!boot_cpu_has_bug(X86_BUG_SRBDS)) {
> > > > +		srbds_mitigation = SRBDS_NOT_AFFECTED;
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	if (!boot_cpu_has(X86_FEATURE_SRBDS_CTRL)) {
> > > > +		srbds_mitigation = SRBDS_MITIGATION_UCODE_NEEDED;
> > > > +		return;
> > > > +	}
> > > > +
> > > > +	if (boot_cpu_has_bug(X86_BUG_SRBDS)) {
> > > > +		srbds_mitigation = SRBDS_MITIGATION_FULL;
> > > > +		/*
> > > > +		 * Check to see if this is one of the MDS_NO systems supporting
> > > > +		 * TSX that are only exposed to SRBDS when TSX is enabled.
> > > > +		 */
> > > > +		ia32_cap = x86_read_arch_cap_msr();
> > > > +		if (ia32_cap & ARCH_CAP_MDS_NO) {
> > > > +			if (!boot_cpu_has(X86_FEATURE_RTM))
> > > > +				srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
> > > > +		}
> > > 
> > > I wonder if this would be equivalent?
> > > 
> > > 		if (!boot_cpu_has_bug(X86_BUG_MDS) && boot_cpu_has(X86_BUG_TAA) &&
> > > 		    !boot_cpu_has(X86_FEATURE_RTM))
> > > 		    	srbds_mitigation = SRBDS_NOT_AFFECTED_TSX_OFF;
> > I have no clue if its equivalent but, I find your version much harder to parse
> > and my version matches more closely to what the white paper states.
> 
> Ok.
> 
> > > > +	}
> > > > +
> > > > +	if (cpu_mitigations_off() || srbds_off) {
> > > > +		if (srbds_mitigation != SRBDS_NOT_AFFECTED_TSX_OFF)
> > > > +			srbds_mitigation = SRBDS_MITIGATION_OFF;
> > > > +	}
> > > > +
> > > > +	srbds_configure_mitigation();
> > > > +}
> > > 
> > > The other mitigations do a printk with the mitigation status here.
> > > Should we not do the same here?
> > Perhaps but, I was attempting to limit the extra chatter in the kernel logs.
> > If you feel it adds value to the code I'll put it in.  FWIW I think the chatter
> > from the other places should be removed.
> 
> It should at least be consistent with the others.
ok.


> > > > +ssize_t cpu_show_special_register_data_sampling(struct device *dev, struct device_attribute *attr, char *buf)
> > > > +{
> > > 
> > > This function name can be renamed to match whatever the filename turns
> > > out to be:
> > > 
> > >   cpu_show_srbds()
> > > 
> > > or
> > > 
> > >   cpu_show_special_register_data_buffer_sampling()
> > 
> > I got some early review feedback to not use the shorter name "srbds" for the
> > sysfs inodes so I was trying to avoid the full spelled out the full names.
> > 
> > "srbds" and "special_register_data_buffer_sampling" do not imply HWRNG security
> > risk so IMO I'd like to just use srbds.  What do others think?
> 
> +1 for "srbds" for both file name and function name.
thanks!


> > > > +void srbds_configure_mitigation(void);
> > > > +
> > > >  #ifdef CONFIG_CPU_SUP_INTEL
> > > > +
> > > >  enum tsx_ctrl_states {
> > > >  	TSX_CTRL_ENABLE,
> > > >  	TSX_CTRL_DISABLE,
> > > > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> > > > index be82cd5841c3..1b083a2a415b 100644
> > > > --- a/arch/x86/kernel/cpu/intel.c
> > > > +++ b/arch/x86/kernel/cpu/intel.c
> > > > @@ -684,6 +684,8 @@ static void init_intel(struct cpuinfo_x86 *c)
> > > >  		tsx_enable();
> > > >  	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
> > > >  		tsx_disable();
> > > > +
> > > > +	srbds_configure_mitigation();
> > > >  }
> > > 
> > > I'm not sure this is the right place to call this.  On the boot CPU,
> > > this gets called *before* the mitigation is selected, and then again
> > > from srbds_select_mitigation() itself.
> > > 
> > > Maybe it should instead be called from identify_secondary_cpu(), while
> > > keeping the existing call from srbds_select_mitigation() so the boot CPU
> > > also gets updated.
> > The SRBDS MSR needs to be poked from each CPU thread if you want to disable the
> > mitigation.  Does identify_secondary_cpu get called per HW thread?
> 
> identify_secondary_cpu() gets called for every HW thread, other than the
> boot one.
> 
> The boot one already does it in srbds_select_mitigation().
> 
> > I felt it would be nice to keep the call close to other mitigation setup
> > logic, in this case the TAA logic.
> 
> That's not the TAA logic, but rather the TSX logic (though yes I realize
> they're related).  And anyway SRBDS and TSX have different requirements.
> 
> > What do you think it will be more readable if I move it to
> > identify_secondary_cpu?
> 
> As I said - it gets called *twice* on the boot CPU.  The first time is
> before srbds_mitigation has even been configured.  That's confusing.
I'll move it to identify_secondary_cpu.

--mark


> -- 
> Josh

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-03-27 19:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-18 21:27 [MODERATED] [PATCH 0/3] v4 more sampling fun 0 mark gross
2020-01-16 22:16 ` [MODERATED] [PATCH 2/3] v4 more sampling fun 2 mark gross
2020-01-30 19:12 ` [MODERATED] [PATCH 3/3] v4 more sampling fun 3 mark gross
2020-03-17  0:56 ` [MODERATED] [PATCH 1/3] v4 more sampling fun 1 mark gross
     [not found] ` <5e7296c7.1c69fb81.f9a2f.00ebSMTPIN_ADDED_BROKEN@mx.google.com>
2020-03-19  8:50   ` [MODERATED] " Greg KH
2020-03-19 15:40     ` mark gross
2020-03-19 15:50       ` Luck, Tony
2020-03-19 16:34         ` Greg KH
2020-03-19 18:13     ` Thomas Gleixner
2020-03-26  3:19 ` [MODERATED] Re: [PATCH 2/3] v4 more sampling fun 2 Josh Poimboeuf
2020-03-27 16:20   ` mark gross
2020-03-27 17:23     ` Luck, Tony
2020-03-27 19:12       ` mark gross
2020-03-27 17:37     ` Josh Poimboeuf
2020-03-27 19:27       ` mark gross
2020-03-26  3:25 ` [MODERATED] Re: [PATCH 3/3] v4 more sampling fun 3 Josh Poimboeuf
2020-03-27 16:28   ` mark gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).