All of lore.kernel.org
 help / color / mirror / Atom feed
* [MODERATED] [PATCH v5 00/11] TAAv5 0
@ 2019-10-05  6:17 Pawan Gupta
  2019-10-05  6:26 ` [MODERATED] [PATCH v5 01/11] TAAv5 1 Pawan Gupta
                   ` (19 more replies)
  0 siblings, 20 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:17 UTC (permalink / raw)
  To: speck

Changes since v4:
- Simplify TSX_CTRL enumeration, set TSX_CTRL default to NOT_SUPPORTED.
- Add new patch "Export MDS_NO=0 to guests when TSX is enabled".
- Add new patch for tsx=auto which enables TSX on unaffected platforms,
  default stays tsx=off.
- Handle kexec like cases for TAA bug enumeration. Set X86_BUG_TAA when
  X86_FEATURE_RTM=1 or TSX_CTRL=1.
- TSX control sysfs file rename(s/tsx/hw_tx_mem/) and file creation changes.
- Dropped patch "x86/speculation/mds: Rename MDS buffer clear functions"
  It doesn't provide enough benefit compared to the amount of changes
  involved. Added code comment about using MDS mitigation.
- Add helper function read_ia32_arch_cap().
- Reorder mitigation checks in taa_select_mitigation().
- s/MSR_// for TSX_CTRL bit defines.
- Changelog,comments and documentation update.
- Rebase to v5.3.

Changes since v3:
- Disable tsx unconditionally, removed tsx=auto mode.
- Fix verw idle clear.
- Refactor TSX code into new tsx.c
- Use early_param for tsx cmdline parameter.
- Rename sysfs vulnerability file to tsx_async_abort.
- Rename common CPU buffer clear infrastructure (s/mds/verw)
- s/TAA_MITIGATION_VMWERV/TAA_MITIGATION_UCODE_NEEDED
- Rebased to v5.3-rc6
- Split patches.
- Changelog and documentation update.

Changes since v2:
- Rebased to v5.3-rc5
- Fix build for non-x86 targets.
- Commit log, code comments and documentation update.
- Minor code refactoring.

Changes since v1:
- Added TSX command line options added(on|off|auto). "auto" is the
  default which sets TSX state as below:
	- TSX disabled on affected platforms
	- TSX enabled on unaffected platforms
- Update commit messages and documentation.
- Add support to control TSX feature from sysfs.

This patchset adds the mitigation for TSX Async Abort (TAA) which is a
side channel vulnerability to internal buffers in some Intel processors similar
to Microachitectural Data Sampling (MDS). Transactional Synchronization
Extensions (TSX) is a feature in Intel processors that speeds up
execution of multi-threaded software through lock elision.

During TAA certain loads may speculatively pass invalid data to
dependent operations when an asynchronous abort condition is pending in
a TSX transaction.  An attacker can use TSX as a tool to extract
information from the microarchitectural buffers.  The victim data may be
placed into these buffers during normal execution which is unrelated to
any use of TSX.

Mitigation is to either clear the cpu buffers or disable TSX.

Pawan Gupta (11):
  x86/tsx: Add enumeration support for IA32_TSX_CTRL MSR
  x86: Add helper function read_ia32_arch_cap()
  x86/tsx: Disable TSX by default
  x86/tsx: Add cmdline parameter to control TSX
  x86/speculation/taa: Add mitigation for TSX Async Abort
  x86/speculation/taa: Add sysfs reporting for TSX Async Abort
  KVM: x86/speculation/taa: Export MDS_NO=0 to guests when TSX is
    enabled
  x86/tsx: Add "auto" option to TSX cmdline parameter
  x86/tsx: Add sysfs interface to control TSX
  x86/speculation/taa: Update TAA on TSX state change
  x86/speculation/taa: Add documentation for TSX Async Abort

 .../ABI/testing/sysfs-devices-system-cpu      |  24 ++
 Documentation/admin-guide/hw-vuln/index.rst   |   1 +
 .../admin-guide/hw-vuln/tsx_async_abort.rst   | 269 ++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  52 ++++
 Documentation/x86/index.rst                   |   1 +
 Documentation/x86/tsx_async_abort.rst         |  54 ++++
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/msr-index.h              |   9 +
 arch/x86/include/asm/nospec-branch.h          |   4 +-
 arch/x86/include/asm/processor.h              |   7 +
 arch/x86/kernel/cpu/Makefile                  |   2 +-
 arch/x86/kernel/cpu/bugs.c                    | 169 ++++++++++-
 arch/x86/kernel/cpu/common.c                  |  31 +-
 arch/x86/kernel/cpu/cpu.h                     |   6 +
 arch/x86/kernel/cpu/intel.c                   |   2 +
 arch/x86/kernel/cpu/tsx.c                     | 258 +++++++++++++++++
 arch/x86/kvm/x86.c                            |  19 ++
 drivers/base/cpu.c                            |  41 ++-
 include/linux/cpu.h                           |   9 +
 19 files changed, 949 insertions(+), 10 deletions(-)
 create mode 100644 Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
 create mode 100644 Documentation/x86/tsx_async_abort.rst
 create mode 100644 arch/x86/kernel/cpu/tsx.c

-- 
2.20.1

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 01/11] TAAv5 1
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
@ 2019-10-05  6:26 ` Pawan Gupta
  2019-10-05  6:27 ` [MODERATED] [PATCH v5 02/11] TAAv5 2 Pawan Gupta
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:26 UTC (permalink / raw)
  To: speck

Transactional Synchronization Extensions (TSX) may be used on certain
processors as part of a speculative side channel attack.  A microcode
update for existing processors that are vulnerable to this attack will
add a new MSR, IA32_TSX_CTRL to allow the system administrator the
option to disable TSX as one of the possible mitigations.  [Note that
future processors that are not vulnerable will also support the
IA32_TSX_CTRL MSR].  Add defines for the new IA32_TSX_CTRL MSR and its
bits.

TSX has two sub-features:

1. Restricted Transactional Memory (RTM) is an explicitly-used feature
   where new instructions begin and end TSX transactions.
2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
   "old" style locks are used by software.

Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
IA32_TSX_CTRL MSR.

There are two control bits in IA32_TSX_CTRL MSR:

  Bit 0: When set it disables the Restricted Transactional Memory (RTM)
         sub-feature of TSX (will force all transactions to abort on the
	 XBEGIN instruction).

  Bit 1: When set it disables the enumeration of the RTM feature (i.e.
         it will make CPUID(EAX=7).EBX{bit11} read as 0).

The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
disabled but still enumerated as present by CPUID(EAX=7).EBX{bit4}.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 arch/x86/include/asm/msr-index.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 271d837d69a8..45b6705d9f71 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -93,6 +93,7 @@
 						  * Microarchitectural Data
 						  * Sampling (MDS) vulnerabilities.
 						  */
+#define ARCH_CAP_TSX_CTRL_MSR		BIT(7)	/* MSR for TSX control is available. */
 
 #define MSR_IA32_FLUSH_CMD		0x0000010b
 #define L1D_FLUSH			BIT(0)	/*
@@ -103,6 +104,10 @@
 #define MSR_IA32_BBL_CR_CTL		0x00000119
 #define MSR_IA32_BBL_CR_CTL3		0x0000011e
 
+#define MSR_IA32_TSX_CTRL		0x00000122
+#define TSX_CTRL_RTM_DISABLE		BIT(0)	/* Disable RTM feature */
+#define TSX_CTRL_CPUID_CLEAR		BIT(1)	/* Disable TSX enumeration */
+
 #define MSR_IA32_SYSENTER_CS		0x00000174
 #define MSR_IA32_SYSENTER_ESP		0x00000175
 #define MSR_IA32_SYSENTER_EIP		0x00000176
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 02/11] TAAv5 2
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
  2019-10-05  6:26 ` [MODERATED] [PATCH v5 01/11] TAAv5 1 Pawan Gupta
@ 2019-10-05  6:27 ` Pawan Gupta
  2019-10-05  6:28 ` [MODERATED] [PATCH v5 03/11] TAAv5 3 Pawan Gupta
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:27 UTC (permalink / raw)
  To: speck

Move IA32_ARCH_CAPABILITIES MSR read to a helper function. If the CPU
doesn't support this MSR return 0. Use this function in following
commits to read IA32_ARCH_CAPABILITIES MSR.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 arch/x86/kernel/cpu/common.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f125bf7ecb6f..af9e357b2ca9 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1091,19 +1091,27 @@ static bool __init cpu_matches(unsigned long which)
 	return m && !!(m->driver_data & which);
 }
 
-static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+u64 read_ia32_arch_cap(void)
 {
 	u64 ia32_cap = 0;
 
+	/* Leave the MSR set to all 0's when not supported */
+	if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+	return ia32_cap;
+}
+
+static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+{
+	u64 ia32_cap = read_ia32_arch_cap();
+
 	if (cpu_matches(NO_SPECULATION))
 		return;
 
 	setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
 	setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
 
-	if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
-		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
-
 	if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) &&
 	   !cpu_has(c, X86_FEATURE_AMD_SSB_NO))
 		setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 03/11] TAAv5 3
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
  2019-10-05  6:26 ` [MODERATED] [PATCH v5 01/11] TAAv5 1 Pawan Gupta
  2019-10-05  6:27 ` [MODERATED] [PATCH v5 02/11] TAAv5 2 Pawan Gupta
@ 2019-10-05  6:28 ` Pawan Gupta
  2019-10-05  6:29 ` [MODERATED] [PATCH v5 04/11] TAAv5 4 Pawan Gupta
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:28 UTC (permalink / raw)
  To: speck

Disable TSX by default on bootup. If IA32_TSX_CTRL MSR is not present,
TSX state stays NOT_SUPPORTED which is the compile time default,
otherwise change TSX state to DISABLE. This is because on certain
processsors TSX may be used as a part of a speculative side channel
attack.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 arch/x86/kernel/cpu/Makefile |  2 +-
 arch/x86/kernel/cpu/cpu.h    |  4 ++
 arch/x86/kernel/cpu/intel.c  |  2 +
 arch/x86/kernel/cpu/tsx.c    | 72 ++++++++++++++++++++++++++++++++++++
 4 files changed, 79 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/kernel/cpu/tsx.c

diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
index d7a1e5a9331c..890f60083eca 100644
--- a/arch/x86/kernel/cpu/Makefile
+++ b/arch/x86/kernel/cpu/Makefile
@@ -30,7 +30,7 @@ obj-$(CONFIG_PROC_FS)	+= proc.o
 obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o
 
 ifdef CONFIG_CPU_SUP_INTEL
-obj-y			+= intel.o intel_pconfig.o
+obj-y			+= intel.o intel_pconfig.o tsx.o
 obj-$(CONFIG_PM)	+= intel_epb.o
 endif
 obj-$(CONFIG_CPU_SUP_AMD)		+= amd.o
diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index c0e2407abdd6..d864ec4180cc 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -62,4 +62,8 @@ unsigned int aperfmperf_get_khz(int cpu);
 
 extern void x86_spec_ctrl_setup_ap(void);
 
+extern void tsx_init(struct cpuinfo_x86 *c);
+
+extern u64 read_ia32_arch_cap(void);
+
 #endif /* ARCH_X86_CPU_H */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 8d6d92ebeb54..b1d6c96f6b88 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -761,6 +761,8 @@ static void init_intel(struct cpuinfo_x86 *c)
 		detect_tme(c);
 
 	init_intel_misc_features(c);
+
+	tsx_init(c);
 }
 
 #ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
new file mode 100644
index 000000000000..c549750dd7c8
--- /dev/null
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Transactional Synchronization Extensions (TSX) control.
+ *
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Author:
+ *	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
+ */
+
+#include <linux/processor.h>
+#include <linux/cpufeature.h>
+
+#include "cpu.h"
+
+static enum tsx_ctrl_states {
+	TSX_CTRL_ENABLE,
+	TSX_CTRL_DISABLE,
+	TSX_CTRL_NOT_SUPPORTED,
+} tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
+
+static void tsx_disable(void)
+{
+	u64 tsx;
+
+	rdmsrl(MSR_IA32_TSX_CTRL, tsx);
+
+	/* Force all transactions to immediately abort */
+	tsx |= TSX_CTRL_RTM_DISABLE;
+	/*
+	 * Ensure TSX support is not enumerated in CPUID.
+	 * This is visible to userspace and will ensure they
+	 * do not waste resources trying TSX transactions that
+	 * will always abort.
+	 */
+	tsx |= TSX_CTRL_CPUID_CLEAR;
+
+	wrmsrl(MSR_IA32_TSX_CTRL, tsx);
+}
+
+static bool tsx_ctrl_is_supported(void)
+{
+	u64 ia32_cap = read_ia32_arch_cap();
+
+	/*
+	 * TSX is controlled via MSR_IA32_TSX_CTRL.  However,
+	 * support for this MSR is enumerated by ARCH_CAP_TSX_MSR bit
+	 * in MSR_IA32_ARCH_CAPABILITIES.
+	 */
+	return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
+}
+
+void tsx_init(struct cpuinfo_x86 *c)
+{
+	if (!tsx_ctrl_is_supported())
+		return;
+
+	/*
+	 * Default to TSX_CTRL_DISABLE. This is because on certain processors
+	 * TSX may be used as part of a speculative side channel attack.
+	 */
+	tsx_ctrl_state = TSX_CTRL_DISABLE;
+
+	tsx_disable();
+	/*
+	 * tsx_disable() will change the state of the
+	 * RTM CPUID bit.  Clear it here since it is now
+	 * expected to be not set.
+	 */
+	clear_cpu_cap(c, X86_FEATURE_RTM);
+	setup_clear_cpu_cap(X86_FEATURE_RTM);
+}
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 04/11] TAAv5 4
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (2 preceding siblings ...)
  2019-10-05  6:28 ` [MODERATED] [PATCH v5 03/11] TAAv5 3 Pawan Gupta
@ 2019-10-05  6:29 ` Pawan Gupta
  2019-10-05  6:30 ` [MODERATED] [PATCH v5 05/11] TAAv5 5 Pawan Gupta
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:29 UTC (permalink / raw)
  To: speck

Add kernel cmdline parameter "tsx" to control the Transactional
Synchronization Extensions (TSX) feature.  On CPUs that support TSX
control, use "tsx=on|off" to enable or disable TSX.  Not specifying this
option is equivalent to "tsx=off".

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 .../admin-guide/kernel-parameters.txt         | 11 +++
 arch/x86/kernel/cpu/tsx.c                     | 97 ++++++++++++++++---
 2 files changed, 95 insertions(+), 13 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 4c1971960afa..832537d59562 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4813,6 +4813,17 @@
 			interruptions from clocksource watchdog are not
 			acceptable).
 
+	tsx=		[X86] Control Transactional Synchronization
+			Extensions (TSX) feature in Intel processors that
+			support TSX control.
+
+			This parameter controls the TSX feature. The options are:
+
+			on	- Enable TSX on the system.
+			off	- Disable TSX on the system.
+
+			Not specifying this option is equivalent to tsx=off.
+
 	turbografx.map[2|3]=	[HW,JOY]
 			TurboGraFX parallel port interface
 			Format:
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index c549750dd7c8..1c0cee7a7d46 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -19,6 +19,30 @@ static enum tsx_ctrl_states {
 	TSX_CTRL_NOT_SUPPORTED,
 } tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
 
+static enum tsx_user_cmds {
+	TSX_USER_CMD_NONE,
+	TSX_USER_CMD_ON,
+	TSX_USER_CMD_OFF,
+} tsx_user_cmd = TSX_USER_CMD_NONE;
+
+static int __init tsx_cmdline(char *str)
+{
+	if (!str)
+		return -EINVAL;
+
+	/*
+	 * tsx_en/disable() are only called when
+	 * X86_FEATURE_RTM and TSX_CTRL MSR are supported.
+	 */
+	if (!strcmp(str, "on"))
+		tsx_user_cmd = TSX_USER_CMD_ON;
+	else if (!strcmp(str, "off"))
+		tsx_user_cmd = TSX_USER_CMD_OFF;
+
+	return 0;
+}
+early_param("tsx", tsx_cmdline);
+
 static void tsx_disable(void)
 {
 	u64 tsx;
@@ -38,6 +62,24 @@ static void tsx_disable(void)
 	wrmsrl(MSR_IA32_TSX_CTRL, tsx);
 }
 
+static void tsx_enable(void)
+{
+	u64 tsx;
+
+	rdmsrl(MSR_IA32_TSX_CTRL, tsx);
+
+	/* Enable the RTM feature in the cpu */
+	tsx &= ~TSX_CTRL_RTM_DISABLE;
+	/*
+	 * Ensure TSX support is enumerated in CPUID.
+	 * This is visible to userspace and will ensure they
+	 * can enumerate and use the TSX feature.
+	 */
+	tsx &= ~TSX_CTRL_CPUID_CLEAR;
+
+	wrmsrl(MSR_IA32_TSX_CTRL, tsx);
+}
+
 static bool tsx_ctrl_is_supported(void)
 {
 	u64 ia32_cap = read_ia32_arch_cap();
@@ -55,18 +97,47 @@ void tsx_init(struct cpuinfo_x86 *c)
 	if (!tsx_ctrl_is_supported())
 		return;
 
-	/*
-	 * Default to TSX_CTRL_DISABLE. This is because on certain processors
-	 * TSX may be used as part of a speculative side channel attack.
-	 */
-	tsx_ctrl_state = TSX_CTRL_DISABLE;
+	switch (tsx_user_cmd) {
+	case TSX_USER_CMD_ON:
+		tsx_ctrl_state = TSX_CTRL_ENABLE;
+		break;
+	case TSX_USER_CMD_OFF:
+		tsx_ctrl_state = TSX_CTRL_DISABLE;
+		break;
+	case TSX_USER_CMD_NONE:
+	default:
+		/*
+		 * If user provided an invalid option or tsx= is not provided
+		 * on cmdline default to TSX_CTRL_DISABLE. This is because on
+		 * certain processors TSX may be used as part of a speculative
+		 * side channel attack.
+		 */
+		tsx_ctrl_state = TSX_CTRL_DISABLE;
+	}
 
-	tsx_disable();
-	/*
-	 * tsx_disable() will change the state of the
-	 * RTM CPUID bit.  Clear it here since it is now
-	 * expected to be not set.
-	 */
-	clear_cpu_cap(c, X86_FEATURE_RTM);
-	setup_clear_cpu_cap(X86_FEATURE_RTM);
+	if (tsx_ctrl_state == TSX_CTRL_DISABLE) {
+		tsx_disable();
+		/*
+		 * tsx_disable() will change the state of the
+		 * RTM CPUID bit.  Clear it here since it is now
+		 * expected to be not set.
+		 */
+		clear_cpu_cap(c, X86_FEATURE_RTM);
+		setup_clear_cpu_cap(X86_FEATURE_RTM);
+	} else if (tsx_ctrl_state == TSX_CTRL_ENABLE) {
+		/*
+		 * HW defaults TSX to be enabled at bootup.
+		 * We may still need the TSX enable support
+		 * during init for special cases like
+		 * kexec after TSX is disabled.
+		 */
+		tsx_enable();
+		/*
+		 * tsx_enable() will change the state of the
+		 * RTM CPUID bit.  Force it here since it is now
+		 * expected to be set.
+		 */
+		set_cpu_cap(c, X86_FEATURE_RTM);
+		setup_force_cpu_cap(X86_FEATURE_RTM);
+	}
 }
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 05/11] TAAv5 5
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (3 preceding siblings ...)
  2019-10-05  6:29 ` [MODERATED] [PATCH v5 04/11] TAAv5 4 Pawan Gupta
@ 2019-10-05  6:30 ` Pawan Gupta
  2019-10-05  6:31 ` [MODERATED] [PATCH v5 06/11] TAAv5 6 Pawan Gupta
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:30 UTC (permalink / raw)
  To: speck

TSX Async Abort (TAA) is a side channel vulnerability to the internal
buffers in some Intel processors similar to Microachitectural Data
Sampling (MDS).  In this case certain loads may speculatively pass
invalid data to dependent operations when an asynchronous abort
condition is pending in a TSX transaction.  This includes loads with no
fault or assist condition.  Such loads may speculatively expose stale
data from the uarch data structures as in MDS.  Scope of exposure is
within the same-thread and cross-thread.  This issue affects all current
processors that support TSX, but do not have ARCH_CAP_TAA_NO (bit 8) set
in MSR_IA32_ARCH_CAPABILITIES.

On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0,
CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers
using VERW or L1D_FLUSH, there is no additional mitigation needed for
TAA.

On affected CPUs with MDS_NO=1 this issue can be mitigated by disabling
Transactional Synchronization Extensions (TSX) feature.  A new MSR
IA32_TSX_CTRL in future and current processors after a microcode update
can be used to control TSX feature.  TSX_CTRL_RTM_DISABLE bit disables
the TSX sub-feature Restricted Transactional Memory (RTM).
TSX_CTRL_CPUID_CLEAR bit clears the RTM enumeration in CPUID.  The other
TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
disabled with updated microcode but still enumerated as present by
CPUID(EAX=7).EBX{bit4}.

The second mitigation approach is similar to MDS which is clearing the
affected CPU buffers on return to user space and when entering a guest.
Relevant microcode update is required for the mitigation to work.  More
details on this approach can be found here:
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html

TSX feature can be controlled by the "tsx" command line parameter.  If
the TSX feature is forced to be enabled then "Clear CPU buffers" (MDS
mitigation) is deployed. The effective mitigation state can be read from
sysfs.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 arch/x86/include/asm/cpufeatures.h   |   1 +
 arch/x86/include/asm/msr-index.h     |   4 +
 arch/x86/include/asm/nospec-branch.h |   4 +-
 arch/x86/include/asm/processor.h     |   7 ++
 arch/x86/kernel/cpu/bugs.c           | 127 ++++++++++++++++++++++++++-
 arch/x86/kernel/cpu/common.c         |  15 ++++
 6 files changed, 154 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index e880f2408e29..138512ecc975 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -397,5 +397,6 @@
 #define X86_BUG_MDS			X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
 #define X86_BUG_MSBDS_ONLY		X86_BUG(20) /* CPU is only affected by the  MSDBS variant of BUG_MDS */
 #define X86_BUG_SWAPGS			X86_BUG(21) /* CPU is affected by speculation through SWAPGS */
+#define X86_BUG_TAA			X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 45b6705d9f71..4b654ae5b915 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -94,6 +94,10 @@
 						  * Sampling (MDS) vulnerabilities.
 						  */
 #define ARCH_CAP_TSX_CTRL_MSR		BIT(7)	/* MSR for TSX control is available. */
+#define ARCH_CAP_TAA_NO			BIT(8)	/*
+						 * Not susceptible to
+						 * TSX Async Abort (TAA) vulnerabilities.
+						 */
 
 #define MSR_IA32_FLUSH_CMD		0x0000010b
 #define L1D_FLUSH			BIT(0)	/*
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 80bc209c0708..5c24a7b35166 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -314,7 +314,7 @@ DECLARE_STATIC_KEY_FALSE(mds_idle_clear);
 #include <asm/segment.h>
 
 /**
- * mds_clear_cpu_buffers - Mitigation for MDS vulnerability
+ * mds_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability
  *
  * This uses the otherwise unused and obsolete VERW instruction in
  * combination with microcode which triggers a CPU buffer flush when the
@@ -337,7 +337,7 @@ static inline void mds_clear_cpu_buffers(void)
 }
 
 /**
- * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability
+ * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability
  *
  * Clear CPU buffers if the corresponding static key is enabled
  */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 6e0a3b43d027..999b85039128 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -988,4 +988,11 @@ enum mds_mitigations {
 	MDS_MITIGATION_VMWERV,
 };
 
+enum taa_mitigations {
+	TAA_MITIGATION_OFF,
+	TAA_MITIGATION_UCODE_NEEDED,
+	TAA_MITIGATION_VERW,
+	TAA_MITIGATION_TSX_DISABLE,
+};
+
 #endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index c6fa3ef10b4e..75d7dd7eee1b 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -39,6 +39,7 @@ static void __init spectre_v2_select_mitigation(void);
 static void __init ssb_select_mitigation(void);
 static void __init l1tf_select_mitigation(void);
 static void __init mds_select_mitigation(void);
+static void __init taa_select_mitigation(void);
 
 /* The base value of the SPEC_CTRL MSR that always has to be preserved. */
 u64 x86_spec_ctrl_base;
@@ -105,6 +106,7 @@ void __init check_bugs(void)
 	ssb_select_mitigation();
 	l1tf_select_mitigation();
 	mds_select_mitigation();
+	taa_select_mitigation();
 
 	arch_smt_update();
 
@@ -268,6 +270,110 @@ static int __init mds_cmdline(char *str)
 }
 early_param("mds", mds_cmdline);
 
+#undef pr_fmt
+#define pr_fmt(fmt)	"TAA: " fmt
+
+/* Default mitigation for TAA-affected CPUs */
+static enum taa_mitigations taa_mitigation __ro_after_init = TAA_MITIGATION_VERW;
+static bool taa_nosmt __ro_after_init;
+
+static const char * const taa_strings[] = {
+	[TAA_MITIGATION_OFF]		= "Vulnerable",
+	[TAA_MITIGATION_UCODE_NEEDED]	= "Vulnerable: Clear CPU buffers attempted, no microcode",
+	[TAA_MITIGATION_VERW]		= "Mitigation: Clear CPU buffers",
+	[TAA_MITIGATION_TSX_DISABLE]	= "Mitigation: TSX disabled",
+};
+
+static void __init taa_select_mitigation(void)
+{
+	u64 ia32_cap = read_ia32_arch_cap();
+
+	if (!boot_cpu_has_bug(X86_BUG_TAA)) {
+		taa_mitigation = TAA_MITIGATION_OFF;
+		return;
+	}
+
+	/*
+	 * As X86_BUG_TAA=1, TSX feature is supported by the hardware. If
+	 * TSX was disabled (X86_FEATURE_RTM=0) earlier during tsx_init().
+	 * Select TSX_DISABLE as mitigation.
+	 *
+	 * This check is ahead of mitigations=off and tsx_async_abort=off
+	 * because when TSX is disabled mitigation is already in place. This
+	 * ensures sysfs doesn't show "Vulnerable" when TSX is disabled.
+	 */
+	if (!boot_cpu_has(X86_FEATURE_RTM)) {
+		taa_mitigation = TAA_MITIGATION_TSX_DISABLE;
+		pr_info("%s\n", taa_strings[taa_mitigation]);
+		return;
+	}
+
+	/* All mitigations turned off from cmdline (mitigations=off) */
+	if (cpu_mitigations_off()) {
+		taa_mitigation = TAA_MITIGATION_OFF;
+		return;
+	}
+
+	/* TAA mitigation is turned off from cmdline (tsx_async_abort=off) */
+	if (taa_mitigation == TAA_MITIGATION_OFF) {
+		pr_info("%s\n", taa_strings[taa_mitigation]);
+		return;
+	}
+
+	if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
+		taa_mitigation = TAA_MITIGATION_VERW;
+	else
+		taa_mitigation = TAA_MITIGATION_UCODE_NEEDED;
+
+	/*
+	 * VERW doesn't clear the CPU buffers when MD_CLEAR=1 and MDS_NO=1.
+	 * A microcode update fixes this behavior to clear CPU buffers.
+	 * Microcode update also adds support for MSR_IA32_TSX_CTRL which
+	 * is enumerated by ARCH_CAP_TSX_CTRL_MSR bit.
+	 *
+	 * On MDS_NO=1 CPUs if ARCH_CAP_TSX_CTRL_MSR is not set, microcode
+	 * update is required.
+	 */
+	if ((ia32_cap & ARCH_CAP_MDS_NO) &&
+	    !(ia32_cap & ARCH_CAP_TSX_CTRL_MSR))
+		taa_mitigation = TAA_MITIGATION_UCODE_NEEDED;
+
+	/*
+	 * TSX is enabled, select alternate mitigation for TAA which is
+	 * same as MDS. Enable MDS static branch to clear CPU buffers.
+	 *
+	 * For guests that can't determine whether the correct microcode is
+	 * present on host, enable the mitigation for UCODE_NEEDED as well.
+	 */
+	static_branch_enable(&mds_user_clear);
+
+	if (taa_nosmt || cpu_mitigations_auto_nosmt())
+		cpu_smt_disable(false);
+
+	pr_info("%s\n", taa_strings[taa_mitigation]);
+}
+
+static int __init tsx_async_abort_cmdline(char *str)
+{
+	if (!boot_cpu_has_bug(X86_BUG_TAA))
+		return 0;
+
+	if (!str)
+		return -EINVAL;
+
+	if (!strcmp(str, "off")) {
+		taa_mitigation = TAA_MITIGATION_OFF;
+	} else if (!strcmp(str, "full")) {
+		taa_mitigation = TAA_MITIGATION_VERW;
+	} else if (!strcmp(str, "full,nosmt")) {
+		taa_mitigation = TAA_MITIGATION_VERW;
+		taa_nosmt = true;
+	}
+
+	return 0;
+}
+early_param("tsx_async_abort", tsx_async_abort_cmdline);
+
 #undef pr_fmt
 #define pr_fmt(fmt)     "Spectre V1 : " fmt
 
@@ -765,7 +871,7 @@ static void update_indir_branch_cond(void)
 #undef pr_fmt
 #define pr_fmt(fmt) fmt
 
-/* Update the static key controlling the MDS CPU buffer clear in idle */
+/* Update the static key controlling the MDS and TAA CPU buffer clear in idle */
 static void update_mds_branch_idle(void)
 {
 	/*
@@ -775,8 +881,11 @@ static void update_mds_branch_idle(void)
 	 * The other variants cannot be mitigated when SMT is enabled, so
 	 * clearing the buffers on idle just to prevent the Store Buffer
 	 * repartitioning leak would be a window dressing exercise.
+	 *
+	 * Apply idle buffer clearing to TAA affected CPUs also.
 	 */
-	if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY))
+	if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY) &&
+	    !boot_cpu_has_bug(X86_BUG_TAA))
 		return;
 
 	if (sched_smt_active())
@@ -786,6 +895,7 @@ static void update_mds_branch_idle(void)
 }
 
 #define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
+#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n"
 
 void arch_smt_update(void)
 {
@@ -819,6 +929,19 @@ void arch_smt_update(void)
 		break;
 	}
 
+	switch (taa_mitigation) {
+	case TAA_MITIGATION_VERW:
+	case TAA_MITIGATION_UCODE_NEEDED:
+		if (sched_smt_active())
+			pr_warn_once(TAA_MSG_SMT);
+		/* TSX is enabled, apply MDS idle buffer clearing. */
+		update_mds_branch_idle();
+		break;
+	case TAA_MITIGATION_TSX_DISABLE:
+	case TAA_MITIGATION_OFF:
+		break;
+	}
+
 	mutex_unlock(&spec_ctrl_mutex);
 }
 
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index af9e357b2ca9..6b25039aa8ae 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1128,6 +1128,21 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 	if (!cpu_matches(NO_SWAPGS))
 		setup_force_cpu_bug(X86_BUG_SWAPGS);
 
+	/*
+	 * When processor is not mitigated for TAA (TAA_NO=0) set TAA bug when:
+	 *	- TSX is supported or
+	 *	- TSX_CTRL is supported
+	 *
+	 * TSX_CTRL check is needed for cases when TSX could be disabled before
+	 * the kernel boot e.g. kexec
+	 * TSX_CTRL check alone is not sufficient for cases when the microcode
+	 * update is not present or running as guest that don't get TSX_CTRL.
+	 */
+	if (!(ia32_cap & ARCH_CAP_TAA_NO) &&
+	    (boot_cpu_has(X86_FEATURE_RTM) ||
+	     (ia32_cap & ARCH_CAP_TSX_CTRL_MSR)))
+		setup_force_cpu_bug(X86_BUG_TAA);
+
 	if (cpu_matches(NO_MELTDOWN))
 		return;
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 06/11] TAAv5 6
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (4 preceding siblings ...)
  2019-10-05  6:30 ` [MODERATED] [PATCH v5 05/11] TAAv5 5 Pawan Gupta
@ 2019-10-05  6:31 ` Pawan Gupta
  2019-10-05  6:32 ` [MODERATED] [PATCH v5 07/11] TAAv5 7 Pawan Gupta
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:31 UTC (permalink / raw)
  To: speck

Add the sysfs reporting file for TSX Async Abort. It exposes the
vulnerability and the mitigation state similar to the existing files for
the other hardware vulnerabilities.

sysfs file path is:
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 arch/x86/kernel/cpu/bugs.c | 23 +++++++++++++++++++++++
 drivers/base/cpu.c         |  9 +++++++++
 include/linux/cpu.h        |  3 +++
 3 files changed, 35 insertions(+)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 75d7dd7eee1b..b9b122b5bcae 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1451,6 +1451,21 @@ static ssize_t mds_show_state(char *buf)
 		       sched_smt_active() ? "vulnerable" : "disabled");
 }
 
+static ssize_t tsx_async_abort_show_state(char *buf)
+{
+	if ((taa_mitigation == TAA_MITIGATION_TSX_DISABLE) ||
+	    (taa_mitigation == TAA_MITIGATION_OFF))
+		return sprintf(buf, "%s\n", taa_strings[taa_mitigation]);
+
+	if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+		return sprintf(buf, "%s; SMT Host state unknown\n",
+			       taa_strings[taa_mitigation]);
+	}
+
+	return sprintf(buf, "%s; SMT %s\n", taa_strings[taa_mitigation],
+		       sched_smt_active() ? "vulnerable" : "disabled");
+}
+
 static char *stibp_state(void)
 {
 	if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
@@ -1521,6 +1536,9 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 	case X86_BUG_MDS:
 		return mds_show_state(buf);
 
+	case X86_BUG_TAA:
+		return tsx_async_abort_show_state(buf);
+
 	default:
 		break;
 	}
@@ -1557,4 +1575,9 @@ ssize_t cpu_show_mds(struct device *dev, struct device_attribute *attr, char *bu
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_MDS);
 }
+
+ssize_t cpu_show_tsx_async_abort(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_TAA);
+}
 #endif
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index cc37511de866..0fccd8c0312e 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -554,12 +554,20 @@ ssize_t __weak cpu_show_mds(struct device *dev,
 	return sprintf(buf, "Not affected\n");
 }
 
+ssize_t __weak cpu_show_tsx_async_abort(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
 static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
 static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
 static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
+static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL);
 
 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_meltdown.attr,
@@ -568,6 +576,7 @@ static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_spec_store_bypass.attr,
 	&dev_attr_l1tf.attr,
 	&dev_attr_mds.attr,
+	&dev_attr_tsx_async_abort.attr,
 	NULL
 };
 
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index fcb1386bb0d4..1872b15bda75 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -59,6 +59,9 @@ extern ssize_t cpu_show_l1tf(struct device *dev,
 			     struct device_attribute *attr, char *buf);
 extern ssize_t cpu_show_mds(struct device *dev,
 			    struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_tsx_async_abort(struct device *dev,
+					struct device_attribute *attr,
+					char *buf);
 
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 07/11] TAAv5 7
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (5 preceding siblings ...)
  2019-10-05  6:31 ` [MODERATED] [PATCH v5 06/11] TAAv5 6 Pawan Gupta
@ 2019-10-05  6:32 ` Pawan Gupta
  2019-10-05  6:33 ` [MODERATED] [PATCH v5 08/11] TAAv5 8 Pawan Gupta
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:32 UTC (permalink / raw)
  To: speck

Export IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0 to guests on TSX Async
Abort(TAA) affected hosts that have TSX enabled and updated microcode.
This is required so that the guests don't complain,

	"Vulnerable: Clear CPU buffers attempted, no microcode"

when the host has the updated microcode to clear CPU buffers.

Microcode update also adds support for MSR_IA32_TSX_CTRL which is
enumerated by the ARCH_CAP_TSX_CTRL bit in IA32_ARCH_CAPABILITIES MSR.
Guests can't do this check themselves when the ARCH_CAP_TSX_CTRL bit is
not exported to the guests.

In this case export MDS_NO=0 to the guests. When guests have
CPUID.MD_CLEAR=1 guests deploy MDS mitigation which also mitigates TAA.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 arch/x86/kvm/x86.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 91602d310a3f..282b909b9394 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1254,6 +1254,25 @@ static u64 kvm_get_arch_capabilities(void)
 	if (l1tf_vmx_mitigation != VMENTER_L1D_FLUSH_NEVER)
 		data |= ARCH_CAP_SKIP_VMENTRY_L1DFLUSH;
 
+	/*
+	 * On TAA affected systems, export MDS_NO=0 when:
+	 *	- TSX is enabled on host, i.e. X86_FEATURE_RTM=1.
+	 *	- Updated microcode is present. This is detected by
+	 *	  the presence of ARCH_CAP_TSX_CTRL_MSR. This ensures
+	 *	  VERW clears CPU buffers.
+	 *
+	 * When MDS_NO=0 is exported, guests deploy clear CPU buffer
+	 * mitigation and don't complain:
+	 *
+	 *	"Vulnerable: Clear CPU buffers attempted, no microcode"
+	 *
+	 * If TSX is disabled on the system, guests are also mitigated against
+	 * TAA and clear CPU buffer mitigation is not required for guests.
+	 */
+	if (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM) &&
+	    (data & ARCH_CAP_TSX_CTRL_MSR))
+		data &= ~ARCH_CAP_MDS_NO;
+
 	return data;
 }
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 08/11] TAAv5 8
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (6 preceding siblings ...)
  2019-10-05  6:32 ` [MODERATED] [PATCH v5 07/11] TAAv5 7 Pawan Gupta
@ 2019-10-05  6:33 ` Pawan Gupta
  2019-10-05  6:34 ` [MODERATED] [PATCH v5 09/11] TAAv5 9 Pawan Gupta
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:33 UTC (permalink / raw)
  To: speck

Platforms which are not affected by X86_BUG_TAA may want the TSX feature
enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto,
disable TSX when X86_BUG_TAA is present, otherwise enable TSX.

More details on X86_BUG_TAA can be found here:
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 5 +++++
 arch/x86/kernel/cpu/tsx.c                       | 9 +++++++++
 2 files changed, 14 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 832537d59562..d7b48c38c6e5 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4821,6 +4821,11 @@
 
 			on	- Enable TSX on the system.
 			off	- Disable TSX on the system.
+			auto	- Disable TSX if X86_BUG_TAA is present,
+				  otherwise enable TSX on the system.
+
+			More details on X86_BUG_TAA are here:
+			Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
 
 			Not specifying this option is equivalent to tsx=off.
 
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 1c0cee7a7d46..73a0e5af3720 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -23,6 +23,7 @@ static enum tsx_user_cmds {
 	TSX_USER_CMD_NONE,
 	TSX_USER_CMD_ON,
 	TSX_USER_CMD_OFF,
+	TSX_USER_CMD_AUTO,
 } tsx_user_cmd = TSX_USER_CMD_NONE;
 
 static int __init tsx_cmdline(char *str)
@@ -38,6 +39,8 @@ static int __init tsx_cmdline(char *str)
 		tsx_user_cmd = TSX_USER_CMD_ON;
 	else if (!strcmp(str, "off"))
 		tsx_user_cmd = TSX_USER_CMD_OFF;
+	else if (!strcmp(str, "auto"))
+		tsx_user_cmd = TSX_USER_CMD_AUTO;
 
 	return 0;
 }
@@ -104,6 +107,12 @@ void tsx_init(struct cpuinfo_x86 *c)
 	case TSX_USER_CMD_OFF:
 		tsx_ctrl_state = TSX_CTRL_DISABLE;
 		break;
+	case TSX_USER_CMD_AUTO:
+		if (boot_cpu_has_bug(X86_BUG_TAA))
+			tsx_ctrl_state = TSX_CTRL_DISABLE;
+		else
+			tsx_ctrl_state = TSX_CTRL_ENABLE;
+		break;
 	case TSX_USER_CMD_NONE:
 	default:
 		/*
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 09/11] TAAv5 9
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (7 preceding siblings ...)
  2019-10-05  6:33 ` [MODERATED] [PATCH v5 08/11] TAAv5 8 Pawan Gupta
@ 2019-10-05  6:34 ` Pawan Gupta
  2019-10-05  6:35 ` [MODERATED] [PATCH v5 10/11] TAAv5 10 Pawan Gupta
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:34 UTC (permalink / raw)
  To: speck

Transactional Synchronization Extensions (TSX) is an extension to the
x86 instruction set architecture (ISA) that adds Hardware Transactional
Memory (HTM) support. Changing TSX state currently requires a reboot.
This may not be desirable when rebooting imposes a huge penalty. Add
support to control TSX feature via a new sysfs file:
/sys/devices/system/cpu/hw_tx_mem

- Writing 0|off|N|n to this file disables TSX feature on all the CPUs.
  This is equivalent to boot parameter tsx=off.
- Writing 1|on|Y|y to this file enables TSX feature on all the CPUs.
  This is equivalent to boot parameter tsx=on.
- Reading from this returns the status of TSX feature.
- When TSX control is not supported this interface is not visible in
  sysfs.

Changing the TSX state from this interface also updates CPUID.RTM
feature bit.  From the kernel side, this feature bit doesn't result in
any ALTERNATIVE code patching.  No memory allocations are done to
save/restore user state. No code paths in outside of the tests for
vulnerability to TAA are dependent on the value of the feature bit.  In
general the kernel doesn't care whether RTM is present or not.

Applications typically look at CPUID bits once at startup (or when first
calling into a library that uses the feature).  So we have a couple of
cases to cover:

1) An application started and saw that RTM was enabled, so began
   to use it. Then TSX was disabled.  Net result in this case is that
   the application will keep trying to use RTM, but every xbegin() will
   immediately abort the transaction. This has a performance impact to
   the application, but it doesn't affect correctness because all users
   of RTM must have a fallback path for when the transaction aborts. Note
   that even if an application is in the middle of a transaction when we
   disable RTM, we are safe. The XPI that we use to update the TSX_CTRL
   MSR will abort the transaction (just as any interrupt would abort
   a transaction).

2) An application starts and sees RTM is not available. So it will
   always use alternative paths. Even if TSX is enabled and RTM is set,
   applications in general do not re-evaluate their choice so will
   continue to run in non-TSX mode.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 .../ABI/testing/sysfs-devices-system-cpu      |  23 ++++
 arch/x86/kernel/cpu/tsx.c                     | 103 ++++++++++++++++++
 drivers/base/cpu.c                            |  32 +++++-
 include/linux/cpu.h                           |   6 +
 4 files changed, 163 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 5f7d7b14fa44..0c3c8c462285 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -562,3 +562,26 @@ Description:	Umwait control
 			  or C0.2 state. The time is an unsigned 32-bit number.
 			  Note that a value of zero means there is no limit.
 			  Low order two bits must be zero.
+
+What:		/sys/devices/system/cpu/hw_tx_mem
+Date:		August 2019
+Contact:	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
+		Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Hardware Transactional Memory (HTM) control.
+
+		Read/write interface to control HTM feature for all the CPUs in
+		the system.  This interface is only present on platforms that
+		support HTM control. HTM is a hardware feature to speed up the
+		execution of multi-threaded software through lock elision. An
+		example of HTM implementation is Intel Transactional
+		Synchronization Extensions (TSX).
+
+			Read returns the status of HTM feature.
+
+			0: HTM is disabled
+			1: HTM is enabled
+
+			Write sets the state of HTM feature.
+
+			0: Disables HTM
+			1: Enables HTM
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 73a0e5af3720..2cea038fdcba 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -10,9 +10,12 @@
 
 #include <linux/processor.h>
 #include <linux/cpufeature.h>
+#include <linux/cpu.h>
 
 #include "cpu.h"
 
+static DEFINE_MUTEX(tsx_mutex);
+
 static enum tsx_ctrl_states {
 	TSX_CTRL_ENABLE,
 	TSX_CTRL_DISABLE,
@@ -150,3 +153,103 @@ void tsx_init(struct cpuinfo_x86 *c)
 		setup_force_cpu_cap(X86_FEATURE_RTM);
 	}
 }
+
+static void tsx_update_this_cpu(void *arg)
+{
+	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
+	unsigned long enable = (unsigned long)arg;
+
+	if (enable) {
+		tsx_enable();
+		set_cpu_cap(c, X86_FEATURE_RTM);
+		setup_force_cpu_cap(X86_FEATURE_RTM);
+	} else {
+		tsx_disable();
+		clear_cpu_cap(c, X86_FEATURE_RTM);
+		setup_clear_cpu_cap(X86_FEATURE_RTM);
+	}
+}
+
+static void tsx_update_on_each_cpu(bool val)
+{
+	get_online_cpus();
+	on_each_cpu(tsx_update_this_cpu, (void *)val, 1);
+	put_online_cpus();
+}
+
+ssize_t hw_tx_mem_show(struct device *dev, struct device_attribute *attr,
+		       char *buf)
+{
+	return sprintf(buf, "%d\n", tsx_ctrl_state == TSX_CTRL_ENABLE ? 1 : 0);
+}
+
+ssize_t hw_tx_mem_store(struct device *dev, struct device_attribute *attr,
+			const char *buf, size_t count)
+{
+	enum tsx_ctrl_states requested_state;
+	ssize_t ret;
+	bool val;
+
+	ret = kstrtobool(buf, &val);
+	if (ret)
+		return ret;
+
+	mutex_lock(&tsx_mutex);
+
+	if (val) {
+		tsx_user_cmd = TSX_USER_CMD_ON;
+		requested_state = TSX_CTRL_ENABLE;
+	} else {
+		tsx_user_cmd = TSX_USER_CMD_OFF;
+		requested_state = TSX_CTRL_DISABLE;
+	}
+
+	/* Current state is same as the reqested state, do nothing */
+	if (tsx_ctrl_state == requested_state)
+		goto exit;
+
+	tsx_ctrl_state = requested_state;
+
+	/*
+	 * Changing the TSX state from this interface also updates CPUID.RTM
+	 * feature  bit. From the kernel side, this feature bit doesn't result
+	 * in any ALTERNATIVE code patching.  No memory allocations are done to
+	 * save/restore user state. No code paths in outside of the tests for
+	 * vulnerability to TAA are dependent on the value of the feature bit.
+	 * In general the kernel doesn't care whether RTM is present or not.
+	 *
+	 * From the user side it is a bit fuzzier. Applications typically look
+	 * at CPUID bits once at startup (or when first calling into a library
+	 * that uses the feature).  So we have a couple of cases to cover:
+	 *
+	 * 1) An application started and saw that RTM was enabled, so began
+	 *    to use it. Then TSX was disabled.  Net result in this case is
+	 *    that the application will keep trying to use RTM, but every
+	 *    xbegin() will immediately abort the transaction. This has a
+	 *    performance impact to the application, but it doesn't affect
+	 *    correctness because all users of RTM must have a fallback path
+	 *    for when the transaction aborts. Note that even if an application
+	 *    is in the middle of a transaction when we disable RTM, we are
+	 *    safe. The XPI that we use to update the TSX_CTRL MSR will abort
+	 *    the transaction (just as any interrupt would abort a
+	 *    transaction).
+	 *
+	 * 2) An application starts and sees RTM is not available. So it will
+	 *    always use alternative paths. Even if TSX is enabled and RTM is
+	 *    set, applications in general do not re-evaluate their choice so
+	 *    will continue to run in non-TSX mode.
+	 */
+	tsx_update_on_each_cpu(val);
+exit:
+	mutex_unlock(&tsx_mutex);
+
+	return count;
+}
+
+umode_t hw_tx_mem_is_visible(void)
+{
+	if (tsx_ctrl_state == TSX_CTRL_NOT_SUPPORTED)
+		return 0;
+
+	return 0644;
+}
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 0fccd8c0312e..2148e974ab0a 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -460,6 +460,34 @@ struct device *cpu_device_create(struct device *parent, void *drvdata,
 }
 EXPORT_SYMBOL_GPL(cpu_device_create);
 
+ssize_t __weak hw_tx_mem_show(struct device *dev, struct device_attribute *a,
+			      char *buf)
+{
+	return -ENODEV;
+}
+
+ssize_t __weak hw_tx_mem_store(struct device *dev, struct device_attribute *a,
+			       const char *buf, size_t count)
+{
+	return -ENODEV;
+}
+
+DEVICE_ATTR_RW(hw_tx_mem);
+
+umode_t __weak hw_tx_mem_is_visible(void)
+{
+	return 0;
+}
+
+static umode_t cpu_root_attrs_is_visible(struct kobject *kobj,
+					 struct attribute *attr, int index)
+{
+	if (attr == &dev_attr_hw_tx_mem.attr)
+		return hw_tx_mem_is_visible();
+
+	return attr->mode;
+}
+
 #ifdef CONFIG_GENERIC_CPU_AUTOPROBE
 static DEVICE_ATTR(modalias, 0444, print_cpu_modalias, NULL);
 #endif
@@ -481,11 +509,13 @@ static struct attribute *cpu_root_attrs[] = {
 #ifdef CONFIG_GENERIC_CPU_AUTOPROBE
 	&dev_attr_modalias.attr,
 #endif
+	&dev_attr_hw_tx_mem.attr,
 	NULL
 };
 
 static struct attribute_group cpu_root_attr_group = {
-	.attrs = cpu_root_attrs,
+	.attrs		= cpu_root_attrs,
+	.is_visible	= cpu_root_attrs_is_visible,
 };
 
 static const struct attribute_group *cpu_root_attr_groups[] = {
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 1872b15bda75..820fda85fc71 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -63,6 +63,12 @@ extern ssize_t cpu_show_tsx_async_abort(struct device *dev,
 					struct device_attribute *attr,
 					char *buf);
 
+extern ssize_t hw_tx_mem_show(struct device *dev, struct device_attribute *a,
+			      char *buf);
+extern ssize_t hw_tx_mem_store(struct device *dev, struct device_attribute *a,
+			       const char *buf, size_t count);
+extern umode_t hw_tx_mem_is_visible(void);
+
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,
 				 const struct attribute_group **groups,
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 10/11] TAAv5 10
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (8 preceding siblings ...)
  2019-10-05  6:34 ` [MODERATED] [PATCH v5 09/11] TAAv5 9 Pawan Gupta
@ 2019-10-05  6:35 ` Pawan Gupta
  2019-10-05  6:36 ` [MODERATED] [PATCH v5 11/11] TAAv5 11 Pawan Gupta
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:35 UTC (permalink / raw)
  To: speck

When the TSX state is changed from the sysfs interface, TSX Async Abort
(TAA) mitigation state also needs to be updated. Set the TAA mitigation
state as per TSX and VERW static branch state.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
---
 arch/x86/kernel/cpu/bugs.c | 21 ++++++++++++++++++++-
 arch/x86/kernel/cpu/cpu.h  |  2 ++
 arch/x86/kernel/cpu/tsx.c  |  3 +++
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index b9b122b5bcae..8144f3bcc831 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -274,7 +274,7 @@ early_param("mds", mds_cmdline);
 #define pr_fmt(fmt)	"TAA: " fmt
 
 /* Default mitigation for TAA-affected CPUs */
-static enum taa_mitigations taa_mitigation __ro_after_init = TAA_MITIGATION_VERW;
+static enum taa_mitigations taa_mitigation = TAA_MITIGATION_VERW;
 static bool taa_nosmt __ro_after_init;
 
 static const char * const taa_strings[] = {
@@ -374,6 +374,25 @@ static int __init tsx_async_abort_cmdline(char *str)
 }
 early_param("tsx_async_abort", tsx_async_abort_cmdline);
 
+void taa_update_mitigation(bool tsx_enabled)
+{
+	/*
+	 * When userspace changes the TSX state, update taa_mitigation
+	 * so that the updated mitigation state is shown in:
+	 * /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+	 *
+	 * Check if TSX is disabled.
+	 * Check if CPU buffer clear is enabled.
+	 * else the system is vulnerable.
+	 */
+	if (!tsx_enabled)
+		taa_mitigation = TAA_MITIGATION_TSX_DISABLE;
+	else if (static_key_count(&mds_user_clear.key))
+		taa_mitigation = TAA_MITIGATION_VERW;
+	else
+		taa_mitigation = TAA_MITIGATION_OFF;
+}
+
 #undef pr_fmt
 #define pr_fmt(fmt)     "Spectre V1 : " fmt
 
diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
index d864ec4180cc..da2579edbb86 100644
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -64,6 +64,8 @@ extern void x86_spec_ctrl_setup_ap(void);
 
 extern void tsx_init(struct cpuinfo_x86 *c);
 
+extern void taa_update_mitigation(bool tsx_enabled);
+
 extern u64 read_ia32_arch_cap(void);
 
 #endif /* ARCH_X86_CPU_H */
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 2cea038fdcba..44b973fd41f4 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -240,6 +240,9 @@ ssize_t hw_tx_mem_store(struct device *dev, struct device_attribute *attr,
 	 *    will continue to run in non-TSX mode.
 	 */
 	tsx_update_on_each_cpu(val);
+
+	if (boot_cpu_has_bug(X86_BUG_TAA))
+		taa_update_mitigation(val);
 exit:
 	mutex_unlock(&tsx_mutex);
 
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] [PATCH v5 11/11] TAAv5 11
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (9 preceding siblings ...)
  2019-10-05  6:35 ` [MODERATED] [PATCH v5 10/11] TAAv5 10 Pawan Gupta
@ 2019-10-05  6:36 ` Pawan Gupta
  2019-10-05 10:54 ` [MODERATED] Re: [PATCH v5 02/11] TAAv5 2 Borislav Petkov
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-05  6:36 UTC (permalink / raw)
  To: speck

Add the documenation for TSX Async Abort. Include the description of
the issue, how to check the mitigation state, control the mitigation,
guidance for system administrators.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Co-developed-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 .../ABI/testing/sysfs-devices-system-cpu      |   1 +
 Documentation/admin-guide/hw-vuln/index.rst   |   1 +
 .../admin-guide/hw-vuln/tsx_async_abort.rst   | 269 ++++++++++++++++++
 .../admin-guide/kernel-parameters.txt         |  36 +++
 Documentation/x86/index.rst                   |   1 +
 Documentation/x86/tsx_async_abort.rst         |  54 ++++
 6 files changed, 362 insertions(+)
 create mode 100644 Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
 create mode 100644 Documentation/x86/tsx_async_abort.rst

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 0c3c8c462285..09a90be9f8cc 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -486,6 +486,7 @@ What:		/sys/devices/system/cpu/vulnerabilities
 		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
 		/sys/devices/system/cpu/vulnerabilities/l1tf
 		/sys/devices/system/cpu/vulnerabilities/mds
+		/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
 Date:		January 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Information about CPU vulnerabilities
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 49311f3da6f2..0802b1c67452 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -12,3 +12,4 @@ are configurable at compile, boot or run time.
    spectre
    l1tf
    mds
+   tsx_async_abort
diff --git a/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
new file mode 100644
index 000000000000..ad6b7ac332d2
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
@@ -0,0 +1,269 @@
+TAA - TSX Asynchronous Abort
+======================================
+
+TAA is a hardware vulnerability that allows unprivileged speculative access to
+data which is available in various CPU internal buffers by using asynchronous
+aborts within an Intel TSX transactional region.
+
+Affected processors
+-------------------
+
+This vulnerability only affects Intel processors that support Intel
+Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)
+is 0 in the IA32_ARCH_CAPABILITIES MSR.  On processors where the MDS_NO bit
+(bit 5)is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations
+also mitigate against TAA.
+
+Whether a processor is affected or not can be read out from the TAA
+vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.
+
+Related CVEs
+------------
+
+The following CVE entry is related to this TAA issue:
+
+   ==============  =====  ===================================================
+   CVE-2019-11135  TAA    TSX Asynchronous Abort (TAA) condition on some
+                          microprocessors utilizing speculative execution may
+                          allow an authenticated user to potentially enable
+                          information disclosure via a side channel with
+                          local access.
+   ==============  =====  ===================================================
+
+Problem
+-------
+
+When performing store, load, L1 refill operations, processors write data into
+temporary microarchitectural structures (buffers). The data in the buffer can
+be forwarded to load operations as an optimization.
+
+Intel TSX are an extension to the x86 instruction set architecture that adds
+hardware transactional memory support to improve performance of multi-threaded
+software. TSX lets the processor expose and exploit concurrence hidden in an
+application due to dynamically avoiding unnecessary synchronization.
+
+TSX supports atomic memory transactions that are either committed (success) or
+aborted. During an abort, operations that happened within the transactional region
+are rolled back. An asynchronous abort takes place, among other options, when a
+different thread accesses a cache line that is also used within the transactional
+region when that access might lead to a data race.
+
+Immediately after an uncompleted asynchronous abort, certain speculatively
+executed loads may read data from those internal buffers and pass it to dependent
+operations. This can be then used to infer the value via a cache side channel
+attack.
+
+Because the buffers are potentially shared between Hyper-Threads cross
+Hyper-Thread attacks are possible.
+
+The victim of a malicious actor does not need to make use of TSX. Only the
+attacker needs to begin a TSX transaction and raise an asynchronous abort
+to try to leak some of data stored in the buffers.
+
+Deeper technical information is available in the TAA specific x86 architecture
+section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.
+
+
+Attack scenarios
+----------------
+
+Attacks against the TAA vulnerability can be implemented from unprivileged
+applications running on hosts or guests.
+
+As for MDS, the attacker has no control over the memory addresses that can be
+leaked. Only the victim is responsible for bringing data to the CPU. As a
+result, the malicious actor has to first sample as much data as possible and
+then postprocess it to try to infer any useful information from it.
+
+A potential attacker only has read access to the data. Also, there is no direct
+privilege escalation by using this technique.
+
+
+.. _tsx_async_abort_sys_info:
+
+TAA system information
+-----------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current TAA status
+of mitigated systems. The relevant sysfs file is:
+
+/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+
+The possible values in this file are:
+
+.. list-table::
+
+   * - 'Vulnerable'
+     - The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.
+   * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
+     - The system tries to clear the buffers but the microcode might not support the operation.
+   * - 'Mitigation: Clear CPU buffers'
+     - The microcode has been updated to clear the buffers. TSX is still enabled.
+   * - 'Mitigation: TSX disabled'
+     - TSX is disabled.
+   * - 'Not affected'
+     - The CPU is not affected by this issue.
+
+.. _ucode_needed:
+
+Best effort mitigation mode
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If the processor is vulnerable, but the availability of the microcode-based
+mitigation mechanism is not advertised via CPUID the kernel selects a best
+effort mitigation mode.  This mode invokes the mitigation instructions
+without a guarantee that they clear the CPU buffers.
+
+This is done to address virtualization scenarios where the host has the
+microcode update applied, but the hypervisor is not yet updated to expose the
+CPUID to the guest. If the host has updated microcode the protection takes
+effect; otherwise a few CPU cycles are wasted pointlessly.
+
+The state in the tsx_async_abort sysfs file reflects this situation
+accordingly.
+
+
+Mitigation mechanism
+--------------------
+
+The kernel detects the affected CPUs and the presence of the microcode which is
+required. If a CPU is affected and the microcode is available, then the kernel
+enables the mitigation by default.
+
+
+The mitigation can be controlled at boot time via a kernel command line option.
+See :ref:`taa_mitigation_control_command_line`. It also provides a sysfs
+interface. See :ref:`taa_mitigation_sysfs`.
+
+.. _virt_mechanism:
+
+Virtualization mitigation
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Affected systems where the host has the TAA microcode and the TAA mitigation is
+ON (with TSX disabled) are not vulnerable regardless of the status of the VMs.
+
+In all other cases, if the host either does not have the TAA microcode or the
+kernel is not mitigated, the system might be vulnerable.
+
+
+.. _taa_mitigation_control_command_line:
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the TAA mitigations at boot time with
+the option "tsx_async_abort=". The valid arguments for this option are:
+
+  ============  =============================================================
+  off		This option disables the TAA mitigation on affected platforms.
+                If the system has TSX enabled (see next parameter) and the CPU
+                is affected, the system is vulnerable.
+
+  full	        TAA mitigation is enabled. If TSX is enabled, on an affected
+                system it will clear CPU buffers on ring transitions. On
+                systems which are MDS-affected and deploy MDS mitigation,
+                TAA is also mitigated. Specifying this option on those
+                systems will have no effect.
+
+  full,nosmt    The same as tsx_async_abort=full, with SMT disabled on
+                vulnerable CPUs that have TSX enabled. This is the complete
+                mitigation. When TSX is disabled, SMT is not disabled because
+                CPU is not vulnerable to cross-thread TAA attacks.
+  ============  =============================================================
+
+Not specifying this option is equivalent to "tsx_async_abort=full".
+
+The kernel command line also allows to control the TSX feature using the
+parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
+to control the TSX feature and the enumeration of the TSX feature bits (RTM
+and HLE) in CPUID.
+
+The valid options are:
+
+  ============  =============================================================
+  off		Disables TSX.
+
+  on		Enables TSX.
+
+  auto		Disables TSX on affected platform, otherwise enables TSX.
+  ============  =============================================================
+
+Not specifying this option is equivalent to "tsx=off".
+
+The following combinations of the "tsx_async_abort" and "tsx" are possible. For
+affected platforms tsx=auto is equivalent to tsx=off and the result will be:
+
+  =========  ====================   =========================================
+  tsx=on     tsx_async_abort=full   The system will use VERW to clear CPU
+                                    buffers.
+  tsx=on     tsx_async_abort=off    The system is vulnerable.
+  tsx=off    tsx_async_abort=full   TSX is disabled. System is not vulnerable.
+  tsx=off    tsx_async_abort=off    TSX is disabled. System is not vulnerable.
+  =========  ====================   =========================================
+
+For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU
+buffers.  For platforms without TSX control "tsx" command line argument has no
+effect.
+
+
+.. _taa_mitigation_sysfs:
+
+Mitigation control using sysfs
+------------------------------
+
+For those affected systems that can not be frequently rebooted to enable or
+disable TSX, sysfs can be used as an alternative after installing the updates.
+The possible values for the file /sys/devices/system/cpu/tsx are:
+
+  ============  =============================================================
+  0		Disable TSX. Upon entering a TSX transactional region, the code
+                will immediately abort, before any instruction executes within
+                the transactional region even speculatively, and continue on
+                the fallback. Equivalent to boot parameter "tsx=off".
+
+  1		Enable TSX. Equivalent to boot parameter "tsx=on".
+
+  ============  =============================================================
+
+Reading from this file returns the status of TSX feature. This file is only
+present on systems that support TSX control.
+
+When disabling TSX by using the sysfs mechanism, applications that are already
+running and use TSX will see their transactional regions aborted and execution
+flow will be redirected to the fallback, losing the benefits of the
+non-blocking path. TSX needs fallback code to guarantee correct execution
+without transactional regions.
+
+
+Mitigation selection guide
+--------------------------
+
+1. Trusted userspace and guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If all user space applications are from a trusted source and do not execute
+untrusted code which is supplied externally, then the mitigation can be
+disabled. The same applies to virtualized environments with trusted guests.
+
+
+2. Untrusted userspace and guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If there are untrusted applications or guests on the system, enabling TSX
+might allow a malicious actor to leak data from the host or from other
+processes running on the same physical core.
+
+If the microcode is available and the TSX is disabled on the host, attacks
+are prevented in a virtualized environment as well, even if the VMs do not
+explicitly enable the mitigation.
+
+
+.. _taa_default_mitigations:
+
+Default mitigations
+-------------------
+
+The kernel's default action for vulnerable processors is:
+
+  - Deploy TSX disable mitigation (tsx_async_abort=full).
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d7b48c38c6e5..12aea1711bbf 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2612,6 +2612,7 @@
 					       ssbd=force-off [ARM64]
 					       l1tf=off [X86]
 					       mds=off [X86]
+					       tsx_async_abort=off [X86]
 
 			auto (default)
 				Mitigate all CPU vulnerabilities, but leave SMT
@@ -2627,6 +2628,7 @@
 				be fully mitigated, even if it means losing SMT.
 				Equivalent to: l1tf=flush,nosmt [X86]
 					       mds=full,nosmt [X86]
+					       tsx_async_abort=full,nosmt [X86]
 
 	mminit_loglevel=
 			[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
@@ -4656,6 +4658,40 @@
 			neutralize any effect of /proc/sys/kernel/sysrq.
 			Useful for debugging.
 
+	tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async
+			Abort (TAA) vulnerability.
+
+			Similar to Micro-architectural Data Sampling (MDS)
+			certain CPUs that support Transactional
+			Synchronization Extensions (TSX) are vulnerable to an
+			exploit against CPU internal buffers which can forward
+			information to a disclosure gadget under certain
+			conditions.
+
+			In vulnerable processors, the speculatively forwarded
+			data can be used in a cache side channel attack, to
+			access data to which the attacker does not have direct
+			access.
+
+			This parameter controls the TAA mitigation.  The
+			options are:
+
+			full       - Enable TAA mitigation on vulnerable CPUs
+			full,nosmt - Enable TAA mitigation and disable SMT on
+				     vulnerable CPUs. If TSX is disabled, SMT
+				     is not disabled because CPU is not
+				     vulnerable to cross-thread TAA attacks.
+			off        - Unconditionally disable TAA mitigation
+
+			Not specifying this option is equivalent to
+			tsx_async_abort=full.  On CPUs which are MDS affected
+			and deploy MDS mitigation, TAA mitigation is not
+			required and doesn't provide any additional
+			mitigation.
+
+			For details see:
+			Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
+
 	tcpmhash_entries= [KNL,NET]
 			Set the number of tcp_metrics_hash slots.
 			Default value is 8192 or 16384 depending on total
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index af64c4bb4447..a8de2fbc1caa 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -27,6 +27,7 @@ x86-specific Documentation
    mds
    microcode
    resctrl_ui
+   tsx_async_abort
    usb-legacy-support
    i386/index
    x86_64/index
diff --git a/Documentation/x86/tsx_async_abort.rst b/Documentation/x86/tsx_async_abort.rst
new file mode 100644
index 000000000000..0bcc764c0429
--- /dev/null
+++ b/Documentation/x86/tsx_async_abort.rst
@@ -0,0 +1,54 @@
+TSX Async Abort (TAA) mitigation
+=================================================
+
+.. _tsx_async_abort:
+
+Overview
+--------
+
+TSX Async Abort (TAA) is a side channel attack on internal buffers in some
+Intel processors similar to Microachitectural Data Sampling (MDS).  In this
+case certain loads may speculatively pass invalid data to dependent operations
+when an asynchronous abort condition is pending in a Transactional
+Synchronization Extensions (TSX) transaction.  This includes loads with no
+fault or assist condition. Such loads may speculatively expose stale data from
+the same uarch data structures as in MDS, with same scope of exposure i.e.
+same-thread and cross-thread. This issue affects all current processors that
+support TSX.
+
+Mitigation strategy
+-------------------
+
+a) TSX disable - One of the mitigation is to disable TSX feature. A new MSR
+IA32_TSX_CTRL will be available in future and current processors and after a
+microcode update in which can be used to disable TSX.  This MSR can be used to
+disable the TSX feature and the enumeration of the TSX feature bits(RTM and
+HLE) in CPUID.
+
+b) CPU clear buffers - Similar to MDS, clearing the CPU buffers mitigates this
+vulnerability. More details on this approach can be found here
+https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html
+
+Kernel internal mitigation modes
+--------------------------------
+
+ =============    ============================================================
+ off              Mitigation is disabled. Either the CPU is not affected or
+                  tsx_async_abort=off is supplied on the kernel command line.
+
+ tsx disabled     Mitigation is enabled. TSX feature is disabled by default at
+                  bootup on processors that support TSX control.
+
+ verw             Mitigation is enabled. CPU is affected and MD_CLEAR is
+                  advertised in CPUID.
+
+ ucode needed     Mitigation is enabled. CPU is affected and MD_CLEAR is not
+                  advertised in CPUID. That is mainly for virtualization
+                  scenarios where the host has the updated microcode but the
+                  hypervisor does not expose MD_CLEAR in CPUID. It's a best
+                  effort approach without guarantee.
+ =============    ============================================================
+
+If the CPU is affected and "tsx_async_abort" kernel command line parameter is
+not provided then the kernel selects an appropriate mitigation depending on the
+status of RTM and MD_CLEAR CPUID bits.
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 02/11] TAAv5 2
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (10 preceding siblings ...)
  2019-10-05  6:36 ` [MODERATED] [PATCH v5 11/11] TAAv5 11 Pawan Gupta
@ 2019-10-05 10:54 ` Borislav Petkov
  2019-10-07 17:48   ` Pawan Gupta
       [not found] ` <5d98396a.1c69fb81.6c7a8.23b1SMTPIN_ADDED_BROKEN@mx.google.com>
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 75+ messages in thread
From: Borislav Petkov @ 2019-10-05 10:54 UTC (permalink / raw)
  To: speck

On Fri, Oct 04, 2019 at 11:27:31PM -0700, speck for Pawan Gupta wrote:
> Move IA32_ARCH_CAPABILITIES MSR read to a helper function. If the CPU
> doesn't support this MSR return 0. Use this function in following
> commits to read IA32_ARCH_CAPABILITIES MSR.
> 
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Reviewed-by: Mark Gross <mgross@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
> ---
>  arch/x86/kernel/cpu/common.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index f125bf7ecb6f..af9e357b2ca9 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1091,19 +1091,27 @@ static bool __init cpu_matches(unsigned long which)
>  	return m && !!(m->driver_data & which);
>  }
>  
> -static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
> +u64 read_ia32_arch_cap(void)

If this is going to be used by other compilation units, call it
something more presentable and clear, with a proper prefix:

	x86_read_arch_cap_msr()

or so.

>  {
>  	u64 ia32_cap = 0;
>  
> +	/* Leave the MSR set to all 0's when not supported */

This comment is not really needed.

> +	if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
> +		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
> +
> +	return ia32_cap;
> +}
> +
> +static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
		^^^^^^

Please audit your newly added functions whether they can be __init
too.

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 247165, AG München
-- 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 03/11] TAAv5 3
       [not found] ` <5d98396a.1c69fb81.6c7a8.23b1SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2019-10-05 21:43   ` Andy Lutomirski
  2019-10-07 17:50     ` Pawan Gupta
  0 siblings, 1 reply; 75+ messages in thread
From: Andy Lutomirski @ 2019-10-05 21:43 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 4527 bytes --]

On 10/4/19 11:28 PM, speck for Pawan Gupta wrote:
> Disable TSX by default on bootup. If IA32_TSX_CTRL MSR is not present,
> TSX state stays NOT_SUPPORTED which is the compile time default,
> otherwise change TSX state to DISABLE. This is because on certain
> processsors TSX may be used as a part of a speculative side channel
> attack.
> 
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Reviewed-by: Mark Gross <mgross@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
> ---
>  arch/x86/kernel/cpu/Makefile |  2 +-
>  arch/x86/kernel/cpu/cpu.h    |  4 ++
>  arch/x86/kernel/cpu/intel.c  |  2 +
>  arch/x86/kernel/cpu/tsx.c    | 72 ++++++++++++++++++++++++++++++++++++
>  4 files changed, 79 insertions(+), 1 deletion(-)
>  create mode 100644 arch/x86/kernel/cpu/tsx.c
> 
> diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
> index d7a1e5a9331c..890f60083eca 100644
> --- a/arch/x86/kernel/cpu/Makefile
> +++ b/arch/x86/kernel/cpu/Makefile
> @@ -30,7 +30,7 @@ obj-$(CONFIG_PROC_FS)	+= proc.o
>  obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o
>  
>  ifdef CONFIG_CPU_SUP_INTEL
> -obj-y			+= intel.o intel_pconfig.o
> +obj-y			+= intel.o intel_pconfig.o tsx.o
>  obj-$(CONFIG_PM)	+= intel_epb.o
>  endif
>  obj-$(CONFIG_CPU_SUP_AMD)		+= amd.o
> diff --git a/arch/x86/kernel/cpu/cpu.h b/arch/x86/kernel/cpu/cpu.h
> index c0e2407abdd6..d864ec4180cc 100644
> --- a/arch/x86/kernel/cpu/cpu.h
> +++ b/arch/x86/kernel/cpu/cpu.h
> @@ -62,4 +62,8 @@ unsigned int aperfmperf_get_khz(int cpu);
>  
>  extern void x86_spec_ctrl_setup_ap(void);
>  
> +extern void tsx_init(struct cpuinfo_x86 *c);
> +
> +extern u64 read_ia32_arch_cap(void);
> +
>  #endif /* ARCH_X86_CPU_H */
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index 8d6d92ebeb54..b1d6c96f6b88 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -761,6 +761,8 @@ static void init_intel(struct cpuinfo_x86 *c)
>  		detect_tme(c);
>  
>  	init_intel_misc_features(c);
> +
> +	tsx_init(c);
>  }
>  
>  #ifdef CONFIG_X86_32
> diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
> new file mode 100644
> index 000000000000..c549750dd7c8
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/tsx.c
> @@ -0,0 +1,72 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Intel Transactional Synchronization Extensions (TSX) control.
> + *
> + * Copyright (C) 2019 Intel Corporation
> + *
> + * Author:
> + *	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> + */
> +
> +#include <linux/processor.h>
> +#include <linux/cpufeature.h>
> +
> +#include "cpu.h"
> +
> +static enum tsx_ctrl_states {
> +	TSX_CTRL_ENABLE,
> +	TSX_CTRL_DISABLE,
> +	TSX_CTRL_NOT_SUPPORTED,
> +} tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
> +
> +static void tsx_disable(void)
> +{
> +	u64 tsx;
> +
> +	rdmsrl(MSR_IA32_TSX_CTRL, tsx);
> +
> +	/* Force all transactions to immediately abort */
> +	tsx |= TSX_CTRL_RTM_DISABLE;
> +	/*
> +	 * Ensure TSX support is not enumerated in CPUID.
> +	 * This is visible to userspace and will ensure they
> +	 * do not waste resources trying TSX transactions that
> +	 * will always abort.
> +	 */
> +	tsx |= TSX_CTRL_CPUID_CLEAR;
> +
> +	wrmsrl(MSR_IA32_TSX_CTRL, tsx);
> +}
> +
> +static bool tsx_ctrl_is_supported(void)
> +{
> +	u64 ia32_cap = read_ia32_arch_cap();
> +
> +	/*
> +	 * TSX is controlled via MSR_IA32_TSX_CTRL.  However,
> +	 * support for this MSR is enumerated by ARCH_CAP_TSX_MSR bit
> +	 * in MSR_IA32_ARCH_CAPABILITIES.
> +	 */
> +	return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
> +}
> +
> +void tsx_init(struct cpuinfo_x86 *c)
> +{
> +	if (!tsx_ctrl_is_supported())
> +		return;
> +
> +	/*
> +	 * Default to TSX_CTRL_DISABLE. This is because on certain processors
> +	 * TSX may be used as part of a speculative side channel attack.
> +	 */
> +	tsx_ctrl_state = TSX_CTRL_DISABLE;
> +
> +	tsx_disable();
> +	/*
> +	 * tsx_disable() will change the state of the
> +	 * RTM CPUID bit.  Clear it here since it is now
> +	 * expected to be not set.
> +	 */
> +	clear_cpu_cap(c, X86_FEATURE_RTM);
> +	setup_clear_cpu_cap(X86_FEATURE_RTM);

You shouldn't be doing both clear_cpu_cap *and* setup_clear_cpu_cap().

Also, please fold patch 4 into this patch or at least make the contents
of tsx_init() in this patch make some sense.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 04/11] TAAv5 4
       [not found] ` <5d9839a4.1c69fb81.238e9.8312SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2019-10-05 21:45   ` Andy Lutomirski
  0 siblings, 0 replies; 75+ messages in thread
From: Andy Lutomirski @ 2019-10-05 21:45 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 4518 bytes --]

On 10/4/19 11:29 PM, speck for Pawan Gupta wrote:
> Add kernel cmdline parameter "tsx" to control the Transactional
> Synchronization Extensions (TSX) feature.  On CPUs that support TSX
> control, use "tsx=on|off" to enable or disable TSX.  Not specifying this
> option is equivalent to "tsx=off".
> 
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Reviewed-by: Mark Gross <mgross@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
> ---
>  .../admin-guide/kernel-parameters.txt         | 11 +++
>  arch/x86/kernel/cpu/tsx.c                     | 97 ++++++++++++++++---
>  2 files changed, 95 insertions(+), 13 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 4c1971960afa..832537d59562 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4813,6 +4813,17 @@
>  			interruptions from clocksource watchdog are not
>  			acceptable).
>  
> +	tsx=		[X86] Control Transactional Synchronization
> +			Extensions (TSX) feature in Intel processors that
> +			support TSX control.
> +
> +			This parameter controls the TSX feature. The options are:
> +
> +			on	- Enable TSX on the system.
> +			off	- Disable TSX on the system.
> +
> +			Not specifying this option is equivalent to tsx=off.
> +
>  	turbografx.map[2|3]=	[HW,JOY]
>  			TurboGraFX parallel port interface
>  			Format:
> diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
> index c549750dd7c8..1c0cee7a7d46 100644
> --- a/arch/x86/kernel/cpu/tsx.c
> +++ b/arch/x86/kernel/cpu/tsx.c
> @@ -19,6 +19,30 @@ static enum tsx_ctrl_states {
>  	TSX_CTRL_NOT_SUPPORTED,
>  } tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
>  
> +static enum tsx_user_cmds {
> +	TSX_USER_CMD_NONE,
> +	TSX_USER_CMD_ON,
> +	TSX_USER_CMD_OFF,
> +} tsx_user_cmd = TSX_USER_CMD_NONE;
> +
> +static int __init tsx_cmdline(char *str)
> +{
> +	if (!str)
> +		return -EINVAL;
> +
> +	/*
> +	 * tsx_en/disable() are only called when
> +	 * X86_FEATURE_RTM and TSX_CTRL MSR are supported.
> +	 */
> +	if (!strcmp(str, "on"))
> +		tsx_user_cmd = TSX_USER_CMD_ON;
> +	else if (!strcmp(str, "off"))
> +		tsx_user_cmd = TSX_USER_CMD_OFF;
> +
> +	return 0;
> +}
> +early_param("tsx", tsx_cmdline);
> +
>  static void tsx_disable(void)
>  {
>  	u64 tsx;
> @@ -38,6 +62,24 @@ static void tsx_disable(void)
>  	wrmsrl(MSR_IA32_TSX_CTRL, tsx);
>  }
>  
> +static void tsx_enable(void)
> +{
> +	u64 tsx;
> +
> +	rdmsrl(MSR_IA32_TSX_CTRL, tsx);
> +
> +	/* Enable the RTM feature in the cpu */
> +	tsx &= ~TSX_CTRL_RTM_DISABLE;
> +	/*
> +	 * Ensure TSX support is enumerated in CPUID.
> +	 * This is visible to userspace and will ensure they
> +	 * can enumerate and use the TSX feature.
> +	 */
> +	tsx &= ~TSX_CTRL_CPUID_CLEAR;
> +
> +	wrmsrl(MSR_IA32_TSX_CTRL, tsx);
> +}
> +
>  static bool tsx_ctrl_is_supported(void)
>  {
>  	u64 ia32_cap = read_ia32_arch_cap();
> @@ -55,18 +97,47 @@ void tsx_init(struct cpuinfo_x86 *c)
>  	if (!tsx_ctrl_is_supported())
>  		return;
>  
> -	/*
> -	 * Default to TSX_CTRL_DISABLE. This is because on certain processors
> -	 * TSX may be used as part of a speculative side channel attack.
> -	 */
> -	tsx_ctrl_state = TSX_CTRL_DISABLE;
> +	switch (tsx_user_cmd) {
> +	case TSX_USER_CMD_ON:
> +		tsx_ctrl_state = TSX_CTRL_ENABLE;
> +		break;
> +	case TSX_USER_CMD_OFF:
> +		tsx_ctrl_state = TSX_CTRL_DISABLE;
> +		break;
> +	case TSX_USER_CMD_NONE:
> +	default:
> +		/*
> +		 * If user provided an invalid option or tsx= is not provided
> +		 * on cmdline default to TSX_CTRL_DISABLE. This is because on
> +		 * certain processors TSX may be used as part of a speculative
> +		 * side channel attack.
> +		 */
> +		tsx_ctrl_state = TSX_CTRL_DISABLE;
> +	}

This code is bizarre.  Why are you separating tsx_ctrl_state from
tsx_user_cmd?  Just have one copy of this, please.  You shouldn't be
writing to a *global* cpu state variable like this in a percpu init
function.  This code should just program the MSR.

Also, the new improved just-one-enum-for-everything global variable
should be __ro_after_init.

Given that you are doing the percpu config for each cpu, setup_clear_...
should be unnecessary.  It's just a helper that automatically clears the
bit for other CPUs.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
       [not found] ` <5d983ad2.1c69fb81.63edd.6575SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2019-10-05 21:49   ` Andy Lutomirski
  2019-10-07 18:35     ` Pawan Gupta
  0 siblings, 1 reply; 75+ messages in thread
From: Andy Lutomirski @ 2019-10-05 21:49 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 5693 bytes --]

On 10/4/19 11:34 PM, speck for Pawan Gupta wrote:
> Transactional Synchronization Extensions (TSX) is an extension to the
> x86 instruction set architecture (ISA) that adds Hardware Transactional
> Memory (HTM) support. Changing TSX state currently requires a reboot.
> This may not be desirable when rebooting imposes a huge penalty. Add
> support to control TSX feature via a new sysfs file:
> /sys/devices/system/cpu/hw_tx_mem
> 
> - Writing 0|off|N|n to this file disables TSX feature on all the CPUs.
>   This is equivalent to boot parameter tsx=off.
> - Writing 1|on|Y|y to this file enables TSX feature on all the CPUs.
>   This is equivalent to boot parameter tsx=on.
> - Reading from this returns the status of TSX feature.
> - When TSX control is not supported this interface is not visible in
>   sysfs.
> 
> Changing the TSX state from this interface also updates CPUID.RTM
> feature bit.  From the kernel side, this feature bit doesn't result in
> any ALTERNATIVE code patching.  No memory allocations are done to
> save/restore user state. No code paths in outside of the tests for
> vulnerability to TAA are dependent on the value of the feature bit.  In
> general the kernel doesn't care whether RTM is present or not.
> 
> Applications typically look at CPUID bits once at startup (or when first
> calling into a library that uses the feature).  So we have a couple of
> cases to cover:
> 
> 1) An application started and saw that RTM was enabled, so began
>    to use it. Then TSX was disabled.  Net result in this case is that
>    the application will keep trying to use RTM, but every xbegin() will
>    immediately abort the transaction. This has a performance impact to
>    the application, but it doesn't affect correctness because all users
>    of RTM must have a fallback path for when the transaction aborts. Note
>    that even if an application is in the middle of a transaction when we
>    disable RTM, we are safe. The XPI that we use to update the TSX_CTRL
>    MSR will abort the transaction (just as any interrupt would abort
>    a transaction).
> 
> 2) An application starts and sees RTM is not available. So it will
>    always use alternative paths. Even if TSX is enabled and RTM is set,
>    applications in general do not re-evaluate their choice so will
>    continue to run in non-TSX mode.
> 
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Reviewed-by: Mark Gross <mgross@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
> ---
>  .../ABI/testing/sysfs-devices-system-cpu      |  23 ++++
>  arch/x86/kernel/cpu/tsx.c                     | 103 ++++++++++++++++++
>  drivers/base/cpu.c                            |  32 +++++-
>  include/linux/cpu.h                           |   6 +
>  4 files changed, 163 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 5f7d7b14fa44..0c3c8c462285 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -562,3 +562,26 @@ Description:	Umwait control
>  			  or C0.2 state. The time is an unsigned 32-bit number.
>  			  Note that a value of zero means there is no limit.
>  			  Low order two bits must be zero.
> +
> +What:		/sys/devices/system/cpu/hw_tx_mem
> +Date:		August 2019
> +Contact:	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> +		Linux kernel mailing list <linux-kernel@vger.kernel.org>
> +Description:	Hardware Transactional Memory (HTM) control.
> +
> +		Read/write interface to control HTM feature for all the CPUs in
> +		the system.  This interface is only present on platforms that
> +		support HTM control. HTM is a hardware feature to speed up the
> +		execution of multi-threaded software through lock elision. An
> +		example of HTM implementation is Intel Transactional
> +		Synchronization Extensions (TSX).
> +
> +			Read returns the status of HTM feature.
> +
> +			0: HTM is disabled
> +			1: HTM is enabled
> +
> +			Write sets the state of HTM feature.
> +
> +			0: Disables HTM
> +			1: Enables HTM
> diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
> index 73a0e5af3720..2cea038fdcba 100644
> --- a/arch/x86/kernel/cpu/tsx.c
> +++ b/arch/x86/kernel/cpu/tsx.c
> @@ -10,9 +10,12 @@
>  
>  #include <linux/processor.h>
>  #include <linux/cpufeature.h>
> +#include <linux/cpu.h>
>  
>  #include "cpu.h"
>  
> +static DEFINE_MUTEX(tsx_mutex);
> +
>  static enum tsx_ctrl_states {
>  	TSX_CTRL_ENABLE,
>  	TSX_CTRL_DISABLE,
> @@ -150,3 +153,103 @@ void tsx_init(struct cpuinfo_x86 *c)
>  		setup_force_cpu_cap(X86_FEATURE_RTM);
>  	}
>  }
> +
> +static void tsx_update_this_cpu(void *arg)
> +{
> +	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> +	unsigned long enable = (unsigned long)arg;
> +
> +	if (enable) {
> +		tsx_enable();
> +		set_cpu_cap(c, X86_FEATURE_RTM);
> +		setup_force_cpu_cap(X86_FEATURE_RTM);
> +	} else {
> +		tsx_disable();
> +		clear_cpu_cap(c, X86_FEATURE_RTM);
> +		setup_clear_cpu_cap(X86_FEATURE_RTM);
> +	}
> +}

You can't use these setup_... functions after boot like this.  Also, I'm
not sure you can safely even set_cpu_cap(...) after boot.

> +
> +static void tsx_update_on_each_cpu(bool val)
> +{
> +	get_online_cpus();
> +	on_each_cpu(tsx_update_this_cpu, (void *)val, 1);
> +	put_online_cpus();

I sure hope that on_each_cpu() is sensible enough to automatically do
the get_online_cpus() thing.  Is it not?


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 01/11] TAAv5 1
       [not found] ` <5d9838f1.1c69fb81.f1bab.d886SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2019-10-05 21:49   ` Andy Lutomirski
  2019-10-06 17:40     ` Andrew Cooper
  0 siblings, 1 reply; 75+ messages in thread
From: Andy Lutomirski @ 2019-10-05 21:49 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

On 10/4/19 11:26 PM, speck for Pawan Gupta wrote:
> Transactional Synchronization Extensions (TSX) may be used on certain
> processors as part of a speculative side channel attack.  A microcode
> update for existing processors that are vulnerable to this attack will
> add a new MSR, IA32_TSX_CTRL to allow the system administrator the
> option to disable TSX as one of the possible mitigations.  [Note that
> future processors that are not vulnerable will also support the
> IA32_TSX_CTRL MSR].  Add defines for the new IA32_TSX_CTRL MSR and its
> bits.
> 
> TSX has two sub-features:
> 
> 1. Restricted Transactional Memory (RTM) is an explicitly-used feature
>    where new instructions begin and end TSX transactions.
> 2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
>    "old" style locks are used by software.
> 
> Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
> IA32_TSX_CTRL MSR.
> 
> There are two control bits in IA32_TSX_CTRL MSR:
> 
>   Bit 0: When set it disables the Restricted Transactional Memory (RTM)
>          sub-feature of TSX (will force all transactions to abort on the
> 	 XBEGIN instruction).
> 
>   Bit 1: When set it disables the enumeration of the RTM feature (i.e.
>          it will make CPUID(EAX=7).EBX{bit11} read as 0).
> 
> The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
> disabled but still enumerated as present by CPUID(EAX=7).EBX{bit4}.

Does the kernel need to intercept CPUID to clear the HLE bit to avoid
potential performance loss?  I can imagine some programs using alternate
code paths that are better in the non-HLE case.


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
       [not found] ` <5d983ad2.1c69fb81.e6640.8f51SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2019-10-06 17:06   ` Greg KH
  2019-10-08  6:01     ` Pawan Gupta
  0 siblings, 1 reply; 75+ messages in thread
From: Greg KH @ 2019-10-06 17:06 UTC (permalink / raw)
  To: speck

On Fri, Oct 04, 2019 at 11:34:31PM -0700, speck for Pawan Gupta wrote:
> Transactional Synchronization Extensions (TSX) is an extension to the
> x86 instruction set architecture (ISA) that adds Hardware Transactional
> Memory (HTM) support. Changing TSX state currently requires a reboot.
> This may not be desirable when rebooting imposes a huge penalty. Add
> support to control TSX feature via a new sysfs file:
> /sys/devices/system/cpu/hw_tx_mem
> 
> - Writing 0|off|N|n to this file disables TSX feature on all the CPUs.
>   This is equivalent to boot parameter tsx=off.
> - Writing 1|on|Y|y to this file enables TSX feature on all the CPUs.
>   This is equivalent to boot parameter tsx=on.
> - Reading from this returns the status of TSX feature.
> - When TSX control is not supported this interface is not visible in
>   sysfs.
> 
> Changing the TSX state from this interface also updates CPUID.RTM
> feature bit.  From the kernel side, this feature bit doesn't result in
> any ALTERNATIVE code patching.  No memory allocations are done to
> save/restore user state. No code paths in outside of the tests for
> vulnerability to TAA are dependent on the value of the feature bit.  In
> general the kernel doesn't care whether RTM is present or not.
> 
> Applications typically look at CPUID bits once at startup (or when first
> calling into a library that uses the feature).  So we have a couple of
> cases to cover:
> 
> 1) An application started and saw that RTM was enabled, so began
>    to use it. Then TSX was disabled.  Net result in this case is that
>    the application will keep trying to use RTM, but every xbegin() will
>    immediately abort the transaction. This has a performance impact to
>    the application, but it doesn't affect correctness because all users
>    of RTM must have a fallback path for when the transaction aborts. Note
>    that even if an application is in the middle of a transaction when we
>    disable RTM, we are safe. The XPI that we use to update the TSX_CTRL
>    MSR will abort the transaction (just as any interrupt would abort
>    a transaction).
> 
> 2) An application starts and sees RTM is not available. So it will
>    always use alternative paths. Even if TSX is enabled and RTM is set,
>    applications in general do not re-evaluate their choice so will
>    continue to run in non-TSX mode.
> 
> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Reviewed-by: Mark Gross <mgross@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
> ---
>  .../ABI/testing/sysfs-devices-system-cpu      |  23 ++++
>  arch/x86/kernel/cpu/tsx.c                     | 103 ++++++++++++++++++
>  drivers/base/cpu.c                            |  32 +++++-
>  include/linux/cpu.h                           |   6 +
>  4 files changed, 163 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
> index 5f7d7b14fa44..0c3c8c462285 100644
> --- a/Documentation/ABI/testing/sysfs-devices-system-cpu
> +++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
> @@ -562,3 +562,26 @@ Description:	Umwait control
>  			  or C0.2 state. The time is an unsigned 32-bit number.
>  			  Note that a value of zero means there is no limit.
>  			  Low order two bits must be zero.
> +
> +What:		/sys/devices/system/cpu/hw_tx_mem
> +Date:		August 2019
> +Contact:	Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> +		Linux kernel mailing list <linux-kernel@vger.kernel.org>
> +Description:	Hardware Transactional Memory (HTM) control.
> +
> +		Read/write interface to control HTM feature for all the CPUs in
> +		the system.  This interface is only present on platforms that
> +		support HTM control. HTM is a hardware feature to speed up the
> +		execution of multi-threaded software through lock elision. An
> +		example of HTM implementation is Intel Transactional
> +		Synchronization Extensions (TSX).
> +
> +			Read returns the status of HTM feature.
> +
> +			0: HTM is disabled
> +			1: HTM is enabled
> +
> +			Write sets the state of HTM feature.
> +
> +			0: Disables HTM
> +			1: Enables HTM
> diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
> index 73a0e5af3720..2cea038fdcba 100644
> --- a/arch/x86/kernel/cpu/tsx.c
> +++ b/arch/x86/kernel/cpu/tsx.c
> @@ -10,9 +10,12 @@
>  
>  #include <linux/processor.h>
>  #include <linux/cpufeature.h>
> +#include <linux/cpu.h>
>  
>  #include "cpu.h"
>  
> +static DEFINE_MUTEX(tsx_mutex);

I think I asked this before, but in looking at the code I still can't
figure it out.  What exactly is this protecting?

It looks like you want to keep only one "writer" out of the sysfs store
function at a time, but:

> +ssize_t hw_tx_mem_store(struct device *dev, struct device_attribute *attr,
> +			const char *buf, size_t count)
> +{
> +	enum tsx_ctrl_states requested_state;
> +	ssize_t ret;
> +	bool val;
> +
> +	ret = kstrtobool(buf, &val);
> +	if (ret)
> +		return ret;
> +
> +	mutex_lock(&tsx_mutex);
> +
> +	if (val) {
> +		tsx_user_cmd = TSX_USER_CMD_ON;
> +		requested_state = TSX_CTRL_ENABLE;
> +	} else {
> +		tsx_user_cmd = TSX_USER_CMD_OFF;
> +		requested_state = TSX_CTRL_DISABLE;
> +	}
> +
> +	/* Current state is same as the reqested state, do nothing */
> +	if (tsx_ctrl_state == requested_state)
> +		goto exit;
> +
> +	tsx_ctrl_state = requested_state;
> +
> +	/*
> +	 * Changing the TSX state from this interface also updates CPUID.RTM
> +	 * feature  bit. From the kernel side, this feature bit doesn't result
> +	 * in any ALTERNATIVE code patching.  No memory allocations are done to
> +	 * save/restore user state. No code paths in outside of the tests for
> +	 * vulnerability to TAA are dependent on the value of the feature bit.
> +	 * In general the kernel doesn't care whether RTM is present or not.
> +	 *
> +	 * From the user side it is a bit fuzzier. Applications typically look
> +	 * at CPUID bits once at startup (or when first calling into a library
> +	 * that uses the feature).  So we have a couple of cases to cover:
> +	 *
> +	 * 1) An application started and saw that RTM was enabled, so began
> +	 *    to use it. Then TSX was disabled.  Net result in this case is
> +	 *    that the application will keep trying to use RTM, but every
> +	 *    xbegin() will immediately abort the transaction. This has a
> +	 *    performance impact to the application, but it doesn't affect
> +	 *    correctness because all users of RTM must have a fallback path
> +	 *    for when the transaction aborts. Note that even if an application
> +	 *    is in the middle of a transaction when we disable RTM, we are
> +	 *    safe. The XPI that we use to update the TSX_CTRL MSR will abort
> +	 *    the transaction (just as any interrupt would abort a
> +	 *    transaction).
> +	 *
> +	 * 2) An application starts and sees RTM is not available. So it will
> +	 *    always use alternative paths. Even if TSX is enabled and RTM is
> +	 *    set, applications in general do not re-evaluate their choice so
> +	 *    will continue to run in non-TSX mode.
> +	 */
> +	tsx_update_on_each_cpu(val);
> +exit:
> +	mutex_unlock(&tsx_mutex);

What I think you want to do is just protect the tsx_update_on_each_cpu()
function, right?

So, you are locking _outside_ of a function call?  That's a sure way to
madness over time.  If this function is so special that it can not be
called multiple times at once, then put the lock _inside_ the function,
right?

Otherwise you could have other places call that function, and this
single lock is not going to protect anything :(

And why is tsx_user_cmd needed, and global?

For a "simple" turn this feature on/off function, it feels like you have
way too many different states and variables holding them here.  Why do
you need more than just one global state, and one variable showing what
was asked for here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 01/11] TAAv5 1
  2019-10-05 21:49   ` [MODERATED] Re: [PATCH v5 01/11] TAAv5 1 Andy Lutomirski
@ 2019-10-06 17:40     ` Andrew Cooper
  0 siblings, 0 replies; 75+ messages in thread
From: Andrew Cooper @ 2019-10-06 17:40 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 2575 bytes --]

On 05/10/2019 22:49, speck for Andy Lutomirski wrote:
> On 10/4/19 11:26 PM, speck for Pawan Gupta wrote:
>> Transactional Synchronization Extensions (TSX) may be used on certain
>> processors as part of a speculative side channel attack.  A microcode
>> update for existing processors that are vulnerable to this attack will
>> add a new MSR, IA32_TSX_CTRL to allow the system administrator the
>> option to disable TSX as one of the possible mitigations.  [Note that
>> future processors that are not vulnerable will also support the
>> IA32_TSX_CTRL MSR].  Add defines for the new IA32_TSX_CTRL MSR and its
>> bits.
>>
>> TSX has two sub-features:
>>
>> 1. Restricted Transactional Memory (RTM) is an explicitly-used feature
>>    where new instructions begin and end TSX transactions.
>> 2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
>>    "old" style locks are used by software.
>>
>> Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
>> IA32_TSX_CTRL MSR.
>>
>> There are two control bits in IA32_TSX_CTRL MSR:
>>
>>   Bit 0: When set it disables the Restricted Transactional Memory (RTM)
>>          sub-feature of TSX (will force all transactions to abort on the
>> 	 XBEGIN instruction).
>>
>>   Bit 1: When set it disables the enumeration of the RTM feature (i.e.
>>          it will make CPUID(EAX=7).EBX{bit11} read as 0).
>>
>> The other TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
>> disabled but still enumerated as present by CPUID(EAX=7).EBX{bit4}.

After experimenting with the production CLX ucode:

(XEN) microcode: CPU0 updated from revision 0x5000024 to 0x500002b, date
= 2019-08-12
...
(XEN) CPUID.7[0].ebx before 0xd39ffffb, after 0xd39ff7eb, xor 0x00000810

Bit 1 flips both the RTM and HLE bits, which is also my understand from
the discussion on the calls.

> Does the kernel need to intercept CPUID to clear the HLE bit to avoid
> potential performance loss?  I can imagine some programs using alternate
> code paths that are better in the non-HLE case.

HLE is a hint-only feature saying "things may get faster if you're on
TSX-capable hardware".  That said, it is not possible to implement fair
spinlocks with, (due to the requirement of the lock state on exit being
identical to entry), and appears to have no production users.

Furthermore, it is already disabled across the board (with the feature
bit still present, for migration compatibility) for previous TSX
errata.  Any perf hit is already being taken.

~Andrew


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 02/11] TAAv5 2
  2019-10-05 10:54 ` [MODERATED] Re: [PATCH v5 02/11] TAAv5 2 Borislav Petkov
@ 2019-10-07 17:48   ` Pawan Gupta
  0 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-07 17:48 UTC (permalink / raw)
  To: speck

> > +static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
> 		^^^^^^
> 
> Please audit your newly added functions whether they can be __init
> too.

Newly added tsx functions are called after boot as well, either from
sysfs or from the cpu hotplug path. I will see if some of them (mostly
tsx_init()) can be changed to be __init. One likely call site for
tsx_init() is identify_boot_cpu(), and then only do
tsx_enable()/disable() from init_intel().

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 03/11] TAAv5 3
  2019-10-05 21:43   ` [MODERATED] Re: [PATCH v5 03/11] TAAv5 3 Andy Lutomirski
@ 2019-10-07 17:50     ` Pawan Gupta
  0 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-07 17:50 UTC (permalink / raw)
  To: speck

> > +void tsx_init(struct cpuinfo_x86 *c)
> > +{
> > +	if (!tsx_ctrl_is_supported())
> > +		return;
> > +
> > +	/*
> > +	 * Default to TSX_CTRL_DISABLE. This is because on certain processors
> > +	 * TSX may be used as part of a speculative side channel attack.
> > +	 */
> > +	tsx_ctrl_state = TSX_CTRL_DISABLE;
> > +
> > +	tsx_disable();
> > +	/*
> > +	 * tsx_disable() will change the state of the
> > +	 * RTM CPUID bit.  Clear it here since it is now
> > +	 * expected to be not set.
> > +	 */
> > +	clear_cpu_cap(c, X86_FEATURE_RTM);
> > +	setup_clear_cpu_cap(X86_FEATURE_RTM);
> 
> You shouldn't be doing both clear_cpu_cap *and* setup_clear_cpu_cap().
> 
> Also, please fold patch 4 into this patch or at least make the contents
> of tsx_init() in this patch make some sense.

I will fold patch 4 into this.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
  2019-10-05 21:49   ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Andy Lutomirski
@ 2019-10-07 18:35     ` Pawan Gupta
  0 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-07 18:35 UTC (permalink / raw)
  To: speck

> > +static void tsx_update_this_cpu(void *arg)
> > +{
> > +	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> > +	unsigned long enable = (unsigned long)arg;
> > +
> > +	if (enable) {
> > +		tsx_enable();
> > +		set_cpu_cap(c, X86_FEATURE_RTM);
> > +		setup_force_cpu_cap(X86_FEATURE_RTM);
> > +	} else {
> > +		tsx_disable();
> > +		clear_cpu_cap(c, X86_FEATURE_RTM);
> > +		setup_clear_cpu_cap(X86_FEATURE_RTM);
> > +	}
> > +}
> 
> You can't use these setup_... functions after boot like this.  Also, I'm
> not sure you can safely even set_cpu_cap(...) after boot.

The only reason they are there is because guests started after changing
the TSX state from sysfs don't get the correct X86_FEATURE_RTM(with qemu
option "-cpu host") . Perhaps there needs to be a fix in qemu/KVM to
read the cpuid and export the correct X86_FEATURE_RTM bit.

> 
> > +
> > +static void tsx_update_on_each_cpu(bool val)
> > +{
> > +	get_online_cpus();
> > +	on_each_cpu(tsx_update_this_cpu, (void *)val, 1);
> > +	put_online_cpus();
> 
> I sure hope that on_each_cpu() is sensible enough to automatically do
> the get_online_cpus() thing.  Is it not?

I dont see cpu_hotplug_lock being taken by on_each_cpu(), which
get_online_cpus() does. This prevents cpus to go down/up when we are
writing to MSR.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 05/11] TAAv5 5
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (16 preceding siblings ...)
       [not found] ` <5d983ad2.1c69fb81.e6640.8f51SMTPIN_ADDED_BROKEN@mx.google.com>
@ 2019-10-08  2:46 ` Josh Poimboeuf
  2019-10-09  1:45   ` Pawan Gupta
  2019-10-08  2:57 ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Josh Poimboeuf
  2019-10-09 13:12 ` [MODERATED] Re: ***UNCHECKED*** [PATCH v5 08/11] TAAv5 8 Michal Hocko
  19 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-08  2:46 UTC (permalink / raw)
  To: speck

On Fri, Oct 04, 2019 at 11:30:31PM -0700, speck for Pawan Gupta wrote:
> TSX Async Abort (TAA) is a side channel vulnerability to the internal
> buffers in some Intel processors similar to Microachitectural Data
> Sampling (MDS).  In this case certain loads may speculatively pass
> invalid data to dependent operations when an asynchronous abort
> condition is pending in a TSX transaction.  This includes loads with no
> fault or assist condition.  Such loads may speculatively expose stale
> data from the uarch data structures as in MDS.  Scope of exposure is
> within the same-thread and cross-thread.  This issue affects all current
> processors that support TSX, but do not have ARCH_CAP_TAA_NO (bit 8) set
> in MSR_IA32_ARCH_CAPABILITIES.

All the patches are missing "Subject:" lines, so when applied, they have
subjects like "TAAv5 1".

> -/* Update the static key controlling the MDS CPU buffer clear in idle */
> +/* Update the static key controlling the MDS and TAA CPU buffer clear in idle */
>  static void update_mds_branch_idle(void)
>  {
>  	/*
> @@ -775,8 +881,11 @@ static void update_mds_branch_idle(void)
>  	 * The other variants cannot be mitigated when SMT is enabled, so
>  	 * clearing the buffers on idle just to prevent the Store Buffer
>  	 * repartitioning leak would be a window dressing exercise.
> +	 *
> +	 * Apply idle buffer clearing to TAA affected CPUs also.
>  	 */
> -	if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY))
> +	if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY) &&
> +	    !boot_cpu_has_bug(X86_BUG_TAA))
>  		return;

I'm not sure this belongs here.  The existing check was for
X86_BUG_MSBDS_ONLY, which is the only MDS variant which can be mitigated
in idle with SMT enabled.  If the other MDS variants are present (which
is true for most CPUs) then this function doesn't enable the MDS idle
mitigation, because it wouldn't be sufficient.

So does this fully mitigate TAA on idle threads with SMT enabled?  If
not then this change is probably pointless.

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (17 preceding siblings ...)
  2019-10-08  2:46 ` [MODERATED] Re: [PATCH v5 05/11] TAAv5 5 Josh Poimboeuf
@ 2019-10-08  2:57 ` Josh Poimboeuf
  2019-10-08  6:10   ` Pawan Gupta
  2019-10-09 13:12 ` [MODERATED] Re: ***UNCHECKED*** [PATCH v5 08/11] TAAv5 8 Michal Hocko
  19 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-08  2:57 UTC (permalink / raw)
  To: speck

On Fri, Oct 04, 2019 at 11:34:31PM -0700, speck for Pawan Gupta wrote:
> Transactional Synchronization Extensions (TSX) is an extension to the
> x86 instruction set architecture (ISA) that adds Hardware Transactional
> Memory (HTM) support. Changing TSX state currently requires a reboot.
> This may not be desirable when rebooting imposes a huge penalty.

This is still missing a real world justification for the added
complexity.  Don't production users typically know at boot time whether
they plan to use TSX?

It looks like this patch should be combined with patch 10, since
otherwise this patch causes a regression in the sysfs mitigation
reporting, right?

And again... this patch needs to be last.  Even after the documentation.
If it triggers documentation changes then those changes should be
separated out and included in this patch.  Otherwise, this patch may
block the merging of the documentation patch.

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
  2019-10-06 17:06   ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Greg KH
@ 2019-10-08  6:01     ` Pawan Gupta
  2019-10-10 21:31       ` Pawan Gupta
  0 siblings, 1 reply; 75+ messages in thread
From: Pawan Gupta @ 2019-10-08  6:01 UTC (permalink / raw)
  To: speck

> > +static DEFINE_MUTEX(tsx_mutex);
> 
> I think I asked this before, but in looking at the code I still can't
> figure it out.  What exactly is this protecting?
> 
> It looks like you want to keep only one "writer" out of the sysfs store
> function at a time, but:
> 
> > +ssize_t hw_tx_mem_store(struct device *dev, struct device_attribute *attr,
> > +			const char *buf, size_t count)
> > +{
> > +	enum tsx_ctrl_states requested_state;
> > +	ssize_t ret;
> > +	bool val;
> > +
> > +	ret = kstrtobool(buf, &val);
> > +	if (ret)
> > +		return ret;
> > +
> > +	mutex_lock(&tsx_mutex);
> > +
> > +	if (val) {
> > +		tsx_user_cmd = TSX_USER_CMD_ON;
> > +		requested_state = TSX_CTRL_ENABLE;
> > +	} else {
> > +		tsx_user_cmd = TSX_USER_CMD_OFF;
> > +		requested_state = TSX_CTRL_DISABLE;
> > +	}
> > +
> > +	/* Current state is same as the reqested state, do nothing */
> > +	if (tsx_ctrl_state == requested_state)
> > +		goto exit;
> > +
> > +	tsx_ctrl_state = requested_state;
> > +
> > +	tsx_update_on_each_cpu(val);
> > +exit:
> > +	mutex_unlock(&tsx_mutex);
> 
> What I think you want to do is just protect the tsx_update_on_each_cpu()
> function, right?

Also I believe below two operations needs to be under a lock. Without
the lock if there are two writers and one is preempted in between these
operations there is a possibility that tsx_ctrl_state and TSX hardware
state could go out of sync.

	tsx_ctrl_state = requested_state;

	// 1st writer gets preempted here
	// 2nd writer flips tsx_ctrl_state and writes to the MSR.
	// 1st writer wakes up and only writes to the MSR

	tsx_update_on_each_cpu(val);
	// tsx_ctrl_state and hardware state would be different here.

Chances of this happening is rare but still a possibility. The lock
would prevent such a condition.

> 
> So, you are locking _outside_ of a function call?  That's a sure way to
> madness over time.  If this function is so special that it can not be
> called multiple times at once, then put the lock _inside_ the function,
> right?
> 
> Otherwise you could have other places call that function, and this
> single lock is not going to protect anything :(

Future caller from other places should also take the lock and also
update tsx_ctrl_state. Worth mentioning in the comment.

> And why is tsx_user_cmd needed, and global?

This was added because cmdline_find_option() uses __initdata and can't
be called inside tsx_init() without it being an __init function.
tsx_init() is in S3/S5 and cpu hotplug path so it can't be __init in its
current callsite. With tsx=auto not translating to any of the
tsx_ctrl_state state, I added tsx_user_cmd. Anyways, I will drop this.

I hope it is okay to move tsx_init() to identify_boot_cpu() to avoid
adding new enum and writing to globals from percpu function. This way
tsx_init() is only called by boot cpu and secondary cpus call
tsx_en/disable().

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 6b25039aa8ae..a4ce9e3a622c 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1576,6 +1575,8 @@ void __init identify_boot_cpu(void)
 #endif
 	cpu_detect_tlb(&boot_cpu_data);
 	setup_cr_pinning();
+
+	tsx_init();
 }
 
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index b1d6c96f6b88..96039b5cda7c 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -762,7 +762,10 @@ static void init_intel(struct cpuinfo_x86 *c)
 
 	init_intel_misc_features(c);
 
-	tsx_init(c);
+	if (tsx_ctrl_state == TSX_CTRL_ENABLE)
+		tsx_enable();
+	else if (tsx_ctrl_state == TSX_CTRL_DISABLE)
+		tsx_disable();
 }

------------------

tsx_init() can now be __init and use cmdline_find_option().

void __init tsx_init(void)
{
	char arg[20];
	int ret;

	if (!tsx_ctrl_is_supported())
		return;

	ret = cmdline_find_option(boot_command_line, "tsx", arg, sizeof(arg));
	if (ret >= 0) {
		if (!strcmp(arg, "on")) {
			tsx_ctrl_state = TSX_CTRL_ENABLE;
		} else if (!strcmp(arg, "off")) {
			tsx_ctrl_state = TSX_CTRL_DISABLE;
		} else if (!strcmp(arg, "auto")) {
			if (boot_cpu_has_bug(X86_BUG_TAA))
				tsx_ctrl_state = TSX_CTRL_DISABLE;
			else
				tsx_ctrl_state = TSX_CTRL_ENABLE;
		} else {
			tsx_ctrl_state = TSX_CTRL_DISABLE;
			pr_info("tsx: invalid option, defaulting to off\n");
		}
	} else {
		/* tsx= not provided, defaulting to off */
		tsx_ctrl_state = TSX_CTRL_DISABLE;
	}

	if (tsx_ctrl_state == TSX_CTRL_DISABLE) {
		tsx_disable();
		setup_clear_cpu_cap(X86_FEATURE_RTM);
	} else if (tsx_ctrl_state == TSX_CTRL_ENABLE) {
		tsx_enable();
		setup_force_cpu_cap(X86_FEATURE_RTM);
	}
}

I am hoping it is okay to use setup_clear_cpu_cap() in tsx_init().

Thanks,
Pawan

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
  2019-10-08  2:57 ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Josh Poimboeuf
@ 2019-10-08  6:10   ` Pawan Gupta
  2019-10-08 10:49     ` Jiri Kosina
  0 siblings, 1 reply; 75+ messages in thread
From: Pawan Gupta @ 2019-10-08  6:10 UTC (permalink / raw)
  To: speck

On Mon, Oct 07, 2019 at 09:57:07PM -0500, speck for Josh Poimboeuf wrote:
> On Fri, Oct 04, 2019 at 11:34:31PM -0700, speck for Pawan Gupta wrote:
> > Transactional Synchronization Extensions (TSX) is an extension to the
> > x86 instruction set architecture (ISA) that adds Hardware Transactional
> > Memory (HTM) support. Changing TSX state currently requires a reboot.
> > This may not be desirable when rebooting imposes a huge penalty.
> 
> This is still missing a real world justification for the added
> complexity.  Don't production users typically know at boot time whether
> they plan to use TSX?

I am not sure about this.

> It looks like this patch should be combined with patch 10, since
> otherwise this patch causes a regression in the sysfs mitigation
> reporting, right?
> 
> And again... this patch needs to be last.  Even after the documentation.
> If it triggers documentation changes then those changes should be
> separated out and included in this patch.  Otherwise, this patch may
> block the merging of the documentation patch.

Ok.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
  2019-10-08  6:10   ` Pawan Gupta
@ 2019-10-08 10:49     ` Jiri Kosina
  0 siblings, 0 replies; 75+ messages in thread
From: Jiri Kosina @ 2019-10-08 10:49 UTC (permalink / raw)
  To: speck

On Mon, 7 Oct 2019, speck for Pawan Gupta wrote:

> > > Transactional Synchronization Extensions (TSX) is an extension to the
> > > x86 instruction set architecture (ISA) that adds Hardware Transactional
> > > Memory (HTM) support. Changing TSX state currently requires a reboot.
> > > This may not be desirable when rebooting imposes a huge penalty.
> > 
> > This is still missing a real world justification for the added
> > complexity.  Don't production users typically know at boot time whether
> > they plan to use TSX?
> 
> I am not sure about this.

We finally got a clearance to talk to one of the few potential heavy users 
of TSX. We should know more about the usecase and their expectations 
tomorrow. Either myself or Michal Hocko can then send an update to speck@.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 05/11] TAAv5 5
  2019-10-08  2:46 ` [MODERATED] Re: [PATCH v5 05/11] TAAv5 5 Josh Poimboeuf
@ 2019-10-09  1:45   ` Pawan Gupta
  0 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-09  1:45 UTC (permalink / raw)
  To: speck

On Mon, Oct 07, 2019 at 09:46:42PM -0500, speck for Josh Poimboeuf wrote:
> > -/* Update the static key controlling the MDS CPU buffer clear in idle */
> > +/* Update the static key controlling the MDS and TAA CPU buffer clear in idle */
> >  static void update_mds_branch_idle(void)
> >  {
> >  	/*
> > @@ -775,8 +881,11 @@ static void update_mds_branch_idle(void)
> >  	 * The other variants cannot be mitigated when SMT is enabled, so
> >  	 * clearing the buffers on idle just to prevent the Store Buffer
> >  	 * repartitioning leak would be a window dressing exercise.
> > +	 *
> > +	 * Apply idle buffer clearing to TAA affected CPUs also.
> >  	 */
> > -	if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY))
> > +	if (!boot_cpu_has_bug(X86_BUG_MSBDS_ONLY) &&
> > +	    !boot_cpu_has_bug(X86_BUG_TAA))
> >  		return;
> 
> I'm not sure this belongs here.  The existing check was for
> X86_BUG_MSBDS_ONLY, which is the only MDS variant which can be mitigated
> in idle with SMT enabled.  If the other MDS variants are present (which
> is true for most CPUs) then this function doesn't enable the MDS idle
> mitigation, because it wouldn't be sufficient.
> 
> So does this fully mitigate TAA on idle threads with SMT enabled?  If
> not then this change is probably pointless.

It does provide protection against the "store buffer" leak. But the other
buffers(fill buffer and load port) are still SMT vulnerable. Do you prefer it
removed?

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** [PATCH v5 08/11] TAAv5 8
  2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
                   ` (18 preceding siblings ...)
  2019-10-08  2:57 ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Josh Poimboeuf
@ 2019-10-09 13:12 ` Michal Hocko
  2019-10-14 19:41   ` Thomas Gleixner
  19 siblings, 1 reply; 75+ messages in thread
From: Michal Hocko @ 2019-10-09 13:12 UTC (permalink / raw)
  To: speck

On Fri 04-10-19 23:33:31, speck for Pawan Gupta wrote:
> Platforms which are not affected by X86_BUG_TAA may want the TSX feature
> enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto,
> disable TSX when X86_BUG_TAA is present, otherwise enable TSX.
> 
> More details on X86_BUG_TAA can be found here:
> https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

This patch is still keeping the default for tsx=off unless I got lost in
the enable/disable logic, right?
The earlier arguments for going this way was that there are no real
users of TSX in production. This is not really the case though. At least
SAP HANA seems to benefit from TSX - publicly available information can
be found [1][2][3]. We have talked to SAP guys today and this is still the
case.

I do understand that there are likely not that many other production
users of the feature but I believe that we should keep the default
update friendly and sticking with the auto semantic by default is both
in line with other mitigations and also reduces the risk of regressions.

Thoughts?

[1] https://blogs.saphana.com/2015/05/05/new-intel-xeon-haswell-processor-delivers-exceptional-performance-sap-hana-platform-2/
[2] https://blogs.saphana.com/2015/06/29/impact-of-haswell-on-hana/
[3] https://www.intel.com/content/dam/www/public/us/en/documents/solution-briefs/sap-hana-real-time-analytics-solution-brief.pdf

> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 5 +++++
>  arch/x86/kernel/cpu/tsx.c                       | 9 +++++++++
>  2 files changed, 14 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 832537d59562..d7b48c38c6e5 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4821,6 +4821,11 @@
>  
>  			on	- Enable TSX on the system.
>  			off	- Disable TSX on the system.
> +			auto	- Disable TSX if X86_BUG_TAA is present,
> +				  otherwise enable TSX on the system.
> +
> +			More details on X86_BUG_TAA are here:
> +			Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
>  
>  			Not specifying this option is equivalent to tsx=off.
>  
> diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
> index 1c0cee7a7d46..73a0e5af3720 100644
> --- a/arch/x86/kernel/cpu/tsx.c
> +++ b/arch/x86/kernel/cpu/tsx.c
> @@ -23,6 +23,7 @@ static enum tsx_user_cmds {
>  	TSX_USER_CMD_NONE,
>  	TSX_USER_CMD_ON,
>  	TSX_USER_CMD_OFF,
> +	TSX_USER_CMD_AUTO,
>  } tsx_user_cmd = TSX_USER_CMD_NONE;
>  
>  static int __init tsx_cmdline(char *str)
> @@ -38,6 +39,8 @@ static int __init tsx_cmdline(char *str)
>  		tsx_user_cmd = TSX_USER_CMD_ON;
>  	else if (!strcmp(str, "off"))
>  		tsx_user_cmd = TSX_USER_CMD_OFF;
> +	else if (!strcmp(str, "auto"))
> +		tsx_user_cmd = TSX_USER_CMD_AUTO;
>  
>  	return 0;
>  }
> @@ -104,6 +107,12 @@ void tsx_init(struct cpuinfo_x86 *c)
>  	case TSX_USER_CMD_OFF:
>  		tsx_ctrl_state = TSX_CTRL_DISABLE;
>  		break;
> +	case TSX_USER_CMD_AUTO:
> +		if (boot_cpu_has_bug(X86_BUG_TAA))
> +			tsx_ctrl_state = TSX_CTRL_DISABLE;
> +		else
> +			tsx_ctrl_state = TSX_CTRL_ENABLE;
> +		break;
>  	case TSX_USER_CMD_NONE:
>  	default:
>  		/*
> -- 
> 2.20.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
  2019-10-08  6:01     ` Pawan Gupta
@ 2019-10-10 21:31       ` Pawan Gupta
  2019-10-11  8:45         ` Greg KH
  0 siblings, 1 reply; 75+ messages in thread
From: Pawan Gupta @ 2019-10-10 21:31 UTC (permalink / raw)
  To: speck

On Mon, Oct 07, 2019 at 11:01:56PM -0700, speck for Pawan Gupta wrote:
> > > +static DEFINE_MUTEX(tsx_mutex);
> > 
> > I think I asked this before, but in looking at the code I still can't
> > figure it out.  What exactly is this protecting?
> > 
> > It looks like you want to keep only one "writer" out of the sysfs store
> > function at a time, but:
> > 
> > > +ssize_t hw_tx_mem_store(struct device *dev, struct device_attribute *attr,
> > > +			const char *buf, size_t count)
> > > +{
> > > +	enum tsx_ctrl_states requested_state;
> > > +	ssize_t ret;
> > > +	bool val;
> > > +
> > > +	ret = kstrtobool(buf, &val);
> > > +	if (ret)
> > > +		return ret;
> > > +
> > > +	mutex_lock(&tsx_mutex);
> > > +
> > > +	if (val) {
> > > +		tsx_user_cmd = TSX_USER_CMD_ON;
> > > +		requested_state = TSX_CTRL_ENABLE;
> > > +	} else {
> > > +		tsx_user_cmd = TSX_USER_CMD_OFF;
> > > +		requested_state = TSX_CTRL_DISABLE;
> > > +	}
> > > +
> > > +	/* Current state is same as the reqested state, do nothing */
> > > +	if (tsx_ctrl_state == requested_state)
> > > +		goto exit;
> > > +
> > > +	tsx_ctrl_state = requested_state;
> > > +
> > > +	tsx_update_on_each_cpu(val);
> > > +exit:
> > > +	mutex_unlock(&tsx_mutex);
> > 
> > What I think you want to do is just protect the tsx_update_on_each_cpu()
> > function, right?
> 
> Also I believe below two operations needs to be under a lock. Without
> the lock if there are two writers and one is preempted in between these
> operations there is a possibility that tsx_ctrl_state and TSX hardware
> state could go out of sync.
> 
> 	tsx_ctrl_state = requested_state;
> 
> 	// 1st writer gets preempted here
> 	// 2nd writer flips tsx_ctrl_state and writes to the MSR.
> 	// 1st writer wakes up and only writes to the MSR
> 
> 	tsx_update_on_each_cpu(val);
> 	// tsx_ctrl_state and hardware state would be different here.
> 
> Chances of this happening is rare but still a possibility. The lock
> would prevent such a condition.

Greg, is it not a possible scenario for tsx_ctrl_state and MSR write to
be under the lock.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 09/11] TAAv5 9
  2019-10-10 21:31       ` Pawan Gupta
@ 2019-10-11  8:45         ` Greg KH
  2019-10-21  8:00           ` Thomas Gleixner
  0 siblings, 1 reply; 75+ messages in thread
From: Greg KH @ 2019-10-11  8:45 UTC (permalink / raw)
  To: speck

On Thu, Oct 10, 2019 at 02:31:51PM -0700, speck for Pawan Gupta wrote:
> On Mon, Oct 07, 2019 at 11:01:56PM -0700, speck for Pawan Gupta wrote:
> > > > +static DEFINE_MUTEX(tsx_mutex);
> > > 
> > > I think I asked this before, but in looking at the code I still can't
> > > figure it out.  What exactly is this protecting?
> > > 
> > > It looks like you want to keep only one "writer" out of the sysfs store
> > > function at a time, but:
> > > 
> > > > +ssize_t hw_tx_mem_store(struct device *dev, struct device_attribute *attr,
> > > > +			const char *buf, size_t count)
> > > > +{
> > > > +	enum tsx_ctrl_states requested_state;
> > > > +	ssize_t ret;
> > > > +	bool val;
> > > > +
> > > > +	ret = kstrtobool(buf, &val);
> > > > +	if (ret)
> > > > +		return ret;
> > > > +
> > > > +	mutex_lock(&tsx_mutex);
> > > > +
> > > > +	if (val) {
> > > > +		tsx_user_cmd = TSX_USER_CMD_ON;
> > > > +		requested_state = TSX_CTRL_ENABLE;
> > > > +	} else {
> > > > +		tsx_user_cmd = TSX_USER_CMD_OFF;
> > > > +		requested_state = TSX_CTRL_DISABLE;
> > > > +	}
> > > > +
> > > > +	/* Current state is same as the reqested state, do nothing */
> > > > +	if (tsx_ctrl_state == requested_state)
> > > > +		goto exit;
> > > > +
> > > > +	tsx_ctrl_state = requested_state;
> > > > +
> > > > +	tsx_update_on_each_cpu(val);
> > > > +exit:
> > > > +	mutex_unlock(&tsx_mutex);
> > > 
> > > What I think you want to do is just protect the tsx_update_on_each_cpu()
> > > function, right?
> > 
> > Also I believe below two operations needs to be under a lock. Without
> > the lock if there are two writers and one is preempted in between these
> > operations there is a possibility that tsx_ctrl_state and TSX hardware
> > state could go out of sync.
> > 
> > 	tsx_ctrl_state = requested_state;
> > 
> > 	// 1st writer gets preempted here
> > 	// 2nd writer flips tsx_ctrl_state and writes to the MSR.
> > 	// 1st writer wakes up and only writes to the MSR
> > 
> > 	tsx_update_on_each_cpu(val);
> > 	// tsx_ctrl_state and hardware state would be different here.
> > 
> > Chances of this happening is rare but still a possibility. The lock
> > would prevent such a condition.
> 
> Greg, is it not a possible scenario for tsx_ctrl_state and MSR write to
> be under the lock.

I'm sorry, but I do not understand what you are trying to state here.

Look at the code, and the lock, and determine what it really is doing
and document that please.  Right now I can't determine it and it looks
pointless to me, so I must be missing something.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 08/11] TAAv5 8
  2019-10-09 13:12 ` [MODERATED] Re: ***UNCHECKED*** [PATCH v5 08/11] TAAv5 8 Michal Hocko
@ 2019-10-14 19:41   ` Thomas Gleixner
  2019-10-14 19:51     ` [MODERATED] " Jiri Kosina
  0 siblings, 1 reply; 75+ messages in thread
From: Thomas Gleixner @ 2019-10-14 19:41 UTC (permalink / raw)
  To: speck

Michal,

On Wed, 9 Oct 2019, speck for Michal Hocko wrote:

> On Fri 04-10-19 23:33:31, speck for Pawan Gupta wrote:
> > Platforms which are not affected by X86_BUG_TAA may want the TSX feature
> > enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto,
> > disable TSX when X86_BUG_TAA is present, otherwise enable TSX.
> > 
> > More details on X86_BUG_TAA can be found here:
> > https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html
> 
> This patch is still keeping the default for tsx=off unless I got lost in
> the enable/disable logic, right?
> The earlier arguments for going this way was that there are no real
> users of TSX in production. This is not really the case though. At least
> SAP HANA seems to benefit from TSX - publicly available information can
> be found [1][2][3]. We have talked to SAP guys today and this is still the
> case.
> 
> I do understand that there are likely not that many other production
> users of the feature but I believe that we should keep the default
> update friendly and sticking with the auto semantic by default is both
> in line with other mitigations and also reduces the risk of regressions.

While definitely I agree in general with the 'no regression' rule, making
TSX by default disabled would be a really good move.

Outside of the SAP/HANA use case the only known useful effect of TSX is to
hide and accelerate attacks of all sorts.

The real question is whether machines on which SAP/HANA runs are just
random boxes/VMs which would be affected by the 'update and reboot world'
treatment or whether SAP/HANA runs on dedicated systems which are carefully
administrated by sysadmins with brains turned on.

If it's the random box affected by 'update and reboot world' then disabling
TSX by default is obviously going to be a source of regressions and we have
to bite the bullet.

If not, then the sysadmins will anyway do some very careful evaluation and
checking, read documentation and add 'WE_REALLY_WANT_TSX' on the kernel
command line. In that case I rather disgruntle a few sysadmins and gain a
useful setup for everyone else.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-14 19:41   ` Thomas Gleixner
@ 2019-10-14 19:51     ` Jiri Kosina
  2019-10-14 21:04       ` [MODERATED] " Borislav Petkov
  2019-10-14 21:05       ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  0 siblings, 2 replies; 75+ messages in thread
From: Jiri Kosina @ 2019-10-14 19:51 UTC (permalink / raw)
  To: speck

On Mon, 14 Oct 2019, speck for Thomas Gleixner wrote:

> The real question is whether machines on which SAP/HANA runs are just 
> random boxes/VMs which would be affected by the 'update and reboot 
> world' treatment or whether SAP/HANA runs on dedicated systems which are 
> carefully administrated by sysadmins with brains turned on.
> 
> If it's the random box affected by 'update and reboot world' then disabling
> TSX by default is obviously going to be a source of regressions and we have
> to bite the bullet.
> 
> If not, then the sysadmins will anyway do some very careful evaluation 
> and checking, read documentation and add 'WE_REALLY_WANT_TSX' on the 
> kernel command line. In that case I rather disgruntle a few sysadmins 
> and gain a useful setup for everyone else.

[ if this archive is ever going to be published, please remove this mail 
  from it, thanks ]

So we talked to SAP about exactly this, and they expressed a real worry 
along the lines of (just my paraphrase, not really verbatim quote of 
anybody):

	"of course we will put it into a SAP note (*), but realistically, 
	not everybody (by far) is reading those. If this regressess on a 
	handful of platforms, we (together with the OS vendor) could 
	probably handle the followup reports, but if it happens all across 
	the board, it'd surely be overwhelming, to put it mildly. It'll"
	basically promptly escalate to a support disaster."

(*) kind of a release/technical notes, an accompanying documentation for 
    update/release, where they can indicate the need of kernel commandline
    parameter to be added

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: Re: [PATCH v5 08/11] TAAv5 8
  2019-10-14 19:51     ` [MODERATED] " Jiri Kosina
@ 2019-10-14 21:04       ` Borislav Petkov
  2019-10-14 21:31         ` Jiri Kosina
  2019-10-14 21:05       ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  1 sibling, 1 reply; 75+ messages in thread
From: Borislav Petkov @ 2019-10-14 21:04 UTC (permalink / raw)
  To: speck

On Mon, Oct 14, 2019 at 09:51:42PM +0200, speck for Jiri Kosina wrote:
> 	not everybody (by far) is reading those. If this regressess on a 
> 	handful of platforms, we (together with the OS vendor) could 
> 	probably handle the followup reports, but if it happens all across 
> 	the board, it'd surely be overwhelming, to put it mildly. It'll"
> 	basically promptly escalate to a support disaster."

Ok, how about we make it a config option: CONFIG_TSX_DEFAULT_ON

and we - distro kernels - enable it. Mainline leaves it off.

Problem solved.

Hmmm?

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 247165, AG München
-- 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-14 19:51     ` [MODERATED] " Jiri Kosina
  2019-10-14 21:04       ` [MODERATED] " Borislav Petkov
@ 2019-10-14 21:05       ` Michal Hocko
  1 sibling, 0 replies; 75+ messages in thread
From: Michal Hocko @ 2019-10-14 21:05 UTC (permalink / raw)
  To: speck

On Mon 14-10-19 21:51:42, speck for Jiri Kosina wrote:
> On Mon, 14 Oct 2019, speck for Thomas Gleixner wrote:
> 
> > The real question is whether machines on which SAP/HANA runs are just 
> > random boxes/VMs which would be affected by the 'update and reboot 
> > world' treatment or whether SAP/HANA runs on dedicated systems which are 
> > carefully administrated by sysadmins with brains turned on.
> > 
> > If it's the random box affected by 'update and reboot world' then disabling
> > TSX by default is obviously going to be a source of regressions and we have
> > to bite the bullet.
> > 
> > If not, then the sysadmins will anyway do some very careful evaluation 
> > and checking, read documentation and add 'WE_REALLY_WANT_TSX' on the 
> > kernel command line. In that case I rather disgruntle a few sysadmins 
> > and gain a useful setup for everyone else.
> 
> [ if this archive is ever going to be published, please remove this mail 
>   from it, thanks ]
> 
> So we talked to SAP about exactly this, and they expressed a real worry 
> along the lines of (just my paraphrase, not really verbatim quote of 
> anybody):
> 
> 	"of course we will put it into a SAP note (*), but realistically, 
> 	not everybody (by far) is reading those. If this regressess on a 
> 	handful of platforms, we (together with the OS vendor) could 
> 	probably handle the followup reports, but if it happens all across 
> 	the board, it'd surely be overwhelming, to put it mildly. It'll"
> 	basically promptly escalate to a support disaster."

And I would just give my personal experience. We have been through many
bug reports where it simply turned out that a long deprecated (by a SAP
note) configuration has been still in place because that is what was
done at the time of deployment and never changed since.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: Re: [PATCH v5 08/11] TAAv5 8
  2019-10-14 21:04       ` [MODERATED] " Borislav Petkov
@ 2019-10-14 21:31         ` Jiri Kosina
  2019-10-15  8:01           ` Thomas Gleixner
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Kosina @ 2019-10-14 21:31 UTC (permalink / raw)
  To: speck

On Mon, 14 Oct 2019, speck for Borislav Petkov wrote:

> > 	not everybody (by far) is reading those. If this regressess on a=20
> > 	handful of platforms, we (together with the OS vendor) could=20
> > 	probably handle the followup reports, but if it happens all across=20
> > 	the board, it'd surely be overwhelming, to put it mildly. It'll"
> > 	basically promptly escalate to a support disaster."
> 
> Ok, how about we make it a config option: CONFIG_TSX_DEFAULT_ON
>
> and we - distro kernels - enable it. Mainline leaves it off.
> 
> Problem solved.
> 
> Hmmm?

If noone would strongly object to that config option, I'll happily send a 
followup patch on top of Pawan's series that'd be adding it (defaulting to 
'n', as that appears to be the common consensus for upstream for totally 
understandable reasons).

Otherwise, some distros might just have to flip the default in the sources 
by non-upstream patch, I am afraid (even though we haven't made it to the 
point where are are making that decision yet).

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 08/11] TAAv5 8
  2019-10-14 21:31         ` Jiri Kosina
@ 2019-10-15  8:01           ` Thomas Gleixner
  2019-10-15 10:34             ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  0 siblings, 1 reply; 75+ messages in thread
From: Thomas Gleixner @ 2019-10-15  8:01 UTC (permalink / raw)
  To: speck

On Mon, 14 Oct 2019, speck for Jiri Kosina wrote:

> On Mon, 14 Oct 2019, speck for Borislav Petkov wrote:
> 
> > > 	not everybody (by far) is reading those. If this regressess on a=20
> > > 	handful of platforms, we (together with the OS vendor) could=20
> > > 	probably handle the followup reports, but if it happens all across=20
> > > 	the board, it'd surely be overwhelming, to put it mildly. It'll"
> > > 	basically promptly escalate to a support disaster."

I feared that :)

> > Ok, how about we make it a config option: CONFIG_TSX_DEFAULT_ON
> >
> > and we - distro kernels - enable it. Mainline leaves it off.
> > 
> > Problem solved.
> > 
> > Hmmm?
> 
> If noone would strongly object to that config option, I'll happily send a 
> followup patch on top of Pawan's series that'd be adding it (defaulting to 
> 'n', as that appears to be the common consensus for upstream for totally 
> understandable reasons).
> 
> Otherwise, some distros might just have to flip the default in the sources 
> by non-upstream patch, I am afraid (even though we haven't made it to the 
> point where are are making that decision yet).

The config option is fine.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15  8:01           ` Thomas Gleixner
@ 2019-10-15 10:34             ` Michal Hocko
  2019-10-15 13:06               ` Josh Poimboeuf
                                 ` (2 more replies)
  0 siblings, 3 replies; 75+ messages in thread
From: Michal Hocko @ 2019-10-15 10:34 UTC (permalink / raw)
  To: speck

On Tue 15-10-19 10:01:40, speck for Thomas Gleixner wrote:
> On Mon, 14 Oct 2019, speck for Jiri Kosina wrote:
> 
> > On Mon, 14 Oct 2019, speck for Borislav Petkov wrote:
> > 
> > > > 	not everybody (by far) is reading those. If this regressess on a=20
> > > > 	handful of platforms, we (together with the OS vendor) could=20
> > > > 	probably handle the followup reports, but if it happens all across=20
> > > > 	the board, it'd surely be overwhelming, to put it mildly. It'll"
> > > > 	basically promptly escalate to a support disaster."
> 
> I feared that :)
> 
> > > Ok, how about we make it a config option: CONFIG_TSX_DEFAULT_ON
> > >
> > > and we - distro kernels - enable it. Mainline leaves it off.
> > > 
> > > Problem solved.
> > > 
> > > Hmmm?
> > 
> > If noone would strongly object to that config option, I'll happily send a 
> > followup patch on top of Pawan's series that'd be adding it (defaulting to 
> > 'n', as that appears to be the common consensus for upstream for totally 
> > understandable reasons).
> > 
> > Otherwise, some distros might just have to flip the default in the sources 
> > by non-upstream patch, I am afraid (even though we haven't made it to the 
> > point where are are making that decision yet).
> 
> The config option is fine.

OK, so what about this patch on top of Pawan's series? I have to say I am not
really entirely happy about yet another config option. In principle this
is not much different from the HT where we decided to stay enabled even
though it is vulnerable to side channels. But I do understand that much
more people will notice HT off than TSX off.

Anyway here is the patch
---
From 9666e91b63cd6213d362d04289e1bcbbe2050bc3 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Tue, 15 Oct 2019 11:21:01 +0200
Subject: [PATCH] x86, tsx: allow to set tsx=auto by a config option

There is a general consensus that TSX usage is not largely spread while
the history shows there is a non trivial space for side channel attacks
possible. Therefore the tsx is disabled by default even on platforms
that might have a safe implementation of TSX according to the current
knowledge. This is a fair trade off to make.

There are, however, workloads that really do benefit from using TSX and
updating to a newer kernel with TSX disabled might introduce a
noticeable regressions. This would be especially a problem for Linux
distributions which will provide TAA mitigations.

Introduce X86_INTEL_ENABLE_SAFE_TSX config option to override the
default tsx=off semantic and make tsx=auto a default which is more
update friendly.

Suggested-by: Borislav Petkov <bpetkov@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/x86/Kconfig          | 22 ++++++++++++++++++++++
 arch/x86/kernel/cpu/tsx.c | 20 ++++++++++++++------
 2 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d6e1faa28c58..9823e34b81ce 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1940,6 +1940,28 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS
 
 	  If unsure, say y.
 
+config X86_INTEL_ENABLE_SAFE_TSX
+	prompt ""
+	def_bool n
+	depends on CPU_SUP_INTEL
+	---help---
+	  Intel's TSX (Transactional Synchronization Extensions) feature
+	  allows to optimize locking protocols through lock elision which
+	  can lead to a noticeable performance boost.
+
+	  On the other hand it has been shown that TSX can be exploited
+	  to form side channel attacks (e.g. TAA) and chances are there
+	  will be more of those attacks discovered in the future.
+
+	  Therefore the TSX is not enabled by default. An admin might override
+	  this decision by tsx=on command line parameter. This has a risk that
+	  TSX will get enabled also on platforms which are known to be vulnerable
+	  to attacks like TAA and a safer option is to use tsx=auto command line
+	  parameter. Enabling this config option will make tsx=auto the default.
+	  See Documentation/admin-guide/kernel-parameters.txt for more details.
+
+	  If you really benefit from TSX then enable this option, otherwise say n.
+
 config EFI
 	bool "EFI runtime service support"
 	depends on ACPI
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 96320449abb7..d3dc1ce5cd4b 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -69,6 +69,14 @@ static bool __init tsx_ctrl_is_supported(void)
 	return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
 }
 
+static enum tsx_ctrl_states x86_safe_tsx_mode(void)
+{
+	if (boot_cpu_has_bug(X86_BUG_TAA))
+		return TSX_CTRL_DISABLE;
+
+	return TSX_CTRL_ENABLE;
+}
+
 void __init tsx_init(void)
 {
 	char arg[20];
@@ -84,17 +92,17 @@ void __init tsx_init(void)
 		} else if (!strcmp(arg, "off")) {
 			tsx_ctrl_state = TSX_CTRL_DISABLE;
 		} else if (!strcmp(arg, "auto")) {
-			if (boot_cpu_has_bug(X86_BUG_TAA))
-				tsx_ctrl_state = TSX_CTRL_DISABLE;
-			else
-				tsx_ctrl_state = TSX_CTRL_ENABLE;
+			tsx_ctrl_state = x86_safe_tsx_mode();
 		} else {
 			tsx_ctrl_state = TSX_CTRL_DISABLE;
 			pr_info("tsx: invalid option, defaulting to off\n");
 		}
 	} else {
-		/* tsx= not provided, defaulting to off */
-		tsx_ctrl_state = TSX_CTRL_DISABLE;
+		/* tsx= not provided */
+		if (IS_ENABLED(X86_INTEL_ENABLE_SAFE_TSX))
+			tsx_ctrl_state = x86_safe_tsx_mode();
+		else
+			tsx_ctrl_state = TSX_CTRL_DISABLE;
 	}
 
 	if (tsx_ctrl_state == TSX_CTRL_DISABLE) {
-- 
2.20.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 10:34             ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
@ 2019-10-15 13:06               ` Josh Poimboeuf
  2019-10-15 13:10                 ` Jiri Kosina
  2019-10-15 17:47               ` Borislav Petkov
  2019-10-16  7:26               ` [MODERATED] Re: ***UNCHECKED*** " Jiri Kosina
  2 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-15 13:06 UTC (permalink / raw)
  To: speck

On Tue, Oct 15, 2019 at 12:34:54PM +0200, speck for Michal Hocko wrote:
> OK, so what about this patch on top of Pawan's series? I have to say I am not
> really entirely happy about yet another config option. In principle this
> is not much different from the HT where we decided to stay enabled even
> though it is vulnerable to side channels. But I do understand that much
> more people will notice HT off than TSX off.
> 
> Anyway here is the patch
> ---
> From 9666e91b63cd6213d362d04289e1bcbbe2050bc3 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 15 Oct 2019 11:21:01 +0200
> Subject: [PATCH] x86, tsx: allow to set tsx=auto by a config option
> 
> There is a general consensus that TSX usage is not largely spread while
> the history shows there is a non trivial space for side channel attacks
> possible. Therefore the tsx is disabled by default even on platforms
> that might have a safe implementation of TSX according to the current
> knowledge. This is a fair trade off to make.
> 
> There are, however, workloads that really do benefit from using TSX and
> updating to a newer kernel with TSX disabled might introduce a
> noticeable regressions. This would be especially a problem for Linux
> distributions which will provide TAA mitigations.
> 
> Introduce X86_INTEL_ENABLE_SAFE_TSX config option to override the
> default tsx=off semantic and make tsx=auto a default which is more
> update friendly.
> 
> Suggested-by: Borislav Petkov <bpetkov@suse.de>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Since all (or most?) modern Intel CPUs are vulnerable to TAA, defaulting
to tsx=auto would effectively be the same as defaulting to tsx=off,
right?  How does this help with regressions?

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 13:06               ` Josh Poimboeuf
@ 2019-10-15 13:10                 ` Jiri Kosina
  2019-10-15 15:26                   ` Josh Poimboeuf
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Kosina @ 2019-10-15 13:10 UTC (permalink / raw)
  To: speck

On Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:

> Since all (or most?) modern Intel CPUs are vulnerable to TAA, defaulting 
> to tsx=auto would effectively be the same as defaulting to tsx=off, 
> right?  How does this help with regressions?

The mitigation is only needed on CPUs where verw doesn't have the buffer 
clearing semantics.

-- 
Jiri Kosina
Director, SUSE Labs Core

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 13:10                 ` Jiri Kosina
@ 2019-10-15 15:26                   ` Josh Poimboeuf
  2019-10-15 15:32                     ` Jiri Kosina
  0 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-15 15:26 UTC (permalink / raw)
  To: speck

On Tue, Oct 15, 2019 at 03:10:33PM +0200, speck for Jiri Kosina wrote:
> On Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:
> 
> > Since all (or most?) modern Intel CPUs are vulnerable to TAA, defaulting 
> > to tsx=auto would effectively be the same as defaulting to tsx=off, 
> > right?  How does this help with regressions?
> 
> The mitigation is only needed on CPUs where verw doesn't have the buffer 
> clearing semantics.

Can you elaborate?  I have no idea what you're trying to say and how it
relates to my question :-)

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 15:26                   ` Josh Poimboeuf
@ 2019-10-15 15:32                     ` Jiri Kosina
  2019-10-15 19:34                       ` Tyler Hicks
  2019-10-15 20:00                       ` Josh Poimboeuf
  0 siblings, 2 replies; 75+ messages in thread
From: Jiri Kosina @ 2019-10-15 15:32 UTC (permalink / raw)
  To: speck

On Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:

> > > Since all (or most?) modern Intel CPUs are vulnerable to TAA, 
> > > defaulting to tsx=auto would effectively be the same as defaulting 
> > > to tsx=off, right?  How does this help with regressions?
> > 
> > The mitigation is only needed on CPUs where verw doesn't have the buffer 
> > clearing semantics.
> 
> Can you elaborate?  I have no idea what you're trying to say and how it
> relates to my question :-)

Only those CPUs with TSX *and* with MDS_NO need TSX disabled in order to 
protect from this issues.

The CPUs that don't enumarate MDS_NO (and therefore got ucode update with 
verw buffer-clearing semantics) are fully mitigated against TAA by MDS 
mitigations already.

Therefore the set of CPUs where we *really* need to turn of TSX in order 
to protect from TAA is currently rather minimal (CascadeLake-B, 
WhiskeyLake-V, CommitLake, CoffeeLake-R), so force-disabling on all CPUs 
covers way bigger set of platforms than actually needed.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 10:34             ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  2019-10-15 13:06               ` Josh Poimboeuf
@ 2019-10-15 17:47               ` Borislav Petkov
  2019-10-16  7:26               ` [MODERATED] Re: ***UNCHECKED*** " Jiri Kosina
  2 siblings, 0 replies; 75+ messages in thread
From: Borislav Petkov @ 2019-10-15 17:47 UTC (permalink / raw)
  To: speck

On Tue, Oct 15, 2019 at 12:34:54PM +0200, speck for Michal Hocko wrote:
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index d6e1faa28c58..9823e34b81ce 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1940,6 +1940,28 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS
>  
>  	  If unsure, say y.
>  
> +config X86_INTEL_ENABLE_SAFE_TSX
> +	prompt ""

Needs a prompt sentence, otherwise it looks like this in menuconfig:

  │ │                                 [ ] Intel Memory Protection Keys                                                                       │ │
  │ │                                 [ ]  (NEW)                                                                                             │ │
  │ │                                 [*] EFI runtime service support

> +	def_bool n
> +	depends on CPU_SUP_INTEL
> +	---help---
> +	  Intel's TSX (Transactional Synchronization Extensions) feature
> +	  allows to optimize locking protocols through lock elision which
> +	  can lead to a noticeable performance boost.
> +
> +	  On the other hand it has been shown that TSX can be exploited
> +	  to form side channel attacks (e.g. TAA) and chances are there
> +	  will be more of those attacks discovered in the future.
> +
> +	  Therefore the TSX is not enabled by default. An admin might override

s/the //

> +	  this decision by tsx=on command line parameter. This has a risk that
> +	  TSX will get enabled also on platforms which are known to be vulnerable
> +	  to attacks like TAA and a safer option is to use tsx=auto command line
> +	  parameter. Enabling this config option will make tsx=auto the default.
> +	  See Documentation/admin-guide/kernel-parameters.txt for more details.
> +
> +	  If you really benefit from TSX then enable this option, otherwise say n.
					^
					and you know what you're doing,

> +
>  config EFI
>  	bool "EFI runtime service support"
>  	depends on ACPI
> diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
> index 96320449abb7..d3dc1ce5cd4b 100644
> --- a/arch/x86/kernel/cpu/tsx.c
> +++ b/arch/x86/kernel/cpu/tsx.c
> @@ -69,6 +69,14 @@ static bool __init tsx_ctrl_is_supported(void)
>  	return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
>  }
>  
> +static enum tsx_ctrl_states x86_safe_tsx_mode(void)

x86_get_tsx_mode()

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg
-- 

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 15:32                     ` Jiri Kosina
@ 2019-10-15 19:34                       ` Tyler Hicks
  2019-10-15 20:00                       ` Josh Poimboeuf
  1 sibling, 0 replies; 75+ messages in thread
From: Tyler Hicks @ 2019-10-15 19:34 UTC (permalink / raw)
  To: speck

On 2019-10-15 17:32:08, speck for Jiri Kosina wrote:
> On Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:
> 
> > > > Since all (or most?) modern Intel CPUs are vulnerable to TAA, 
> > > > defaulting to tsx=auto would effectively be the same as defaulting 
> > > > to tsx=off, right?  How does this help with regressions?
> > > 
> > > The mitigation is only needed on CPUs where verw doesn't have the buffer 
> > > clearing semantics.
> > 
> > Can you elaborate?  I have no idea what you're trying to say and how it
> > relates to my question :-)
> 
> Only those CPUs with TSX *and* with MDS_NO need TSX disabled in order to 
> protect from this issues.
> 
> The CPUs that don't enumarate MDS_NO (and therefore got ucode update with 
> verw buffer-clearing semantics) are fully mitigated against TAA by MDS 
> mitigations already.

I don't think "fully mitigated" is true in this case. My understanding
is that they're still vulnerable to cross-thread attacks when SMT is on.

Tyler

> 
> Therefore the set of CPUs where we *really* need to turn of TSX in order 
> to protect from TAA is currently rather minimal (CascadeLake-B, 
> WhiskeyLake-V, CommitLake, CoffeeLake-R), so force-disabling on all CPUs 
> covers way bigger set of platforms than actually needed.
> 
> -- 
> Jiri Kosina
> SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 15:32                     ` Jiri Kosina
  2019-10-15 19:34                       ` Tyler Hicks
@ 2019-10-15 20:00                       ` Josh Poimboeuf
  2019-10-15 20:15                         ` Jiri Kosina
  1 sibling, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-15 20:00 UTC (permalink / raw)
  To: speck

On Tue, Oct 15, 2019 at 05:32:08PM +0200, speck for Jiri Kosina wrote:
> On Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:
> 
> > > > Since all (or most?) modern Intel CPUs are vulnerable to TAA, 
> > > > defaulting to tsx=auto would effectively be the same as defaulting 
> > > > to tsx=off, right?  How does this help with regressions?
> > > 
> > > The mitigation is only needed on CPUs where verw doesn't have the buffer 
> > > clearing semantics.
> > 
> > Can you elaborate?  I have no idea what you're trying to say and how it
> > relates to my question :-)
> 
> Only those CPUs with TSX *and* with MDS_NO need TSX disabled in order to 
> protect from this issues.
> 
> The CPUs that don't enumarate MDS_NO (and therefore got ucode update with 
> verw buffer-clearing semantics) are fully mitigated against TAA by MDS 
> mitigations already.
> 
> Therefore the set of CPUs where we *really* need to turn of TSX in order 
> to protect from TAA is currently rather minimal (CascadeLake-B, 
> WhiskeyLake-V, CommitLake, CoffeeLake-R), so force-disabling on all CPUs 
> covers way bigger set of platforms than actually needed.

Maybe I'm missing something.  Isn't there going to be a ucode update for
MDS_NO parts, which does the verw buffer clearing?  In that case there's
no need to disable TSX, and instead the verw mitigation could be used,
if desired.

AFAICT, the patch allows to set the default to tsx=auto, which disables
TSX on *all* vulnerable parts, not just the MDS_NO ones.  I don't see
how that would prevent user regressions.

It sounds like maybe you're suggesting something else, that TSX should
only be disabled on vulnerable MDS_NO parts?

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 20:00                       ` Josh Poimboeuf
@ 2019-10-15 20:15                         ` Jiri Kosina
  2019-10-15 20:35                           ` Jiri Kosina
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Kosina @ 2019-10-15 20:15 UTC (permalink / raw)
  To: speck

On Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:

> Maybe I'm missing something.  Isn't there going to be a ucode update for
> MDS_NO parts, which does the verw buffer clearing?  In that case there's
> no need to disable TSX, and instead the verw mitigation could be used,
> if desired.

My understanding was that MDS_NO CPUs will only get ucode update that 
exposes TSX control MSR, and nothing else.

> AFAICT, the patch allows to set the default to tsx=auto, which disables 
> TSX on *all* vulnerable parts, not just the MDS_NO ones.  I don't see 
> how that would prevent user regressions.
> 
> It sounds like maybe you're suggesting something else, that TSX should
> only be disabled on vulnerable MDS_NO parts?

OK, let me take a look at the code again. I definitely thought that's what 
'auto' indeed does.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 20:15                         ` Jiri Kosina
@ 2019-10-15 20:35                           ` Jiri Kosina
  2019-10-15 20:54                             ` Josh Poimboeuf
  2019-10-15 20:56                             ` [MODERATED] " Pawan Gupta
  0 siblings, 2 replies; 75+ messages in thread
From: Jiri Kosina @ 2019-10-15 20:35 UTC (permalink / raw)
  To: speck

On Tue, 15 Oct 2019, speck for Jiri Kosina wrote:

> > Maybe I'm missing something.  Isn't there going to be a ucode update for
> > MDS_NO parts, which does the verw buffer clearing?  In that case there's
> > no need to disable TSX, and instead the verw mitigation could be used,
> > if desired.
> 
> My understanding was that MDS_NO CPUs will only get ucode update that 
> exposes TSX control MSR, and nothing else.
> 
> > AFAICT, the patch allows to set the default to tsx=auto, which disables 
> > TSX on *all* vulnerable parts, not just the MDS_NO ones.  I don't see 
> > how that would prevent user regressions.
> > 
> > It sounds like maybe you're suggesting something else, that TSX should
> > only be disabled on vulnerable MDS_NO parts?
> 
> OK, let me take a look at the code again. I definitely thought that's what 
> 'auto' indeed does.

OK, so you are right and I misunderstood the logic in the code, sorry. 

Then the only purpose of 'auto' really is getting TSX enabled on future 
CPUs which would eventually have ARCH_CAP_TAA_NO=1; so pretty useless for 
preventing regressions.

So yeah, I agree, 'auto' is actually useless to prevent regressions, and I 
believe we want some other 'auto' (*), which would actually disable TSX 
only if (X86_BUG_TAA && !MD_CLEAR), agreed?

(*) I'd actually prefer to convert the current 'auto' to this new 
    semantics; it'll keep TSX enabled on future CPUs without X86_BUG_TAA,
    and it'll prevent regressions in unnecessary cases.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 20:35                           ` Jiri Kosina
@ 2019-10-15 20:54                             ` Josh Poimboeuf
  2019-10-15 20:56                             ` [MODERATED] " Pawan Gupta
  1 sibling, 0 replies; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-15 20:54 UTC (permalink / raw)
  To: speck

On Tue, Oct 15, 2019 at 10:35:04PM +0200, speck for Jiri Kosina wrote:
> On Tue, 15 Oct 2019, speck for Jiri Kosina wrote:
> 
> > > Maybe I'm missing something.  Isn't there going to be a ucode update for
> > > MDS_NO parts, which does the verw buffer clearing?  In that case there's
> > > no need to disable TSX, and instead the verw mitigation could be used,
> > > if desired.
> > 
> > My understanding was that MDS_NO CPUs will only get ucode update that 
> > exposes TSX control MSR, and nothing else.

I'm looking at a Cinco slide deck from Intel, where page 11 says that
CPUs with MDS_NO=1 will get a microcode update to provide both VERW
buffer clearing support and the new MSR interface to disable TSX.

> > > AFAICT, the patch allows to set the default to tsx=auto, which disables 
> > > TSX on *all* vulnerable parts, not just the MDS_NO ones.  I don't see 
> > > how that would prevent user regressions.
> > > 
> > > It sounds like maybe you're suggesting something else, that TSX should
> > > only be disabled on vulnerable MDS_NO parts?
> > 
> > OK, let me take a look at the code again. I definitely thought that's what 
> > 'auto' indeed does.
> 
> OK, so you are right and I misunderstood the logic in the code, sorry. 
> 
> Then the only purpose of 'auto' really is getting TSX enabled on future 
> CPUs which would eventually have ARCH_CAP_TAA_NO=1; so pretty useless for 
> preventing regressions.
> 
> So yeah, I agree, 'auto' is actually useless to prevent regressions, and I 
> believe we want some other 'auto' (*), which would actually disable TSX 
> only if (X86_BUG_TAA && !MD_CLEAR), agreed?

Based on the above-mentioned slide, I don't see how

  X86_BUG_TAA && !MD_CLEAR

is a realistic scenario, if the user has updated their microcode.

> (*) I'd actually prefer to convert the current 'auto' to this new 
>     semantics; it'll keep TSX enabled on future CPUs without X86_BUG_TAA,
>     and it'll prevent regressions in unnecessary cases.

I'm also not sure I see the point of the existing tsx=auto option,
unless we can identify who's actually going to use it.

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 20:35                           ` Jiri Kosina
  2019-10-15 20:54                             ` Josh Poimboeuf
@ 2019-10-15 20:56                             ` Pawan Gupta
  2019-10-15 21:14                               ` Jiri Kosina
  1 sibling, 1 reply; 75+ messages in thread
From: Pawan Gupta @ 2019-10-15 20:56 UTC (permalink / raw)
  To: speck

On Tue, Oct 15, 2019 at 10:35:04PM +0200, speck for Jiri Kosina wrote:
> On Tue, 15 Oct 2019, speck for Jiri Kosina wrote:
> 
> > > Maybe I'm missing something.  Isn't there going to be a ucode update for
> > > MDS_NO parts, which does the verw buffer clearing?  In that case there's
> > > no need to disable TSX, and instead the verw mitigation could be used,
> > > if desired.
> > 
> > My understanding was that MDS_NO CPUs will only get ucode update that 
> > exposes TSX control MSR, and nothing else.
> > 
> > > AFAICT, the patch allows to set the default to tsx=auto, which disables 
> > > TSX on *all* vulnerable parts, not just the MDS_NO ones.  I don't see 
> > > how that would prevent user regressions.
> > > 
> > > It sounds like maybe you're suggesting something else, that TSX should
> > > only be disabled on vulnerable MDS_NO parts?
> > 
> > OK, let me take a look at the code again. I definitely thought that's what 
> > 'auto' indeed does.
> 
> OK, so you are right and I misunderstood the logic in the code, sorry. 
> 
> Then the only purpose of 'auto' really is getting TSX enabled on future 
> CPUs which would eventually have ARCH_CAP_TAA_NO=1; so pretty useless for 
> preventing regressions.
> 
> So yeah, I agree, 'auto' is actually useless to prevent regressions, and I 
> believe we want some other 'auto' (*), which would actually disable TSX 
> only if (X86_BUG_TAA && !MD_CLEAR), agreed?
> 
> (*) I'd actually prefer to convert the current 'auto' to this new 
>     semantics; it'll keep TSX enabled on future CPUs without X86_BUG_TAA,
>     and it'll prevent regressions in unnecessary cases.

tsx_ctrl_is_supported() checks for the TSX_CTRL MSR support and then
only attempts at disabling TSX. MDS_NO=0 CPUs will not get the ucode
update for TSX control, therefore tsx=auto wont cause regression on
older CPUs. 

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 20:56                             ` [MODERATED] " Pawan Gupta
@ 2019-10-15 21:14                               ` Jiri Kosina
  2019-10-15 23:12                                 ` Josh Poimboeuf
  0 siblings, 1 reply; 75+ messages in thread
From: Jiri Kosina @ 2019-10-15 21:14 UTC (permalink / raw)
  To: speck

On Tue, 15 Oct 2019, speck for Pawan Gupta wrote:

> tsx_ctrl_is_supported() checks for the TSX_CTRL MSR support and then 
> only attempts at disabling TSX. MDS_NO=0 CPUs will not get the ucode 
> update for TSX control, therefore tsx=auto wont cause regression on 
> older CPUs.

OK, that piece of information finally made it to make sense again :)

So I believe distros still want the option (Michal's patch) to default to 
'auto', so that actual heavy users of TSX will get the right thing once 
they update their CPUs to !TAA_BUG ones, but it's less urgent that I 
originally thought.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 21:14                               ` Jiri Kosina
@ 2019-10-15 23:12                                 ` Josh Poimboeuf
  2019-10-15 23:13                                   ` [MODERATED] [AUTOREPLY] [MODERATED] [AUTOREPLY] Automatic reply: " James, Hengameh M
                                                     ` (2 more replies)
  0 siblings, 3 replies; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-15 23:12 UTC (permalink / raw)
  To: speck

On Tue, Oct 15, 2019 at 11:14:03PM +0200, speck for Jiri Kosina wrote:
> On Tue, 15 Oct 2019, speck for Pawan Gupta wrote:
> 
> > tsx_ctrl_is_supported() checks for the TSX_CTRL MSR support and then 
> > only attempts at disabling TSX. MDS_NO=0 CPUs will not get the ucode 
> > update for TSX control, therefore tsx=auto wont cause regression on 
> > older CPUs.

So just to clarify, CPUs with TAA_BUG and MDS_NO=0 will *not* have
ARCH_CAP_TSX_CTRL_MSR?

I didn't see that important detail mentioned anywhere in the patches.

At the very least, the documentation for tsx=auto and tsx=off need to be
clarified, as they don't seem to mention any MDS_NO=0 contingencies.

> OK, that piece of information finally made it to make sense again :)
> 
> So I believe distros still want the option (Michal's patch) to default to 
> 'auto', so that actual heavy users of TSX will get the right thing once 
> they update their CPUs to !TAA_BUG ones, but it's less urgent that I 
> originally thought.

So if I understand correctly, you're postulating that distros want:

a) TAA_BUG && MDS_NO=0 => TSX on
b) TAA_BUG && MDS_NO=1 => TSX off
c) !TAA_BUG            => TSX on

How are you reaching that conclusion?  It seems horribly confusing for
TSX users, but again maybe I'm missing something.

It seems to me that "heavy users of TSX" would want tsx=on, no matter
what.  And so we would need to leave that as the default in order to not
break those users.

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] [AUTOREPLY] [MODERATED] [AUTOREPLY] Automatic reply: [PATCH v5 08/11] TAAv5 8
  2019-10-15 23:12                                 ` Josh Poimboeuf
@ 2019-10-15 23:13                                   ` James, Hengameh M
  2019-10-16  4:52                                   ` [MODERATED] " Jiri Kosina
  2019-10-18  1:17                                   ` Ben Hutchings
  2 siblings, 0 replies; 75+ messages in thread
From: James, Hengameh M @ 2019-10-15 23:13 UTC (permalink / raw)
  To: speck



^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 23:12                                 ` Josh Poimboeuf
  2019-10-15 23:13                                   ` [MODERATED] [AUTOREPLY] [MODERATED] [AUTOREPLY] Automatic reply: " James, Hengameh M
@ 2019-10-16  4:52                                   ` Jiri Kosina
  2019-10-16  5:05                                     ` Jiri Kosina
  2019-10-16  7:14                                     ` Josh Poimboeuf
  2019-10-18  1:17                                   ` Ben Hutchings
  2 siblings, 2 replies; 75+ messages in thread
From: Jiri Kosina @ 2019-10-16  4:52 UTC (permalink / raw)
  To: speck

yOn Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:

> So if I understand correctly, you're postulating that distros want:
> 
> a) TAA_BUG && MDS_NO=0 => TSX on
> b) TAA_BUG && MDS_NO=1 => TSX off
> c) !TAA_BUG            => TSX on
> 
> How are you reaching that conclusion?  It seems horribly confusing for
> TSX users, but again maybe I'm missing something.
> 
> It seems to me that "heavy users of TSX" would want tsx=on, no matter
> what.  And so we would need to leave that as the default in order to not
> break those users.

But then we're not defaulting to safe behavior, which is confusing too, 
because we almost (SMT being the exception) always did in the past for all 
the other previous issues.

So I believe the above is the best compromise between having as few 
regressions as possible, while still maintaining security (by default).

I agree it's not nice, but nothing is, if you ask me.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16  4:52                                   ` [MODERATED] " Jiri Kosina
@ 2019-10-16  5:05                                     ` Jiri Kosina
  2019-10-21 21:15                                       ` Luck, Tony
  2019-10-16  7:14                                     ` Josh Poimboeuf
  1 sibling, 1 reply; 75+ messages in thread
From: Jiri Kosina @ 2019-10-16  5:05 UTC (permalink / raw)
  To: speck

On Wed, 16 Oct 2019, speck for Jiri Kosina wrote:

> But then we're not defaulting to safe behavior, which is confusing too, 
> because we almost (SMT being the exception) always did in the past for all 
> the other previous issues.
> 
> So I believe the above is the best compromise between having as few 
> regressions as possible, while still maintaining security (by default).
> 
> I agree it's not nice, but nothing is, if you ask me.

But again, given Pawan's (surprising to me, I'd still like to find this 
detail documented somewhere in Intel's materials) statement that there 
will never be a situation when some older CPU with MDS_NO will see 
ARCH_CAP_TSX_CTRL_MSR exposed by new ucode, I see this as much less 
pressing issue, and we could get away with tsx=off, as the number 
immediate regressions will be minimal.

And there will be enough time to push the information (need for 
commandline option) to the TSX users for the future CPUs.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16  4:52                                   ` [MODERATED] " Jiri Kosina
  2019-10-16  5:05                                     ` Jiri Kosina
@ 2019-10-16  7:14                                     ` Josh Poimboeuf
  2019-10-16  7:20                                       ` Jiri Kosina
  1 sibling, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-16  7:14 UTC (permalink / raw)
  To: speck

On Wed, Oct 16, 2019 at 06:52:15AM +0200, speck for Jiri Kosina wrote:
> yOn Tue, 15 Oct 2019, speck for Josh Poimboeuf wrote:
> 
> > So if I understand correctly, you're postulating that distros want:
> > 
> > a) TAA_BUG && MDS_NO=0 => TSX on
> > b) TAA_BUG && MDS_NO=1 => TSX off
> > c) !TAA_BUG            => TSX on
> > 
> > How are you reaching that conclusion?  It seems horribly confusing for
> > TSX users, but again maybe I'm missing something.
> > 
> > It seems to me that "heavy users of TSX" would want tsx=on, no matter
> > what.  And so we would need to leave that as the default in order to not
> > break those users.
> 
> But then we're not defaulting to safe behavior, which is confusing too, 
> because we almost (SMT being the exception) always did in the past for all 
> the other previous issues.

Why wouldn't it be safe behavior?  For both TAA_BUG cases, we can
mitigate TSX with verw buffer clearing.

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16  7:14                                     ` Josh Poimboeuf
@ 2019-10-16  7:20                                       ` Jiri Kosina
  0 siblings, 0 replies; 75+ messages in thread
From: Jiri Kosina @ 2019-10-16  7:20 UTC (permalink / raw)
  To: speck

On Wed, 16 Oct 2019, speck for Josh Poimboeuf wrote:

> Why wouldn't it be safe behavior?  For both TAA_BUG cases, we can
> mitigate TSX with verw buffer clearing.

If ucodes providing verw clearing would indeed really become available 
even for MDS_NO=1 systems, then yes, and we would then probably default to 
TSX on.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 10:34             ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  2019-10-15 13:06               ` Josh Poimboeuf
  2019-10-15 17:47               ` Borislav Petkov
@ 2019-10-16  7:26               ` Jiri Kosina
  2019-10-16  7:54                 ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  2 siblings, 1 reply; 75+ messages in thread
From: Jiri Kosina @ 2019-10-16  7:26 UTC (permalink / raw)
  To: speck

On Tue, 15 Oct 2019, speck for Michal Hocko wrote:

> From 9666e91b63cd6213d362d04289e1bcbbe2050bc3 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Tue, 15 Oct 2019 11:21:01 +0200
> Subject: [PATCH] x86, tsx: allow to set tsx=auto by a config option
> 
> There is a general consensus that TSX usage is not largely spread while
> the history shows there is a non trivial space for side channel attacks
> possible. Therefore the tsx is disabled by default even on platforms
> that might have a safe implementation of TSX according to the current
> knowledge. This is a fair trade off to make.
> 
> There are, however, workloads that really do benefit from using TSX and
> updating to a newer kernel with TSX disabled might introduce a
> noticeable regressions. This would be especially a problem for Linux
> distributions which will provide TAA mitigations.
> 
> Introduce X86_INTEL_ENABLE_SAFE_TSX config option to override the
> default tsx=off semantic and make tsx=auto a default which is more
> update friendly.

So given the recent clarifications, especially around older MDS_NO=0 CPUs, 
I think it'd actually make more sense to turn this into a config option 
that would simply allow to chose what the default behavior should be, 
chosing from tristate off/on/auto.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16  7:26               ` [MODERATED] Re: ***UNCHECKED*** " Jiri Kosina
@ 2019-10-16  7:54                 ` Michal Hocko
  2019-10-16  9:23                   ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  0 siblings, 1 reply; 75+ messages in thread
From: Michal Hocko @ 2019-10-16  7:54 UTC (permalink / raw)
  To: speck

On Wed 16-10-19 09:26:14, speck for Jiri Kosina wrote:
> On Tue, 15 Oct 2019, speck for Michal Hocko wrote:
> 
> > From 9666e91b63cd6213d362d04289e1bcbbe2050bc3 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.com>
> > Date: Tue, 15 Oct 2019 11:21:01 +0200
> > Subject: [PATCH] x86, tsx: allow to set tsx=auto by a config option
> > 
> > There is a general consensus that TSX usage is not largely spread while
> > the history shows there is a non trivial space for side channel attacks
> > possible. Therefore the tsx is disabled by default even on platforms
> > that might have a safe implementation of TSX according to the current
> > knowledge. This is a fair trade off to make.
> > 
> > There are, however, workloads that really do benefit from using TSX and
> > updating to a newer kernel with TSX disabled might introduce a
> > noticeable regressions. This would be especially a problem for Linux
> > distributions which will provide TAA mitigations.
> > 
> > Introduce X86_INTEL_ENABLE_SAFE_TSX config option to override the
> > default tsx=off semantic and make tsx=auto a default which is more
> > update friendly.
> 
> So given the recent clarifications, especially around older MDS_NO=0 CPUs, 
> I think it'd actually make more sense to turn this into a config option 
> that would simply allow to chose what the default behavior should be, 
> chosing from tristate off/on/auto.

This on top of the commit?
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9823e34b81ce..f7f79f189359 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1940,11 +1940,11 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS
 
 	  If unsure, say y.
 
-config X86_INTEL_ENABLE_SAFE_TSX
-	prompt ""
-	def_bool n
+choice
+	prompt "TSX enable mode"
 	depends on CPU_SUP_INTEL
-	---help---
+	default X86_INTEL_TSX_MODE_OFF
+	help
 	  Intel's TSX (Transactional Synchronization Extensions) feature
 	  allows to optimize locking protocols through lock elision which
 	  can lead to a noticeable performance boost.
@@ -1953,14 +1953,37 @@ config X86_INTEL_ENABLE_SAFE_TSX
 	  to form side channel attacks (e.g. TAA) and chances are there
 	  will be more of those attacks discovered in the future.
 
-	  Therefore the TSX is not enabled by default. An admin might override
-	  this decision by tsx=on command line parameter. This has a risk that
-	  TSX will get enabled also on platforms which are known to be vulnerable
-	  to attacks like TAA and a safer option is to use tsx=auto command line
-	  parameter. Enabling this config option will make tsx=auto the default.
-	  See Documentation/admin-guide/kernel-parameters.txt for more details.
+	  Therefore the TSX is not enabled by default (aka tsx=off). An admin
+	  might override this decision by tsx=on command line parameter. This
+	  has a risk that TSX will get enabled also on platforms which are
+	  known to be vulnerable to attacks like TAA and a safer option is to
+	  use tsx=auto command line parameter.
 
-	  If you really benefit from TSX then enable this option, otherwise say n.
+	  This options allows to set the default tsx mode between tsx=on, off
+	  and auto. See Documentation/admin-guide/kernel-parameters.txt for more
+	  details.
+
+	  Say off if not sure, auto if TSX is in use but it should be used on safe
+	  platforms or on if TSX is in use and the security aspect of tsx in not
+	  relevant.
+
+config X86_INTEL_TSX_MODE_OFF
+	bool "off"
+	help
+	  TSX is always disabled - equals tsx=off command line parameter.
+
+config X86_INTEL_TSX_MODE_ON
+	bool "on"
+	help
+	  TSX is always enabled on TSX capable HW - equals tsx=on command line
+	  parameter.
+
+config X86_INTEL_TSX_MODE_AUTO
+	bool "auto"
+	help
+	  TSX is enabled on TSX capable HW that is believed to be safe against
+	  side channel attacks- equals tsx=auto command line parameter.
+endchoice
 
 config EFI
 	bool "EFI runtime service support"
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index d3dc1ce5cd4b..a2df0d5ffb15 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -99,10 +99,12 @@ void __init tsx_init(void)
 		}
 	} else {
 		/* tsx= not provided */
-		if (IS_ENABLED(X86_INTEL_ENABLE_SAFE_TSX))
+		if (IS_ENABLED(X86_INTEL_TSX_MODE_AUTO))
 			tsx_ctrl_state = x86_safe_tsx_mode();
-		else
+		else if (IS_ENABLED(X86_INTEL_TSX_MODE_OFF))
 			tsx_ctrl_state = TSX_CTRL_DISABLE;
+		else
+			tsx_ctrl_state = TSX_CTRL_ENABLE;
 	}
 
 	if (tsx_ctrl_state == TSX_CTRL_DISABLE) {
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: Re: Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16  7:54                 ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
@ 2019-10-16  9:23                   ` Michal Hocko
  2019-10-16 12:15                     ` Thomas Gleixner
  0 siblings, 1 reply; 75+ messages in thread
From: Michal Hocko @ 2019-10-16  9:23 UTC (permalink / raw)
  To: speck

On Wed 16-10-19 09:54:34, speck for Michal Hocko wrote:
[...]
> +config X86_INTEL_TSX_MODE_OFF
> +	bool "off"
> +	help
> +	  TSX is always disabled - equals tsx=off command line parameter.
> +
> +config X86_INTEL_TSX_MODE_ON
> +	bool "on"
> +	help
> +	  TSX is always enabled on TSX capable HW - equals tsx=on command line
> +	  parameter.
> +
> +config X86_INTEL_TSX_MODE_AUTO
> +	bool "auto"
> +	help
> +	  TSX is enabled on TSX capable HW that is believed to be safe against
> +	  side channel attacks- equals tsx=auto command line parameter.
> +endchoice

I have just reread the discussion again and I have to say that there is
more confusion and uncertainty than I would like to see. Can somebody
from Intel give an authoritative answer to the following question
please?

Is tsx=auto going to lead to any different state than tsx=on? In other
words does auto mode make any sense at all?

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: ***UNCHECKED*** Re: Re: Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16  9:23                   ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
@ 2019-10-16 12:15                     ` Thomas Gleixner
  2019-10-16 18:34                       ` [MODERATED] " Pawan Gupta
  2019-10-18  0:14                       ` Pawan Gupta
  0 siblings, 2 replies; 75+ messages in thread
From: Thomas Gleixner @ 2019-10-16 12:15 UTC (permalink / raw)
  To: speck

On Wed, 16 Oct 2019, speck for Michal Hocko wrote:
> On Wed 16-10-19 09:54:34, speck for Michal Hocko wrote:
> [...]
> > +config X86_INTEL_TSX_MODE_OFF
> > +	bool "off"
> > +	help
> > +	  TSX is always disabled - equals tsx=off command line parameter.
> > +
> > +config X86_INTEL_TSX_MODE_ON
> > +	bool "on"
> > +	help
> > +	  TSX is always enabled on TSX capable HW - equals tsx=on command line
> > +	  parameter.
> > +
> > +config X86_INTEL_TSX_MODE_AUTO
> > +	bool "auto"
> > +	help
> > +	  TSX is enabled on TSX capable HW that is believed to be safe against
> > +	  side channel attacks- equals tsx=auto command line parameter.
> > +endchoice
> 
> I have just reread the discussion again and I have to say that there is
> more confusion and uncertainty than I would like to see. Can somebody
> from Intel give an authoritative answer to the following question
> please?
> 
> Is tsx=auto going to lead to any different state than tsx=on? In other
> words does auto mode make any sense at all?

Is it only my vacation induced spark of mental sanity or is this whole TAA
thing a complete trainwreck again?

We're at version 5 and more than 3 month since the first RFC got posted and
we are still debating which CPUs are affected and which migitations are
going to be deployed depending on the CPU advertized misfeatures and the
eventually surfacing microcode?

Can we please stop this complete waste of time right now and start over
with clarifying the situation? I.e. someone at Intel needs to sit down and
write up a matrix:

TAA-Affected | MDS_NO | VERW works | TSX_MSR | Resulting action
-------------|--------|------------|---------|-----------------
 No          |   X    |     X      |    Y    | ?
 No          |   X    |     X      |    N    | None
 Yes         |   0    |     0      |    0    | SNAFU
 ...         |        |            |         |

You surely can fill in the rest on your own, right?

And exactly that information wants to be in the admin documentation or in a
separate Documentation/x86/taa.rst as well.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16 12:15                     ` Thomas Gleixner
@ 2019-10-16 18:34                       ` Pawan Gupta
  2019-10-18  0:14                       ` Pawan Gupta
  1 sibling, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-16 18:34 UTC (permalink / raw)
  To: speck

On Wed, Oct 16, 2019 at 02:15:08PM +0200, speck for Thomas Gleixner wrote:
> On Wed, 16 Oct 2019, speck for Michal Hocko wrote:
> > On Wed 16-10-19 09:54:34, speck for Michal Hocko wrote:
> > [...]
> > > +config X86_INTEL_TSX_MODE_OFF
> > > +	bool "off"
> > > +	help
> > > +	  TSX is always disabled - equals tsx=off command line parameter.
> > > +
> > > +config X86_INTEL_TSX_MODE_ON
> > > +	bool "on"
> > > +	help
> > > +	  TSX is always enabled on TSX capable HW - equals tsx=on command line
> > > +	  parameter.
> > > +
> > > +config X86_INTEL_TSX_MODE_AUTO
> > > +	bool "auto"
> > > +	help
> > > +	  TSX is enabled on TSX capable HW that is believed to be safe against
> > > +	  side channel attacks- equals tsx=auto command line parameter.
> > > +endchoice
> > 
> > I have just reread the discussion again and I have to say that there is
> > more confusion and uncertainty than I would like to see. Can somebody
> > from Intel give an authoritative answer to the following question
> > please?
> > 
> > Is tsx=auto going to lead to any different state than tsx=on? In other
> > words does auto mode make any sense at all?
> 
> Is it only my vacation induced spark of mental sanity or is this whole TAA
> thing a complete trainwreck again?
> 
> We're at version 5 and more than 3 month since the first RFC got posted and
> we are still debating which CPUs are affected and which migitations are
> going to be deployed depending on the CPU advertized misfeatures and the
> eventually surfacing microcode?
> 
> Can we please stop this complete waste of time right now and start over
> with clarifying the situation? I.e. someone at Intel needs to sit down and
> write up a matrix:
> 
> TAA-Affected | MDS_NO | VERW works | TSX_MSR | Resulting action
> -------------|--------|------------|---------|-----------------
>  No          |   X    |     X      |    Y    | ?
>  No          |   X    |     X      |    N    | None
>  Yes         |   0    |     0      |    0    | SNAFU
>  ...         |        |            |         |
> 
> You surely can fill in the rest on your own, right?
> 
> And exactly that information wants to be in the admin documentation or in a
> separate Documentation/x86/taa.rst as well.

I am working on this with other folks at Intel.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16 12:15                     ` Thomas Gleixner
  2019-10-16 18:34                       ` [MODERATED] " Pawan Gupta
@ 2019-10-18  0:14                       ` Pawan Gupta
  2019-10-21  8:09                         ` Thomas Gleixner
  2019-10-21 12:54                         ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  1 sibling, 2 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-18  0:14 UTC (permalink / raw)
  To: speck

On Wed, Oct 16, 2019 at 02:15:08PM +0200, speck for Thomas Gleixner wrote:
> On Wed, 16 Oct 2019, speck for Michal Hocko wrote:
> > Is tsx=auto going to lead to any different state than tsx=on? In other
> > words does auto mode make any sense at all?
> 
> Is it only my vacation induced spark of mental sanity or is this whole TAA
> thing a complete trainwreck again?
> 
> We're at version 5 and more than 3 month since the first RFC got posted and
> we are still debating which CPUs are affected and which migitations are
> going to be deployed depending on the CPU advertized misfeatures and the
> eventually surfacing microcode?
> 
> Can we please stop this complete waste of time right now and start over
> with clarifying the situation? I.e. someone at Intel needs to sit down and
> write up a matrix:
> 
> TAA-Affected | MDS_NO | VERW works | TSX_MSR | Resulting action
> -------------|--------|------------|---------|-----------------
>  No          |   X    |     X      |    Y    | ?
>  No          |   X    |     X      |    N    | None
>  Yes         |   0    |     0      |    0    | SNAFU
>  ...         |        |            |         |
> 
> You surely can fill in the rest on your own, right?
> 
> And exactly that information wants to be in the admin documentation or in a
> separate Documentation/x86/taa.rst as well.

Below is the matrix for the impact of tsx=<on|off|auto> cmdline options
on state of TAA mitigation, VERW behavior and TSX feature for various
combinations of MSR_IA32_ARCH_CAPABILITIES bits.

1. tsx=auto
+----------+----------+----------------+---------------+--------------+-------------------+
|  MSR_IA32_ARCH_CAPABILITIES bits     |           Result with cmdline tsx=auto           |
+----------+----------+----------------+---------------+--------------+-------------------+
|  TAA_NO  |  MDS_NO  |  TSX_CTRL_MSR  |  VERW clears  |  TSX state   |  TAA mitigation   |
|          |          |                |  CPU buffers  | after bootup |                   |
+==========+==========+================+===============+==============+===================+
|    0     |    0     |       0        |      Yes      |  HW default  |    Same as MDS    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    0     |       1        | Invalid case  | Invalid case |   Invalid case    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    1     |       0        |      No       |  HW default  | Need ucode update |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    1     |       1        |      Yes      | TSX disabled |   TSX disabled    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    1     |    X     |       1        |       X       | TSX enabled  |    None needed    |
+----------+----------+----------------+---------------+--------------+-------------------+


2. tsx=on
+----------+----------+----------------+---------------+--------------+-------------------+
|  MSR_IA32_ARCH_CAPABILITIES bits     |           Result with cmdline tsx=on             |
+----------+----------+----------------+---------------+--------------+-------------------+
|  TAA_NO  |  MDS_NO  |  TSX_CTRL_MSR  |  VERW clears  |  TSX state   |  TAA mitigation   |
|          |          |                |  CPU buffers  | after bootup |                   |
+==========+==========+================+===============+==============+===================+
|    0     |    0     |       0        |      Yes      |  HW default  |    Same as MDS    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    0     |       1        | Invalid case  | Invalid case |   Invalid case    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    1     |       0        |      No       |  HW default  | Need ucode update |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    1     |       1        |      Yes      | TSX enabled  |    Same as MDS    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    1     |    X     |       1        |       X       | TSX enabled  |    None needed    |
+----------+----------+----------------+---------------+--------------+-------------------+


3. tsx=off
+----------+----------+----------------+---------------+--------------+-------------------+
|  MSR_IA32_ARCH_CAPABILITIES bits     |           Result with cmdline tsx=off            |
+----------+----------+----------------+---------------+--------------+-------------------+
|  TAA_NO  |  MDS_NO  |  TSX_CTRL_MSR  |  VERW clears  |  TSX state   |  TAA mitigation   |
|          |          |                |  CPU buffers  | after bootup |                   |
+==========+==========+================+===============+==============+===================+
|    0     |    0     |       0        |      Yes      |  HW default  |    Same as MDS    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    0     |       1        | Invalid case  | Invalid case |   Invalid case    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    1     |       0        |      No       |  HW default  | Need ucode update |
+----------+----------+----------------+---------------+--------------+-------------------+
|    0     |    1     |       1        |      Yes      | TSX disabled |   TSX disabled    |
+----------+----------+----------------+---------------+--------------+-------------------+
|    1     |    X     |       1        |       X       | TSX disabled |    None needed    |
+----------+----------+----------------+---------------+--------------+-------------------+

Let me know if there are any questions.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-15 23:12                                 ` Josh Poimboeuf
  2019-10-15 23:13                                   ` [MODERATED] [AUTOREPLY] [MODERATED] [AUTOREPLY] Automatic reply: " James, Hengameh M
  2019-10-16  4:52                                   ` [MODERATED] " Jiri Kosina
@ 2019-10-18  1:17                                   ` Ben Hutchings
  2019-10-18  4:04                                     ` Pawan Gupta
  2 siblings, 1 reply; 75+ messages in thread
From: Ben Hutchings @ 2019-10-18  1:17 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1154 bytes --]

On Tue, 2019-10-15 at 18:12 -0500, speck for Josh Poimboeuf wrote:
> On Tue, Oct 15, 2019 at 11:14:03PM +0200, speck for Jiri Kosina wrote:
[...]
> > OK, that piece of information finally made it to make sense again :)
> > 
> > So I believe distros still want the option (Michal's patch) to default to 
> > 'auto', so that actual heavy users of TSX will get the right thing once 
> > they update their CPUs to !TAA_BUG ones, but it's less urgent that I 
> > originally thought.
> 
> So if I understand correctly, you're postulating that distros want:
> 
> a) TAA_BUG && MDS_NO=0 => TSX on
> b) TAA_BUG && MDS_NO=1 => TSX off
> c) !TAA_BUG            => TSX on
[...]

I think this should be:

a) TAA_BUG && MD_CLEAR=1 => TSX on
b) TAA_BUG && MD_CLEAR=0 => TSX off
c) !TAA_BUG              => TSX on

As I understand it, with currently released microcode, no CPUs have
both MDS_NO and MD_CLEAR set.  But with the pending microcode updates,
CPUs with MDS_NO=1 will also get MD_CLEAR=1 and we can use VERW to
mitigate against TAA.

Ben.

-- 
Ben Hutchings
It's easier to fight for one's principles than to live up to them.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-18  1:17                                   ` Ben Hutchings
@ 2019-10-18  4:04                                     ` Pawan Gupta
  0 siblings, 0 replies; 75+ messages in thread
From: Pawan Gupta @ 2019-10-18  4:04 UTC (permalink / raw)
  To: speck

On Fri, Oct 18, 2019 at 02:17:12AM +0100, speck for Ben Hutchings wrote:
> On Tue, 2019-10-15 at 18:12 -0500, speck for Josh Poimboeuf wrote:
> > On Tue, Oct 15, 2019 at 11:14:03PM +0200, speck for Jiri Kosina wrote:
> [...]
> > > OK, that piece of information finally made it to make sense again :)
> > > 
> > > So I believe distros still want the option (Michal's patch) to default to 
> > > 'auto', so that actual heavy users of TSX will get the right thing once 
> > > they update their CPUs to !TAA_BUG ones, but it's less urgent that I 
> > > originally thought.
> > 
> > So if I understand correctly, you're postulating that distros want:
> > 
> > a) TAA_BUG && MDS_NO=0 => TSX on
> > b) TAA_BUG && MDS_NO=1 => TSX off
> > c) !TAA_BUG            => TSX on
> [...]
> 
> I think this should be:
> 
> a) TAA_BUG && MD_CLEAR=1 => TSX on
> b) TAA_BUG && MD_CLEAR=0 => TSX off
> c) !TAA_BUG              => TSX on
> 
> As I understand it, with currently released microcode, no CPUs have
> both MDS_NO and MD_CLEAR set.  But with the pending microcode updates,
> CPUs with MDS_NO=1 will also get MD_CLEAR=1 and we can use VERW to
> mitigate against TAA.

Both with current and pending microcode update MDS_NO and MD_CLEAR could
be set at the same time. We should not rely on MD_CLEAR. There are other
means of finding out if VERW clears the CPU buffers, better described in
the matrix I shared in the other email.

In summary TSX_CTRL bit in ARCH_CAP_MSR is the true indicator if VERW
clears CPU buffers on MDS_NO=1 CPUs.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 09/11] TAAv5 9
  2019-10-11  8:45         ` Greg KH
@ 2019-10-21  8:00           ` Thomas Gleixner
  0 siblings, 0 replies; 75+ messages in thread
From: Thomas Gleixner @ 2019-10-21  8:00 UTC (permalink / raw)
  To: speck

On Fri, 11 Oct 2019, speck for Greg KH wrote:
> On Thu, Oct 10, 2019 at 02:31:51PM -0700, speck for Pawan Gupta wrote:
> I'm sorry, but I do not understand what you are trying to state here.
> 
> Look at the code, and the lock, and determine what it really is doing
> and document that please.  Right now I can't determine it and it looks
> pointless to me, so I must be missing something.

It's protecting the write to the global state along with the actual update
function which writes the MSR on each CPU depending on that global state.

That has to be one protected section for obvious reasons.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 75+ messages in thread

* Re: [PATCH v5 08/11] TAAv5 8
  2019-10-18  0:14                       ` Pawan Gupta
@ 2019-10-21  8:09                         ` Thomas Gleixner
  2019-10-21 12:54                         ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
  1 sibling, 0 replies; 75+ messages in thread
From: Thomas Gleixner @ 2019-10-21  8:09 UTC (permalink / raw)
  To: speck

On Thu, 17 Oct 2019, speck for Pawan Gupta wrote:
> On Wed, Oct 16, 2019 at 02:15:08PM +0200, speck for Thomas Gleixner wrote:
> Below is the matrix for the impact of tsx=<on|off|auto> cmdline options
> on state of TAA mitigation, VERW behavior and TSX feature for various
> combinations of MSR_IA32_ARCH_CAPABILITIES bits.
> 
> 1. tsx=auto
> +----------+----------+----------------+---------------+--------------+-------------------+
> |  MSR_IA32_ARCH_CAPABILITIES bits     |           Result with cmdline tsx=auto           |
> +----------+----------+----------------+---------------+--------------+-------------------+
> |  TAA_NO  |  MDS_NO  |  TSX_CTRL_MSR  |  VERW clears  |  TSX state   |  TAA mitigation   |
> |          |          |                |  CPU buffers  | after bootup |                   |
> +==========+==========+================+===============+==============+===================+
> |    0     |    0     |       0        |      Yes      |  HW default  |    Same as MDS    |
> +----------+----------+----------------+---------------+--------------+-------------------+
> |    0     |    0     |       1        | Invalid case  | Invalid case |   Invalid case    |
> +----------+----------+----------------+---------------+--------------+-------------------+

So you're saying that if TSX_CTRL_MSR is available VERW clears CPU buffers
independent of MDS_NO being set, right? So:

  MDS_NO	MD_CLEAR	TSX_CTRL_MSR
    0		  0		    0		Vulnerable (needs ucode)
    0		  1		    0		MDS and TAA mitigated via VERW
    1		  1		    0		MDS fixed, TAA vulnerable if TSX enabled
    		  		    		because MD_CLEAR has no meaning and
						VERW is not guaranteed to clear buffers
    1		  x		    1		MDS fixed, TAA can be mitigated by
    		  		    		    	   VERW or TSX_CTRL_MSR

If so, please document all of that proper and fixup the other comments
people had on this series.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: ***UNCHECKED*** Re: [PATCH v5 08/11] TAAv5 8
  2019-10-18  0:14                       ` Pawan Gupta
  2019-10-21  8:09                         ` Thomas Gleixner
@ 2019-10-21 12:54                         ` Michal Hocko
  2019-10-21 20:01                           ` [MODERATED] " Pawan Gupta
  1 sibling, 1 reply; 75+ messages in thread
From: Michal Hocko @ 2019-10-21 12:54 UTC (permalink / raw)
  To: speck

On Thu 17-10-19 17:14:07, speck for Pawan Gupta wrote:
[...]
> Let me know if there are any questions.

is there any list of CPUs that would benefit from tsx=auto compared to
tsx=on? In other words do we really need both?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-21 12:54                         ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
@ 2019-10-21 20:01                           ` Pawan Gupta
  2019-10-21 20:33                             ` Josh Poimboeuf
  0 siblings, 1 reply; 75+ messages in thread
From: Pawan Gupta @ 2019-10-21 20:01 UTC (permalink / raw)
  To: speck

On Mon, Oct 21, 2019 at 02:54:29PM +0200, speck for Michal Hocko wrote:
> On Thu 17-10-19 17:14:07, speck for Pawan Gupta wrote:
> [...]
> > Let me know if there are any questions.
> 
> is there any list of CPUs that would benefit from tsx=auto compared to
> tsx=on? In other words do we really need both?

Below is the list of TAA affected CPUs that will get TSX_CTRL MSR.

+----------------------------+------------------+------------+
|            Name            |  Family / model  |  Stepping  |
+============================+==================+============+
| Whiskey Lake (ULT refresh) |      06_8E       |    0xC     |
+----------------------------+------------------+------------+
|    2nd gen Cascade Lake    |      06_55       |    6, 7    |
+----------------------------+------------------+------------+
|       Coffee Lake R        |      06_9E       |    0xD     |
+----------------------------+------------------+------------+

tsx=auto on these CPUs will disable TSX.
tsx=on will keep TSX enabled.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-21 20:34                               ` Josh Poimboeuf
@ 2019-10-21 20:33                                 ` Pawan Gupta
  2019-10-21 23:01                                   ` Andrew Cooper
  0 siblings, 1 reply; 75+ messages in thread
From: Pawan Gupta @ 2019-10-21 20:33 UTC (permalink / raw)
  To: speck

On Mon, Oct 21, 2019 at 03:34:40PM -0500, speck for Josh Poimboeuf wrote:
> On Mon, Oct 21, 2019 at 03:33:38PM -0500, Josh Poimboeuf wrote:
> > On Mon, Oct 21, 2019 at 01:01:39PM -0700, speck for Pawan Gupta wrote:
> > > On Mon, Oct 21, 2019 at 02:54:29PM +0200, speck for Michal Hocko wrote:
> > > > On Thu 17-10-19 17:14:07, speck for Pawan Gupta wrote:
> > > > [...]
> > > > > Let me know if there are any questions.
> > > > 
> > > > is there any list of CPUs that would benefit from tsx=auto compared to
> > > > tsx=on? In other words do we really need both?
> > > 
> > > Below is the list of TAA affected CPUs that will get TSX_CTRL MSR.
> > > 
> > > +----------------------------+------------------+------------+
> > > |            Name            |  Family / model  |  Stepping  |
> > > +============================+==================+============+
> > > | Whiskey Lake (ULT refresh) |      06_8E       |    0xC     |
> > > +----------------------------+------------------+------------+
> > > |    2nd gen Cascade Lake    |      06_55       |    6, 7    |
> > > +----------------------------+------------------+------------+
> > > |       Coffee Lake R        |      06_9E       |    0xD     |
> > > +----------------------------+------------------+------------+
> > > 
> > > tsx=auto on these CPUs will disable TSX.
> > > tsx=on will keep TSX enabled.
> 
> [ Please ignore my previous email, it had a glaring typo ]
> 
> Just to clarify, is this list identical to the list of CPUs which are
> vulnerable to TAA and have MDS_NO=1?

That is correct.

Thanks,
Pawan

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-21 20:01                           ` [MODERATED] " Pawan Gupta
@ 2019-10-21 20:33                             ` Josh Poimboeuf
  2019-10-21 20:34                               ` Josh Poimboeuf
  0 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-21 20:33 UTC (permalink / raw)
  To: speck

On Mon, Oct 21, 2019 at 01:01:39PM -0700, speck for Pawan Gupta wrote:
> On Mon, Oct 21, 2019 at 02:54:29PM +0200, speck for Michal Hocko wrote:
> > On Thu 17-10-19 17:14:07, speck for Pawan Gupta wrote:
> > [...]
> > > Let me know if there are any questions.
> > 
> > is there any list of CPUs that would benefit from tsx=auto compared to
> > tsx=on? In other words do we really need both?
> 
> Below is the list of TAA affected CPUs that will get TSX_CTRL MSR.
> 
> +----------------------------+------------------+------------+
> |            Name            |  Family / model  |  Stepping  |
> +============================+==================+============+
> | Whiskey Lake (ULT refresh) |      06_8E       |    0xC     |
> +----------------------------+------------------+------------+
> |    2nd gen Cascade Lake    |      06_55       |    6, 7    |
> +----------------------------+------------------+------------+
> |       Coffee Lake R        |      06_9E       |    0xD     |
> +----------------------------+------------------+------------+
> 
> tsx=auto on these CPUs will disable TSX.
> tsx=on will keep TSX enabled.

Just to clarify, is this list identical to the list of CPUs which are
vulnerable to TAA and have MDS_NO=0?

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-21 20:33                             ` Josh Poimboeuf
@ 2019-10-21 20:34                               ` Josh Poimboeuf
  2019-10-21 20:33                                 ` Pawan Gupta
  0 siblings, 1 reply; 75+ messages in thread
From: Josh Poimboeuf @ 2019-10-21 20:34 UTC (permalink / raw)
  To: speck

On Mon, Oct 21, 2019 at 03:33:38PM -0500, Josh Poimboeuf wrote:
> On Mon, Oct 21, 2019 at 01:01:39PM -0700, speck for Pawan Gupta wrote:
> > On Mon, Oct 21, 2019 at 02:54:29PM +0200, speck for Michal Hocko wrote:
> > > On Thu 17-10-19 17:14:07, speck for Pawan Gupta wrote:
> > > [...]
> > > > Let me know if there are any questions.
> > > 
> > > is there any list of CPUs that would benefit from tsx=auto compared to
> > > tsx=on? In other words do we really need both?
> > 
> > Below is the list of TAA affected CPUs that will get TSX_CTRL MSR.
> > 
> > +----------------------------+------------------+------------+
> > |            Name            |  Family / model  |  Stepping  |
> > +============================+==================+============+
> > | Whiskey Lake (ULT refresh) |      06_8E       |    0xC     |
> > +----------------------------+------------------+------------+
> > |    2nd gen Cascade Lake    |      06_55       |    6, 7    |
> > +----------------------------+------------------+------------+
> > |       Coffee Lake R        |      06_9E       |    0xD     |
> > +----------------------------+------------------+------------+
> > 
> > tsx=auto on these CPUs will disable TSX.
> > tsx=on will keep TSX enabled.

[ Please ignore my previous email, it had a glaring typo ]

Just to clarify, is this list identical to the list of CPUs which are
vulnerable to TAA and have MDS_NO=1?

-- 
Josh

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-16  5:05                                     ` Jiri Kosina
@ 2019-10-21 21:15                                       ` Luck, Tony
  0 siblings, 0 replies; 75+ messages in thread
From: Luck, Tony @ 2019-10-21 21:15 UTC (permalink / raw)
  To: speck

On Wed, Oct 16, 2019 at 07:05:33AM +0200, speck for Jiri Kosina wrote:
> But again, given Pawan's (surprising to me, I'd still like to find this 
> detail documented somewhere in Intel's materials) statement that there 
> will never be a situation when some older CPU with MDS_NO will see 
> ARCH_CAP_TSX_CTRL_MSR exposed by new ucode, I see this as much less 
> pressing issue, and we could get away with tsx=off, as the number 
> immediate regressions will be minimal.
> 

Here's a rough timeline.

"Old" CPUs = everything vulnerable to MDS before:
	Cascade Lake (model 0x55, stepping 0x6, 0x7),
	Whiskey Lake (model 0x8E, stepping 0xC),
	Coffee Lake refresh (model 0x9E, stepping 0xD)

   These don't set MDS_NO. They already have (since May 14) a microcode update to
   enable MD_CLEAR (VERW + enhanced L1D flush). They are vulnerable to TAA, but
   the existing MDS fixes also mitigate TAA. They are not going to get a
   u-code update for TAA. They will not have TSX_CTRL.


"Current" CPUs (Cascade Lake, Whiskey Lake, Coffee Lake refresh).
   Out of the box these have MDS_NO=1. They set the CPUID MD_CLEAR bit
   (even though VERW doesn't flush the store buffers, fill buffers & load ports.
   Blame the VMM folks for talking us into doing that).

   There is (Embargo: Nov 12) a microcode update for these CPUS that:
   1) Enables the VERW/L1D flush of fill buffers, store buffers, load ports
   2) Provides the TSX_CTRL MSR

   OS can choose whether to mitigate TAA by either turning off TSX, or by
   using VERW.  [Note that because MD_CLEAR isn't a useful indicator here
   OS can use the existence of TSX_CTRL to determine whether VERW is effective]

"Future" CPUs
   These will set TAA_NO in additon to MDS_NO.  Intel has said that we will
   continue to support TSX_CTRL.

-Tony

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-21 20:33                                 ` Pawan Gupta
@ 2019-10-21 23:01                                   ` Andrew Cooper
  2019-10-21 23:37                                     ` Luck, Tony
  0 siblings, 1 reply; 75+ messages in thread
From: Andrew Cooper @ 2019-10-21 23:01 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1760 bytes --]

On 21/10/2019 21:33, speck for Pawan Gupta wrote:
> On Mon, Oct 21, 2019 at 03:34:40PM -0500, speck for Josh Poimboeuf wrote:
>> On Mon, Oct 21, 2019 at 03:33:38PM -0500, Josh Poimboeuf wrote:
>>> On Mon, Oct 21, 2019 at 01:01:39PM -0700, speck for Pawan Gupta wrote:
>>>> On Mon, Oct 21, 2019 at 02:54:29PM +0200, speck for Michal Hocko wrote:
>>>>> On Thu 17-10-19 17:14:07, speck for Pawan Gupta wrote:
>>>>> [...]
>>>>>> Let me know if there are any questions.
>>>>> is there any list of CPUs that would benefit from tsx=auto compared to
>>>>> tsx=on? In other words do we really need both?
>>>> Below is the list of TAA affected CPUs that will get TSX_CTRL MSR.
>>>>
>>>> +----------------------------+------------------+------------+
>>>> |            Name            |  Family / model  |  Stepping  |
>>>> +============================+==================+============+
>>>> | Whiskey Lake (ULT refresh) |      06_8E       |    0xC     |
>>>> +----------------------------+------------------+------------+
>>>> |    2nd gen Cascade Lake    |      06_55       |    6, 7    |
>>>> +----------------------------+------------------+------------+
>>>> |       Coffee Lake R        |      06_9E       |    0xD     |
>>>> +----------------------------+------------------+------------+
>>>>
>>>> tsx=auto on these CPUs will disable TSX.
>>>> tsx=on will keep TSX enabled.
>> [ Please ignore my previous email, it had a glaring typo ]
>>
>> Just to clarify, is this list identical to the list of CPUs which are
>> vulnerable to TAA and have MDS_NO=1?
> That is correct.

What about Commit Lake parts, because that is explicitly called out in
the list of MDS_NO=1, TAA-vulnerable parts in initial whitepaper about TAA.

~Andrew


^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-21 23:01                                   ` Andrew Cooper
@ 2019-10-21 23:37                                     ` Luck, Tony
  2019-10-21 23:39                                       ` Andrew Cooper
  0 siblings, 1 reply; 75+ messages in thread
From: Luck, Tony @ 2019-10-21 23:37 UTC (permalink / raw)
  To: speck

On Tue, Oct 22, 2019 at 12:01:01AM +0100, speck for Andrew Cooper wrote:
> What about Commit Lake parts, because that is explicitly called out in
> the list of MDS_NO=1, TAA-vulnerable parts in initial whitepaper about TAA.

We've dropped Comet Lake from that list (someone pointed out that it
doesn't have TSX).

-Tony

^ permalink raw reply	[flat|nested] 75+ messages in thread

* [MODERATED] Re: [PATCH v5 08/11] TAAv5 8
  2019-10-21 23:37                                     ` Luck, Tony
@ 2019-10-21 23:39                                       ` Andrew Cooper
  0 siblings, 0 replies; 75+ messages in thread
From: Andrew Cooper @ 2019-10-21 23:39 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 440 bytes --]

On 22/10/2019 00:37, speck for Luck, Tony wrote:
> On Tue, Oct 22, 2019 at 12:01:01AM +0100, speck for Andrew Cooper wrote:
>> What about Commit Lake parts, because that is explicitly called out in
>> the list of MDS_NO=1, TAA-vulnerable parts in initial whitepaper about TAA.
> We've dropped Comet Lake from that list (someone pointed out that it
> doesn't have TSX).

Ah.  One fewer thing to worry about.

Thanks.

~Andrew


^ permalink raw reply	[flat|nested] 75+ messages in thread

end of thread, other threads:[~2019-10-21 23:39 UTC | newest]

Thread overview: 75+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-05  6:17 [MODERATED] [PATCH v5 00/11] TAAv5 0 Pawan Gupta
2019-10-05  6:26 ` [MODERATED] [PATCH v5 01/11] TAAv5 1 Pawan Gupta
2019-10-05  6:27 ` [MODERATED] [PATCH v5 02/11] TAAv5 2 Pawan Gupta
2019-10-05  6:28 ` [MODERATED] [PATCH v5 03/11] TAAv5 3 Pawan Gupta
2019-10-05  6:29 ` [MODERATED] [PATCH v5 04/11] TAAv5 4 Pawan Gupta
2019-10-05  6:30 ` [MODERATED] [PATCH v5 05/11] TAAv5 5 Pawan Gupta
2019-10-05  6:31 ` [MODERATED] [PATCH v5 06/11] TAAv5 6 Pawan Gupta
2019-10-05  6:32 ` [MODERATED] [PATCH v5 07/11] TAAv5 7 Pawan Gupta
2019-10-05  6:33 ` [MODERATED] [PATCH v5 08/11] TAAv5 8 Pawan Gupta
2019-10-05  6:34 ` [MODERATED] [PATCH v5 09/11] TAAv5 9 Pawan Gupta
2019-10-05  6:35 ` [MODERATED] [PATCH v5 10/11] TAAv5 10 Pawan Gupta
2019-10-05  6:36 ` [MODERATED] [PATCH v5 11/11] TAAv5 11 Pawan Gupta
2019-10-05 10:54 ` [MODERATED] Re: [PATCH v5 02/11] TAAv5 2 Borislav Petkov
2019-10-07 17:48   ` Pawan Gupta
     [not found] ` <5d98396a.1c69fb81.6c7a8.23b1SMTPIN_ADDED_BROKEN@mx.google.com>
2019-10-05 21:43   ` [MODERATED] Re: [PATCH v5 03/11] TAAv5 3 Andy Lutomirski
2019-10-07 17:50     ` Pawan Gupta
     [not found] ` <5d9839a4.1c69fb81.238e9.8312SMTPIN_ADDED_BROKEN@mx.google.com>
2019-10-05 21:45   ` [MODERATED] Re: [PATCH v5 04/11] TAAv5 4 Andy Lutomirski
     [not found] ` <5d983ad2.1c69fb81.63edd.6575SMTPIN_ADDED_BROKEN@mx.google.com>
2019-10-05 21:49   ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Andy Lutomirski
2019-10-07 18:35     ` Pawan Gupta
     [not found] ` <5d9838f1.1c69fb81.f1bab.d886SMTPIN_ADDED_BROKEN@mx.google.com>
2019-10-05 21:49   ` [MODERATED] Re: [PATCH v5 01/11] TAAv5 1 Andy Lutomirski
2019-10-06 17:40     ` Andrew Cooper
     [not found] ` <5d983ad2.1c69fb81.e6640.8f51SMTPIN_ADDED_BROKEN@mx.google.com>
2019-10-06 17:06   ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Greg KH
2019-10-08  6:01     ` Pawan Gupta
2019-10-10 21:31       ` Pawan Gupta
2019-10-11  8:45         ` Greg KH
2019-10-21  8:00           ` Thomas Gleixner
2019-10-08  2:46 ` [MODERATED] Re: [PATCH v5 05/11] TAAv5 5 Josh Poimboeuf
2019-10-09  1:45   ` Pawan Gupta
2019-10-08  2:57 ` [MODERATED] Re: [PATCH v5 09/11] TAAv5 9 Josh Poimboeuf
2019-10-08  6:10   ` Pawan Gupta
2019-10-08 10:49     ` Jiri Kosina
2019-10-09 13:12 ` [MODERATED] Re: ***UNCHECKED*** [PATCH v5 08/11] TAAv5 8 Michal Hocko
2019-10-14 19:41   ` Thomas Gleixner
2019-10-14 19:51     ` [MODERATED] " Jiri Kosina
2019-10-14 21:04       ` [MODERATED] " Borislav Petkov
2019-10-14 21:31         ` Jiri Kosina
2019-10-15  8:01           ` Thomas Gleixner
2019-10-15 10:34             ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
2019-10-15 13:06               ` Josh Poimboeuf
2019-10-15 13:10                 ` Jiri Kosina
2019-10-15 15:26                   ` Josh Poimboeuf
2019-10-15 15:32                     ` Jiri Kosina
2019-10-15 19:34                       ` Tyler Hicks
2019-10-15 20:00                       ` Josh Poimboeuf
2019-10-15 20:15                         ` Jiri Kosina
2019-10-15 20:35                           ` Jiri Kosina
2019-10-15 20:54                             ` Josh Poimboeuf
2019-10-15 20:56                             ` [MODERATED] " Pawan Gupta
2019-10-15 21:14                               ` Jiri Kosina
2019-10-15 23:12                                 ` Josh Poimboeuf
2019-10-15 23:13                                   ` [MODERATED] [AUTOREPLY] [MODERATED] [AUTOREPLY] Automatic reply: " James, Hengameh M
2019-10-16  4:52                                   ` [MODERATED] " Jiri Kosina
2019-10-16  5:05                                     ` Jiri Kosina
2019-10-21 21:15                                       ` Luck, Tony
2019-10-16  7:14                                     ` Josh Poimboeuf
2019-10-16  7:20                                       ` Jiri Kosina
2019-10-18  1:17                                   ` Ben Hutchings
2019-10-18  4:04                                     ` Pawan Gupta
2019-10-15 17:47               ` Borislav Petkov
2019-10-16  7:26               ` [MODERATED] Re: ***UNCHECKED*** " Jiri Kosina
2019-10-16  7:54                 ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
2019-10-16  9:23                   ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
2019-10-16 12:15                     ` Thomas Gleixner
2019-10-16 18:34                       ` [MODERATED] " Pawan Gupta
2019-10-18  0:14                       ` Pawan Gupta
2019-10-21  8:09                         ` Thomas Gleixner
2019-10-21 12:54                         ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko
2019-10-21 20:01                           ` [MODERATED] " Pawan Gupta
2019-10-21 20:33                             ` Josh Poimboeuf
2019-10-21 20:34                               ` Josh Poimboeuf
2019-10-21 20:33                                 ` Pawan Gupta
2019-10-21 23:01                                   ` Andrew Cooper
2019-10-21 23:37                                     ` Luck, Tony
2019-10-21 23:39                                       ` Andrew Cooper
2019-10-14 21:05       ` [MODERATED] Re: ***UNCHECKED*** " Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.