All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V7 00/15] SBB 0
@ 2018-04-29 19:30 Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 01/15] SBB 1 Thomas Gleixner
                   ` (16 more replies)
  0 siblings, 17 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

This is an update based on Konrads V6 series. The major changes are:

 - Distangle the mitigation control from the AMD/Intel cpu init code and
   keep it confined to bugs.c. That's cleaner and required to make the
   prctl mode work properly on both AMD and Intel

 - Avoid parsing the command line when RDS is not supported at all.

 - Make all the spec ctrl msr related variables __ro_after_init instead of
   read_mostly. Nothing can fiddle with them after boot.

 - Integrate the PRCTL:

   - Split it up into separate patches

   - Use the new scheme as proposed by Jon Masters

   - Document the interface and return values

   - Implement the context magic as I suggested in the earlier review

   - Make it work for both AMD and Intel

   - Allow caching of the AMD magic MSR to avoid a RMW in context switch.

   - Implement the arch prctl according to the new scheme w/o all the magic
     nonsense of allowing writes when prctl control is disabled and reusing
     the existing functions instead of adding yet another pile.

Things which need some thought:

 - The default mode for Intel is now PRCTL controlled, which might be OK
   for enterprise distro stuff. But what is going to happen for the rest of
   the world? Are the browsers prctl enabled on CRD? I seriously doubt that
   and I'm pondering to make the default ON for that very reason. We should
   not require that Joe User has to add magic crap to the kernel command
   line to get protected. Enterprise admins should know how to do that.

TODOs:
  - Make the prctl documentation rst properly formatted
  - Write a patch for the prctl(2) man page

Thanks,

	tglx

8<--------------
 Documentation/admin-guide/kernel-parameters.txt |   36 ++
 Documentation/userspace-api/index.rst           |    1 
 arch/x86/include/asm/cpufeatures.h              |    4 
 arch/x86/include/asm/msr-index.h                |    3 
 arch/x86/include/asm/nospec-branch.h            |   32 +-
 arch/x86/include/asm/thread_info.h              |    4 
 arch/x86/kernel/cpu/amd.c                       |   21 +
 arch/x86/kernel/cpu/bugs.c                      |  337 +++++++++++++++++++++++-
 arch/x86/kernel/cpu/common.c                    |   47 ++-
 arch/x86/kernel/cpu/cpu.h                       |    2 
 arch/x86/kernel/cpu/intel.c                     |    1 
 arch/x86/kernel/process.c                       |   22 +
 arch/x86/kvm/cpuid.c                            |    2 
 arch/x86/kvm/svm.c                              |    8 
 arch/x86/kvm/vmx.c                              |   16 -
 b/Documentation/userspace-api/spec_ctrl.rst     |   65 ++++
 b/arch/x86/include/asm/specctrl.h               |   38 ++
 drivers/base/cpu.c                              |    8 
 include/linux/cpu.h                             |    2 
 include/linux/nospec.h                          |    5 
 include/uapi/linux/prctl.h                      |   11 
 kernel/sys.c                                    |   18 +
 22 files changed, 638 insertions(+), 45 deletions(-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 01/15] SBB 1
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 02/15] SBB 2 Thomas Gleixner
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Combine the various logic which goes through all those
x86_cpu_id matching structures in one function.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>

---
v2: New patch
v3: Added Reviewed-by
v4: Changed title of the patch
---
 arch/x86/kernel/cpu/common.c | 21 +++++++++++----------
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 348cf4821240..74722c38a836 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -918,21 +918,27 @@ static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
 	{}
 };
 
-static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c)
+static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 {
 	u64 ia32_cap = 0;
 
+	if (x86_match_cpu(cpu_no_speculation))
+		return;
+
+	setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+	setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+
 	if (x86_match_cpu(cpu_no_meltdown))
-		return false;
+		return;
 
 	if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
 		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
 
 	/* Rogue Data Cache Load? No! */
 	if (ia32_cap & ARCH_CAP_RDCL_NO)
-		return false;
+		return;
 
-	return true;
+	setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
 }
 
 /*
@@ -982,12 +988,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 
 	setup_force_cpu_cap(X86_FEATURE_ALWAYS);
 
-	if (!x86_match_cpu(cpu_no_speculation)) {
-		if (cpu_vulnerable_to_meltdown(c))
-			setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
-		setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
-		setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
-	}
+	cpu_set_bug_bits(c);
 
 	fpu__init_system(c);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [patch V7 02/15] SBB 2
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 01/15] SBB 1 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 03/15] SBB 3 Thomas Gleixner
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Those SysFS functions have a similar preamble, as such make common
code to handle them.

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>

---
v2: New patch
v3: Added Reviewed-by
v4: Fixed the title
---
 arch/x86/kernel/cpu/bugs.c | 46 ++++++++++++++++++++++++++++++++--------------
 1 file changed, 32 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index bfca937bdcc3..ad613f73d071 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -314,30 +314,48 @@ static void __init spectre_v2_select_mitigation(void)
 #undef pr_fmt
 
 #ifdef CONFIG_SYSFS
-ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
+
+ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
+			char *buf, unsigned int bug)
 {
-	if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
+	if (!boot_cpu_has_bug(bug))
 		return sprintf(buf, "Not affected\n");
-	if (boot_cpu_has(X86_FEATURE_PTI))
-		return sprintf(buf, "Mitigation: PTI\n");
+
+	switch (bug) {
+	case X86_BUG_CPU_MELTDOWN:
+		if (boot_cpu_has(X86_FEATURE_PTI))
+			return sprintf(buf, "Mitigation: PTI\n");
+
+		break;
+
+	case X86_BUG_SPECTRE_V1:
+		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
+
+	case X86_BUG_SPECTRE_V2:
+		return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
+			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
+			       spectre_v2_module_string());
+
+	default:
+		break;
+	}
+
 	return sprintf(buf, "Vulnerable\n");
 }
 
+ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_CPU_MELTDOWN);
+}
+
 ssize_t cpu_show_spectre_v1(struct device *dev, struct device_attribute *attr, char *buf)
 {
-	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1))
-		return sprintf(buf, "Not affected\n");
-	return sprintf(buf, "Mitigation: __user pointer sanitization\n");
+	return cpu_show_common(dev, attr, buf, X86_BUG_SPECTRE_V1);
 }
 
 ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, char *buf)
 {
-	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
-		return sprintf(buf, "Not affected\n");
-
-	return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
-		       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
-		       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
-		       spectre_v2_module_string());
+	return cpu_show_common(dev, attr, buf, X86_BUG_SPECTRE_V2);
 }
 #endif
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [patch V7 03/15] SBB 3
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 01/15] SBB 1 Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 02/15] SBB 2 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 23:31   ` [MODERATED] " Linus Torvalds
  2018-04-29 19:30 ` [patch V7 04/15] SBB 4 Thomas Gleixner
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

The 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to all
the other bits as reserved. The Intel SDM glossary defines reserved as
implementation specific - aka unknown.

As such at bootup this must be taken it into account and proper masking for
the bits in use applied.

A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199511

[ tglx: Made x86_spec_ctrl_base __ro_after_init ]

Suggested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>

---
 arch/x86/include/asm/nospec-branch.h |   25 +++++++++++++++++++++----
 arch/x86/kernel/cpu/bugs.c           |   28 ++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -217,6 +217,17 @@ enum spectre_v2_mitigation {
 	SPECTRE_V2_IBRS,
 };
 
+/*
+ * The Intel specification for the SPEC_CTRL MSR requires that we
+ * preserve any already set reserved bits at boot time (e.g. for
+ * future additions that this kernel is not currently aware of).
+ * We then set any additional mitigation bits that we want
+ * ourselves and always use this as the base for SPEC_CTRL.
+ * We also use this when handling guest entry/exit as below.
+ */
+extern void x86_set_spec_ctrl(u64);
+extern u64 x86_get_default_spec_ctrl(void);
+
 extern char __indirect_thunk_start[];
 extern char __indirect_thunk_end[];
 
@@ -248,12 +259,14 @@ static inline void vmexit_fill_RSB(void)
 				 "movl $0, %%edx\n\t"		\
 				 "wrmsr",			\
 				 _feature)			\
-		     : : [msr] "i" (_msr), [val] "i" (_val)	\
+		     : : [msr] "i" (_msr), [val] "m" (_val)	\
 		     : "eax", "ecx", "edx", "memory")
 
 static inline void indirect_branch_prediction_barrier(void)
 {
-	alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB,
+	u64 val = PRED_CMD_IBPB;
+
+	alternative_msr_write(MSR_IA32_PRED_CMD, val,
 			      X86_FEATURE_USE_IBPB);
 }
 
@@ -265,14 +278,18 @@ static inline void indirect_branch_predi
  */
 #define firmware_restrict_branch_speculation_start()			\
 do {									\
+	u64 val = x86_get_default_spec_ctrl() | SPEC_CTRL_IBRS;		\
+									\
 	preempt_disable();						\
-	alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS,	\
+	alternative_msr_write(MSR_IA32_SPEC_CTRL, val,			\
 			      X86_FEATURE_USE_IBRS_FW);			\
 } while (0)
 
 #define firmware_restrict_branch_speculation_end()			\
 do {									\
-	alternative_msr_write(MSR_IA32_SPEC_CTRL, 0,			\
+	u64 val = x86_get_default_spec_ctrl();				\
+									\
+	alternative_msr_write(MSR_IA32_SPEC_CTRL, val,			\
 			      X86_FEATURE_USE_IBRS_FW);			\
 	preempt_enable();						\
 } while (0)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -28,6 +28,12 @@
 
 static void __init spectre_v2_select_mitigation(void);
 
+/*
+ * Our boot-time value of SPEC_CTRL MSR. We read it once so that any
+ * writes to SPEC_CTRL contain whatever reserved bits have been set.
+ */
+static u64 __ro_after_init x86_spec_ctrl_base;
+
 void __init check_bugs(void)
 {
 	identify_boot_cpu();
@@ -37,6 +43,13 @@ void __init check_bugs(void)
 		print_cpu_info(&boot_cpu_data);
 	}
 
+	/*
+	 * Read the SPEC_CTRL MSR to account for reserved bits which may
+	 * have unknown values.
+	 */
+	if (boot_cpu_has(X86_FEATURE_IBRS))
+		rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+
 	/* Select the proper spectre mitigation before patching alternatives */
 	spectre_v2_select_mitigation();
 
@@ -95,6 +108,21 @@ static const char *spectre_v2_strings[]
 
 static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
 
+void x86_set_spec_ctrl(u64 val)
+{
+	if (val & ~(SPEC_CTRL_IBRS))
+		WARN_ONCE(1, "SPEC_CTRL MSR value 0x%16llx is unknown.\n", val);
+	else
+		wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base | val);
+}
+EXPORT_SYMBOL_GPL(x86_set_spec_ctrl);
+
+u64 x86_get_default_spec_ctrl(void)
+{
+	return x86_spec_ctrl_base;
+}
+EXPORT_SYMBOL_GPL(x86_get_default_spec_ctrl);
+
 #ifdef RETPOLINE
 static bool spectre_v2_bad_module;
 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 04/15] SBB 4
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (2 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 03/15] SBB 3 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 05/15] SBB 5 Thomas Gleixner
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

A guest may modify the SPEC_CTRL MSR from the value used by the
kernel. Since the kernel doesn't use IBRS, this means a value of zero is
what is needed in the host.

But the 336996-Speculative-Execution-Side-Channel-Mitigations.pdf refers to
the other bits as reserved so the kernel should respect the boot time
SPEC_CTRL value and use that.

This allows to deal with future extensions to the SPEC_CTRL interface if
any at all.

Note: This uses wrmsrl instead of native_wrmsl. I does not make any
difference as paravirt will over-write the callq *0xfff.. with the wrmsrl
assembler code.

A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199511

[ tglx: Added a paranoia check for IBRS into the functions and simplified
  	them ]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>

---
v2: New patch
v3: Use the two accessory functions instead of poking at the global variable.
v4: Use x86_get_spec_ctrl instead of global variable.
v5: Use x86_get_default_spec_ctrl instead of x86_get_spec_ctrl
---
 arch/x86/include/asm/nospec-branch.h |   10 ++++++++++
 arch/x86/kernel/cpu/bugs.c           |   18 ++++++++++++++++++
 arch/x86/kvm/svm.c                   |    6 ++----
 arch/x86/kvm/vmx.c                   |    6 ++----
 4 files changed, 32 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -228,6 +228,16 @@ enum spectre_v2_mitigation {
 extern void x86_set_spec_ctrl(u64);
 extern u64 x86_get_default_spec_ctrl(void);
 
+/*
+ * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR
+ * the guest has, while on VMEXIT we restore the host view. This
+ * would be easier if SPEC_CTRL were architecturally maskable or
+ * shadowable for guests but this is not (currently) the case.
+ * Takes the guest view of SPEC_CTRL MSR as a parameter.
+ */
+extern void x86_set_guest_spec_ctrl(u64);
+extern void x86_restore_host_spec_ctrl(u64);
+
 extern char __indirect_thunk_start[];
 extern char __indirect_thunk_end[];
 
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -123,6 +123,24 @@ u64 x86_get_default_spec_ctrl(void)
 }
 EXPORT_SYMBOL_GPL(x86_get_default_spec_ctrl);
 
+void x86_set_guest_spec_ctrl(u64 guest_spec_ctrl)
+{
+	if (!boot_cpu_has(X86_FEATURE_IBRS))
+		return;
+	if (x86_spec_ctrl_base != guest_spec_ctrl)
+		wrmsrl(MSR_IA32_SPEC_CTRL, guest_spec_ctrl);
+}
+EXPORT_SYMBOL_GPL(x86_set_guest_spec_ctrl);
+
+void x86_restore_host_spec_ctrl(u64 guest_spec_ctrl)
+{
+	if (!boot_cpu_has(X86_FEATURE_IBRS))
+		return;
+	if (x86_spec_ctrl_base != guest_spec_ctrl)
+		wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+}
+EXPORT_SYMBOL_GPL(x86_restore_host_spec_ctrl);
+
 #ifdef RETPOLINE
 static bool spectre_v2_bad_module;
 
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5557,8 +5557,7 @@ static void svm_vcpu_run(struct kvm_vcpu
 	 * is no need to worry about the conditional branch over the wrmsr
 	 * being speculatively taken.
 	 */
-	if (svm->spec_ctrl)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+	x86_set_guest_spec_ctrl(svm->spec_ctrl);
 
 	asm volatile (
 		"push %%" _ASM_BP "; \n\t"
@@ -5670,8 +5669,7 @@ static void svm_vcpu_run(struct kvm_vcpu
 	if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
 		svm->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
 
-	if (svm->spec_ctrl)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+	x86_restore_host_spec_ctrl(svm->spec_ctrl);
 
 	/* Eliminate branch target predictions from guest mode */
 	vmexit_fill_RSB();
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9726,8 +9726,7 @@ static void __noclone vmx_vcpu_run(struc
 	 * is no need to worry about the conditional branch over the wrmsr
 	 * being speculatively taken.
 	 */
-	if (vmx->spec_ctrl)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+	x86_set_guest_spec_ctrl(vmx->spec_ctrl);
 
 	vmx->__launched = vmx->loaded_vmcs->launched;
 
@@ -9875,8 +9874,7 @@ static void __noclone vmx_vcpu_run(struc
 	if (unlikely(!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL)))
 		vmx->spec_ctrl = native_read_msr(MSR_IA32_SPEC_CTRL);
 
-	if (vmx->spec_ctrl)
-		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+	x86_restore_host_spec_ctrl(vmx->spec_ctrl);
 
 	/* Eliminate branch target predictions from guest mode */
 	vmexit_fill_RSB();

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 05/15] SBB 5
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (3 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 04/15] SBB 4 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 06/15] SBB 6 Thomas Gleixner
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

It does not do much except show the words 'Vulnerable' for recent x86
cores. Intel cores prior to Nehalem are known not to be vulnerable, and
so are some Atoms and some Xeon Phi.

It assumes that older Cyrix, Centaur, etc. cores are immune.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>

---
v1.3: Remove AMD
    s/md/mdd/
v1.4: s/mdd/sbb/
v3: s/SSB/SPEC_STORE_BYPASS
  Rework the logic in cpu_set_bug_bits to be inverse.
v4: Expanded the not affected array
  - s/X86_BUG_CPU_SPEC_STORE_BYPASS/X86_BUG_SPEC_STORE_BYPASS/
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/kernel/cpu/bugs.c         |    5 +++++
 arch/x86/kernel/cpu/common.c       |   20 ++++++++++++++++++++
 drivers/base/cpu.c                 |    8 ++++++++
 include/linux/cpu.h                |    2 ++
 5 files changed, 36 insertions(+)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -362,5 +362,6 @@
 #define X86_BUG_CPU_MELTDOWN		X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */
 #define X86_BUG_SPECTRE_V1		X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */
 #define X86_BUG_SPECTRE_V2		X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */
+#define X86_BUG_SPEC_STORE_BYPASS	X86_BUG(17) /* CPU is affected by speculative store bypass attack */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -400,4 +400,9 @@ ssize_t cpu_show_spectre_v2(struct devic
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_SPECTRE_V2);
 }
+
+ssize_t cpu_show_spec_store_bypass(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_SPEC_STORE_BYPASS);
+}
 #endif
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -918,10 +918,30 @@ static const __initconst struct x86_cpu_
 	{}
 };
 
+static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_ATOM_PINEVIEW },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_ATOM_LINCROFT },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_ATOM_PENWELL },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_ATOM_CLOVERVIEW },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_ATOM_CEDARVIEW },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_ATOM_SILVERMONT1 },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_ATOM_AIRMONT },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_XEON_PHI_KNL },
+	{ X86_VENDOR_INTEL,     6, INTEL_FAM6_XEON_PHI_KNM },
+	{ X86_VENDOR_CENTAUR,	5 },
+	{ X86_VENDOR_INTEL,	5 },
+	{ X86_VENDOR_NSC,	5 },
+	{ X86_VENDOR_ANY,	4 },
+	{}
+};
+
 static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 {
 	u64 ia32_cap = 0;
 
+	if (!x86_match_cpu(cpu_no_spec_store_bypass))
+		setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
+
 	if (x86_match_cpu(cpu_no_speculation))
 		return;
 
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -532,14 +532,22 @@ ssize_t __weak cpu_show_spectre_v2(struc
 	return sprintf(buf, "Not affected\n");
 }
 
+ssize_t __weak cpu_show_spec_store_bypass(struct device *dev,
+					  struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
+static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
 
 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_meltdown.attr,
 	&dev_attr_spectre_v1.attr,
 	&dev_attr_spectre_v2.attr,
+	&dev_attr_spec_store_bypass.attr,
 	NULL
 };
 
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -53,6 +53,8 @@ extern ssize_t cpu_show_spectre_v1(struc
 				   struct device_attribute *attr, char *buf);
 extern ssize_t cpu_show_spectre_v2(struct device *dev,
 				   struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_spec_store_bypass(struct device *dev,
+					  struct device_attribute *attr, char *buf);
 
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 06/15] SBB 6
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (4 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 05/15] SBB 5 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 07/15] SBB 7 Thomas Gleixner
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Add the CPU feature bit CPUID.7.0.EDX[31] which indicates whether the CPU
supports Reduced Data Speculation.

[ tglx: Split it out from a later patch ]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 1 file changed, 1 insertion(+)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -333,6 +333,7 @@
 #define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
 #define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
 #define X86_FEATURE_ARCH_CAPABILITIES	(18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
+#define X86_FEATURE_RDS			(18*32+31) /* Reduced Data Speculation */
 
 /*
  * BUG word(s)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 07/15] SBB 7
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (5 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 06/15] SBB 6 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 08/15] SBB 8 Thomas Gleixner
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Contemporary high performance processors use a common industry-wide
optimization known as "Speculative Store Bypass" in which loads from
addresses to which a recent store has occurred may (speculatively) see an
older value. Intel refers to this feature as "Memory Disambiguation" which
is part of their "Smart Memory Access" capability in Nehalem and later
generation processors.

Some processors have an implementation bug that enables a cache
side-channel attack against such speculatively read values. An attacker can
create exploit code that allows them to read memory outside of a sandbox
environment (for example, malicious JavaScript in a web page), or to
perform more complex attacks against code running within the same privilege
level, e.g. via the stack.

Provide two command line control knobs:

 nospec_store_bypass_disable
 spec_store_bypass_disable=[off,auto,on]

By default affected x86 processors will power on with Speculative
Store Bypass enabled. Hence the provided kernel parameters are written
from the point of view of whether to enable a mitigation or not.
The parameters are as follows:

 - auto - kernel detects whether your CPU model contains a
	  vulnerable implementation of Speculative Store
	  Bypass and picks the most appropriate mitigation

	  Some CPUs may have a Speculative Store Bypass implementation
	  which is not vulnerable to the described attacks.  For those, the
	  'auto' (default) option will pick the right choice.

 - on   - disable Speculative Store Bypass
 - off  - enable Speculative Store Bypass

[ tglx: Reordered the checks so that the whole evaluation is not done
  	when the CPU does not support RDS ]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>

---
v1.3:
 - Wrong array was used when figuring mdd_v4= arguments
 - If broken microcode found, also disable MD feature.
 - s/mdd_v4/mdd/
 - s/kernel/userpsace/
 - Fix up the commit description
v1.4:
 - Incorporate various feedback and suggestions from Jon Masters
 - Change mdd parameters to allow only big hammer on/off for now
 - Introduce "ssbd" industry standard term for cross-architecture
 - Use mdd_ and MDD_ prefixes to clearly spell out that "enable"
   means we are actually disabling a CPU processor feature (MD)
v2:
 - s/mdd/ssb/ in the x86 code to match with the 'ssb' part
v3:
 - rip out mdd.
 - remove 'Haswell' and such and replace with 'contemporary'
 - s/ssb/spec_store_bypass/ and various other cleanups (Jon Masters)
v4: Rip out 'Memory Disambiguation Disable'
 - Fix up documentation.
 - Rip out the X86_FEATURE_STBUF_BYPASS
 - s/X86_FEATURE_STBUF_MITIGATE/X86_FEATURE_STBUF_BYPASS_MITIGATE/
 - On the functions change spec_store_bypass_ to ssb_
v5: Changed title.
  - s/X86_FEATURE_STBUF_BYPASS_MITIGATE/X86_FEATURE_SPEC_STORE_BYPASS_DISABLE/
  - Add logic to get out of ssb_parse_cmdline is bug is not present
---
 Documentation/admin-guide/kernel-parameters.txt |   33 +++++++
 arch/x86/include/asm/cpufeatures.h              |    1 
 arch/x86/include/asm/nospec-branch.h            |    6 +
 arch/x86/kernel/cpu/bugs.c                      |  103 ++++++++++++++++++++++++
 4 files changed, 143 insertions(+)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2680,6 +2680,9 @@
 			allow data leaks with this option, which is equivalent
 			to spectre_v2=off.
 
+	nospec_store_bypass_disable
+			[HW] Disable all mitigations for the Speculative Store Bypass vulnerability
+
 	noxsave		[BUGS=X86] Disables x86 extended register state save
 			and restore using xsave. The kernel will fallback to
 			enabling legacy floating-point and sse state.
@@ -4025,6 +4028,36 @@
 			Not specifying this option is equivalent to
 			spectre_v2=auto.
 
+	spec_store_bypass_disable=
+			[HW] Control Speculative Store Bypass (SSB) Disable mitigation
+			(Speculative Store Bypass vulnerability)
+
+			Certain CPUs are vulnerable to an exploit against a
+			a common industry wide performance optimization known
+			as "Speculative Store Bypass" in which recent stores
+			to the same memory location may not be observed by
+			later loads during speculative execution. The idea
+			is that such stores are unlikely and that they can
+			be detected prior to instruction retirement at the
+			end of a particular speculation execution window.
+
+			In vulnerable processors, the speculatively forwarded
+			store can be used in a cache side channel attack, for
+			example to read memory to which the attacker does not
+			directly have access (e.g. inside sandboxed code).
+
+			This parameter controls whether the Speculative Store
+			Bypass optimization is used.
+
+			on     - Unconditionally disable Speculative Store Bypass
+			off    - Unconditionally enable Speculative Store Bypass
+			auto   - Kernel detects whether the CPU model contains a
+			         vulnerable implementation of Speculative Store
+			         Bypass and picks the most appropriate mitigation
+
+			Not specifying this option is equivalent to
+			spec_store_bypass_disable=auto.
+
 	spia_io_base=	[HW,MTD]
 	spia_fio_base=
 	spia_pedr=
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -214,6 +214,7 @@
 
 #define X86_FEATURE_USE_IBPB		( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */
 #define X86_FEATURE_USE_IBRS_FW		( 7*32+22) /* "" Use IBRS during runtime firmware calls */
+#define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE	( 7*32+23) /* "" Disable Speculative Store Bypass. */
 
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -238,6 +238,12 @@ extern u64 x86_get_default_spec_ctrl(voi
 extern void x86_set_guest_spec_ctrl(u64);
 extern void x86_restore_host_spec_ctrl(u64);
 
+/* The Speculative Store Bypass disable variants */
+enum ssb_mitigation {
+	SPEC_STORE_BYPASS_NONE,
+	SPEC_STORE_BYPASS_DISABLE,
+};
+
 extern char __indirect_thunk_start[];
 extern char __indirect_thunk_end[];
 
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -27,6 +27,7 @@
 #include <asm/intel-family.h>
 
 static void __init spectre_v2_select_mitigation(void);
+static void __init ssb_select_mitigation(void);
 
 /*
  * Our boot-time value of SPEC_CTRL MSR. We read it once so that any
@@ -53,6 +54,12 @@ void __init check_bugs(void)
 	/* Select the proper spectre mitigation before patching alternatives */
 	spectre_v2_select_mitigation();
 
+	/*
+	 * Select proper mitigation for any exposure to the Speculative Store
+	 * Bypass vulnerability.
+	 */
+	ssb_select_mitigation();
+
 #ifdef CONFIG_X86_32
 	/*
 	 * Check whether we are able to run this kernel safely on SMP.
@@ -358,6 +365,99 @@ static void __init spectre_v2_select_mit
 }
 
 #undef pr_fmt
+#define pr_fmt(fmt)	"Speculative Store Bypass: " fmt
+
+static enum ssb_mitigation ssb_mode = SPEC_STORE_BYPASS_NONE;
+
+/* The kernel command line selection */
+enum ssb_mitigation_cmd {
+	SPEC_STORE_BYPASS_CMD_NONE,
+	SPEC_STORE_BYPASS_CMD_AUTO,
+	SPEC_STORE_BYPASS_CMD_ON,
+};
+
+static const char *ssb_strings[] = {
+	[SPEC_STORE_BYPASS_NONE]	= "Vulnerable",
+	[SPEC_STORE_BYPASS_DISABLE]	= "Mitigation: Speculative Store Bypass disabled"
+};
+
+static const struct {
+	const char *option;
+	enum ssb_mitigation_cmd cmd;
+} ssb_mitigation_options[] = {
+	{ "auto",	SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */
+	{ "on",		SPEC_STORE_BYPASS_CMD_ON },   /* Disable Speculative Store Bypass */
+	{ "off",	SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */
+};
+
+static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void)
+{
+	enum ssb_mitigation_cmd cmd = SPEC_STORE_BYPASS_CMD_AUTO;
+	char arg[20];
+	int ret, i;
+
+	if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable")) {
+		return SPEC_STORE_BYPASS_CMD_NONE;
+	} else {
+		ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable",
+					  arg, sizeof(arg));
+		if (ret < 0)
+			return SPEC_STORE_BYPASS_CMD_AUTO;
+
+		for (i = 0; i < ARRAY_SIZE(ssb_mitigation_options); i++) {
+			if (!match_option(arg, ret, ssb_mitigation_options[i].option))
+				continue;
+
+			cmd = ssb_mitigation_options[i].cmd;
+			break;
+		}
+
+		if (i >= ARRAY_SIZE(ssb_mitigation_options)) {
+			pr_err("unknown option (%s). Switching to AUTO select\n", arg);
+			return SPEC_STORE_BYPASS_CMD_AUTO;
+		}
+	}
+
+	return cmd;
+}
+
+static enum ssb_mitigation_cmd __init __ssb_select_mitigation(void)
+{
+	enum ssb_mitigation mode = SPEC_STORE_BYPASS_NONE;
+	enum ssb_mitigation_cmd cmd;
+
+	if (!boot_cpu_has(X86_FEATURE_RDS))
+		return mode;
+
+	cmd = ssb_parse_cmdline();
+	if (!boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS) &&
+	    (cmd == SPEC_STORE_BYPASS_CMD_NONE ||
+	     cmd == SPEC_STORE_BYPASS_CMD_AUTO))
+		return mode;
+
+	switch (cmd) {
+	case SPEC_STORE_BYPASS_CMD_AUTO:
+	case SPEC_STORE_BYPASS_CMD_ON:
+		mode = SPEC_STORE_BYPASS_DISABLE;
+		break;
+	case SPEC_STORE_BYPASS_CMD_NONE:
+		break;
+	}
+
+	if (mode != SPEC_STORE_BYPASS_NONE)
+		setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE);
+	return mode;
+}
+
+static void ssb_select_mitigation()
+{
+	ssb_mode = __ssb_select_mitigation();
+
+	if (boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
+		pr_info("%s\n", ssb_strings[ssb_mode]);
+}
+
+#undef pr_fmt
 
 #ifdef CONFIG_SYSFS
 
@@ -383,6 +483,9 @@ ssize_t cpu_show_common(struct device *d
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
 			       spectre_v2_module_string());
 
+	case X86_BUG_SPEC_STORE_BYPASS:
+		return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
+
 	default:
 		break;
 	}

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 08/15] SBB 8
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (6 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 07/15] SBB 7 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 09/15] SBB 9 Thomas Gleixner
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Intel CPUs expose methods to:

 - Detect whether Reduced Data Speculation capability is available via
   CPUID.7.0.EDX[31],

 - The SPEC_CTRL MSR(0x48), bit 2 set to enable Reduced Data Speculation.

 - MSR_IA32_ARCH_CAPABILITIES, Bit(4) no need to enable Reduced Data Speculation.

With that in mind if spec_store_bypass_disable=[auto,on] is selected set at
boot-time the SPEC_CTRL MSR to enable Reduced Data Speculation if the
platform requires it.

Note that this does not fix the KVM case where the SPEC_CTRL is exposed to
guests which can muck with, see patch titled : KVM/SVM/VMX/x86/spectre_v2:
Support the combination of guest and host IBRS.

And for the firmware (IBRS to be set), see patch titled:
 x86/spectre_v2: Read SPEC_CTRL MSR during boot and re-use reserved bits

[ tglx: Distangled it from the intel implementation and kept the call order ]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>

---

v1.2: Expand on the commit description
  s/md_v4/mdd/
  s/spec_ctrl_msr_on/spec_ctrl_priv/
  s/spec_ctrl_msr_off/spec_ctrp_unpriv/

v1.3:
 - Add comment about privilege level changes.

v1.4: Simplify and incorporate various suggestions from Jon Masters
 - Export a single x86_spec_ctrl_base value with initial bits

v2: Rip out the c_fix_cpu.
 Depend on synthetic CPU flag
v3: Move the generic_identify to be done _after_ we figure out whether
  we can do the mitigation.
v4: s/MDD/RDS/
   s/Memory Disambiguation Disable/Reduced Data Speculation/
   Tweak the various 'disable', enabled now that it is called RDS.
   Set the x86_spec_ctrl with SPEC_CTRL_RDS if RDS is detected
   Fixup x86_set_spec_ctrl to deal with two Bitfields.
v5: s/X86_FEATURE_DISABLE_SSB/X86_FEATURE_SPEC_STORE_BYPASS_DISABLE/
   Also check MSR_IA32_ARCH_CAPABILITIES for Bit(4)
   Add documentation on what those three flags mean
   Add docs on why we set x86_spec_ctrl only on Intel
   Add extra check in ssb_parse_cmdline for RDS be available
   In init_intel drop the check for RDS as the X86_FEATURE_SPEC_STORE_BYPASS_DISABLE
    is implicitly set only iff RDS has been set in ssb_parse_cmdline.
---
 arch/x86/include/asm/msr-index.h |    2 ++
 arch/x86/kernel/cpu/bugs.c       |   30 ++++++++++++++++++++++++++++--
 arch/x86/kernel/cpu/common.c     |   10 ++++++----
 arch/x86/kernel/cpu/cpu.h        |    2 ++
 arch/x86/kernel/cpu/intel.c      |    1 +
 5 files changed, 39 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -42,6 +42,7 @@
 #define MSR_IA32_SPEC_CTRL		0x00000048 /* Speculation Control */
 #define SPEC_CTRL_IBRS			(1 << 0)   /* Indirect Branch Restricted Speculation */
 #define SPEC_CTRL_STIBP			(1 << 1)   /* Single Thread Indirect Branch Predictors */
+#define SPEC_CTRL_RDS			(1 << 2)   /* Reduced Data Speculation */
 
 #define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */
 #define PRED_CMD_IBPB			(1 << 0)   /* Indirect Branch Prediction Barrier */
@@ -68,6 +69,7 @@
 #define MSR_IA32_ARCH_CAPABILITIES	0x0000010a
 #define ARCH_CAP_RDCL_NO		(1 << 0)   /* Not susceptible to Meltdown */
 #define ARCH_CAP_IBRS_ALL		(1 << 1)   /* Enhanced IBRS support */
+#define ARCH_CAP_RDS_NO			(1 << 4)   /* Not susceptible to speculative store bypass */
 
 #define MSR_IA32_BBL_CR_CTL		0x00000119
 #define MSR_IA32_BBL_CR_CTL3		0x0000011e
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -117,7 +117,7 @@ static enum spectre_v2_mitigation spectr
 
 void x86_set_spec_ctrl(u64 val)
 {
-	if (val & ~(SPEC_CTRL_IBRS))
+	if (val & ~(SPEC_CTRL_IBRS | SPEC_CTRL_RDS))
 		WARN_ONCE(1, "SPEC_CTRL MSR value 0x%16llx is unknown.\n", val);
 	else
 		wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base | val);
@@ -444,8 +444,28 @@ static enum ssb_mitigation_cmd __init __
 		break;
 	}
 
-	if (mode != SPEC_STORE_BYPASS_NONE)
+	/*
+	 * We have three CPU feature flags that are in play here:
+	 *  - X86_BUG_SPEC_STORE_BYPASS - CPU is susceptible.
+	 *  - X86_FEATURE_RDS - CPU is able to turn off speculative store bypass
+	 *  - X86_FEATURE_SPEC_STORE_BYPASS_DISABLE - engage the mitigation
+	 */
+	if (mode != SPEC_STORE_BYPASS_NONE) {
 		setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE);
+		/*
+		 * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses
+		 * a completely different MSR and bit dependent on family.
+		 */
+		switch (boot_cpu_data.x86_vendor) {
+		case X86_VENDOR_INTEL:
+			x86_spec_ctrl_base |= SPEC_CTRL_RDS;
+			x86_set_spec_ctrl(SPEC_CTRL_RDS);
+			break;
+		case X86_VENDOR_AMD:
+			break;
+		}
+	}
+
 	return mode;
 }
 
@@ -459,6 +479,12 @@ static void ssb_select_mitigation()
 
 #undef pr_fmt
 
+void x86_setup_ap_spec_ctrl(void)
+{
+	if (boot_cpu_has(X86_FEATURE_IBRS))
+		x86_set_spec_ctrl(x86_spec_ctrl_base & (SPEC_CTRL_IBRS | SPEC_CTRL_RDS));
+}
+
 #ifdef CONFIG_SYSFS
 
 ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -944,7 +944,11 @@ static void __init cpu_set_bug_bits(stru
 {
 	u64 ia32_cap = 0;
 
-	if (!x86_match_cpu(cpu_no_spec_store_bypass))
+	if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
+		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+	if (!x86_match_cpu(cpu_no_spec_store_bypass) &&
+	   !(ia32_cap & ARCH_CAP_RDS_NO))
 		setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
 
 	if (x86_match_cpu(cpu_no_speculation))
@@ -956,9 +960,6 @@ static void __init cpu_set_bug_bits(stru
 	if (x86_match_cpu(cpu_no_meltdown))
 		return;
 
-	if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
-		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
-
 	/* Rogue Data Cache Load? No! */
 	if (ia32_cap & ARCH_CAP_RDCL_NO)
 		return;
@@ -1376,6 +1377,7 @@ void identify_secondary_cpu(struct cpuin
 #endif
 	mtrr_ap_init();
 	validate_apic_and_package_id(c);
+	x86_setup_ap_spec_ctrl();
 }
 
 static __init int setup_noclflush(char *arg)
--- a/arch/x86/kernel/cpu/cpu.h
+++ b/arch/x86/kernel/cpu/cpu.h
@@ -50,4 +50,6 @@ extern void cpu_detect_cache_sizes(struc
 
 unsigned int aperfmperf_get_khz(int cpu);
 
+extern void x86_setup_ap_spec_ctrl(void);
+
 #endif /* ARCH_X86_CPU_H */
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -189,6 +189,7 @@ static void early_init_intel(struct cpui
 		setup_clear_cpu_cap(X86_FEATURE_STIBP);
 		setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL);
 		setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP);
+		setup_clear_cpu_cap(X86_FEATURE_RDS);
 	}
 
 	/*

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 09/15] SBB 9
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (7 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 08/15] SBB 8 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 10/15] SBB 10 Thomas Gleixner
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Intel and AMD SPEC_CTRL (0x48) MSR semantics may differ in the
future (or in fact use different MSRs for the same functionality).

As such a run-time mechanism is required to whitelist the appropriate MSR
values.

[ tglx: Made the variable __ro_after_init ]

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
v5: New patch
---
 arch/x86/kernel/cpu/bugs.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -35,6 +35,12 @@ static void __init ssb_select_mitigation
  */
 static u64 __ro_after_init x86_spec_ctrl_base;
 
+/*
+ * The vendor and possibly platform specific bits which can be modified in
+ * x86_spec_ctrl_base.
+ */
+static u64 __ro_after_init x86_spec_ctrl_mask = ~(SPEC_CTRL_IBRS);
+
 void __init check_bugs(void)
 {
 	identify_boot_cpu();
@@ -117,7 +123,7 @@ static enum spectre_v2_mitigation spectr
 
 void x86_set_spec_ctrl(u64 val)
 {
-	if (val & ~(SPEC_CTRL_IBRS | SPEC_CTRL_RDS))
+	if (val & x86_spec_ctrl_mask)
 		WARN_ONCE(1, "SPEC_CTRL MSR value 0x%16llx is unknown.\n", val);
 	else
 		wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base | val);
@@ -459,6 +465,7 @@ static enum ssb_mitigation_cmd __init __
 		switch (boot_cpu_data.x86_vendor) {
 		case X86_VENDOR_INTEL:
 			x86_spec_ctrl_base |= SPEC_CTRL_RDS;
+			x86_spec_ctrl_mask &= ~(SPEC_CTRL_RDS);
 			x86_set_spec_ctrl(SPEC_CTRL_RDS);
 			break;
 		case X86_VENDOR_AMD:
@@ -482,7 +489,7 @@ static void ssb_select_mitigation()
 void x86_setup_ap_spec_ctrl(void)
 {
 	if (boot_cpu_has(X86_FEATURE_IBRS))
-		x86_set_spec_ctrl(x86_spec_ctrl_base & (SPEC_CTRL_IBRS | SPEC_CTRL_RDS));
+		x86_set_spec_ctrl(x86_spec_ctrl_base & ~x86_spec_ctrl_mask);
 }
 
 #ifdef CONFIG_SYSFS

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 10/15] SBB 10
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (8 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 09/15] SBB 9 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-30  0:16   ` [MODERATED] " Konrad Rzeszutek Wilk
  2018-04-29 19:30 ` [patch V7 11/15] SBB 11 Thomas Gleixner
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

AMD does not need the Speculative Store Bypass mitigation to be enabled.

But the parameters for this are already available and can be done via MSR
C001_1020. Each family uses a different bit in that MSR for this.

[ tglx: Expose the bit mask via a variable and move the actual MSR fiddling
  	into the bugs code as that's the right thing to do and to prepare
  	for dynamic enable/disable ]

Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
v2: New patch
v3: Use the right CPU features
    Move the whole logic in early_init_amd and init_amd
v5: Set X86_FEATURE_RDS on Fam16->17h
    Change title to have 'bugs' in it.
---
 arch/x86/include/asm/cpufeatures.h   |    1 +
 arch/x86/include/asm/nospec-branch.h |    4 ++++
 arch/x86/kernel/cpu/amd.c            |   21 +++++++++++++++++++++
 arch/x86/kernel/cpu/bugs.c           |   27 ++++++++++++++++++++++++++-
 4 files changed, 52 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -215,6 +215,7 @@
 #define X86_FEATURE_USE_IBPB		( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */
 #define X86_FEATURE_USE_IBRS_FW		( 7*32+22) /* "" Use IBRS during runtime firmware calls */
 #define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE	( 7*32+23) /* "" Disable Speculative Store Bypass. */
+#define X86_FEATURE_AMD_RDS		(7*32+24)  /* "" AMD RDS implementation */
 
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -244,6 +244,10 @@ enum ssb_mitigation {
 	SPEC_STORE_BYPASS_DISABLE,
 };
 
+/* AMD specific Speculative Store Bypass MSR data */
+extern u64 x86_amd_ls_cfg_base;
+extern u64 x86_amd_ls_cfg_rds_mask;
+
 extern char __indirect_thunk_start[];
 extern char __indirect_thunk_end[];
 
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -10,6 +10,7 @@
 #include <asm/processor.h>
 #include <asm/apic.h>
 #include <asm/cpu.h>
+#include <asm/nospec-branch.h>
 #include <asm/smp.h>
 #include <asm/pci-direct.h>
 #include <asm/delay.h>
@@ -673,6 +674,26 @@ static void early_init_amd(struct cpuinf
 		set_cpu_bug(c, X86_BUG_AMD_E400);
 
 	early_detect_mem_encrypt(c);
+
+	if (c->x86 >= 0x15 && c->x86 <= 0x17) {
+		unsigned int bit;
+
+		switch (c->x86) {
+		case 0x15: bit = 54; break;
+		case 0x16: bit = 33; break;
+		case 0x17: bit = 10; break;
+		default: return;
+		}
+		/*
+		 * Try to cache the base value so further operations can
+		 * avoid RMW. If that faults, do not enable RDS.
+		 */
+		if (!rdmsrl_safe(MSR_AMD64_LS_CFG, &x86_amd_ls_cfg_base)) {
+			set_cpu_cap(c, X86_FEATURE_RDS);
+			set_cpu_cap(c, X86_FEATURE_AMD_RDS);
+			x86_amd_ls_cfg_rds_mask = (1ULL << bit);
+		}
+	}
 }
 
 static void init_amd_k8(struct cpuinfo_x86 *c)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -41,6 +41,13 @@ static u64 __ro_after_init x86_spec_ctrl
  */
 static u64 __ro_after_init x86_spec_ctrl_mask = ~(SPEC_CTRL_IBRS);
 
+/*
+ * AMD specific MSR info for Store Bypass control.  x86_amd_ls_cfg_rds_mask
+ * is initialized in identify_boot_cpu().
+ */
+u64 __ro_after_init x86_amd_ls_cfg_base;
+u64 __ro_after_init x86_amd_ls_cfg_rds_mask;
+
 void __init check_bugs(void)
 {
 	identify_boot_cpu();
@@ -52,7 +59,8 @@ void __init check_bugs(void)
 
 	/*
 	 * Read the SPEC_CTRL MSR to account for reserved bits which may
-	 * have unknown values.
+	 * have unknown values. AMD64_LS_CFG msr is cached in the early AMD
+	 * init code as it is not enumerated and depends on the family.
 	 */
 	if (boot_cpu_has(X86_FEATURE_IBRS))
 		rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
@@ -154,6 +162,14 @@ void x86_restore_host_spec_ctrl(u64 gues
 }
 EXPORT_SYMBOL_GPL(x86_restore_host_spec_ctrl);
 
+static void x86_amd_rds_enable(void)
+{
+	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_bit);
+
+	if (boot_cpu_has(X86_FEATURE_AMD_RDS))
+		wrmsrl(MSR_AMD64_LS_CFG, msrval);
+}
+
 #ifdef RETPOLINE
 static bool spectre_v2_bad_module;
 
@@ -443,6 +459,11 @@ static enum ssb_mitigation_cmd __init __
 
 	switch (cmd) {
 	case SPEC_STORE_BYPASS_CMD_AUTO:
+		/*
+		 * AMD platforms by default don't need SSB mitigation.
+		 */
+		if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+			break;
 	case SPEC_STORE_BYPASS_CMD_ON:
 		mode = SPEC_STORE_BYPASS_DISABLE;
 		break;
@@ -469,6 +490,7 @@ static enum ssb_mitigation_cmd __init __
 			x86_set_spec_ctrl(SPEC_CTRL_RDS);
 			break;
 		case X86_VENDOR_AMD:
+			x86_amd_rds_enable();
 			break;
 		}
 	}
@@ -490,6 +512,9 @@ void x86_setup_ap_spec_ctrl(void)
 {
 	if (boot_cpu_has(X86_FEATURE_IBRS))
 		x86_set_spec_ctrl(x86_spec_ctrl_base & ~x86_spec_ctrl_mask);
+
+	if (ssb_mode == SPEC_STORE_BYPASS_DISABLE)
+		x86_amd_rds_enable();
 }
 
 #ifdef CONFIG_SYSFS

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 11/15] SBB 11
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (9 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 10/15] SBB 10 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-29 19:30 ` [patch V7 12/15] SBB 12 Thomas Gleixner
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Expose the CPUID.7.EDX[31] bit to the guest, and also guard against various
combinations of SPEC_CTRL MSR values.

The handling of the MSR (to take into account the host value of SPEC_CTRL
Bit(2)) is taken care of in patch:

  KVM/SVM/VMX/x86/spectre_v2: Support the combination of guest and host IBRS

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
v2: New patch.
v4: s/MDD/RDS/
---
 arch/x86/kvm/cpuid.c |    2 +-
 arch/x86/kvm/vmx.c   |    8 +++++---
 2 files changed, 6 insertions(+), 4 deletions(-)

--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -407,7 +407,7 @@ static inline int __do_cpuid_ent(struct
 
 	/* cpuid 7.0.edx*/
 	const u32 kvm_cpuid_7_0_edx_x86_features =
-		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
+		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) | F(RDS) |
 		F(ARCH_CAPABILITIES);
 
 	/* all calls to cpuid_count() should be made on the same cpu */
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3524,7 +3524,8 @@ static int vmx_get_msr(struct kvm_vcpu *
 	case MSR_IA32_SPEC_CTRL:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_IBRS) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDS))
 			return 1;
 
 		msr_info->data = to_vmx(vcpu)->spec_ctrl;
@@ -3643,11 +3644,12 @@ static int vmx_set_msr(struct kvm_vcpu *
 	case MSR_IA32_SPEC_CTRL:
 		if (!msr_info->host_initiated &&
 		    !guest_cpuid_has(vcpu, X86_FEATURE_IBRS) &&
-		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
+		    !guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_RDS))
 			return 1;
 
 		/* The STIBP bit doesn't fault even if it's not advertised */
-		if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP))
+		if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_RDS))
 			return 1;
 
 		vmx->spec_ctrl = data;

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 12/15] SBB 12
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (10 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 11/15] SBB 11 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-30  1:33   ` [MODERATED] " Konrad Rzeszutek Wilk
  2018-04-29 19:30 ` [patch V7 13/15] SBB 13 Thomas Gleixner
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Having everything in nospec-branch.h creates a hell of dependencies when
adding the prctl based switching mechanism. Move everything which is not
required in nospec-branch.h to specctrl.h and fixup the includes in the
relevant files.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/nospec-branch.h |   14 --------------
 arch/x86/include/asm/specctrl.h      |   21 +++++++++++++++++++++
 arch/x86/kernel/cpu/amd.c            |    2 +-
 arch/x86/kernel/cpu/bugs.c           |    2 +-
 arch/x86/kvm/svm.c                   |    2 +-
 arch/x86/kvm/vmx.c                   |    2 +-
 6 files changed, 25 insertions(+), 18 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -228,26 +228,12 @@ enum spectre_v2_mitigation {
 extern void x86_set_spec_ctrl(u64);
 extern u64 x86_get_default_spec_ctrl(void);
 
-/*
- * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR
- * the guest has, while on VMEXIT we restore the host view. This
- * would be easier if SPEC_CTRL were architecturally maskable or
- * shadowable for guests but this is not (currently) the case.
- * Takes the guest view of SPEC_CTRL MSR as a parameter.
- */
-extern void x86_set_guest_spec_ctrl(u64);
-extern void x86_restore_host_spec_ctrl(u64);
-
 /* The Speculative Store Bypass disable variants */
 enum ssb_mitigation {
 	SPEC_STORE_BYPASS_NONE,
 	SPEC_STORE_BYPASS_DISABLE,
 };
 
-/* AMD specific Speculative Store Bypass MSR data */
-extern u64 x86_amd_ls_cfg_base;
-extern u64 x86_amd_ls_cfg_rds_mask;
-
 extern char __indirect_thunk_start[];
 extern char __indirect_thunk_end[];
 
--- /dev/null
+++ b/arch/x86/include/asm/specctrl.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_SPECCTRL_H_
+#define _ASM_X86_SPECCTRL_H_
+
+#include <asm/nospec-branch.h>
+
+/*
+ * On VMENTER we must preserve whatever view of the SPEC_CTRL MSR
+ * the guest has, while on VMEXIT we restore the host view. This
+ * would be easier if SPEC_CTRL were architecturally maskable or
+ * shadowable for guests but this is not (currently) the case.
+ * Takes the guest view of SPEC_CTRL MSR as a parameter.
+ */
+extern void x86_set_guest_spec_ctrl(u64);
+extern void x86_restore_host_spec_ctrl(u64);
+
+/* AMD specific Speculative Store Bypass MSR data */
+extern u64 x86_amd_ls_cfg_base;
+extern u64 x86_amd_ls_cfg_rds_mask;
+
+#endif
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -10,7 +10,7 @@
 #include <asm/processor.h>
 #include <asm/apic.h>
 #include <asm/cpu.h>
-#include <asm/nospec-branch.h>
+#include <asm/specctrl.h>
 #include <asm/smp.h>
 #include <asm/pci-direct.h>
 #include <asm/delay.h>
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -13,7 +13,7 @@
 #include <linux/cpu.h>
 #include <linux/module.h>
 
-#include <asm/nospec-branch.h>
+#include <asm/specctrl.h>
 #include <asm/cmdline.h>
 #include <asm/bugs.h>
 #include <asm/processor.h>
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -49,7 +49,7 @@
 #include <asm/debugreg.h>
 #include <asm/kvm_para.h>
 #include <asm/irq_remapping.h>
-#include <asm/nospec-branch.h>
+#include <asm/specctrl.h>
 
 #include <asm/virtext.h>
 #include "trace.h"
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -51,7 +51,7 @@
 #include <asm/apic.h>
 #include <asm/irq_remapping.h>
 #include <asm/mmu_context.h>
-#include <asm/nospec-branch.h>
+#include <asm/specctrl.h>
 #include <asm/mshyperv.h>
 
 #include "trace.h"

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 13/15] SBB 13
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (11 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 12/15] SBB 12 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-30  1:48   ` [MODERATED] " Konrad Rzeszutek Wilk
                     ` (3 more replies)
  2018-04-29 19:30 ` [patch V7 14/15] SBB 14 Thomas Gleixner
                   ` (3 subsequent siblings)
  16 siblings, 4 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

Add two new prctls to control aspects of speculation related vulnerabilites
and their mitigations.

PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
the following meaning:

PR_SPEC_PRCTL:	PRCTL per task control of the mitigation is enabled
PR_SPEC_ENABLE:	The speculation feature is enabled, mitigation is disabled
PR_SPEC_DISABLE:The speculation feature is disabled, mitigation is enabled

If all bits are 0 the CPU is not affected by the speculation misfeature. If
bit0 is set, then the per task control of the mitigation is enabled. If not
set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will
fail.

PR_GET_SPECULATION_CTRL allows to control the speculation misfeature, which
is selected by arg2 of prctl(2) per task. arg3 is used to hand in either
PR_SPEC_ENABLE or PR_SPEC_DISABLE.

The common return values are:

-EINVAL: prctl is not implemented by the architecture
-ENODEV: arg2 is selecting a not supported speculation misfeature

PR_SET_SPECULATION_CTRL has these additional return values:

-ERANGE: arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
-ENXIO:  prctl control of the selected speculation misfeature is disabled

The first supported controllable speculation misfeature is
PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
architectures.

TODO: Tidy up spec_ctrl.rst and write a man prctl(2) patch.

Based on an intial patch from Tim Chen and mostly rewritten.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 Documentation/userspace-api/index.rst     |    1 
 Documentation/userspace-api/spec_ctrl.rst |   65 ++++++++++++++++++++++++++++++
 include/linux/nospec.h                    |    5 ++
 include/uapi/linux/prctl.h                |   11 +++++
 kernel/sys.c                              |   18 ++++++++
 5 files changed, 100 insertions(+)

--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -19,6 +19,7 @@ place where this information is gathered
    no_new_privs
    seccomp_filter
    unshare
+   spec_ctrl
 
 .. only::  subproject and html
 
--- /dev/null
+++ b/Documentation/userspace-api/spec_ctrl.rst
@@ -0,0 +1,65 @@
+===================
+Speculation Control
+===================
+
+Quite some CPUs have speculation related misfeatures which are in fact
+vulnerabilites causing data leaks in various forms even accross privilege
+domains.
+
+The kernel provides mitigation for such vulnerabilities in various
+forms. Some of these mitigations are compile time configurable and some on
+the kernel command line.
+
+There is also a class of mitigations which is very expensive, but they can
+be restricted to a certain set of processes or tasks in controlled
+environments. The mechanism to control these mitigations is via ``prctl(2)``.
+
+There are two prctl options which are related to this:
+
+ - PR_GET_SPECULATION_CTRL
+
+ - PR_SET_SPECULATION_CTRL
+
+PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
+which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
+the following meaning:
+
+PR_SPEC_PRCTL:	PRCTL per task control of the mitigation is enabled
+PR_SPEC_ENABLE:	The speculation feature is enabled, mitigation is disabled
+PR_SPEC_DISABLE:The speculation feature is disabled, mitigation is enabled
+
+If all bits are 0 the CPU is not affected by the speculation misfeature. If
+bit0 is set, then the per task control of the mitigation is enabled. If not
+set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will
+fail.
+
+PR_GET_SPECULATION_CTRL allows to control the speculation misfeature, which
+is selected by arg2 of prctl(2) per task. arg3 is used to hand in either
+PR_SPEC_ENABLE or PR_SPEC_DISABLE.
+
+The common return values are:
+
+-EINVAL: prctl is not implemented by the architecture
+-ENODEV: arg2 is selecting a not supported speculation misfeature
+
+PR_SET_SPECULATION_CTRL has these additional return values:
+
+-ERANGE: arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
+-ENXIO:  prctl control of the selected speculation misfeature is disabled
+
+The following supported controllable speculation misfeatures are implemented:
+
+- PR_SPEC_STORE_BYPASS: Speculative Store Bypass
+
+  To control it the following prctl() invocations are used:
+
+    prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS);
+
+  and
+
+    prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE);
+
+    or
+
+    prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE);
+
--- a/include/linux/nospec.h
+++ b/include/linux/nospec.h
@@ -55,4 +55,9 @@ static inline unsigned long array_index_
 									\
 	(typeof(_i)) (_i & _mask);					\
 })
+
+/* Speculation control prctl */
+int arch_prctl_set_spec_ctrl(unsigned long which, unsigned long ctrl);
+int arch_prctl_get_spec_ctrl(unsigned long which);
+
 #endif /* _LINUX_NOSPEC_H */
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -207,4 +207,15 @@ struct prctl_mm_map {
 # define PR_SVE_VL_LEN_MASK		0xffff
 # define PR_SVE_VL_INHERIT		(1 << 17) /* inherit across exec */
 
+/* Per task speculation control */
+#define PR_SET_SPECULATION_CTRL		52
+#define PR_GET_SPECULATION_CTRL		53
+/* Speculation control variants */
+# define PR_SPEC_STORE_BYPASS		0
+/* Return and control values for PR_SET/GET_SPECULATION_CTRL */
+# define PR_SPEC_NOT_AFFECTED		0
+# define PR_SPEC_PRCTL			(1UL << 0)
+# define PR_SPEC_ENABLE			(1UL << 1)
+# define PR_SPEC_DISABLE		(1UL << 2)
+
 #endif /* _LINUX_PRCTL_H */
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -61,6 +61,8 @@
 #include <linux/uidgid.h>
 #include <linux/cred.h>
 
+#include <linux/nospec.h>
+
 #include <linux/kmsg_dump.h>
 /* Move somewhere else to avoid recompiling? */
 #include <generated/utsrelease.h>
@@ -2242,6 +2244,16 @@ static int propagate_has_child_subreaper
 	return 1;
 }
 
+int __weak arch_prctl_set_spec_ctrl(unsigned long which, unsigned long ctrl)
+{
+	return -EINVAL;
+}
+
+int __weak arch_prctl_get_spec_ctrl(unsigned long which)
+{
+	return -EINVAL;
+}
+
 SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 		unsigned long, arg4, unsigned long, arg5)
 {
@@ -2450,6 +2462,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
 	case PR_SVE_GET_VL:
 		error = SVE_GET_VL();
 		break;
+	case PR_SET_SPECULATION_CTRL:
+		error = arch_prctl_set_spec_ctrl(arg2, arg3);
+		break;
+	case PR_GET_SPECULATION_CTRL:
+		error = arch_prctl_get_spec_ctrl(arg2);
+		break;
 	default:
 		error = -EINVAL;
 		break;

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 14/15] SBB 14
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (12 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 13/15] SBB 13 Thomas Gleixner
@ 2018-04-29 19:30 ` Thomas Gleixner
  2018-04-30  2:14   ` [MODERATED] " Konrad Rzeszutek Wilk
  2018-04-29 19:31 ` [patch V7 15/15] SBB 15 Thomas Gleixner
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:30 UTC (permalink / raw)
  To: speck

The Speculative Store Bypass vulnerability can be mitigated with the
Reduced Data Speculation control (RDS). To allow finer grained control of
this eventually expensive mitigation a per task mitigation control is
required.

Add a new TIF_RDS flag and put it into the group of TIF flags which are
evaluated for mismatch in switch_to(). If these bits differ in the previous
and the next task, then the slow path function __switch_to_xtra() is
invoked. Implement the TIF_RDS dependent mitigation control in the slow
path.

If the prctl for controlling Speculative Store Bypass is disabled or no
task uses the prctl then there is no overhead in the switch_to() fast
path.

Update the KVM related speculation control functions to take TID_RDS into
account as well.

Based on a patch from Tim Chen. Completely rewritten.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/msr-index.h   |    3 ++-
 arch/x86/include/asm/specctrl.h    |   17 +++++++++++++++++
 arch/x86/include/asm/thread_info.h |    4 +++-
 arch/x86/kernel/cpu/bugs.c         |   22 +++++++++++++++++-----
 arch/x86/kernel/process.c          |   22 ++++++++++++++++++++++
 5 files changed, 61 insertions(+), 7 deletions(-)

--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -42,7 +42,8 @@
 #define MSR_IA32_SPEC_CTRL		0x00000048 /* Speculation Control */
 #define SPEC_CTRL_IBRS			(1 << 0)   /* Indirect Branch Restricted Speculation */
 #define SPEC_CTRL_STIBP			(1 << 1)   /* Single Thread Indirect Branch Predictors */
-#define SPEC_CTRL_RDS			(1 << 2)   /* Reduced Data Speculation */
+#define SPEC_CTRL_RDS_SHIFT		2	   /* Reduced Data Speculation bit */
+#define SPEC_CTRL_RDS			(1 << SPEC_CTRL_RDS_SHIFT)   /* Reduced Data Speculation */
 
 #define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */
 #define PRED_CMD_IBPB			(1 << 0)   /* Indirect Branch Prediction Barrier */
--- a/arch/x86/include/asm/specctrl.h
+++ b/arch/x86/include/asm/specctrl.h
@@ -2,6 +2,7 @@
 #ifndef _ASM_X86_SPECCTRL_H_
 #define _ASM_X86_SPECCTRL_H_
 
+#include <linux/thread_info.h>
 #include <asm/nospec-branch.h>
 
 /*
@@ -18,4 +19,20 @@ extern void x86_restore_host_spec_ctrl(u
 extern u64 x86_amd_ls_cfg_base;
 extern u64 x86_amd_ls_cfg_rds_mask;
 
+/* The Intel SPEC CTRL MSR base value cache */
+extern u64 x86_spec_ctrl_base;
+
+static inline u64 rds_tif_to_spec_ctrl(u64 tifn)
+{
+	BUILD_BUG_ON(TIF_RDS < SPEC_CTRL_RDS_SHIFT);
+	return (tifn & _TIF_RDS) >> (TIF_RDS - SPEC_CTRL_RDS_SHIFT);
+}
+
+static inline u64 rds_tif_to_amd_ls_cfg(u64 tifn)
+{
+	return tifn & _TIF_RDS ? x86_amd_ls_cfg_rds_mask : 0ULL;
+}
+
+extern void speculative_store_bypass_update(void);
+
 #endif
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -79,6 +79,7 @@ struct thread_info {
 #define TIF_SIGPENDING		2	/* signal pending */
 #define TIF_NEED_RESCHED	3	/* rescheduling necessary */
 #define TIF_SINGLESTEP		4	/* reenable singlestep on user return*/
+#define TIF_RDS			5	/* Reduced data speculation */
 #define TIF_SYSCALL_EMU		6	/* syscall emulation active */
 #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
 #define TIF_SECCOMP		8	/* secure computing */
@@ -105,6 +106,7 @@ struct thread_info {
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
 #define _TIF_SINGLESTEP		(1 << TIF_SINGLESTEP)
+#define _TIF_RDS		(1 << TIF_RDS)
 #define _TIF_SYSCALL_EMU	(1 << TIF_SYSCALL_EMU)
 #define _TIF_SYSCALL_AUDIT	(1 << TIF_SYSCALL_AUDIT)
 #define _TIF_SECCOMP		(1 << TIF_SECCOMP)
@@ -144,7 +146,7 @@ struct thread_info {
 
 /* flags to check in __switch_to() */
 #define _TIF_WORK_CTXSW							\
-	(_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP)
+	(_TIF_IO_BITMAP|_TIF_NOCPUID|_TIF_NOTSC|_TIF_BLOCKSTEP|_TIF_RDS)
 
 #define _TIF_WORK_CTXSW_PREV (_TIF_WORK_CTXSW|_TIF_USER_RETURN_NOTIFY)
 #define _TIF_WORK_CTXSW_NEXT (_TIF_WORK_CTXSW)
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -33,7 +33,7 @@ static void __init ssb_select_mitigation
  * Our boot-time value of SPEC_CTRL MSR. We read it once so that any
  * writes to SPEC_CTRL contain whatever reserved bits have been set.
  */
-static u64 __ro_after_init x86_spec_ctrl_base;
+u64 __ro_after_init x86_spec_ctrl_base;
 
 /*
  * The vendor and possibly platform specific bits which can be modified in
@@ -146,25 +146,37 @@ EXPORT_SYMBOL_GPL(x86_get_default_spec_c
 
 void x86_set_guest_spec_ctrl(u64 guest_spec_ctrl)
 {
+	u64 host = x86_spec_ctrl_base;
+
 	if (!boot_cpu_has(X86_FEATURE_IBRS))
 		return;
-	if (x86_spec_ctrl_base != guest_spec_ctrl)
+
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+		host |=  rds_tif_to_spec_ctrl(current_thread_info()->flags);
+
+	if (host != guest_spec_ctrl)
 		wrmsrl(MSR_IA32_SPEC_CTRL, guest_spec_ctrl);
 }
 EXPORT_SYMBOL_GPL(x86_set_guest_spec_ctrl);
 
 void x86_restore_host_spec_ctrl(u64 guest_spec_ctrl)
 {
+	u64 host = x86_spec_ctrl_base;
+
 	if (!boot_cpu_has(X86_FEATURE_IBRS))
 		return;
-	if (x86_spec_ctrl_base != guest_spec_ctrl)
-		wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+		host |=  rds_tif_to_spec_ctrl(current_thread_info()->flags);
+
+	if (host != guest_spec_ctrl)
+		wrmsrl(MSR_IA32_SPEC_CTRL, host);
 }
 EXPORT_SYMBOL_GPL(x86_restore_host_spec_ctrl);
 
 static void x86_amd_rds_enable(void)
 {
-	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_bit);
+	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_mask;
 
 	if (boot_cpu_has(X86_FEATURE_AMD_RDS))
 		wrmsrl(MSR_AMD64_LS_CFG, msrval);
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -38,6 +38,7 @@
 #include <asm/switch_to.h>
 #include <asm/desc.h>
 #include <asm/prctl.h>
+#include <asm/specctrl.h>
 
 /*
  * per-CPU TSS segments. Threads are completely 'soft' on Linux,
@@ -278,6 +279,24 @@ static inline void switch_to_bitmap(stru
 	}
 }
 
+static __always_inline void __speculative_store_bypass_update(unsigned long tifn)
+{
+	u64 msr;
+
+	if (static_cpu_has(X86_FEATURE_AMD_RDS)) {
+		msr = x86_amd_ls_cfg_base | rds_tif_to_amd_ls_cfg(tifn);
+		wrmsrl(MSR_AMD64_LS_CFG, msr);
+	} else {
+		msr = x86_spec_ctrl_base | rds_tif_to_spec_ctrl(tifn);
+		wrmsrl(MSR_IA32_SPEC_CTRL, msr);
+	}
+}
+
+void speculative_store_bypass_update(void)
+{
+	__speculative_store_bypass_update(current_thread_info()->flags);
+}
+
 void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
 		      struct tss_struct *tss)
 {
@@ -309,6 +328,9 @@ void __switch_to_xtra(struct task_struct
 
 	if ((tifp ^ tifn) & _TIF_NOCPUID)
 		set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
+
+	if ((tifp ^ tifn) & _TIF_RDS)
+		__speculative_store_bypass_update(tifn);
 }
 
 /*

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [patch V7 15/15] SBB 15
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (13 preceding siblings ...)
  2018-04-29 19:30 ` [patch V7 14/15] SBB 14 Thomas Gleixner
@ 2018-04-29 19:31 ` Thomas Gleixner
  2018-04-30  2:32   ` [MODERATED] " Konrad Rzeszutek Wilk
                     ` (2 more replies)
  2018-04-29 20:14 ` [patch V7 00/15] SBB 0 Thomas Gleixner
  2018-04-29 20:35 ` [MODERATED] " Borislav Petkov
  16 siblings, 3 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 19:31 UTC (permalink / raw)
  To: speck

Add prctl based control for Speculative Store Bypass mitigation and make it
the default mitigation for Intel.

Andi Kleen provided the following rationale (slightly redacted):

 There are multiple levels of impact of Speculative Store Bypass:

 1) JITed sandbox.
    It cannot invoke system calls, but can do PRIME+PROBE and may have call
    interfaces to other code

 2) Native code process.
    No protection inside the process at this level.

 3) Kernel.

 4) Between processes. 

 The prctl tries to protect against case (1) doing attacks.

 If the untrusted code can do random system calls then control is already
 lost in a much worse way. So there needs to be system call protection in
 some way (using a JIT not allowing them or seccomp). Or rather if the
 process can subvert its environment somehow to do the prctl it can already
 execute arbitrary code, which is much worse than SSB.

 To put it differently, the point of the prctl is to not allow JITed code
 to read data it shouldn't read from its JITed sandbox. If it already has
 escaped its sandbox then it can already read everything it wants in its
 address space, and do much worse.

 On the other hand if there is the ability to switch it freely gives more
 flexibility to do the protection which is needed for JITed code without
 impacting the overall system performance.

Based on an initial patch from Tim Chen. Completely rewritten.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 Documentation/admin-guide/kernel-parameters.txt |    3 
 arch/x86/include/asm/nospec-branch.h            |    1 
 arch/x86/kernel/cpu/bugs.c                      |   77 ++++++++++++++++++++++--
 3 files changed, 76 insertions(+), 5 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4054,6 +4054,9 @@
 			auto   - Kernel detects whether the CPU model contains a
 			         vulnerable implementation of Speculative Store
 			         Bypass and picks the most appropriate mitigation
+			prctl  - Control Speculative Store Bypass for a thread
+			         via prctl. By default it is enabled. The state
+				 is inherited on fork.
 
 			Not specifying this option is equivalent to
 			spec_store_bypass_disable=auto.
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -232,6 +232,7 @@ extern u64 x86_get_default_spec_ctrl(voi
 enum ssb_mitigation {
 	SPEC_STORE_BYPASS_NONE,
 	SPEC_STORE_BYPASS_DISABLE,
+	SPEC_STORE_BYPASS_PRCTL,
 };
 
 extern char __indirect_thunk_start[];
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -12,6 +12,8 @@
 #include <linux/utsname.h>
 #include <linux/cpu.h>
 #include <linux/module.h>
+#include <linux/nospec.h>
+#include <linux/prctl.h>
 
 #include <asm/specctrl.h>
 #include <asm/cmdline.h>
@@ -408,20 +410,23 @@ enum ssb_mitigation_cmd {
 	SPEC_STORE_BYPASS_CMD_NONE,
 	SPEC_STORE_BYPASS_CMD_AUTO,
 	SPEC_STORE_BYPASS_CMD_ON,
+	SPEC_STORE_BYPASS_CMD_PRCTL,
 };
 
 static const char *ssb_strings[] = {
 	[SPEC_STORE_BYPASS_NONE]	= "Vulnerable",
-	[SPEC_STORE_BYPASS_DISABLE]	= "Mitigation: Speculative Store Bypass disabled"
+	[SPEC_STORE_BYPASS_DISABLE]	= "Mitigation: Speculative Store Bypass disabled",
+	[SPEC_STORE_BYPASS_PRCTL]	= "Mitigation: Speculative Store Bypass disabled via prctl"
 };
 
 static const struct {
 	const char *option;
 	enum ssb_mitigation_cmd cmd;
 } ssb_mitigation_options[] = {
-	{ "auto",	SPEC_STORE_BYPASS_CMD_AUTO }, /* Platform decides */
-	{ "on",		SPEC_STORE_BYPASS_CMD_ON },   /* Disable Speculative Store Bypass */
-	{ "off",	SPEC_STORE_BYPASS_CMD_NONE }, /* Don't touch Speculative Store Bypass */
+	{ "auto",	SPEC_STORE_BYPASS_CMD_AUTO },  /* Platform decides */
+	{ "on",		SPEC_STORE_BYPASS_CMD_ON },    /* Disable Speculative Store Bypass */
+	{ "off",	SPEC_STORE_BYPASS_CMD_NONE },  /* Don't touch Speculative Store Bypass */
+	{ "prctl",	SPEC_STORE_BYPASS_CMD_PRCTL }, /* Disable Speculative Store Bypass via prctl */
 };
 
 static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void)
@@ -476,9 +481,15 @@ static enum ssb_mitigation_cmd __init __
 		 */
 		if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
 			break;
+		/* Choose prctl as the default mode */
+		mode = SPEC_STORE_BYPASS_PRCTL;
+		break;
 	case SPEC_STORE_BYPASS_CMD_ON:
 		mode = SPEC_STORE_BYPASS_DISABLE;
 		break;
+	case SPEC_STORE_BYPASS_CMD_PRCTL:
+		mode = SPEC_STORE_BYPASS_PRCTL;
+		break;
 	case SPEC_STORE_BYPASS_CMD_NONE:
 		break;
 	}
@@ -489,7 +500,7 @@ static enum ssb_mitigation_cmd __init __
 	 *  - X86_FEATURE_RDS - CPU is able to turn off speculative store bypass
 	 *  - X86_FEATURE_SPEC_STORE_BYPASS_DISABLE - engage the mitigation
 	 */
-	if (mode != SPEC_STORE_BYPASS_NONE) {
+	if (mode == SPEC_STORE_BYPASS_DISABLE) {
 		setup_force_cpu_cap(X86_FEATURE_SPEC_STORE_BYPASS_DISABLE);
 		/*
 		 * Intel uses the SPEC CTRL MSR Bit(2) for this, while AMD uses
@@ -520,6 +531,62 @@ static void ssb_select_mitigation()
 
 #undef pr_fmt
 
+static int sbb_prctl_set(unsigned long ctrl)
+{
+	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
+
+	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
+		return -ENXIO;
+
+	if (ctrl == PR_SPEC_ENABLE)
+		clear_tsk_thread_flag(current, TIF_RDS);
+	else
+		set_tsk_thread_flag(current, TIF_RDS);
+
+	if (rds != !!test_tsk_thread_flag(current, TIF_RDS))
+		speculative_store_bypass_update();
+	return 0;
+}
+
+static int sbb_prctl_get(void)
+{
+	switch (ssb_mode) {
+	case SPEC_STORE_BYPASS_DISABLE:
+		return PR_SPEC_DISABLE;
+	case SPEC_STORE_BYPASS_PRCTL:
+		if (test_tsk_thread_flag(current, TIF_RDS))
+			return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
+		return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
+	default:
+		if (boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
+			return PR_SPEC_ENABLE;
+		return PR_SPEC_NOT_AFFECTED;
+	}
+}
+
+int arch_prctl_set_spec_ctrl(unsigned long which, unsigned long ctrl)
+{
+	if (ctrl != PR_SPEC_ENABLE && ctrl != PR_SPEC_DISABLE)
+		return -ERANGE;
+
+	switch (which) {
+	case PR_SPEC_STORE_BYPASS:
+		return sbb_prctl_set(ctrl);
+	default:
+		return -ENODEV;
+	}
+}
+
+int arch_prctl_get_spec_ctrl(unsigned long which)
+{
+	switch (which) {
+	case PR_SPEC_STORE_BYPASS:
+		return sbb_prctl_get();
+	default:
+		return -ENODEV;
+	}
+}
+
 void x86_setup_ap_spec_ctrl(void)
 {
 	if (boot_cpu_has(X86_FEATURE_IBRS))

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 00/15] SBB 0
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (14 preceding siblings ...)
  2018-04-29 19:31 ` [patch V7 15/15] SBB 15 Thomas Gleixner
@ 2018-04-29 20:14 ` Thomas Gleixner
  2018-04-29 20:35 ` [MODERATED] " Borislav Petkov
  16 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 20:14 UTC (permalink / raw)
  To: speck

On Sun, 29 Apr 2018, speck for Thomas Gleixner wrote:

This SSB acronym drives me nuts. I typoed that as SBB all day long. SBB is
the official acronym for the 'Schweizer BundesBahnen', i.e. the Swiss
Federal Railways. Living close to Switzerland seems to do that to me.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 00/15] SBB 0
  2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
                   ` (15 preceding siblings ...)
  2018-04-29 20:14 ` [patch V7 00/15] SBB 0 Thomas Gleixner
@ 2018-04-29 20:35 ` Borislav Petkov
  2018-04-29 20:46   ` Konrad Rzeszutek Wilk
  2018-04-29 20:55   ` Thomas Gleixner
  16 siblings, 2 replies; 50+ messages in thread
From: Borislav Petkov @ 2018-04-29 20:35 UTC (permalink / raw)
  To: speck

On Sun, Apr 29, 2018 at 09:30:45PM +0200, speck for Thomas Gleixner wrote:
> This is an update based on Konrads V6 series. The major changes are:
> 
>  - Distangle the mitigation control from the AMD/Intel cpu init code and
>    keep it confined to bugs.c. That's cleaner and required to make the
>    prctl mode work properly on both AMD and Intel
> 
>  - Avoid parsing the command line when RDS is not supported at all.
> 
>  - Make all the spec ctrl msr related variables __ro_after_init instead of
>    read_mostly. Nothing can fiddle with them after boot.
> 
>  - Integrate the PRCTL:

Ok, now that we have this "fancy" contraption I haz a question: how are
programs supposed to use it which are closed source? Or such which can't
be recompiled? Are people supposed to do wrappers or are we saying,
tough luck to those who can't change their applications?

Thx.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 00/15] SBB 0
  2018-04-29 20:35 ` [MODERATED] " Borislav Petkov
@ 2018-04-29 20:46   ` Konrad Rzeszutek Wilk
  2018-04-29 20:57     ` Thomas Gleixner
  2018-04-29 21:40     ` [MODERATED] " Borislav Petkov
  2018-04-29 20:55   ` Thomas Gleixner
  1 sibling, 2 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-29 20:46 UTC (permalink / raw)
  To: speck

On Sun, Apr 29, 2018 at 10:35:42PM +0200, speck for Borislav Petkov wrote:
> On Sun, Apr 29, 2018 at 09:30:45PM +0200, speck for Thomas Gleixner wrote:
> > This is an update based on Konrads V6 series. The major changes are:
> > 
> >  - Distangle the mitigation control from the AMD/Intel cpu init code and
> >    keep it confined to bugs.c. That's cleaner and required to make the
> >    prctl mode work properly on both AMD and Intel
> > 
> >  - Avoid parsing the command line when RDS is not supported at all.
> > 
> >  - Make all the spec ctrl msr related variables __ro_after_init instead of
> >    read_mostly. Nothing can fiddle with them after boot.
> > 
> >  - Integrate the PRCTL:
> 
> Ok, now that we have this "fancy" contraption I haz a question: how are
> programs supposed to use it which are closed source? Or such which can't
> be recompiled? Are people supposed to do wrappers or are we saying,
> tough luck to those who can't change their applications?

spec_store_bypass_disable=on

And they are all good.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 00/15] SBB 0
  2018-04-29 20:35 ` [MODERATED] " Borislav Petkov
  2018-04-29 20:46   ` Konrad Rzeszutek Wilk
@ 2018-04-29 20:55   ` Thomas Gleixner
  2018-04-29 22:05     ` Thomas Gleixner
  1 sibling, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 20:55 UTC (permalink / raw)
  To: speck

On Sun, 29 Apr 2018, speck for Borislav Petkov wrote:
> On Sun, Apr 29, 2018 at 09:30:45PM +0200, speck for Thomas Gleixner wrote:
> > This is an update based on Konrads V6 series. The major changes are:
> > 
> >  - Distangle the mitigation control from the AMD/Intel cpu init code and
> >    keep it confined to bugs.c. That's cleaner and required to make the
> >    prctl mode work properly on both AMD and Intel
> > 
> >  - Avoid parsing the command line when RDS is not supported at all.
> > 
> >  - Make all the spec ctrl msr related variables __ro_after_init instead of
> >    read_mostly. Nothing can fiddle with them after boot.
> > 
> >  - Integrate the PRCTL:
> 
> Ok, now that we have this "fancy" contraption I haz a question: how are
> programs supposed to use it which are closed source? Or such which can't
> be recompiled? Are people supposed to do wrappers or are we saying,

No wrappers. prctl(2) operates on self, aka current ...

> tough luck to those who can't change their applications?

That's how it looks like. And that's why I was asking whether default
mitigation mode PRCTL is a brilliant choice.

It's surely a brilliant choice for Intel and AMD because it does not affect
performance and I'm sure they will publish the next 'oh it's not that bad
and it works as advertised' statement on CRD.

Whether it's a brilliant choice for the average user who cannot control the
applications and is not necessarily able to fiddle with command line
arguments is a different question. And it's one which we have to ask
ourself seriously.

Of course I don't know how bad the attack vector through javascript is,
because there is again no useful information and no proper technical
analysis available.

So I really give a shit on what Intel or AMD wants. We have to make a
decision what we think is the right thing to do. And I certainly prefer
doing protection by default for everyone over performance for Intels and
AMDs sake.

This is not a Intel/AMD wish and we serve store. We make our own decisions
and I rather let enterprise admins curse about setting the right bits on
the kernel command line when they tune their shiny new prctl enabled JVM
thingy.

Unless someone provides authoritative proof that there is no issue. I'm
looking forward to that....

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 00/15] SBB 0
  2018-04-29 20:46   ` Konrad Rzeszutek Wilk
@ 2018-04-29 20:57     ` Thomas Gleixner
  2018-04-29 21:40     ` [MODERATED] " Borislav Petkov
  1 sibling, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 20:57 UTC (permalink / raw)
  To: speck

On Sun, 29 Apr 2018, speck for Konrad Rzeszutek Wilk wrote:
> On Sun, Apr 29, 2018 at 10:35:42PM +0200, speck for Borislav Petkov wrote:
> > On Sun, Apr 29, 2018 at 09:30:45PM +0200, speck for Thomas Gleixner wrote:
> > > This is an update based on Konrads V6 series. The major changes are:
> > > 
> > >  - Distangle the mitigation control from the AMD/Intel cpu init code and
> > >    keep it confined to bugs.c. That's cleaner and required to make the
> > >    prctl mode work properly on both AMD and Intel
> > > 
> > >  - Avoid parsing the command line when RDS is not supported at all.
> > > 
> > >  - Make all the spec ctrl msr related variables __ro_after_init instead of
> > >    read_mostly. Nothing can fiddle with them after boot.
> > > 
> > >  - Integrate the PRCTL:
> > 
> > Ok, now that we have this "fancy" contraption I haz a question: how are
> > programs supposed to use it which are closed source? Or such which can't
> > be recompiled? Are people supposed to do wrappers or are we saying,
> > tough luck to those who can't change their applications?
> 
> spec_store_bypass_disable=on
> 
> And they are all good.

Sure and who is making all Joe Users do that? You're singing the enterprise
performance chant.

Let enterprise admins add: spec_store_bypass_disable=prctl and they are all
good.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 00/15] SBB 0
  2018-04-29 20:46   ` Konrad Rzeszutek Wilk
  2018-04-29 20:57     ` Thomas Gleixner
@ 2018-04-29 21:40     ` Borislav Petkov
  1 sibling, 0 replies; 50+ messages in thread
From: Borislav Petkov @ 2018-04-29 21:40 UTC (permalink / raw)
  To: speck

On Sun, Apr 29, 2018 at 04:46:19PM -0400, speck for Konrad Rzeszutek Wilk wrote:
> spec_store_bypass_disable=on
> 
> And they are all good.

No no, that's not the "solution". If we tell them that, then the whole
prctl bla was for nothing.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 00/15] SBB 0
  2018-04-29 20:55   ` Thomas Gleixner
@ 2018-04-29 22:05     ` Thomas Gleixner
  2018-04-30  0:06       ` [MODERATED] " Jon Masters
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-29 22:05 UTC (permalink / raw)
  To: speck

On Sun, 29 Apr 2018, speck for Thomas Gleixner wrote:
> This is not a Intel/AMD wish and we serve store. We make our own decisions
> and I rather let enterprise admins curse about setting the right bits on
> the kernel command line when they tune their shiny new prctl enabled JVM
> thingy.
> 
> Unless someone provides authoritative proof that there is no issue. I'm
> looking forward to that....

Just for the record. I was sceptical about the prctl in the beginning, but
I surely recognize the value. Though in that early discussion my concerns
about the general security problem for Joe User was mitigated with the
argument that sensible stuff like browsers are going to be covered by
seccomp() anyway. Now the seccomp() thing was given up upon, so the initial
question still stays. And I think it's a legitimate one,

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 03/15] SBB 3
  2018-04-29 19:30 ` [patch V7 03/15] SBB 3 Thomas Gleixner
@ 2018-04-29 23:31   ` Linus Torvalds
  2018-04-30  2:50     ` Konrad Rzeszutek Wilk
  2018-04-30  7:09     ` David Woodhouse
  0 siblings, 2 replies; 50+ messages in thread
From: Linus Torvalds @ 2018-04-29 23:31 UTC (permalink / raw)
  To: speck


[ Going through the patches more carefully because Thomas asked me to ]

On Sun, 29 Apr 2018, speck for Thomas Gleixner wrote:
>  
> @@ -248,12 +259,14 @@ static inline void vmexit_fill_RSB(void)
>  				 "movl $0, %%edx\n\t"		\
>  				 "wrmsr",			\
>  				 _feature)			\
> -		     : : [msr] "i" (_msr), [val] "i" (_val)	\
> +		     : : [msr] "i" (_msr), [val] "m" (_val)	\
>  		     : "eax", "ecx", "edx", "memory")

Oops. This is suspicious.

I suspect it all works, but it looks very non-optimal.

The alternative asm is

	"movl %[msr], %%ecx\n\t"       \
	"movl %[val], %%eax\n\t"       \
	"movl $0, %%edx\n\t"           \
	"wrmsr",                       \

and those new constraints are quite dubious:

	: : [msr] "i" (_msr), [val] "m" (_val)     \
	: "eax", "ecx", "edx", "memory")

and that *forces* gcc to spill "val" to memory (regardless of 
alternative), only for the inline asm to then load it from memory.

It would make more sense to say "g" than "m", I think. We don't much care 
where "val" comes from.

But more importantly, why do we have those odd "mov" instructions at all?

Why does this code not just do

    { unsigned int ax, dx, cx;
      asm volatile(ALTERNATIVE("","wrmsr",_feature)
	:"=a" (ax), "=c" (cx), "=d" (dx)
	:"0" (_val), "1" (_msr)
	:"memory");
    }

which seems simpler and better for gcc to set up? Sure, it makes the 
setup be unconditional, but it effectively already was (because of that 
memory setup).

But I guess it doesn't much _matter_.

            Linus

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 00/15] SBB 0
  2018-04-29 22:05     ` Thomas Gleixner
@ 2018-04-30  0:06       ` Jon Masters
  0 siblings, 0 replies; 50+ messages in thread
From: Jon Masters @ 2018-04-30  0:06 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 1338 bytes --]

On 04/29/2018 06:05 PM, speck for Thomas Gleixner wrote:
> On Sun, 29 Apr 2018, speck for Thomas Gleixner wrote:
>> This is not a Intel/AMD wish and we serve store. We make our own decisions
>> and I rather let enterprise admins curse about setting the right bits on
>> the kernel command line when they tune their shiny new prctl enabled JVM
>> thingy.
>>
>> Unless someone provides authoritative proof that there is no issue. I'm
>> looking forward to that....
> 
> Just for the record. I was sceptical about the prctl in the beginning, but
> I surely recognize the value. Though in that early discussion my concerns
> about the general security problem for Joe User was mitigated with the
> argument that sensible stuff like browsers are going to be covered by
> seccomp() anyway. Now the seccomp() thing was given up upon, so the initial
> question still stays. And I think it's a legitimate one,

I did some analysis of users of seccomp a few weeks ago when Intel folks
started suggesting that path and I found that its use is far from
universal - it wouldn't be a good thing to rely upon for users.

For users, most of the major browsers plan process isolation updates
trying to keep everything contained as much as possible that way.

Jon.

-- 
Computer Architect | Sent from my Fedora powered laptop


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 10/15] SBB 10
  2018-04-29 19:30 ` [patch V7 10/15] SBB 10 Thomas Gleixner
@ 2018-04-30  0:16   ` Konrad Rzeszutek Wilk
  2018-04-30  7:49     ` Thomas Gleixner
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  0:16 UTC (permalink / raw)
  To: speck

..snip..
> @@ -52,7 +59,8 @@ void __init check_bugs(void)
>  
>  	/*
>  	 * Read the SPEC_CTRL MSR to account for reserved bits which may
> -	 * have unknown values.
> +	 * have unknown values. AMD64_LS_CFG msr is cached in the early AMD
> +	 * init code as it is not enumerated and depends on the family.
>  	 */
>  	if (boot_cpu_has(X86_FEATURE_IBRS))
>  		rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
> @@ -154,6 +162,14 @@ void x86_restore_host_spec_ctrl(u64 gues
>  }
>  EXPORT_SYMBOL_GPL(x86_restore_host_spec_ctrl);
>  
> +static void x86_amd_rds_enable(void)
> +{
> +	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_bit);
                                                                 ^- that
')' shouldn't be there.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 12/15] SBB 12
  2018-04-29 19:30 ` [patch V7 12/15] SBB 12 Thomas Gleixner
@ 2018-04-30  1:33   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  1:33 UTC (permalink / raw)
  To: speck

On Sun, Apr 29, 2018 at 09:30:57PM +0200, speck for Thomas Gleixner wrote:
> Subject: [patch V7 12/15] x86/speculation: Create specctrl.h to avoid include hell
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Having everything in nospec-branch.h creates a hell of dependencies when
> adding the prctl based switching mechanism. Move everything which is not
> required in nospec-branch.h to specctrl.h and fixup the includes in the
> relevant files.

spec-ctrl.h sounds more natural to me, but <shrugs>, either way:

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 13/15] SBB 13
  2018-04-29 19:30 ` [patch V7 13/15] SBB 13 Thomas Gleixner
@ 2018-04-30  1:48   ` Konrad Rzeszutek Wilk
  2018-04-30  2:39     ` Konrad Rzeszutek Wilk
  2018-04-30  3:17     ` Jon Masters
  2018-04-30  2:20   ` [MODERATED] " Konrad Rzeszutek Wilk
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  1:48 UTC (permalink / raw)
  To: speck

On Sun, Apr 29, 2018 at 09:30:58PM +0200, speck for Thomas Gleixner wrote:
> Subject: [patch V7 13/15] prctl: Add speculation control prctls
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Add two new prctls to control aspects of speculation related vulnerabilites
> and their mitigations.
> 
> PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
> which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
> the following meaning:
> 
> PR_SPEC_PRCTL:	PRCTL per task control of the mitigation is enabled

This reads to me as "mitigation is engaged right now".

But in reality it means: "Hey, you can fiddle with the PR_SET_SPECULATION_CTRL if you
would like"

Perhaps change this to say:
	"PRCTL per task control of the mitigation can be controlled by PR_SET_SPECULATION_CTRL"

but that is not true either (see the next patch, perhaps that is a bug?)

	 
> PR_SPEC_ENABLE:	The speculation feature is enabled, mitigation is disabled
> PR_SPEC_DISABLE:The speculation feature is disabled, mitigation is enabled
> 
> If all bits are 0 the CPU is not affected by the speculation misfeature. If
> bit0 is set, then the per task control of the mitigation is enabled. If not

s/enabled/available/?

> set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will
> fail.
> 
> PR_GET_SPECULATION_CTRL allows to control the speculation misfeature, which

s/GET/SET/

> is selected by arg2 of prctl(2) per task. arg3 is used to hand in either
> PR_SPEC_ENABLE or PR_SPEC_DISABLE.
> 
> The common return values are:
> 
> -EINVAL: prctl is not implemented by the architecture
> -ENODEV: arg2 is selecting a not supported speculation misfeature
> 
> PR_SET_SPECULATION_CTRL has these additional return values:
> 
> -ERANGE: arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
> -ENXIO:  prctl control of the selected speculation misfeature is disabled
> 
> The first supported controllable speculation misfeature is
> PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
> architectures.
> 
> TODO: Tidy up spec_ctrl.rst and write a man prctl(2) patch.
> 
> Based on an intial patch from Tim Chen and mostly rewritten.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  Documentation/userspace-api/index.rst     |    1 
>  Documentation/userspace-api/spec_ctrl.rst |   65 ++++++++++++++++++++++++++++++
>  include/linux/nospec.h                    |    5 ++
>  include/uapi/linux/prctl.h                |   11 +++++
>  kernel/sys.c                              |   18 ++++++++
>  5 files changed, 100 insertions(+)
> 
> --- a/Documentation/userspace-api/index.rst
> +++ b/Documentation/userspace-api/index.rst
> @@ -19,6 +19,7 @@ place where this information is gathered
>     no_new_privs
>     seccomp_filter
>     unshare
> +   spec_ctrl
>  
>  .. only::  subproject and html
>  
> --- /dev/null
> +++ b/Documentation/userspace-api/spec_ctrl.rst
> @@ -0,0 +1,65 @@
> +===================
> +Speculation Control
> +===================
> +
> +Quite some CPUs have speculation related misfeatures which are in fact
> +vulnerabilites causing data leaks in various forms even accross privilege
> +domains.
> +
> +The kernel provides mitigation for such vulnerabilities in various
> +forms. Some of these mitigations are compile time configurable and some on
> +the kernel command line.
> +
> +There is also a class of mitigations which is very expensive, but they can
> +be restricted to a certain set of processes or tasks in controlled
> +environments. The mechanism to control these mitigations is via ``prctl(2)``.
> +
> +There are two prctl options which are related to this:
> +
> + - PR_GET_SPECULATION_CTRL
> +
> + - PR_SET_SPECULATION_CTRL
> +
> +PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
> +which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
> +the following meaning:
> +
> +PR_SPEC_PRCTL:	PRCTL per task control of the mitigation is enabled

See above.
> +PR_SPEC_ENABLE:	The speculation feature is enabled, mitigation is disabled
> +PR_SPEC_DISABLE:The speculation feature is disabled, mitigation is enabled
> +
> +If all bits are 0 the CPU is not affected by the speculation misfeature. If
> +bit0 is set, then the per task control of the mitigation is enabled. If not
> +set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will
> +fail.
> +
> +PR_GET_SPECULATION_CTRL allows to control the speculation misfeature, which

s/GET/SET/
> +is selected by arg2 of prctl(2) per task. arg3 is used to hand in either
> +PR_SPEC_ENABLE or PR_SPEC_DISABLE.
> +
> +The common return values are:
> +
> +-EINVAL: prctl is not implemented by the architecture
> +-ENODEV: arg2 is selecting a not supported speculation misfeature
> +
> +PR_SET_SPECULATION_CTRL has these additional return values:
> +
> +-ERANGE: arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
> +-ENXIO:  prctl control of the selected speculation misfeature is disabled
> +
> +The following supported controllable speculation misfeatures are implemented:
> +
> +- PR_SPEC_STORE_BYPASS: Speculative Store Bypass
> +
> +  To control it the following prctl() invocations are used:
> +
> +    prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS);
> +
> +  and

s/and if PR_SPEC_PRCTL bit is set then:/ 
> +
> +    prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE);

s/GET/SET/
> +
> +    or
> +
> +    prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE);

s/GET/SET/
> +
> --- a/include/linux/nospec.h
> +++ b/include/linux/nospec.h
> @@ -55,4 +55,9 @@ static inline unsigned long array_index_
>  									\
>  	(typeof(_i)) (_i & _mask);					\
>  })
> +
> +/* Speculation control prctl */
> +int arch_prctl_set_spec_ctrl(unsigned long which, unsigned long ctrl);
> +int arch_prctl_get_spec_ctrl(unsigned long which);
> +
>  #endif /* _LINUX_NOSPEC_H */
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -207,4 +207,15 @@ struct prctl_mm_map {
>  # define PR_SVE_VL_LEN_MASK		0xffff
>  # define PR_SVE_VL_INHERIT		(1 << 17) /* inherit across exec */
>  
> +/* Per task speculation control */
> +#define PR_SET_SPECULATION_CTRL		52
> +#define PR_GET_SPECULATION_CTRL		53
> +/* Speculation control variants */
> +# define PR_SPEC_STORE_BYPASS		0
> +/* Return and control values for PR_SET/GET_SPECULATION_CTRL */
> +# define PR_SPEC_NOT_AFFECTED		0
> +# define PR_SPEC_PRCTL			(1UL << 0)
> +# define PR_SPEC_ENABLE			(1UL << 1)
> +# define PR_SPEC_DISABLE		(1UL << 2)
> +
>  #endif /* _LINUX_PRCTL_H */
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -61,6 +61,8 @@
>  #include <linux/uidgid.h>
>  #include <linux/cred.h>
>  
> +#include <linux/nospec.h>
> +
>  #include <linux/kmsg_dump.h>
>  /* Move somewhere else to avoid recompiling? */
>  #include <generated/utsrelease.h>
> @@ -2242,6 +2244,16 @@ static int propagate_has_child_subreaper
>  	return 1;
>  }
>  
> +int __weak arch_prctl_set_spec_ctrl(unsigned long which, unsigned long ctrl)
> +{
> +	return -EINVAL;
> +}
> +
> +int __weak arch_prctl_get_spec_ctrl(unsigned long which)
> +{
> +	return -EINVAL;
> +}
> +
>  SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>  		unsigned long, arg4, unsigned long, arg5)
>  {
> @@ -2450,6 +2462,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
>  	case PR_SVE_GET_VL:
>  		error = SVE_GET_VL();
>  		break;
> +	case PR_SET_SPECULATION_CTRL:
> +		error = arch_prctl_set_spec_ctrl(arg2, arg3);

Would it make sense to also test arg4, arg5? And if somebody provided any value
in them to return -EINVAL?

> +		break;
> +	case PR_GET_SPECULATION_CTRL:
> +		error = arch_prctl_get_spec_ctrl(arg2);

Ditto here (with obviously checking for arg3 as well)?

> +		break;
>  	default:
>  		error = -EINVAL;
>  		break;
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 14/15] SBB 14
  2018-04-29 19:30 ` [patch V7 14/15] SBB 14 Thomas Gleixner
@ 2018-04-30  2:14   ` Konrad Rzeszutek Wilk
  2018-04-30  5:57     ` Thomas Gleixner
  0 siblings, 1 reply; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  2:14 UTC (permalink / raw)
  To: speck

..snip..
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -42,7 +42,8 @@
>  #define MSR_IA32_SPEC_CTRL		0x00000048 /* Speculation Control */
>  #define SPEC_CTRL_IBRS			(1 << 0)   /* Indirect Branch Restricted Speculation */
>  #define SPEC_CTRL_STIBP			(1 << 1)   /* Single Thread Indirect Branch Predictors */
> -#define SPEC_CTRL_RDS			(1 << 2)   /* Reduced Data Speculation */
> +#define SPEC_CTRL_RDS_SHIFT		2	   /* Reduced Data Speculation bit */
> +#define SPEC_CTRL_RDS			(1 << SPEC_CTRL_RDS_SHIFT)   /* Reduced Data Speculation */
>  
>  #define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */
>  #define PRED_CMD_IBPB			(1 << 0)   /* Indirect Branch Prediction Barrier */
> --- a/arch/x86/include/asm/specctrl.h
> +++ b/arch/x86/include/asm/specctrl.h
> @@ -2,6 +2,7 @@
>  #ifndef _ASM_X86_SPECCTRL_H_
>  #define _ASM_X86_SPECCTRL_H_
>  
> +#include <linux/thread_info.h>
>  #include <asm/nospec-branch.h>
>  
>  /*
> @@ -18,4 +19,20 @@ extern void x86_restore_host_spec_ctrl(u
>  extern u64 x86_amd_ls_cfg_base;
>  extern u64 x86_amd_ls_cfg_rds_mask;
>  
> +/* The Intel SPEC CTRL MSR base value cache */
> +extern u64 x86_spec_ctrl_base;
> +
> +static inline u64 rds_tif_to_spec_ctrl(u64 tifn)
> +{
> +	BUILD_BUG_ON(TIF_RDS < SPEC_CTRL_RDS_SHIFT);
> +	return (tifn & _TIF_RDS) >> (TIF_RDS - SPEC_CTRL_RDS_SHIFT);

If my math is correct, the right side value is 3, not 2. That is
TIF_RDS (5) - SPEC_CTRL_RDS_SHIFT(2) = 3.

Then if _TIF_RDS is set we do:
	1 >> 3

which is zero. Hm, did you mean to shift it to left? That would
be 8 (also incorrect).

If _TIF_RDS is unset we do:
	0 >> 3

which is also zero.
>  void x86_set_guest_spec_ctrl(u64 guest_spec_ctrl)
>  {
> +	u64 host = x86_spec_ctrl_base;
> +
>  	if (!boot_cpu_has(X86_FEATURE_IBRS))
>  		return;
> -	if (x86_spec_ctrl_base != guest_spec_ctrl)
> +
> +	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> +		host |=  rds_tif_to_spec_ctrl(current_thread_info()->flags);

Here you have an extra space in front of the rds_tif_to_spec_ctrl.
> + 
> +	if (host != guest_spec_ctrl)
>  		wrmsrl(MSR_IA32_SPEC_CTRL, guest_spec_ctrl);
>  }
>  EXPORT_SYMBOL_GPL(x86_set_guest_spec_ctrl);
>  
>  void x86_restore_host_spec_ctrl(u64 guest_spec_ctrl)
>  {
> +	u64 host = x86_spec_ctrl_base;
> +
>  	if (!boot_cpu_has(X86_FEATURE_IBRS))
>  		return;
> -	if (x86_spec_ctrl_base != guest_spec_ctrl)
> -		wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
> +
> +	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> +		host |=  rds_tif_to_spec_ctrl(current_thread_info()->flags);

Ditto here - also an extra space in front of the rds_tif_to_spec_ctrl.

> +
> +	if (host != guest_spec_ctrl)
> +		wrmsrl(MSR_IA32_SPEC_CTRL, host);
>  }
>  EXPORT_SYMBOL_GPL(x86_restore_host_spec_ctrl);
>  
>  static void x86_amd_rds_enable(void)
>  {
> -	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_bit);
> +	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_mask;

Ah, here is the fix for that ')'
>  
>  	if (boot_cpu_has(X86_FEATURE_AMD_RDS))
>  		wrmsrl(MSR_AMD64_LS_CFG, msrval);
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -38,6 +38,7 @@
>  #include <asm/switch_to.h>
>  #include <asm/desc.h>
>  #include <asm/prctl.h>
> +#include <asm/specctrl.h>
>  
>  /*
>   * per-CPU TSS segments. Threads are completely 'soft' on Linux,
> @@ -278,6 +279,24 @@ static inline void switch_to_bitmap(stru
>  	}
>  }
>  
> +static __always_inline void __speculative_store_bypass_update(unsigned long tifn)
> +{
> +	u64 msr;
> +
> +	if (static_cpu_has(X86_FEATURE_AMD_RDS)) {
> +		msr = x86_amd_ls_cfg_base | rds_tif_to_amd_ls_cfg(tifn);
> +		wrmsrl(MSR_AMD64_LS_CFG, msr);
> +	} else {

Would it make sense to test for 'X86_FEATURE_RDS' here? Let me double-check
the next patch - you probably have some check there where the _TIF_RDS never
gets set if X86_FEATURE_RDS does not exist.

> +		msr = x86_spec_ctrl_base | rds_tif_to_spec_ctrl(tifn);
> +		wrmsrl(MSR_IA32_SPEC_CTRL, msr);
> +	}
> +}
> +
> +void speculative_store_bypass_update(void)
> +{
> +	__speculative_store_bypass_update(current_thread_info()->flags);
> +}
> +
>  void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p,
>  		      struct tss_struct *tss)
>  {
> @@ -309,6 +328,9 @@ void __switch_to_xtra(struct task_struct
>  
>  	if ((tifp ^ tifn) & _TIF_NOCPUID)
>  		set_cpuid_faulting(!!(tifn & _TIF_NOCPUID));
> +
> +	if ((tifp ^ tifn) & _TIF_RDS)
> +		__speculative_store_bypass_update(tifn);
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 13/15] SBB 13
  2018-04-29 19:30 ` [patch V7 13/15] SBB 13 Thomas Gleixner
  2018-04-30  1:48   ` [MODERATED] " Konrad Rzeszutek Wilk
@ 2018-04-30  2:20   ` Konrad Rzeszutek Wilk
  2018-04-30  2:36   ` Konrad Rzeszutek Wilk
  2018-04-30 17:28   ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  2:20 UTC (permalink / raw)
  To: speck

..snip..
> The common return values are:
> 
> -EINVAL: prctl is not implemented by the architecture
> -ENODEV: arg2 is selecting a not supported speculation misfeature
> 
> PR_SET_SPECULATION_CTRL has these additional return values:
> 
> -ERANGE: arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
> -ENXIO:  prctl control of the selected speculation misfeature is disabled

s/disabled/not available (perhaps due to boot time overrides)/ ?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 15/15] SBB 15
  2018-04-29 19:31 ` [patch V7 15/15] SBB 15 Thomas Gleixner
@ 2018-04-30  2:32   ` Konrad Rzeszutek Wilk
  2018-04-30 15:56   ` Konrad Rzeszutek Wilk
  2018-04-30 19:30   ` [MODERATED] " Tim Chen
  2 siblings, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  2:32 UTC (permalink / raw)
  To: speck

> +static int sbb_prctl_set(unsigned long ctrl)
> +{
> +	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
> +
> +	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
> +		return -ENXIO;
> +
> +	if (ctrl == PR_SPEC_ENABLE)
> +		clear_tsk_thread_flag(current, TIF_RDS);
> +	else
> +		set_tsk_thread_flag(current, TIF_RDS);
> +
> +	if (rds != !!test_tsk_thread_flag(current, TIF_RDS))
> +		speculative_store_bypass_update();

I would have added an \n here, but you the boss so if you prefer them
this way that is cool too.

Either way:

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 13/15] SBB 13
  2018-04-29 19:30 ` [patch V7 13/15] SBB 13 Thomas Gleixner
  2018-04-30  1:48   ` [MODERATED] " Konrad Rzeszutek Wilk
  2018-04-30  2:20   ` [MODERATED] " Konrad Rzeszutek Wilk
@ 2018-04-30  2:36   ` Konrad Rzeszutek Wilk
  2018-04-30 17:28   ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  2:36 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 469 bytes --]

> PR_SET_SPECULATION_CTRL has these additional return values:
> 
> -ERANGE: arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE

s/not either/neither/

? Feel free to ignore it of course but it read more obvious to me that way.

And also attaching the testing program I spoke off. My previous testing (which
I alluded to in previous emails) missed the obvious case of 'r < 0' meaning
0xfffffff.. which would report that all bits are set (duh).




[-- Attachment #2: test-prctl.c --]
[-- Type: text/plain, Size: 2145 bytes --]

#include <stdio.h>
       #include <sys/prctl.h>

#define SPEC_CTRL_RDS_SHIFT            2          /* Reduced Data Speculation bit */
/* Per task speculation control */
#define PR_SET_SPECULATION_CTRL                52
#define PR_GET_SPECULATION_CTRL                53
/* Speculation control variants */
# define PR_SPEC_STORE_BYPASS          0
/* Return and control values for PR_SET/GET_SPECULATION_CTRL */
# define PR_SPEC_NOT_AFFECTED          0
# define PR_SPEC_PRCTL                 (1UL << 0)
# define PR_SPEC_ENABLE                        (1UL << 1)
# define PR_SPEC_DISABLE               (1UL << 2)

void print_bitfields(int r)
{
 if ( r >= 0 )
 	fprintf(stderr, "	r=%s %s %s\n", r & PR_SPEC_PRCTL ? "PR_SPEC_PRCTL" : "",
		r & PR_SPEC_ENABLE ? "PR_SPEC_ENABLE" : "",
		r & PR_SPEC_DISABLE ? "PR_SPEC_DISABLE" : "");

}
void get_v(void)
{
 int r;
 r = prctl(PR_GET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS);
 fprintf(stderr, "   PR_GET_SPECULATION_CTRL PR_SPEC_STORE_BYPASS r=%d\n", r);
 print_bitfields(r);

}

int main(void)
{
 int r;

 fprintf(stderr,"Initial state\n"); 
 get_v();

 r = prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE);
 fprintf(stderr, "PR_SET_SPECULATION_CTRL PR_SPEC_DISABLE r=%d\n", r);
 print_bitfields(r);
 get_v();

 /* Two DISALBE in a row*/
 r = prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE);
 fprintf(stderr, "PR_SET_SPECULATION_CTRL PR_SPEC_DISABLE r=%d\n", r);
 print_bitfields(r);
 get_v();

 /* Two enable in a row. */
 r = prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE);
 fprintf(stderr, "PR_SET_SPECULATION_CTRL PR_SPEC_ENABLE r=%d\n", r);
 print_bitfields(r);
 get_v();

 r = prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_ENABLE);
 fprintf(stderr, "PR_SET_SPECULATION_CTRL PR_SPEC_ENABLE r=%d\n", r);
 print_bitfields(r);
 get_v();
 fprintf(stderr,"\nReal test-case\n");
 /* The real normal test-case */
 r = prctl(PR_SET_SPECULATION_CTRL, PR_SPEC_STORE_BYPASS, PR_SPEC_DISABLE);
 fprintf(stderr, "PR_SET_SPECULATION_CTRL PR_SPEC_DISABLE r=%d\n", r);
 print_bitfields(r);
 get_v();

 do { } while (1);
 
	return 0;
}

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 13/15] SBB 13
  2018-04-30  1:48   ` [MODERATED] " Konrad Rzeszutek Wilk
@ 2018-04-30  2:39     ` Konrad Rzeszutek Wilk
  2018-04-30  3:17     ` Jon Masters
  1 sibling, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  2:39 UTC (permalink / raw)
  To: speck

On Sun, Apr 29, 2018 at 09:48:56PM -0400, speck for Konrad Rzeszutek Wilk wrote:
> On Sun, Apr 29, 2018 at 09:30:58PM +0200, speck for Thomas Gleixner wrote:
> > Subject: [patch V7 13/15] prctl: Add speculation control prctls
> > From: Thomas Gleixner <tglx@linutronix.de>
> > 
> > Add two new prctls to control aspects of speculation related vulnerabilites
> > and their mitigations.
> > 
> > PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
> > which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
> > the following meaning:
> > 
> > PR_SPEC_PRCTL:	PRCTL per task control of the mitigation is enabled
> 
> This reads to me as "mitigation is engaged right now".
> 
> But in reality it means: "Hey, you can fiddle with the PR_SET_SPECULATION_CTRL if you
> would like"
> 
> Perhaps change this to say:
> 	"PRCTL per task control of the mitigation can be controlled by PR_SET_SPECULATION_CTRL"
> 
> but that is not true either (see the next patch, perhaps that is a bug?)

..it was a bug in my testing program. So that statement in quotes above is correct.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 03/15] SBB 3
  2018-04-29 23:31   ` [MODERATED] " Linus Torvalds
@ 2018-04-30  2:50     ` Konrad Rzeszutek Wilk
  2018-04-30  7:09     ` David Woodhouse
  1 sibling, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30  2:50 UTC (permalink / raw)
  To: speck

..snip..
> The alternative asm is
> 
> 	"movl %[msr], %%ecx\n\t"       \
> 	"movl %[val], %%eax\n\t"       \
> 	"movl $0, %%edx\n\t"           \
> 	"wrmsr",                       \

My compiler (ancient gcc (GCC) 4.4.4 20100503 (Red Hat 4.4.4-2)) seems to have created:

ffffffff8249adf4:       b9 48 00 00 00          mov    $0x48,%ecx
ffffffff8249adf9:       8b 45 c8                mov    -0x38(%rbp),%eax
ffffffff8249adfc:       ba 00 00 00 00          mov    $0x0,%edx
ffffffff8249ae01:       0f 30                   wrmsr   
ffffffff8249ae03:       b9 48 00 00 00          mov    $0x48,%ecx
ffffffff8249ae08:       8b 45 c8                mov    -0x38(%rbp),%eax
ffffffff8249ae0b:       ba 00 00 00 00          mov    $0x0,%edx
ffffffff8249ae10:       0f 30                   wrmsr   
..

which obviously means there is a preamble to stick the value in
the stack, which means an call, which means exactly what you mean.

I originally did the simple change to minimize the amount of backporting churn.

That is when stable@vger.kernel.org trees emails would start flooding
in I could easily and correctly provide the right incantations.

which reminds me, should all those patches have 'xCc: stable@vger.kernel.org'
or such on them? (the x so git format-patch won't include the email in the CC).

> 
> and those new constraints are quite dubious:
> 
> 	: : [msr] "i" (_msr), [val] "m" (_val)     \
> 	: "eax", "ecx", "edx", "memory")
> 
> and that *forces* gcc to spill "val" to memory (regardless of 
> alternative), only for the inline asm to then load it from memory.
> 
> It would make more sense to say "g" than "m", I think. We don't much care 
> where "val" comes from.

I tried that:

diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 7a1be0b..0e66b75 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -266,7 +266,7 @@ static inline void vmexit_fill_RSB(void)
                                 "movl $0, %%edx\n\t"           \
                                 "wrmsr",                       \
                                 _feature)                      \
-                    : : [msr] "i" (_msr), [val] "m" (_val)     \
+                    : : [msr] "i" (_msr), [val] "g" (_val)     \
                     : "eax", "ecx", "edx", "memory")
 
 static inline void indirect_branch_prediction_barrier(void)

and got:

/home/konrad/ssd/konrad/linux/arch/x86/platform/efi/efi_64.c: Assembler messages:
/home/konrad/ssd/konrad/linux/arch/x86/platform/efi/efi_64.c:869: Error: suffix or operands invalid for `mov'
/home/konrad/ssd/konrad/linux/arch/x86/platform/efi/efi_64.c:869: Error: suffix or operands invalid for `mov'

Sorry for not digging in this further, my brain is tired tonight.
> 
> But more importantly, why do we have those odd "mov" instructions at all?

It probably was there to make the GCC gas shut up when this was using 'i' ?

> 
> Why does this code not just do
> 
>     { unsigned int ax, dx, cx;
>       asm volatile(ALTERNATIVE("","wrmsr",_feature)
> 	:"=a" (ax), "=c" (cx), "=d" (dx)
> 	:"0" (_val), "1" (_msr)
> 	:"memory");
>     }
> 
> which seems simpler and better for gcc to set up? Sure, it makes the 
> setup be unconditional, but it effectively already was (because of that 
> memory setup).
> 
> But I guess it doesn't much _matter_.

Right as we only do those on BIOS/EFI calls. Perhaps an follow up clean up patch?

> 
>             Linus

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 13/15] SBB 13
  2018-04-30  1:48   ` [MODERATED] " Konrad Rzeszutek Wilk
  2018-04-30  2:39     ` Konrad Rzeszutek Wilk
@ 2018-04-30  3:17     ` Jon Masters
  2018-04-30  8:35       ` Thomas Gleixner
  1 sibling, 1 reply; 50+ messages in thread
From: Jon Masters @ 2018-04-30  3:17 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 690 bytes --]

On 04/29/2018 09:48 PM, speck for Konrad Rzeszutek Wilk wrote:
>> SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>>  		unsigned long, arg4, unsigned long, arg5)
>>  {
>> @@ -2450,6 +2462,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
>>  	case PR_SVE_GET_VL:
>>  		error = SVE_GET_VL();
>>  		break;
>> +	case PR_SET_SPECULATION_CTRL:
>> +		error = arch_prctl_set_spec_ctrl(arg2, arg3);
> Would it make sense to also test arg4, arg5? And if somebody provided any value
> in them to return -EINVAL?

Maybe, but the other prctl's don't do this for args they don't use yet.

Jon.

-- 
Computer Architect | Sent from my Fedora powered laptop


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 14/15] SBB 14
  2018-04-30  2:14   ` [MODERATED] " Konrad Rzeszutek Wilk
@ 2018-04-30  5:57     ` Thomas Gleixner
  2018-04-30 15:49       ` [MODERATED] " Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-30  5:57 UTC (permalink / raw)
  To: speck

On Sun, 29 Apr 2018, speck for Konrad Rzeszutek Wilk wrote:
> > +extern u64 x86_spec_ctrl_base;
> > +
> > +static inline u64 rds_tif_to_spec_ctrl(u64 tifn)
> > +{
> > +	BUILD_BUG_ON(TIF_RDS < SPEC_CTRL_RDS_SHIFT);
> > +	return (tifn & _TIF_RDS) >> (TIF_RDS - SPEC_CTRL_RDS_SHIFT);
> 
> If my math is correct, the right side value is 3, not 2. That is
> TIF_RDS (5) - SPEC_CTRL_RDS_SHIFT(2) = 3.
> 
> Then if _TIF_RDS is set we do:
> 	1 >> 3

_TIF_RDS == (1 << TIF_RDS) == 0x20

Ergo if set:

	0x20 >> 3 = 0x04

if not set:

       0x00 >> 3 = 0x00

Right?

> >  static void x86_amd_rds_enable(void)
> >  {
> > -	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_bit);
> > +	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_mask;
> 
> Ah, here is the fix for that ')'

Will fold it back.
  
> > +static __always_inline void __speculative_store_bypass_update(unsigned long tifn)
> > +{
> > +	u64 msr;
> > +
> > +	if (static_cpu_has(X86_FEATURE_AMD_RDS)) {
> > +		msr = x86_amd_ls_cfg_base | rds_tif_to_amd_ls_cfg(tifn);
> > +		wrmsrl(MSR_AMD64_LS_CFG, msr);
> > +	} else {
> 
> Would it make sense to test for 'X86_FEATURE_RDS' here? Let me double-check
> the next patch - you probably have some check there where the _TIF_RDS never
> gets set if X86_FEATURE_RDS does not exist.

Correct.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 03/15] SBB 3
  2018-04-29 23:31   ` [MODERATED] " Linus Torvalds
  2018-04-30  2:50     ` Konrad Rzeszutek Wilk
@ 2018-04-30  7:09     ` David Woodhouse
  1 sibling, 0 replies; 50+ messages in thread
From: David Woodhouse @ 2018-04-30  7:09 UTC (permalink / raw)
  To: speck



On Sun, 2018-04-29 at 16:31 -0700, speck for Linus Torvalds wrote:
> 
> But more importantly, why do we have those odd "mov" instructions at all?
>
> Why does this code not just do
> 
>     { unsigned int ax, dx, cx;
>       asm volatile(ALTERNATIVE("","wrmsr",_feature)
>         :"=a" (ax), "=c" (cx), "=d" (dx)
>         :"0" (_val), "1" (_msr)
>         :"memory");
>     }
> 
> which seems simpler and better for gcc to set up? Sure, it makes the 
> setup be unconditional, but it effectively already was (because of that 
> memory setup).
> 
> But I guess it doesn't much _matter_.

FWIW we have those mov instructions because there was much bikeshedding
of this the first time round, and people really wanted the ax/cx/dx
setup not to happen in the ALTERNATIVE case. It *wasn't* unconditional
when it was all immediates, and I was losing the will to live. As you
say, it doesn't much matter.

We could still make it all go away in the ALTERNATIVE case by passing
in *only* the immediate value to be OR'd with x86_spec_ctrl_base, and
letting the asm load x86_spec_crtl_base for itself (the variable itself
would have to be exported).

I'm inclined not to bother. It's not like it's even that much of a win
because we *clobber* those registers anyway.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 10/15] SBB 10
  2018-04-30  0:16   ` [MODERATED] " Konrad Rzeszutek Wilk
@ 2018-04-30  7:49     ` Thomas Gleixner
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-30  7:49 UTC (permalink / raw)
  To: speck

On Sun, 29 Apr 2018, speck for Konrad Rzeszutek Wilk wrote:

> ..snip..
> > @@ -52,7 +59,8 @@ void __init check_bugs(void)
> >  
> >  	/*
> >  	 * Read the SPEC_CTRL MSR to account for reserved bits which may
> > -	 * have unknown values.
> > +	 * have unknown values. AMD64_LS_CFG msr is cached in the early AMD
> > +	 * init code as it is not enumerated and depends on the family.
> >  	 */
> >  	if (boot_cpu_has(X86_FEATURE_IBRS))
> >  		rdmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
> > @@ -154,6 +162,14 @@ void x86_restore_host_spec_ctrl(u64 gues
> >  }
> >  EXPORT_SYMBOL_GPL(x86_restore_host_spec_ctrl);
> >  
> > +static void x86_amd_rds_enable(void)
> > +{
> > +	u64 msrval = x86_amd_ls_cfg_base | x86_amd_ls_cfg_rds_bit);
>                                                                  ^- that
> ')' shouldn't be there.

Right. That was my first attempt to use the bit position so I could avoid
the conditional in switch_to() on AMD as well but then I gave up on that
again...

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 13/15] SBB 13
  2018-04-30  3:17     ` Jon Masters
@ 2018-04-30  8:35       ` Thomas Gleixner
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-30  8:35 UTC (permalink / raw)
  To: speck

On Sun, 29 Apr 2018, speck for Jon Masters wrote:
> On 04/29/2018 09:48 PM, speck for Konrad Rzeszutek Wilk wrote:
> >> SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> >>  		unsigned long, arg4, unsigned long, arg5)
> >>  {
> >> @@ -2450,6 +2462,12 @@ SYSCALL_DEFINE5(prctl, int, option, unsi
> >>  	case PR_SVE_GET_VL:
> >>  		error = SVE_GET_VL();
> >>  		break;
> >> +	case PR_SET_SPECULATION_CTRL:
> >> +		error = arch_prctl_set_spec_ctrl(arg2, arg3);
> > Would it make sense to also test arg4, arg5? And if somebody provided any value
> > in them to return -EINVAL?
> 
> Maybe, but the other prctl's don't do this for args they don't use yet.

And that's a good reason to do it neither, right?

Hell no. Konrad is right, it's a good thing to test unused args for 0 in
case you need an extra arg later. And given the scope of the mess this is
about it's not unreasonable to expect extra arguments.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 14/15] SBB 14
  2018-04-30  5:57     ` Thomas Gleixner
@ 2018-04-30 15:49       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30 15:49 UTC (permalink / raw)
  To: speck

On Mon, Apr 30, 2018 at 07:57:19AM +0200, speck for Thomas Gleixner wrote:
> On Sun, 29 Apr 2018, speck for Konrad Rzeszutek Wilk wrote:
> > > +extern u64 x86_spec_ctrl_base;
> > > +
> > > +static inline u64 rds_tif_to_spec_ctrl(u64 tifn)
> > > +{
> > > +	BUILD_BUG_ON(TIF_RDS < SPEC_CTRL_RDS_SHIFT);
> > > +	return (tifn & _TIF_RDS) >> (TIF_RDS - SPEC_CTRL_RDS_SHIFT);
> > 
> > If my math is correct, the right side value is 3, not 2. That is
> > TIF_RDS (5) - SPEC_CTRL_RDS_SHIFT(2) = 3.
> > 
> > Then if _TIF_RDS is set we do:
> > 	1 >> 3
> 
> _TIF_RDS == (1 << TIF_RDS) == 0x20
> 
> Ergo if set:
> 
> 	0x20 >> 3 = 0x04
> 
> if not set:
> 
>        0x00 >> 3 = 0x00
> 
> Right?

<blushes> Yes. I missed the '_'.

With that explanation and the thing below, please add

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 15/15] SBB 15
  2018-04-29 19:31 ` [patch V7 15/15] SBB 15 Thomas Gleixner
  2018-04-30  2:32   ` [MODERATED] " Konrad Rzeszutek Wilk
@ 2018-04-30 15:56   ` Konrad Rzeszutek Wilk
  2018-04-30 16:07     ` Thomas Gleixner
  2018-04-30 19:30   ` [MODERATED] " Tim Chen
  2 siblings, 1 reply; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30 15:56 UTC (permalink / raw)
  To: speck

> +static int sbb_prctl_set(unsigned long ctrl)

Aha!

s/sbb/ssb/!

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 15/15] SBB 15
  2018-04-30 15:56   ` Konrad Rzeszutek Wilk
@ 2018-04-30 16:07     ` Thomas Gleixner
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-30 16:07 UTC (permalink / raw)
  To: speck

On Mon, 30 Apr 2018, speck for Konrad Rzeszutek Wilk wrote:

> > +static int sbb_prctl_set(unsigned long ctrl)
> 
> Aha!
> 
> s/sbb/ssb/!

I hate that acronym. Or it hates me ...

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 13/15] SBB 13
  2018-04-29 19:30 ` [patch V7 13/15] SBB 13 Thomas Gleixner
                     ` (2 preceding siblings ...)
  2018-04-30  2:36   ` Konrad Rzeszutek Wilk
@ 2018-04-30 17:28   ` Konrad Rzeszutek Wilk
  3 siblings, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30 17:28 UTC (permalink / raw)
  To: speck

[-- Attachment #1: Type: text/plain, Size: 4011 bytes --]

> TODO: Tidy up spec_ctrl.rst and write a man prctl(2) patch.
> 

It looks good (the .rst) document.

Attached is a patch to the manpages. Both inline and attached.

From 5f3440b1025beb526b7a0c60e4e98d7b595b30b1 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Mon, 30 Apr 2018 13:25:20 -0400
Subject: [PATCH] SSB MANPAGE #1

prctl.2: PR_[SET|GET]_SPECULATION_CTRL

field.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v8: New patch
---
 man2/prctl.2 | 113 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 113 insertions(+)

diff --git a/man2/prctl.2 b/man2/prctl.2
index 54764d881..77ceeb03f 100644
--- a/man2/prctl.2
+++ b/man2/prctl.2
@@ -1008,6 +1008,82 @@ the "securebits" flags of the calling thread.
 See
 .BR capabilities (7).
 .TP
+.BR PR_GET_SPECULATION_CTRL
+Returns the state of the speculation misfeature which is selected with
+the value of
+.IR arg2 ,
+which must be
+.B PR_SPEC_STORE_BYPASS.
+Otherwise the call fails with the error
+.BR ENODEV .
+The return value uses bit 0-2 with the following meaning:
+.RS
+.TP
+.BR PR_SPEC_PRCTL
+Mitigation can be controlled per task by
+.B PR_SET_SPECULATION_CTRL
+.TP
+.BR PR_SPEC_ENABLE
+The speculation feature is enabled, mitigation is disabled.
+.TP
+.BR PR_SPEC_DISABLE
+The speculation feature is disabled, mitigation is enabled
+.RE
+.IP
+If all bits are
+.B 0
+then the CPU is not affected by the speculation misfeature.
+.IP
+If
+.B PR_SPEC_PRCTL
+is set, then the per task control of the mitigation is available. If not set,
+.B prctl()
+for the speculation misfeature will fail.
+In the above operation
+.I arg3
+,
+.I arg4,
+and
+.I arg5
+must be specified as 0, otherwise the call fails with the error
+.BR EUCLEAN.
+.TP
+.BR PR_SET_SPECULATION_CTRL
+Sets the state of the speculation misfeature which is selected with
+the value of
+.IR arg2 ,
+which must be
+.B PR_SPEC_STORE_BYPASS.
+Otherwise the call fails with the error
+.BR ENODEV .
+This control is per task. The
+.IR arg3
+is used to hand in the control value, which can be either:
+.RS
+.TP
+.BR PR_SPEC_ENABLE
+The speculation feature is enabled, mitigation is disabled.
+.TP
+.BR PR_SPEC_DISABLE
+The speculation feature is disabled, mitigation is enabled
+.RE
+.IP
+Any other value in
+.IR arg3
+will result in the call failure with the error
+.BR ERANGE .
+.IP
+Furtheremore this speculation feature can also be controlled by the boot-time
+parameter of
+.B
+spec_store_bypass_disable=
+Which could enforce a read-only policy which will result in the call failure
+with the error
+.BR ENXIO .
+Consult the
+.B PR_GET_SPECULATION_CTRL
+for details on the possible enumerations.
+.TP
 .BR PR_SET_THP_DISABLE " (since Linux 3.15)"
 .\" commit a0715cc22601e8830ace98366c0c2bd8da52af52
 Set the state of the "THP disable" flag for the calling thread.
@@ -1501,6 +1577,12 @@ and
 .IR arg3
 does not specify a valid capability.
 .TP
+.B ENODEV
+.I option
+was
+.BR PR_SET_SPECULATION_CTRL
+the kernel or CPU does not support the requested speculation misfeature.
+.TP
 .B ENXIO
 .I option
 was
@@ -1510,6 +1592,15 @@ or
 and the kernel or the CPU does not support MPX management.
 Check that the kernel and processor have MPX support.
 .TP
+.B ENXIO
+.I option
+was
+.BR PR_SET_SPECULATION_CTRL
+implies that the control of the selected speculation misfeature is not possible.
+See
+.BR PR_GET_SPECULATION_CTRL
+for the bit fields to determine which option is available.
+.TP
 .B EOPNOTSUPP
 .I option
 is
@@ -1570,6 +1661,28 @@ is not present in the process's permitted and inheritable capability sets,
 or the
 .B PR_CAP_AMBIENT_LOWER
 securebit has been set.
+.TP
+.B ERANGE
+.I option
+was
+.BR PR_SET_SPECULATION_CTRL
+and
+.IR arg3
+is incorrect - neither
+.B PR_SPEC_ENABLE
+nor
+.B PR_SPEC_DISABLE
+.
+.TP
+.B EUCLEAN
+.I option
+was
+.BR PR_GET_SPECULATION_CTRL
+or
+.BR PR_SET_SPECULATION_CTRL
+and unused arguments to
+.B prctl()
+are not 0.
 .SH VERSIONS
 The
 .BR prctl ()
-- 
2.13.4


[-- Attachment #2: 0001-SSB-MANPAGE-1.patch --]
[-- Type: text/plain, Size: 3845 bytes --]

From 5f3440b1025beb526b7a0c60e4e98d7b595b30b1 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Mon, 30 Apr 2018 13:25:20 -0400
Subject: [PATCH] SSB MANPAGE #1

prctl.2: PR_[SET|GET]_SPECULATION_CTRL

field.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v8: New patch
---
 man2/prctl.2 | 113 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 113 insertions(+)

diff --git a/man2/prctl.2 b/man2/prctl.2
index 54764d881..77ceeb03f 100644
--- a/man2/prctl.2
+++ b/man2/prctl.2
@@ -1008,6 +1008,82 @@ the "securebits" flags of the calling thread.
 See
 .BR capabilities (7).
 .TP
+.BR PR_GET_SPECULATION_CTRL
+Returns the state of the speculation misfeature which is selected with
+the value of
+.IR arg2 ,
+which must be
+.B PR_SPEC_STORE_BYPASS.
+Otherwise the call fails with the error
+.BR ENODEV .
+The return value uses bit 0-2 with the following meaning:
+.RS
+.TP
+.BR PR_SPEC_PRCTL
+Mitigation can be controlled per task by
+.B PR_SET_SPECULATION_CTRL
+.TP
+.BR PR_SPEC_ENABLE
+The speculation feature is enabled, mitigation is disabled.
+.TP
+.BR PR_SPEC_DISABLE
+The speculation feature is disabled, mitigation is enabled
+.RE
+.IP
+If all bits are
+.B 0
+then the CPU is not affected by the speculation misfeature.
+.IP
+If
+.B PR_SPEC_PRCTL
+is set, then the per task control of the mitigation is available. If not set,
+.B prctl()
+for the speculation misfeature will fail.
+In the above operation
+.I arg3
+,
+.I arg4,
+and
+.I arg5
+must be specified as 0, otherwise the call fails with the error
+.BR EUCLEAN.
+.TP
+.BR PR_SET_SPECULATION_CTRL
+Sets the state of the speculation misfeature which is selected with
+the value of
+.IR arg2 ,
+which must be
+.B PR_SPEC_STORE_BYPASS.
+Otherwise the call fails with the error
+.BR ENODEV .
+This control is per task. The
+.IR arg3
+is used to hand in the control value, which can be either:
+.RS
+.TP
+.BR PR_SPEC_ENABLE
+The speculation feature is enabled, mitigation is disabled.
+.TP
+.BR PR_SPEC_DISABLE
+The speculation feature is disabled, mitigation is enabled
+.RE
+.IP
+Any other value in
+.IR arg3
+will result in the call failure with the error
+.BR ERANGE .
+.IP
+Furtheremore this speculation feature can also be controlled by the boot-time
+parameter of
+.B
+spec_store_bypass_disable=
+Which could enforce a read-only policy which will result in the call failure
+with the error
+.BR ENXIO .
+Consult the
+.B PR_GET_SPECULATION_CTRL
+for details on the possible enumerations.
+.TP
 .BR PR_SET_THP_DISABLE " (since Linux 3.15)"
 .\" commit a0715cc22601e8830ace98366c0c2bd8da52af52
 Set the state of the "THP disable" flag for the calling thread.
@@ -1501,6 +1577,12 @@ and
 .IR arg3
 does not specify a valid capability.
 .TP
+.B ENODEV
+.I option
+was
+.BR PR_SET_SPECULATION_CTRL
+the kernel or CPU does not support the requested speculation misfeature.
+.TP
 .B ENXIO
 .I option
 was
@@ -1510,6 +1592,15 @@ or
 and the kernel or the CPU does not support MPX management.
 Check that the kernel and processor have MPX support.
 .TP
+.B ENXIO
+.I option
+was
+.BR PR_SET_SPECULATION_CTRL
+implies that the control of the selected speculation misfeature is not possible.
+See
+.BR PR_GET_SPECULATION_CTRL
+for the bit fields to determine which option is available.
+.TP
 .B EOPNOTSUPP
 .I option
 is
@@ -1570,6 +1661,28 @@ is not present in the process's permitted and inheritable capability sets,
 or the
 .B PR_CAP_AMBIENT_LOWER
 securebit has been set.
+.TP
+.B ERANGE
+.I option
+was
+.BR PR_SET_SPECULATION_CTRL
+and
+.IR arg3
+is incorrect - neither
+.B PR_SPEC_ENABLE
+nor
+.B PR_SPEC_DISABLE
+.
+.TP
+.B EUCLEAN
+.I option
+was
+.BR PR_GET_SPECULATION_CTRL
+or
+.BR PR_SET_SPECULATION_CTRL
+and unused arguments to
+.B prctl()
+are not 0.
 .SH VERSIONS
 The
 .BR prctl ()
-- 
2.13.4


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 15/15] SBB 15
  2018-04-29 19:31 ` [patch V7 15/15] SBB 15 Thomas Gleixner
  2018-04-30  2:32   ` [MODERATED] " Konrad Rzeszutek Wilk
  2018-04-30 15:56   ` Konrad Rzeszutek Wilk
@ 2018-04-30 19:30   ` Tim Chen
  2018-04-30 19:36     ` Thomas Gleixner
  2018-04-30 20:09     ` [MODERATED] " Konrad Rzeszutek Wilk
  2 siblings, 2 replies; 50+ messages in thread
From: Tim Chen @ 2018-04-30 19:30 UTC (permalink / raw)
  To: speck


[-- Attachment #1.1: Type: text/plain, Size: 719 bytes --]

On 04/29/2018 12:31 PM, speck for Thomas Gleixner wrote:

>+static int sbb_prctl_set(unsigned long ctrl)
>+{
>+	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
>+
>+	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
>+		return -ENXIO;

Perhaps we should not use the prctl if we don't have
the SSB bug.  Something like:

	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL ||
	    !boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
		return -ENXIO; 

Thanks.

Tim
>+
>+	if (ctrl == PR_SPEC_ENABLE)
>+		clear_tsk_thread_flag(current, TIF_RDS);
>+	else
>+		set_tsk_thread_flag(current, TIF_RDS);
>+
>+	if (rds != !!test_tsk_thread_flag(current, TIF_RDS))
>+		speculative_store_bypass_update();
>+	return 0;
+}
+

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 15/15] SBB 15
  2018-04-30 19:30   ` [MODERATED] " Tim Chen
@ 2018-04-30 19:36     ` Thomas Gleixner
  2018-04-30 20:12       ` [MODERATED] " Tim Chen
  2018-04-30 20:09     ` [MODERATED] " Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-30 19:36 UTC (permalink / raw)
  To: speck

On Mon, 30 Apr 2018, speck for Tim Chen wrote:

> On 04/29/2018 12:31 PM, speck for Thomas Gleixner wrote:
> 
> >+static int sbb_prctl_set(unsigned long ctrl)
> >+{
> >+	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
> >+
> >+	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
> >+		return -ENXIO;
> 
> Perhaps we should not use the prctl if we don't have
> the SSB bug.  Something like:
> 
> 	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL ||
> 	    !boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
> 		return -ENXIO; 

What? If the CPU does not have the bug then the kernel does not enable
PRCTL mode at all unless

      1) RDS is available

AND

      2) The user enforced PRTCL mode on the kernel command line.

So what exactly are you trying to solve?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 15/15] SBB 15
  2018-04-30 19:30   ` [MODERATED] " Tim Chen
  2018-04-30 19:36     ` Thomas Gleixner
@ 2018-04-30 20:09     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30 20:09 UTC (permalink / raw)
  To: speck

On Mon, Apr 30, 2018 at 12:30:18PM -0700, speck for Tim Chen wrote:
> On 04/29/2018 12:31 PM, speck for Thomas Gleixner wrote:
> 
> >+static int sbb_prctl_set(unsigned long ctrl)
> >+{
> >+	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
> >+
> >+	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
> >+		return -ENXIO;
> 
> Perhaps we should not use the prctl if we don't have
> the SSB bug.  Something like:

If you look in the previous patches you will see that you cannot
set SPEC_STORE_BYPASS_PRCTL at all if X86_BUG_SPEC_STORE_BYPASS
is not defined.

> 
> 	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL ||
> 	    !boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
> 		return -ENXIO; 
> 
> Thanks.
> 
> Tim
> >+
> >+	if (ctrl == PR_SPEC_ENABLE)
> >+		clear_tsk_thread_flag(current, TIF_RDS);
> >+	else
> >+		set_tsk_thread_flag(current, TIF_RDS);
> >+
> >+	if (rds != !!test_tsk_thread_flag(current, TIF_RDS))
> >+		speculative_store_bypass_update();
> >+	return 0;
> +}
> +

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 15/15] SBB 15
  2018-04-30 19:36     ` Thomas Gleixner
@ 2018-04-30 20:12       ` Tim Chen
  2018-04-30 20:20         ` Konrad Rzeszutek Wilk
  2018-04-30 20:28         ` Thomas Gleixner
  0 siblings, 2 replies; 50+ messages in thread
From: Tim Chen @ 2018-04-30 20:12 UTC (permalink / raw)
  To: speck


[-- Attachment #1.1: Type: text/plain, Size: 1286 bytes --]

On 04/30/2018 12:36 PM, speck for Thomas Gleixner wrote:
> On Mon, 30 Apr 2018, speck for Tim Chen wrote:
> 
>> On 04/29/2018 12:31 PM, speck for Thomas Gleixner wrote:
>>
>>> +static int sbb_prctl_set(unsigned long ctrl)
>>> +{
>>> +	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
>>> +
>>> +	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
>>> +		return -ENXIO;
>>
>> Perhaps we should not use the prctl if we don't have
>> the SSB bug.  Something like:
>>
>> 	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL ||
>> 	    !boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
>> 		return -ENXIO; 
> 
> What? If the CPU does not have the bug then the kernel does not enable
> PRCTL mode at all unless
> 
>       1) RDS is available
> 
> AND
> 
>       2) The user enforced PRTCL mode on the kernel command line.
> 
> So what exactly are you trying to solve?

Right, we will only do the PRCTL only for the case when
admin explicitly ask for it at kernel command line.

We'll let the admin have the choice to do prctl, if he
explicitly ask for it. Even though it
is unnecessary for cpus that don't have the SSB bug and
there will be some added cost in changing the MSR on context
switches.  Just want to clear in my mind that this is the choice
we are making.

Tim

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 15/15] SBB 15
  2018-04-30 20:12       ` [MODERATED] " Tim Chen
@ 2018-04-30 20:20         ` Konrad Rzeszutek Wilk
  2018-04-30 20:44           ` Tim Chen
  2018-04-30 20:28         ` Thomas Gleixner
  1 sibling, 1 reply; 50+ messages in thread
From: Konrad Rzeszutek Wilk @ 2018-04-30 20:20 UTC (permalink / raw)
  To: speck

On Mon, Apr 30, 2018 at 01:12:28PM -0700, speck for Tim Chen wrote:
> On 04/30/2018 12:36 PM, speck for Thomas Gleixner wrote:
> > On Mon, 30 Apr 2018, speck for Tim Chen wrote:
> > 
> >> On 04/29/2018 12:31 PM, speck for Thomas Gleixner wrote:
> >>
> >>> +static int sbb_prctl_set(unsigned long ctrl)
> >>> +{
> >>> +	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
> >>> +
> >>> +	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
> >>> +		return -ENXIO;
> >>
> >> Perhaps we should not use the prctl if we don't have
> >> the SSB bug.  Something like:
> >>
> >> 	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL ||
> >> 	    !boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
> >> 		return -ENXIO; 
> > 
> > What? If the CPU does not have the bug then the kernel does not enable
> > PRCTL mode at all unless
> > 
> >       1) RDS is available
> > 
> > AND
> > 
> >       2) The user enforced PRTCL mode on the kernel command line.
> > 
> > So what exactly are you trying to solve?
> 
> Right, we will only do the PRCTL only for the case when
> admin explicitly ask for it at kernel command line.

No. It will be automatically enabled. That is the '=auto'
sets prctl by default on Intel... You know why don't I just
paste to you the code:

477         switch (cmd) {                                                          
478         case SPEC_STORE_BYPASS_CMD_AUTO:                                        
479                 /*                                                              
480                  * AMD platforms by default don't need SSB mitigation.          
481                  */                                                             
482                 if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)                 
483                         break;                                                  
484                 /* Choose prctl as the default mode */                          
485                 mode = SPEC_STORE_BYPASS_PRCTL;                                 
486                 break;                                              


> 
> We'll let the admin have the choice to do prctl, if he
> explicitly ask for it. Even though it
> is unnecessary for cpus that don't have the SSB bug and
> there will be some added cost in changing the MSR on context
> switches.  Just want to clear in my mind that this is the choice
> we are making.
> 
> Tim

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [patch V7 15/15] SBB 15
  2018-04-30 20:12       ` [MODERATED] " Tim Chen
  2018-04-30 20:20         ` Konrad Rzeszutek Wilk
@ 2018-04-30 20:28         ` Thomas Gleixner
  1 sibling, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2018-04-30 20:28 UTC (permalink / raw)
  To: speck

On Mon, 30 Apr 2018, speck for Tim Chen wrote:
> On 04/30/2018 12:36 PM, speck for Thomas Gleixner wrote:
> > On Mon, 30 Apr 2018, speck for Tim Chen wrote:
> > 
> >> On 04/29/2018 12:31 PM, speck for Thomas Gleixner wrote:
> >>
> >>> +static int sbb_prctl_set(unsigned long ctrl)
> >>> +{
> >>> +	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
> >>> +
> >>> +	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
> >>> +		return -ENXIO;
> >>
> >> Perhaps we should not use the prctl if we don't have
> >> the SSB bug.  Something like:
> >>
> >> 	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL ||
> >> 	    !boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
> >> 		return -ENXIO; 
> > 
> > What? If the CPU does not have the bug then the kernel does not enable
> > PRCTL mode at all unless
> > 
> >       1) RDS is available
> > 
> > AND
> > 
> >       2) The user enforced PRTCL mode on the kernel command line.
> > 
> > So what exactly are you trying to solve?
> 
> Right, we will only do the PRCTL only for the case when
> admin explicitly ask for it at kernel command line.
> 
> We'll let the admin have the choice to do prctl, if he
> explicitly ask for it. Even though it
> is unnecessary for cpus that don't have the SSB bug and
> there will be some added cost in changing the MSR on context
> switches.  Just want to clear in my mind that this is the choice
> we are making.

You can also force KPTI on CPUs which are not affected. That's useful for
testing as well.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [MODERATED] Re: [patch V7 15/15] SBB 15
  2018-04-30 20:20         ` Konrad Rzeszutek Wilk
@ 2018-04-30 20:44           ` Tim Chen
  0 siblings, 0 replies; 50+ messages in thread
From: Tim Chen @ 2018-04-30 20:44 UTC (permalink / raw)
  To: speck


[-- Attachment #1.1: Type: text/plain, Size: 3134 bytes --]

On 04/30/2018 01:20 PM, speck for Konrad Rzeszutek Wilk wrote:
> On Mon, Apr 30, 2018 at 01:12:28PM -0700, speck for Tim Chen wrote:
>> On 04/30/2018 12:36 PM, speck for Thomas Gleixner wrote:
>>> On Mon, 30 Apr 2018, speck for Tim Chen wrote:
>>>
>>>> On 04/29/2018 12:31 PM, speck for Thomas Gleixner wrote:
>>>>
>>>>> +static int sbb_prctl_set(unsigned long ctrl)
>>>>> +{
>>>>> +	bool rds = !!test_tsk_thread_flag(current, TIF_RDS);
>>>>> +
>>>>> +	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL)
>>>>> +		return -ENXIO;
>>>>
>>>> Perhaps we should not use the prctl if we don't have
>>>> the SSB bug.  Something like:
>>>>
>>>> 	if (ssb_mode != SPEC_STORE_BYPASS_PRCTL ||
>>>> 	    !boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS))
>>>> 		return -ENXIO; 
>>>
>>> What? If the CPU does not have the bug then the kernel does not enable
>>> PRCTL mode at all unless
>>>
>>>       1) RDS is available
>>>
>>> AND
>>>
>>>       2) The user enforced PRTCL mode on the kernel command line.
>>>
>>> So what exactly are you trying to solve?
>>
>> Right, we will only do the PRCTL only for the case when
>> admin explicitly ask for it at kernel command line.
> 
> No. It will be automatically enabled. That is the '=auto'

If the CPU doesn't have the bug, the auto case will default
to SPEC_STORE_BYPASS_CMD_NONE.  So I'm not concerned about
that case.

        if (!boot_cpu_has_bug(X86_BUG_SPEC_STORE_BYPASS) &&
            (cmd == SPEC_STORE_BYPASS_CMD_NONE ||
             cmd == SPEC_STORE_BYPASS_CMD_AUTO))
                return mode;

The case that's of interest is the prctl case on command line and cpu doesn't
have the bug.  So if I understand Thomas' reply, we allow the
setting of MSR for that case (though unnecessary), as the admin explicitly asked for it.

Thanks.

Tim

> sets prctl by default on Intel... You know why don't I just
> paste to you the code:
> 
> 477         switch (cmd) {                                                          
> 478         case SPEC_STORE_BYPASS_CMD_AUTO:                                        
> 479                 /*                                                              
> 480                  * AMD platforms by default don't need SSB mitigation.          
> 481                  */                                                             
> 482                 if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)                 
> 483                         break;                                                  
> 484                 /* Choose prctl as the default mode */                          
> 485                 mode = SPEC_STORE_BYPASS_PRCTL;                                 
> 486                 break;                                              
> 
> 
>>
>> We'll let the admin have the choice to do prctl, if he
>> explicitly ask for it. Even though it
>> is unnecessary for cpus that don't have the SSB bug and
>> there will be some added cost in changing the MSR on context
>> switches.  Just want to clear in my mind that this is the choice
>> we are making.
>>
>> Tim
> 
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2018-04-30 20:44 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-29 19:30 [patch V7 00/15] SBB 0 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 01/15] SBB 1 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 02/15] SBB 2 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 03/15] SBB 3 Thomas Gleixner
2018-04-29 23:31   ` [MODERATED] " Linus Torvalds
2018-04-30  2:50     ` Konrad Rzeszutek Wilk
2018-04-30  7:09     ` David Woodhouse
2018-04-29 19:30 ` [patch V7 04/15] SBB 4 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 05/15] SBB 5 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 06/15] SBB 6 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 07/15] SBB 7 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 08/15] SBB 8 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 09/15] SBB 9 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 10/15] SBB 10 Thomas Gleixner
2018-04-30  0:16   ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-30  7:49     ` Thomas Gleixner
2018-04-29 19:30 ` [patch V7 11/15] SBB 11 Thomas Gleixner
2018-04-29 19:30 ` [patch V7 12/15] SBB 12 Thomas Gleixner
2018-04-30  1:33   ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-29 19:30 ` [patch V7 13/15] SBB 13 Thomas Gleixner
2018-04-30  1:48   ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-30  2:39     ` Konrad Rzeszutek Wilk
2018-04-30  3:17     ` Jon Masters
2018-04-30  8:35       ` Thomas Gleixner
2018-04-30  2:20   ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-30  2:36   ` Konrad Rzeszutek Wilk
2018-04-30 17:28   ` Konrad Rzeszutek Wilk
2018-04-29 19:30 ` [patch V7 14/15] SBB 14 Thomas Gleixner
2018-04-30  2:14   ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-30  5:57     ` Thomas Gleixner
2018-04-30 15:49       ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-29 19:31 ` [patch V7 15/15] SBB 15 Thomas Gleixner
2018-04-30  2:32   ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-30 15:56   ` Konrad Rzeszutek Wilk
2018-04-30 16:07     ` Thomas Gleixner
2018-04-30 19:30   ` [MODERATED] " Tim Chen
2018-04-30 19:36     ` Thomas Gleixner
2018-04-30 20:12       ` [MODERATED] " Tim Chen
2018-04-30 20:20         ` Konrad Rzeszutek Wilk
2018-04-30 20:44           ` Tim Chen
2018-04-30 20:28         ` Thomas Gleixner
2018-04-30 20:09     ` [MODERATED] " Konrad Rzeszutek Wilk
2018-04-29 20:14 ` [patch V7 00/15] SBB 0 Thomas Gleixner
2018-04-29 20:35 ` [MODERATED] " Borislav Petkov
2018-04-29 20:46   ` Konrad Rzeszutek Wilk
2018-04-29 20:57     ` Thomas Gleixner
2018-04-29 21:40     ` [MODERATED] " Borislav Petkov
2018-04-29 20:55   ` Thomas Gleixner
2018-04-29 22:05     ` Thomas Gleixner
2018-04-30  0:06       ` [MODERATED] " Jon Masters

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.