linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/9] x86/split_lock: Add feature split lock detection
@ 2020-03-15  5:05 Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of " Xiaoyao Li
                   ` (9 more replies)
  0 siblings, 10 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

This series aims to add the virtualization of split lock detection for
guest, while containing some fixes of native kernel split lock handling. 

Note, this series is based on the kernel patch[1].

Patch 1 is the fix and enhancement for kernel split lock detction. It
ensures X86_FEATURE_SPLIT_LOCK_DETECT flag is set after verifying the
feature is really supported. And it explicitly turn off split lock when
sld_off instead of assuming BIOS/firmware leaves it cleared.

Patch 2 optimizes the runtime MSR accessing.

Patch 3-4 are the preparation for enabling split lock detection
virtualization in KVM.

Patch 5 fixes the issue tht malicious guest may exploit kvm emulator to
attcact host kernel.

Patch 6 handles guest's split lock when host truns split lock detect on.

Patch 7-9 implement the virtualization of split lock detection in kvm.

[1]: https://lore.kernel.org/lkml/158031147976.396.8941798847364718785.tip-bot2@tip-bot2/ 

v5:
 - Use X86_FEATURE_SPLIT_LOCK_DETECT flag in kvm to ensure split lock
   detection is really supported.
 - Add and export sld related helper functions in their related usecase 
   kvm patches.

v4:
 - Add patch 1 to rework the initialization flow of split lock
   detection.
 - Drop percpu MSR_TEST_CTRL cache, just use a static variable to cache
   the reserved/unused bit of MSR_TEST_CTRL. [Sean]
 - Add new option for split_lock_detect kernel param.
 - Changlog refinement. [Sean]
 - Add a new patch to enable MSR_TEST_CTRL for intel guest. [Sean]

Xiaoyao Li (9):
  x86/split_lock: Rework the initialization flow of split lock detection
  x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
  x86/split_lock: Re-define the kernel param option for
    split_lock_detect
  x86/split_lock: Export handle_user_split_lock()
  kvm: x86: Emulate split-lock access as a write
  kvm: vmx: Extend VMX's #AC interceptor to handle split lock #AC
    happens in guest
  kvm: x86: Emulate MSR IA32_CORE_CAPABILITIES
  kvm: vmx: Enable MSR_TEST_CTRL for intel guest
  x86: vmx: virtualize split lock detection

 .../admin-guide/kernel-parameters.txt         |   5 +-
 arch/x86/include/asm/cpu.h                    |  23 +++-
 arch/x86/include/asm/kvm_host.h               |   1 +
 arch/x86/kernel/cpu/intel.c                   | 119 +++++++++++++-----
 arch/x86/kernel/traps.c                       |   2 +-
 arch/x86/kvm/cpuid.c                          |   7 +-
 arch/x86/kvm/vmx/vmx.c                        |  75 ++++++++++-
 arch/x86/kvm/vmx/vmx.h                        |   1 +
 arch/x86/kvm/x86.c                            |  42 ++++++-
 9 files changed, 229 insertions(+), 46 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-21  0:41   ` Luck, Tony
  2020-03-23 17:02   ` Thomas Gleixner
  2020-03-15  5:05 ` [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR Xiaoyao Li
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Current initialization flow of split lock detection has following issues:
1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
   zero. However, it's possible that BIOS/firmware has set it.

2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
   there is a virtualization flaw that FMS indicates the existence while
   it's actually not supported.

3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
   to check verify if feature does exist, so cannot expose it to guest.

To solve these issues, introducing a new sld_state, "sld_not_exist", as
the default value. It will be switched to other value if CORE_CAPABILITIES
or FMS enumerate split lock detection.

Only when sld_state != sld_not_exist, it goes to initialization flow.

In initialization flow, it explicitly accesses MSR_TEST_CTRL and
SPLIT_LOCK_DETECT bit to ensure there is no virtualization flaw, i.e.,
feature split lock detection does supported. In detail,
 1. sld_off,   verify SPLIT_LOCK_DETECT bit can be cleared, and clear it;
 2. sld_warn,  verify SPLIT_LOCK_DETECT bit can be cleared and set,
               and set it;
 3. sld_fatal, verify SPLIT_LOCK_DETECT bit can be set, and set it;

Only when no MSR aceessing failure, can X86_FEATURE_SPLIT_LOCK_DETECT be
set. So kvm can use X86_FEATURE_SPLIT_LOCK_DETECT to check the existence
of feature.

Also, since MSR and bit are checked when split_lock_init(), there
is no need to use safe version RDMSR/WRMSR at runtime.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kernel/cpu/intel.c | 64 +++++++++++++++++++++++++------------
 1 file changed, 44 insertions(+), 20 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index db3e745e5d47..064ba12defc8 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -34,17 +34,17 @@
 #endif
 
 enum split_lock_detect_state {
-	sld_off = 0,
+	sld_not_exist = 0,
+	sld_off,
 	sld_warn,
 	sld_fatal,
 };
 
 /*
- * Default to sld_off because most systems do not support split lock detection
  * split_lock_setup() will switch this to sld_warn on systems that support
  * split lock detect, unless there is a command line override.
  */
-static enum split_lock_detect_state sld_state = sld_off;
+static enum split_lock_detect_state sld_state = sld_not_exist;
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -585,7 +585,7 @@ static void init_intel_misc_features(struct cpuinfo_x86 *c)
 	wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
 }
 
-static void split_lock_init(void);
+static void split_lock_init(struct cpuinfo_x86 *c);
 
 static void init_intel(struct cpuinfo_x86 *c)
 {
@@ -702,7 +702,8 @@ static void init_intel(struct cpuinfo_x86 *c)
 	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
 		tsx_disable();
 
-	split_lock_init();
+	if (sld_state != sld_not_exist)
+		split_lock_init(c);
 }
 
 #ifdef CONFIG_X86_32
@@ -989,7 +990,6 @@ static void __init split_lock_setup(void)
 	char arg[20];
 	int i, ret;
 
-	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
 	sld_state = sld_warn;
 
 	ret = cmdline_find_option(boot_command_line, "split_lock_detect",
@@ -1015,6 +1015,8 @@ static void __init split_lock_setup(void)
 	case sld_fatal:
 		pr_info("sending SIGBUS on user-space split_locks\n");
 		break;
+	default:
+		break;
 	}
 }
 
@@ -1022,39 +1024,61 @@ static void __init split_lock_setup(void)
  * Locking is not required at the moment because only bit 29 of this
  * MSR is implemented and locking would not prevent that the operation
  * of one thread is immediately undone by the sibling thread.
- * Use the "safe" versions of rdmsr/wrmsr here because although code
- * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
- * exist, there may be glitches in virtualization that leave a guest
- * with an incorrect view of real h/w capabilities.
  */
-static bool __sld_msr_set(bool on)
+static void __sld_msr_set(bool on)
 {
 	u64 test_ctrl_val;
 
-	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
-		return false;
+	rdmsrl(MSR_TEST_CTRL, test_ctrl_val);
 
 	if (on)
 		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 	else
 		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 
-	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
+	wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
 }
 
-static void split_lock_init(void)
+/*
+ * Use the "safe" versions of rdmsr/wrmsr here because although code
+ * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
+ * exist, there may be glitches in virtualization that leave a guest
+ * with an incorrect view of real h/w capabilities.
+ * If not msr_broken, then it needn't use "safe" version at runtime.
+ */
+static void split_lock_init(struct cpuinfo_x86 *c)
 {
-	if (sld_state == sld_off)
-		return;
+	u64 test_ctrl_val;
 
-	if (__sld_msr_set(true))
-		return;
+	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
+		goto msr_broken;
+
+	switch (sld_state) {
+	case sld_off:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
+		break;
+	case sld_warn:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
+		fallthrough;
+	case sld_fatal:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val | MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
+		break;
+	default:
+		break;
+	}
+
+	set_cpu_cap(c, X86_FEATURE_SPLIT_LOCK_DETECT);
+	return;
 
+msr_broken:
 	/*
 	 * If this is anything other than the boot-cpu, you've done
 	 * funny things and you get to keep whatever pieces.
 	 */
-	pr_warn("MSR fail -- disabled\n");
+	pr_warn_once("MSR fail -- disabled\n");
 	sld_state = sld_off;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of " Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-21  0:43   ` Luck, Tony
  2020-03-23 17:06   ` Thomas Gleixner
  2020-03-15  5:05 ` [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect Xiaoyao Li
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

In a context switch from a task that is detecting split locks
to one that is not (or vice versa) we need to update the TEST_CTRL
MSR. Currently this is done with the common sequence:
	read the MSR
	flip the bit
	write the MSR
in order to avoid changing the value of any reserved bits in the MSR.

Cache the value of the TEST_CTRL MSR when we read it during initialization
so we can avoid an expensive RDMSR instruction during context switch.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Originally-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kernel/cpu/intel.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 064ba12defc8..4b3245035b5a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1020,6 +1020,14 @@ static void __init split_lock_setup(void)
 	}
 }
 
+/*
+ * Soft copy of MSR_TEST_CTRL initialized when we first read the
+ * MSR. Used at runtime to avoid using rdmsr again just to collect
+ * the reserved bits in the MSR. We assume reserved bits are the
+ * same on all CPUs.
+ */
+static u64 test_ctrl_val;
+
 /*
  * Locking is not required at the moment because only bit 29 of this
  * MSR is implemented and locking would not prevent that the operation
@@ -1027,16 +1035,14 @@ static void __init split_lock_setup(void)
  */
 static void __sld_msr_set(bool on)
 {
-	u64 test_ctrl_val;
-
-	rdmsrl(MSR_TEST_CTRL, test_ctrl_val);
+	u64 val = test_ctrl_val;
 
 	if (on)
-		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+		val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 	else
-		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+		val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 
-	wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
+	wrmsrl(MSR_TEST_CTRL, val);
 }
 
 /*
@@ -1048,11 +1054,13 @@ static void __sld_msr_set(bool on)
  */
 static void split_lock_init(struct cpuinfo_x86 *c)
 {
-	u64 test_ctrl_val;
+	u64 val;
 
-	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
+	if (rdmsrl_safe(MSR_TEST_CTRL, &val))
 		goto msr_broken;
 
+	test_ctrl_val = val;
+
 	switch (sld_state) {
 	case sld_off:
 		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of " Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-21  0:46   ` Luck, Tony
  2020-03-23 17:10   ` Thomas Gleixner
  2020-03-15  5:05 ` [PATCH v5 4/9] x86/split_lock: Export handle_user_split_lock() Xiaoyao Li
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Change sld_off to sld_disable, which means disabling feature split lock
detection and it cannot be used in kernel nor can kvm expose it guest.
Of course, the X86_FEATURE_SPLIT_LOCK_DETECT is not set.

Add a new optioin sld_kvm_only, which means kernel turns split lock
detection off, but kvm can expose it to guest.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 .../admin-guide/kernel-parameters.txt         |  5 ++++-
 arch/x86/kernel/cpu/intel.c                   | 22 ++++++++++++++-----
 2 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1ee2d1e6d89a..2b922061ff08 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4666,7 +4666,10 @@
 			instructions that access data across cache line
 			boundaries will result in an alignment check exception.
 
-			off	- not enabled
+			disable	- disabled, neither kernel nor kvm can use it.
+
+			kvm_only - off in kernel but kvm can expose it to
+				   guest for debug/testing scenario.
 
 			warn	- the kernel will emit rate limited warnings
 				  about applications triggering the #AC
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 4b3245035b5a..3eeab717a0d0 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -35,7 +35,8 @@
 
 enum split_lock_detect_state {
 	sld_not_exist = 0,
-	sld_off,
+	sld_disable,
+	sld_kvm_only,
 	sld_warn,
 	sld_fatal,
 };
@@ -973,7 +974,8 @@ static const struct {
 	const char			*option;
 	enum split_lock_detect_state	state;
 } sld_options[] __initconst = {
-	{ "off",	sld_off   },
+	{ "disable",	sld_disable },
+	{ "kvm_only",	sld_kvm_only },
 	{ "warn",	sld_warn  },
 	{ "fatal",	sld_fatal },
 };
@@ -1004,10 +1006,14 @@ static void __init split_lock_setup(void)
 	}
 
 	switch (sld_state) {
-	case sld_off:
+	case sld_disable:
 		pr_info("disabled\n");
 		break;
 
+	case sld_kvm_only:
+		pr_info("off in kernel, but kvm can expose it to guest\n");
+		break;
+
 	case sld_warn:
 		pr_info("warning about user-space split_locks\n");
 		break;
@@ -1062,7 +1068,13 @@ static void split_lock_init(struct cpuinfo_x86 *c)
 	test_ctrl_val = val;
 
 	switch (sld_state) {
-	case sld_off:
+	case sld_disable:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
+		return;
+	case sld_kvm_only:
+		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val | MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
+			goto msr_broken;
 		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
 			goto msr_broken;
 		break;
@@ -1087,7 +1099,7 @@ static void split_lock_init(struct cpuinfo_x86 *c)
 	 * funny things and you get to keep whatever pieces.
 	 */
 	pr_warn_once("MSR fail -- disabled\n");
-	sld_state = sld_off;
+	sld_state = sld_disable;
 }
 
 bool handle_user_split_lock(struct pt_regs *regs, long error_code)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 4/9] x86/split_lock: Export handle_user_split_lock()
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
                   ` (2 preceding siblings ...)
  2020-03-15  5:05 ` [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-21  0:48   ` Luck, Tony
  2020-03-15  5:05 ` [PATCH v5 5/9] kvm: x86: Emulate split-lock access as a write Xiaoyao Li
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

In the future, KVM will use handle_user_split_lock() to handle #AC
caused by split lock in guest. Due to the fact that KVM doesn't have
a @regs context and will pre-check EFLASG.AC, move the EFLAGS.AC check
to do_alignment_check().

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/include/asm/cpu.h  | 4 ++--
 arch/x86/kernel/cpu/intel.c | 7 ++++---
 arch/x86/kernel/traps.c     | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index ff6f3ca649b3..ff567afa6ee1 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -43,11 +43,11 @@ unsigned int x86_stepping(unsigned int sig);
 #ifdef CONFIG_CPU_SUP_INTEL
 extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
-extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
+extern bool handle_user_split_lock(unsigned long ip);
 #else
 static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
 static inline void switch_to_sld(unsigned long tifn) {}
-static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code)
+static inline bool handle_user_split_lock(unsigned long ip)
 {
 	return false;
 }
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 3eeab717a0d0..c401d174c8db 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1102,13 +1102,13 @@ static void split_lock_init(struct cpuinfo_x86 *c)
 	sld_state = sld_disable;
 }
 
-bool handle_user_split_lock(struct pt_regs *regs, long error_code)
+bool handle_user_split_lock(unsigned long ip)
 {
-	if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
+	if (sld_state == sld_fatal)
 		return false;
 
 	pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n",
-			    current->comm, current->pid, regs->ip);
+			    current->comm, current->pid, ip);
 
 	/*
 	 * Disable the split lock detection for this task so it can make
@@ -1119,6 +1119,7 @@ bool handle_user_split_lock(struct pt_regs *regs, long error_code)
 	set_tsk_thread_flag(current, TIF_SLD);
 	return true;
 }
+EXPORT_SYMBOL_GPL(handle_user_split_lock);
 
 /*
  * This function is called only when switching between tasks with
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 0ef5befaed7d..407ff9be610f 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -304,7 +304,7 @@ dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
 
 	local_irq_enable();
 
-	if (handle_user_split_lock(regs, error_code))
+	if (!(regs->flags & X86_EFLAGS_AC) && handle_user_split_lock(regs->ip))
 		return;
 
 	do_trap(X86_TRAP_AC, SIGBUS, "alignment check", regs,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 5/9] kvm: x86: Emulate split-lock access as a write
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
                   ` (3 preceding siblings ...)
  2020-03-15  5:05 ` [PATCH v5 4/9] x86/split_lock: Export handle_user_split_lock() Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 6/9] kvm: vmx: Extend VMX's #AC interceptor to handle split lock #AC happens in guest Xiaoyao Li
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

If split lock detect is on (warn/fatal), #AC handler calls die() when
split lock happens in kernel.

Malicous guest can exploit the KVM emulator to trigger split lock #AC
in kernel[1]. So just emulating the access as a write if it's a
split-lock access (the same as access spans page) to avoid malicious
attacking kernel.

More discussion can be found [2][3].

[1] https://lore.kernel.org/lkml/8c5b11c9-58df-38e7-a514-dc12d687b198@redhat.com/
[2] https://lkml.kernel.org/r/20200131200134.GD18946@linux.intel.com
[3] https://lkml.kernel.org/r/20200227001117.GX9940@linux.intel.com

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/include/asm/cpu.h  | 2 ++
 arch/x86/kernel/cpu/intel.c | 6 ++++++
 arch/x86/kvm/x86.c          | 7 ++++++-
 3 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index ff567afa6ee1..d2071f6a35ac 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -44,6 +44,7 @@ unsigned int x86_stepping(unsigned int sig);
 extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
 extern bool handle_user_split_lock(unsigned long ip);
+extern bool split_lock_detect_on(void);
 #else
 static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
 static inline void switch_to_sld(unsigned long tifn) {}
@@ -51,5 +52,6 @@ static inline bool handle_user_split_lock(unsigned long ip)
 {
 	return false;
 }
+static inline bool split_lock_detect_on(void) { return false; }
 #endif
 #endif /* _ASM_X86_CPU_H */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index c401d174c8db..de94957a11a4 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -1102,6 +1102,12 @@ static void split_lock_init(struct cpuinfo_x86 *c)
 	sld_state = sld_disable;
 }
 
+bool split_lock_detect_on(void)
+{
+	return sld_state == sld_warn || sld_state == sld_fatal;
+}
+EXPORT_SYMBOL_GPL(split_lock_detect_on);
+
 bool handle_user_split_lock(unsigned long ip)
 {
 	if (sld_state == sld_fatal)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5de200663f51..1a0e6c0b1b39 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5873,6 +5873,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
 {
 	struct kvm_host_map map;
 	struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
+	u64 page_line_mask = PAGE_MASK;
 	gpa_t gpa;
 	char *kaddr;
 	bool exchanged;
@@ -5887,7 +5888,11 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt,
 	    (gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE)
 		goto emul_write;
 
-	if (((gpa + bytes - 1) & PAGE_MASK) != (gpa & PAGE_MASK))
+	if (split_lock_detect_on())
+		page_line_mask = ~(cache_line_size() - 1);
+
+	/* when write spans page or spans cache when SLD enabled */
+	if (((gpa + bytes - 1) & page_line_mask) != (gpa & page_line_mask))
 		goto emul_write;
 
 	if (kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map))
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 6/9] kvm: vmx: Extend VMX's #AC interceptor to handle split lock #AC happens in guest
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
                   ` (4 preceding siblings ...)
  2020-03-15  5:05 ` [PATCH v5 5/9] kvm: x86: Emulate split-lock access as a write Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 7/9] kvm: x86: Emulate MSR IA32_CORE_CAPABILITIES Xiaoyao Li
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

There are two types of #AC can be generated in Intel CPUs:
 1. legacy alignment check #AC;
 2. split lock #AC;

Legacy alignment check #AC can be injected to guest if guest has enabled
alignemnet check.

when host enables split lock detectin, i.e., sld_warn or sld_fatal,
there will be an unexpected #AC in guest and intercepted by KVM because
KVM doesn't virtualize this feature to guest and hardware value of
MSR_TEST_CTRL.SLD bit stays unchanged when vcpu is running.

To handle this unexpected #AC, treat guest just like host usermode that
calling handle_user_split_lock():
 - If host is sld_warn, it warns and set TIF_SLD so that __switch_to_xtra()
   does the MSR_TEST_CTRL.SLD bit switching when control transfer to/from
   this vcpu.
 - If host is sld_fatal, forward #AC to userspace, the similar as sending
   SIGBUS.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 40b1e6138cd5..3fb132ad489d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4609,6 +4609,12 @@ static int handle_machine_check(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+static inline bool guest_cpu_alignment_check_enabled(struct kvm_vcpu *vcpu)
+{
+	return vmx_get_cpl(vcpu) == 3 && kvm_read_cr0_bits(vcpu, X86_CR0_AM) &&
+	       (kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
+}
+
 static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -4674,9 +4680,6 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 		return handle_rmode_exception(vcpu, ex_no, error_code);
 
 	switch (ex_no) {
-	case AC_VECTOR:
-		kvm_queue_exception_e(vcpu, AC_VECTOR, error_code);
-		return 1;
 	case DB_VECTOR:
 		dr6 = vmcs_readl(EXIT_QUALIFICATION);
 		if (!(vcpu->guest_debug &
@@ -4705,6 +4708,27 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 		kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip;
 		kvm_run->debug.arch.exception = ex_no;
 		break;
+	case AC_VECTOR:
+		/*
+		 * Reflect #AC to the guest if it's expecting the #AC, i.e. has
+		 * legacy alignment check enabled.  Pre-check host split lock
+		 * support to avoid the VMREADs needed to check legacy #AC,
+		 * i.e. reflect the #AC if the only possible source is legacy
+		 * alignment checks.
+		 */
+		if (!split_lock_detect_on() ||
+		    guest_cpu_alignment_check_enabled(vcpu)) {
+			kvm_queue_exception_e(vcpu, AC_VECTOR, error_code);
+			return 1;
+		}
+
+		/*
+		 * Forward the #AC to userspace if kernel policy does not allow
+		 * temporarily disabling split lock detection.
+		 */
+		if (handle_user_split_lock(kvm_rip_read(vcpu)))
+			return 1;
+		fallthrough;
 	default:
 		kvm_run->exit_reason = KVM_EXIT_EXCEPTION;
 		kvm_run->ex.exception = ex_no;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 7/9] kvm: x86: Emulate MSR IA32_CORE_CAPABILITIES
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
                   ` (5 preceding siblings ...)
  2020-03-15  5:05 ` [PATCH v5 6/9] kvm: vmx: Extend VMX's #AC interceptor to handle split lock #AC happens in guest Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 8/9] kvm: vmx: Enable MSR_TEST_CTRL for intel guest Xiaoyao Li
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Emulate MSR_IA32_CORE_CAPABILITIES in software and unconditionally
advertise its support to userspace. Like MSR_IA32_ARCH_CAPABILITIES, it
is a feature-enumerating MSR and can be fully emulated regardless of
hardware support. Existence of CORE_CAPABILITIES is enumerated via
CPUID.(EAX=7H,ECX=0):EDX[30].

Note, support for individual features enumerated via CORE_CAPABILITIES,
e.g., split lock detection, will be added in future patches.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/cpuid.c            |  7 ++++---
 arch/x86/kvm/x86.c              | 22 ++++++++++++++++++++++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 98959e8cd448..d538c9f51a09 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -593,6 +593,7 @@ struct kvm_vcpu_arch {
 	u64 ia32_xss;
 	u64 microcode_version;
 	u64 arch_capabilities;
+	u64 core_capabilities;
 
 	/*
 	 * Paging state of the vcpu
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b1c469446b07..344cd605ecaa 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -374,7 +374,7 @@ static inline void do_cpuid_7_mask(struct kvm_cpuid_entry2 *entry, int index)
 	const u32 kvm_cpuid_7_0_edx_x86_features =
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
 		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
-		F(MD_CLEAR);
+		F(MD_CLEAR) | F(CORE_CAPABILITIES);
 
 	/* cpuid 7.1.eax */
 	const u32 kvm_cpuid_7_1_eax_x86_features =
@@ -409,10 +409,11 @@ static inline void do_cpuid_7_mask(struct kvm_cpuid_entry2 *entry, int index)
 		    boot_cpu_has(X86_FEATURE_AMD_SSBD))
 			entry->edx |= F(SPEC_CTRL_SSBD);
 		/*
-		 * We emulate ARCH_CAPABILITIES in software even
-		 * if the host doesn't support it.
+		 * ARCH_CAPABILITIES and CORE_CAPABILITIES are emulated in
+		 * software regardless of host support.
 		 */
 		entry->edx |= F(ARCH_CAPABILITIES);
+		entry->edx |= F(CORE_CAPABILITIES);
 		break;
 	case 1:
 		entry->eax &= kvm_cpuid_7_1_eax_x86_features;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1a0e6c0b1b39..72d4bfea8864 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1221,6 +1221,7 @@ static const u32 emulated_msrs_all[] = {
 	MSR_IA32_TSC_ADJUST,
 	MSR_IA32_TSCDEADLINE,
 	MSR_IA32_ARCH_CAPABILITIES,
+	MSR_IA32_CORE_CAPS,
 	MSR_IA32_MISC_ENABLE,
 	MSR_IA32_MCG_STATUS,
 	MSR_IA32_MCG_CTL,
@@ -1287,6 +1288,7 @@ static const u32 msr_based_features_all[] = {
 	MSR_F10H_DECFG,
 	MSR_IA32_UCODE_REV,
 	MSR_IA32_ARCH_CAPABILITIES,
+	MSR_IA32_CORE_CAPS,
 };
 
 static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
@@ -1340,12 +1342,20 @@ static u64 kvm_get_arch_capabilities(void)
 	return data;
 }
 
+static u64 kvm_get_core_capabilities(void)
+{
+	return 0;
+}
+
 static int kvm_get_msr_feature(struct kvm_msr_entry *msr)
 {
 	switch (msr->index) {
 	case MSR_IA32_ARCH_CAPABILITIES:
 		msr->data = kvm_get_arch_capabilities();
 		break;
+	case MSR_IA32_CORE_CAPS:
+		msr->data = kvm_get_core_capabilities();
+		break;
 	case MSR_IA32_UCODE_REV:
 		rdmsrl_safe(msr->index, &msr->data);
 		break;
@@ -2718,6 +2728,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		vcpu->arch.arch_capabilities = data;
 		break;
+	case MSR_IA32_CORE_CAPS:
+		if (!msr_info->host_initiated)
+			return 1;
+		vcpu->arch.core_capabilities = data;
+		break;
 	case MSR_EFER:
 		return set_efer(vcpu, msr_info);
 	case MSR_K7_HWCR:
@@ -3046,6 +3061,12 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			return 1;
 		msr_info->data = vcpu->arch.arch_capabilities;
 		break;
+	case MSR_IA32_CORE_CAPS:
+		if (!msr_info->host_initiated &&
+		    !guest_cpuid_has(vcpu, X86_FEATURE_CORE_CAPABILITIES))
+			return 1;
+		msr_info->data = vcpu->arch.core_capabilities;
+		break;
 	case MSR_IA32_POWER_CTL:
 		msr_info->data = vcpu->arch.msr_ia32_power_ctl;
 		break;
@@ -9348,6 +9369,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 		goto free_guest_fpu;
 
 	vcpu->arch.arch_capabilities = kvm_get_arch_capabilities();
+	vcpu->arch.core_capabilities = kvm_get_core_capabilities();
 	vcpu->arch.msr_platform_info = MSR_PLATFORM_INFO_CPUID_FAULT;
 	kvm_vcpu_mtrr_init(vcpu);
 	vcpu_load(vcpu);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 8/9] kvm: vmx: Enable MSR_TEST_CTRL for intel guest
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
                   ` (6 preceding siblings ...)
  2020-03-15  5:05 ` [PATCH v5 7/9] kvm: x86: Emulate MSR IA32_CORE_CAPABILITIES Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-15  5:05 ` [PATCH v5 9/9] x86: vmx: virtualize split lock detection Xiaoyao Li
  2020-03-23  2:18 ` [PATCH v5 0/9] x86/split_lock: Add feature " Xiaoyao Li
  9 siblings, 0 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Only enabling the read and write zero of MSR_TEST_CTRL. This makes
MSR_TEST_CTRL always available for intel guest, but guset cannot write any
value to it except zero.

This matches the truth that most Intel CPUs support MSR_TEST_CTRL, and
it also alleviates the effort to handle wrmsr/rdmsr when exposing split
lock detect to guest in the following patch.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 10 ++++++++++
 arch/x86/kvm/vmx/vmx.h |  1 +
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3fb132ad489d..107c873b23c2 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1831,6 +1831,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	u32 index;
 
 	switch (msr_info->index) {
+	case MSR_TEST_CTRL:
+		msr_info->data = vmx->msr_test_ctrl;
+		break;
 #ifdef CONFIG_X86_64
 	case MSR_FS_BASE:
 		msr_info->data = vmcs_readl(GUEST_FS_BASE);
@@ -1984,6 +1987,12 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	u32 index;
 
 	switch (msr_index) {
+	case MSR_TEST_CTRL:
+		if (data)
+			return 1;
+
+		vmx->msr_test_ctrl = data;
+		break;
 	case MSR_EFER:
 		ret = kvm_set_msr_common(vcpu, msr_info);
 		break;
@@ -4283,6 +4292,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 
 	vmx->rmode.vm86_active = 0;
 	vmx->spec_ctrl = 0;
+	vmx->msr_test_ctrl = 0;
 
 	vmx->msr_ia32_umwait_control = 0;
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index e64da06c7009..f679453dcab8 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -225,6 +225,7 @@ struct vcpu_vmx {
 #endif
 
 	u64		      spec_ctrl;
+	u64		      msr_test_ctrl;
 	u32		      msr_ia32_umwait_control;
 
 	u32 secondary_exec_control;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v5 9/9] x86: vmx: virtualize split lock detection
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
                   ` (7 preceding siblings ...)
  2020-03-15  5:05 ` [PATCH v5 8/9] kvm: vmx: Enable MSR_TEST_CTRL for intel guest Xiaoyao Li
@ 2020-03-15  5:05 ` Xiaoyao Li
  2020-03-23  2:18 ` [PATCH v5 0/9] x86/split_lock: Add feature " Xiaoyao Li
  9 siblings, 0 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-15  5:05 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Due to the fact that MSR_TEST_CTRL is per-core scope, i.e., the sibling
threads in the same physical CPU core share the same MSR, only
advertising feature split lock detection to guest when SMT is disabled
or unsupported, for simplicitly.

Below summarizing how guest behaves of different host configuration:

  sld_fatal - hardware MSR_TEST_CTRL.SLD is always on when vcpu is running,
              even though guest thinks it sets/clears MSR_TEST_CTRL.SLD
	      bit successfully. i.e., SLD is forced on for guest.

  sld_warn - hardware MSR_TEST_CTRL.SLD is left on until an #AC is
	     intercepted with MSR_TEST_CTRL.SLD=0 in the guest, at which
	     point normal sld_warn rules apply, i.e., clear
	     MSR_TEST_CTRL.SLD bit and set TIF_SLD.
	     If a vCPU associated with the task does VM-Enter with
	     virtual MSR_TEST_CTRL.SLD=1, TIF_SLD is reset, hardware
	     MSR_TEST_CTRL.SLD is re-set, and cycle begins anew.

  sld_kvm_only - hardware MSR_TEST_CTRL.SLD is set on VM-Entry and cleared
		 onVM-Exit if guest enables SLD, i.e., guest's virtual
	         MSR_TEST_CTRL.SLD is set.

  sld_disable - guest cannot see feature split lock detection.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
 arch/x86/include/asm/cpu.h  | 19 +++++++++++++++-
 arch/x86/kernel/cpu/intel.c | 30 ++++++++++++-------------
 arch/x86/kvm/vmx/vmx.c      | 45 ++++++++++++++++++++++++++++++++-----
 arch/x86/kvm/x86.c          | 17 +++++++++++---
 4 files changed, 87 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index d2071f6a35ac..d868dfb4ac10 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -41,10 +41,25 @@ unsigned int x86_family(unsigned int sig);
 unsigned int x86_model(unsigned int sig);
 unsigned int x86_stepping(unsigned int sig);
 #ifdef CONFIG_CPU_SUP_INTEL
+enum split_lock_detect_state {
+	sld_not_exist = 0,
+	sld_disable,
+	sld_kvm_only,
+	sld_warn,
+	sld_fatal,
+};
+extern enum split_lock_detect_state sld_state;
+
+static inline bool split_lock_detect_on(void)
+{
+	return (sld_state == sld_warn) || (sld_state == sld_fatal);
+}
+
 extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
 extern bool handle_user_split_lock(unsigned long ip);
-extern bool split_lock_detect_on(void);
+extern void sld_msr_set(bool on);
+extern void sld_turn_back_on(void);
 #else
 static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
 static inline void switch_to_sld(unsigned long tifn) {}
@@ -53,5 +68,7 @@ static inline bool handle_user_split_lock(unsigned long ip)
 	return false;
 }
 static inline bool split_lock_detect_on(void) { return false; }
+static inline void sld_msr_set(bool on) {}
+static inline void sld_turn_back_on(void) {}
 #endif
 #endif /* _ASM_X86_CPU_H */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index de94957a11a4..de46e1d3f1c7 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -33,19 +33,12 @@
 #include <asm/apic.h>
 #endif
 
-enum split_lock_detect_state {
-	sld_not_exist = 0,
-	sld_disable,
-	sld_kvm_only,
-	sld_warn,
-	sld_fatal,
-};
-
 /*
  * split_lock_setup() will switch this to sld_warn on systems that support
  * split lock detect, unless there is a command line override.
  */
-static enum split_lock_detect_state sld_state = sld_not_exist;
+enum split_lock_detect_state sld_state = sld_not_exist;
+EXPORT_SYMBOL_GPL(sld_state);
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -1102,12 +1095,6 @@ static void split_lock_init(struct cpuinfo_x86 *c)
 	sld_state = sld_disable;
 }
 
-bool split_lock_detect_on(void)
-{
-	return sld_state == sld_warn || sld_state == sld_fatal;
-}
-EXPORT_SYMBOL_GPL(split_lock_detect_on);
-
 bool handle_user_split_lock(unsigned long ip)
 {
 	if (sld_state == sld_fatal)
@@ -1127,6 +1114,19 @@ bool handle_user_split_lock(unsigned long ip)
 }
 EXPORT_SYMBOL_GPL(handle_user_split_lock);
 
+void sld_msr_set(bool on)
+{
+	__sld_msr_set(on);
+}
+EXPORT_SYMBOL_GPL(sld_msr_set);
+
+void sld_turn_back_on(void)
+{
+	__sld_msr_set(true);
+	clear_tsk_thread_flag(current, TIF_SLD);
+}
+EXPORT_SYMBOL_GPL(sld_turn_back_on);
+
 /*
  * This function is called only when switching between tasks with
  * different split-lock detection modes. It sets the MSR for the
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 107c873b23c2..ddd2c7e6c214 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1819,6 +1819,22 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
 	}
 }
 
+static inline u64 vmx_msr_test_ctrl_valid_bits(struct kvm_vcpu *vcpu)
+{
+	u64 valid_bits = 0;
+
+	/*
+	 * Note: for guest, feature split lock detection can only be enumerated
+	 * through MSR_IA32_CORE_CAPABILITIES bit.
+	 * The FMS enumeration is invalid.
+	 */
+	if (vcpu->arch.core_capabilities &
+	    MSR_IA32_CORE_CAPS_SPLIT_LOCK_DETECT)
+		valid_bits |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+
+	return valid_bits;
+}
+
 /*
  * Reads an msr value (of 'msr_index') into 'pdata'.
  * Returns 0 on success, non-0 otherwise.
@@ -1988,7 +2004,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 
 	switch (msr_index) {
 	case MSR_TEST_CTRL:
-		if (data)
+		if (data & ~vmx_msr_test_ctrl_valid_bits(vcpu))
 			return 1;
 
 		vmx->msr_test_ctrl = data;
@@ -4625,6 +4641,11 @@ static inline bool guest_cpu_alignment_check_enabled(struct kvm_vcpu *vcpu)
 	       (kvm_get_rflags(vcpu) & X86_EFLAGS_AC);
 }
 
+static inline bool guest_cpu_split_lock_detect_on(struct vcpu_vmx *vmx)
+{
+	return vmx->msr_test_ctrl & MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+}
+
 static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -4721,12 +4742,13 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
 	case AC_VECTOR:
 		/*
 		 * Reflect #AC to the guest if it's expecting the #AC, i.e. has
-		 * legacy alignment check enabled.  Pre-check host split lock
-		 * support to avoid the VMREADs needed to check legacy #AC,
-		 * i.e. reflect the #AC if the only possible source is legacy
-		 * alignment checks.
+		 * legacy alignment check enabled or split lock detect enabled.
+		 * Pre-check host split lock support to avoid further check of
+		 * guest, i.e. reflect the #AC if host doesn't enable split lock
+		 * detection.
 		 */
 		if (!split_lock_detect_on() ||
+		    guest_cpu_split_lock_detect_on(vmx) ||
 		    guest_cpu_alignment_check_enabled(vcpu)) {
 			kvm_queue_exception_e(vcpu, AC_VECTOR, error_code);
 			return 1;
@@ -6619,6 +6641,14 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	 */
 	x86_spec_ctrl_set_guest(vmx->spec_ctrl, 0);
 
+	if (static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
+	    guest_cpu_split_lock_detect_on(vmx)) {
+		if (test_thread_flag(TIF_SLD))
+			sld_turn_back_on();
+		else if (!split_lock_detect_on())
+			sld_msr_set(true);
+	}
+
 	/* L1D Flush includes CPU buffer clear to mitigate MDS */
 	if (static_branch_unlikely(&vmx_l1d_should_flush))
 		vmx_l1d_flush(vcpu);
@@ -6653,6 +6683,11 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
 
 	x86_spec_ctrl_restore_host(vmx->spec_ctrl, 0);
 
+	if (static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) &&
+	    guest_cpu_split_lock_detect_on(vmx) &&
+	    !split_lock_detect_on())
+		sld_msr_set(false);
+
 	/* All fields are clean at this point */
 	if (static_branch_unlikely(&enable_evmcs))
 		current_evmcs->hv_clean_fields |=
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 72d4bfea8864..c9ad6569d1b6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1162,7 +1162,7 @@ static const u32 msrs_to_save_all[] = {
 #endif
 	MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
 	MSR_IA32_FEAT_CTL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
-	MSR_IA32_SPEC_CTRL,
+	MSR_IA32_SPEC_CTRL, MSR_TEST_CTRL,
 	MSR_IA32_RTIT_CTL, MSR_IA32_RTIT_STATUS, MSR_IA32_RTIT_CR3_MATCH,
 	MSR_IA32_RTIT_OUTPUT_BASE, MSR_IA32_RTIT_OUTPUT_MASK,
 	MSR_IA32_RTIT_ADDR0_A, MSR_IA32_RTIT_ADDR0_B,
@@ -1344,7 +1344,12 @@ static u64 kvm_get_arch_capabilities(void)
 
 static u64 kvm_get_core_capabilities(void)
 {
-	return 0;
+	u64 data = 0;
+
+	if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) && !cpu_smt_possible())
+		data |= MSR_IA32_CORE_CAPS_SPLIT_LOCK_DETECT;
+
+	return data;
 }
 
 static int kvm_get_msr_feature(struct kvm_msr_entry *msr)
@@ -2729,7 +2734,8 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 		vcpu->arch.arch_capabilities = data;
 		break;
 	case MSR_IA32_CORE_CAPS:
-		if (!msr_info->host_initiated)
+		if (!msr_info->host_initiated ||
+		    data & ~kvm_get_core_capabilities())
 			return 1;
 		vcpu->arch.core_capabilities = data;
 		break;
@@ -5276,6 +5282,11 @@ static void kvm_init_msr_list(void)
 		 * to the guests in some cases.
 		 */
 		switch (msrs_to_save_all[i]) {
+		case MSR_TEST_CTRL:
+			if (!(kvm_get_core_capabilities() &
+			      MSR_IA32_CORE_CAPS_SPLIT_LOCK_DETECT))
+				continue;
+			break;
 		case MSR_IA32_BNDCFGS:
 			if (!kvm_mpx_supported())
 				continue;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-15  5:05 ` [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of " Xiaoyao Li
@ 2020-03-21  0:41   ` Luck, Tony
  2020-03-23 17:02   ` Thomas Gleixner
  1 sibling, 0 replies; 33+ messages in thread
From: Luck, Tony @ 2020-03-21  0:41 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel,
	Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Vitaly Kuznetsov, Jim Mattson

On Sun, Mar 15, 2020 at 01:05:09PM +0800, Xiaoyao Li wrote:
> To solve these issues, introducing a new sld_state, "sld_not_exist", as
> the default value. It will be switched to other value if CORE_CAPABILITIES
> or FMS enumerate split lock detection.

Is a better name for this state "sld_uninitialized?

Otherwise looks good.

Reviewed-by: Tony Luck <tony.luck@intel.com>

-Tony

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
  2020-03-15  5:05 ` [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR Xiaoyao Li
@ 2020-03-21  0:43   ` Luck, Tony
  2020-03-23 17:06     ` Thomas Gleixner
  2020-03-23 17:06   ` Thomas Gleixner
  1 sibling, 1 reply; 33+ messages in thread
From: Luck, Tony @ 2020-03-21  0:43 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel,
	Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Vitaly Kuznetsov, Jim Mattson

On Sun, Mar 15, 2020 at 01:05:10PM +0800, Xiaoyao Li wrote:
> In a context switch from a task that is detecting split locks
> to one that is not (or vice versa) we need to update the TEST_CTRL
> MSR. Currently this is done with the common sequence:
> 	read the MSR
> 	flip the bit
> 	write the MSR
> in order to avoid changing the value of any reserved bits in the MSR.
> 
> Cache the value of the TEST_CTRL MSR when we read it during initialization
> so we can avoid an expensive RDMSR instruction during context switch.
> 
> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Originally-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>

Is it bad form to Ack/Review patches originally by oneself?

Whatever:

Acked-by: Tony Luck <tony.luck@intel.com>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-15  5:05 ` [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect Xiaoyao Li
@ 2020-03-21  0:46   ` Luck, Tony
  2020-03-23 17:10   ` Thomas Gleixner
  1 sibling, 0 replies; 33+ messages in thread
From: Luck, Tony @ 2020-03-21  0:46 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel,
	Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Vitaly Kuznetsov, Jim Mattson

On Sun, Mar 15, 2020 at 01:05:11PM +0800, Xiaoyao Li wrote:
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4666,7 +4666,10 @@
>  			instructions that access data across cache line
>  			boundaries will result in an alignment check exception.
>  
> -			off	- not enabled
> +			disable	- disabled, neither kernel nor kvm can use it.

Are command line arguments "ABI"?  The "=off" variant isn't upstream yet,
but it is in TIP.  I'm ok with this change, but perhaps this patch (or at
least this part of this patch) needs to catch up with the older one within
the 5.7 merge window (or sometime before v5.7).

Reviewed-by: Tony Luck <tony.luck@intel.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 4/9] x86/split_lock: Export handle_user_split_lock()
  2020-03-15  5:05 ` [PATCH v5 4/9] x86/split_lock: Export handle_user_split_lock() Xiaoyao Li
@ 2020-03-21  0:48   ` Luck, Tony
  0 siblings, 0 replies; 33+ messages in thread
From: Luck, Tony @ 2020-03-21  0:48 UTC (permalink / raw)
  To: Xiaoyao Li
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel,
	Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Vitaly Kuznetsov, Jim Mattson

On Sun, Mar 15, 2020 at 01:05:12PM +0800, Xiaoyao Li wrote:
> In the future, KVM will use handle_user_split_lock() to handle #AC
> caused by split lock in guest. Due to the fact that KVM doesn't have
> a @regs context and will pre-check EFLASG.AC, move the EFLAGS.AC check
> to do_alignment_check().
> 
> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> ---
>  arch/x86/include/asm/cpu.h  | 4 ++--
>  arch/x86/kernel/cpu/intel.c | 7 ++++---
>  arch/x86/kernel/traps.c     | 2 +-
>  3 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
> index ff6f3ca649b3..ff567afa6ee1 100644
> --- a/arch/x86/include/asm/cpu.h
> +++ b/arch/x86/include/asm/cpu.h
> @@ -43,11 +43,11 @@ unsigned int x86_stepping(unsigned int sig);
>  #ifdef CONFIG_CPU_SUP_INTEL
>  extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
>  extern void switch_to_sld(unsigned long tifn);
> -extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
> +extern bool handle_user_split_lock(unsigned long ip);
>  #else
>  static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
>  static inline void switch_to_sld(unsigned long tifn) {}
> -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code)
> +static inline bool handle_user_split_lock(unsigned long ip)
>  {
>  	return false;
>  }
> diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
> index 3eeab717a0d0..c401d174c8db 100644
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -1102,13 +1102,13 @@ static void split_lock_init(struct cpuinfo_x86 *c)
>  	sld_state = sld_disable;
>  }
>  
> -bool handle_user_split_lock(struct pt_regs *regs, long error_code)
> +bool handle_user_split_lock(unsigned long ip)
>  {
> -	if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal)
> +	if (sld_state == sld_fatal)
>  		return false;
>  
>  	pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n",
> -			    current->comm, current->pid, regs->ip);
> +			    current->comm, current->pid, ip);
>  
>  	/*
>  	 * Disable the split lock detection for this task so it can make
> @@ -1119,6 +1119,7 @@ bool handle_user_split_lock(struct pt_regs *regs, long error_code)
>  	set_tsk_thread_flag(current, TIF_SLD);
>  	return true;
>  }
> +EXPORT_SYMBOL_GPL(handle_user_split_lock);
>  
>  /*
>   * This function is called only when switching between tasks with
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 0ef5befaed7d..407ff9be610f 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -304,7 +304,7 @@ dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
>  
>  	local_irq_enable();
>  
> -	if (handle_user_split_lock(regs, error_code))
> +	if (!(regs->flags & X86_EFLAGS_AC) && handle_user_split_lock(regs->ip))
>  		return;
>  
>  	do_trap(X86_TRAP_AC, SIGBUS, "alignment check", regs,
> -- 
> 2.20.1
> 

Looks good to me.

Reviewed-by: Tony Luck <tony.luck@intel.com>


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 0/9] x86/split_lock: Add feature split lock detection
  2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
                   ` (8 preceding siblings ...)
  2020-03-15  5:05 ` [PATCH v5 9/9] x86: vmx: virtualize split lock detection Xiaoyao Li
@ 2020-03-23  2:18 ` Xiaoyao Li
  9 siblings, 0 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-23  2:18 UTC (permalink / raw)
  To: Thomas Gleixner, Paolo Bonzini
  Cc: Ingo Molnar, Borislav Petkov, hpa, Sean Christopherson, kvm, x86,
	linux-kernel, Andy Lutomirski, Peter Zijlstra, Arvind Sankar,
	Fenghua Yu, Tony Luck, Vitaly Kuznetsov, Jim Mattson

Hi Thomas and Paolo,

On 3/15/2020 1:05 PM, Xiaoyao Li wrote:
> This series aims to add the virtualization of split lock detection for
> guest, while containing some fixes of native kernel split lock handling.
> 
> Note, this series is based on the kernel patch[1].

Do you have any comment on this series?

If no, may I ask can we make it for 5.7?

Thanks,
-Xiaoyao


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-15  5:05 ` [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of " Xiaoyao Li
  2020-03-21  0:41   ` Luck, Tony
@ 2020-03-23 17:02   ` Thomas Gleixner
  2020-03-23 20:24     ` Thomas Gleixner
  2020-03-24 11:51     ` Xiaoyao Li
  1 sibling, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-23 17:02 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Xiaoyao Li <xiaoyao.li@intel.com> writes:

> Current initialization flow of split lock detection has following issues:
> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>    zero. However, it's possible that BIOS/firmware has set it.

Ok.

> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>    there is a virtualization flaw that FMS indicates the existence while
>    it's actually not supported.
>
> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>    to check verify if feature does exist, so cannot expose it to
>    guest.

Sorry this does not make anny sense. KVM is the hypervisor, so it better
can rely on the detect flag. Unless you talk about nested virt and a
broken L1 hypervisor.

> To solve these issues, introducing a new sld_state, "sld_not_exist",
> as

The usual naming convention is sld_not_supported.

> the default value. It will be switched to other value if CORE_CAPABILITIES
> or FMS enumerate split lock detection.
>
> Only when sld_state != sld_not_exist, it goes to initialization flow.
>
> In initialization flow, it explicitly accesses MSR_TEST_CTRL and
> SPLIT_LOCK_DETECT bit to ensure there is no virtualization flaw, i.e.,
> feature split lock detection does supported. In detail,
>  1. sld_off,   verify SPLIT_LOCK_DETECT bit can be cleared, and clear it;

That's not what the patch does. It writes with the bit cleared and the
only thing it checks is whether the wrmsrl fails or not. Verification is
something completely different.

>  2. sld_warn,  verify SPLIT_LOCK_DETECT bit can be cleared and set,
>                and set it;
>  3. sld_fatal, verify SPLIT_LOCK_DETECT bit can be set, and set it;
>
> Only when no MSR aceessing failure, can X86_FEATURE_SPLIT_LOCK_DETECT be
> set. So kvm can use X86_FEATURE_SPLIT_LOCK_DETECT to check the existence
> of feature.

Again, this has nothing to do with KVM. 

>   * Processors which have self-snooping capability can handle conflicting
> @@ -585,7 +585,7 @@ static void init_intel_misc_features(struct cpuinfo_x86 *c)
>  	wrmsrl(MSR_MISC_FEATURES_ENABLES, msr);
>  }
>  
> -static void split_lock_init(void);
> +static void split_lock_init(struct cpuinfo_x86 *c);
>  
>  static void init_intel(struct cpuinfo_x86 *c)
>  {
> @@ -702,7 +702,8 @@ static void init_intel(struct cpuinfo_x86 *c)
>  	if (tsx_ctrl_state == TSX_CTRL_DISABLE)
>  		tsx_disable();
>  
> -	split_lock_init();
> +	if (sld_state != sld_not_exist)
> +		split_lock_init(c);

That conditional want's to be in split_lock_init() where it used to be.

> +/*
> + * Use the "safe" versions of rdmsr/wrmsr here because although code
> + * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
> + * exist, there may be glitches in virtualization that leave a guest
> + * with an incorrect view of real h/w capabilities.
> + * If not msr_broken, then it needn't use "safe" version at runtime.
> + */
> +static void split_lock_init(struct cpuinfo_x86 *c)
>  {
> -	if (sld_state == sld_off)
> -		return;
> +	u64 test_ctrl_val;
>  
> -	if (__sld_msr_set(true))
> -		return;
> +	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
> +		goto msr_broken;
> +
> +	switch (sld_state) {
> +	case sld_off:
> +		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
> +			goto msr_broken;
> +		break;
> +	case sld_warn:
> +		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
> +			goto msr_broken;
> +		fallthrough;
> +	case sld_fatal:
> +		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val | MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
> +			goto msr_broken;
> +		break;

This does not make any sense either. Why doing it any different for warn
and fatal?

> +	default:
> +		break;

If there is ever a state added, then default will just fall through and
possibly nobody notices because the compiler does not complain.

> +	}
> +
> +	set_cpu_cap(c, X86_FEATURE_SPLIT_LOCK_DETECT);
> +	return;
>  
> +msr_broken:
>  	/*
>  	 * If this is anything other than the boot-cpu, you've done
>  	 * funny things and you get to keep whatever pieces.
>  	 */
> -	pr_warn("MSR fail -- disabled\n");
> +	pr_warn_once("MSR fail -- disabled\n");
>  	sld_state = sld_off;

So you run this on every CPU. What's the point? If the hypervisor is so
broken that the MSR works on CPU0 but not on CPU1 then this is probably
the least of your worries.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
  2020-03-15  5:05 ` [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR Xiaoyao Li
  2020-03-21  0:43   ` Luck, Tony
@ 2020-03-23 17:06   ` Thomas Gleixner
  2020-03-24  1:16     ` Xiaoyao Li
  1 sibling, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-23 17:06 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Xiaoyao Li <xiaoyao.li@intel.com> writes:
> +/*
> + * Soft copy of MSR_TEST_CTRL initialized when we first read the
> + * MSR. Used at runtime to avoid using rdmsr again just to collect
> + * the reserved bits in the MSR. We assume reserved bits are the
> + * same on all CPUs.
> + */
> +static u64 test_ctrl_val;
> +
>  /*
>   * Locking is not required at the moment because only bit 29 of this
>   * MSR is implemented and locking would not prevent that the operation
> @@ -1027,16 +1035,14 @@ static void __init split_lock_setup(void)
>   */
>  static void __sld_msr_set(bool on)
>  {
> -	u64 test_ctrl_val;
> -
> -	rdmsrl(MSR_TEST_CTRL, test_ctrl_val);
> +	u64 val = test_ctrl_val;
>  
>  	if (on)
> -		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +		val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
>  	else
> -		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +		val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
>  
> -	wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
> +	wrmsrl(MSR_TEST_CTRL, val);
>  }
>  
>  /*
> @@ -1048,11 +1054,13 @@ static void __sld_msr_set(bool on)
>   */
>  static void split_lock_init(struct cpuinfo_x86 *c)
>  {
> -	u64 test_ctrl_val;
> +	u64 val;
>  
> -	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
> +	if (rdmsrl_safe(MSR_TEST_CTRL, &val))
>  		goto msr_broken;
>  
> +	test_ctrl_val = val;
> +
>  	switch (sld_state) {
>  	case sld_off:
>  		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))

That's just broken. Simply because

       case sld_warn:
       case sld_fatal:

set the split lock detect bit, but the cache variable has it cleared
unless it was set at boot time already.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
  2020-03-21  0:43   ` Luck, Tony
@ 2020-03-23 17:06     ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-23 17:06 UTC (permalink / raw)
  To: Luck, Tony, Xiaoyao Li
  Cc: Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel, Andy Lutomirski,
	Peter Zijlstra, Arvind Sankar, Fenghua Yu, Vitaly Kuznetsov,
	Jim Mattson

"Luck, Tony" <tony.luck@intel.com> writes:

> On Sun, Mar 15, 2020 at 01:05:10PM +0800, Xiaoyao Li wrote:
>> In a context switch from a task that is detecting split locks
>> to one that is not (or vice versa) we need to update the TEST_CTRL
>> MSR. Currently this is done with the common sequence:
>> 	read the MSR
>> 	flip the bit
>> 	write the MSR
>> in order to avoid changing the value of any reserved bits in the MSR.
>> 
>> Cache the value of the TEST_CTRL MSR when we read it during initialization
>> so we can avoid an expensive RDMSR instruction during context switch.
>> 
>> Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
>> Originally-by: Tony Luck <tony.luck@intel.com>
>> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>
> Is it bad form to Ack/Review patches originally by oneself?

Only if they are broken ....

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-15  5:05 ` [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect Xiaoyao Li
  2020-03-21  0:46   ` Luck, Tony
@ 2020-03-23 17:10   ` Thomas Gleixner
  2020-03-24  1:38     ` Xiaoyao Li
  1 sibling, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-23 17:10 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Xiaoyao Li <xiaoyao.li@intel.com> writes:

> Change sld_off to sld_disable, which means disabling feature split lock
> detection and it cannot be used in kernel nor can kvm expose it guest.
> Of course, the X86_FEATURE_SPLIT_LOCK_DETECT is not set.
>
> Add a new optioin sld_kvm_only, which means kernel turns split lock
> detection off, but kvm can expose it to guest.

What's the point of this? If the host is not clean, then you better fix
the host first before trying to expose it to guests.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-23 17:02   ` Thomas Gleixner
@ 2020-03-23 20:24     ` Thomas Gleixner
  2020-03-24  1:10       ` Xiaoyao Li
  2020-03-24 11:51     ` Xiaoyao Li
  1 sibling, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-23 20:24 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson, Xiaoyao Li

Thomas Gleixner <tglx@linutronix.de> writes:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>
>> Current initialization flow of split lock detection has following issues:
>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>    zero. However, it's possible that BIOS/firmware has set it.
>
> Ok.
>
>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>    there is a virtualization flaw that FMS indicates the existence while
>>    it's actually not supported.
>>
>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>    to check verify if feature does exist, so cannot expose it to
>>    guest.
>
> Sorry this does not make anny sense. KVM is the hypervisor, so it better
> can rely on the detect flag. Unless you talk about nested virt and a
> broken L1 hypervisor.
>
>> To solve these issues, introducing a new sld_state, "sld_not_exist",
>> as
>
> The usual naming convention is sld_not_supported.

But this extra state is not needed at all, it already exists:

    X86_FEATURE_SPLIT_LOCK_DETECT

You just need to make split_lock_setup() a bit smarter. Soemthing like
the below. It just wants to be split into separate patches.

Thanks,

        tglx
---
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -45,6 +45,7 @@ enum split_lock_detect_state {
  * split lock detect, unless there is a command line override.
  */
 static enum split_lock_detect_state sld_state = sld_off;
+static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -984,11 +985,32 @@ static inline bool match_option(const ch
 	return len == arglen && !strncmp(arg, opt, len);
 }
 
+static bool __init split_lock_verify_msr(bool on)
+{
+	u64 ctrl, tmp;
+
+	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
+		return false;
+	if (on)
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	else
+		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
+		return false;
+	rdmsrl(MSR_TEST_CTRL, tmp);
+	return ctrl == tmp;
+}
+
 static void __init split_lock_setup(void)
 {
 	char arg[20];
 	int i, ret;
 
+	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
+		pr_info("MSR access failed: Disabled\n");
+		return;
+	}
+
 	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
 	sld_state = sld_warn;
 
@@ -1007,7 +1029,6 @@ static void __init split_lock_setup(void
 	case sld_off:
 		pr_info("disabled\n");
 		break;
-
 	case sld_warn:
 		pr_info("warning about user-space split_locks\n");
 		break;
@@ -1018,44 +1039,40 @@ static void __init split_lock_setup(void
 	}
 }
 
-/*
- * Locking is not required at the moment because only bit 29 of this
- * MSR is implemented and locking would not prevent that the operation
- * of one thread is immediately undone by the sibling thread.
- * Use the "safe" versions of rdmsr/wrmsr here because although code
- * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
- * exist, there may be glitches in virtualization that leave a guest
- * with an incorrect view of real h/w capabilities.
- */
-static bool __sld_msr_set(bool on)
+static void split_lock_init(void)
 {
-	u64 test_ctrl_val;
+	u64 ctrl;
 
-	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
-		return false;
+	if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
+		return;
 
-	if (on)
-		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	rdmsrl(MSR_TEST_CTRL, ctrl);
+	if (sld_state == sld_off)
+		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 	else
-		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
-
-	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	wrmsrl(MSR_TEST_CTRL, ctrl);
+	this_cpu_write(msr_test_ctrl_cache, ctrl);
 }
 
-static void split_lock_init(void)
+/*
+ * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
+ * is not implemented as one thread could undo the setting of the other
+ * thread immediately after dropping the lock anyway.
+ */
+static void msr_test_ctrl_update(bool on, u64 mask)
 {
-	if (sld_state == sld_off)
-		return;
+	u64 tmp, ctrl = this_cpu_read(msr_test_ctrl_cache);
 
-	if (__sld_msr_set(true))
-		return;
+	if (on)
+		tmp = ctrl | mask;
+	else
+		tmp = ctrl & ~mask;
 
-	/*
-	 * If this is anything other than the boot-cpu, you've done
-	 * funny things and you get to keep whatever pieces.
-	 */
-	pr_warn("MSR fail -- disabled\n");
-	sld_state = sld_off;
+	if (tmp != ctrl) {
+		wrmsrl(MSR_TEST_CTRL, ctrl);
+		this_cpu_write(msr_test_ctrl_cache, ctrl);
+	}
 }
 
 bool handle_user_split_lock(struct pt_regs *regs, long error_code)
@@ -1071,7 +1088,7 @@ bool handle_user_split_lock(struct pt_re
 	 * progress and set TIF_SLD so the detection is re-enabled via
 	 * switch_to_sld() when the task is scheduled out.
 	 */
-	__sld_msr_set(false);
+	msr_test_ctrl_update(false, MSR_TEST_CTRL_SPLIT_LOCK_DETECT);
 	set_tsk_thread_flag(current, TIF_SLD);
 	return true;
 }
@@ -1085,7 +1102,7 @@ bool handle_user_split_lock(struct pt_re
  */
 void switch_to_sld(unsigned long tifn)
 {
-	__sld_msr_set(!(tifn & _TIF_SLD));
+	msr_test_ctrl_update(!(tifn & _TIF_SLD), MSR_TEST_CTRL_SPLIT_LOCK_DETECT);
 }
 
 #define SPLIT_LOCK_CPU(model) {X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY}



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-23 20:24     ` Thomas Gleixner
@ 2020-03-24  1:10       ` Xiaoyao Li
  2020-03-24 10:29         ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-24  1:10 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

On 3/24/2020 4:24 AM, Thomas Gleixner wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>
>>> Current initialization flow of split lock detection has following issues:
>>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>>     zero. However, it's possible that BIOS/firmware has set it.
>>
>> Ok.
>>
>>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>>     there is a virtualization flaw that FMS indicates the existence while
>>>     it's actually not supported.
>>>
>>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>>     to check verify if feature does exist, so cannot expose it to
>>>     guest.
>>
>> Sorry this does not make anny sense. KVM is the hypervisor, so it better
>> can rely on the detect flag. Unless you talk about nested virt and a
>> broken L1 hypervisor.
>>
>>> To solve these issues, introducing a new sld_state, "sld_not_exist",
>>> as
>>
>> The usual naming convention is sld_not_supported.
> 
> But this extra state is not needed at all, it already exists:
> 
>      X86_FEATURE_SPLIT_LOCK_DETECT
> 
> You just need to make split_lock_setup() a bit smarter. Soemthing like
> the below. It just wants to be split into separate patches.
> 
> Thanks,
> 
>          tglx
> ---
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -45,6 +45,7 @@ enum split_lock_detect_state {
>    * split lock detect, unless there is a command line override.
>    */
>   static enum split_lock_detect_state sld_state = sld_off;
> +static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);

I used percpu cache in v3, but people prefer Tony's cache for reserved 
bits[1].

If you prefer percpu cache, I'll use it in next version.

[1]: https://lore.kernel.org/lkml/20200303192242.GU1439@linux.intel.com/

>   /*
>    * Processors which have self-snooping capability can handle conflicting
> @@ -984,11 +985,32 @@ static inline bool match_option(const ch
>   	return len == arglen && !strncmp(arg, opt, len);
>   }
>   
> +static bool __init split_lock_verify_msr(bool on)
> +{
> +	u64 ctrl, tmp;
> +
> +	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
> +		return false;
> +	if (on)
> +		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	else
> +		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
> +		return false;
> +	rdmsrl(MSR_TEST_CTRL, tmp);
> +	return ctrl == tmp;
> +}
> +
>   static void __init split_lock_setup(void)
>   {
>   	char arg[20];
>   	int i, ret;
>   
> +	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
> +		pr_info("MSR access failed: Disabled\n");
> +		return;
> +	}
> +

I did similar thing like this in my v3, however Sean raised concern that 
toggling MSR bit before parsing kernel param is bad behavior. [2]

[2]: https://lore.kernel.org/kvm/20200305162311.GG11500@linux.intel.com/


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR
  2020-03-23 17:06   ` Thomas Gleixner
@ 2020-03-24  1:16     ` Xiaoyao Li
  0 siblings, 0 replies; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-24  1:16 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

On 3/24/2020 1:06 AM, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> +/*
>> + * Soft copy of MSR_TEST_CTRL initialized when we first read the
>> + * MSR. Used at runtime to avoid using rdmsr again just to collect
>> + * the reserved bits in the MSR. We assume reserved bits are the
>> + * same on all CPUs.
>> + */
>> +static u64 test_ctrl_val;
>> +
>>   /*
>>    * Locking is not required at the moment because only bit 29 of this
>>    * MSR is implemented and locking would not prevent that the operation
>> @@ -1027,16 +1035,14 @@ static void __init split_lock_setup(void)
>>    */
>>   static void __sld_msr_set(bool on)
>>   {
>> -	u64 test_ctrl_val;
>> -
>> -	rdmsrl(MSR_TEST_CTRL, test_ctrl_val);
>> +	u64 val = test_ctrl_val;
>>   
>>   	if (on)
>> -		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
>> +		val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
>>   	else
>> -		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
>> +		val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
>>   
>> -	wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
>> +	wrmsrl(MSR_TEST_CTRL, val);
>>   }
>>   
>>   /*
>> @@ -1048,11 +1054,13 @@ static void __sld_msr_set(bool on)
>>    */
>>   static void split_lock_init(struct cpuinfo_x86 *c)
>>   {
>> -	u64 test_ctrl_val;
>> +	u64 val;
>>   
>> -	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
>> +	if (rdmsrl_safe(MSR_TEST_CTRL, &val))
>>   		goto msr_broken;
>>   
>> +	test_ctrl_val = val;
>> +
>>   	switch (sld_state) {
>>   	case sld_off:
>>   		if (wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val & ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT))
> 
> That's just broken. Simply because
> 
>         case sld_warn:
>         case sld_fatal:
> 
> set the split lock detect bit, but the cache variable has it cleared
> unless it was set at boot time already.

The test_ctrl_val is not to cache the value of MSR_TEST_CTRL, but cache 
the reserved/unused bits other than MSR_TEST_CTRL_SPLIT_LOCK_DETECT bit.

> Thanks,
> 
>          tglx
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-23 17:10   ` Thomas Gleixner
@ 2020-03-24  1:38     ` Xiaoyao Li
  2020-03-24 10:40       ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-24  1:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

On 3/24/2020 1:10 AM, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> 
>> Change sld_off to sld_disable, which means disabling feature split lock
>> detection and it cannot be used in kernel nor can kvm expose it guest.
>> Of course, the X86_FEATURE_SPLIT_LOCK_DETECT is not set.
>>
>> Add a new optioin sld_kvm_only, which means kernel turns split lock
>> detection off, but kvm can expose it to guest.
> 
> What's the point of this? If the host is not clean, then you better fix
> the host first before trying to expose it to guests.

It's not about whether or not host is clean. It's for the cases that 
users just don't want it enabled on host, to not break the applications 
or drivers that do have split lock issue.

> Thanks,
> 
>          tglx
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-24  1:10       ` Xiaoyao Li
@ 2020-03-24 10:29         ` Thomas Gleixner
  2020-03-25  0:18           ` Xiaoyao Li
  0 siblings, 1 reply; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-24 10:29 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

Xiaoyao Li <xiaoyao.li@intel.com> writes:
> On 3/24/2020 4:24 AM, Thomas Gleixner wrote:
>> --- a/arch/x86/kernel/cpu/intel.c
>> +++ b/arch/x86/kernel/cpu/intel.c
>> @@ -45,6 +45,7 @@ enum split_lock_detect_state {
>>    * split lock detect, unless there is a command line override.
>>    */
>>   static enum split_lock_detect_state sld_state = sld_off;
>> +static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);
>
> I used percpu cache in v3, but people prefer Tony's cache for reserved 
> bits[1].
>
> If you prefer percpu cache, I'll use it in next version.

I'm fine with the single variable.

>>   static void __init split_lock_setup(void)
>>   {
>>   	char arg[20];
>>   	int i, ret;
>>   
>> +	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
>> +		pr_info("MSR access failed: Disabled\n");
>> +		return;
>> +	}
>> +
>
> I did similar thing like this in my v3, however Sean raised concern that 
> toggling MSR bit before parsing kernel param is bad behavior. [2]

That's trivial enough to fix.

Thanks,

        tglx

8<---------------
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -44,7 +44,8 @@ enum split_lock_detect_state {
  * split_lock_setup() will switch this to sld_warn on systems that support
  * split lock detect, unless there is a command line override.
  */
-static enum split_lock_detect_state sld_state = sld_off;
+static enum split_lock_detect_state sld_state __ro_after_init = sld_off;
+static u64 msr_test_ctrl_cache __ro_after_init;
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -984,78 +985,85 @@ static inline bool match_option(const ch
 	return len == arglen && !strncmp(arg, opt, len);
 }
 
+static bool __init split_lock_verify_msr(bool on)
+{
+	u64 ctrl, tmp;
+
+	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
+		return false;
+	if (on)
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	else
+		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
+		return false;
+	rdmsrl(MSR_TEST_CTRL, tmp);
+	return ctrl == tmp;
+}
+
 static void __init split_lock_setup(void)
 {
+	enum split_lock_detect_state state = sld_warn;
 	char arg[20];
 	int i, ret;
 
-	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
-	sld_state = sld_warn;
+	if (!split_lock_verify_msr(false)) {
+		pr_info("MSR access failed: Disabled\n");
+		return;
+	}
 
 	ret = cmdline_find_option(boot_command_line, "split_lock_detect",
 				  arg, sizeof(arg));
 	if (ret >= 0) {
 		for (i = 0; i < ARRAY_SIZE(sld_options); i++) {
 			if (match_option(arg, ret, sld_options[i].option)) {
-				sld_state = sld_options[i].state;
+				state = sld_options[i].state;
 				break;
 			}
 		}
 	}
 
-	switch (sld_state) {
+	switch (state) {
 	case sld_off:
 		pr_info("disabled\n");
-		break;
-
+		return;
 	case sld_warn:
 		pr_info("warning about user-space split_locks\n");
 		break;
-
 	case sld_fatal:
 		pr_info("sending SIGBUS on user-space split_locks\n");
 		break;
 	}
+
+	rdmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
+
+	if (!split_lock_verify_msr(true)) {
+		pr_info("MSR access failed: Disabled\n");
+		return;
+	}
+
+	sld_state = state;
+	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
 }
 
 /*
- * Locking is not required at the moment because only bit 29 of this
- * MSR is implemented and locking would not prevent that the operation
- * of one thread is immediately undone by the sibling thread.
- * Use the "safe" versions of rdmsr/wrmsr here because although code
- * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
- * exist, there may be glitches in virtualization that leave a guest
- * with an incorrect view of real h/w capabilities.
+ * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
+ * is not implemented as one thread could undo the setting of the other
+ * thread immediately after dropping the lock anyway.
  */
-static bool __sld_msr_set(bool on)
+static void sld_update_msr(bool on)
 {
-	u64 test_ctrl_val;
-
-	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
-		return false;
+	u64 ctrl = msr_test_ctrl_cache;
 
 	if (on)
-		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
-	else
-		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
-
-	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
+		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
+	wrmsrl(MSR_TEST_CTRL, ctrl);
 }
 
 static void split_lock_init(void)
 {
-	if (sld_state == sld_off)
-		return;
-
-	if (__sld_msr_set(true))
-		return;
-
-	/*
-	 * If this is anything other than the boot-cpu, you've done
-	 * funny things and you get to keep whatever pieces.
-	 */
-	pr_warn("MSR fail -- disabled\n");
-	sld_state = sld_off;
+	if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
+		sld_update_msr(sld_state != sld_off);
 }
 
 bool handle_user_split_lock(struct pt_regs *regs, long error_code)
@@ -1071,7 +1079,7 @@ bool handle_user_split_lock(struct pt_re
 	 * progress and set TIF_SLD so the detection is re-enabled via
 	 * switch_to_sld() when the task is scheduled out.
 	 */
-	__sld_msr_set(false);
+	sld_update_msr(false);
 	set_tsk_thread_flag(current, TIF_SLD);
 	return true;
 }
@@ -1085,7 +1093,7 @@ bool handle_user_split_lock(struct pt_re
  */
 void switch_to_sld(unsigned long tifn)
 {
-	__sld_msr_set(!(tifn & _TIF_SLD));
+	sld_update_msr(!(tifn & _TIF_SLD));
 }
 
 #define SPLIT_LOCK_CPU(model) {X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY}

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-24  1:38     ` Xiaoyao Li
@ 2020-03-24 10:40       ` Thomas Gleixner
  2020-03-24 18:02         ` Sean Christopherson
  2020-03-25  0:43         ` Xiaoyao Li
  0 siblings, 2 replies; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-24 10:40 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

Xiaoyao Li <xiaoyao.li@intel.com> writes:
> On 3/24/2020 1:10 AM, Thomas Gleixner wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> 
>>> Change sld_off to sld_disable, which means disabling feature split lock
>>> detection and it cannot be used in kernel nor can kvm expose it guest.
>>> Of course, the X86_FEATURE_SPLIT_LOCK_DETECT is not set.
>>>
>>> Add a new optioin sld_kvm_only, which means kernel turns split lock
>>> detection off, but kvm can expose it to guest.
>> 
>> What's the point of this? If the host is not clean, then you better fix
>> the host first before trying to expose it to guests.
>
> It's not about whether or not host is clean. It's for the cases that 
> users just don't want it enabled on host, to not break the applications 
> or drivers that do have split lock issue.

It's very much about whether the host is split lock clean.

If your host kernel is not, then this wants to be fixed first. If your
host application is broken, then either fix it or use "warn".

Stop proliferating crap.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-23 17:02   ` Thomas Gleixner
  2020-03-23 20:24     ` Thomas Gleixner
@ 2020-03-24 11:51     ` Xiaoyao Li
  2020-03-24 13:31       ` Thomas Gleixner
  1 sibling, 1 reply; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-24 11:51 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

On 3/24/2020 1:02 AM, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> 
>> Current initialization flow of split lock detection has following issues:
>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>     zero. However, it's possible that BIOS/firmware has set it.
> 
> Ok.
> 
>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>     there is a virtualization flaw that FMS indicates the existence while
>>     it's actually not supported.
>>
>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>     to check verify if feature does exist, so cannot expose it to
>>     guest.
> 
> Sorry this does not make anny sense. KVM is the hypervisor, so it better
> can rely on the detect flag. Unless you talk about nested virt and a
> broken L1 hypervisor.
> 

Yeah. It is for the nested virt on a broken L1 hypervisor.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-24 11:51     ` Xiaoyao Li
@ 2020-03-24 13:31       ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-24 13:31 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

Xiaoyao Li <xiaoyao.li@intel.com> writes:
> On 3/24/2020 1:02 AM, Thomas Gleixner wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> 
>>> Current initialization flow of split lock detection has following issues:
>>> 1. It assumes the initial value of MSR_TEST_CTRL.SPLIT_LOCK_DETECT to be
>>>     zero. However, it's possible that BIOS/firmware has set it.
>> 
>> Ok.
>> 
>>> 2. X86_FEATURE_SPLIT_LOCK_DETECT flag is unconditionally set even if
>>>     there is a virtualization flaw that FMS indicates the existence while
>>>     it's actually not supported.
>>>
>>> 3. Because of #2, KVM cannot rely on X86_FEATURE_SPLIT_LOCK_DETECT flag
>>>     to check verify if feature does exist, so cannot expose it to
>>>     guest.
>> 
>> Sorry this does not make anny sense. KVM is the hypervisor, so it better
>> can rely on the detect flag. Unless you talk about nested virt and a
>> broken L1 hypervisor.
>> 
>
> Yeah. It is for the nested virt on a broken L1 hypervisor.

Then please spell it out in the changelog. Changelogs which need crystalballs
to decode are pretty useless.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-24 10:40       ` Thomas Gleixner
@ 2020-03-24 18:02         ` Sean Christopherson
  2020-03-24 18:42           ` Thomas Gleixner
  2020-03-25  0:43         ` Xiaoyao Li
  1 sibling, 1 reply; 33+ messages in thread
From: Sean Christopherson @ 2020-03-24 18:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	kvm, x86, linux-kernel, Andy Lutomirski, Peter Zijlstra,
	Arvind Sankar, Fenghua Yu, Tony Luck, Vitaly Kuznetsov,
	Jim Mattson

On Tue, Mar 24, 2020 at 11:40:18AM +0100, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> > On 3/24/2020 1:10 AM, Thomas Gleixner wrote:
> >> Xiaoyao Li <xiaoyao.li@intel.com> writes:
> >> 
> >>> Change sld_off to sld_disable, which means disabling feature split lock
> >>> detection and it cannot be used in kernel nor can kvm expose it guest.
> >>> Of course, the X86_FEATURE_SPLIT_LOCK_DETECT is not set.
> >>>
> >>> Add a new optioin sld_kvm_only, which means kernel turns split lock
> >>> detection off, but kvm can expose it to guest.
> >> 
> >> What's the point of this? If the host is not clean, then you better fix
> >> the host first before trying to expose it to guests.
> >
> > It's not about whether or not host is clean. It's for the cases that 
> > users just don't want it enabled on host, to not break the applications 
> > or drivers that do have split lock issue.
> 
> It's very much about whether the host is split lock clean.
> 
> If your host kernel is not, then this wants to be fixed first. If your
> host application is broken, then either fix it or use "warn".

The "kvm only" option was my suggestion.  The thought was to provide a way
for users to leverage KVM to debug/test kernels without having to have a
known good kernel and/or to minimize the risk of crashing their physical
system.  E.g. debug a misbehaving driver by assigning its associated device
to a guest.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-24 18:02         ` Sean Christopherson
@ 2020-03-24 18:42           ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-24 18:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	kvm, x86, linux-kernel, Andy Lutomirski, Peter Zijlstra,
	Arvind Sankar, Fenghua Yu, Tony Luck, Vitaly Kuznetsov,
	Jim Mattson

Sean Christopherson <sean.j.christopherson@intel.com> writes:
> On Tue, Mar 24, 2020 at 11:40:18AM +0100, Thomas Gleixner wrote:
>> 
>> It's very much about whether the host is split lock clean.
>> 
>> If your host kernel is not, then this wants to be fixed first. If your
>> host application is broken, then either fix it or use "warn".
>
> The "kvm only" option was my suggestion.  The thought was to provide a way
> for users to leverage KVM to debug/test kernels without having to have a
> known good kernel and/or to minimize the risk of crashing their physical
> system.  E.g. debug a misbehaving driver by assigning its associated device
> to a guest.

warn is giving you that, right? I won't crash the host because the #AC
triggers in guest context.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-24 10:29         ` Thomas Gleixner
@ 2020-03-25  0:18           ` Xiaoyao Li
  2020-03-25  0:52             ` Thomas Gleixner
  0 siblings, 1 reply; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-25  0:18 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

On 3/24/2020 6:29 PM, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> On 3/24/2020 4:24 AM, Thomas Gleixner wrote:
>>> --- a/arch/x86/kernel/cpu/intel.c
>>> +++ b/arch/x86/kernel/cpu/intel.c
>>> @@ -45,6 +45,7 @@ enum split_lock_detect_state {
>>>     * split lock detect, unless there is a command line override.
>>>     */
>>>    static enum split_lock_detect_state sld_state = sld_off;
>>> +static DEFINE_PER_CPU(u64, msr_test_ctrl_cache);
>>
>> I used percpu cache in v3, but people prefer Tony's cache for reserved
>> bits[1].
>>
>> If you prefer percpu cache, I'll use it in next version.
> 
> I'm fine with the single variable.
> 
>>>    static void __init split_lock_setup(void)
>>>    {
>>>    	char arg[20];
>>>    	int i, ret;
>>>    
>>> +	if (!split_lock_verify_msr(true) || !split_lock_verify_msr(false)) {
>>> +		pr_info("MSR access failed: Disabled\n");
>>> +		return;
>>> +	}
>>> +
>>
>> I did similar thing like this in my v3, however Sean raised concern that
>> toggling MSR bit before parsing kernel param is bad behavior. [2]
> 
> That's trivial enough to fix.
> 
> Thanks,
> 
>          tglx
> 
> 8<---------------
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -44,7 +44,8 @@ enum split_lock_detect_state {
>    * split_lock_setup() will switch this to sld_warn on systems that support
>    * split lock detect, unless there is a command line override.
>    */
> -static enum split_lock_detect_state sld_state = sld_off;
> +static enum split_lock_detect_state sld_state __ro_after_init = sld_off;
> +static u64 msr_test_ctrl_cache __ro_after_init;
>   
>   /*
>    * Processors which have self-snooping capability can handle conflicting
> @@ -984,78 +985,85 @@ static inline bool match_option(const ch
>   	return len == arglen && !strncmp(arg, opt, len);
>   }
>   
> +static bool __init split_lock_verify_msr(bool on)
> +{
> +	u64 ctrl, tmp;
> +
> +	if (rdmsrl_safe(MSR_TEST_CTRL, &ctrl))
> +		return false;
> +	if (on)
> +		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	else
> +		ctrl &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	if (wrmsrl_safe(MSR_TEST_CTRL, ctrl))
> +		return false;
> +	rdmsrl(MSR_TEST_CTRL, tmp);
> +	return ctrl == tmp;
> +}
> +
>   static void __init split_lock_setup(void)
>   {
> +	enum split_lock_detect_state state = sld_warn;
>   	char arg[20];
>   	int i, ret;
>   
> -	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
> -	sld_state = sld_warn;
> +	if (!split_lock_verify_msr(false)) {
> +		pr_info("MSR access failed: Disabled\n");
> +		return;
> +	}
>   
>   	ret = cmdline_find_option(boot_command_line, "split_lock_detect",
>   				  arg, sizeof(arg));
>   	if (ret >= 0) {
>   		for (i = 0; i < ARRAY_SIZE(sld_options); i++) {
>   			if (match_option(arg, ret, sld_options[i].option)) {
> -				sld_state = sld_options[i].state;
> +				state = sld_options[i].state;
>   				break;
>   			}
>   		}
>   	}
>   
> -	switch (sld_state) {
> +	switch (state) {
>   	case sld_off:
>   		pr_info("disabled\n");
> -		break;
> -
> +		return;
Here, when sld_off, it just returns without 
setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT).

So for APs, it won't clear SLD bit in split_lock_init().

And I remember why I used sld_not_exist, not use
setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT)

Yes, we can call setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT)
for sld_off case. And in split_lock_init(), explicitly calling 
sld_update_msr(false) to turn off sld, and calling clear_cpu_cap(c, 
X86_FEATURE_SPLIT_LOCK_DETECT) to clear the cap. But due to 
setup_force_cpu_cap(), split_lock_detect will still occurs in 
/proc/cpuinfo.

>   	case sld_warn:
>   		pr_info("warning about user-space split_locks\n");
>   		break;
> -
>   	case sld_fatal:
>   		pr_info("sending SIGBUS on user-space split_locks\n");
>   		break;
>   	}
> +
> +	rdmsrl(MSR_TEST_CTRL, msr_test_ctrl_cache);
> +
> +	if (!split_lock_verify_msr(true)) {
> +		pr_info("MSR access failed: Disabled\n");
> +		return;
> +	}
> +
> +	sld_state = state;
> +	setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT);
>   }
>   
>   /*
> - * Locking is not required at the moment because only bit 29 of this
> - * MSR is implemented and locking would not prevent that the operation
> - * of one thread is immediately undone by the sibling thread.
> - * Use the "safe" versions of rdmsr/wrmsr here because although code
> - * checks CPUID and MSR bits to make sure the TEST_CTRL MSR should
> - * exist, there may be glitches in virtualization that leave a guest
> - * with an incorrect view of real h/w capabilities.
> + * MSR_TEST_CTRL is per core, but we treat it like a per CPU MSR. Locking
> + * is not implemented as one thread could undo the setting of the other
> + * thread immediately after dropping the lock anyway.
>    */
> -static bool __sld_msr_set(bool on)
> +static void sld_update_msr(bool on)
>   {
> -	u64 test_ctrl_val;
> -
> -	if (rdmsrl_safe(MSR_TEST_CTRL, &test_ctrl_val))
> -		return false;
> +	u64 ctrl = msr_test_ctrl_cache;
>   
>   	if (on)
> -		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> -	else
> -		test_ctrl_val &= ~MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> -
> -	return !wrmsrl_safe(MSR_TEST_CTRL, test_ctrl_val);
> +		ctrl |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
> +	wrmsrl(MSR_TEST_CTRL, ctrl);
>   }
>   
>   static void split_lock_init(void)
>   {
> -	if (sld_state == sld_off)
> -		return;
> -
> -	if (__sld_msr_set(true))
> -		return;
> -
> -	/*
> -	 * If this is anything other than the boot-cpu, you've done
> -	 * funny things and you get to keep whatever pieces.
> -	 */
> -	pr_warn("MSR fail -- disabled\n");
> -	sld_state = sld_off;
> +	if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT))
> +		sld_update_msr(sld_state != sld_off);
>   }
>   
>   bool handle_user_split_lock(struct pt_regs *regs, long error_code)
> @@ -1071,7 +1079,7 @@ bool handle_user_split_lock(struct pt_re
>   	 * progress and set TIF_SLD so the detection is re-enabled via
>   	 * switch_to_sld() when the task is scheduled out.
>   	 */
> -	__sld_msr_set(false);
> +	sld_update_msr(false);
>   	set_tsk_thread_flag(current, TIF_SLD);
>   	return true;
>   }
> @@ -1085,7 +1093,7 @@ bool handle_user_split_lock(struct pt_re
>    */
>   void switch_to_sld(unsigned long tifn)
>   {
> -	__sld_msr_set(!(tifn & _TIF_SLD));
> +	sld_update_msr(!(tifn & _TIF_SLD));
>   }
>   
>   #define SPLIT_LOCK_CPU(model) {X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY}
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-24 10:40       ` Thomas Gleixner
  2020-03-24 18:02         ` Sean Christopherson
@ 2020-03-25  0:43         ` Xiaoyao Li
  2020-03-25  1:03           ` Thomas Gleixner
  1 sibling, 1 reply; 33+ messages in thread
From: Xiaoyao Li @ 2020-03-25  0:43 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, hpa,
	Paolo Bonzini, Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

On 3/24/2020 6:40 PM, Thomas Gleixner wrote:
> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>> On 3/24/2020 1:10 AM, Thomas Gleixner wrote:
>>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>>
>>>> Change sld_off to sld_disable, which means disabling feature split lock
>>>> detection and it cannot be used in kernel nor can kvm expose it guest.
>>>> Of course, the X86_FEATURE_SPLIT_LOCK_DETECT is not set.
>>>>
>>>> Add a new optioin sld_kvm_only, which means kernel turns split lock
>>>> detection off, but kvm can expose it to guest.
>>>
>>> What's the point of this? If the host is not clean, then you better fix
>>> the host first before trying to expose it to guests.
>>
>> It's not about whether or not host is clean. It's for the cases that
>> users just don't want it enabled on host, to not break the applications
>> or drivers that do have split lock issue.
> 
> It's very much about whether the host is split lock clean.
> 
> If your host kernel is not, then this wants to be fixed first. If your
> host application is broken, then either fix it or use "warn".
> 

My thought is for CSPs that they might not turn on SLD on their product 
environment. Any split lock in kernel or drivers may break their service 
for tenants.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of split lock detection
  2020-03-25  0:18           ` Xiaoyao Li
@ 2020-03-25  0:52             ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-25  0:52 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

Xiaoyao Li <xiaoyao.li@intel.com> writes:
> On 3/24/2020 6:29 PM, Thomas Gleixner wrote:
>> -	switch (sld_state) {
>> +	switch (state) {
>>   	case sld_off:
>>   		pr_info("disabled\n");
>> -		break;
>> -
>> +		return;
> Here, when sld_off, it just returns without 
> setup_force_cpu_cap(X86_FEATURE_SPLIT_LOCK_DETECT).
>
> So for APs, it won't clear SLD bit in split_lock_init().

Trivial fix:

static void split_lock_init(void)
{
	split_lock_verify_msr(sld_state != sld_off);
}

You just need to remove the __init annotation from split_lock_verify_msr().

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect
  2020-03-25  0:43         ` Xiaoyao Li
@ 2020-03-25  1:03           ` Thomas Gleixner
  0 siblings, 0 replies; 33+ messages in thread
From: Thomas Gleixner @ 2020-03-25  1:03 UTC (permalink / raw)
  To: Xiaoyao Li, Ingo Molnar, Borislav Petkov, hpa, Paolo Bonzini,
	Sean Christopherson, kvm, x86, linux-kernel
  Cc: Andy Lutomirski, Peter Zijlstra, Arvind Sankar, Fenghua Yu,
	Tony Luck, Vitaly Kuznetsov, Jim Mattson

Xiaoyao Li <xiaoyao.li@intel.com> writes:

> On 3/24/2020 6:40 PM, Thomas Gleixner wrote:
>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>> On 3/24/2020 1:10 AM, Thomas Gleixner wrote:
>>>> Xiaoyao Li <xiaoyao.li@intel.com> writes:
>>>>
>>>>> Change sld_off to sld_disable, which means disabling feature split lock
>>>>> detection and it cannot be used in kernel nor can kvm expose it guest.
>>>>> Of course, the X86_FEATURE_SPLIT_LOCK_DETECT is not set.
>>>>>
>>>>> Add a new optioin sld_kvm_only, which means kernel turns split lock
>>>>> detection off, but kvm can expose it to guest.
>>>>
>>>> What's the point of this? If the host is not clean, then you better fix
>>>> the host first before trying to expose it to guests.
>>>
>>> It's not about whether or not host is clean. It's for the cases that
>>> users just don't want it enabled on host, to not break the applications
>>> or drivers that do have split lock issue.
>> 
>> It's very much about whether the host is split lock clean.
>> 
>> If your host kernel is not, then this wants to be fixed first. If your
>> host application is broken, then either fix it or use "warn".
>> 
>
> My thought is for CSPs that they might not turn on SLD on their product 
> environment. Any split lock in kernel or drivers may break their service 
> for tenants.

Again you are proliferating crap and making excuses for Common Sense
violating Purposes (CSP).

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2020-03-25  1:14 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-15  5:05 [PATCH v5 0/9] x86/split_lock: Add feature split lock detection Xiaoyao Li
2020-03-15  5:05 ` [PATCH v5 1/9] x86/split_lock: Rework the initialization flow of " Xiaoyao Li
2020-03-21  0:41   ` Luck, Tony
2020-03-23 17:02   ` Thomas Gleixner
2020-03-23 20:24     ` Thomas Gleixner
2020-03-24  1:10       ` Xiaoyao Li
2020-03-24 10:29         ` Thomas Gleixner
2020-03-25  0:18           ` Xiaoyao Li
2020-03-25  0:52             ` Thomas Gleixner
2020-03-24 11:51     ` Xiaoyao Li
2020-03-24 13:31       ` Thomas Gleixner
2020-03-15  5:05 ` [PATCH v5 2/9] x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR Xiaoyao Li
2020-03-21  0:43   ` Luck, Tony
2020-03-23 17:06     ` Thomas Gleixner
2020-03-23 17:06   ` Thomas Gleixner
2020-03-24  1:16     ` Xiaoyao Li
2020-03-15  5:05 ` [PATCH v5 3/9] x86/split_lock: Re-define the kernel param option for split_lock_detect Xiaoyao Li
2020-03-21  0:46   ` Luck, Tony
2020-03-23 17:10   ` Thomas Gleixner
2020-03-24  1:38     ` Xiaoyao Li
2020-03-24 10:40       ` Thomas Gleixner
2020-03-24 18:02         ` Sean Christopherson
2020-03-24 18:42           ` Thomas Gleixner
2020-03-25  0:43         ` Xiaoyao Li
2020-03-25  1:03           ` Thomas Gleixner
2020-03-15  5:05 ` [PATCH v5 4/9] x86/split_lock: Export handle_user_split_lock() Xiaoyao Li
2020-03-21  0:48   ` Luck, Tony
2020-03-15  5:05 ` [PATCH v5 5/9] kvm: x86: Emulate split-lock access as a write Xiaoyao Li
2020-03-15  5:05 ` [PATCH v5 6/9] kvm: vmx: Extend VMX's #AC interceptor to handle split lock #AC happens in guest Xiaoyao Li
2020-03-15  5:05 ` [PATCH v5 7/9] kvm: x86: Emulate MSR IA32_CORE_CAPABILITIES Xiaoyao Li
2020-03-15  5:05 ` [PATCH v5 8/9] kvm: vmx: Enable MSR_TEST_CTRL for intel guest Xiaoyao Li
2020-03-15  5:05 ` [PATCH v5 9/9] x86: vmx: virtualize split lock detection Xiaoyao Li
2020-03-23  2:18 ` [PATCH v5 0/9] x86/split_lock: Add feature " Xiaoyao Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).