* [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors @ 2020-04-02 12:32 Thomas Gleixner 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner ` (2 more replies) 0 siblings, 3 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 12:32 UTC (permalink / raw) To: LKML Cc: x86, Kenneth R. Crudup, Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt Kenneth reported that a VMWare guest crashes in the VMWare BIOS due to a Split-Lock induced #AC which is injected by the VMWare hypervisor into the guest. While this is a good thing in principle, it's not really practical. That means that Split-Lock-Detection has to be disabled when any unprepared VMX hypervisor is loaded. As hypervisor modules are not really identifiable, the only safe solution we came up with is to scan the module text at load time for a VMLAUNCH instruction. If VMLAUNCH is found then Split-Lock-Detection is disabled on the host to prevent the above. If the hypervisor has at least minimal handling code, the module can tell the kernel by adding MOD_INFO(sld_safe, "Y") which disables the text scan. For KVM it's simple enough to handle it at least at the basic level by checking guest CR0.AM and EFLAGS.AC state and a trivial host side handler which depending on the SLD mode handles it gracefully or tells the VMX handler to deliver the #AC to user space which then can crash and burn itself. As Peter and myself don't have access to a SLD enabled machine, the KVM/VMX part is untested. The module scan part works. Alternatively we can obviously revert SLD, but that does not make the problem vs. out of tree hypervisors go away magically. So we can just get over it now. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 12:32 [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Thomas Gleixner @ 2020-04-02 12:32 ` Thomas Gleixner 2020-04-02 15:23 ` [patch v2 " Peter Zijlstra ` (5 more replies) 2020-04-02 12:33 ` [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage Thomas Gleixner 2020-04-02 13:43 ` [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Kenneth R. Crudup 2 siblings, 6 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 12:32 UTC (permalink / raw) To: LKML Cc: x86, Kenneth R. Crudup, Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt From: Peter Zijlstra <peterz@infradead.org> It turns out that with Split-Lock-Detect enabled (default) any VMX hypervisor needs at least a little modification in order to not blindly inject the #AC into the guest without the guest being ready for it. Since there is no telling which module implements a hypervisor, scan the module text and look for the VMLAUNCH instruction. If found, the module is assumed to be a hypervisor of some sort and SLD is disabled. Hypervisors, which have been modified and are known to work correctly, can add: MODULE_INFO(sld_safe, "Y"); to explicitly tell the module loader they're good. NOTE: it is unfortunate that struct load_info is not available to the arch module code, this means CONFIG_CPU_SUP_INTEL gunk is needed in generic code. NOTE: while we can 'trivially' fix KVM, we're still stuck with stuff like VMware and VirtualBox doing their own thing. Reported-by: "Kenneth R. Crudup" <kenny@panix.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Jessica Yu <jeyu@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: Nadav Amit <namit@vmware.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> --- arch/x86/include/asm/cpu.h | 2 ++ arch/x86/kernel/cpu/intel.c | 38 +++++++++++++++++++++++++++++++++++++- arch/x86/kernel/module.c | 6 ++++++ include/linux/module.h | 4 ++++ kernel/module.c | 5 +++++ 5 files changed, 54 insertions(+), 1 deletion(-) --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -44,6 +44,7 @@ unsigned int x86_stepping(unsigned int s extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); extern void switch_to_sld(unsigned long tifn); extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); +extern void split_lock_validate_module_text(struct module *me, void *text, void *text_end); #else static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} static inline void switch_to_sld(unsigned long tifn) {} @@ -51,5 +52,6 @@ static inline bool handle_user_split_loc { return false; } +static inline void split_lock_validate_module_text(struct module *me, void *text, void *text_end) {} #endif #endif /* _ASM_X86_CPU_H */ --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -9,6 +9,7 @@ #include <linux/thread_info.h> #include <linux/init.h> #include <linux/uaccess.h> +#include <linux/module.h> #include <asm/cpufeature.h> #include <asm/pgtable.h> @@ -21,6 +22,7 @@ #include <asm/elf.h> #include <asm/cpu_device_id.h> #include <asm/cmdline.h> +#include <asm/insn.h> #ifdef CONFIG_X86_64 #include <linux/topology.h> @@ -1055,12 +1057,46 @@ static void sld_update_msr(bool on) { u64 test_ctrl_val = msr_test_ctrl_cache; - if (on) + if (on && (sld_state != sld_off)) test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT; wrmsrl(MSR_TEST_CTRL, test_ctrl_val); } +static void sld_remote_kill(void *arg) +{ + sld_update_msr(false); +} + +void split_lock_validate_module_text(struct module *me, void *text, void *text_end) +{ + u8 vmlaunch[] = { 0x0f, 0x01, 0xc2 }; + struct insn insn; + + if (sld_state == sld_off) + return; + + while (text < text_end) { + kernel_insn_init(&insn, text, text_end - text); + insn_get_length(&insn); + + if (WARN_ON_ONCE(!insn_complete(&insn))) + break; + + if (insn.length == sizeof(vmlaunch) && !memcmp(text, vmlaunch, sizeof(vmlaunch))) + goto bad_module; + + text += insn.length; + } + + return; + +bad_module: + pr_warn("disabled due to VMLAUNCH in module: %s\n", me->name); + sld_state = sld_off; + on_each_cpu(sld_remote_kill, NULL, 1); +} + static void split_lock_init(void) { split_lock_verify_msr(sld_state != sld_off); --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -24,6 +24,7 @@ #include <asm/pgtable.h> #include <asm/setup.h> #include <asm/unwind.h> +#include <asm/cpu.h> #if 0 #define DEBUGP(fmt, ...) \ @@ -253,6 +254,11 @@ int module_finalize(const Elf_Ehdr *hdr, tseg, tseg + text->sh_size); } + if (text && !me->sld_safe) { + void *tseg = (void *)text->sh_addr; + split_lock_validate_module_text(me, tseg, tseg + text->sh_size); + } + if (para) { void *pseg = (void *)para->sh_addr; apply_paravirt(pseg, pseg + para->sh_size); --- a/include/linux/module.h +++ b/include/linux/module.h @@ -407,6 +407,10 @@ struct module { bool sig_ok; #endif +#ifdef CONFIG_CPU_SUP_INTEL + bool sld_safe; +#endif + bool async_probe_requested; /* symbols that will be GPL-only in the near future. */ --- a/kernel/module.c +++ b/kernel/module.c @@ -3096,6 +3096,11 @@ static int check_modinfo(struct module * "is unknown, you have been warned.\n", mod->name); } +#ifdef CONFIG_CPU_SUP_INTEL + if (get_modinfo(info, "sld_safe")) + mod->sld_safe = true; +#endif + err = check_modinfo_livepatch(mod, info); if (err) return err; ^ permalink raw reply [flat|nested] 76+ messages in thread
* [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner @ 2020-04-02 15:23 ` Peter Zijlstra 2020-04-02 16:20 ` Xiaoyao Li 2020-04-03 8:09 ` David Laight 2020-04-02 23:42 ` [patch " Rasmus Villemoes ` (4 subsequent siblings) 5 siblings, 2 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-02 15:23 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt I picked VMXOFF (which also appears in vmmon.ko) instead of VMXON because that latter takes an argument is therefore more difficult to decode. --- Subject: x86,module: Detect VMX modules and disable Split-Lock-Detect From: Peter Zijlstra <peterz@infradead.org> Date: Thu, 02 Apr 2020 14:32:59 +0200 It turns out that with Split-Lock-Detect enabled (default) any VMX hypervisor needs at least a little modification in order to not blindly inject the #AC into the guest without the guest being ready for it. Since there is no telling which module implements a hypervisor, scan the module text and look for the VMLAUNCH/VMXOFF instructions. If found, the module is assumed to be a hypervisor of some sort and SLD is disabled. Hypervisors, which have been modified and are known to work correctly, can add: MODULE_INFO(sld_safe, "Y"); to explicitly tell the module loader they're good. NOTE: it is unfortunate that struct load_info is not available to the arch module code, this means CONFIG_CPU_SUP_INTEL gunk is needed in generic code. NOTE: while we can 'trivially' fix KVM, we're still stuck with stuff like VMware and VirtualBox doing their own thing. Reported-by: "Kenneth R. Crudup" <kenny@panix.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- arch/x86/include/asm/cpu.h | 2 ++ arch/x86/kernel/cpu/intel.c | 41 ++++++++++++++++++++++++++++++++++++++++- arch/x86/kernel/module.c | 6 ++++++ include/linux/module.h | 4 ++++ kernel/module.c | 5 +++++ 5 files changed, 57 insertions(+), 1 deletion(-) --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -44,6 +44,7 @@ unsigned int x86_stepping(unsigned int s extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); extern void switch_to_sld(unsigned long tifn); extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); +extern void split_lock_validate_module_text(struct module *me, void *text, void *text_end); #else static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} static inline void switch_to_sld(unsigned long tifn) {} @@ -51,5 +52,6 @@ static inline bool handle_user_split_loc { return false; } +static inline void split_lock_validate_module_text(struct module *me, void *text, void *text_end) {} #endif #endif /* _ASM_X86_CPU_H */ --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -9,6 +9,7 @@ #include <linux/thread_info.h> #include <linux/init.h> #include <linux/uaccess.h> +#include <linux/module.h> #include <asm/cpufeature.h> #include <asm/pgtable.h> @@ -21,6 +22,7 @@ #include <asm/elf.h> #include <asm/cpu_device_id.h> #include <asm/cmdline.h> +#include <asm/insn.h> #ifdef CONFIG_X86_64 #include <linux/topology.h> @@ -1055,12 +1057,49 @@ static void sld_update_msr(bool on) { u64 test_ctrl_val = msr_test_ctrl_cache; - if (on) + if (on && (sld_state != sld_off)) test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT; wrmsrl(MSR_TEST_CTRL, test_ctrl_val); } +static void sld_remote_kill(void *arg) +{ + sld_update_msr(false); +} + +void split_lock_validate_module_text(struct module *me, void *text, void *text_end) +{ + u8 vmxoff[] = { 0x0f, 0x01, 0xc4 }; + u8 vmlaunch[] = { 0x0f, 0x01, 0xc2 }; + struct insn insn; + + if (sld_state == sld_off) + return; + + while (text < text_end) { + kernel_insn_init(&insn, text, text_end - text); + insn_get_length(&insn); + + if (WARN_ON_ONCE(!insn_complete(&insn))) + break; + + if (insn.length == 3 && + (!memcmp(text, vmlaunch, sizeof(vmlaunch)) || + !memcmp(text, vmxoff, sizeof(vmxoff)))) + goto bad_module; + + text += insn.length; + } + + return; + +bad_module: + pr_warn("disabled due to VMX in module: %s\n", me->name); + sld_state = sld_off; + on_each_cpu(sld_remote_kill, NULL, 1); +} + static void split_lock_init(void) { split_lock_verify_msr(sld_state != sld_off); --- a/arch/x86/kernel/module.c +++ b/arch/x86/kernel/module.c @@ -24,6 +24,7 @@ #include <asm/pgtable.h> #include <asm/setup.h> #include <asm/unwind.h> +#include <asm/cpu.h> #if 0 #define DEBUGP(fmt, ...) \ @@ -253,6 +254,11 @@ int module_finalize(const Elf_Ehdr *hdr, tseg, tseg + text->sh_size); } + if (text && !me->sld_safe) { + void *tseg = (void *)text->sh_addr; + split_lock_validate_module_text(me, tseg, tseg + text->sh_size); + } + if (para) { void *pseg = (void *)para->sh_addr; apply_paravirt(pseg, pseg + para->sh_size); --- a/include/linux/module.h +++ b/include/linux/module.h @@ -407,6 +407,10 @@ struct module { bool sig_ok; #endif +#ifdef CONFIG_CPU_SUP_INTEL + bool sld_safe; +#endif + bool async_probe_requested; /* symbols that will be GPL-only in the near future. */ --- a/kernel/module.c +++ b/kernel/module.c @@ -3096,6 +3096,11 @@ static int check_modinfo(struct module * "is unknown, you have been warned.\n", mod->name); } +#ifdef CONFIG_CPU_SUP_INTEL + if (get_modinfo(info, "sld_safe")) + mod->sld_safe = true; +#endif + err = check_modinfo_livepatch(mod, info); if (err) return err; ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 15:23 ` [patch v2 " Peter Zijlstra @ 2020-04-02 16:20 ` Xiaoyao Li 2020-04-02 16:25 ` Peter Zijlstra 2020-04-03 8:09 ` David Laight 1 sibling, 1 reply; 76+ messages in thread From: Xiaoyao Li @ 2020-04-02 16:20 UTC (permalink / raw) To: Peter Zijlstra, Thomas Gleixner Cc: LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On 4/2/2020 11:23 PM, Peter Zijlstra wrote: > > I picked VMXOFF (which also appears in vmmon.ko) instead of VMXON > because that latter takes an argument is therefore more difficult to > decode. > > --- > Subject: x86,module: Detect VMX modules and disable Split-Lock-Detect > From: Peter Zijlstra <peterz@infradead.org> > Date: Thu, 02 Apr 2020 14:32:59 +0200 > > It turns out that with Split-Lock-Detect enabled (default) any VMX > hypervisor needs at least a little modification in order to not blindly > inject the #AC into the guest without the guest being ready for it. > > Since there is no telling which module implements a hypervisor, scan the > module text and look for the VMLAUNCH/VMXOFF instructions. If found, the > module is assumed to be a hypervisor of some sort and SLD is disabled. > > Hypervisors, which have been modified and are known to work correctly, > can add: > > MODULE_INFO(sld_safe, "Y"); > > to explicitly tell the module loader they're good. > > NOTE: it is unfortunate that struct load_info is not available to the > arch module code, this means CONFIG_CPU_SUP_INTEL gunk is needed > in generic code. > > NOTE: while we can 'trivially' fix KVM, we're still stuck with stuff > like VMware and VirtualBox doing their own thing. > > Reported-by: "Kenneth R. Crudup" <kenny@panix.com> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > arch/x86/include/asm/cpu.h | 2 ++ > arch/x86/kernel/cpu/intel.c | 41 ++++++++++++++++++++++++++++++++++++++++- > arch/x86/kernel/module.c | 6 ++++++ > include/linux/module.h | 4 ++++ > kernel/module.c | 5 +++++ > 5 files changed, 57 insertions(+), 1 deletion(-) > > --- a/arch/x86/include/asm/cpu.h > +++ b/arch/x86/include/asm/cpu.h > @@ -44,6 +44,7 @@ unsigned int x86_stepping(unsigned int s > extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); > extern void switch_to_sld(unsigned long tifn); > extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); > +extern void split_lock_validate_module_text(struct module *me, void *text, void *text_end); > #else > static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} > static inline void switch_to_sld(unsigned long tifn) {} > @@ -51,5 +52,6 @@ static inline bool handle_user_split_loc > { > return false; > } > +static inline void split_lock_validate_module_text(struct module *me, void *text, void *text_end) {} > #endif > #endif /* _ASM_X86_CPU_H */ > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -9,6 +9,7 @@ > #include <linux/thread_info.h> > #include <linux/init.h> > #include <linux/uaccess.h> > +#include <linux/module.h> > > #include <asm/cpufeature.h> > #include <asm/pgtable.h> > @@ -21,6 +22,7 @@ > #include <asm/elf.h> > #include <asm/cpu_device_id.h> > #include <asm/cmdline.h> > +#include <asm/insn.h> > > #ifdef CONFIG_X86_64 > #include <linux/topology.h> > @@ -1055,12 +1057,49 @@ static void sld_update_msr(bool on) > { > u64 test_ctrl_val = msr_test_ctrl_cache; > > - if (on) > + if (on && (sld_state != sld_off)) > test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT; > > wrmsrl(MSR_TEST_CTRL, test_ctrl_val); > } > > +static void sld_remote_kill(void *arg) > +{ > + sld_update_msr(false); > +} > + > +void split_lock_validate_module_text(struct module *me, void *text, void *text_end) > +{ > + u8 vmxoff[] = { 0x0f, 0x01, 0xc4 }; > + u8 vmlaunch[] = { 0x0f, 0x01, 0xc2 }; > + struct insn insn; > + > + if (sld_state == sld_off) > + return; > + > + while (text < text_end) { > + kernel_insn_init(&insn, text, text_end - text); > + insn_get_length(&insn); > + > + if (WARN_ON_ONCE(!insn_complete(&insn))) > + break; > + > + if (insn.length == 3 && > + (!memcmp(text, vmlaunch, sizeof(vmlaunch)) || > + !memcmp(text, vmxoff, sizeof(vmxoff)))) > + goto bad_module; > + > + text += insn.length; > + } > + > + return; > + > +bad_module: > + pr_warn("disabled due to VMX in module: %s\n", me->name); > + sld_state = sld_off; shouldn't we remove the __ro_after_init of sld_state? And, shouldn't we clear X86_FEATURE_SPLIT_LOCK_DETECT flag? > + on_each_cpu(sld_remote_kill, NULL, 1); > +} > + > static void split_lock_init(void) > { > split_lock_verify_msr(sld_state != sld_off); > --- a/arch/x86/kernel/module.c > +++ b/arch/x86/kernel/module.c > @@ -24,6 +24,7 @@ > #include <asm/pgtable.h> > #include <asm/setup.h> > #include <asm/unwind.h> > +#include <asm/cpu.h> > > #if 0 > #define DEBUGP(fmt, ...) \ > @@ -253,6 +254,11 @@ int module_finalize(const Elf_Ehdr *hdr, > tseg, tseg + text->sh_size); > } > > + if (text && !me->sld_safe) { > + void *tseg = (void *)text->sh_addr; > + split_lock_validate_module_text(me, tseg, tseg + text->sh_size); > + } > + > if (para) { > void *pseg = (void *)para->sh_addr; > apply_paravirt(pseg, pseg + para->sh_size); > --- a/include/linux/module.h > +++ b/include/linux/module.h > @@ -407,6 +407,10 @@ struct module { > bool sig_ok; > #endif > > +#ifdef CONFIG_CPU_SUP_INTEL > + bool sld_safe; > +#endif > + > bool async_probe_requested; > > /* symbols that will be GPL-only in the near future. */ > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -3096,6 +3096,11 @@ static int check_modinfo(struct module * > "is unknown, you have been warned.\n", mod->name); > } > > +#ifdef CONFIG_CPU_SUP_INTEL > + if (get_modinfo(info, "sld_safe")) > + mod->sld_safe = true; > +#endif > + > err = check_modinfo_livepatch(mod, info); > if (err) > return err; > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 16:20 ` Xiaoyao Li @ 2020-04-02 16:25 ` Peter Zijlstra 2020-04-02 16:39 ` Nadav Amit 2020-04-02 16:41 ` Xiaoyao Li 0 siblings, 2 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-02 16:25 UTC (permalink / raw) To: Xiaoyao Li Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt Learn to trim your replies already! On Fri, Apr 03, 2020 at 12:20:08AM +0800, Xiaoyao Li wrote: > On 4/2/2020 11:23 PM, Peter Zijlstra wrote: > > +bad_module: > > + pr_warn("disabled due to VMX in module: %s\n", me->name); > > + sld_state = sld_off; > > shouldn't we remove the __ro_after_init of sld_state? Oh, that's probably a good idea. I can't actually test this due to no hardware. > And, shouldn't we clear X86_FEATURE_SPLIT_LOCK_DETECT flag? Don't think you can do that this late. Also, the hardware has the MSR and it works, it's just that we should not. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 16:25 ` Peter Zijlstra @ 2020-04-02 16:39 ` Nadav Amit 2020-04-02 16:41 ` Xiaoyao Li 1 sibling, 0 replies; 76+ messages in thread From: Nadav Amit @ 2020-04-02 16:39 UTC (permalink / raw) To: Peter Zijlstra Cc: Xiaoyao Li, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt > On Apr 2, 2020, at 9:25 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > > Learn to trim your replies already! > > On Fri, Apr 03, 2020 at 12:20:08AM +0800, Xiaoyao Li wrote: >> On 4/2/2020 11:23 PM, Peter Zijlstra wrote: > >>> +bad_module: >>> + pr_warn("disabled due to VMX in module: %s\n", me->name); >>> + sld_state = sld_off; >> >> shouldn't we remove the __ro_after_init of sld_state? > > Oh, that's probably a good idea. I can't actually test this due to no > hardware. Just wondering, since I lack hardware as well: can the performance counter LOCK_CYCLES.SPLIT_LOCK_UC_LOCK_DURATION be used to detect split-locks similarly to SLD (although it would be after the split-lock event)? ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 16:25 ` Peter Zijlstra 2020-04-02 16:39 ` Nadav Amit @ 2020-04-02 16:41 ` Xiaoyao Li 2020-04-02 17:34 ` Thomas Gleixner 1 sibling, 1 reply; 76+ messages in thread From: Xiaoyao Li @ 2020-04-02 16:41 UTC (permalink / raw) To: Peter Zijlstra Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On 4/3/2020 12:25 AM, Peter Zijlstra wrote: > > Learn to trim your replies already! Sorry. > On Fri, Apr 03, 2020 at 12:20:08AM +0800, Xiaoyao Li wrote: >> On 4/2/2020 11:23 PM, Peter Zijlstra wrote: > >>> +bad_module: >>> + pr_warn("disabled due to VMX in module: %s\n", me->name); >>> + sld_state = sld_off; >> >> shouldn't we remove the __ro_after_init of sld_state? > > Oh, that's probably a good idea. I can't actually test this due to no > hardware. > >> And, shouldn't we clear X86_FEATURE_SPLIT_LOCK_DETECT flag? > > Don't think you can do that this late. Also, the hardware has the MSR > and it works, it's just that we should not. > Actually, I agree to keep this flag. But, during the previous patch review, tglx wants to make sld_off = no X86_FEATURE_SPLIT_LOCK_DETECT I'm not sure whether he still insists on it now. I really want to decouple sld_off and X86_FEATURE_SPLIT_LOCK_DETECT. So if X86_FEATURE_SPLIT_LOCK_DETECT is set, we can virtualize and expose it to guest even when host is sld_off. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 16:41 ` Xiaoyao Li @ 2020-04-02 17:34 ` Thomas Gleixner 2020-04-02 17:51 ` Sean Christopherson 0 siblings, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 17:34 UTC (permalink / raw) To: Xiaoyao Li, Peter Zijlstra Cc: LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt Xiaoyao Li <xiaoyao.li@intel.com> writes: > On 4/3/2020 12:25 AM, Peter Zijlstra wrote: >> On Fri, Apr 03, 2020 at 12:20:08AM +0800, Xiaoyao Li wrote: >>> And, shouldn't we clear X86_FEATURE_SPLIT_LOCK_DETECT flag? >> >> Don't think you can do that this late. Also, the hardware has the MSR >> and it works, it's just that we should not. >> > > Actually, I agree to keep this flag. > > But, during the previous patch review, tglx wants to make > > sld_off = no X86_FEATURE_SPLIT_LOCK_DETECT > > I'm not sure whether he still insists on it now. Obviously I cant. > I really want to decouple sld_off and X86_FEATURE_SPLIT_LOCK_DETECT. > So if X86_FEATURE_SPLIT_LOCK_DETECT is set, we can virtualize and expose > it to guest even when host is sld_off. Can we first have a sane solution for the problem at hand? Aside of that I'm still against the attempt of proliferating crap, i.e. disabling it because the host is triggering it and then exposing it to guests. The above does not change my mind in any way. This proposal is still wrong. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 17:34 ` Thomas Gleixner @ 2020-04-02 17:51 ` Sean Christopherson 2020-04-02 18:51 ` Peter Zijlstra 0 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 17:51 UTC (permalink / raw) To: Thomas Gleixner Cc: Xiaoyao Li, Peter Zijlstra, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 07:34:35PM +0200, Thomas Gleixner wrote: > Aside of that I'm still against the attempt of proliferating crap, > i.e. disabling it because the host is triggering it and then exposing it > to guests. The above does not change my mind in any way. This proposal > is still wrong. Eh, I still think the "off in host, on in guest" is a legit scenario for debug/development/testing, but I agree that the added complexity doesn't justify the minimal benefits versus sld_warn. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 17:51 ` Sean Christopherson @ 2020-04-02 18:51 ` Peter Zijlstra 2020-04-02 20:23 ` Sean Christopherson 0 siblings, 1 reply; 76+ messages in thread From: Peter Zijlstra @ 2020-04-02 18:51 UTC (permalink / raw) To: Sean Christopherson Cc: Thomas Gleixner, Xiaoyao Li, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 10:51:28AM -0700, Sean Christopherson wrote: > On Thu, Apr 02, 2020 at 07:34:35PM +0200, Thomas Gleixner wrote: > > Aside of that I'm still against the attempt of proliferating crap, > > i.e. disabling it because the host is triggering it and then exposing it > > to guests. The above does not change my mind in any way. This proposal > > is still wrong. > > Eh, I still think the "off in host, on in guest" is a legit scenario for > debug/development/testing, but I agree that the added complexity doesn't > justify the minimal benefits versus sld_warn. Off in host on in guest seems utterly insane to me. Why do you care about that? That's like building a bridge with rotten timber. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 18:51 ` Peter Zijlstra @ 2020-04-02 20:23 ` Sean Christopherson 2020-04-02 21:04 ` Thomas Gleixner 0 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 20:23 UTC (permalink / raw) To: Peter Zijlstra Cc: Thomas Gleixner, Xiaoyao Li, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 08:51:48PM +0200, Peter Zijlstra wrote: > On Thu, Apr 02, 2020 at 10:51:28AM -0700, Sean Christopherson wrote: > > On Thu, Apr 02, 2020 at 07:34:35PM +0200, Thomas Gleixner wrote: > > > Aside of that I'm still against the attempt of proliferating crap, > > > i.e. disabling it because the host is triggering it and then exposing it > > > to guests. The above does not change my mind in any way. This proposal > > > is still wrong. > > > > Eh, I still think the "off in host, on in guest" is a legit scenario for > > debug/development/testing, but I agree that the added complexity doesn't > > justify the minimal benefits versus sld_warn. > > Off in host on in guest seems utterly insane to me. Why do you care > about that? For development/debug/testing. Ignoring the core-scope stupidity of split lock, the _functional_ behavior of the host kernel and guest kernel are completely separate. The host can generate split locks all it wants, but other than performance, its bad behavior has no impact on the guest. For example, all of the debug that was done to eliminate split locks in the kernel could have been done in a KVM guest, even though the host kernel would not have yet been split-lock free. It's somewhat of a moot point now that the kernel is split-lock free. But, if I encountered a split lock panic on my system, the first thing I would do (after rebooting) would be to fire up a VM to try and reproduce and debug the issue. Oftentimes it's significantly easier to "enable" a feature in KVM, i.e. expose a feature to the guest, than it is to actually enable it in the kernel. Enabling KVM first doesn't work if there are hard dependencies on kernel enabling, e.g. most things that have an XSAVE component, but for a lot of features it's a viable strategy to enable KVM first, and then do all testing and debug inside a KVM guest. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 20:23 ` Sean Christopherson @ 2020-04-02 21:04 ` Thomas Gleixner 2020-04-02 21:16 ` Sean Christopherson 0 siblings, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 21:04 UTC (permalink / raw) To: Sean Christopherson, Peter Zijlstra Cc: Xiaoyao Li, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Thu, Apr 02, 2020 at 08:51:48PM +0200, Peter Zijlstra wrote: >> On Thu, Apr 02, 2020 at 10:51:28AM -0700, Sean Christopherson wrote: >> > On Thu, Apr 02, 2020 at 07:34:35PM +0200, Thomas Gleixner wrote: >> > > Aside of that I'm still against the attempt of proliferating crap, >> > > i.e. disabling it because the host is triggering it and then exposing it >> > > to guests. The above does not change my mind in any way. This proposal >> > > is still wrong. >> > >> > Eh, I still think the "off in host, on in guest" is a legit scenario for >> > debug/development/testing, but I agree that the added complexity doesn't >> > justify the minimal benefits versus sld_warn. >> >> Off in host on in guest seems utterly insane to me. Why do you care >> about that? > > For development/debug/testing. Ignoring the core-scope stupidity of split > lock, the _functional_ behavior of the host kernel and guest kernel are > completely separate. The host can generate split locks all it wants, but > other than performance, its bad behavior has no impact on the guest. > > For example, all of the debug that was done to eliminate split locks in the > kernel could have been done in a KVM guest, even though the host kernel > would not have yet been split-lock free. > > It's somewhat of a moot point now that the kernel is split-lock free. But, > if I encountered a split lock panic on my system, the first thing I would > do (after rebooting) would be to fire up a VM to try and reproduce and > debug the issue. > > Oftentimes it's significantly easier to "enable" a feature in KVM, i.e. > expose a feature to the guest, than it is to actually enable it in the > kernel. Enabling KVM first doesn't work if there are hard dependencies on > kernel enabling, e.g. most things that have an XSAVE component, but for a > lot of features it's a viable strategy to enable KVM first, and then do all > testing and debug inside a KVM guest. I can see that aspect, but there were pretty clear messages in one of the other threads: "It's not about whether or not host is clean. It's for the cases that users just don't want it enabled on host, to not break the applications or drivers that do have split lock issue." "My thought is for CSPs that they might not turn on SLD on their product environment. Any split lock in kernel or drivers may break their service for tenants." which I back then called out as proliferating crap and ensuring that this stuff never gets fixed. I still call it out as exactly that and you know as well as I do that this is the reality. For people like you who actually want to debug stuff in a guest, the extra 10 lines of hack on top of the other 1000 lines of hacks you already have are not really something which justifies to give hardware and OS/application vendors the easy way out to avoid fixing their broken crap. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 21:04 ` Thomas Gleixner @ 2020-04-02 21:16 ` Sean Christopherson 0 siblings, 0 replies; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 21:16 UTC (permalink / raw) To: Thomas Gleixner Cc: Peter Zijlstra, Xiaoyao Li, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 11:04:05PM +0200, Thomas Gleixner wrote: > Sean Christopherson <sean.j.christopherson@intel.com> writes: > > On Thu, Apr 02, 2020 at 08:51:48PM +0200, Peter Zijlstra wrote: > >> On Thu, Apr 02, 2020 at 10:51:28AM -0700, Sean Christopherson wrote: > >> > On Thu, Apr 02, 2020 at 07:34:35PM +0200, Thomas Gleixner wrote: > >> > > Aside of that I'm still against the attempt of proliferating crap, > >> > > i.e. disabling it because the host is triggering it and then exposing it > >> > > to guests. The above does not change my mind in any way. This proposal > >> > > is still wrong. > >> > > >> > Eh, I still think the "off in host, on in guest" is a legit scenario for > >> > debug/development/testing, but I agree that the added complexity doesn't > >> > justify the minimal benefits versus sld_warn. > >> > >> Off in host on in guest seems utterly insane to me. Why do you care > >> about that? > > > > For development/debug/testing. Ignoring the core-scope stupidity of split > > lock, the _functional_ behavior of the host kernel and guest kernel are > > completely separate. The host can generate split locks all it wants, but > > other than performance, its bad behavior has no impact on the guest. > > > > For example, all of the debug that was done to eliminate split locks in the > > kernel could have been done in a KVM guest, even though the host kernel > > would not have yet been split-lock free. > > > > It's somewhat of a moot point now that the kernel is split-lock free. But, > > if I encountered a split lock panic on my system, the first thing I would > > do (after rebooting) would be to fire up a VM to try and reproduce and > > debug the issue. > > > > Oftentimes it's significantly easier to "enable" a feature in KVM, i.e. > > expose a feature to the guest, than it is to actually enable it in the > > kernel. Enabling KVM first doesn't work if there are hard dependencies on > > kernel enabling, e.g. most things that have an XSAVE component, but for a > > lot of features it's a viable strategy to enable KVM first, and then do all > > testing and debug inside a KVM guest. > > I can see that aspect, but there were pretty clear messages in one of > the other threads: > > "It's not about whether or not host is clean. It's for the cases that > users just don't want it enabled on host, to not break the > applications or drivers that do have split lock issue." > > "My thought is for CSPs that they might not turn on SLD on their > product environment. Any split lock in kernel or drivers may break > their service for tenants." > > which I back then called out as proliferating crap and ensuring that > this stuff never gets fixed. Or more likely, gets fixed, just not in upstream :-) > I still call it out as exactly that and you know as well as I do that > this is the reality. > > For people like you who actually want to debug stuff in a guest, the > extra 10 lines of hack on top of the other 1000 lines of hacks you > already have are not really something which justifies to give hardware > and OS/application vendors the easy way out to avoid fixing their broken > crap. Ya, I see where you're coming from. As above, I agree that having a "KVM only" mode does more harm than good. ^ permalink raw reply [flat|nested] 76+ messages in thread
* RE: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 15:23 ` [patch v2 " Peter Zijlstra 2020-04-02 16:20 ` Xiaoyao Li @ 2020-04-03 8:09 ` David Laight 2020-04-03 14:33 ` Peter Zijlstra 1 sibling, 1 reply; 76+ messages in thread From: David Laight @ 2020-04-03 8:09 UTC (permalink / raw) To: 'Peter Zijlstra', Thomas Gleixner Cc: LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt From: Peter Zijlstra > Sent: 02 April 2020 16:24 > > I picked VMXOFF (which also appears in vmmon.ko) instead of VMXON > because that latter takes an argument is therefore more difficult to > decode. ... > + while (text < text_end) { > + kernel_insn_init(&insn, text, text_end - text); > + insn_get_length(&insn); > + > + if (WARN_ON_ONCE(!insn_complete(&insn))) > + break; > + > + if (insn.length == 3 && > + (!memcmp(text, vmlaunch, sizeof(vmlaunch)) || > + !memcmp(text, vmxoff, sizeof(vmxoff)))) > + goto bad_module; > + > + text += insn.length; > + } How long is that going to take on a module with (say) 400k of text? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch v2 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 8:09 ` David Laight @ 2020-04-03 14:33 ` Peter Zijlstra 0 siblings, 0 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-03 14:33 UTC (permalink / raw) To: David Laight Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Fri, Apr 03, 2020 at 08:09:03AM +0000, David Laight wrote: > From: Peter Zijlstra > > Sent: 02 April 2020 16:24 > > > > I picked VMXOFF (which also appears in vmmon.ko) instead of VMXON > > because that latter takes an argument is therefore more difficult to > > decode. > ... > > + while (text < text_end) { > > + kernel_insn_init(&insn, text, text_end - text); > > + insn_get_length(&insn); > > + > > + if (WARN_ON_ONCE(!insn_complete(&insn))) > > + break; > > + > > + if (insn.length == 3 && > > + (!memcmp(text, vmlaunch, sizeof(vmlaunch)) || > > + !memcmp(text, vmxoff, sizeof(vmxoff)))) > > + goto bad_module; > > + > > + text += insn.length; > > + } > > How long is that going to take on a module with (say) 400k of text? It's module load, why would you care? I suspect it's really fast, but even if it wasn't I couldn't be arsed. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner 2020-04-02 15:23 ` [patch v2 " Peter Zijlstra @ 2020-04-02 23:42 ` Rasmus Villemoes 2020-04-03 14:35 ` Jessica Yu 2020-04-03 11:29 ` kbuild test robot ` (3 subsequent siblings) 5 siblings, 1 reply; 76+ messages in thread From: Rasmus Villemoes @ 2020-04-02 23:42 UTC (permalink / raw) To: Thomas Gleixner, LKML Cc: x86, Kenneth R. Crudup, Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On 02/04/2020 14.32, Thomas Gleixner wrote: > From: Peter Zijlstra <peterz@infradead.org> > > It turns out that with Split-Lock-Detect enabled (default) any VMX > hypervisor needs at least a little modification in order to not blindly > inject the #AC into the guest without the guest being ready for it. > > Since there is no telling which module implements a hypervisor, scan the > module text and look for the VMLAUNCH instruction. If found, the module is > assumed to be a hypervisor of some sort and SLD is disabled. How long does that scan take/add to module load time? Would it make sense to exempt in-tree modules? Rasmus ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 23:42 ` [patch " Rasmus Villemoes @ 2020-04-03 14:35 ` Jessica Yu 2020-04-03 15:21 ` Peter Zijlstra 0 siblings, 1 reply; 76+ messages in thread From: Jessica Yu @ 2020-04-03 14:35 UTC (permalink / raw) To: Rasmus Villemoes Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Peter Zijlstra (Intel), Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt +++ Rasmus Villemoes [03/04/20 01:42 +0200]: >On 02/04/2020 14.32, Thomas Gleixner wrote: >> From: Peter Zijlstra <peterz@infradead.org> >> >> It turns out that with Split-Lock-Detect enabled (default) any VMX >> hypervisor needs at least a little modification in order to not blindly >> inject the #AC into the guest without the guest being ready for it. >> >> Since there is no telling which module implements a hypervisor, scan the >> module text and look for the VMLAUNCH instruction. If found, the module is >> assumed to be a hypervisor of some sort and SLD is disabled. > >How long does that scan take/add to module load time? Would it make >sense to exempt in-tree modules? > >Rasmus I second Rasmus's question. It seems rather unfortunate that we have to do this text scan for every module load on x86, when it doesn't apply to the majority of them, and only to a handful of out-of-tree hypervisor modules (assuming kvm is taken care of already). I wonder if it would make sense then to limit the text scans to just out-of-tree modules (i.e., missing the intree modinfo flag)? Jessica ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 14:35 ` Jessica Yu @ 2020-04-03 15:21 ` Peter Zijlstra 2020-04-03 16:01 ` Sean Christopherson 2020-04-03 18:53 ` Thomas Gleixner 0 siblings, 2 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-03 15:21 UTC (permalink / raw) To: Jessica Yu Cc: Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > +++ Rasmus Villemoes [03/04/20 01:42 +0200]: > > On 02/04/2020 14.32, Thomas Gleixner wrote: > > > From: Peter Zijlstra <peterz@infradead.org> > > > > > > It turns out that with Split-Lock-Detect enabled (default) any VMX > > > hypervisor needs at least a little modification in order to not blindly > > > inject the #AC into the guest without the guest being ready for it. > > > > > > Since there is no telling which module implements a hypervisor, scan the > > > module text and look for the VMLAUNCH instruction. If found, the module is > > > assumed to be a hypervisor of some sort and SLD is disabled. > > > > How long does that scan take/add to module load time? Would it make > > sense to exempt in-tree modules? > > > > Rasmus > > I second Rasmus's question. It seems rather unfortunate that we have > to do this text scan for every module load on x86, when it doesn't > apply to the majority of them, and only to a handful of out-of-tree > hypervisor modules (assuming kvm is taken care of already). > > I wonder if it would make sense then to limit the text scans to just > out-of-tree modules (i.e., missing the intree modinfo flag)? It would; didn't know there was one. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 15:21 ` Peter Zijlstra @ 2020-04-03 16:01 ` Sean Christopherson 2020-04-03 16:12 ` Peter Zijlstra 2020-04-03 18:53 ` Thomas Gleixner 1 sibling, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-03 16:01 UTC (permalink / raw) To: Peter Zijlstra Cc: Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: > On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > > +++ Rasmus Villemoes [03/04/20 01:42 +0200]: > > > On 02/04/2020 14.32, Thomas Gleixner wrote: > > > > From: Peter Zijlstra <peterz@infradead.org> > > > > > > > > It turns out that with Split-Lock-Detect enabled (default) any VMX > > > > hypervisor needs at least a little modification in order to not blindly > > > > inject the #AC into the guest without the guest being ready for it. > > > > > > > > Since there is no telling which module implements a hypervisor, scan the > > > > module text and look for the VMLAUNCH instruction. If found, the module is > > > > assumed to be a hypervisor of some sort and SLD is disabled. > > > > > > How long does that scan take/add to module load time? Would it make > > > sense to exempt in-tree modules? > > > > > > Rasmus > > > > I second Rasmus's question. It seems rather unfortunate that we have > > to do this text scan for every module load on x86, when it doesn't > > apply to the majority of them, and only to a handful of out-of-tree > > hypervisor modules (assuming kvm is taken care of already). > > > > I wonder if it would make sense then to limit the text scans to just > > out-of-tree modules (i.e., missing the intree modinfo flag)? > > It would; didn't know there was one. Rather than scanning modules at all, what about hooking native_write_cr4() to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a "sld safe" counter? Partially tested patch incoming... ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:01 ` Sean Christopherson @ 2020-04-03 16:12 ` Peter Zijlstra 2020-04-03 16:16 ` David Laight 2020-04-03 16:25 ` Sean Christopherson 0 siblings, 2 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-03 16:12 UTC (permalink / raw) To: Sean Christopherson Cc: Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook On Fri, Apr 03, 2020 at 09:01:56AM -0700, Sean Christopherson wrote: > On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: > > On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > > > I wonder if it would make sense then to limit the text scans to just > > > out-of-tree modules (i.e., missing the intree modinfo flag)? > > > > It would; didn't know there was one. > > Rather than scanning modules at all, what about hooking native_write_cr4() > to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a > "sld safe" counter? And then you're hoping that the module uses that and not: asm volatile ("mov %0, cr4" :: "r" (val)); I think I feel safer with the scanning to be fair. Also with the intree hint on, we can extend the scanning for out-of-tree modules for more dodgy crap we really don't want modules to do, like for example the above. ^ permalink raw reply [flat|nested] 76+ messages in thread
* RE: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:12 ` Peter Zijlstra @ 2020-04-03 16:16 ` David Laight 2020-04-03 16:39 ` Peter Zijlstra 2020-04-03 16:25 ` Sean Christopherson 1 sibling, 1 reply; 76+ messages in thread From: David Laight @ 2020-04-03 16:16 UTC (permalink / raw) To: 'Peter Zijlstra', Sean Christopherson Cc: Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook From: Peter Zijlstra > Sent: 03 April 2020 17:12 > On Fri, Apr 03, 2020 at 09:01:56AM -0700, Sean Christopherson wrote: > > On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: > > > On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > > > > > I wonder if it would make sense then to limit the text scans to just > > > > out-of-tree modules (i.e., missing the intree modinfo flag)? > > > > > > It would; didn't know there was one. > > > > Rather than scanning modules at all, what about hooking native_write_cr4() > > to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a > > "sld safe" counter? > > And then you're hoping that the module uses that and not: > > asm volatile ("mov %0, cr4" :: "r" (val)); > > I think I feel safer with the scanning to be fair. Also with the intree > hint on, we can extend the scanning for out-of-tree modules for more > dodgy crap we really don't want modules to do, like for example the > above. Could you do the scanning in the last phase of the module build that has to be done against the target kernel headers and with the target kernel build infrastructure? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:16 ` David Laight @ 2020-04-03 16:39 ` Peter Zijlstra 0 siblings, 0 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-03 16:39 UTC (permalink / raw) To: David Laight Cc: Sean Christopherson, Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook On Fri, Apr 03, 2020 at 04:16:38PM +0000, David Laight wrote: > From: Peter Zijlstra > > Sent: 03 April 2020 17:12 > > On Fri, Apr 03, 2020 at 09:01:56AM -0700, Sean Christopherson wrote: > > > On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: > > > > On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > > > > > > > I wonder if it would make sense then to limit the text scans to just > > > > > out-of-tree modules (i.e., missing the intree modinfo flag)? > > > > > > > > It would; didn't know there was one. > > > > > > Rather than scanning modules at all, what about hooking native_write_cr4() > > > to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a > > > "sld safe" counter? > > > > And then you're hoping that the module uses that and not: > > > > asm volatile ("mov %0, cr4" :: "r" (val)); > > > > I think I feel safer with the scanning to be fair. Also with the intree > > hint on, we can extend the scanning for out-of-tree modules for more > > dodgy crap we really don't want modules to do, like for example the > > above. > > Could you do the scanning in the last phase of the module build > that has to be done against the target kernel headers and with the > target kernel build infrastructure? You think the binary module people care to respect our module build warnings? ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:12 ` Peter Zijlstra 2020-04-03 16:16 ` David Laight @ 2020-04-03 16:25 ` Sean Christopherson 2020-04-03 16:40 ` Peter Zijlstra 1 sibling, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-03 16:25 UTC (permalink / raw) To: Peter Zijlstra Cc: Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook On Fri, Apr 03, 2020 at 06:12:05PM +0200, Peter Zijlstra wrote: > On Fri, Apr 03, 2020 at 09:01:56AM -0700, Sean Christopherson wrote: > > On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: > > > On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > > > > > I wonder if it would make sense then to limit the text scans to just > > > > out-of-tree modules (i.e., missing the intree modinfo flag)? > > > > > > It would; didn't know there was one. > > > > Rather than scanning modules at all, what about hooking native_write_cr4() > > to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a > > "sld safe" counter? > > And then you're hoping that the module uses that and not: > > asm volatile ("mov %0, cr4" :: "r" (val)); > > I think I feel safer with the scanning to be fair. Also with the intree > hint on, we can extend the scanning for out-of-tree modules for more > dodgy crap we really don't want modules to do, like for example the > above. Ya, that's the big uknown. But wouldn't they'd already be broken in the sense that they'd corrupt the CR4 shadow? E.g. setting VMXE without updating cpu_tlbstate.cr4 would result in future in-kernel writes to CR4 attempting to clear CR4.VMXE post-VMXON, which would #GP. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:25 ` Sean Christopherson @ 2020-04-03 16:40 ` Peter Zijlstra 2020-04-03 16:48 ` Nadav Amit 0 siblings, 1 reply; 76+ messages in thread From: Peter Zijlstra @ 2020-04-03 16:40 UTC (permalink / raw) To: Sean Christopherson Cc: Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook On Fri, Apr 03, 2020 at 09:25:55AM -0700, Sean Christopherson wrote: > On Fri, Apr 03, 2020 at 06:12:05PM +0200, Peter Zijlstra wrote: > > On Fri, Apr 03, 2020 at 09:01:56AM -0700, Sean Christopherson wrote: > > > On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: > > > > On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > > > > > > > I wonder if it would make sense then to limit the text scans to just > > > > > out-of-tree modules (i.e., missing the intree modinfo flag)? > > > > > > > > It would; didn't know there was one. > > > > > > Rather than scanning modules at all, what about hooking native_write_cr4() > > > to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a > > > "sld safe" counter? > > > > And then you're hoping that the module uses that and not: > > > > asm volatile ("mov %0, cr4" :: "r" (val)); > > > > I think I feel safer with the scanning to be fair. Also with the intree > > hint on, we can extend the scanning for out-of-tree modules for more > > dodgy crap we really don't want modules to do, like for example the > > above. > > Ya, that's the big uknown. But wouldn't they'd already be broken in the > sense that they'd corrupt the CR4 shadow? E.g. setting VMXE without > updating cpu_tlbstate.cr4 would result in future in-kernel writes to CR4 > attempting to clear CR4.VMXE post-VMXON, which would #GP. Sadly the CR4 shadow is exported, so they can actually fix that up :/ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:40 ` Peter Zijlstra @ 2020-04-03 16:48 ` Nadav Amit 2020-04-03 17:21 ` Sean Christopherson 0 siblings, 1 reply; 76+ messages in thread From: Nadav Amit @ 2020-04-03 16:48 UTC (permalink / raw) To: Peter Zijlstra Cc: Sean Christopherson, Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook > On Apr 3, 2020, at 9:40 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > On Fri, Apr 03, 2020 at 09:25:55AM -0700, Sean Christopherson wrote: >> On Fri, Apr 03, 2020 at 06:12:05PM +0200, Peter Zijlstra wrote: >>> On Fri, Apr 03, 2020 at 09:01:56AM -0700, Sean Christopherson wrote: >>>> On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: >>>>> On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: >>> >>>>>> I wonder if it would make sense then to limit the text scans to just >>>>>> out-of-tree modules (i.e., missing the intree modinfo flag)? >>>>> >>>>> It would; didn't know there was one. >>>> >>>> Rather than scanning modules at all, what about hooking native_write_cr4() >>>> to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a >>>> "sld safe" counter? >>> >>> And then you're hoping that the module uses that and not: >>> >>> asm volatile ("mov %0, cr4" :: "r" (val)); >>> >>> I think I feel safer with the scanning to be fair. Also with the intree >>> hint on, we can extend the scanning for out-of-tree modules for more >>> dodgy crap we really don't want modules to do, like for example the >>> above. >> >> Ya, that's the big uknown. But wouldn't they'd already be broken in the >> sense that they'd corrupt the CR4 shadow? E.g. setting VMXE without >> updating cpu_tlbstate.cr4 would result in future in-kernel writes to CR4 >> attempting to clear CR4.VMXE post-VMXON, which would #GP. > > Sadly the CR4 shadow is exported, so they can actually fix that up :/ I do not think that Sean’s idea would work for VMware. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:48 ` Nadav Amit @ 2020-04-03 17:21 ` Sean Christopherson 0 siblings, 0 replies; 76+ messages in thread From: Sean Christopherson @ 2020-04-03 17:21 UTC (permalink / raw) To: Nadav Amit Cc: Peter Zijlstra, Jessica Yu, Rasmus Villemoes, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook On Fri, Apr 03, 2020 at 04:48:35PM +0000, Nadav Amit wrote: > > On Apr 3, 2020, at 9:40 AM, Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Fri, Apr 03, 2020 at 09:25:55AM -0700, Sean Christopherson wrote: > >> On Fri, Apr 03, 2020 at 06:12:05PM +0200, Peter Zijlstra wrote: > >>> On Fri, Apr 03, 2020 at 09:01:56AM -0700, Sean Christopherson wrote: > >>>> On Fri, Apr 03, 2020 at 05:21:58PM +0200, Peter Zijlstra wrote: > >>>>> On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: > >>> > >>>>>> I wonder if it would make sense then to limit the text scans to just > >>>>>> out-of-tree modules (i.e., missing the intree modinfo flag)? > >>>>> > >>>>> It would; didn't know there was one. > >>>> > >>>> Rather than scanning modules at all, what about hooking native_write_cr4() > >>>> to kill SLD if CR4.VMXE is toggled on and the caller didn't increment a > >>>> "sld safe" counter? > >>> > >>> And then you're hoping that the module uses that and not: > >>> > >>> asm volatile ("mov %0, cr4" :: "r" (val)); > >>> > >>> I think I feel safer with the scanning to be fair. Also with the intree > >>> hint on, we can extend the scanning for out-of-tree modules for more > >>> dodgy crap we really don't want modules to do, like for example the > >>> above. > >> > >> Ya, that's the big uknown. But wouldn't they'd already be broken in the > >> sense that they'd corrupt the CR4 shadow? E.g. setting VMXE without > >> updating cpu_tlbstate.cr4 would result in future in-kernel writes to CR4 > >> attempting to clear CR4.VMXE post-VMXON, which would #GP. > > > > Sadly the CR4 shadow is exported, so they can actually fix that up :/ > > I do not think that Sean’s idea would work for VMware. Well phooey. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 15:21 ` Peter Zijlstra 2020-04-03 16:01 ` Sean Christopherson @ 2020-04-03 18:53 ` Thomas Gleixner 2020-04-03 20:58 ` Andy Lutomirski 1 sibling, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-03 18:53 UTC (permalink / raw) To: Peter Zijlstra, Jessica Yu Cc: Rasmus Villemoes, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook, vbox-dev Peter Zijlstra <peterz@infradead.org> writes: > On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: >> +++ Rasmus Villemoes [03/04/20 01:42 +0200]: >> > On 02/04/2020 14.32, Thomas Gleixner wrote: >> > > From: Peter Zijlstra <peterz@infradead.org> >> > > >> > > It turns out that with Split-Lock-Detect enabled (default) any VMX >> > > hypervisor needs at least a little modification in order to not blindly >> > > inject the #AC into the guest without the guest being ready for it. >> > > >> > > Since there is no telling which module implements a hypervisor, scan the >> > > module text and look for the VMLAUNCH instruction. If found, the module is >> > > assumed to be a hypervisor of some sort and SLD is disabled. >> > >> > How long does that scan take/add to module load time? Would it make >> > sense to exempt in-tree modules? >> > >> > Rasmus >> >> I second Rasmus's question. It seems rather unfortunate that we have >> to do this text scan for every module load on x86, when it doesn't >> apply to the majority of them, and only to a handful of out-of-tree >> hypervisor modules (assuming kvm is taken care of already). >> >> I wonder if it would make sense then to limit the text scans to just >> out-of-tree modules (i.e., missing the intree modinfo flag)? > > It would; didn't know there was one. But that still would not make it complete. I was staring at virtualbox today after Jann pointed out that this sucker does complete backwards things. The kernel driver does not contain any VM* instructions at all. The actual hypervisor code is built as a separate binary and somehow loaded into the kernel with their own magic fixup of relocations and function linking. This "design" probably comes from the original virtualbox implementation which circumvented GPL that way. TBH, I don't care if we wreckage virtualbox simply because that thing is already a complete and utter trainwreck violating taste and common sense in any possible way. Just for illustration: - It installs preempt notifiers and the first thing in the callback function is to issue 'stac()'! - There is quite some other horrible code in there which fiddles in the guts of the kernel just because it can. - Conditionals in release code which check stuff like VBOX_WITH_TEXT_MODMEM_HACK, VBOX_WITH_EFLAGS_AC_SET_IN_VBOXDRV, VBOX_WITH_NON_PROD_HACK_FOR_PERF_STACKS along with the most absurd hacks ever. If you feel the need to look yourself, please use your eyecancer protection gear. Can someone at Oracle please make sure, that this monstrosity gets shred in pieces? Enough vented, but that still does not solve the SLD problem in any sensible way. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 18:53 ` Thomas Gleixner @ 2020-04-03 20:58 ` Andy Lutomirski 2020-04-03 21:49 ` Thomas Gleixner 0 siblings, 1 reply; 76+ messages in thread From: Andy Lutomirski @ 2020-04-03 20:58 UTC (permalink / raw) To: Thomas Gleixner Cc: Peter Zijlstra, Jessica Yu, Rasmus Villemoes, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook, vbox-dev > On Apr 3, 2020, at 11:54 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > > Peter Zijlstra <peterz@infradead.org> writes: >>> On Fri, Apr 03, 2020 at 04:35:00PM +0200, Jessica Yu wrote: >>> +++ Rasmus Villemoes [03/04/20 01:42 +0200]: >>>> On 02/04/2020 14.32, Thomas Gleixner wrote: >>>>> From: Peter Zijlstra <peterz@infradead.org> >>>>> >>>>> It turns out that with Split-Lock-Detect enabled (default) any VMX >>>>> hypervisor needs at least a little modification in order to not blindly >>>>> inject the #AC into the guest without the guest being ready for it. >>>>> >>>>> Since there is no telling which module implements a hypervisor, scan the >>>>> module text and look for the VMLAUNCH instruction. If found, the module is >>>>> assumed to be a hypervisor of some sort and SLD is disabled. >>>> >>>> How long does that scan take/add to module load time? Would it make >>>> sense to exempt in-tree modules? >>>> >>>> Rasmus >>> >>> I second Rasmus's question. It seems rather unfortunate that we have >>> to do this text scan for every module load on x86, when it doesn't >>> apply to the majority of them, and only to a handful of out-of-tree >>> hypervisor modules (assuming kvm is taken care of already). >>> >>> I wonder if it would make sense then to limit the text scans to just >>> out-of-tree modules (i.e., missing the intree modinfo flag)? >> >> It would; didn't know there was one. > > But that still would not make it complete. > > I was staring at virtualbox today after Jann pointed out that this > sucker does complete backwards things. > > The kernel driver does not contain any VM* instructions at all. > > The actual hypervisor code is built as a separate binary and somehow > loaded into the kernel with their own magic fixup of relocations and > function linking. This "design" probably comes from the original > virtualbox implementation which circumvented GPL that way. > > TBH, I don't care if we wreckage virtualbox simply because that thing is > already a complete and utter trainwreck violating taste and common sense > in any possible way. Just for illustration: > > - It installs preempt notifiers and the first thing in the callback > function is to issue 'stac()'! > > - There is quite some other horrible code in there which fiddles in > the guts of the kernel just because it can. > > - Conditionals in release code which check stuff like > VBOX_WITH_TEXT_MODMEM_HACK, VBOX_WITH_EFLAGS_AC_SET_IN_VBOXDRV, > VBOX_WITH_NON_PROD_HACK_FOR_PERF_STACKS along with the most absurd > hacks ever. > > If you feel the need to look yourself, please use your eyecancer > protection gear. > > Can someone at Oracle please make sure, that this monstrosity gets shred > in pieces? > > Enough vented, but that still does not solve the SLD problem in any > sensible way. Could we unexport set_memory_x perhaps? And maybe try to make virtualbox break in as many ways as possible? > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 20:58 ` Andy Lutomirski @ 2020-04-03 21:49 ` Thomas Gleixner 0 siblings, 0 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-03 21:49 UTC (permalink / raw) To: Andy Lutomirski Cc: Peter Zijlstra, Jessica Yu, Rasmus Villemoes, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt, Greg Kroah-Hartman, jannh, keescook, vbox-dev Andy Lutomirski <luto@amacapital.net> writes: >> On Apr 3, 2020, at 11:54 AM, Thomas Gleixner <tglx@linutronix.de> wrote: >> Peter Zijlstra <peterz@infradead.org> writes: >> Enough vented, but that still does not solve the SLD problem in any >> sensible way. > > Could we unexport set_memory_x perhaps? And maybe try to make > virtualbox break in as many ways as possible? set_memory_x() is not exported anymore, but they found new ways of circumvention. set_memory_x() is only used on really old kernels. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner 2020-04-02 15:23 ` [patch v2 " Peter Zijlstra 2020-04-02 23:42 ` [patch " Rasmus Villemoes @ 2020-04-03 11:29 ` kbuild test robot 2020-04-03 14:43 ` kbuild test robot ` (2 subsequent siblings) 5 siblings, 0 replies; 76+ messages in thread From: kbuild test robot @ 2020-04-03 11:29 UTC (permalink / raw) To: Thomas Gleixner Cc: kbuild-all, LKML, x86@kernel.org, Kenneth R. Crudup , Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt [-- Attachment #1: Type: text/plain, Size: 1965 bytes --] Hi Thomas, I love your patch! Yet something to improve: [auto build test ERROR on tip/auto-latest] [also build test ERROR on next-20200403] [cannot apply to jeyu/modules-next tip/x86/core v5.6] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Thomas-Gleixner/x86-Prevent-Split-Lock-Detection-wreckage-on-VMX-hypervisors/20200403-171430 base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git ee8bac724cc7767dcf9480afb656318994f22c3d config: um-x86_64_defconfig (attached as .config) compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0 reproduce: # save the attached .config to linux build tree make ARCH=um SUBARCH=x86_64 If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): /usr/bin/ld: arch/x86/um/../kernel/module.o: in function `module_finalize': >> module.c:(.text+0x315): undefined reference to `split_lock_validate_module_text' >> collect2: error: ld returned 1 exit status -- In file included from arch/x86/um/../kernel/module.c:27:0: >> arch/x86/include/asm/cpu.h:38:31: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration int mwait_usable(const struct cpuinfo_x86 *); ^~~~~~~~~~~ arch/x86/include/asm/cpu.h:44:49: warning: 'struct cpuinfo_x86' declared inside parameter list will not be visible outside of this definition or declaration extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); ^~~~~~~~~~~ --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 8505 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner ` (2 preceding siblings ...) 2020-04-03 11:29 ` kbuild test robot @ 2020-04-03 14:43 ` kbuild test robot 2020-04-03 16:36 ` Sean Christopherson 2020-04-06 12:23 ` Christoph Hellwig 5 siblings, 0 replies; 76+ messages in thread From: kbuild test robot @ 2020-04-03 14:43 UTC (permalink / raw) To: Thomas Gleixner Cc: kbuild-all, LKML, x86@kernel.org, Kenneth R. Crudup , Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt [-- Attachment #1: Type: text/plain, Size: 3326 bytes --] Hi Thomas, I love your patch! Yet something to improve: [auto build test ERROR on tip/auto-latest] [also build test ERROR on next-20200403] [cannot apply to jeyu/modules-next tip/x86/core v5.6] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Thomas-Gleixner/x86-Prevent-Split-Lock-Detection-wreckage-on-VMX-hypervisors/20200403-171430 base: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git ee8bac724cc7767dcf9480afb656318994f22c3d config: x86_64-randconfig-s1-20200403 (attached as .config) compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 If you fix the issue, kindly add following tag Reported-by: kbuild test robot <lkp@intel.com> All errors (new ones prefixed by >>): arch/x86/kernel/module.c: In function 'module_finalize': >> arch/x86/kernel/module.c:257:17: error: 'struct module' has no member named 'sld_safe' if (text && !me->sld_safe) { ^~ vim +257 arch/x86/kernel/module.c 220 221 int module_finalize(const Elf_Ehdr *hdr, 222 const Elf_Shdr *sechdrs, 223 struct module *me) 224 { 225 const Elf_Shdr *s, *text = NULL, *alt = NULL, *locks = NULL, 226 *para = NULL, *orc = NULL, *orc_ip = NULL; 227 char *secstrings = (void *)hdr + sechdrs[hdr->e_shstrndx].sh_offset; 228 229 for (s = sechdrs; s < sechdrs + hdr->e_shnum; s++) { 230 if (!strcmp(".text", secstrings + s->sh_name)) 231 text = s; 232 if (!strcmp(".altinstructions", secstrings + s->sh_name)) 233 alt = s; 234 if (!strcmp(".smp_locks", secstrings + s->sh_name)) 235 locks = s; 236 if (!strcmp(".parainstructions", secstrings + s->sh_name)) 237 para = s; 238 if (!strcmp(".orc_unwind", secstrings + s->sh_name)) 239 orc = s; 240 if (!strcmp(".orc_unwind_ip", secstrings + s->sh_name)) 241 orc_ip = s; 242 } 243 244 if (alt) { 245 /* patch .altinstructions */ 246 void *aseg = (void *)alt->sh_addr; 247 apply_alternatives(aseg, aseg + alt->sh_size); 248 } 249 if (locks && text) { 250 void *lseg = (void *)locks->sh_addr; 251 void *tseg = (void *)text->sh_addr; 252 alternatives_smp_module_add(me, me->name, 253 lseg, lseg + locks->sh_size, 254 tseg, tseg + text->sh_size); 255 } 256 > 257 if (text && !me->sld_safe) { 258 void *tseg = (void *)text->sh_addr; 259 split_lock_validate_module_text(me, tseg, tseg + text->sh_size); 260 } 261 262 if (para) { 263 void *pseg = (void *)para->sh_addr; 264 apply_paravirt(pseg, pseg + para->sh_size); 265 } 266 267 /* make jump label nops */ 268 jump_label_apply_nops(me); 269 270 if (orc && orc_ip) 271 unwind_module_init(me, (void *)orc_ip->sh_addr, orc_ip->sh_size, 272 (void *)orc->sh_addr, orc->sh_size); 273 274 return 0; 275 } 276 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 36422 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner ` (3 preceding siblings ...) 2020-04-03 14:43 ` kbuild test robot @ 2020-04-03 16:36 ` Sean Christopherson 2020-04-03 16:41 ` Peter Zijlstra 2020-04-06 12:23 ` Christoph Hellwig 5 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-03 16:36 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, x86, Kenneth R. Crudup, Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 02:32:59PM +0200, Thomas Gleixner wrote: > --- a/arch/x86/kernel/module.c > +++ b/arch/x86/kernel/module.c > @@ -24,6 +24,7 @@ > #include <asm/pgtable.h> > #include <asm/setup.h> > #include <asm/unwind.h> > +#include <asm/cpu.h> > > #if 0 > #define DEBUGP(fmt, ...) \ > @@ -253,6 +254,11 @@ int module_finalize(const Elf_Ehdr *hdr, > tseg, tseg + text->sh_size); > } > > + if (text && !me->sld_safe) { As also reported by the test bot, sld_safe only exist if CPU_SUP_INTEL=y. This can also be conditioned on boot_cpu_has(X86_FEATURE_VMX), or the static variant. If CPU_SUP_INTEL=y, X86_FEATURE_VMX will be set if and only if VMX is fully enabled, i.e. supported by the CPU and enabled in MSR_IA32_FEATURE_CONTROl. > + void *tseg = (void *)text->sh_addr; > + split_lock_validate_module_text(me, tseg, tseg + text->sh_size); > + } > + > if (para) { > void *pseg = (void *)para->sh_addr; > apply_paravirt(pseg, pseg + para->sh_size); > --- a/include/linux/module.h > +++ b/include/linux/module.h > @@ -407,6 +407,10 @@ struct module { > bool sig_ok; > #endif > > +#ifdef CONFIG_CPU_SUP_INTEL > + bool sld_safe; > +#endif > + > bool async_probe_requested; > > /* symbols that will be GPL-only in the near future. */ > --- a/kernel/module.c > +++ b/kernel/module.c > @@ -3096,6 +3096,11 @@ static int check_modinfo(struct module * > "is unknown, you have been warned.\n", mod->name); > } > > +#ifdef CONFIG_CPU_SUP_INTEL > + if (get_modinfo(info, "sld_safe")) > + mod->sld_safe = true; > +#endif > + > err = check_modinfo_livepatch(mod, info); > if (err) > return err; > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:36 ` Sean Christopherson @ 2020-04-03 16:41 ` Peter Zijlstra 2020-04-03 18:35 ` Jessica Yu 0 siblings, 1 reply; 76+ messages in thread From: Peter Zijlstra @ 2020-04-03 16:41 UTC (permalink / raw) To: Sean Christopherson Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt On Fri, Apr 03, 2020 at 09:36:05AM -0700, Sean Christopherson wrote: > On Thu, Apr 02, 2020 at 02:32:59PM +0200, Thomas Gleixner wrote: > > --- a/arch/x86/kernel/module.c > > +++ b/arch/x86/kernel/module.c > > @@ -24,6 +24,7 @@ > > #include <asm/pgtable.h> > > #include <asm/setup.h> > > #include <asm/unwind.h> > > +#include <asm/cpu.h> > > > > #if 0 > > #define DEBUGP(fmt, ...) \ > > @@ -253,6 +254,11 @@ int module_finalize(const Elf_Ehdr *hdr, > > tseg, tseg + text->sh_size); > > } > > > > + if (text && !me->sld_safe) { > > As also reported by the test bot, sld_safe only exist if CPU_SUP_INTEL=y. > > This can also be conditioned on boot_cpu_has(X86_FEATURE_VMX), or the > static variant. If CPU_SUP_INTEL=y, X86_FEATURE_VMX will be set if and > only if VMX is fully enabled, i.e. supported by the CPU and enabled in > MSR_IA32_FEATURE_CONTROl. > > > + void *tseg = (void *)text->sh_addr; > > + split_lock_validate_module_text(me, tseg, tseg + text->sh_size); > > + } Ideally we push it all into arch code, but load_info isn't exposed to arch module code :/. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-03 16:41 ` Peter Zijlstra @ 2020-04-03 18:35 ` Jessica Yu 0 siblings, 0 replies; 76+ messages in thread From: Jessica Yu @ 2020-04-03 18:35 UTC (permalink / raw) To: Peter Zijlstra Cc: Sean Christopherson, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt +++ Peter Zijlstra [03/04/20 18:41 +0200]: >On Fri, Apr 03, 2020 at 09:36:05AM -0700, Sean Christopherson wrote: >> On Thu, Apr 02, 2020 at 02:32:59PM +0200, Thomas Gleixner wrote: >> > --- a/arch/x86/kernel/module.c >> > +++ b/arch/x86/kernel/module.c >> > @@ -24,6 +24,7 @@ >> > #include <asm/pgtable.h> >> > #include <asm/setup.h> >> > #include <asm/unwind.h> >> > +#include <asm/cpu.h> >> > >> > #if 0 >> > #define DEBUGP(fmt, ...) \ >> > @@ -253,6 +254,11 @@ int module_finalize(const Elf_Ehdr *hdr, >> > tseg, tseg + text->sh_size); >> > } >> > >> > + if (text && !me->sld_safe) { >> >> As also reported by the test bot, sld_safe only exist if CPU_SUP_INTEL=y. >> >> This can also be conditioned on boot_cpu_has(X86_FEATURE_VMX), or the >> static variant. If CPU_SUP_INTEL=y, X86_FEATURE_VMX will be set if and >> only if VMX is fully enabled, i.e. supported by the CPU and enabled in >> MSR_IA32_FEATURE_CONTROl. >> >> > + void *tseg = (void *)text->sh_addr; >> > + split_lock_validate_module_text(me, tseg, tseg + text->sh_size); >> > + } > >Ideally we push it all into arch code, but load_info isn't exposed to >arch module code :/. Hm, I can look into exposing load_info to arch module code and will post a patch on Monday. Jessica ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner ` (4 preceding siblings ...) 2020-04-03 16:36 ` Sean Christopherson @ 2020-04-06 12:23 ` Christoph Hellwig 2020-04-06 14:40 ` Peter Zijlstra 5 siblings, 1 reply; 76+ messages in thread From: Christoph Hellwig @ 2020-04-06 12:23 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, x86, Kenneth R. Crudup, Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 02:32:59PM +0200, Thomas Gleixner wrote: > From: Peter Zijlstra <peterz@infradead.org> > > It turns out that with Split-Lock-Detect enabled (default) any VMX > hypervisor needs at least a little modification in order to not blindly > inject the #AC into the guest without the guest being ready for it. > > Since there is no telling which module implements a hypervisor, scan the > module text and look for the VMLAUNCH instruction. If found, the module is > assumed to be a hypervisor of some sort and SLD is disabled. > > Hypervisors, which have been modified and are known to work correctly, > can add: > > MODULE_INFO(sld_safe, "Y"); > > to explicitly tell the module loader they're good. > > NOTE: it is unfortunate that struct load_info is not available to the > arch module code, this means CONFIG_CPU_SUP_INTEL gunk is needed > in generic code. > > NOTE: while we can 'trivially' fix KVM, we're still stuck with stuff > like VMware and VirtualBox doing their own thing. This is just crazy. We have never cared about any out tree module, why would we care here where it creates a real complexity. Just fix KVM and ignore anything else. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-06 12:23 ` Christoph Hellwig @ 2020-04-06 14:40 ` Peter Zijlstra 2020-04-06 15:18 ` Christoph Hellwig 0 siblings, 1 reply; 76+ messages in thread From: Peter Zijlstra @ 2020-04-06 14:40 UTC (permalink / raw) To: Christoph Hellwig Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Mon, Apr 06, 2020 at 05:23:43AM -0700, Christoph Hellwig wrote: > On Thu, Apr 02, 2020 at 02:32:59PM +0200, Thomas Gleixner wrote: > > From: Peter Zijlstra <peterz@infradead.org> > > > > It turns out that with Split-Lock-Detect enabled (default) any VMX > > hypervisor needs at least a little modification in order to not blindly > > inject the #AC into the guest without the guest being ready for it. > > > > Since there is no telling which module implements a hypervisor, scan the > > module text and look for the VMLAUNCH instruction. If found, the module is > > assumed to be a hypervisor of some sort and SLD is disabled. > > > > Hypervisors, which have been modified and are known to work correctly, > > can add: > > > > MODULE_INFO(sld_safe, "Y"); > > > > to explicitly tell the module loader they're good. > > > > NOTE: it is unfortunate that struct load_info is not available to the > > arch module code, this means CONFIG_CPU_SUP_INTEL gunk is needed > > in generic code. > > > > NOTE: while we can 'trivially' fix KVM, we're still stuck with stuff > > like VMware and VirtualBox doing their own thing. > > This is just crazy. We have never cared about any out tree module, why > would we care here where it creates a real complexity. Just fix KVM > and ignore anything else. It is absolutely bonkers, but at the same time we can extend this infrastructure to scan for dubious code patterns we don't want to support. Like for instance direct manipulation of CR4. Look at is as another layer to enforce sanity on binary only modules. Do we want to go that way, and do you know of other code patterns you'd want to fail module loading for? ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-06 14:40 ` Peter Zijlstra @ 2020-04-06 15:18 ` Christoph Hellwig 2020-04-06 15:22 ` Peter Zijlstra 0 siblings, 1 reply; 76+ messages in thread From: Christoph Hellwig @ 2020-04-06 15:18 UTC (permalink / raw) To: Peter Zijlstra Cc: Christoph Hellwig, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Mon, Apr 06, 2020 at 04:40:20PM +0200, Peter Zijlstra wrote: > It is absolutely bonkers, but at the same time we can extend this > infrastructure to scan for dubious code patterns we don't want to > support. Like for instance direct manipulation of CR4. But that is not what this code does - it disables split lock detection. If it failed to load the module the whole thing would make a little more sense. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-06 15:18 ` Christoph Hellwig @ 2020-04-06 15:22 ` Peter Zijlstra 2020-04-06 18:27 ` Steven Rostedt 0 siblings, 1 reply; 76+ messages in thread From: Peter Zijlstra @ 2020-04-06 15:22 UTC (permalink / raw) To: Christoph Hellwig Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Mon, Apr 06, 2020 at 08:18:47AM -0700, Christoph Hellwig wrote: > On Mon, Apr 06, 2020 at 04:40:20PM +0200, Peter Zijlstra wrote: > > It is absolutely bonkers, but at the same time we can extend this > > infrastructure to scan for dubious code patterns we don't want to > > support. Like for instance direct manipulation of CR4. > > But that is not what this code does - it disables split lock detection. > If it failed to load the module the whole thing would make a little > more sense. If this lives, it'll be to just to fail module loading. IIRC the same was suggested elsewhere in the thread. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect 2020-04-06 15:22 ` Peter Zijlstra @ 2020-04-06 18:27 ` Steven Rostedt 0 siblings, 0 replies; 76+ messages in thread From: Steven Rostedt @ 2020-04-06 18:27 UTC (permalink / raw) To: Peter Zijlstra Cc: Christoph Hellwig, Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck On Mon, 6 Apr 2020 17:22:31 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Mon, Apr 06, 2020 at 08:18:47AM -0700, Christoph Hellwig wrote: > > On Mon, Apr 06, 2020 at 04:40:20PM +0200, Peter Zijlstra wrote: > > > It is absolutely bonkers, but at the same time we can extend this > > > infrastructure to scan for dubious code patterns we don't want to > > > support. Like for instance direct manipulation of CR4. > > > > But that is not what this code does - it disables split lock detection. > > If it failed to load the module the whole thing would make a little > > more sense. > > If this lives, it'll be to just to fail module loading. IIRC the same > was suggested elsewhere in the thread. I believe I may have been the one to suggest it. It's no different than breaking kabi if you ask me. If a module can't deal with a new feature, than it should not be able to load. And let whoever owns that module fix it for their users. -- Steve ^ permalink raw reply [flat|nested] 76+ messages in thread
* [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage 2020-04-02 12:32 [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Thomas Gleixner 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner @ 2020-04-02 12:33 ` Thomas Gleixner 2020-04-02 15:30 ` Sean Christopherson 2020-04-02 15:55 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Sean Christopherson 2020-04-02 13:43 ` [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Kenneth R. Crudup 2 siblings, 2 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 12:33 UTC (permalink / raw) To: LKML Cc: x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Peter Zijlstra (Intel), Jessica Yu, Steven Rostedt Without at least minimal handling for split lock detection induced #AC, VMX will just run into the same problem as the VMWare hypervisor, which was reported by Kenneth. It will inject the #AC blindly into the guest whether the guest is prepared or not. Add the minimal required handling for it: - Check guest state whether CR0.AM is enabled and EFLAGS.AC is set. If so, then the #AC originated from CPL3 and the guest has is prepared to handle it. In this case it does not matter whether the #AC is due to a split lock or a regular unaligned check. - Invoke a minimal split lock detection handler. If the host SLD mode is sld_warn, then handle it in the same way as user space handling works: Emit a warning, disable SLD and mark the current task with TIF_SLD. With that resume the guest without injecting #AC. If the host mode is sld_fatal or sld_off, emit a warning and deliver the exception to user space which can crash and burn itself. Mark the module with MOD_INFO(sld_safe, "Y") so the module loader does not force SLD off. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Kenneth R. Crudup" <kenny@panix.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: Nadav Amit <namit@vmware.com> Cc: Thomas Hellstrom <thellstrom@vmware.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/cpu.h | 1 + arch/x86/kernel/cpu/intel.c | 28 +++++++++++++++++++++++----- arch/x86/kvm/vmx/vmx.c | 40 +++++++++++++++++++++++++++++++++++++--- 3 files changed, 61 insertions(+), 8 deletions(-) --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -44,6 +44,7 @@ unsigned int x86_stepping(unsigned int s extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); extern void switch_to_sld(unsigned long tifn); extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); +extern bool handle_guest_split_lock(unsigned long ip); extern void split_lock_validate_module_text(struct module *me, void *text, void *text_end); #else static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -1102,13 +1102,10 @@ static void split_lock_init(void) split_lock_verify_msr(sld_state != sld_off); } -bool handle_user_split_lock(struct pt_regs *regs, long error_code) +static void split_lock_warn(unsigned long ip) { - if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) - return false; - pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n", - current->comm, current->pid, regs->ip); + current->comm, current->pid, ip); /* * Disable the split lock detection for this task so it can make @@ -1117,6 +1114,27 @@ bool handle_user_split_lock(struct pt_re */ sld_update_msr(false); set_tsk_thread_flag(current, TIF_SLD); +} + +bool handle_guest_split_lock(unsigned long ip) +{ + if (sld_state == sld_warn) { + split_lock_warn(ip); + return true; + } + + pr_warn_once("#AC: %s/%d %s split_lock trap at address: 0x%lx\n", + current->comm, current->pid, + sld_state == sld_fatal ? "fatal" : "bogus", ip); + return false; +} +EXPORT_SYMBOL_GPL(handle_guest_split_lock); + +bool handle_user_split_lock(struct pt_regs *regs, long error_code) +{ + if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) + return false; + split_lock_warn(regs->ip); return true; } --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -65,6 +65,7 @@ MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); +MODULE_INFO(sld_safe, "Y"); #ifdef MODULE static const struct x86_cpu_id vmx_cpu_id[] = { @@ -4623,6 +4624,22 @@ static int handle_machine_check(struct k return 1; } +static bool guest_handles_ac(struct kvm_vcpu *vcpu) +{ + /* + * If guest has alignment checking enabled in CR0 and activated in + * eflags, then the #AC originated from CPL3 and the guest is able + * to handle it. It does not matter whether this is a regular or + * a split lock operation induced #AC. + */ + if (vcpu->arch.cr0 & X86_CR0_AM && + vmx_get_rflags(vcpu) & X86_EFLAGS_AC) + return true; + + /* Add guest SLD handling checks here once it's supported */ + return false; +} + static int handle_exception_nmi(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -4688,9 +4705,6 @@ static int handle_exception_nmi(struct k return handle_rmode_exception(vcpu, ex_no, error_code); switch (ex_no) { - case AC_VECTOR: - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); - return 1; case DB_VECTOR: dr6 = vmcs_readl(EXIT_QUALIFICATION); if (!(vcpu->guest_debug & @@ -4719,6 +4733,26 @@ static int handle_exception_nmi(struct k kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; kvm_run->debug.arch.exception = ex_no; break; + case AC_VECTOR: + if (guest_handles_ac(vcpu)) { + kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); + return 1; + } + /* + * Handle #AC caused by split lock detection. If the host + * mode is sld_warn, then it warns, marks current with + * TIF_SLD and disables split lock detection. So the guest + * can just continue. + * + * If the host mode is fatal, the handling code warned. Let + * qemu kill itself. + * + * If the host mode is off, then this #AC is bonkers and + * something is badly wrong. Let it fail as well. + */ + if (handle_guest_split_lock(kvm_rip_read(vcpu))) + return 1; + /* fall through */ default: kvm_run->exit_reason = KVM_EXIT_EXCEPTION; kvm_run->ex.exception = ex_no; ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage 2020-04-02 12:33 ` [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage Thomas Gleixner @ 2020-04-02 15:30 ` Sean Christopherson 2020-04-02 15:44 ` Nadav Amit 2020-04-02 16:56 ` Thomas Gleixner 2020-04-02 15:55 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Sean Christopherson 1 sibling, 2 replies; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 15:30 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra (Intel), Jessica Yu, Steven Rostedt On Thu, Apr 02, 2020 at 02:33:00PM +0200, Thomas Gleixner wrote: > Without at least minimal handling for split lock detection induced #AC, VMX > will just run into the same problem as the VMWare hypervisor, which was > reported by Kenneth. > > It will inject the #AC blindly into the guest whether the guest is prepared > or not. > > Add the minimal required handling for it: > > - Check guest state whether CR0.AM is enabled and EFLAGS.AC is set. If > so, then the #AC originated from CPL3 and the guest has is prepared to > handle it. In this case it does not matter whether the #AC is due to a > split lock or a regular unaligned check. > > - Invoke a minimal split lock detection handler. If the host SLD mode is > sld_warn, then handle it in the same way as user space handling works: > Emit a warning, disable SLD and mark the current task with TIF_SLD. > With that resume the guest without injecting #AC. > > If the host mode is sld_fatal or sld_off, emit a warning and deliver > the exception to user space which can crash and burn itself. > > Mark the module with MOD_INFO(sld_safe, "Y") so the module loader does not > force SLD off. Some comments below. But, any objection to taking Xiaoyao's patches that do effectively the same things, minus the MOD_INFO()? I'll repost them in reply to this thread. > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > Cc: "Kenneth R. Crudup" <kenny@panix.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Fenghua Yu <fenghua.yu@intel.com> > Cc: Xiaoyao Li <xiaoyao.li@intel.com> > Cc: Nadav Amit <namit@vmware.com> > Cc: Thomas Hellstrom <thellstrom@vmware.com> > Cc: Sean Christopherson <sean.j.christopherson@intel.com> > Cc: Tony Luck <tony.luck@intel.com> > --- > arch/x86/include/asm/cpu.h | 1 + > arch/x86/kernel/cpu/intel.c | 28 +++++++++++++++++++++++----- > arch/x86/kvm/vmx/vmx.c | 40 +++++++++++++++++++++++++++++++++++++--- > 3 files changed, 61 insertions(+), 8 deletions(-) > > --- a/arch/x86/include/asm/cpu.h > +++ b/arch/x86/include/asm/cpu.h > @@ -44,6 +44,7 @@ unsigned int x86_stepping(unsigned int s > extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); > extern void switch_to_sld(unsigned long tifn); > extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); > +extern bool handle_guest_split_lock(unsigned long ip); > extern void split_lock_validate_module_text(struct module *me, void *text, void *text_end); > #else > static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -1102,13 +1102,10 @@ static void split_lock_init(void) > split_lock_verify_msr(sld_state != sld_off); > } > > -bool handle_user_split_lock(struct pt_regs *regs, long error_code) > +static void split_lock_warn(unsigned long ip) > { > - if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) > - return false; > - > pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n", > - current->comm, current->pid, regs->ip); > + current->comm, current->pid, ip); > > /* > * Disable the split lock detection for this task so it can make > @@ -1117,6 +1114,27 @@ bool handle_user_split_lock(struct pt_re > */ > sld_update_msr(false); > set_tsk_thread_flag(current, TIF_SLD); > +} > + > +bool handle_guest_split_lock(unsigned long ip) > +{ > + if (sld_state == sld_warn) { > + split_lock_warn(ip); > + return true; > + } > + > + pr_warn_once("#AC: %s/%d %s split_lock trap at address: 0x%lx\n", > + current->comm, current->pid, > + sld_state == sld_fatal ? "fatal" : "bogus", ip); > + return false; > +} > +EXPORT_SYMBOL_GPL(handle_guest_split_lock); > + > +bool handle_user_split_lock(struct pt_regs *regs, long error_code) > +{ > + if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) > + return false; > + split_lock_warn(regs->ip); > return true; > } > > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -65,6 +65,7 @@ > > MODULE_AUTHOR("Qumranet"); > MODULE_LICENSE("GPL"); > +MODULE_INFO(sld_safe, "Y"); > > #ifdef MODULE > static const struct x86_cpu_id vmx_cpu_id[] = { > @@ -4623,6 +4624,22 @@ static int handle_machine_check(struct k > return 1; > } > > +static bool guest_handles_ac(struct kvm_vcpu *vcpu) > +{ > + /* > + * If guest has alignment checking enabled in CR0 and activated in > + * eflags, then the #AC originated from CPL3 and the guest is able > + * to handle it. It does not matter whether this is a regular or > + * a split lock operation induced #AC. > + */ > + if (vcpu->arch.cr0 & X86_CR0_AM && Technically not required since KVM doesn't let the gets toggle CR0.AM at will, but going through kvm_read_cr0{_bits}() is preferred. > + vmx_get_rflags(vcpu) & X86_EFLAGS_AC) I don't think this is correct. A guest could trigger a split-lock #AC at CPL0 with EFLAGS.AC=1 and CR0.AM=1, and then panic because it didn't expect #AC at CPL0. > + return true; > + > + /* Add guest SLD handling checks here once it's supported */ > + return false; > +} > + > static int handle_exception_nmi(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -4688,9 +4705,6 @@ static int handle_exception_nmi(struct k > return handle_rmode_exception(vcpu, ex_no, error_code); > > switch (ex_no) { > - case AC_VECTOR: > - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > - return 1; > case DB_VECTOR: > dr6 = vmcs_readl(EXIT_QUALIFICATION); > if (!(vcpu->guest_debug & > @@ -4719,6 +4733,26 @@ static int handle_exception_nmi(struct k > kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; > kvm_run->debug.arch.exception = ex_no; > break; > + case AC_VECTOR: > + if (guest_handles_ac(vcpu)) { > + kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > + return 1; > + } > + /* > + * Handle #AC caused by split lock detection. If the host > + * mode is sld_warn, then it warns, marks current with > + * TIF_SLD and disables split lock detection. So the guest > + * can just continue. > + * > + * If the host mode is fatal, the handling code warned. Let > + * qemu kill itself. > + * > + * If the host mode is off, then this #AC is bonkers and > + * something is badly wrong. Let it fail as well. > + */ > + if (handle_guest_split_lock(kvm_rip_read(vcpu))) > + return 1; > + /* fall through */ > default: > kvm_run->exit_reason = KVM_EXIT_EXCEPTION; > kvm_run->ex.exception = ex_no; > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage 2020-04-02 15:30 ` Sean Christopherson @ 2020-04-02 15:44 ` Nadav Amit 2020-04-02 16:04 ` Sean Christopherson 2020-04-02 16:56 ` Thomas Gleixner 1 sibling, 1 reply; 76+ messages in thread From: Nadav Amit @ 2020-04-02 15:44 UTC (permalink / raw) To: Sean Christopherson Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra (Intel), Jessica Yu, Steven Rostedt > On Apr 2, 2020, at 8:30 AM, Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Thu, Apr 02, 2020 at 02:33:00PM +0200, Thomas Gleixner wrote: >> Without at least minimal handling for split lock detection induced #AC, VMX >> will just run into the same problem as the VMWare hypervisor, which was >> reported by Kenneth. >> >> It will inject the #AC blindly into the guest whether the guest is prepared >> or not. >> >> Add the minimal required handling for it: >> >> - Check guest state whether CR0.AM is enabled and EFLAGS.AC is set. If >> so, then the #AC originated from CPL3 and the guest has is prepared to >> handle it. In this case it does not matter whether the #AC is due to a >> split lock or a regular unaligned check. >> >> - Invoke a minimal split lock detection handler. If the host SLD mode is >> sld_warn, then handle it in the same way as user space handling works: >> Emit a warning, disable SLD and mark the current task with TIF_SLD. >> With that resume the guest without injecting #AC. >> >> If the host mode is sld_fatal or sld_off, emit a warning and deliver >> the exception to user space which can crash and burn itself. >> >> Mark the module with MOD_INFO(sld_safe, "Y") so the module loader does not >> force SLD off. > > Some comments below. But, any objection to taking Xiaoyao's patches that > do effectively the same things, minus the MOD_INFO()? I'll repost them in > reply to this thread. IIUC they also deal with emulated split-lock accesses in the host, during instruction emulation [1]. This seems also to be required, although I am not sure the approach that he took once emulation encounters a split-lock is robust. [1] https://lore.kernel.org/lkml/20200324151859.31068-5-xiaoyao.li@intel.com/ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage 2020-04-02 15:44 ` Nadav Amit @ 2020-04-02 16:04 ` Sean Christopherson 0 siblings, 0 replies; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 16:04 UTC (permalink / raw) To: Nadav Amit Cc: Thomas Gleixner, LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra (Intel), Jessica Yu, Steven Rostedt On Thu, Apr 02, 2020 at 03:44:00PM +0000, Nadav Amit wrote: > > On Apr 2, 2020, at 8:30 AM, Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > > > On Thu, Apr 02, 2020 at 02:33:00PM +0200, Thomas Gleixner wrote: > >> Without at least minimal handling for split lock detection induced #AC, VMX > >> will just run into the same problem as the VMWare hypervisor, which was > >> reported by Kenneth. > >> > >> It will inject the #AC blindly into the guest whether the guest is prepared > >> or not. > >> > >> Add the minimal required handling for it: > >> > >> - Check guest state whether CR0.AM is enabled and EFLAGS.AC is set. If > >> so, then the #AC originated from CPL3 and the guest has is prepared to > >> handle it. In this case it does not matter whether the #AC is due to a > >> split lock or a regular unaligned check. > >> > >> - Invoke a minimal split lock detection handler. If the host SLD mode is > >> sld_warn, then handle it in the same way as user space handling works: > >> Emit a warning, disable SLD and mark the current task with TIF_SLD. > >> With that resume the guest without injecting #AC. > >> > >> If the host mode is sld_fatal or sld_off, emit a warning and deliver > >> the exception to user space which can crash and burn itself. > >> > >> Mark the module with MOD_INFO(sld_safe, "Y") so the module loader does not > >> force SLD off. > > > > Some comments below. But, any objection to taking Xiaoyao's patches that > > do effectively the same things, minus the MOD_INFO()? I'll repost them in > > reply to this thread. > > IIUC they also deal with emulated split-lock accesses in the host, during > instruction emulation [1]. This seems also to be required, although I am not > sure the approach that he took once emulation encounters a split-lock is > robust. Yep. It's "robust" in the sense that KVM won't panic the host. It's not robust from the perspective that it could possibly hose the guest. But, no sane, well-behaved guest should reach that particular emulator path on a split-lock enabled system. > [1] https://lore.kernel.org/lkml/20200324151859.31068-5-xiaoyao.li@intel.com/ ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage 2020-04-02 15:30 ` Sean Christopherson 2020-04-02 15:44 ` Nadav Amit @ 2020-04-02 16:56 ` Thomas Gleixner 1 sibling, 0 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 16:56 UTC (permalink / raw) To: Sean Christopherson Cc: LKML, x86, Kenneth R. Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra (Intel), Jessica Yu, Steven Rostedt Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Thu, Apr 02, 2020 at 02:33:00PM +0200, Thomas Gleixner wrote: >> Mark the module with MOD_INFO(sld_safe, "Y") so the module loader does not >> force SLD off. > > Some comments below. But, any objection to taking Xiaoyao's patches that > do effectively the same things, minus the MOD_INFO()? I'll repost them in > reply to this thread. If they are sane, I don't have a problem. But TBH, I really couldn't be bothered to actually scan my mails whether there surfaced something sane by now. Writing that up was just faster :) I'll have look. >> +static bool guest_handles_ac(struct kvm_vcpu *vcpu) >> +{ >> + /* >> + * If guest has alignment checking enabled in CR0 and activated in >> + * eflags, then the #AC originated from CPL3 and the guest is able >> + * to handle it. It does not matter whether this is a regular or >> + * a split lock operation induced #AC. >> + */ >> + if (vcpu->arch.cr0 & X86_CR0_AM && > > Technically not required since KVM doesn't let the gets toggle CR0.AM at > will, but going through kvm_read_cr0{_bits}() is preferred. You're the expert here. >> + vmx_get_rflags(vcpu) & X86_EFLAGS_AC) > > I don't think this is correct. A guest could trigger a split-lock #AC at > CPL0 with EFLAGS.AC=1 and CR0.AM=1, and then panic because it didn't expect > #AC at CPL0. Indeed. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling 2020-04-02 12:33 ` [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage Thomas Gleixner 2020-04-02 15:30 ` Sean Christopherson @ 2020-04-02 15:55 ` Sean Christopherson 2020-04-02 15:55 ` [PATCH 1/3] KVM: x86: Emulate split-lock access as a write in emulator Sean Christopherson ` (3 more replies) 1 sibling, 4 replies; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 15:55 UTC (permalink / raw) To: Thomas Gleixner Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel First three patches from Xiaoyao's series to add split-lock #AC support in KVM. Xiaoyao Li (3): KVM: x86: Emulate split-lock access as a write in emulator x86/split_lock: Refactor and export handle_user_split_lock() for KVM KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest arch/x86/include/asm/cpu.h | 4 ++-- arch/x86/kernel/cpu/intel.c | 7 ++++--- arch/x86/kernel/traps.c | 2 +- arch/x86/kvm/vmx/vmx.c | 30 +++++++++++++++++++++++++++--- arch/x86/kvm/x86.c | 12 +++++++++++- 5 files changed, 45 insertions(+), 10 deletions(-) -- 2.24.1 ^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH 1/3] KVM: x86: Emulate split-lock access as a write in emulator 2020-04-02 15:55 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Sean Christopherson @ 2020-04-02 15:55 ` Sean Christopherson 2020-04-02 15:55 ` [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM Sean Christopherson ` (2 subsequent siblings) 3 siblings, 0 replies; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 15:55 UTC (permalink / raw) To: Thomas Gleixner Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel From: Xiaoyao Li <xiaoyao.li@intel.com> Emulate split-lock accesses as writes if split lock detection is on to avoid #AC during emulation, which will result in a panic(). This should never occur for a well behaved guest, but a malicious guest can manipulate the TLB to trigger emulation of a locked instruction[1]. More discussion can be found [2][3]. [1] https://lkml.kernel.org/r/8c5b11c9-58df-38e7-a514-dc12d687b198@redhat.com [2] https://lkml.kernel.org/r/20200131200134.GD18946@linux.intel.com [3] https://lkml.kernel.org/r/20200227001117.GX9940@linux.intel.com Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/kvm/x86.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index bf8564d73fc3..37ce0fc9a62d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5875,6 +5875,7 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt, { struct kvm_host_map map; struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt); + u64 page_line_mask; gpa_t gpa; char *kaddr; bool exchanged; @@ -5889,7 +5890,16 @@ static int emulator_cmpxchg_emulated(struct x86_emulate_ctxt *ctxt, (gpa & PAGE_MASK) == APIC_DEFAULT_PHYS_BASE) goto emul_write; - if (((gpa + bytes - 1) & PAGE_MASK) != (gpa & PAGE_MASK)) + /* + * Emulate the atomic as a straight write to avoid #AC if SLD is + * enabled in the host and the access splits a cache line. + */ + if (boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) + page_line_mask = ~(cache_line_size() - 1); + else + page_line_mask = PAGE_MASK; + + if (((gpa + bytes - 1) & page_line_mask) != (gpa & page_line_mask)) goto emul_write; if (kvm_vcpu_map(vcpu, gpa_to_gfn(gpa), &map)) -- 2.24.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM 2020-04-02 15:55 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Sean Christopherson 2020-04-02 15:55 ` [PATCH 1/3] KVM: x86: Emulate split-lock access as a write in emulator Sean Christopherson @ 2020-04-02 15:55 ` Sean Christopherson 2020-04-02 17:01 ` Thomas Gleixner 2020-04-02 15:55 ` [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest Sean Christopherson 2020-04-10 10:23 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Paolo Bonzini 3 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 15:55 UTC (permalink / raw) To: Thomas Gleixner Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel From: Xiaoyao Li <xiaoyao.li@intel.com> In the future, KVM will use handle_user_split_lock() to handle #AC caused by split lock in guest. Due to the fact that KVM doesn't have a @regs context and will pre-check EFLAGS.AC, move the EFLAGS.AC check to do_alignment_check(). Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/include/asm/cpu.h | 4 ++-- arch/x86/kernel/cpu/intel.c | 7 ++++--- arch/x86/kernel/traps.c | 2 +- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h index ff6f3ca649b3..ff567afa6ee1 100644 --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -43,11 +43,11 @@ unsigned int x86_stepping(unsigned int sig); #ifdef CONFIG_CPU_SUP_INTEL extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); extern void switch_to_sld(unsigned long tifn); -extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); +extern bool handle_user_split_lock(unsigned long ip); #else static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} static inline void switch_to_sld(unsigned long tifn) {} -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code) +static inline bool handle_user_split_lock(unsigned long ip) { return false; } diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 9a26e972cdea..7688f51aabdb 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -1066,13 +1066,13 @@ static void split_lock_init(void) split_lock_verify_msr(sld_state != sld_off); } -bool handle_user_split_lock(struct pt_regs *regs, long error_code) +bool handle_user_split_lock(unsigned long ip) { - if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) + if (sld_state == sld_fatal) return false; pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n", - current->comm, current->pid, regs->ip); + current->comm, current->pid, ip); /* * Disable the split lock detection for this task so it can make @@ -1083,6 +1083,7 @@ bool handle_user_split_lock(struct pt_regs *regs, long error_code) set_tsk_thread_flag(current, TIF_SLD); return true; } +EXPORT_SYMBOL_GPL(handle_user_split_lock); /* * This function is called only when switching between tasks with diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index d54cffdc7cac..55902c5bf0aa 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -304,7 +304,7 @@ dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code) local_irq_enable(); - if (handle_user_split_lock(regs, error_code)) + if (!(regs->flags & X86_EFLAGS_AC) && handle_user_split_lock(regs->ip)) return; do_trap(X86_TRAP_AC, SIGBUS, "alignment check", regs, -- 2.24.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM 2020-04-02 15:55 ` [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM Sean Christopherson @ 2020-04-02 17:01 ` Thomas Gleixner 2020-04-02 17:19 ` Sean Christopherson 0 siblings, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 17:01 UTC (permalink / raw) To: Sean Christopherson Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel Sean Christopherson <sean.j.christopherson@intel.com> writes: > From: Xiaoyao Li <xiaoyao.li@intel.com> > > In the future, KVM will use handle_user_split_lock() to handle #AC > caused by split lock in guest. Due to the fact that KVM doesn't have > a @regs context and will pre-check EFLAGS.AC, move the EFLAGS.AC check > to do_alignment_check(). > > Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> > Reviewed-by: Tony Luck <tony.luck@intel.com> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > --- > arch/x86/include/asm/cpu.h | 4 ++-- > arch/x86/kernel/cpu/intel.c | 7 ++++--- > arch/x86/kernel/traps.c | 2 +- > 3 files changed, 7 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h > index ff6f3ca649b3..ff567afa6ee1 100644 > --- a/arch/x86/include/asm/cpu.h > +++ b/arch/x86/include/asm/cpu.h > @@ -43,11 +43,11 @@ unsigned int x86_stepping(unsigned int sig); > #ifdef CONFIG_CPU_SUP_INTEL > extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); > extern void switch_to_sld(unsigned long tifn); > -extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); > +extern bool handle_user_split_lock(unsigned long ip); > #else > static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} > static inline void switch_to_sld(unsigned long tifn) {} > -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code) > +static inline bool handle_user_split_lock(unsigned long ip) This is necessary because VMX can be compiled without CPU_SUP_INTEL? > { > return false; > } > diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c > index 9a26e972cdea..7688f51aabdb 100644 > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -1066,13 +1066,13 @@ static void split_lock_init(void) > split_lock_verify_msr(sld_state != sld_off); > } > > -bool handle_user_split_lock(struct pt_regs *regs, long error_code) > +bool handle_user_split_lock(unsigned long ip) > { > - if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) > + if (sld_state == sld_fatal) > return false; > > pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n", > - current->comm, current->pid, regs->ip); > + current->comm, current->pid, ip); So this returns true even in the case that sld_state == off. Should never happen, but I rather have an extra check and be both verbose and correct. See the variant I did. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM 2020-04-02 17:01 ` Thomas Gleixner @ 2020-04-02 17:19 ` Sean Christopherson 2020-04-02 19:06 ` Thomas Gleixner 0 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 17:19 UTC (permalink / raw) To: Thomas Gleixner Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel On Thu, Apr 02, 2020 at 07:01:56PM +0200, Thomas Gleixner wrote: > Sean Christopherson <sean.j.christopherson@intel.com> writes: > > From: Xiaoyao Li <xiaoyao.li@intel.com> > > > > In the future, KVM will use handle_user_split_lock() to handle #AC > > caused by split lock in guest. Due to the fact that KVM doesn't have > > a @regs context and will pre-check EFLAGS.AC, move the EFLAGS.AC check > > to do_alignment_check(). > > > > Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> > > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> > > Reviewed-by: Tony Luck <tony.luck@intel.com> > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > --- > > arch/x86/include/asm/cpu.h | 4 ++-- > > arch/x86/kernel/cpu/intel.c | 7 ++++--- > > arch/x86/kernel/traps.c | 2 +- > > 3 files changed, 7 insertions(+), 6 deletions(-) > > > > diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h > > index ff6f3ca649b3..ff567afa6ee1 100644 > > --- a/arch/x86/include/asm/cpu.h > > +++ b/arch/x86/include/asm/cpu.h > > @@ -43,11 +43,11 @@ unsigned int x86_stepping(unsigned int sig); > > #ifdef CONFIG_CPU_SUP_INTEL > > extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); > > extern void switch_to_sld(unsigned long tifn); > > -extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); > > +extern bool handle_user_split_lock(unsigned long ip); > > #else > > static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} > > static inline void switch_to_sld(unsigned long tifn) {} > > -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code) > > +static inline bool handle_user_split_lock(unsigned long ip) > > This is necessary because VMX can be compiled without CPU_SUP_INTEL? Ya, it came about when cleaning up the IA32_FEATURE_CONTROL MSR handling to consolidate duplicate code. config KVM_INTEL tristate "KVM for Intel (and compatible) processors support" depends on KVM && IA32_FEAT_CTL config IA32_FEAT_CTL def_bool y depends on CPU_SUP_INTEL || CPU_SUP_CENTAUR || CPU_SUP_ZHAOXIN ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM 2020-04-02 17:19 ` Sean Christopherson @ 2020-04-02 19:06 ` Thomas Gleixner 2020-04-10 4:39 ` Xiaoyao Li 0 siblings, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 19:06 UTC (permalink / raw) To: Sean Christopherson Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Thu, Apr 02, 2020 at 07:01:56PM +0200, Thomas Gleixner wrote: >> > static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} >> > static inline void switch_to_sld(unsigned long tifn) {} >> > -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code) >> > +static inline bool handle_user_split_lock(unsigned long ip) >> >> This is necessary because VMX can be compiled without CPU_SUP_INTEL? > > Ya, it came about when cleaning up the IA32_FEATURE_CONTROL MSR handling > to consolidate duplicate code. > > config KVM_INTEL > tristate "KVM for Intel (and compatible) processors support" > depends on KVM && IA32_FEAT_CTL > > config IA32_FEAT_CTL > def_bool y > depends on CPU_SUP_INTEL || CPU_SUP_CENTAUR || CPU_SUP_ZHAOXIN Ah, indeed. So something like the below would make sense. Hmm? Of course that can be mangled into Xiaoyao's patches, I'm not worried about my patch count :) Aside of that I really wish Intel HW folks had indicated the source of the #AC via the error code. It can only be 0 or 1 for the regular #AC so there would have been 31 bits to chose from. Thanks, tglx 8<---------------- --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -43,14 +43,14 @@ unsigned int x86_stepping(unsigned int s #ifdef CONFIG_CPU_SUP_INTEL extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); extern void switch_to_sld(unsigned long tifn); -extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); +extern int handle_ac_split_lock(unsigned long ip); extern void split_lock_validate_module_text(struct module *me, void *text, void *text_end); #else static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} static inline void switch_to_sld(unsigned long tifn) {} -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code) +static int handle_ac_split_lock(unsigned long ip) { - return false; + return -ENOSYS; } static inline void split_lock_validate_module_text(struct module *me, void *text, void *text_end) {} #endif --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -1102,13 +1102,20 @@ static void split_lock_init(void) split_lock_verify_msr(sld_state != sld_off); } -bool handle_user_split_lock(struct pt_regs *regs, long error_code) +int handle_ac_split_lock(unsigned long ip) { - if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) - return false; + switch (sld_state) { + case sld_warn: + break; + case sld_off: + pr_warn_once("#AC: Spurious trap at address: 0x%lx\n", ip); + return -ENOSYS; + case sld_fatal: + return -EFAULT; + } pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n", - current->comm, current->pid, regs->ip); + current->comm, current->pid, ip); /* * Disable the split lock detection for this task so it can make @@ -1117,8 +1124,9 @@ bool handle_user_split_lock(struct pt_re */ sld_update_msr(false); set_tsk_thread_flag(current, TIF_SLD); - return true; + return 0; } +EXPORT_SYMBOL_GPL(handle_ac_split_lock); /* * This function is called only when switching between tasks with --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -304,7 +304,7 @@ dotraplinkage void do_alignment_check(st local_irq_enable(); - if (handle_user_split_lock(regs, error_code)) + if (!(regs->flags & X86_EFLAGS_AC) && !handle_ac_split_lock(regs->ip)) return; do_trap(X86_TRAP_AC, SIGBUS, "alignment check", regs, --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -65,6 +65,7 @@ MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); +MODULE_INFO(sld_safe, "Y"); #ifdef MODULE static const struct x86_cpu_id vmx_cpu_id[] = { @@ -4623,6 +4624,22 @@ static int handle_machine_check(struct k return 1; } +static bool guest_handles_ac(struct kvm_vcpu *vcpu) +{ + /* + * If guest has alignment checking enabled in CR0 and activated in + * eflags, then the #AC originated from CPL3 and the guest is able + * to handle it. It does not matter whether this is a regular or + * a split lock operation induced #AC. + */ + if (vmx_get_cpl(vcpu) == 3 && kvm_read_cr0_bits(vcpu, X86_CR0_AM) && + kvm_get_rflags(vcpu) & X86_EFLAGS_AC) + return true; + + /* Add guest SLD handling checks here once it's supported */ + return false; +} + static int handle_exception_nmi(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -4630,6 +4647,7 @@ static int handle_exception_nmi(struct k u32 intr_info, ex_no, error_code; unsigned long cr2, rip, dr6; u32 vect_info; + int err; vect_info = vmx->idt_vectoring_info; intr_info = vmx->exit_intr_info; @@ -4688,9 +4706,6 @@ static int handle_exception_nmi(struct k return handle_rmode_exception(vcpu, ex_no, error_code); switch (ex_no) { - case AC_VECTOR: - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); - return 1; case DB_VECTOR: dr6 = vmcs_readl(EXIT_QUALIFICATION); if (!(vcpu->guest_debug & @@ -4719,6 +4734,29 @@ static int handle_exception_nmi(struct k kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; kvm_run->debug.arch.exception = ex_no; break; + case AC_VECTOR: + if (guest_handles_ac(vcpu)) { + kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); + return 1; + } + /* + * Handle #AC caused by split lock detection. If the host + * mode is sld_warn, then it warns, marks current with + * TIF_SLD and disables split lock detection. So the guest + * can just continue. + * + * If the host mode is fatal, the handling code warned. Let + * qemu kill itself. + * + * If the host mode is off, then this #AC is bonkers and + * something is badly wrong. Let it fail as well. + */ + err = handle_ac_split_lock(kvm_rip_read(vcpu)); + if (!err) + return 1; + /* Propagate the error type to user space */ + error_code = err == -EFAULT ? 0x100 : 0x200; + fallthrough; default: kvm_run->exit_reason = KVM_EXIT_EXCEPTION; kvm_run->ex.exception = ex_no; ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM 2020-04-02 19:06 ` Thomas Gleixner @ 2020-04-10 4:39 ` Xiaoyao Li 2020-04-10 10:21 ` Paolo Bonzini 0 siblings, 1 reply; 76+ messages in thread From: Xiaoyao Li @ 2020-04-10 4:39 UTC (permalink / raw) To: Thomas Gleixner, Sean Christopherson, Paolo Bonzini Cc: x86, Kenneth R . Crudup, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel On 4/3/2020 3:06 AM, Thomas Gleixner wrote: > Sean Christopherson <sean.j.christopherson@intel.com> writes: >> On Thu, Apr 02, 2020 at 07:01:56PM +0200, Thomas Gleixner wrote: >>>> static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} >>>> static inline void switch_to_sld(unsigned long tifn) {} >>>> -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code) >>>> +static inline bool handle_user_split_lock(unsigned long ip) >>> >>> This is necessary because VMX can be compiled without CPU_SUP_INTEL? >> >> Ya, it came about when cleaning up the IA32_FEATURE_CONTROL MSR handling >> to consolidate duplicate code. >> >> config KVM_INTEL >> tristate "KVM for Intel (and compatible) processors support" >> depends on KVM && IA32_FEAT_CTL >> >> config IA32_FEAT_CTL >> def_bool y >> depends on CPU_SUP_INTEL || CPU_SUP_CENTAUR || CPU_SUP_ZHAOXIN > > Ah, indeed. So something like the below would make sense. Hmm? > > Of course that can be mangled into Xiaoyao's patches, I'm not worried > about my patch count :) > I don't mind using yours in my next version. Hi Paolo, Are you OK with the kvm part below? If no objection, I can spin the next version using tglx's. > > 8<---------------- > --- a/arch/x86/include/asm/cpu.h > +++ b/arch/x86/include/asm/cpu.h > @@ -43,14 +43,14 @@ unsigned int x86_stepping(unsigned int s > #ifdef CONFIG_CPU_SUP_INTEL > extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c); > extern void switch_to_sld(unsigned long tifn); > -extern bool handle_user_split_lock(struct pt_regs *regs, long error_code); > +extern int handle_ac_split_lock(unsigned long ip); > extern void split_lock_validate_module_text(struct module *me, void *text, void *text_end); > #else > static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {} > static inline void switch_to_sld(unsigned long tifn) {} > -static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code) > +static int handle_ac_split_lock(unsigned long ip) > { > - return false; > + return -ENOSYS; > } > static inline void split_lock_validate_module_text(struct module *me, void *text, void *text_end) {} > #endif > > --- a/arch/x86/kernel/cpu/intel.c > +++ b/arch/x86/kernel/cpu/intel.c > @@ -1102,13 +1102,20 @@ static void split_lock_init(void) > split_lock_verify_msr(sld_state != sld_off); > } > > -bool handle_user_split_lock(struct pt_regs *regs, long error_code) > +int handle_ac_split_lock(unsigned long ip) > { > - if ((regs->flags & X86_EFLAGS_AC) || sld_state == sld_fatal) > - return false; > + switch (sld_state) { > + case sld_warn: > + break; > + case sld_off: > + pr_warn_once("#AC: Spurious trap at address: 0x%lx\n", ip); > + return -ENOSYS; > + case sld_fatal: > + return -EFAULT; > + } > > pr_warn_ratelimited("#AC: %s/%d took a split_lock trap at address: 0x%lx\n", > - current->comm, current->pid, regs->ip); > + current->comm, current->pid, ip); > > /* > * Disable the split lock detection for this task so it can make > @@ -1117,8 +1124,9 @@ bool handle_user_split_lock(struct pt_re > */ > sld_update_msr(false); > set_tsk_thread_flag(current, TIF_SLD); > - return true; > + return 0; > } > +EXPORT_SYMBOL_GPL(handle_ac_split_lock); > > /* > * This function is called only when switching between tasks with > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -304,7 +304,7 @@ dotraplinkage void do_alignment_check(st > > local_irq_enable(); > > - if (handle_user_split_lock(regs, error_code)) > + if (!(regs->flags & X86_EFLAGS_AC) && !handle_ac_split_lock(regs->ip)) > return; > > do_trap(X86_TRAP_AC, SIGBUS, "alignment check", regs, > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -65,6 +65,7 @@ > > MODULE_AUTHOR("Qumranet"); > MODULE_LICENSE("GPL"); > +MODULE_INFO(sld_safe, "Y"); > > #ifdef MODULE > static const struct x86_cpu_id vmx_cpu_id[] = { > @@ -4623,6 +4624,22 @@ static int handle_machine_check(struct k > return 1; > } > > +static bool guest_handles_ac(struct kvm_vcpu *vcpu) > +{ > + /* > + * If guest has alignment checking enabled in CR0 and activated in > + * eflags, then the #AC originated from CPL3 and the guest is able > + * to handle it. It does not matter whether this is a regular or > + * a split lock operation induced #AC. > + */ > + if (vmx_get_cpl(vcpu) == 3 && kvm_read_cr0_bits(vcpu, X86_CR0_AM) && > + kvm_get_rflags(vcpu) & X86_EFLAGS_AC) > + return true; > + > + /* Add guest SLD handling checks here once it's supported */ > + return false; > +} > + > static int handle_exception_nmi(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -4630,6 +4647,7 @@ static int handle_exception_nmi(struct k > u32 intr_info, ex_no, error_code; > unsigned long cr2, rip, dr6; > u32 vect_info; > + int err; > > vect_info = vmx->idt_vectoring_info; > intr_info = vmx->exit_intr_info; > @@ -4688,9 +4706,6 @@ static int handle_exception_nmi(struct k > return handle_rmode_exception(vcpu, ex_no, error_code); > > switch (ex_no) { > - case AC_VECTOR: > - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > - return 1; > case DB_VECTOR: > dr6 = vmcs_readl(EXIT_QUALIFICATION); > if (!(vcpu->guest_debug & > @@ -4719,6 +4734,29 @@ static int handle_exception_nmi(struct k > kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; > kvm_run->debug.arch.exception = ex_no; > break; > + case AC_VECTOR: > + if (guest_handles_ac(vcpu)) { > + kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > + return 1; > + } > + /* > + * Handle #AC caused by split lock detection. If the host > + * mode is sld_warn, then it warns, marks current with > + * TIF_SLD and disables split lock detection. So the guest > + * can just continue. > + * > + * If the host mode is fatal, the handling code warned. Let > + * qemu kill itself. > + * > + * If the host mode is off, then this #AC is bonkers and > + * something is badly wrong. Let it fail as well. > + */ > + err = handle_ac_split_lock(kvm_rip_read(vcpu)); > + if (!err) > + return 1; > + /* Propagate the error type to user space */ > + error_code = err == -EFAULT ? 0x100 : 0x200; > + fallthrough; > default: > kvm_run->exit_reason = KVM_EXIT_EXCEPTION; > kvm_run->ex.exception = ex_no; > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM 2020-04-10 4:39 ` Xiaoyao Li @ 2020-04-10 10:21 ` Paolo Bonzini 0 siblings, 0 replies; 76+ messages in thread From: Paolo Bonzini @ 2020-04-10 10:21 UTC (permalink / raw) To: Xiaoyao Li, Thomas Gleixner, Sean Christopherson Cc: x86, Kenneth R . Crudup, Fenghua Yu, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel On 10/04/20 06:39, Xiaoyao Li wrote: > > +static bool guest_handles_ac(struct kvm_vcpu *vcpu) > +{ > + /* > + * If guest has alignment checking enabled in CR0 and activated in > + * eflags, then the #AC originated from CPL3 and the guest is able > + * to handle it. It does not matter whether this is a regular or > + * a split lock operation induced #AC. > + */ > + if (vmx_get_cpl(vcpu) == 3 && kvm_read_cr0_bits(vcpu, X86_CR0_AM) && > + kvm_get_rflags(vcpu) & X86_EFLAGS_AC) > + return true; > + > + /* Add guest SLD handling checks here once it's supported */ > + return false; > +} > + > static int handle_exception_nmi(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -4630,6 +4647,7 @@ static int handle_exception_nmi(struct k > u32 intr_info, ex_no, error_code; > unsigned long cr2, rip, dr6; > u32 vect_info; > + int err; > vect_info = vmx->idt_vectoring_info; > intr_info = vmx->exit_intr_info; > @@ -4688,9 +4706,6 @@ static int handle_exception_nmi(struct k > return handle_rmode_exception(vcpu, ex_no, error_code); > switch (ex_no) { > - case AC_VECTOR: > - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > - return 1; > case DB_VECTOR: > dr6 = vmcs_readl(EXIT_QUALIFICATION); > if (!(vcpu->guest_debug & > @@ -4719,6 +4734,29 @@ static int handle_exception_nmi(struct k > kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; > kvm_run->debug.arch.exception = ex_no; > break; > + case AC_VECTOR: > + if (guest_handles_ac(vcpu)) { > + kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > + return 1; > + } > + /* > + * Handle #AC caused by split lock detection. If the host > + * mode is sld_warn, then it warns, marks current with > + * TIF_SLD and disables split lock detection. So the guest > + * can just continue. > + * > + * If the host mode is fatal, the handling code warned. Let > + * qemu kill itself. > + * > + * If the host mode is off, then this #AC is bonkers and > + * something is badly wrong. Let it fail as well. > + */ > + err = handle_ac_split_lock(kvm_rip_read(vcpu)); > + if (!err) > + return 1; > + /* Propagate the error type to user space */ > + error_code = err == -EFAULT ? 0x100 : 0x200; > + fallthrough; Acked-by: Paolo Bonzini <pbonzini@redhat.com> ^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 15:55 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Sean Christopherson 2020-04-02 15:55 ` [PATCH 1/3] KVM: x86: Emulate split-lock access as a write in emulator Sean Christopherson 2020-04-02 15:55 ` [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM Sean Christopherson @ 2020-04-02 15:55 ` Sean Christopherson 2020-04-02 17:19 ` Thomas Gleixner 2020-04-10 10:23 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Paolo Bonzini 3 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 15:55 UTC (permalink / raw) To: Thomas Gleixner Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel From: Xiaoyao Li <xiaoyao.li@intel.com> Two types #AC can be generated in Intel CPUs: 1. legacy alignment check #AC 2. split lock #AC Reflect #AC back into the guest if the guest has legacy alignment checks enabled or if SLD is disabled. If SLD is enabled, treat the guest like a host userspace application by calling handle_user_split_lock(). If the #AC is handled (SLD disabled and TIF_SLD set), then simply resume the guest. If the #AC isn't handled, i.e. host is sld_fatal, then forward the #AC to the userspace VMM, similar to sending SIGBUS. Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/kvm/vmx/vmx.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 458e684dfbdc..a96cfda0a5b9 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4623,6 +4623,12 @@ static int handle_machine_check(struct kvm_vcpu *vcpu) return 1; } +static inline bool guest_cpu_alignment_check_enabled(struct kvm_vcpu *vcpu) +{ + return vmx_get_cpl(vcpu) == 3 && kvm_read_cr0_bits(vcpu, X86_CR0_AM) && + (kvm_get_rflags(vcpu) & X86_EFLAGS_AC); +} + static int handle_exception_nmi(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -4688,9 +4694,6 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) return handle_rmode_exception(vcpu, ex_no, error_code); switch (ex_no) { - case AC_VECTOR: - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); - return 1; case DB_VECTOR: dr6 = vmcs_readl(EXIT_QUALIFICATION); if (!(vcpu->guest_debug & @@ -4719,6 +4722,27 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; kvm_run->debug.arch.exception = ex_no; break; + case AC_VECTOR: + /* + * Reflect #AC to the guest if it's expecting the #AC, i.e. has + * legacy alignment check enabled. Pre-check host split lock + * turned on to avoid the VMREADs needed to check legacy #AC, + * i.e. reflect the #AC if the only possible source is legacy + * alignment checks. + */ + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) || + guest_cpu_alignment_check_enabled(vcpu)) { + kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); + return 1; + } + + /* + * Forward the #AC to userspace if kernel policy does not allow + * temporarily disabling split lock detection. + */ + if (handle_user_split_lock(kvm_rip_read(vcpu))) + return 1; + fallthrough; default: kvm_run->exit_reason = KVM_EXIT_EXCEPTION; kvm_run->ex.exception = ex_no; -- 2.24.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 15:55 ` [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest Sean Christopherson @ 2020-04-02 17:19 ` Thomas Gleixner 2020-04-02 17:40 ` Sean Christopherson 0 siblings, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 17:19 UTC (permalink / raw) To: Sean Christopherson Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel Sean Christopherson <sean.j.christopherson@intel.com> writes: > @@ -4623,6 +4623,12 @@ static int handle_machine_check(struct kvm_vcpu *vcpu) > return 1; > } > > +static inline bool guest_cpu_alignment_check_enabled(struct kvm_vcpu *vcpu) I used a different function name intentionally so the check for 'guest want's split lock #AC' can go there as well once it's sorted. > +{ > + return vmx_get_cpl(vcpu) == 3 && kvm_read_cr0_bits(vcpu, X86_CR0_AM) && > + (kvm_get_rflags(vcpu) & X86_EFLAGS_AC); > +} > + > static int handle_exception_nmi(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -4688,9 +4694,6 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) > return handle_rmode_exception(vcpu, ex_no, error_code); > > switch (ex_no) { > - case AC_VECTOR: > - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > - return 1; > case DB_VECTOR: > dr6 = vmcs_readl(EXIT_QUALIFICATION); > if (!(vcpu->guest_debug & > @@ -4719,6 +4722,27 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) > kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; > kvm_run->debug.arch.exception = ex_no; > break; > + case AC_VECTOR: > + /* > + * Reflect #AC to the guest if it's expecting the #AC, i.e. has > + * legacy alignment check enabled. Pre-check host split lock > + * turned on to avoid the VMREADs needed to check legacy #AC, > + * i.e. reflect the #AC if the only possible source is legacy > + * alignment checks. > + */ > + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) || I think the right thing to do here is to make this really independent of that feature, i.e. inject the exception if (CPL==3 && CR0.AM && EFLAGS.AC) || (FUTURE && (GUEST_TEST_CTRL & SLD)) iow. when its really clear that the guest asked for it. If there is an actual #AC with SLD disabled and !(CPL==3 && CR0.AM && EFLAGS.AC) then something is badly wrong and the thing should just die. That's why I separated handle_guest_split_lock() and tell about that case. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 17:19 ` Thomas Gleixner @ 2020-04-02 17:40 ` Sean Christopherson 2020-04-02 20:07 ` Thomas Gleixner 0 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 17:40 UTC (permalink / raw) To: Thomas Gleixner Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel On Thu, Apr 02, 2020 at 07:19:44PM +0200, Thomas Gleixner wrote: > Sean Christopherson <sean.j.christopherson@intel.com> writes: > > @@ -4623,6 +4623,12 @@ static int handle_machine_check(struct kvm_vcpu *vcpu) > > return 1; > > } > > > > +static inline bool guest_cpu_alignment_check_enabled(struct kvm_vcpu *vcpu) > > I used a different function name intentionally so the check for 'guest > want's split lock #AC' can go there as well once it's sorted. Heh, IIRC, I advised Xiaoyao to do the opposite so that the injection logic in the #AC case statement was more or less complete without having to dive into the helper, e.g. the resulting code looks like this once split-lock is exposed to the guest: if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) || guest_cpu_alignment_check_enabled(vcpu) || guest_cpu_sld_on(vmx)) { kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); return 1; } > > +{ > > + return vmx_get_cpl(vcpu) == 3 && kvm_read_cr0_bits(vcpu, X86_CR0_AM) && > > + (kvm_get_rflags(vcpu) & X86_EFLAGS_AC); > > +} > > + > > static int handle_exception_nmi(struct kvm_vcpu *vcpu) > > { > > struct vcpu_vmx *vmx = to_vmx(vcpu); > > @@ -4688,9 +4694,6 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) > > return handle_rmode_exception(vcpu, ex_no, error_code); > > > > switch (ex_no) { > > - case AC_VECTOR: > > - kvm_queue_exception_e(vcpu, AC_VECTOR, error_code); > > - return 1; > > case DB_VECTOR: > > dr6 = vmcs_readl(EXIT_QUALIFICATION); > > if (!(vcpu->guest_debug & > > @@ -4719,6 +4722,27 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) > > kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip; > > kvm_run->debug.arch.exception = ex_no; > > break; > > + case AC_VECTOR: > > + /* > > + * Reflect #AC to the guest if it's expecting the #AC, i.e. has > > + * legacy alignment check enabled. Pre-check host split lock > > + * turned on to avoid the VMREADs needed to check legacy #AC, > > + * i.e. reflect the #AC if the only possible source is legacy > > + * alignment checks. > > + */ > > + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) || > > I think the right thing to do here is to make this really independent of > that feature, i.e. inject the exception if > > (CPL==3 && CR0.AM && EFLAGS.AC) || (FUTURE && (GUEST_TEST_CTRL & SLD)) > > iow. when its really clear that the guest asked for it. If there is an > actual #AC with SLD disabled and !(CPL==3 && CR0.AM && EFLAGS.AC) then > something is badly wrong and the thing should just die. That's why I > separated handle_guest_split_lock() and tell about that case. That puts KVM in a weird spot if/when intercepting #AC is no longer necessary, e.g. "if" future CPUs happen to gain a feature that traps into the hypervisor (KVM) if a potential near-infinite ucode loop is detected. The only reason KVM intercepts #AC (before split-lock) is to prevent a malicious guest from executing a DoS attack on the host by putting the #AC handler in ring 3. Current CPUs will get stuck in ucode vectoring #AC faults more or less indefinitely, e.g. long enough to trigger watchdogs in the host. Injecting #AC if and only if KVM is 100% certain the guest wants the #AC would lead to divergent behavior if KVM chose to not intercept #AC, e.g. some theoretical unknown #AC source would conditionally result in exits to userspace depending on whether or not KVM wanted to intercept #AC for other reasons. That's why we went with the approach of reflecting #AC unless KVM detected that the #AC was host-induced. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 17:40 ` Sean Christopherson @ 2020-04-02 20:07 ` Thomas Gleixner 2020-04-02 20:36 ` Andy Lutomirski ` (2 more replies) 0 siblings, 3 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 20:07 UTC (permalink / raw) To: Sean Christopherson Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel Sean, Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Thu, Apr 02, 2020 at 07:19:44PM +0200, Thomas Gleixner wrote: >> Sean Christopherson <sean.j.christopherson@intel.com> writes: >> > + case AC_VECTOR: >> > + /* >> > + * Reflect #AC to the guest if it's expecting the #AC, i.e. has >> > + * legacy alignment check enabled. Pre-check host split lock >> > + * turned on to avoid the VMREADs needed to check legacy #AC, >> > + * i.e. reflect the #AC if the only possible source is legacy >> > + * alignment checks. >> > + */ >> > + if (!boot_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) || >> >> I think the right thing to do here is to make this really independent of >> that feature, i.e. inject the exception if >> >> (CPL==3 && CR0.AM && EFLAGS.AC) || (FUTURE && (GUEST_TEST_CTRL & SLD)) >> >> iow. when its really clear that the guest asked for it. If there is an >> actual #AC with SLD disabled and !(CPL==3 && CR0.AM && EFLAGS.AC) then >> something is badly wrong and the thing should just die. That's why I >> separated handle_guest_split_lock() and tell about that case. > > That puts KVM in a weird spot if/when intercepting #AC is no longer > necessary, e.g. "if" future CPUs happen to gain a feature that traps into > the hypervisor (KVM) if a potential near-infinite ucode loop is detected. > > The only reason KVM intercepts #AC (before split-lock) is to prevent a > malicious guest from executing a DoS attack on the host by putting the #AC > handler in ring 3. Current CPUs will get stuck in ucode vectoring #AC > faults more or less indefinitely, e.g. long enough to trigger watchdogs in > the host. Which is thankfully well documented in the VMX code and the corresponding chapter in the SDM. > Injecting #AC if and only if KVM is 100% certain the guest wants the #AC > would lead to divergent behavior if KVM chose to not intercept #AC, e.g. AFAICT, #AC is not really something which is performance relevant, but I might obviously be uninformed on that. Assumed it is not, then there is neither a hard requirement nor a real incentive to give up on intercepting #AC even when future CPUs have a fix for the above wreckage. > some theoretical unknown #AC source would conditionally result in exits to > userspace depending on whether or not KVM wanted to intercept #AC for > other reasons. I'd rather like to know when there is an unknown #AC source instead of letting the guest silently swallow it. TBH, the more I learn about this, the more I tend to just give up on this whole split lock stuff in its current form and wait until HW folks provide something which is actually usable: - Per thread - Properly distinguishable from a regular #AC via error code OTOH, that means I won't be able to use it before retirement. Oh well. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 20:07 ` Thomas Gleixner @ 2020-04-02 20:36 ` Andy Lutomirski 2020-04-02 20:48 ` Peter Zijlstra 2020-04-02 20:51 ` Sean Christopherson 2 siblings, 0 replies; 76+ messages in thread From: Andy Lutomirski @ 2020-04-02 20:36 UTC (permalink / raw) To: Thomas Gleixner Cc: Sean Christopherson, x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel > On Apr 2, 2020, at 1:07 PM, Thomas Gleixner <tglx@linutronix.de> wrote: > > > > TBH, the more I learn about this, the more I tend to just give up on > this whole split lock stuff in its current form and wait until HW folks > provide something which is actually usable: > > - Per thread > - Properly distinguishable from a regular #AC via error code Why the latter? I would argue that #AC from CPL3 with EFLAGS.AC set is almost by construction not a split lock. In particular, if you meet these conditions, how exactly can you do a split lock without simultaneously triggering an alignment check? (Maybe CMPXCHG16B? > > OTOH, that means I won't be able to use it before retirement. Oh well. > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 20:07 ` Thomas Gleixner 2020-04-02 20:36 ` Andy Lutomirski @ 2020-04-02 20:48 ` Peter Zijlstra 2020-04-02 20:51 ` Sean Christopherson 2 siblings, 0 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-02 20:48 UTC (permalink / raw) To: Thomas Gleixner Cc: Sean Christopherson, x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel On Thu, Apr 02, 2020 at 10:07:07PM +0200, Thomas Gleixner wrote: If we're doing wish lists, I have one more ;-) > TBH, the more I learn about this, the more I tend to just give up on > this whole split lock stuff in its current form and wait until HW folks > provide something which is actually usable: > > - Per thread > - Properly distinguishable from a regular #AC via error code - is an actual alignment check. That is, disallow all unaligned LOCK prefixed ops, not just those that happen to straddle a cacheline. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 20:07 ` Thomas Gleixner 2020-04-02 20:36 ` Andy Lutomirski 2020-04-02 20:48 ` Peter Zijlstra @ 2020-04-02 20:51 ` Sean Christopherson 2020-04-02 22:27 ` Thomas Gleixner 2 siblings, 1 reply; 76+ messages in thread From: Sean Christopherson @ 2020-04-02 20:51 UTC (permalink / raw) To: Thomas Gleixner Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel On Thu, Apr 02, 2020 at 10:07:07PM +0200, Thomas Gleixner wrote: > Sean Christopherson <sean.j.christopherson@intel.com> writes: > > On Thu, Apr 02, 2020 at 07:19:44PM +0200, Thomas Gleixner wrote: > >> Sean Christopherson <sean.j.christopherson@intel.com> writes: > > That puts KVM in a weird spot if/when intercepting #AC is no longer > > necessary, e.g. "if" future CPUs happen to gain a feature that traps into > > the hypervisor (KVM) if a potential near-infinite ucode loop is detected. > > > > The only reason KVM intercepts #AC (before split-lock) is to prevent a > > malicious guest from executing a DoS attack on the host by putting the #AC > > handler in ring 3. Current CPUs will get stuck in ucode vectoring #AC > > faults more or less indefinitely, e.g. long enough to trigger watchdogs in > > the host. > > Which is thankfully well documented in the VMX code and the > corresponding chapter in the SDM. > > > Injecting #AC if and only if KVM is 100% certain the guest wants the #AC > > would lead to divergent behavior if KVM chose to not intercept #AC, e.g. > > AFAICT, #AC is not really something which is performance relevant, but I > might obviously be uninformed on that. > > Assumed it is not, then there is neither a hard requirement nor a real > incentive to give up on intercepting #AC even when future CPUs have a > fix for the above wreckage. Agreed that there's no hard requirement, but general speaking, the less KVM needs to poke into the guest the better. > > some theoretical unknown #AC source would conditionally result in exits to > > userspace depending on whether or not KVM wanted to intercept #AC for > > other reasons. > > I'd rather like to know when there is an unknown #AC source instead of > letting the guest silently swallow it. Trying to prevent the guest from squashing a spurious fault is a fools errand. For example, with nested virtualization, the correct behavior from an architectural perspective is to forward exceptions from L2 (the nested VM) to L1 (the direct VM) that L1 wants to intercept. E.g. if L1 wants to intercept #AC faults that happen in L2, then KVM reflects all #AC faults into L1 as VM-Exits without ever reaching this code. Nested virt aside, detecting spurious #AC and a few other exceptions is mostly feasible, but for many exceptions it's flat out impossible. Anyways, this particular case isn't a sticking point, i.e. I'd be ok with exiting to userspace on a spurious #AC, I just don't see the value in doing so. Returning KVM_EXIT_EXCEPTION doesn't necessarily equate to throwing up a red flag, e.g. from a kernel perspective you'd still be relying on the userspace VMM to report the error in a sane manner. I think at one point Xiaoyao had a WARN_ON for a spurious #AC, but it was removed because the odds of a false positive due to some funky corner case seemed higher than detecting a CPU bug. > TBH, the more I learn about this, the more I tend to just give up on > this whole split lock stuff in its current form and wait until HW folks > provide something which is actually usable: > > - Per thread > - Properly distinguishable from a regular #AC via error code > > OTOH, that means I won't be able to use it before retirement. Oh well. > > Thanks, > > tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 20:51 ` Sean Christopherson @ 2020-04-02 22:27 ` Thomas Gleixner 2020-04-02 22:40 ` Nadav Amit 0 siblings, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 22:27 UTC (permalink / raw) To: Sean Christopherson Cc: x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel Sean, Sean Christopherson <sean.j.christopherson@intel.com> writes: > On Thu, Apr 02, 2020 at 10:07:07PM +0200, Thomas Gleixner wrote: >> Sean Christopherson <sean.j.christopherson@intel.com> writes: >> AFAICT, #AC is not really something which is performance relevant, but I >> might obviously be uninformed on that. >> >> Assumed it is not, then there is neither a hard requirement nor a real >> incentive to give up on intercepting #AC even when future CPUs have a >> fix for the above wreckage. > > Agreed that there's no hard requirement, but general speaking, the less KVM > needs to poke into the guest the better. Fair enough. >> > some theoretical unknown #AC source would conditionally result in exits to >> > userspace depending on whether or not KVM wanted to intercept #AC for >> > other reasons. >> >> I'd rather like to know when there is an unknown #AC source instead of >> letting the guest silently swallow it. > > Trying to prevent the guest from squashing a spurious fault is a fools > errand. For example, with nested virtualization, the correct behavior > from an architectural perspective is to forward exceptions from L2 (the > nested VM) to L1 (the direct VM) that L1 wants to intercept. E.g. if L1 > wants to intercept #AC faults that happen in L2, then KVM reflects all #AC > faults into L1 as VM-Exits without ever reaching this code. Which means L1 should have some handling for that case at least those L1 hypervisors which we can fix. If we want to go there. > Anyways, this particular case isn't a sticking point, i.e. I'd be ok with > exiting to userspace on a spurious #AC, I just don't see the value in doing > so. Returning KVM_EXIT_EXCEPTION doesn't necessarily equate to throwing up > a red flag, e.g. from a kernel perspective you'd still be relying on the > userspace VMM to report the error in a sane manner. I think at one point > Xiaoyao had a WARN_ON for a spurious #AC, but it was removed because the > odds of a false positive due to some funky corner case seemed higher than > detecting a CPU bug. Agreed. Relying on the user space side to crash and burn the guest is probably wishful thinking. So the right thing might be to just kill it right at the spot. But coming back to the above discussion: if (!cpu_has(SLD) || guest_wants_regular_ac()) { kvm_queue_exception_e(); return 1; } vs. if (guest_wants_regular_ac()) { kvm_queue_exception_e(); return 1; } The only situation where this makes a difference is when the CPU does not support SLD. If it does the thing became ambiguous today. With my previous attempt to bring sanity into this by not setting the feature flag when SLD is disabled at the command line, the check is consistent. But the detection of unaware hypervisors with the module scan brings us into a situation where we have sld_state == sld_off and the feature flag being set because we can't undo it anymore. So now you load VMWare which disables SLD, but the feature bit stays and then when you unload it and load VMX afterwards then you have exactly the same situation as with the feature check removed. Consistency gone. So with that we might want to go back to the sld_state check instead of the cpu feature check, but that does not make it more consistent: As I just verified, it's possible to load the vmware module parallel to the KVM/VMX one. So either we deal with it in some way or just decide that SLD and HV modules which do not have the MOD_INFO(sld_safe) magic cannot be loaded when SLD is enabled on the host. I'm fine with the latter :) What a mess. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 22:27 ` Thomas Gleixner @ 2020-04-02 22:40 ` Nadav Amit 2020-04-02 23:03 ` Thomas Gleixner 2020-04-02 23:08 ` Steven Rostedt 0 siblings, 2 replies; 76+ messages in thread From: Nadav Amit @ 2020-04-02 22:40 UTC (permalink / raw) To: Thomas Gleixner Cc: Sean Christopherson, x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, LKML, Doug Covelli > On Apr 2, 2020, at 3:27 PM, Thomas Gleixner <tglx@linutronix.de> wrote: > > As I just verified, it's possible to load the vmware module parallel > to the KVM/VMX one. > > So either we deal with it in some way or just decide that SLD and HV > modules which do not have the MOD_INFO(sld_safe) magic cannot be loaded > when SLD is enabled on the host. I'm fine with the latter :) > > What a mess. [ +Doug ] Just to communicate the information that was given to me: we do intend to fix the SLD issue in VMware and if needed to release a minor version that addresses it. Having said that, there are other hypervisors, such as virtualbox or jailhouse, which would have a similar issue. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 22:40 ` Nadav Amit @ 2020-04-02 23:03 ` Thomas Gleixner 2020-04-02 23:08 ` Steven Rostedt 1 sibling, 0 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 23:03 UTC (permalink / raw) To: Nadav Amit Cc: Sean Christopherson, x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, LKML, Doug Covelli Nadav Amit <namit@vmware.com> writes: > Just to communicate the information that was given to me: we do intend to > fix the SLD issue in VMware and if needed to release a minor version that > addresses it. Having said that, there are other hypervisors, such as > virtualbox or jailhouse, which would have a similar issue. I'm well aware of that and even if VMWare fixes this, this still will trip up users which fail to install updates for one reason or the other and leave them puzzled. Maybe we just should not care at all. Despite that I might have mentioned it before: What a mess ... Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 22:40 ` Nadav Amit 2020-04-02 23:03 ` Thomas Gleixner @ 2020-04-02 23:08 ` Steven Rostedt 2020-04-02 23:16 ` Kenneth R. Crudup 1 sibling, 1 reply; 76+ messages in thread From: Steven Rostedt @ 2020-04-02 23:08 UTC (permalink / raw) To: Nadav Amit Cc: Thomas Gleixner, Sean Christopherson, x86, Kenneth R . Crudup, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, LKML, Doug Covelli On Thu, 2 Apr 2020 22:40:03 +0000 Nadav Amit <namit@vmware.com> wrote: > > On Apr 2, 2020, at 3:27 PM, Thomas Gleixner <tglx@linutronix.de> wrote: > > > > As I just verified, it's possible to load the vmware module parallel > > to the KVM/VMX one. > > > > So either we deal with it in some way or just decide that SLD and HV > > modules which do not have the MOD_INFO(sld_safe) magic cannot be loaded > > when SLD is enabled on the host. I'm fine with the latter :) > > > > What a mess. > > [ +Doug ] > > Just to communicate the information that was given to me: we do intend to > fix the SLD issue in VMware and if needed to release a minor version that > addresses it. Having said that, there are other hypervisors, such as > virtualbox or jailhouse, which would have a similar issue. If we go the approach of not letting VM modules load if it doesn't have the sld_safe flag set, how is this different than a VM module not loading due to kabi breakage? If we prevent it from loading (and keeping from having to go into this inconsistent state that Thomas described), it would encourage people to get the latest modules, and the maintainers of said modules motivation to update them. -- Steve ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 23:08 ` Steven Rostedt @ 2020-04-02 23:16 ` Kenneth R. Crudup 2020-04-02 23:18 ` Jim Mattson 0 siblings, 1 reply; 76+ messages in thread From: Kenneth R. Crudup @ 2020-04-02 23:16 UTC (permalink / raw) To: Steven Rostedt Cc: Nadav Amit, Thomas Gleixner, Sean Christopherson, x86, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, LKML, Doug Covelli On Thu, 2 Apr 2020, Steven Rostedt wrote: > If we go the approach of not letting VM modules load if it doesn't have the > sld_safe flag set, how is this different than a VM module not loading due > to kabi breakage? Why not a compromise: if such a module is attempted to be loaded, print up a message saying something akin to "turn the parameter 'split_lock_detect' off" as we reject loading it- and if we see that we've booted with it off just splat a WARN_ON() if someone tries to load such modules? -Kenny -- Kenneth R. Crudup Sr. SW Engineer, Scott County Consulting, Silicon Valley ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 23:16 ` Kenneth R. Crudup @ 2020-04-02 23:18 ` Jim Mattson 2020-04-03 12:16 ` Thomas Gleixner 0 siblings, 1 reply; 76+ messages in thread From: Jim Mattson @ 2020-04-02 23:18 UTC (permalink / raw) To: Kenneth R. Crudup Cc: Steven Rostedt, Nadav Amit, Thomas Gleixner, Sean Christopherson, x86, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Vitaly Kuznetsov, Wanpeng Li, kvm, LKML, Doug Covelli On Thu, Apr 2, 2020 at 4:16 PM Kenneth R. Crudup <kenny@panix.com> wrote: > > > On Thu, 2 Apr 2020, Steven Rostedt wrote: > > > If we go the approach of not letting VM modules load if it doesn't have the > > sld_safe flag set, how is this different than a VM module not loading due > > to kabi breakage? > > Why not a compromise: if such a module is attempted to be loaded, print up > a message saying something akin to "turn the parameter 'split_lock_detect' > off" as we reject loading it- and if we see that we've booted with it off > just splat a WARN_ON() if someone tries to load such modules? What modules are we talking about? I thought we were discussing L1 hypervisors, which are just binary blobs. The only modules at the L0 level are kvm and kvm_intel. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest 2020-04-02 23:18 ` Jim Mattson @ 2020-04-03 12:16 ` Thomas Gleixner 0 siblings, 0 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-03 12:16 UTC (permalink / raw) To: Jim Mattson, Kenneth R. Crudup Cc: Steven Rostedt, Nadav Amit, Sean Christopherson, x86, Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Vitaly Kuznetsov, Wanpeng Li, kvm, LKML, Doug Covelli Jim, Jim Mattson <jmattson@google.com> writes: > On Thu, Apr 2, 2020 at 4:16 PM Kenneth R. Crudup <kenny@panix.com> wrote: >> On Thu, 2 Apr 2020, Steven Rostedt wrote: >> >> > If we go the approach of not letting VM modules load if it doesn't have the >> > sld_safe flag set, how is this different than a VM module not loading due >> > to kabi breakage? >> >> Why not a compromise: if such a module is attempted to be loaded, print up >> a message saying something akin to "turn the parameter 'split_lock_detect' >> off" as we reject loading it- and if we see that we've booted with it off >> just splat a WARN_ON() if someone tries to load such modules? > > What modules are we talking about? I thought we were discussing L1 > hypervisors, which are just binary blobs. The only modules at the L0 > level are kvm and kvm_intel. Maybe in your world, but VmWare (which got this started), VirtualBox, Jailhouse and who knows what else _are_ L0 hypervisors. Otherwise we wouldn't have that conversation at all. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling 2020-04-02 15:55 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Sean Christopherson ` (2 preceding siblings ...) 2020-04-02 15:55 ` [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest Sean Christopherson @ 2020-04-10 10:23 ` Paolo Bonzini 2020-04-10 11:14 ` Thomas Gleixner 3 siblings, 1 reply; 76+ messages in thread From: Paolo Bonzini @ 2020-04-10 10:23 UTC (permalink / raw) To: Sean Christopherson, Thomas Gleixner Cc: x86, Kenneth R . Crudup, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel On 02/04/20 17:55, Sean Christopherson wrote: > First three patches from Xiaoyao's series to add split-lock #AC support > in KVM. > > Xiaoyao Li (3): > KVM: x86: Emulate split-lock access as a write in emulator > x86/split_lock: Refactor and export handle_user_split_lock() for KVM > KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in > guest Sorry I was out of the loop on this (I'm working part time and it's a mess). Sean, can you send the patches as a top-level message? I'll queue them and get them to Linus over the weekend. Paolo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling 2020-04-10 10:23 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Paolo Bonzini @ 2020-04-10 11:14 ` Thomas Gleixner 0 siblings, 0 replies; 76+ messages in thread From: Thomas Gleixner @ 2020-04-10 11:14 UTC (permalink / raw) To: Paolo Bonzini, Sean Christopherson Cc: x86, Kenneth R . Crudup, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck, Peter Zijlstra, Jessica Yu, Steven Rostedt, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, kvm, linux-kernel Paolo Bonzini <pbonzini@redhat.com> writes: > On 02/04/20 17:55, Sean Christopherson wrote: >> First three patches from Xiaoyao's series to add split-lock #AC support >> in KVM. >> >> Xiaoyao Li (3): >> KVM: x86: Emulate split-lock access as a write in emulator >> x86/split_lock: Refactor and export handle_user_split_lock() for KVM >> KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in >> guest > > Sorry I was out of the loop on this (I'm working part time and it's a > mess). Sean, can you send the patches as a top-level message? I'll > queue them and get them to Linus over the weekend. I have a reworked version. I'll post it after lunch. Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 12:32 [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Thomas Gleixner 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner 2020-04-02 12:33 ` [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage Thomas Gleixner @ 2020-04-02 13:43 ` Kenneth R. Crudup 2020-04-02 14:32 ` Peter Zijlstra 2020-04-02 14:37 ` Thomas Gleixner 2 siblings, 2 replies; 76+ messages in thread From: Kenneth R. Crudup @ 2020-04-02 13:43 UTC (permalink / raw) To: Thomas Gleixner Cc: LKML, x86, Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Thu, 2 Apr 2020, Thomas Gleixner wrote: > As Peter and myself don't have access to a SLD enabled machine, the > KVM/VMX part is untested. The module scan part works. I just applied both of these patches to my (Linus' tip) tree, and unfortunately VMWare still hangs if split_lock_detect= is set to anything but "off". Was there anything else I'd needed to do? -Kenny -- Kenneth R. Crudup Sr. SW Engineer, Scott County Consulting, Silicon Valley ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 13:43 ` [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Kenneth R. Crudup @ 2020-04-02 14:32 ` Peter Zijlstra 2020-04-02 14:41 ` Kenneth R. Crudup 2020-04-02 14:37 ` Thomas Gleixner 1 sibling, 1 reply; 76+ messages in thread From: Peter Zijlstra @ 2020-04-02 14:32 UTC (permalink / raw) To: Kenneth R. Crudup Cc: Thomas Gleixner, LKML, x86, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 06:43:19AM -0700, Kenneth R. Crudup wrote: > > On Thu, 2 Apr 2020, Thomas Gleixner wrote: > > > As Peter and myself don't have access to a SLD enabled machine, the > > KVM/VMX part is untested. The module scan part works. > > I just applied both of these patches to my (Linus' tip) tree, and unfortunately > VMWare still hangs if split_lock_detect= is set to anything but "off". > > Was there anything else I'd needed to do? Hmm, you're not seeing this: + pr_warn("disabled due to VMLAUNCH in module: %s\n", me->name); fire when you load the vmware kernel module? Could you slip me a copy of that thing by private mail? ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 14:32 ` Peter Zijlstra @ 2020-04-02 14:41 ` Kenneth R. Crudup 2020-04-02 14:46 ` Peter Zijlstra 0 siblings, 1 reply; 76+ messages in thread From: Kenneth R. Crudup @ 2020-04-02 14:41 UTC (permalink / raw) To: Peter Zijlstra Cc: Thomas Gleixner, LKML, x86, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Thu, 2 Apr 2020, Peter Zijlstra wrote: > Hmm, you're not seeing this: > + pr_warn("disabled due to VMLAUNCH in module: %s\n", me->name); > fire when you load the vmware kernel module? I just looked back at the syslog's copy of the kernel messages at the time I'd tried it, and no, I don't see that message. > Could you slip me a copy of that thing by private mail? OK, gimme a couple of days though, I've gotta get a little work done. (Also, what "thing" exactly did you want?) -Kenny -- Kenneth R. Crudup Sr. SW Engineer, Scott County Consulting, Silicon Valley ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 14:41 ` Kenneth R. Crudup @ 2020-04-02 14:46 ` Peter Zijlstra 2020-04-02 14:53 ` Kenneth R. Crudup 0 siblings, 1 reply; 76+ messages in thread From: Peter Zijlstra @ 2020-04-02 14:46 UTC (permalink / raw) To: Kenneth R. Crudup Cc: Thomas Gleixner, LKML, x86, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 07:41:41AM -0700, Kenneth R. Crudup wrote: > > On Thu, 2 Apr 2020, Peter Zijlstra wrote: > > > Hmm, you're not seeing this: > > + pr_warn("disabled due to VMLAUNCH in module: %s\n", me->name); > > fire when you load the vmware kernel module? > > I just looked back at the syslog's copy of the kernel messages at the time > I'd tried it, and no, I don't see that message. Dang! > > Could you slip me a copy of that thing by private mail? > > OK, gimme a couple of days though, I've gotta get a little work done. > (Also, what "thing" exactly did you want?) All the .ko files that come with vmware. I want to dig through them to see why the VMLAUNCH detection thing isn't working. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 14:46 ` Peter Zijlstra @ 2020-04-02 14:53 ` Kenneth R. Crudup 0 siblings, 0 replies; 76+ messages in thread From: Kenneth R. Crudup @ 2020-04-02 14:53 UTC (permalink / raw) To: Peter Zijlstra Cc: Thomas Gleixner, LKML, x86, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt > > (Also, what "thing" exactly did you want?) On Thu, 2 Apr 2020, Peter Zijlstra wrote: > All the .ko files that come with vmware. Ah, gotcha. One thing you/VMWare may want to look at is it turns out that "vmw_vmci", part of the kernel tree (CONFIG_VMWARE_VMCI) has to be compiled into the kernel as a module so udev can properly create the Misc device node (I'd tried making it a built-in and messing with udev rules and the other, compiled-in-later VMWare module loading system didn't seem to like that). Maybe some sort of mitigation for this can be done there, putting it back in-tree? But anyway, I'll send you a .tar.bz2 in a little bit. -Kenny -- Kenneth R. Crudup Sr. SW Engineer, Scott County Consulting, Silicon Valley ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 13:43 ` [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Kenneth R. Crudup 2020-04-02 14:32 ` Peter Zijlstra @ 2020-04-02 14:37 ` Thomas Gleixner 2020-04-02 14:47 ` Nadav Amit 1 sibling, 1 reply; 76+ messages in thread From: Thomas Gleixner @ 2020-04-02 14:37 UTC (permalink / raw) To: Kenneth R. Crudup Cc: LKML, x86, Peter Zijlstra (Intel), Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt "Kenneth R. Crudup" <kenny@panix.com> writes: > On Thu, 2 Apr 2020, Thomas Gleixner wrote: > >> As Peter and myself don't have access to a SLD enabled machine, the >> KVM/VMX part is untested. The module scan part works. > > I just applied both of these patches to my (Linus' tip) tree, and unfortunately > VMWare still hangs if split_lock_detect= is set to anything but "off". > > Was there anything else I'd needed to do? Hmm. Not really. Does dmesg show the warning when the VMWare module loads? Something like: x86/split lock detection: disabled due to VMLAUNCH in module: .... Thanks, tglx ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 14:37 ` Thomas Gleixner @ 2020-04-02 14:47 ` Nadav Amit 2020-04-02 15:11 ` Peter Zijlstra 0 siblings, 1 reply; 76+ messages in thread From: Nadav Amit @ 2020-04-02 14:47 UTC (permalink / raw) To: Thomas Gleixner, Peter Zijlstra (Intel) Cc: Kenneth R. Crudup, LKML, x86, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt > On Apr 2, 2020, at 7:37 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > > "Kenneth R. Crudup" <kenny@panix.com> writes: > >> On Thu, 2 Apr 2020, Thomas Gleixner wrote: >> >>> As Peter and myself don't have access to a SLD enabled machine, the >>> KVM/VMX part is untested. The module scan part works. >> >> I just applied both of these patches to my (Linus' tip) tree, and unfortunately >> VMWare still hangs if split_lock_detect= is set to anything but "off". >> >> Was there anything else I'd needed to do? > > Hmm. Not really. Does dmesg show the warning when the VMWare module loads? > Something like: > > x86/split lock detection: disabled due to VMLAUNCH in module: …. I ran an objdump on VMware workstation and indeed I do not see a VMLAUNCH/VMRESUME. I do see however VMXON which should also be good for detecting hypervisors. I will try to understand why VMLAUNCH is not there. ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors 2020-04-02 14:47 ` Nadav Amit @ 2020-04-02 15:11 ` Peter Zijlstra 0 siblings, 0 replies; 76+ messages in thread From: Peter Zijlstra @ 2020-04-02 15:11 UTC (permalink / raw) To: Nadav Amit Cc: Thomas Gleixner, Kenneth R. Crudup, LKML, x86, Paolo Bonzini, Jessica Yu, Fenghua Yu, Xiaoyao Li, Thomas Hellstrom, Sean Christopherson, Tony Luck, Steven Rostedt On Thu, Apr 02, 2020 at 02:47:33PM +0000, Nadav Amit wrote: > > On Apr 2, 2020, at 7:37 AM, Thomas Gleixner <tglx@linutronix.de> wrote: > > > > "Kenneth R. Crudup" <kenny@panix.com> writes: > > > >> On Thu, 2 Apr 2020, Thomas Gleixner wrote: > >> > >>> As Peter and myself don't have access to a SLD enabled machine, the > >>> KVM/VMX part is untested. The module scan part works. > >> > >> I just applied both of these patches to my (Linus' tip) tree, and unfortunately > >> VMWare still hangs if split_lock_detect= is set to anything but "off". > >> > >> Was there anything else I'd needed to do? > > > > Hmm. Not really. Does dmesg show the warning when the VMWare module loads? > > Something like: > > > > x86/split lock detection: disabled due to VMLAUNCH in module: …. > > I ran an objdump on VMware workstation and indeed I do not see a > VMLAUNCH/VMRESUME. I do see however VMXON which should also be good for > detecting hypervisors. I will try to understand why VMLAUNCH is not there. Let me send a version with VMXON detection added in as well. ^ permalink raw reply [flat|nested] 76+ messages in thread
end of thread, other threads:[~2020-04-10 11:15 UTC | newest] Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-04-02 12:32 [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Thomas Gleixner 2020-04-02 12:32 ` [patch 1/2] x86,module: Detect VMX modules and disable Split-Lock-Detect Thomas Gleixner 2020-04-02 15:23 ` [patch v2 " Peter Zijlstra 2020-04-02 16:20 ` Xiaoyao Li 2020-04-02 16:25 ` Peter Zijlstra 2020-04-02 16:39 ` Nadav Amit 2020-04-02 16:41 ` Xiaoyao Li 2020-04-02 17:34 ` Thomas Gleixner 2020-04-02 17:51 ` Sean Christopherson 2020-04-02 18:51 ` Peter Zijlstra 2020-04-02 20:23 ` Sean Christopherson 2020-04-02 21:04 ` Thomas Gleixner 2020-04-02 21:16 ` Sean Christopherson 2020-04-03 8:09 ` David Laight 2020-04-03 14:33 ` Peter Zijlstra 2020-04-02 23:42 ` [patch " Rasmus Villemoes 2020-04-03 14:35 ` Jessica Yu 2020-04-03 15:21 ` Peter Zijlstra 2020-04-03 16:01 ` Sean Christopherson 2020-04-03 16:12 ` Peter Zijlstra 2020-04-03 16:16 ` David Laight 2020-04-03 16:39 ` Peter Zijlstra 2020-04-03 16:25 ` Sean Christopherson 2020-04-03 16:40 ` Peter Zijlstra 2020-04-03 16:48 ` Nadav Amit 2020-04-03 17:21 ` Sean Christopherson 2020-04-03 18:53 ` Thomas Gleixner 2020-04-03 20:58 ` Andy Lutomirski 2020-04-03 21:49 ` Thomas Gleixner 2020-04-03 11:29 ` kbuild test robot 2020-04-03 14:43 ` kbuild test robot 2020-04-03 16:36 ` Sean Christopherson 2020-04-03 16:41 ` Peter Zijlstra 2020-04-03 18:35 ` Jessica Yu 2020-04-06 12:23 ` Christoph Hellwig 2020-04-06 14:40 ` Peter Zijlstra 2020-04-06 15:18 ` Christoph Hellwig 2020-04-06 15:22 ` Peter Zijlstra 2020-04-06 18:27 ` Steven Rostedt 2020-04-02 12:33 ` [patch 2/2] x86/kvm/vmx: Prevent split lock detection induced #AC wreckage Thomas Gleixner 2020-04-02 15:30 ` Sean Christopherson 2020-04-02 15:44 ` Nadav Amit 2020-04-02 16:04 ` Sean Christopherson 2020-04-02 16:56 ` Thomas Gleixner 2020-04-02 15:55 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Sean Christopherson 2020-04-02 15:55 ` [PATCH 1/3] KVM: x86: Emulate split-lock access as a write in emulator Sean Christopherson 2020-04-02 15:55 ` [PATCH 2/3] x86/split_lock: Refactor and export handle_user_split_lock() for KVM Sean Christopherson 2020-04-02 17:01 ` Thomas Gleixner 2020-04-02 17:19 ` Sean Christopherson 2020-04-02 19:06 ` Thomas Gleixner 2020-04-10 4:39 ` Xiaoyao Li 2020-04-10 10:21 ` Paolo Bonzini 2020-04-02 15:55 ` [PATCH 3/3] KVM: VMX: Extend VMX's #AC interceptor to handle split lock #AC in guest Sean Christopherson 2020-04-02 17:19 ` Thomas Gleixner 2020-04-02 17:40 ` Sean Christopherson 2020-04-02 20:07 ` Thomas Gleixner 2020-04-02 20:36 ` Andy Lutomirski 2020-04-02 20:48 ` Peter Zijlstra 2020-04-02 20:51 ` Sean Christopherson 2020-04-02 22:27 ` Thomas Gleixner 2020-04-02 22:40 ` Nadav Amit 2020-04-02 23:03 ` Thomas Gleixner 2020-04-02 23:08 ` Steven Rostedt 2020-04-02 23:16 ` Kenneth R. Crudup 2020-04-02 23:18 ` Jim Mattson 2020-04-03 12:16 ` Thomas Gleixner 2020-04-10 10:23 ` [PATCH 0/3] x86: KVM: VMX: Add basic split-lock #AC handling Paolo Bonzini 2020-04-10 11:14 ` Thomas Gleixner 2020-04-02 13:43 ` [patch 0/2] x86: Prevent Split-Lock-Detection wreckage on VMX hypervisors Kenneth R. Crudup 2020-04-02 14:32 ` Peter Zijlstra 2020-04-02 14:41 ` Kenneth R. Crudup 2020-04-02 14:46 ` Peter Zijlstra 2020-04-02 14:53 ` Kenneth R. Crudup 2020-04-02 14:37 ` Thomas Gleixner 2020-04-02 14:47 ` Nadav Amit 2020-04-02 15:11 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).