linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
@ 2020-04-03 16:30 Sean Christopherson
  2020-04-03 16:42 ` Peter Zijlstra
  2020-04-06 12:50 ` Christoph Hellwig
  0 siblings, 2 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-04-03 16:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86
  Cc: H. Peter Anvin, linux-kernel, Kenneth R. Crudup, Peter Zijlstra,
	Jessica Yu, Rasmus Villemoes, Paolo Bonzini, Fenghua Yu,
	Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

Hook into native CR4 writes to disable split-lock detection if CR4.VMXE
is toggled on by an SDL-unaware entity, e.g. an out-of-tree hypervisor
module.  Most/all VMX-based hypervisors blindly reflect #AC exceptions
into the guest, or don't intercept #AC in the first place.  With SLD
enabled, this results in unexpected #AC faults in the guest, leading to
crashes in the guest and other undesirable behavior.

Reported-by: "Kenneth R. Crudup" <kenny@panix.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Cc: Kenneth R. Crudup <kenny@panix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Xiaoyao Li <xiaoyao.li@intel.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Doug Covelli <dcovelli@vmware.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
---

A bit ugly, but on the plus side the code is largely contained to intel.c.
I think forgoing the on_all_cpus() remote kill is safe?  The reverse is
definitely not safe as native_cr4_write() is called with IRQs disabled.
There's the weird discrepancy where SLD is still reported as supported and
enabled, but I don't think anything in the kernel will notice/care?

If we wanted, this could also make the "kill" temporary, e.g. undo it if
VMX is disabled.  But, that would require per-CPU tracking of VMXE, and
going that far out of our way to support out-of-tree hypervisors seems a
bit much.

The big unknown is whether or not existing out-of-tree modules play nice
and actually route through native_cr4_write().

 arch/x86/include/asm/cpu.h   | 18 ++++++++++++++++++
 arch/x86/kernel/cpu/common.c |  1 +
 arch/x86/kernel/cpu/intel.c  | 31 ++++++++++++++++++++++++++++++-
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index ff6f3ca649b3..73c5f6478fb5 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -8,6 +8,8 @@
 #include <linux/nodemask.h>
 #include <linux/percpu.h>
 
+#include <asm/tlbflush.h>
+
 #ifdef CONFIG_SMP
 
 extern void prefill_possible_map(void);
@@ -44,6 +46,16 @@ unsigned int x86_stepping(unsigned int sig);
 extern void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c);
 extern void switch_to_sld(unsigned long tifn);
 extern bool handle_user_split_lock(struct pt_regs *regs, long error_code);
+extern void split_lock_kill(void);
+
+extern atomic_t cr4_vmxe_split_lock_safe;
+static inline void cr4_set_vmxe(void)
+{
+	atomic_inc(&cr4_vmxe_split_lock_safe);
+	cr4_set_bits(X86_CR4_VMXE);
+	atomic_dec(&cr4_vmxe_split_lock_safe);
+}
+extern void split_lock_cr4_write(unsigned long val);
 #else
 static inline void __init cpu_set_core_cap_bits(struct cpuinfo_x86 *c) {}
 static inline void switch_to_sld(unsigned long tifn) {}
@@ -51,5 +63,11 @@ static inline bool handle_user_split_lock(struct pt_regs *regs, long error_code)
 {
 	return false;
 }
+static inline void split_lock_kill(void) {}
+static inline void cr4_set_vmxe(void)
+{
+	cr4_set_bits(X86_CR4_VMXE);
+}
+static inline void split_lock_cr4_write(unsigned long val) {}
 #endif
 #endif /* _ASM_X86_CPU_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index bed0cb83fe24..a50dda8d4409 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -373,6 +373,7 @@ void native_write_cr4(unsigned long val)
 {
 	unsigned long bits_missing = 0;
 
+	split_lock_cr4_write(val);
 set_register:
 	asm volatile("mov %0,%%cr4": "+r" (val), "+m" (cr4_pinned_bits));
 
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 9a26e972cdea..9cfe29d68457 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -46,6 +46,11 @@ enum split_lock_detect_state {
  */
 static enum split_lock_detect_state sld_state __ro_after_init = sld_off;
 static u64 msr_test_ctrl_cache __ro_after_init;
+static bool sld_killed __read_mostly;
+
+/* Cookie that states CR4.VMXE can be enabled without killing SLD. */
+atomic_t cr4_vmxe_split_lock_safe = ATOMIC_INIT(0);
+EXPORT_SYMBOL_GPL(cr4_vmxe_split_lock_safe);
 
 /*
  * Processors which have self-snooping capability can handle conflicting
@@ -1055,7 +1060,7 @@ static void sld_update_msr(bool on)
 {
 	u64 test_ctrl_val = msr_test_ctrl_cache;
 
-	if (on)
+	if (on && !sld_killed)
 		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;
 
 	wrmsrl(MSR_TEST_CTRL, test_ctrl_val);
@@ -1096,6 +1101,30 @@ void switch_to_sld(unsigned long tifn)
 	sld_update_msr(!(tifn & _TIF_SLD));
 }
 
+void split_lock_cr4_write(unsigned long val)
+{
+	/*
+	 * Out-of-tree hypervisors that aren't aware of split-lock will blindly
+	 * reflect split-lock #AC into their guests.  Kill split-lock detection
+	 * if an unaware entity enables VMX.
+	 */
+	if (!static_cpu_has(X86_FEATURE_VMX) ||
+	    !static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) ||
+	    !(val & X86_CR4_VMXE) || atomic_read(&cr4_vmxe_split_lock_safe) ||
+	    (native_read_cr4() & X86_CR4_VMXE))
+		return;
+
+	WARN_ON_ONCE(1);
+
+	/*
+	 * No need to forcefully disable SLD on other CPUs, they'll come here
+	 * if/when they set CR4.VMXE.  Set the global kill bit to prevent
+	 * re-enabling SLD, e.g. via switch_to_sld().
+	 */
+	sld_killed = true;
+	sld_update_msr(false);
+}
+
 #define SPLIT_LOCK_CPU(model) {X86_VENDOR_INTEL, 6, model, X86_FEATURE_ANY}
 
 /*
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-03 16:30 [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX Sean Christopherson
@ 2020-04-03 16:42 ` Peter Zijlstra
  2020-04-03 17:20   ` Sean Christopherson
  2020-04-06 12:50 ` Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2020-04-03 16:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, linux-kernel, Kenneth R. Crudup, Jessica Yu,
	Rasmus Villemoes, Paolo Bonzini, Fenghua Yu, Xiaoyao Li,
	Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt,
	Greg Kroah-Hartman, Jann Horn, Kees Cook, David Laight,
	Doug Covelli

On Fri, Apr 03, 2020 at 09:30:07AM -0700, Sean Christopherson wrote:
> Hook into native CR4 writes to disable split-lock detection if CR4.VMXE
> is toggled on by an SDL-unaware entity, e.g. an out-of-tree hypervisor
> module.  Most/all VMX-based hypervisors blindly reflect #AC exceptions
> into the guest, or don't intercept #AC in the first place.  With SLD
> enabled, this results in unexpected #AC faults in the guest, leading to
> crashes in the guest and other undesirable behavior.
> 
> Reported-by: "Kenneth R. Crudup" <kenny@panix.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jessica Yu <jeyu@kernel.org>
> Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> Cc: Kenneth R. Crudup <kenny@panix.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Fenghua Yu <fenghua.yu@intel.com>
> Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> Cc: Nadav Amit <namit@vmware.com>
> Cc: Thomas Hellstrom <thellstrom@vmware.com>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jann Horn <jannh@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: David Laight <David.Laight@ACULAB.COM>
> Cc: Doug Covelli <dcovelli@vmware.com>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> ---
> 
> A bit ugly, but on the plus side the code is largely contained to intel.c.
> I think forgoing the on_all_cpus() remote kill is safe? 

How would it be safe? You can't control where the module text will be
ran, or how quickly.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-03 16:42 ` Peter Zijlstra
@ 2020-04-03 17:20   ` Sean Christopherson
  0 siblings, 0 replies; 15+ messages in thread
From: Sean Christopherson @ 2020-04-03 17:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, linux-kernel, Kenneth R. Crudup, Jessica Yu,
	Rasmus Villemoes, Paolo Bonzini, Fenghua Yu, Xiaoyao Li,
	Nadav Amit, Thomas Hellstrom, Tony Luck, Steven Rostedt,
	Greg Kroah-Hartman, Jann Horn, Kees Cook, David Laight,
	Doug Covelli

On Fri, Apr 03, 2020 at 06:42:44PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 03, 2020 at 09:30:07AM -0700, Sean Christopherson wrote:
> > Hook into native CR4 writes to disable split-lock detection if CR4.VMXE
> > is toggled on by an SDL-unaware entity, e.g. an out-of-tree hypervisor
> > module.  Most/all VMX-based hypervisors blindly reflect #AC exceptions
> > into the guest, or don't intercept #AC in the first place.  With SLD
> > enabled, this results in unexpected #AC faults in the guest, leading to
> > crashes in the guest and other undesirable behavior.
> > 
> > Reported-by: "Kenneth R. Crudup" <kenny@panix.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Jessica Yu <jeyu@kernel.org>
> > Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> > Cc: Kenneth R. Crudup <kenny@panix.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Fenghua Yu <fenghua.yu@intel.com>
> > Cc: Xiaoyao Li <xiaoyao.li@intel.com>
> > Cc: Nadav Amit <namit@vmware.com>
> > Cc: Thomas Hellstrom <thellstrom@vmware.com>
> > Cc: Tony Luck <tony.luck@intel.com>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Jann Horn <jannh@google.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: David Laight <David.Laight@ACULAB.COM>
> > Cc: Doug Covelli <dcovelli@vmware.com>
> > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
> > ---
> > 
> > A bit ugly, but on the plus side the code is largely contained to intel.c.
> > I think forgoing the on_all_cpus() remote kill is safe? 
> 
> How would it be safe? You can't control where the module text will be
> ran, or how quickly.

Ugh, I forgot about the stupid core scope behavior.

CR4.VMXE needs to be set on every logical CPU before that CPU can do VMXON
and enter a guest, so every CPU will come through this code and locally
disable SLD.

But, a SMT sibling could race on the WRMSR and re-enable SLD on the CPU
that just killed SLD.  Waiting until other CPUs stop enabling SLD should
work.  Something like this?  Disclaimer, memory ordering isn't my forte.

static atomic_t enabling_sld = ATOMIC_INIT(0);

static void sld_update_msr(bool on)
{
	u64 test_ctrl_val = msr_test_ctrl_cache;

	if (on && !sld_killed)
		test_ctrl_val |= MSR_TEST_CTRL_SPLIT_LOCK_DETECT;

	if (test_ctrl_val & MSR_TEST_CTRL_SPLIT_LOCK_DETECT)
		atomic_inc(&enabling_sld);

	wrmsrl(MSR_TEST_CTRL, test_ctrl_val);

	if (test_ctrl_val & MSR_TEST_CTRL_SPLIT_LOCK_DETECT)
		atomic_dec(&enabling_sld);
}

void split_lock_cr4_write(unsigned long val)
{
	u64 ctrl;

	/*
	 * Out-of-tree hypervisors that aren't aware of split-lock will blindly
	 * reflect split-lock #AC into their guests.  Kill split-lock detection
	 * if an unaware entity enables VMX.
	 */
	if (!static_cpu_has(X86_FEATURE_VMX) ||
	    !static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT) ||
	    !(val & X86_CR4_VMXE) || atomic_read(&cr4_vmxe_split_lock_safe) ||
	    (native_read_cr4() & X86_CR4_VMXE))
		return;

	WARN_ON_ONCE(1);

	/*
	 * Set the global kill flag to prevent re-enabling SLD, e.g. via
	 * switch_to_sld().
	 */
	WRITE_ONCE(sld_killed, true);

	/*
	 * No need to forcefully disable SLD on other CPUs, they'll come here
	 * if/when they set CR4.VMXE.  But, wait until no other threads are
	 * enabling SLD, i.e. have seen sld_killed, as the MSR may be shared
	 * by SMT siblings.
	 */
	while (atomic_read(&enabling_sld));
	sld_update_msr(false);
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-03 16:30 [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX Sean Christopherson
  2020-04-03 16:42 ` Peter Zijlstra
@ 2020-04-06 12:50 ` Christoph Hellwig
  2020-04-06 14:04   ` Peter Zijlstra
  2020-04-06 21:37   ` Thomas Gleixner
  1 sibling, 2 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-04-06 12:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	H. Peter Anvin, linux-kernel, Kenneth R. Crudup, Peter Zijlstra,
	Jessica Yu, Rasmus Villemoes, Paolo Bonzini, Fenghua Yu,
	Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Fri, Apr 03, 2020 at 09:30:07AM -0700, Sean Christopherson wrote:
> Hook into native CR4 writes to disable split-lock detection if CR4.VMXE
> is toggled on by an SDL-unaware entity, e.g. an out-of-tree hypervisor
> module.  Most/all VMX-based hypervisors blindly reflect #AC exceptions
> into the guest, or don't intercept #AC in the first place.  With SLD
> enabled, this results in unexpected #AC faults in the guest, leading to
> crashes in the guest and other undesirable behavior.

Out of tree modules do not matter, so we should not add code just to
work around broken third party code.  If you really feel strongly just
make sure something they rely on for their hacks stops being exported
and they are properly broken.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 12:50 ` Christoph Hellwig
@ 2020-04-06 14:04   ` Peter Zijlstra
  2020-04-06 14:34     ` Peter Zijlstra
  2020-04-06 15:24     ` Christoph Hellwig
  2020-04-06 21:37   ` Thomas Gleixner
  1 sibling, 2 replies; 15+ messages in thread
From: Peter Zijlstra @ 2020-04-06 14:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 05:50:10AM -0700, Christoph Hellwig wrote:
> On Fri, Apr 03, 2020 at 09:30:07AM -0700, Sean Christopherson wrote:
> > Hook into native CR4 writes to disable split-lock detection if CR4.VMXE
> > is toggled on by an SDL-unaware entity, e.g. an out-of-tree hypervisor
> > module.  Most/all VMX-based hypervisors blindly reflect #AC exceptions
> > into the guest, or don't intercept #AC in the first place.  With SLD
> > enabled, this results in unexpected #AC faults in the guest, leading to
> > crashes in the guest and other undesirable behavior.
> 
> Out of tree modules do not matter, so we should not add code just to
> work around broken third party code.  If you really feel strongly just
> make sure something they rely on for their hacks stops being exported
> and they are properly broken.

Something like so then?

I couldn't find any in-tree users for unmap_kernel_range*(), and this
removes __get_vm_area() and with the ability to custom ranges. It also
removes map_vm_area() and replaces it with map_vm_area_nx() which kills
adding executable maps.

---
diff --git a/arch/powerpc/include/asm/pasemi_dma.h b/arch/powerpc/include/asm/pasemi_dma.h
index 712a0b32120f..305c54d56244 100644
--- a/arch/powerpc/include/asm/pasemi_dma.h
+++ b/arch/powerpc/include/asm/pasemi_dma.h
@@ -523,4 +523,6 @@ extern void pasemi_dma_free_fun(int fun);
 /* Initialize the library, must be called before any other functions */
 extern int pasemi_dma_init(void);
 
+extern struct vm_struct *get_phb_area(unsigned long size, unsigned long flags);
+
 #endif /* ASM_PASEMI_DMA_H */
diff --git a/arch/powerpc/platforms/pasemi/dma_lib.c b/arch/powerpc/platforms/pasemi/dma_lib.c
index 270fa3c0d372..0aae02df061d 100644
--- a/arch/powerpc/platforms/pasemi/dma_lib.c
+++ b/arch/powerpc/platforms/pasemi/dma_lib.c
@@ -617,3 +617,10 @@ int pasemi_dma_init(void)
 	return err;
 }
 EXPORT_SYMBOL(pasemi_dma_init);
+
+struct vm_struct *get_phb_area(unsigned long size, unsigned long flags)
+{
+	return __get_vm_area_node(size, 1, flags, PHB_IO_BASE, PHB_IO_END, NUMA_NO_NODE,
+				  GFP_KERNEL, __builtin_return_address(0));
+}
+EXPORT_SYMBOL_GPL(get_phb_area);
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 65c2ecd730c5..0bd5ea46f1d2 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -274,6 +274,11 @@ typedef struct pgprot { pgprotval_t pgprot; } pgprot_t;
 
 typedef struct { pgdval_t pgd; } pgd_t;
 
+static inline pgprot_t pgprot_nx(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) | _PAGE_NX);
+}
+
 #ifdef CONFIG_X86_PAE
 
 /*
diff --git a/drivers/pcmcia/electra_cf.c b/drivers/pcmcia/electra_cf.c
index f2741c04289d..42eb242b29f1 100644
--- a/drivers/pcmcia/electra_cf.c
+++ b/drivers/pcmcia/electra_cf.c
@@ -22,6 +22,8 @@
 #include <linux/of_platform.h>
 #include <linux/slab.h>
 
+#include <asm/pasemi_dma.h>
+
 #include <pcmcia/ss.h>
 
 static const char driver_name[] = "electra-cf";
@@ -204,7 +206,7 @@ static int electra_cf_probe(struct platform_device *ofdev)
 	cf->mem_base = ioremap(cf->mem_phys, cf->mem_size);
 	cf->io_size = PAGE_ALIGN(resource_size(&io));
 
-	area = __get_vm_area(cf->io_size, 0, PHB_IO_BASE, PHB_IO_END);
+	area = get_phb_area(cf->io_size, 0);
 	if (area == NULL) {
 		status = -ENOMEM;
 		goto fail1;
diff --git a/drivers/staging/media/ipu3/ipu3-dmamap.c b/drivers/staging/media/ipu3/ipu3-dmamap.c
index 7431322379f6..fc3fe85c340c 100644
--- a/drivers/staging/media/ipu3/ipu3-dmamap.c
+++ b/drivers/staging/media/ipu3/ipu3-dmamap.c
@@ -124,13 +124,13 @@ void *imgu_dmamap_alloc(struct imgu_device *imgu, struct imgu_css_map *map,
 	}
 
 	/* Now grab a virtual region */
-	map->vma = __get_vm_area(size, VM_USERMAP, VMALLOC_START, VMALLOC_END);
+	map->vma = get_vm_area(size, VM_USERMAP);
 	if (!map->vma)
 		goto out_unmap;
 
 	map->vma->pages = pages;
 	/* And map it in KVA */
-	if (map_vm_area(map->vma, PAGE_KERNEL, pages))
+	if (map_vm_area_nx(map->vma, PAGE_KERNEL, pages))
 		goto out_vunmap;
 
 	map->size = size;
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index e2e2bef07dd2..ee66c1f8b141 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -490,6 +490,10 @@ static inline int arch_unmap_one(struct mm_struct *mm,
 #define flush_tlb_fix_spurious_fault(vma, address) flush_tlb_page(vma, address)
 #endif
 
+#ifndef pgprot_nx
+#define pgprot_nx(prot)	(prot)
+#endif
+
 #ifndef pgprot_noncached
 #define pgprot_noncached(prot)	(prot)
 #endif
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6b8eeb0ecee5..a1c5faa6042e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2029,7 +2029,6 @@ void unmap_kernel_range_noflush(unsigned long addr, unsigned long size)
 {
 	vunmap_page_range(addr, addr + size);
 }
-EXPORT_SYMBOL_GPL(unmap_kernel_range_noflush);
 
 /**
  * unmap_kernel_range - unmap kernel VM area and flush cache and TLB
@@ -2047,7 +2046,6 @@ void unmap_kernel_range(unsigned long addr, unsigned long size)
 	vunmap_page_range(addr, end);
 	flush_tlb_kernel_range(addr, end);
 }
-EXPORT_SYMBOL_GPL(unmap_kernel_range);
 
 int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page **pages)
 {
@@ -2059,7 +2057,12 @@ int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page **pages)
 
 	return err > 0 ? 0 : err;
 }
-EXPORT_SYMBOL_GPL(map_vm_area);
+
+int map_vm_area_nx(struct vm_struct *area, pgprot prot, struct page **pages)
+{
+	return map_vm_area(area, pgprot_nx(prot), pages);
+}
+EXPORT_SYMBOL_GPL(map_vm_area_nx);
 
 static inline void setup_vmalloc_vm_locked(struct vm_struct *vm,
 	struct vmap_area *va, unsigned long flags, const void *caller)
@@ -2133,7 +2136,6 @@ struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
 	return __get_vm_area_node(size, 1, flags, start, end, NUMA_NO_NODE,
 				  GFP_KERNEL, __builtin_return_address(0));
 }
-EXPORT_SYMBOL_GPL(__get_vm_area);
 
 struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags,
 				       unsigned long start, unsigned long end,
@@ -2160,6 +2162,7 @@ struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
 				  NUMA_NO_NODE, GFP_KERNEL,
 				  __builtin_return_address(0));
 }
+EXPORT_SYMBOL_GPL(get_vm_area);
 
 struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
 				const void *caller)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 14:04   ` Peter Zijlstra
@ 2020-04-06 14:34     ` Peter Zijlstra
  2020-04-06 15:24     ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2020-04-06 14:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 04:04:04PM +0200, Peter Zijlstra wrote:
> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
> index 65c2ecd730c5..0bd5ea46f1d2 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -274,6 +274,11 @@ typedef struct pgprot { pgprotval_t pgprot; } pgprot_t;
>  
>  typedef struct { pgdval_t pgd; } pgd_t;
>  

This lost:

#define pgprot_nx pgprot_nx

> +static inline pgprot_t pgprot_nx(pgprot_t prot)
> +{
> +	return __pgprot(pgprot_val(prot) | _PAGE_NX);
> +}
> +
>  #ifdef CONFIG_X86_PAE
>  
>  /*

> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index e2e2bef07dd2..ee66c1f8b141 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -490,6 +490,10 @@ static inline int arch_unmap_one(struct mm_struct *mm,
>  #define flush_tlb_fix_spurious_fault(vma, address) flush_tlb_page(vma, address)
>  #endif
>  
> +#ifndef pgprot_nx
> +#define pgprot_nx(prot)	(prot)
> +#endif
> +
>  #ifndef pgprot_noncached
>  #define pgprot_noncached(prot)	(prot)
>  #endif

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 14:04   ` Peter Zijlstra
  2020-04-06 14:34     ` Peter Zijlstra
@ 2020-04-06 15:24     ` Christoph Hellwig
  2020-04-06 15:39       ` Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2020-04-06 15:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 04:04:03PM +0200, Peter Zijlstra wrote:
> On Mon, Apr 06, 2020 at 05:50:10AM -0700, Christoph Hellwig wrote:
> > On Fri, Apr 03, 2020 at 09:30:07AM -0700, Sean Christopherson wrote:
> > > Hook into native CR4 writes to disable split-lock detection if CR4.VMXE
> > > is toggled on by an SDL-unaware entity, e.g. an out-of-tree hypervisor
> > > module.  Most/all VMX-based hypervisors blindly reflect #AC exceptions
> > > into the guest, or don't intercept #AC in the first place.  With SLD
> > > enabled, this results in unexpected #AC faults in the guest, leading to
> > > crashes in the guest and other undesirable behavior.
> > 
> > Out of tree modules do not matter, so we should not add code just to
> > work around broken third party code.  If you really feel strongly just
> > make sure something they rely on for their hacks stops being exported
> > and they are properly broken.
> 
> Something like so then?
> 
> I couldn't find any in-tree users for unmap_kernel_range*(),

zsmalloc.c can be built modular.  Not that I think it should..

> and this
> removes __get_vm_area() and with the ability to custom ranges. It also
> removes map_vm_area() and replaces it with map_vm_area_nx() which kills
> adding executable maps.

The pbh/electra change looks sensible.  ipu3 as said really should use
vmap, with that map_vm_area can also become unexported.  In fact
with a vmap variant that allos passing the type of memory for
/proc/vmallocinfo we can just kill off map_vm_area enrirely and fold
it into the two remaining callers.

I'll prepare proper series for the vmalloc area cleanups if you don't
mind.


> 
> ---
> diff --git a/arch/powerpc/include/asm/pasemi_dma.h b/arch/powerpc/include/asm/pasemi_dma.h
> index 712a0b32120f..305c54d56244 100644
> --- a/arch/powerpc/include/asm/pasemi_dma.h
> +++ b/arch/powerpc/include/asm/pasemi_dma.h
> @@ -523,4 +523,6 @@ extern void pasemi_dma_free_fun(int fun);
>  /* Initialize the library, must be called before any other functions */
>  extern int pasemi_dma_init(void);
>  
> +extern struct vm_struct *get_phb_area(unsigned long size, unsigned long flags);
> +
>  #endif /* ASM_PASEMI_DMA_H */
> diff --git a/arch/powerpc/platforms/pasemi/dma_lib.c b/arch/powerpc/platforms/pasemi/dma_lib.c
> index 270fa3c0d372..0aae02df061d 100644
> --- a/arch/powerpc/platforms/pasemi/dma_lib.c
> +++ b/arch/powerpc/platforms/pasemi/dma_lib.c
> @@ -617,3 +617,10 @@ int pasemi_dma_init(void)
>  	return err;
>  }
>  EXPORT_SYMBOL(pasemi_dma_init);
> +
> +struct vm_struct *get_phb_area(unsigned long size, unsigned long flags)
> +{
> +	return __get_vm_area_node(size, 1, flags, PHB_IO_BASE, PHB_IO_END, NUMA_NO_NODE,
> +				  GFP_KERNEL, __builtin_return_address(0));
> +}
> +EXPORT_SYMBOL_GPL(get_phb_area);
> diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
> index 65c2ecd730c5..0bd5ea46f1d2 100644
> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -274,6 +274,11 @@ typedef struct pgprot { pgprotval_t pgprot; } pgprot_t;
>  
>  typedef struct { pgdval_t pgd; } pgd_t;
>  
> +static inline pgprot_t pgprot_nx(pgprot_t prot)
> +{
> +	return __pgprot(pgprot_val(prot) | _PAGE_NX);
> +}
> +
>  #ifdef CONFIG_X86_PAE
>  
>  /*
> diff --git a/drivers/pcmcia/electra_cf.c b/drivers/pcmcia/electra_cf.c
> index f2741c04289d..42eb242b29f1 100644
> --- a/drivers/pcmcia/electra_cf.c
> +++ b/drivers/pcmcia/electra_cf.c
> @@ -22,6 +22,8 @@
>  #include <linux/of_platform.h>
>  #include <linux/slab.h>
>  
> +#include <asm/pasemi_dma.h>
> +
>  #include <pcmcia/ss.h>
>  
>  static const char driver_name[] = "electra-cf";
> @@ -204,7 +206,7 @@ static int electra_cf_probe(struct platform_device *ofdev)
>  	cf->mem_base = ioremap(cf->mem_phys, cf->mem_size);
>  	cf->io_size = PAGE_ALIGN(resource_size(&io));
>  
> -	area = __get_vm_area(cf->io_size, 0, PHB_IO_BASE, PHB_IO_END);
> +	area = get_phb_area(cf->io_size, 0);
>  	if (area == NULL) {
>  		status = -ENOMEM;
>  		goto fail1;
> diff --git a/drivers/staging/media/ipu3/ipu3-dmamap.c b/drivers/staging/media/ipu3/ipu3-dmamap.c
> index 7431322379f6..fc3fe85c340c 100644
> --- a/drivers/staging/media/ipu3/ipu3-dmamap.c
> +++ b/drivers/staging/media/ipu3/ipu3-dmamap.c
> @@ -124,13 +124,13 @@ void *imgu_dmamap_alloc(struct imgu_device *imgu, struct imgu_css_map *map,
>  	}
>  
>  	/* Now grab a virtual region */
> -	map->vma = __get_vm_area(size, VM_USERMAP, VMALLOC_START, VMALLOC_END);
> +	map->vma = get_vm_area(size, VM_USERMAP);
>  	if (!map->vma)
>  		goto out_unmap;
>  
>  	map->vma->pages = pages;
>  	/* And map it in KVA */
> -	if (map_vm_area(map->vma, PAGE_KERNEL, pages))
> +	if (map_vm_area_nx(map->vma, PAGE_KERNEL, pages))
>  		goto out_vunmap;
>  
>  	map->size = size;
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index e2e2bef07dd2..ee66c1f8b141 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -490,6 +490,10 @@ static inline int arch_unmap_one(struct mm_struct *mm,
>  #define flush_tlb_fix_spurious_fault(vma, address) flush_tlb_page(vma, address)
>  #endif
>  
> +#ifndef pgprot_nx
> +#define pgprot_nx(prot)	(prot)
> +#endif
> +
>  #ifndef pgprot_noncached
>  #define pgprot_noncached(prot)	(prot)
>  #endif
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 6b8eeb0ecee5..a1c5faa6042e 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2029,7 +2029,6 @@ void unmap_kernel_range_noflush(unsigned long addr, unsigned long size)
>  {
>  	vunmap_page_range(addr, addr + size);
>  }
> -EXPORT_SYMBOL_GPL(unmap_kernel_range_noflush);
>  
>  /**
>   * unmap_kernel_range - unmap kernel VM area and flush cache and TLB
> @@ -2047,7 +2046,6 @@ void unmap_kernel_range(unsigned long addr, unsigned long size)
>  	vunmap_page_range(addr, end);
>  	flush_tlb_kernel_range(addr, end);
>  }
> -EXPORT_SYMBOL_GPL(unmap_kernel_range);
>  
>  int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page **pages)
>  {
> @@ -2059,7 +2057,12 @@ int map_vm_area(struct vm_struct *area, pgprot_t prot, struct page **pages)
>  
>  	return err > 0 ? 0 : err;
>  }
> -EXPORT_SYMBOL_GPL(map_vm_area);
> +
> +int map_vm_area_nx(struct vm_struct *area, pgprot prot, struct page **pages)
> +{
> +	return map_vm_area(area, pgprot_nx(prot), pages);
> +}
> +EXPORT_SYMBOL_GPL(map_vm_area_nx);
>  
>  static inline void setup_vmalloc_vm_locked(struct vm_struct *vm,
>  	struct vmap_area *va, unsigned long flags, const void *caller)
> @@ -2133,7 +2136,6 @@ struct vm_struct *__get_vm_area(unsigned long size, unsigned long flags,
>  	return __get_vm_area_node(size, 1, flags, start, end, NUMA_NO_NODE,
>  				  GFP_KERNEL, __builtin_return_address(0));
>  }
> -EXPORT_SYMBOL_GPL(__get_vm_area);
>  
>  struct vm_struct *__get_vm_area_caller(unsigned long size, unsigned long flags,
>  				       unsigned long start, unsigned long end,
> @@ -2160,6 +2162,7 @@ struct vm_struct *get_vm_area(unsigned long size, unsigned long flags)
>  				  NUMA_NO_NODE, GFP_KERNEL,
>  				  __builtin_return_address(0));
>  }
> +EXPORT_SYMBOL_GPL(get_vm_area);
>  
>  struct vm_struct *get_vm_area_caller(unsigned long size, unsigned long flags,
>  				const void *caller)
---end quoted text---

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 15:24     ` Christoph Hellwig
@ 2020-04-06 15:39       ` Christoph Hellwig
  2020-04-06 16:01         ` Peter Zijlstra
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2020-04-06 15:39 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 08:24:11AM -0700, Christoph Hellwig wrote:
> > and this
> > removes __get_vm_area() and with the ability to custom ranges. It also
> > removes map_vm_area() and replaces it with map_vm_area_nx() which kills
> > adding executable maps.

Also there seems to be various other ways to create exectuable mappings,
pretty much everything in vmalloc.c that gets a pgprot_t..

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 15:39       ` Christoph Hellwig
@ 2020-04-06 16:01         ` Peter Zijlstra
  2020-04-06 17:10           ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2020-04-06 16:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 08:39:02AM -0700, Christoph Hellwig wrote:
> On Mon, Apr 06, 2020 at 08:24:11AM -0700, Christoph Hellwig wrote:
> > > and this
> > > removes __get_vm_area() and with the ability to custom ranges. It also
> > > removes map_vm_area() and replaces it with map_vm_area_nx() which kills
> > > adding executable maps.
> 
> Also there seems to be various other ways to create exectuable mappings,
> pretty much everything in vmalloc.c that gets a pgprot_t..

Please feel free to use my pgprot_nx() and apply liberally on any
exported function.

But crucially, I don't think any of the still exported functions allows
getting memory in the text range, and if you want to run code outside of
the text range, things become _much_ harder. That said, modules
shouldn't be able to create executable code, full-stop (IMO).

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 16:01         ` Peter Zijlstra
@ 2020-04-06 17:10           ` Christoph Hellwig
  2020-04-06 18:39             ` Peter Zijlstra
                               ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-04-06 17:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 06:01:57PM +0200, Peter Zijlstra wrote:
> Please feel free to use my pgprot_nx() and apply liberally on any
> exported function.
> 
> But crucially, I don't think any of the still exported functions allows
> getting memory in the text range, and if you want to run code outside of
> the text range, things become _much_ harder. That said, modules
> shouldn't be able to create executable code, full-stop (IMO).

This is what i've got for now:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sanitize-vmalloc-api

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 17:10           ` Christoph Hellwig
@ 2020-04-06 18:39             ` Peter Zijlstra
  2020-04-06 22:54             ` Andy Lutomirski
  2020-04-08  9:12             ` Peter Zijlstra
  2 siblings, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2020-04-06 18:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 10:10:58AM -0700, Christoph Hellwig wrote:
> On Mon, Apr 06, 2020 at 06:01:57PM +0200, Peter Zijlstra wrote:
> > Please feel free to use my pgprot_nx() and apply liberally on any
> > exported function.
> > 
> > But crucially, I don't think any of the still exported functions allows
> > getting memory in the text range, and if you want to run code outside of
> > the text range, things become _much_ harder. That said, modules
> > shouldn't be able to create executable code, full-stop (IMO).
> 
> This is what i've got for now:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sanitize-vmalloc-api

Looks excellent:

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 12:50 ` Christoph Hellwig
  2020-04-06 14:04   ` Peter Zijlstra
@ 2020-04-06 21:37   ` Thomas Gleixner
  1 sibling, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2020-04-06 21:37 UTC (permalink / raw)
  To: Christoph Hellwig, Sean Christopherson
  Cc: Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Peter Zijlstra, Jessica Yu, Rasmus Villemoes,
	Paolo Bonzini, Fenghua Yu, Xiaoyao Li, Nadav Amit,
	Thomas Hellstrom, Tony Luck, Steven Rostedt, Greg Kroah-Hartman,
	Jann Horn, Kees Cook, David Laight, Doug Covelli

Christoph Hellwig <hch@infradead.org> writes:
> On Fri, Apr 03, 2020 at 09:30:07AM -0700, Sean Christopherson wrote:
>> Hook into native CR4 writes to disable split-lock detection if CR4.VMXE
>> is toggled on by an SDL-unaware entity, e.g. an out-of-tree hypervisor
>> module.  Most/all VMX-based hypervisors blindly reflect #AC exceptions
>> into the guest, or don't intercept #AC in the first place.  With SLD
>> enabled, this results in unexpected #AC faults in the guest, leading to
>> crashes in the guest and other undesirable behavior.
>
> Out of tree modules do not matter, so we should not add code just to
> work around broken third party code.  If you really feel strongly just
> make sure something they rely on for their hacks stops being exported
> and they are properly broken.

As we agreed on elsewhere in the thread already, we are not going to
disable SLD, we just reject the module to be loaded. That's way better
than silently failing.

Aside of that I think that we should extend this kind of analysis to
other nasty patterns of out of tree modules, like directly fiddling with
CR* and other circumventions of stuff we are trying to protect.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 17:10           ` Christoph Hellwig
  2020-04-06 18:39             ` Peter Zijlstra
@ 2020-04-06 22:54             ` Andy Lutomirski
  2020-04-08  9:12             ` Peter Zijlstra
  2 siblings, 0 replies; 15+ messages in thread
From: Andy Lutomirski @ 2020-04-06 22:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Peter Zijlstra, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, X86 ML, H. Peter Anvin, LKML,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 6, 2020 at 10:11 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Mon, Apr 06, 2020 at 06:01:57PM +0200, Peter Zijlstra wrote:
> > Please feel free to use my pgprot_nx() and apply liberally on any
> > exported function.
> >
> > But crucially, I don't think any of the still exported functions allows
> > getting memory in the text range, and if you want to run code outside of
> > the text range, things become _much_ harder. That said, modules
> > shouldn't be able to create executable code, full-stop (IMO).
>
> This is what i've got for now:
>
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sanitize-vmalloc-api

You have:

 mm: remove __get_vm_area

Switch the two remaining callers to use __get_vm_area instead.

The second line contains a typo :)

Otherwise this looks pretty good.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-06 17:10           ` Christoph Hellwig
  2020-04-06 18:39             ` Peter Zijlstra
  2020-04-06 22:54             ` Andy Lutomirski
@ 2020-04-08  9:12             ` Peter Zijlstra
  2020-04-08 11:02               ` Christoph Hellwig
  2 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2020-04-08  9:12 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sean Christopherson, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Mon, Apr 06, 2020 at 10:10:58AM -0700, Christoph Hellwig wrote:
> On Mon, Apr 06, 2020 at 06:01:57PM +0200, Peter Zijlstra wrote:
> > Please feel free to use my pgprot_nx() and apply liberally on any
> > exported function.
> > 
> > But crucially, I don't think any of the still exported functions allows
> > getting memory in the text range, and if you want to run code outside of
> > the text range, things become _much_ harder. That said, modules
> > shouldn't be able to create executable code, full-stop (IMO).
> 
> This is what i've got for now:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sanitize-vmalloc-api

Should we not also apply pgprot_nx() to __vmalloc(), that's also
EXPORT_SYMBOL().

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX
  2020-04-08  9:12             ` Peter Zijlstra
@ 2020-04-08 11:02               ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-04-08 11:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Christoph Hellwig, Sean Christopherson, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, x86, H. Peter Anvin, linux-kernel,
	Kenneth R. Crudup, Jessica Yu, Rasmus Villemoes, Paolo Bonzini,
	Fenghua Yu, Xiaoyao Li, Nadav Amit, Thomas Hellstrom, Tony Luck,
	Steven Rostedt, Greg Kroah-Hartman, Jann Horn, Kees Cook,
	David Laight, Doug Covelli

On Wed, Apr 08, 2020 at 11:12:14AM +0200, Peter Zijlstra wrote:
> > http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/sanitize-vmalloc-api
> 
> Should we not also apply pgprot_nx() to __vmalloc(), that's also
> EXPORT_SYMBOL().

__vmalloc has lost the pgprot argument in the latest version.  And
based on the other thread it seems I need to take a look at ioremap
as well.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-04-08 11:02 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-03 16:30 [RFC PATCH] x86/split_lock: Disable SLD if an unaware (out-of-tree) module enables VMX Sean Christopherson
2020-04-03 16:42 ` Peter Zijlstra
2020-04-03 17:20   ` Sean Christopherson
2020-04-06 12:50 ` Christoph Hellwig
2020-04-06 14:04   ` Peter Zijlstra
2020-04-06 14:34     ` Peter Zijlstra
2020-04-06 15:24     ` Christoph Hellwig
2020-04-06 15:39       ` Christoph Hellwig
2020-04-06 16:01         ` Peter Zijlstra
2020-04-06 17:10           ` Christoph Hellwig
2020-04-06 18:39             ` Peter Zijlstra
2020-04-06 22:54             ` Andy Lutomirski
2020-04-08  9:12             ` Peter Zijlstra
2020-04-08 11:02               ` Christoph Hellwig
2020-04-06 21:37   ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).