xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
@ 2015-07-28  5:29 Andy Lutomirski
  2015-07-28  5:29 ` [PATCH v5 1/4] x86/xen: Unmap aliases in xen_alloc_ldt and xen_free_ldt Andy Lutomirski
                   ` (8 more replies)
  0 siblings, 9 replies; 25+ messages in thread
From: Andy Lutomirski @ 2015-07-28  5:29 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: security, Andy Lutomirski, Andrew Cooper, X86 ML, linux-kernel,
	xen-devel, Borislav Petkov, Jan Beulich, Sasha Levin,
	Boris Ostrovsky

This is intended for x86/urgent.  Sorry for taking so long, but it
seemed nice to avoid breaking Xen.

This fixes the "dazed and confused" issue which was exposed by the
CVE-2015-5157 fix.  It's also probably a good general attack surface
reduction, and it replaces some scary code with IMO less scary code.

Also, servers and embedded systems should probably turn off modify_ldt.
This makes that possible.

Boris, could I get a Tested-by, assuming this works for you?

Willy and Kees: I left the config option alone.  The -tiny people will
like it, and we can always add a sysctl of some sort later.

Changes from v4:
 - Fix Xen even better (patch 1 is new).
 - Reorder the patches to make a little more sense.

Changes from v3:
 - Hopefully fixed Xen.
 - Fixed 32-bit test case on 32-bit native kernel.
 - Fix bogus vumnap for some LDT sizes.
 - Strengthen test case to check all LDT sizes (catches bogus vunmap).
 - Lots of cleanups, mostly from Borislav.
 - Simplify IPI code using on_each_cpu_mask.

Changes from v2:
 - Allocate ldt_struct and the LDT entries separately.  This should fix Xen.
 - Stop using write_ldt_entry, since I'm pretty sure it's unnecessary now
   that we no longer mutate an in-use LDT.  (Xen people, can you check?)

Changes from v1:
 - The config option is new.
 - The test case is new.
 - Fixed a missing allocation failure check.
 - Fixed a use-after-free on fork().

Andy Lutomirski (4):
  x86/xen: Unmap aliases in xen_alloc_ldt and xen_free_ldt
  x86/ldt: Make modify_ldt synchronous
  selftests/x86, x86/ldt: Add a selftest for modify_ldt
  x86/ldt: Make modify_ldt optional

 arch/x86/Kconfig                      |  17 ++
 arch/x86/include/asm/desc.h           |  15 -
 arch/x86/include/asm/mmu.h            |   5 +-
 arch/x86/include/asm/mmu_context.h    |  68 ++++-
 arch/x86/kernel/Makefile              |   3 +-
 arch/x86/kernel/cpu/common.c          |   4 +-
 arch/x86/kernel/cpu/perf_event.c      |  16 +-
 arch/x86/kernel/ldt.c                 | 262 +++++++++--------
 arch/x86/kernel/process_64.c          |   6 +-
 arch/x86/kernel/step.c                |   8 +-
 arch/x86/power/cpu.c                  |   3 +-
 arch/x86/xen/enlighten.c              |  12 +
 kernel/sys_ni.c                       |   1 +
 tools/testing/selftests/x86/Makefile  |   2 +-
 tools/testing/selftests/x86/ldt_gdt.c | 520 ++++++++++++++++++++++++++++++++++
 15 files changed, 787 insertions(+), 155 deletions(-)
 create mode 100644 tools/testing/selftests/x86/ldt_gdt.c

-- 
2.4.3

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v5 1/4] x86/xen: Unmap aliases in xen_alloc_ldt and xen_free_ldt
  2015-07-28  5:29 [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option Andy Lutomirski
@ 2015-07-28  5:29 ` Andy Lutomirski
  2015-07-28  5:29 ` [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous Andy Lutomirski
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Andy Lutomirski @ 2015-07-28  5:29 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: security, Andy Lutomirski, Andrew Cooper, X86 ML, linux-kernel,
	stable, xen-devel, Borislav Petkov, Jan Beulich, Sasha Levin,
	Boris Ostrovsky

I've been able to get an unmodified Xen guest to OOPS once after a
lot of test iterations without this patch.  I think this patch fixes
the problem.  I'm a bit surprised that we don't see much more severe
LDT problems on Xen without this fix.

Once the synchronous modify_ldt code causes modify_ldt to more
aggressively reallocate the LDT, the OOPSes become much more common
without this fix.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/xen/enlighten.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 0b95c9b8283f..e417d08c56c4 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -32,6 +32,7 @@
 #include <linux/gfp.h>
 #include <linux/memblock.h>
 #include <linux/edd.h>
+#include <linux/vmalloc.h>
 
 #include <xen/xen.h>
 #include <xen/events.h>
@@ -512,6 +513,10 @@ static void xen_alloc_ldt(struct desc_struct *ldt, unsigned entries)
 
 	for(i = 0; i < entries; i += entries_per_page)
 		set_aliased_prot(ldt + i, PAGE_KERNEL_RO);
+
+	/* If there are stray aliases, the LDT won't work. */
+	if (is_vmalloc_addr(ldt))
+		vm_unmap_aliases();
 }
 
 static void xen_free_ldt(struct desc_struct *ldt, unsigned entries)
@@ -519,6 +524,13 @@ static void xen_free_ldt(struct desc_struct *ldt, unsigned entries)
 	const unsigned entries_per_page = PAGE_SIZE / LDT_ENTRY_SIZE;
 	int i;
 
+	/*
+	 * The set_aliased_prot call may OOPS due to a hypercall failure
+	 * if there are any lazy vmap aliases of the page.  We don't
+	 * need to call vm_unmap_aliases() here, though, because we
+	 * already eliminated any aliases in xen_alloc_ldt.
+	 */
+
 	for(i = 0; i < entries; i += entries_per_page)
 		set_aliased_prot(ldt + i, PAGE_KERNEL);
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
  2015-07-28  5:29 [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option Andy Lutomirski
  2015-07-28  5:29 ` [PATCH v5 1/4] x86/xen: Unmap aliases in xen_alloc_ldt and xen_free_ldt Andy Lutomirski
@ 2015-07-28  5:29 ` Andy Lutomirski
  2015-07-28  5:29 ` [PATCH v5 3/4] selftests/x86, x86/ldt: Add a selftest for modify_ldt Andy Lutomirski
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Andy Lutomirski @ 2015-07-28  5:29 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: security, Andy Lutomirski, Andrew Cooper, X86 ML, linux-kernel,
	stable, xen-devel, Borislav Petkov, Jan Beulich, Sasha Levin,
	Boris Ostrovsky

modify_ldt has questionable locking and does not synchronize
threads.  Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.

This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.

Cc: stable@vger.kernel.org
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/include/asm/desc.h        |  15 ---
 arch/x86/include/asm/mmu.h         |   3 +-
 arch/x86/include/asm/mmu_context.h |  53 +++++++-
 arch/x86/kernel/cpu/common.c       |   4 +-
 arch/x86/kernel/cpu/perf_event.c   |  12 +-
 arch/x86/kernel/ldt.c              | 262 ++++++++++++++++++++-----------------
 arch/x86/kernel/process_64.c       |   4 +-
 arch/x86/kernel/step.c             |   6 +-
 arch/x86/power/cpu.c               |   3 +-
 9 files changed, 209 insertions(+), 153 deletions(-)

diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index a0bf89fd2647..4e10d73cf018 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -280,21 +280,6 @@ static inline void clear_LDT(void)
 	set_ldt(NULL, 0);
 }
 
-/*
- * load one particular LDT into the current CPU
- */
-static inline void load_LDT_nolock(mm_context_t *pc)
-{
-	set_ldt(pc->ldt, pc->size);
-}
-
-static inline void load_LDT(mm_context_t *pc)
-{
-	preempt_disable();
-	load_LDT_nolock(pc);
-	preempt_enable();
-}
-
 static inline unsigned long get_desc_base(const struct desc_struct *desc)
 {
 	return (unsigned)(desc->base0 | ((desc->base1) << 16) | ((desc->base2) << 24));
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 09b9620a73b4..364d27481a52 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -9,8 +9,7 @@
  * we put the segment information here.
  */
 typedef struct {
-	void *ldt;
-	int size;
+	struct ldt_struct *ldt;
 
 #ifdef CONFIG_X86_64
 	/* True if mm supports a task running in 32 bit compatibility mode. */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 804a3a6030ca..3fcff70c398e 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -34,6 +34,49 @@ static inline void load_mm_cr4(struct mm_struct *mm) {}
 #endif
 
 /*
+ * ldt_structs can be allocated, used, and freed, but they are never
+ * modified while live.
+ */
+struct ldt_struct {
+	/*
+	 * Xen requires page-aligned LDTs with special permissions.  This is
+	 * needed to prevent us from installing evil descriptors such as
+	 * call gates.  On native, we could merge the ldt_struct and LDT
+	 * allocations, but it's not worth trying to optimize.
+	 */
+	struct desc_struct *entries;
+	int size;
+};
+
+static inline void load_mm_ldt(struct mm_struct *mm)
+{
+	struct ldt_struct *ldt;
+	DEBUG_LOCKS_WARN_ON(!irqs_disabled());
+
+	/* lockless_dereference synchronizes with smp_store_release */
+	ldt = lockless_dereference(mm->context.ldt);
+
+	/*
+	 * Any change to mm->context.ldt is followed by an IPI to all
+	 * CPUs with the mm active.  The LDT will not be freed until
+	 * after the IPI is handled by all such CPUs.  This means that,
+	 * if the ldt_struct changes before we return, the values we see
+	 * will be safe, and the new values will be loaded before we run
+	 * any user code.
+	 *
+	 * NB: don't try to convert this to use RCU without extreme care.
+	 * We would still need IRQs off, because we don't want to change
+	 * the local LDT after an IPI loaded a newer value than the one
+	 * that we can see.
+	 */
+
+	if (unlikely(ldt))
+		set_ldt(ldt->entries, ldt->size);
+	else
+		clear_LDT();
+}
+
+/*
  * Used for LDT copy/destruction.
  */
 int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
@@ -78,12 +121,12 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 		 * was called and then modify_ldt changed
 		 * prev->context.ldt but suppressed an IPI to this CPU.
 		 * In this case, prev->context.ldt != NULL, because we
-		 * never free an LDT while the mm still exists.  That
-		 * means that next->context.ldt != prev->context.ldt,
-		 * because mms never share an LDT.
+		 * never set context.ldt to NULL while the mm still
+		 * exists.  That means that next->context.ldt !=
+		 * prev->context.ldt, because mms never share an LDT.
 		 */
 		if (unlikely(prev->context.ldt != next->context.ldt))
-			load_LDT_nolock(&next->context);
+			load_mm_ldt(next);
 	}
 #ifdef CONFIG_SMP
 	  else {
@@ -106,7 +149,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 			load_cr3(next->pgd);
 			trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL);
 			load_mm_cr4(next);
-			load_LDT_nolock(&next->context);
+			load_mm_ldt(next);
 		}
 	}
 #endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 922c5e0cea4c..cb9e5df42dd2 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1410,7 +1410,7 @@ void cpu_init(void)
 	load_sp0(t, &current->thread);
 	set_tss_desc(cpu, t);
 	load_TR_desc();
-	load_LDT(&init_mm.context);
+	load_mm_ldt(&init_mm);
 
 	clear_all_debug_regs();
 	dbg_restore_debug_regs();
@@ -1459,7 +1459,7 @@ void cpu_init(void)
 	load_sp0(t, thread);
 	set_tss_desc(cpu, t);
 	load_TR_desc();
-	load_LDT(&init_mm.context);
+	load_mm_ldt(&init_mm);
 
 	t->x86_tss.io_bitmap_base = offsetof(struct tss_struct, io_bitmap);
 
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 3658de47900f..9469dfa55607 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -2179,21 +2179,25 @@ static unsigned long get_segment_base(unsigned int segment)
 	int idx = segment >> 3;
 
 	if ((segment & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+		struct ldt_struct *ldt;
+
 		if (idx > LDT_ENTRIES)
 			return 0;
 
-		if (idx > current->active_mm->context.size)
+		/* IRQs are off, so this synchronizes with smp_store_release */
+		ldt = lockless_dereference(current->active_mm->context.ldt);
+		if (!ldt || idx > ldt->size)
 			return 0;
 
-		desc = current->active_mm->context.ldt;
+		desc = &ldt->entries[idx];
 	} else {
 		if (idx > GDT_ENTRIES)
 			return 0;
 
-		desc = raw_cpu_ptr(gdt_page.gdt);
+		desc = raw_cpu_ptr(gdt_page.gdt) + idx;
 	}
 
-	return get_desc_base(desc + idx);
+	return get_desc_base(desc);
 }
 
 #ifdef CONFIG_COMPAT
diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index c37886d759cc..2bcc0525f1c1 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -12,6 +12,7 @@
 #include <linux/string.h>
 #include <linux/mm.h>
 #include <linux/smp.h>
+#include <linux/slab.h>
 #include <linux/vmalloc.h>
 #include <linux/uaccess.h>
 
@@ -20,82 +21,82 @@
 #include <asm/mmu_context.h>
 #include <asm/syscalls.h>
 
-#ifdef CONFIG_SMP
+/* context.lock is held for us, so we don't need any locking. */
 static void flush_ldt(void *current_mm)
 {
-	if (current->active_mm == current_mm)
-		load_LDT(&current->active_mm->context);
+	mm_context_t *pc;
+
+	if (current->active_mm != current_mm)
+		return;
+
+	pc = &current->active_mm->context;
+	set_ldt(pc->ldt->entries, pc->ldt->size);
 }
-#endif
 
-static int alloc_ldt(mm_context_t *pc, int mincount, int reload)
+/* The caller must call finalize_ldt_struct on the result. LDT starts zeroed. */
+static struct ldt_struct *alloc_ldt_struct(int size)
 {
-	void *oldldt, *newldt;
-	int oldsize;
-
-	if (mincount <= pc->size)
-		return 0;
-	oldsize = pc->size;
-	mincount = (mincount + (PAGE_SIZE / LDT_ENTRY_SIZE - 1)) &
-			(~(PAGE_SIZE / LDT_ENTRY_SIZE - 1));
-	if (mincount * LDT_ENTRY_SIZE > PAGE_SIZE)
-		newldt = vmalloc(mincount * LDT_ENTRY_SIZE);
+	struct ldt_struct *new_ldt;
+	int alloc_size;
+
+	if (size > LDT_ENTRIES)
+		return NULL;
+
+	new_ldt = kmalloc(sizeof(struct ldt_struct), GFP_KERNEL);
+	if (!new_ldt)
+		return NULL;
+
+	BUILD_BUG_ON(LDT_ENTRY_SIZE != sizeof(struct desc_struct));
+	alloc_size = size * LDT_ENTRY_SIZE;
+
+	/*
+	 * Xen is very picky: it requires a page-aligned LDT that has no
+	 * trailing nonzero bytes in any page that contains LDT descriptors.
+	 * Keep it simple: zero the whole allocation and never allocate less
+	 * than PAGE_SIZE.
+	 */
+	if (alloc_size > PAGE_SIZE)
+		new_ldt->entries = vzalloc(alloc_size);
 	else
-		newldt = (void *)__get_free_page(GFP_KERNEL);
-
-	if (!newldt)
-		return -ENOMEM;
+		new_ldt->entries = kzalloc(PAGE_SIZE, GFP_KERNEL);
 
-	if (oldsize)
-		memcpy(newldt, pc->ldt, oldsize * LDT_ENTRY_SIZE);
-	oldldt = pc->ldt;
-	memset(newldt + oldsize * LDT_ENTRY_SIZE, 0,
-	       (mincount - oldsize) * LDT_ENTRY_SIZE);
+	if (!new_ldt->entries) {
+		kfree(new_ldt);
+		return NULL;
+	}
 
-	paravirt_alloc_ldt(newldt, mincount);
+	new_ldt->size = size;
+	return new_ldt;
+}
 
-#ifdef CONFIG_X86_64
-	/* CHECKME: Do we really need this ? */
-	wmb();
-#endif
-	pc->ldt = newldt;
-	wmb();
-	pc->size = mincount;
-	wmb();
-
-	if (reload) {
-#ifdef CONFIG_SMP
-		preempt_disable();
-		load_LDT(pc);
-		if (!cpumask_equal(mm_cpumask(current->mm),
-				   cpumask_of(smp_processor_id())))
-			smp_call_function(flush_ldt, current->mm, 1);
-		preempt_enable();
-#else
-		load_LDT(pc);
-#endif
-	}
-	if (oldsize) {
-		paravirt_free_ldt(oldldt, oldsize);
-		if (oldsize * LDT_ENTRY_SIZE > PAGE_SIZE)
-			vfree(oldldt);
-		else
-			put_page(virt_to_page(oldldt));
-	}
-	return 0;
+/* After calling this, the LDT is immutable. */
+static void finalize_ldt_struct(struct ldt_struct *ldt)
+{
+	paravirt_alloc_ldt(ldt->entries, ldt->size);
 }
 
-static inline int copy_ldt(mm_context_t *new, mm_context_t *old)
+/* context.lock is held */
+static void install_ldt(struct mm_struct *current_mm,
+			struct ldt_struct *ldt)
 {
-	int err = alloc_ldt(new, old->size, 0);
-	int i;
+	/* Synchronizes with lockless_dereference in load_mm_ldt. */
+	smp_store_release(&current_mm->context.ldt, ldt);
+
+	/* Activate the LDT for all CPUs using current_mm. */
+	on_each_cpu_mask(mm_cpumask(current_mm), flush_ldt, current_mm, true);
+}
 
-	if (err < 0)
-		return err;
+static void free_ldt_struct(struct ldt_struct *ldt)
+{
+	if (likely(!ldt))
+		return;
 
-	for (i = 0; i < old->size; i++)
-		write_ldt_entry(new->ldt, i, old->ldt + i * LDT_ENTRY_SIZE);
-	return 0;
+	paravirt_free_ldt(ldt->entries, ldt->size);
+	if (ldt->size * LDT_ENTRY_SIZE > PAGE_SIZE)
+		vfree(ldt->entries);
+	else
+		kfree(ldt->entries);
+	kfree(ldt);
 }
 
 /*
@@ -104,17 +105,37 @@ static inline int copy_ldt(mm_context_t *new, mm_context_t *old)
  */
 int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 {
+	struct ldt_struct *new_ldt;
 	struct mm_struct *old_mm;
 	int retval = 0;
 
 	mutex_init(&mm->context.lock);
-	mm->context.size = 0;
 	old_mm = current->mm;
-	if (old_mm && old_mm->context.size > 0) {
-		mutex_lock(&old_mm->context.lock);
-		retval = copy_ldt(&mm->context, &old_mm->context);
-		mutex_unlock(&old_mm->context.lock);
+	if (!old_mm) {
+		mm->context.ldt = NULL;
+		return 0;
 	}
+
+	mutex_lock(&old_mm->context.lock);
+	if (!old_mm->context.ldt) {
+		mm->context.ldt = NULL;
+		goto out_unlock;
+	}
+
+	new_ldt = alloc_ldt_struct(old_mm->context.ldt->size);
+	if (!new_ldt) {
+		retval = -ENOMEM;
+		goto out_unlock;
+	}
+
+	memcpy(new_ldt->entries, old_mm->context.ldt->entries,
+	       new_ldt->size * LDT_ENTRY_SIZE);
+	finalize_ldt_struct(new_ldt);
+
+	mm->context.ldt = new_ldt;
+
+out_unlock:
+	mutex_unlock(&old_mm->context.lock);
 	return retval;
 }
 
@@ -125,53 +146,47 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
  */
 void destroy_context(struct mm_struct *mm)
 {
-	if (mm->context.size) {
-#ifdef CONFIG_X86_32
-		/* CHECKME: Can this ever happen ? */
-		if (mm == current->active_mm)
-			clear_LDT();
-#endif
-		paravirt_free_ldt(mm->context.ldt, mm->context.size);
-		if (mm->context.size * LDT_ENTRY_SIZE > PAGE_SIZE)
-			vfree(mm->context.ldt);
-		else
-			put_page(virt_to_page(mm->context.ldt));
-		mm->context.size = 0;
-	}
+	free_ldt_struct(mm->context.ldt);
+	mm->context.ldt = NULL;
 }
 
 static int read_ldt(void __user *ptr, unsigned long bytecount)
 {
-	int err;
+	int retval;
 	unsigned long size;
 	struct mm_struct *mm = current->mm;
 
-	if (!mm->context.size)
-		return 0;
+	mutex_lock(&mm->context.lock);
+
+	if (!mm->context.ldt) {
+		retval = 0;
+		goto out_unlock;
+	}
+
 	if (bytecount > LDT_ENTRY_SIZE * LDT_ENTRIES)
 		bytecount = LDT_ENTRY_SIZE * LDT_ENTRIES;
 
-	mutex_lock(&mm->context.lock);
-	size = mm->context.size * LDT_ENTRY_SIZE;
+	size = mm->context.ldt->size * LDT_ENTRY_SIZE;
 	if (size > bytecount)
 		size = bytecount;
 
-	err = 0;
-	if (copy_to_user(ptr, mm->context.ldt, size))
-		err = -EFAULT;
-	mutex_unlock(&mm->context.lock);
-	if (err < 0)
-		goto error_return;
+	if (copy_to_user(ptr, mm->context.ldt->entries, size)) {
+		retval = -EFAULT;
+		goto out_unlock;
+	}
+
 	if (size != bytecount) {
-		/* zero-fill the rest */
-		if (clear_user(ptr + size, bytecount - size) != 0) {
-			err = -EFAULT;
-			goto error_return;
+		/* Zero-fill the rest and pretend we read bytecount bytes. */
+		if (clear_user(ptr + size, bytecount - size)) {
+			retval = -EFAULT;
+			goto out_unlock;
 		}
 	}
-	return bytecount;
-error_return:
-	return err;
+	retval = bytecount;
+
+out_unlock:
+	mutex_unlock(&mm->context.lock);
+	return retval;
 }
 
 static int read_default_ldt(void __user *ptr, unsigned long bytecount)
@@ -195,6 +210,8 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode)
 	struct desc_struct ldt;
 	int error;
 	struct user_desc ldt_info;
+	int oldsize, newsize;
+	struct ldt_struct *new_ldt, *old_ldt;
 
 	error = -EINVAL;
 	if (bytecount != sizeof(ldt_info))
@@ -213,34 +230,39 @@ static int write_ldt(void __user *ptr, unsigned long bytecount, int oldmode)
 			goto out;
 	}
 
-	mutex_lock(&mm->context.lock);
-	if (ldt_info.entry_number >= mm->context.size) {
-		error = alloc_ldt(&current->mm->context,
-				  ldt_info.entry_number + 1, 1);
-		if (error < 0)
-			goto out_unlock;
-	}
-
-	/* Allow LDTs to be cleared by the user. */
-	if (ldt_info.base_addr == 0 && ldt_info.limit == 0) {
-		if (oldmode || LDT_empty(&ldt_info)) {
-			memset(&ldt, 0, sizeof(ldt));
-			goto install;
+	if ((oldmode && !ldt_info.base_addr && !ldt_info.limit) ||
+	    LDT_empty(&ldt_info)) {
+		/* The user wants to clear the entry. */
+		memset(&ldt, 0, sizeof(ldt));
+	} else {
+		if (!IS_ENABLED(CONFIG_X86_16BIT) && !ldt_info.seg_32bit) {
+			error = -EINVAL;
+			goto out;
 		}
+
+		fill_ldt(&ldt, &ldt_info);
+		if (oldmode)
+			ldt.avl = 0;
 	}
 
-	if (!IS_ENABLED(CONFIG_X86_16BIT) && !ldt_info.seg_32bit) {
-		error = -EINVAL;
+	mutex_lock(&mm->context.lock);
+
+	old_ldt = mm->context.ldt;
+	oldsize = old_ldt ? old_ldt->size : 0;
+	newsize = max((int)(ldt_info.entry_number + 1), oldsize);
+
+	error = -ENOMEM;
+	new_ldt = alloc_ldt_struct(newsize);
+	if (!new_ldt)
 		goto out_unlock;
-	}
 
-	fill_ldt(&ldt, &ldt_info);
-	if (oldmode)
-		ldt.avl = 0;
+	if (old_ldt)
+		memcpy(new_ldt->entries, old_ldt->entries, oldsize * LDT_ENTRY_SIZE);
+	new_ldt->entries[ldt_info.entry_number] = ldt;
+	finalize_ldt_struct(new_ldt);
 
-	/* Install the new entry ...  */
-install:
-	write_ldt_entry(mm->context.ldt, ldt_info.entry_number, &ldt);
+	install_ldt(mm, new_ldt);
+	free_ldt_struct(old_ldt);
 	error = 0;
 
 out_unlock:
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 71d7849a07f7..f6b916387590 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -121,11 +121,11 @@ void __show_regs(struct pt_regs *regs, int all)
 void release_thread(struct task_struct *dead_task)
 {
 	if (dead_task->mm) {
-		if (dead_task->mm->context.size) {
+		if (dead_task->mm->context.ldt) {
 			pr_warn("WARNING: dead process %s still has LDT? <%p/%d>\n",
 				dead_task->comm,
 				dead_task->mm->context.ldt,
-				dead_task->mm->context.size);
+				dead_task->mm->context.ldt->size);
 			BUG();
 		}
 	}
diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
index 9b4d51d0c0d0..6273324186ac 100644
--- a/arch/x86/kernel/step.c
+++ b/arch/x86/kernel/step.c
@@ -5,6 +5,7 @@
 #include <linux/mm.h>
 #include <linux/ptrace.h>
 #include <asm/desc.h>
+#include <asm/mmu_context.h>
 
 unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *regs)
 {
@@ -30,10 +31,11 @@ unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *re
 		seg &= ~7UL;
 
 		mutex_lock(&child->mm->context.lock);
-		if (unlikely((seg >> 3) >= child->mm->context.size))
+		if (unlikely(!child->mm->context.ldt ||
+			     (seg >> 3) >= child->mm->context.ldt->size))
 			addr = -1L; /* bogus selector, access would fault */
 		else {
-			desc = child->mm->context.ldt + seg;
+			desc = &child->mm->context.ldt->entries[seg];
 			base = get_desc_base(desc);
 
 			/* 16-bit code segment? */
diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
index 0d7dd1f5ac36..9ab52791fed5 100644
--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -22,6 +22,7 @@
 #include <asm/fpu/internal.h>
 #include <asm/debugreg.h>
 #include <asm/cpu.h>
+#include <asm/mmu_context.h>
 
 #ifdef CONFIG_X86_32
 __visible unsigned long saved_context_ebx;
@@ -153,7 +154,7 @@ static void fix_processor_context(void)
 	syscall_init();				/* This sets MSR_*STAR and related */
 #endif
 	load_TR_desc();				/* This does ltr */
-	load_LDT(&current->active_mm->context);	/* This does lldt */
+	load_mm_ldt(current->active_mm);	/* This does lldt */
 
 	fpu__resume_cpu();
 }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v5 3/4] selftests/x86, x86/ldt: Add a selftest for modify_ldt
  2015-07-28  5:29 [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option Andy Lutomirski
  2015-07-28  5:29 ` [PATCH v5 1/4] x86/xen: Unmap aliases in xen_alloc_ldt and xen_free_ldt Andy Lutomirski
  2015-07-28  5:29 ` [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous Andy Lutomirski
@ 2015-07-28  5:29 ` Andy Lutomirski
  2015-07-28  5:29 ` [PATCH v5 4/4] x86/ldt: Make modify_ldt optional Andy Lutomirski
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Andy Lutomirski @ 2015-07-28  5:29 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: security, Andy Lutomirski, Andrew Cooper, X86 ML, linux-kernel,
	xen-devel, Borislav Petkov, Jan Beulich, Sasha Levin,
	Boris Ostrovsky

This tests general modify_ldt behavior (only writes, so far) as
well as synchronous updates via IPI.  It fails on old kernels.

I called this ldt_gdt because I'll add set_thread_area tests to
it at some point.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 tools/testing/selftests/x86/Makefile  |   2 +-
 tools/testing/selftests/x86/ldt_gdt.c | 520 ++++++++++++++++++++++++++++++++++
 2 files changed, 521 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/ldt_gdt.c

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index caa60d56d7d1..4138387b892c 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -4,7 +4,7 @@ include ../lib.mk
 
 .PHONY: all all_32 all_64 warn_32bit_failure clean
 
-TARGETS_C_BOTHBITS := sigreturn single_step_syscall sysret_ss_attrs
+TARGETS_C_BOTHBITS := sigreturn single_step_syscall sysret_ss_attrs ldt_gdt
 TARGETS_C_32BIT_ONLY := entry_from_vm86
 
 TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY)
diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c
new file mode 100644
index 000000000000..c27adfc9ae72
--- /dev/null
+++ b/tools/testing/selftests/x86/ldt_gdt.c
@@ -0,0 +1,520 @@
+/*
+ * ldt_gdt.c - Test cases for LDT and GDT access
+ * Copyright (c) 2015 Andrew Lutomirski
+ */
+
+#define _GNU_SOURCE
+#include <err.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <signal.h>
+#include <setjmp.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <asm/ldt.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <stdbool.h>
+#include <pthread.h>
+#include <sched.h>
+#include <linux/futex.h>
+
+#define AR_ACCESSED		(1<<8)
+
+#define AR_TYPE_RODATA		(0 * (1<<9))
+#define AR_TYPE_RWDATA		(1 * (1<<9))
+#define AR_TYPE_RODATA_EXPDOWN	(2 * (1<<9))
+#define AR_TYPE_RWDATA_EXPDOWN	(3 * (1<<9))
+#define AR_TYPE_XOCODE		(4 * (1<<9))
+#define AR_TYPE_XRCODE		(5 * (1<<9))
+#define AR_TYPE_XOCODE_CONF	(6 * (1<<9))
+#define AR_TYPE_XRCODE_CONF	(7 * (1<<9))
+
+#define AR_DPL3			(3 * (1<<13))
+
+#define AR_S			(1 << 12)
+#define AR_P			(1 << 15)
+#define AR_AVL			(1 << 20)
+#define AR_L			(1 << 21)
+#define AR_DB			(1 << 22)
+#define AR_G			(1 << 23)
+
+static int nerrs;
+
+static void check_invalid_segment(uint16_t index, int ldt)
+{
+	uint32_t has_limit = 0, has_ar = 0, limit, ar;
+	uint32_t selector = (index << 3) | (ldt << 2) | 3;
+
+	asm ("lsl %[selector], %[limit]\n\t"
+	     "jnz 1f\n\t"
+	     "movl $1, %[has_limit]\n\t"
+	     "1:"
+	     : [limit] "=r" (limit), [has_limit] "+rm" (has_limit)
+	     : [selector] "r" (selector));
+	asm ("larl %[selector], %[ar]\n\t"
+	     "jnz 1f\n\t"
+	     "movl $1, %[has_ar]\n\t"
+	     "1:"
+	     : [ar] "=r" (ar), [has_ar] "+rm" (has_ar)
+	     : [selector] "r" (selector));
+
+	if (has_limit || has_ar) {
+		printf("[FAIL]\t%s entry %hu is valid but should be invalid\n",
+		       (ldt ? "LDT" : "GDT"), index);
+		nerrs++;
+	} else {
+		printf("[OK]\t%s entry %hu is invalid\n",
+		       (ldt ? "LDT" : "GDT"), index);
+	}
+}
+
+static void check_valid_segment(uint16_t index, int ldt,
+				uint32_t expected_ar, uint32_t expected_limit,
+				bool verbose)
+{
+	uint32_t has_limit = 0, has_ar = 0, limit, ar;
+	uint32_t selector = (index << 3) | (ldt << 2) | 3;
+
+	asm ("lsl %[selector], %[limit]\n\t"
+	     "jnz 1f\n\t"
+	     "movl $1, %[has_limit]\n\t"
+	     "1:"
+	     : [limit] "=r" (limit), [has_limit] "+rm" (has_limit)
+	     : [selector] "r" (selector));
+	asm ("larl %[selector], %[ar]\n\t"
+	     "jnz 1f\n\t"
+	     "movl $1, %[has_ar]\n\t"
+	     "1:"
+	     : [ar] "=r" (ar), [has_ar] "+rm" (has_ar)
+	     : [selector] "r" (selector));
+
+	if (!has_limit || !has_ar) {
+		printf("[FAIL]\t%s entry %hu is invalid but should be valid\n",
+		       (ldt ? "LDT" : "GDT"), index);
+		nerrs++;
+		return;
+	}
+
+	if (ar != expected_ar) {
+		printf("[FAIL]\t%s entry %hu has AR 0x%08X but expected 0x%08X\n",
+		       (ldt ? "LDT" : "GDT"), index, ar, expected_ar);
+		nerrs++;
+	} else if (limit != expected_limit) {
+		printf("[FAIL]\t%s entry %hu has limit 0x%08X but expected 0x%08X\n",
+		       (ldt ? "LDT" : "GDT"), index, limit, expected_limit);
+		nerrs++;
+	} else if (verbose) {
+		printf("[OK]\t%s entry %hu has AR 0x%08X and limit 0x%08X\n",
+		       (ldt ? "LDT" : "GDT"), index, ar, limit);
+	}
+}
+
+static bool install_valid_mode(const struct user_desc *desc, uint32_t ar,
+			       bool oldmode)
+{
+	int ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
+			  desc, sizeof(*desc));
+	if (ret < -1)
+		errno = -ret;
+	if (ret == 0) {
+		uint32_t limit = desc->limit;
+		if (desc->limit_in_pages)
+			limit = (limit << 12) + 4095;
+		check_valid_segment(desc->entry_number, 1, ar, limit, true);
+		return true;
+	} else if (errno == ENOSYS) {
+		printf("[OK]\tmodify_ldt returned -ENOSYS\n");
+		return false;
+	} else {
+		if (desc->seg_32bit) {
+			printf("[FAIL]\tUnexpected modify_ldt failure %d\n",
+			       errno);
+			nerrs++;
+			return false;
+		} else {
+			printf("[OK]\tmodify_ldt rejected 16 bit segment\n");
+			return false;
+		}
+	}
+}
+
+static bool install_valid(const struct user_desc *desc, uint32_t ar)
+{
+	return install_valid_mode(desc, ar, false);
+}
+
+static void install_invalid(const struct user_desc *desc, bool oldmode)
+{
+	int ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
+			  desc, sizeof(*desc));
+	if (ret < -1)
+		errno = -ret;
+	if (ret == 0) {
+		check_invalid_segment(desc->entry_number, 1);
+	} else if (errno == ENOSYS) {
+		printf("[OK]\tmodify_ldt returned -ENOSYS\n");
+	} else {
+		if (desc->seg_32bit) {
+			printf("[FAIL]\tUnexpected modify_ldt failure %d\n",
+			       errno);
+			nerrs++;
+		} else {
+			printf("[OK]\tmodify_ldt rejected 16 bit segment\n");
+		}
+	}
+}
+
+static int safe_modify_ldt(int func, struct user_desc *ptr,
+			   unsigned long bytecount)
+{
+	int ret = syscall(SYS_modify_ldt, 0x11, ptr, bytecount);
+	if (ret < -1)
+		errno = -ret;
+	return ret;
+}
+
+static void fail_install(struct user_desc *desc)
+{
+	if (safe_modify_ldt(0x11, desc, sizeof(*desc)) == 0) {
+		printf("[FAIL]\tmodify_ldt accepted a bad descriptor\n");
+		nerrs++;
+	} else if (errno == ENOSYS) {
+		printf("[OK]\tmodify_ldt returned -ENOSYS\n");
+	} else {
+		printf("[OK]\tmodify_ldt failure %d\n", errno);
+	}
+}
+
+static void do_simple_tests(void)
+{
+	struct user_desc desc = {
+		.entry_number    = 0,
+		.base_addr       = 0,
+		.limit           = 10,
+		.seg_32bit       = 1,
+		.contents        = 2, /* Code, not conforming */
+		.read_exec_only  = 0,
+		.limit_in_pages  = 0,
+		.seg_not_present = 0,
+		.useable         = 0
+	};
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE | AR_S | AR_P | AR_DB);
+
+	desc.limit_in_pages = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
+		      AR_S | AR_P | AR_DB | AR_G);
+
+	check_invalid_segment(1, 1);
+
+	desc.entry_number = 2;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
+		      AR_S | AR_P | AR_DB | AR_G);
+
+	check_invalid_segment(1, 1);
+
+	desc.base_addr = 0xf0000000;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
+		      AR_S | AR_P | AR_DB | AR_G);
+
+	desc.useable = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
+		      AR_S | AR_P | AR_DB | AR_G | AR_AVL);
+
+	desc.seg_not_present = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
+		      AR_S | AR_DB | AR_G | AR_AVL);
+
+	desc.seg_32bit = 0;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
+		      AR_S | AR_G | AR_AVL);
+
+	desc.seg_32bit = 1;
+	desc.contents = 0;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA |
+		      AR_S | AR_DB | AR_G | AR_AVL);
+
+	desc.read_exec_only = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA |
+		      AR_S | AR_DB | AR_G | AR_AVL);
+
+	desc.contents = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA_EXPDOWN |
+		      AR_S | AR_DB | AR_G | AR_AVL);
+
+	desc.read_exec_only = 0;
+	desc.limit_in_pages = 0;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA_EXPDOWN |
+		      AR_S | AR_DB | AR_AVL);
+
+	desc.contents = 3;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE_CONF |
+		      AR_S | AR_DB | AR_AVL);
+
+	desc.read_exec_only = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XOCODE_CONF |
+		      AR_S | AR_DB | AR_AVL);
+
+	desc.read_exec_only = 0;
+	desc.contents = 2;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
+		      AR_S | AR_DB | AR_AVL);
+
+	desc.read_exec_only = 1;
+
+#ifdef __x86_64__
+	desc.lm = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_XOCODE |
+		      AR_S | AR_DB | AR_AVL);
+	desc.lm = 0;
+#endif
+
+	bool entry1_okay = install_valid(&desc, AR_DPL3 | AR_TYPE_XOCODE |
+					 AR_S | AR_DB | AR_AVL);
+
+	if (entry1_okay) {
+		printf("[RUN]\tTest fork\n");
+		pid_t child = fork();
+		if (child == 0) {
+			nerrs = 0;
+			check_valid_segment(desc.entry_number, 1,
+					    AR_DPL3 | AR_TYPE_XOCODE |
+					    AR_S | AR_DB | AR_AVL, desc.limit,
+					    true);
+			check_invalid_segment(1, 1);
+			exit(nerrs ? 1 : 0);
+		} else {
+			int status;
+			if (waitpid(child, &status, 0) != child ||
+			    !WIFEXITED(status)) {
+				printf("[FAIL]\tChild died\n");
+				nerrs++;
+			} else if (WEXITSTATUS(status) != 0) {
+				printf("[FAIL]\tChild failed\n");
+				nerrs++;
+			} else {
+				printf("[OK]\tChild succeeded\n");
+			}
+		}
+
+		printf("[RUN]\tTest size\n");
+		int i;
+		for (i = 0; i < 8192; i++) {
+			desc.entry_number = i;
+			desc.limit = i;
+			if (safe_modify_ldt(0x11, &desc, sizeof(desc)) != 0) {
+				printf("[FAIL]\tFailed to install entry %d\n", i);
+				nerrs++;
+				break;
+			}
+		}
+		for (int j = 0; j < i; j++) {
+			check_valid_segment(j, 1, AR_DPL3 | AR_TYPE_XOCODE |
+					    AR_S | AR_DB | AR_AVL, j, false);
+		}
+		printf("[DONE]\tSize test\n");
+	} else {
+		printf("[SKIP]\tSkipping fork and size tests because we have no LDT\n");
+	}
+
+	/* Test entry_number too high. */
+	desc.entry_number = 8192;
+	fail_install(&desc);
+
+	/* Test deletion and actions mistakeable for deletion. */
+	memset(&desc, 0, sizeof(desc));
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S | AR_P);
+
+	desc.seg_not_present = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S);
+
+	desc.seg_not_present = 0;
+	desc.read_exec_only = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA | AR_S | AR_P);
+
+	desc.read_exec_only = 0;
+	desc.seg_not_present = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S);
+
+	desc.read_exec_only = 1;
+	desc.limit = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA | AR_S);
+
+	desc.limit = 0;
+	desc.base_addr = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA | AR_S);
+
+	desc.base_addr = 0;
+	install_invalid(&desc, false);
+
+	desc.seg_not_present = 0;
+	desc.read_exec_only = 0;
+	desc.seg_32bit = 1;
+	install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S | AR_P | AR_DB);
+	install_invalid(&desc, true);
+}
+
+/*
+ * 0: thread is idle
+ * 1: thread armed
+ * 2: thread should clear LDT entry 0
+ * 3: thread should exit
+ */
+static volatile unsigned int ftx;
+
+static void *threadproc(void *ctx)
+{
+	cpu_set_t cpuset;
+	CPU_ZERO(&cpuset);
+	CPU_SET(1, &cpuset);
+	if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0)
+		err(1, "sched_setaffinity to CPU 1");	/* should never fail */
+
+	while (1) {
+		syscall(SYS_futex, &ftx, FUTEX_WAIT, 0, NULL, NULL, 0);
+		while (ftx != 2) {
+			if (ftx >= 3)
+				return NULL;
+		}
+
+		/* clear LDT entry 0 */
+		const struct user_desc desc = {};
+		if (syscall(SYS_modify_ldt, 1, &desc, sizeof(desc)) != 0)
+			err(1, "modify_ldt");
+
+		/* If ftx == 2, set it to zero.  If ftx == 100, quit. */
+		unsigned int x = -2;
+		asm volatile ("lock xaddl %[x], %[ftx]" :
+			      [x] "+r" (x), [ftx] "+m" (ftx));
+		if (x != 2)
+			return NULL;
+	}
+}
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+
+}
+
+static jmp_buf jmpbuf;
+
+static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
+{
+	siglongjmp(jmpbuf, 1);
+}
+
+static void do_multicpu_tests(void)
+{
+	cpu_set_t cpuset;
+	pthread_t thread;
+	int failures = 0, iters = 5, i;
+	unsigned short orig_ss;
+
+	CPU_ZERO(&cpuset);
+	CPU_SET(1, &cpuset);
+	if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) {
+		printf("[SKIP]\tCannot set affinity to CPU 1\n");
+		return;
+	}
+
+	CPU_ZERO(&cpuset);
+	CPU_SET(0, &cpuset);
+	if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) {
+		printf("[SKIP]\tCannot set affinity to CPU 0\n");
+		return;
+	}
+
+	sethandler(SIGSEGV, sigsegv, 0);
+#ifdef __i386__
+	/* True 32-bit kernels send SIGILL instead of SIGSEGV on IRET faults. */
+	sethandler(SIGILL, sigsegv, 0);
+#endif
+
+	printf("[RUN]\tCross-CPU LDT invalidation\n");
+
+	if (pthread_create(&thread, 0, threadproc, 0) != 0)
+		err(1, "pthread_create");
+
+	asm volatile ("mov %%ss, %0" : "=rm" (orig_ss));
+
+	for (i = 0; i < 5; i++) {
+		if (sigsetjmp(jmpbuf, 1) != 0)
+			continue;
+
+		/* Make sure the thread is ready after the last test. */
+		while (ftx != 0)
+			;
+
+		struct user_desc desc = {
+			.entry_number    = 0,
+			.base_addr       = 0,
+			.limit           = 0xfffff,
+			.seg_32bit       = 1,
+			.contents        = 0, /* Data */
+			.read_exec_only  = 0,
+			.limit_in_pages  = 1,
+			.seg_not_present = 0,
+			.useable         = 0
+		};
+
+		if (safe_modify_ldt(0x11, &desc, sizeof(desc)) != 0) {
+			if (errno != ENOSYS)
+				err(1, "modify_ldt");
+			printf("[SKIP]\tmodify_ldt unavailable\n");
+			break;
+		}
+
+		/* Arm the thread. */
+		ftx = 1;
+		syscall(SYS_futex, &ftx, FUTEX_WAKE, 0, NULL, NULL, 0);
+
+		asm volatile ("mov %0, %%ss" : : "r" (0x7));
+
+		/* Go! */
+		ftx = 2;
+
+		while (ftx != 0)
+			;
+
+		/*
+		 * On success, modify_ldt will segfault us synchronously,
+		 * and we'll escape via siglongjmp.
+		 */
+
+		failures++;
+		asm volatile ("mov %0, %%ss" : : "rm" (orig_ss));
+	};
+
+	ftx = 100;  /* Kill the thread. */
+	syscall(SYS_futex, &ftx, FUTEX_WAKE, 0, NULL, NULL, 0);
+
+	if (pthread_join(thread, NULL) != 0)
+		err(1, "pthread_join");
+
+	if (failures) {
+		printf("[FAIL]\t%d of %d iterations failed\n", failures, iters);
+		nerrs++;
+	} else {
+		printf("[OK]\tAll %d iterations succeeded\n", iters);
+	}
+}
+
+int main()
+{
+	do_simple_tests();
+
+	do_multicpu_tests();
+
+	return nerrs ? 1 : 0;
+}
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v5 4/4] x86/ldt: Make modify_ldt optional
  2015-07-28  5:29 [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option Andy Lutomirski
                   ` (2 preceding siblings ...)
  2015-07-28  5:29 ` [PATCH v5 3/4] selftests/x86, x86/ldt: Add a selftest for modify_ldt Andy Lutomirski
@ 2015-07-28  5:29 ` Andy Lutomirski
       [not found] ` <8d0567e8ab3933a7c91486e6d628c5f5c02f0753.1438061139.git.luto@kernel.org>
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 25+ messages in thread
From: Andy Lutomirski @ 2015-07-28  5:29 UTC (permalink / raw)
  To: Peter Zijlstra, Steven Rostedt
  Cc: security, Andy Lutomirski, Andrew Cooper, X86 ML, linux-kernel,
	xen-devel, Borislav Petkov, Jan Beulich, Sasha Levin,
	Boris Ostrovsky

The modify_ldt syscall exposes a large attack surface and is
unnecessary for modern userspace.  Make it optional.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/Kconfig                   | 17 +++++++++++++++++
 arch/x86/include/asm/mmu.h         |  2 ++
 arch/x86/include/asm/mmu_context.h | 31 +++++++++++++++++++++++--------
 arch/x86/kernel/Makefile           |  3 ++-
 arch/x86/kernel/cpu/perf_event.c   |  4 ++++
 arch/x86/kernel/process_64.c       |  2 ++
 arch/x86/kernel/step.c             |  2 ++
 kernel/sys_ni.c                    |  1 +
 8 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b3a1a5d77d92..ede52be845db 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1015,6 +1015,7 @@ config VM86
 config X86_16BIT
 	bool "Enable support for 16-bit segments" if EXPERT
 	default y
+	depends on MODIFY_LDT_SYSCALL
 	---help---
 	  This option is required by programs like Wine to run 16-bit
 	  protected mode legacy code on x86 processors.  Disabling
@@ -2053,6 +2054,22 @@ config CMDLINE_OVERRIDE
 	  This is used to work around broken boot loaders.  This should
 	  be set to 'N' under normal conditions.
 
+config MODIFY_LDT_SYSCALL
+       bool "Enable the LDT (local descriptor table)" if EXPERT
+       default y
+       ---help---
+         Linux can allow user programs to install a per-process x86
+	 Local Descriptor Table (LDT) using the modify_ldt(2) system
+	 call.  This is required to run 16-bit or segmented code such as
+	 DOSEMU or some Wine programs.  It is also used by some very old
+	 threading libraries.
+
+	 Enabling this feature adds a small amount of overhead to
+	 context switches and increases the low-level kernel attack
+	 surface.  Disabling it removes the modify_ldt(2) system call.
+
+	 Saying 'N' here may make sense for embedded or server kernels.
+
 source "kernel/livepatch/Kconfig"
 
 endmenu
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 364d27481a52..55234d5e7160 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -9,7 +9,9 @@
  * we put the segment information here.
  */
 typedef struct {
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
 	struct ldt_struct *ldt;
+#endif
 
 #ifdef CONFIG_X86_64
 	/* True if mm supports a task running in 32 bit compatibility mode. */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index 3fcff70c398e..08094eded318 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -33,6 +33,7 @@ static inline void load_mm_cr4(struct mm_struct *mm)
 static inline void load_mm_cr4(struct mm_struct *mm) {}
 #endif
 
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
 /*
  * ldt_structs can be allocated, used, and freed, but they are never
  * modified while live.
@@ -48,10 +49,24 @@ struct ldt_struct {
 	int size;
 };
 
+/*
+ * Used for LDT copy/destruction.
+ */
+int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
+void destroy_context(struct mm_struct *mm);
+#else	/* CONFIG_MODIFY_LDT_SYSCALL */
+static inline int init_new_context(struct task_struct *tsk,
+				   struct mm_struct *mm)
+{
+	return 0;
+}
+static inline void destroy_context(struct mm_struct *mm) {}
+#endif
+
 static inline void load_mm_ldt(struct mm_struct *mm)
 {
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
 	struct ldt_struct *ldt;
-	DEBUG_LOCKS_WARN_ON(!irqs_disabled());
 
 	/* lockless_dereference synchronizes with smp_store_release */
 	ldt = lockless_dereference(mm->context.ldt);
@@ -74,14 +89,12 @@ static inline void load_mm_ldt(struct mm_struct *mm)
 		set_ldt(ldt->entries, ldt->size);
 	else
 		clear_LDT();
-}
-
-/*
- * Used for LDT copy/destruction.
- */
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
-void destroy_context(struct mm_struct *mm);
+#else
+	clear_LDT();
+#endif
 
+	DEBUG_LOCKS_WARN_ON(!irqs_disabled());
+}
 
 static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
 {
@@ -113,6 +126,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 		/* Load per-mm CR4 state */
 		load_mm_cr4(next);
 
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
 		/*
 		 * Load the LDT, if the LDT is different.
 		 *
@@ -127,6 +141,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 		 */
 		if (unlikely(prev->context.ldt != next->context.ldt))
 			load_mm_ldt(next);
+#endif
 	}
 #ifdef CONFIG_SMP
 	  else {
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 0f15af41bd80..2b507befcd3f 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -24,7 +24,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace
 
 obj-y			:= process_$(BITS).o signal.o
 obj-y			+= traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
-obj-y			+= time.o ioport.o ldt.o dumpstack.o nmi.o
+obj-y			+= time.o ioport.o dumpstack.o nmi.o
+obj-$(CONFIG_MODIFY_LDT_SYSCALL)	+= ldt.o
 obj-y			+= setup.o x86_init.o i8259.o irqinit.o jump_label.o
 obj-$(CONFIG_IRQ_WORK)  += irq_work.o
 obj-y			+= probe_roms.o
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 9469dfa55607..58b872ef2329 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -2179,6 +2179,7 @@ static unsigned long get_segment_base(unsigned int segment)
 	int idx = segment >> 3;
 
 	if ((segment & SEGMENT_TI_MASK) == SEGMENT_LDT) {
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
 		struct ldt_struct *ldt;
 
 		if (idx > LDT_ENTRIES)
@@ -2190,6 +2191,9 @@ static unsigned long get_segment_base(unsigned int segment)
 			return 0;
 
 		desc = &ldt->entries[idx];
+#else
+		return 0;
+#endif
 	} else {
 		if (idx > GDT_ENTRIES)
 			return 0;
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index f6b916387590..941295ddf802 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -121,6 +121,7 @@ void __show_regs(struct pt_regs *regs, int all)
 void release_thread(struct task_struct *dead_task)
 {
 	if (dead_task->mm) {
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
 		if (dead_task->mm->context.ldt) {
 			pr_warn("WARNING: dead process %s still has LDT? <%p/%d>\n",
 				dead_task->comm,
@@ -128,6 +129,7 @@ void release_thread(struct task_struct *dead_task)
 				dead_task->mm->context.ldt->size);
 			BUG();
 		}
+#endif
 	}
 }
 
diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
index 6273324186ac..fd88e152d584 100644
--- a/arch/x86/kernel/step.c
+++ b/arch/x86/kernel/step.c
@@ -18,6 +18,7 @@ unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *re
 		return addr;
 	}
 
+#ifdef CONFIG_MODIFY_LDT_SYSCALL
 	/*
 	 * We'll assume that the code segments in the GDT
 	 * are all zero-based. That is largely true: the
@@ -45,6 +46,7 @@ unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *re
 		}
 		mutex_unlock(&child->mm->context.lock);
 	}
+#endif
 
 	return addr;
 }
diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
index 7995ef5868d8..ca7d84f438f1 100644
--- a/kernel/sys_ni.c
+++ b/kernel/sys_ni.c
@@ -140,6 +140,7 @@ cond_syscall(sys_sgetmask);
 cond_syscall(sys_ssetmask);
 cond_syscall(sys_vm86old);
 cond_syscall(sys_vm86);
+cond_syscall(sys_modify_ldt);
 cond_syscall(sys_ipc);
 cond_syscall(compat_sys_ipc);
 cond_syscall(compat_sys_sysctl);
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 3/4] selftests/x86, x86/ldt: Add a selftest for modify_ldt
       [not found] ` <8d0567e8ab3933a7c91486e6d628c5f5c02f0753.1438061139.git.luto@kernel.org>
@ 2015-07-28 16:53   ` Kees Cook
  0 siblings, 0 replies; 25+ messages in thread
From: Kees Cook @ 2015-07-28 16:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: security, Peter Zijlstra, Andrew Cooper, X86 ML, LKML,
	Steven Rostedt, xen-devel, Borislav Petkov, Jan Beulich,
	Sasha Levin, Boris Ostrovsky

On Mon, Jul 27, 2015 at 10:29 PM, Andy Lutomirski <luto@kernel.org> wrote:
> This tests general modify_ldt behavior (only writes, so far) as
> well as synchronous updates via IPI.  It fails on old kernels.
>
> I called this ldt_gdt because I'll add set_thread_area tests to
> it at some point.
>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>

Looks great!

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  tools/testing/selftests/x86/Makefile  |   2 +-
>  tools/testing/selftests/x86/ldt_gdt.c | 520 ++++++++++++++++++++++++++++++++++
>  2 files changed, 521 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/x86/ldt_gdt.c
>
> diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
> index caa60d56d7d1..4138387b892c 100644
> --- a/tools/testing/selftests/x86/Makefile
> +++ b/tools/testing/selftests/x86/Makefile
> @@ -4,7 +4,7 @@ include ../lib.mk
>
>  .PHONY: all all_32 all_64 warn_32bit_failure clean
>
> -TARGETS_C_BOTHBITS := sigreturn single_step_syscall sysret_ss_attrs
> +TARGETS_C_BOTHBITS := sigreturn single_step_syscall sysret_ss_attrs ldt_gdt
>  TARGETS_C_32BIT_ONLY := entry_from_vm86
>
>  TARGETS_C_32BIT_ALL := $(TARGETS_C_BOTHBITS) $(TARGETS_C_32BIT_ONLY)
> diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c
> new file mode 100644
> index 000000000000..c27adfc9ae72
> --- /dev/null
> +++ b/tools/testing/selftests/x86/ldt_gdt.c
> @@ -0,0 +1,520 @@
> +/*
> + * ldt_gdt.c - Test cases for LDT and GDT access
> + * Copyright (c) 2015 Andrew Lutomirski
> + */
> +
> +#define _GNU_SOURCE
> +#include <err.h>
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <signal.h>
> +#include <setjmp.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <errno.h>
> +#include <unistd.h>
> +#include <sys/syscall.h>
> +#include <asm/ldt.h>
> +#include <sys/types.h>
> +#include <sys/wait.h>
> +#include <stdbool.h>
> +#include <pthread.h>
> +#include <sched.h>
> +#include <linux/futex.h>
> +
> +#define AR_ACCESSED            (1<<8)
> +
> +#define AR_TYPE_RODATA         (0 * (1<<9))
> +#define AR_TYPE_RWDATA         (1 * (1<<9))
> +#define AR_TYPE_RODATA_EXPDOWN (2 * (1<<9))
> +#define AR_TYPE_RWDATA_EXPDOWN (3 * (1<<9))
> +#define AR_TYPE_XOCODE         (4 * (1<<9))
> +#define AR_TYPE_XRCODE         (5 * (1<<9))
> +#define AR_TYPE_XOCODE_CONF    (6 * (1<<9))
> +#define AR_TYPE_XRCODE_CONF    (7 * (1<<9))
> +
> +#define AR_DPL3                        (3 * (1<<13))
> +
> +#define AR_S                   (1 << 12)
> +#define AR_P                   (1 << 15)
> +#define AR_AVL                 (1 << 20)
> +#define AR_L                   (1 << 21)
> +#define AR_DB                  (1 << 22)
> +#define AR_G                   (1 << 23)
> +
> +static int nerrs;
> +
> +static void check_invalid_segment(uint16_t index, int ldt)
> +{
> +       uint32_t has_limit = 0, has_ar = 0, limit, ar;
> +       uint32_t selector = (index << 3) | (ldt << 2) | 3;
> +
> +       asm ("lsl %[selector], %[limit]\n\t"
> +            "jnz 1f\n\t"
> +            "movl $1, %[has_limit]\n\t"
> +            "1:"
> +            : [limit] "=r" (limit), [has_limit] "+rm" (has_limit)
> +            : [selector] "r" (selector));
> +       asm ("larl %[selector], %[ar]\n\t"
> +            "jnz 1f\n\t"
> +            "movl $1, %[has_ar]\n\t"
> +            "1:"
> +            : [ar] "=r" (ar), [has_ar] "+rm" (has_ar)
> +            : [selector] "r" (selector));
> +
> +       if (has_limit || has_ar) {
> +               printf("[FAIL]\t%s entry %hu is valid but should be invalid\n",
> +                      (ldt ? "LDT" : "GDT"), index);
> +               nerrs++;
> +       } else {
> +               printf("[OK]\t%s entry %hu is invalid\n",
> +                      (ldt ? "LDT" : "GDT"), index);
> +       }
> +}
> +
> +static void check_valid_segment(uint16_t index, int ldt,
> +                               uint32_t expected_ar, uint32_t expected_limit,
> +                               bool verbose)
> +{
> +       uint32_t has_limit = 0, has_ar = 0, limit, ar;
> +       uint32_t selector = (index << 3) | (ldt << 2) | 3;
> +
> +       asm ("lsl %[selector], %[limit]\n\t"
> +            "jnz 1f\n\t"
> +            "movl $1, %[has_limit]\n\t"
> +            "1:"
> +            : [limit] "=r" (limit), [has_limit] "+rm" (has_limit)
> +            : [selector] "r" (selector));
> +       asm ("larl %[selector], %[ar]\n\t"
> +            "jnz 1f\n\t"
> +            "movl $1, %[has_ar]\n\t"
> +            "1:"
> +            : [ar] "=r" (ar), [has_ar] "+rm" (has_ar)
> +            : [selector] "r" (selector));
> +
> +       if (!has_limit || !has_ar) {
> +               printf("[FAIL]\t%s entry %hu is invalid but should be valid\n",
> +                      (ldt ? "LDT" : "GDT"), index);
> +               nerrs++;
> +               return;
> +       }
> +
> +       if (ar != expected_ar) {
> +               printf("[FAIL]\t%s entry %hu has AR 0x%08X but expected 0x%08X\n",
> +                      (ldt ? "LDT" : "GDT"), index, ar, expected_ar);
> +               nerrs++;
> +       } else if (limit != expected_limit) {
> +               printf("[FAIL]\t%s entry %hu has limit 0x%08X but expected 0x%08X\n",
> +                      (ldt ? "LDT" : "GDT"), index, limit, expected_limit);
> +               nerrs++;
> +       } else if (verbose) {
> +               printf("[OK]\t%s entry %hu has AR 0x%08X and limit 0x%08X\n",
> +                      (ldt ? "LDT" : "GDT"), index, ar, limit);
> +       }
> +}
> +
> +static bool install_valid_mode(const struct user_desc *desc, uint32_t ar,
> +                              bool oldmode)
> +{
> +       int ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
> +                         desc, sizeof(*desc));
> +       if (ret < -1)
> +               errno = -ret;
> +       if (ret == 0) {
> +               uint32_t limit = desc->limit;
> +               if (desc->limit_in_pages)
> +                       limit = (limit << 12) + 4095;
> +               check_valid_segment(desc->entry_number, 1, ar, limit, true);
> +               return true;
> +       } else if (errno == ENOSYS) {
> +               printf("[OK]\tmodify_ldt returned -ENOSYS\n");
> +               return false;
> +       } else {
> +               if (desc->seg_32bit) {
> +                       printf("[FAIL]\tUnexpected modify_ldt failure %d\n",
> +                              errno);
> +                       nerrs++;
> +                       return false;
> +               } else {
> +                       printf("[OK]\tmodify_ldt rejected 16 bit segment\n");
> +                       return false;
> +               }
> +       }
> +}
> +
> +static bool install_valid(const struct user_desc *desc, uint32_t ar)
> +{
> +       return install_valid_mode(desc, ar, false);
> +}
> +
> +static void install_invalid(const struct user_desc *desc, bool oldmode)
> +{
> +       int ret = syscall(SYS_modify_ldt, oldmode ? 1 : 0x11,
> +                         desc, sizeof(*desc));
> +       if (ret < -1)
> +               errno = -ret;
> +       if (ret == 0) {
> +               check_invalid_segment(desc->entry_number, 1);
> +       } else if (errno == ENOSYS) {
> +               printf("[OK]\tmodify_ldt returned -ENOSYS\n");
> +       } else {
> +               if (desc->seg_32bit) {
> +                       printf("[FAIL]\tUnexpected modify_ldt failure %d\n",
> +                              errno);
> +                       nerrs++;
> +               } else {
> +                       printf("[OK]\tmodify_ldt rejected 16 bit segment\n");
> +               }
> +       }
> +}
> +
> +static int safe_modify_ldt(int func, struct user_desc *ptr,
> +                          unsigned long bytecount)
> +{
> +       int ret = syscall(SYS_modify_ldt, 0x11, ptr, bytecount);
> +       if (ret < -1)
> +               errno = -ret;
> +       return ret;
> +}
> +
> +static void fail_install(struct user_desc *desc)
> +{
> +       if (safe_modify_ldt(0x11, desc, sizeof(*desc)) == 0) {
> +               printf("[FAIL]\tmodify_ldt accepted a bad descriptor\n");
> +               nerrs++;
> +       } else if (errno == ENOSYS) {
> +               printf("[OK]\tmodify_ldt returned -ENOSYS\n");
> +       } else {
> +               printf("[OK]\tmodify_ldt failure %d\n", errno);
> +       }
> +}
> +
> +static void do_simple_tests(void)
> +{
> +       struct user_desc desc = {
> +               .entry_number    = 0,
> +               .base_addr       = 0,
> +               .limit           = 10,
> +               .seg_32bit       = 1,
> +               .contents        = 2, /* Code, not conforming */
> +               .read_exec_only  = 0,
> +               .limit_in_pages  = 0,
> +               .seg_not_present = 0,
> +               .useable         = 0
> +       };
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE | AR_S | AR_P | AR_DB);
> +
> +       desc.limit_in_pages = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
> +                     AR_S | AR_P | AR_DB | AR_G);
> +
> +       check_invalid_segment(1, 1);
> +
> +       desc.entry_number = 2;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
> +                     AR_S | AR_P | AR_DB | AR_G);
> +
> +       check_invalid_segment(1, 1);
> +
> +       desc.base_addr = 0xf0000000;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
> +                     AR_S | AR_P | AR_DB | AR_G);
> +
> +       desc.useable = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
> +                     AR_S | AR_P | AR_DB | AR_G | AR_AVL);
> +
> +       desc.seg_not_present = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
> +                     AR_S | AR_DB | AR_G | AR_AVL);
> +
> +       desc.seg_32bit = 0;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
> +                     AR_S | AR_G | AR_AVL);
> +
> +       desc.seg_32bit = 1;
> +       desc.contents = 0;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA |
> +                     AR_S | AR_DB | AR_G | AR_AVL);
> +
> +       desc.read_exec_only = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA |
> +                     AR_S | AR_DB | AR_G | AR_AVL);
> +
> +       desc.contents = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA_EXPDOWN |
> +                     AR_S | AR_DB | AR_G | AR_AVL);
> +
> +       desc.read_exec_only = 0;
> +       desc.limit_in_pages = 0;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA_EXPDOWN |
> +                     AR_S | AR_DB | AR_AVL);
> +
> +       desc.contents = 3;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE_CONF |
> +                     AR_S | AR_DB | AR_AVL);
> +
> +       desc.read_exec_only = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XOCODE_CONF |
> +                     AR_S | AR_DB | AR_AVL);
> +
> +       desc.read_exec_only = 0;
> +       desc.contents = 2;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XRCODE |
> +                     AR_S | AR_DB | AR_AVL);
> +
> +       desc.read_exec_only = 1;
> +
> +#ifdef __x86_64__
> +       desc.lm = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_XOCODE |
> +                     AR_S | AR_DB | AR_AVL);
> +       desc.lm = 0;
> +#endif
> +
> +       bool entry1_okay = install_valid(&desc, AR_DPL3 | AR_TYPE_XOCODE |
> +                                        AR_S | AR_DB | AR_AVL);
> +
> +       if (entry1_okay) {
> +               printf("[RUN]\tTest fork\n");
> +               pid_t child = fork();
> +               if (child == 0) {
> +                       nerrs = 0;
> +                       check_valid_segment(desc.entry_number, 1,
> +                                           AR_DPL3 | AR_TYPE_XOCODE |
> +                                           AR_S | AR_DB | AR_AVL, desc.limit,
> +                                           true);
> +                       check_invalid_segment(1, 1);
> +                       exit(nerrs ? 1 : 0);
> +               } else {
> +                       int status;
> +                       if (waitpid(child, &status, 0) != child ||
> +                           !WIFEXITED(status)) {
> +                               printf("[FAIL]\tChild died\n");
> +                               nerrs++;
> +                       } else if (WEXITSTATUS(status) != 0) {
> +                               printf("[FAIL]\tChild failed\n");
> +                               nerrs++;
> +                       } else {
> +                               printf("[OK]\tChild succeeded\n");
> +                       }
> +               }
> +
> +               printf("[RUN]\tTest size\n");
> +               int i;
> +               for (i = 0; i < 8192; i++) {
> +                       desc.entry_number = i;
> +                       desc.limit = i;
> +                       if (safe_modify_ldt(0x11, &desc, sizeof(desc)) != 0) {
> +                               printf("[FAIL]\tFailed to install entry %d\n", i);
> +                               nerrs++;
> +                               break;
> +                       }
> +               }
> +               for (int j = 0; j < i; j++) {
> +                       check_valid_segment(j, 1, AR_DPL3 | AR_TYPE_XOCODE |
> +                                           AR_S | AR_DB | AR_AVL, j, false);
> +               }
> +               printf("[DONE]\tSize test\n");
> +       } else {
> +               printf("[SKIP]\tSkipping fork and size tests because we have no LDT\n");
> +       }
> +
> +       /* Test entry_number too high. */
> +       desc.entry_number = 8192;
> +       fail_install(&desc);
> +
> +       /* Test deletion and actions mistakeable for deletion. */
> +       memset(&desc, 0, sizeof(desc));
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S | AR_P);
> +
> +       desc.seg_not_present = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S);
> +
> +       desc.seg_not_present = 0;
> +       desc.read_exec_only = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA | AR_S | AR_P);
> +
> +       desc.read_exec_only = 0;
> +       desc.seg_not_present = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S);
> +
> +       desc.read_exec_only = 1;
> +       desc.limit = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA | AR_S);
> +
> +       desc.limit = 0;
> +       desc.base_addr = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RODATA | AR_S);
> +
> +       desc.base_addr = 0;
> +       install_invalid(&desc, false);
> +
> +       desc.seg_not_present = 0;
> +       desc.read_exec_only = 0;
> +       desc.seg_32bit = 1;
> +       install_valid(&desc, AR_DPL3 | AR_TYPE_RWDATA | AR_S | AR_P | AR_DB);
> +       install_invalid(&desc, true);
> +}
> +
> +/*
> + * 0: thread is idle
> + * 1: thread armed
> + * 2: thread should clear LDT entry 0
> + * 3: thread should exit
> + */
> +static volatile unsigned int ftx;
> +
> +static void *threadproc(void *ctx)
> +{
> +       cpu_set_t cpuset;
> +       CPU_ZERO(&cpuset);
> +       CPU_SET(1, &cpuset);
> +       if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0)
> +               err(1, "sched_setaffinity to CPU 1");   /* should never fail */
> +
> +       while (1) {
> +               syscall(SYS_futex, &ftx, FUTEX_WAIT, 0, NULL, NULL, 0);
> +               while (ftx != 2) {
> +                       if (ftx >= 3)
> +                               return NULL;
> +               }
> +
> +               /* clear LDT entry 0 */
> +               const struct user_desc desc = {};
> +               if (syscall(SYS_modify_ldt, 1, &desc, sizeof(desc)) != 0)
> +                       err(1, "modify_ldt");
> +
> +               /* If ftx == 2, set it to zero.  If ftx == 100, quit. */
> +               unsigned int x = -2;
> +               asm volatile ("lock xaddl %[x], %[ftx]" :
> +                             [x] "+r" (x), [ftx] "+m" (ftx));
> +               if (x != 2)
> +                       return NULL;
> +       }
> +}
> +
> +static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
> +                      int flags)
> +{
> +       struct sigaction sa;
> +       memset(&sa, 0, sizeof(sa));
> +       sa.sa_sigaction = handler;
> +       sa.sa_flags = SA_SIGINFO | flags;
> +       sigemptyset(&sa.sa_mask);
> +       if (sigaction(sig, &sa, 0))
> +               err(1, "sigaction");
> +
> +}
> +
> +static jmp_buf jmpbuf;
> +
> +static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
> +{
> +       siglongjmp(jmpbuf, 1);
> +}
> +
> +static void do_multicpu_tests(void)
> +{
> +       cpu_set_t cpuset;
> +       pthread_t thread;
> +       int failures = 0, iters = 5, i;
> +       unsigned short orig_ss;
> +
> +       CPU_ZERO(&cpuset);
> +       CPU_SET(1, &cpuset);
> +       if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) {
> +               printf("[SKIP]\tCannot set affinity to CPU 1\n");
> +               return;
> +       }
> +
> +       CPU_ZERO(&cpuset);
> +       CPU_SET(0, &cpuset);
> +       if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) {
> +               printf("[SKIP]\tCannot set affinity to CPU 0\n");
> +               return;
> +       }
> +
> +       sethandler(SIGSEGV, sigsegv, 0);
> +#ifdef __i386__
> +       /* True 32-bit kernels send SIGILL instead of SIGSEGV on IRET faults. */
> +       sethandler(SIGILL, sigsegv, 0);
> +#endif
> +
> +       printf("[RUN]\tCross-CPU LDT invalidation\n");
> +
> +       if (pthread_create(&thread, 0, threadproc, 0) != 0)
> +               err(1, "pthread_create");
> +
> +       asm volatile ("mov %%ss, %0" : "=rm" (orig_ss));
> +
> +       for (i = 0; i < 5; i++) {
> +               if (sigsetjmp(jmpbuf, 1) != 0)
> +                       continue;
> +
> +               /* Make sure the thread is ready after the last test. */
> +               while (ftx != 0)
> +                       ;
> +
> +               struct user_desc desc = {
> +                       .entry_number    = 0,
> +                       .base_addr       = 0,
> +                       .limit           = 0xfffff,
> +                       .seg_32bit       = 1,
> +                       .contents        = 0, /* Data */
> +                       .read_exec_only  = 0,
> +                       .limit_in_pages  = 1,
> +                       .seg_not_present = 0,
> +                       .useable         = 0
> +               };
> +
> +               if (safe_modify_ldt(0x11, &desc, sizeof(desc)) != 0) {
> +                       if (errno != ENOSYS)
> +                               err(1, "modify_ldt");
> +                       printf("[SKIP]\tmodify_ldt unavailable\n");
> +                       break;
> +               }
> +
> +               /* Arm the thread. */
> +               ftx = 1;
> +               syscall(SYS_futex, &ftx, FUTEX_WAKE, 0, NULL, NULL, 0);
> +
> +               asm volatile ("mov %0, %%ss" : : "r" (0x7));
> +
> +               /* Go! */
> +               ftx = 2;
> +
> +               while (ftx != 0)
> +                       ;
> +
> +               /*
> +                * On success, modify_ldt will segfault us synchronously,
> +                * and we'll escape via siglongjmp.
> +                */
> +
> +               failures++;
> +               asm volatile ("mov %0, %%ss" : : "rm" (orig_ss));
> +       };
> +
> +       ftx = 100;  /* Kill the thread. */
> +       syscall(SYS_futex, &ftx, FUTEX_WAKE, 0, NULL, NULL, 0);
> +
> +       if (pthread_join(thread, NULL) != 0)
> +               err(1, "pthread_join");
> +
> +       if (failures) {
> +               printf("[FAIL]\t%d of %d iterations failed\n", failures, iters);
> +               nerrs++;
> +       } else {
> +               printf("[OK]\tAll %d iterations succeeded\n", iters);
> +       }
> +}
> +
> +int main()
> +{
> +       do_simple_tests();
> +
> +       do_multicpu_tests();
> +
> +       return nerrs ? 1 : 0;
> +}
> --
> 2.4.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 4/4] x86/ldt: Make modify_ldt optional
       [not found] ` <49b3792ac157c32e2479ac3092a915a1b8c99ce8.1438061139.git.luto@kernel.org>
@ 2015-07-28 16:56   ` Kees Cook
       [not found]   ` <CAGXu5jLGWEiA+RCf-eApAdx0fsuRFWguOQMpWX9nOKDwWLHWtg@mail.gmail.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Kees Cook @ 2015-07-28 16:56 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: security, Peter Zijlstra, Andrew Cooper, X86 ML, LKML,
	Steven Rostedt, xen-devel, Borislav Petkov, Jan Beulich,
	Sasha Levin, Boris Ostrovsky

On Mon, Jul 27, 2015 at 10:29 PM, Andy Lutomirski <luto@kernel.org> wrote:
> The modify_ldt syscall exposes a large attack surface and is
> unnecessary for modern userspace.  Make it optional.
>
> Signed-off-by: Andy Lutomirski <luto@kernel.org>

Reviewed-by: Kees Cook <keescook@chromium.org>

> ---
>  arch/x86/Kconfig                   | 17 +++++++++++++++++
>  arch/x86/include/asm/mmu.h         |  2 ++
>  arch/x86/include/asm/mmu_context.h | 31 +++++++++++++++++++++++--------
>  arch/x86/kernel/Makefile           |  3 ++-
>  arch/x86/kernel/cpu/perf_event.c   |  4 ++++
>  arch/x86/kernel/process_64.c       |  2 ++
>  arch/x86/kernel/step.c             |  2 ++
>  kernel/sys_ni.c                    |  1 +
>  8 files changed, 53 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b3a1a5d77d92..ede52be845db 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1015,6 +1015,7 @@ config VM86
>  config X86_16BIT
>         bool "Enable support for 16-bit segments" if EXPERT
>         default y
> +       depends on MODIFY_LDT_SYSCALL
>         ---help---
>           This option is required by programs like Wine to run 16-bit
>           protected mode legacy code on x86 processors.  Disabling
> @@ -2053,6 +2054,22 @@ config CMDLINE_OVERRIDE
>           This is used to work around broken boot loaders.  This should
>           be set to 'N' under normal conditions.
>
> +config MODIFY_LDT_SYSCALL
> +       bool "Enable the LDT (local descriptor table)" if EXPERT
> +       default y
> +       ---help---
> +         Linux can allow user programs to install a per-process x86

nit: looks like a spaces vs tabs issue in the line above?

> +        Local Descriptor Table (LDT) using the modify_ldt(2) system
> +        call.  This is required to run 16-bit or segmented code such as
> +        DOSEMU or some Wine programs.  It is also used by some very old
> +        threading libraries.
> +
> +        Enabling this feature adds a small amount of overhead to
> +        context switches and increases the low-level kernel attack
> +        surface.  Disabling it removes the modify_ldt(2) system call.
> +
> +        Saying 'N' here may make sense for embedded or server kernels.
> +
>  source "kernel/livepatch/Kconfig"
>
>  endmenu
> diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
> index 364d27481a52..55234d5e7160 100644
> --- a/arch/x86/include/asm/mmu.h
> +++ b/arch/x86/include/asm/mmu.h
> @@ -9,7 +9,9 @@
>   * we put the segment information here.
>   */
>  typedef struct {
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
>         struct ldt_struct *ldt;
> +#endif
>
>  #ifdef CONFIG_X86_64
>         /* True if mm supports a task running in 32 bit compatibility mode. */
> diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
> index 3fcff70c398e..08094eded318 100644
> --- a/arch/x86/include/asm/mmu_context.h
> +++ b/arch/x86/include/asm/mmu_context.h
> @@ -33,6 +33,7 @@ static inline void load_mm_cr4(struct mm_struct *mm)
>  static inline void load_mm_cr4(struct mm_struct *mm) {}
>  #endif
>
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
>  /*
>   * ldt_structs can be allocated, used, and freed, but they are never
>   * modified while live.
> @@ -48,10 +49,24 @@ struct ldt_struct {
>         int size;
>  };
>
> +/*
> + * Used for LDT copy/destruction.
> + */
> +int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
> +void destroy_context(struct mm_struct *mm);
> +#else  /* CONFIG_MODIFY_LDT_SYSCALL */
> +static inline int init_new_context(struct task_struct *tsk,
> +                                  struct mm_struct *mm)
> +{
> +       return 0;
> +}
> +static inline void destroy_context(struct mm_struct *mm) {}
> +#endif
> +
>  static inline void load_mm_ldt(struct mm_struct *mm)
>  {
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
>         struct ldt_struct *ldt;
> -       DEBUG_LOCKS_WARN_ON(!irqs_disabled());
>
>         /* lockless_dereference synchronizes with smp_store_release */
>         ldt = lockless_dereference(mm->context.ldt);
> @@ -74,14 +89,12 @@ static inline void load_mm_ldt(struct mm_struct *mm)
>                 set_ldt(ldt->entries, ldt->size);
>         else
>                 clear_LDT();
> -}
> -
> -/*
> - * Used for LDT copy/destruction.
> - */
> -int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
> -void destroy_context(struct mm_struct *mm);
> +#else
> +       clear_LDT();
> +#endif
>
> +       DEBUG_LOCKS_WARN_ON(!irqs_disabled());
> +}
>
>  static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
>  {
> @@ -113,6 +126,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>                 /* Load per-mm CR4 state */
>                 load_mm_cr4(next);
>
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
>                 /*
>                  * Load the LDT, if the LDT is different.
>                  *
> @@ -127,6 +141,7 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
>                  */
>                 if (unlikely(prev->context.ldt != next->context.ldt))
>                         load_mm_ldt(next);
> +#endif
>         }
>  #ifdef CONFIG_SMP
>           else {
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 0f15af41bd80..2b507befcd3f 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -24,7 +24,8 @@ CFLAGS_irq.o := -I$(src)/../include/asm/trace
>
>  obj-y                  := process_$(BITS).o signal.o
>  obj-y                  += traps.o irq.o irq_$(BITS).o dumpstack_$(BITS).o
> -obj-y                  += time.o ioport.o ldt.o dumpstack.o nmi.o
> +obj-y                  += time.o ioport.o dumpstack.o nmi.o
> +obj-$(CONFIG_MODIFY_LDT_SYSCALL)       += ldt.o
>  obj-y                  += setup.o x86_init.o i8259.o irqinit.o jump_label.o
>  obj-$(CONFIG_IRQ_WORK)  += irq_work.o
>  obj-y                  += probe_roms.o
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 9469dfa55607..58b872ef2329 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -2179,6 +2179,7 @@ static unsigned long get_segment_base(unsigned int segment)
>         int idx = segment >> 3;
>
>         if ((segment & SEGMENT_TI_MASK) == SEGMENT_LDT) {
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
>                 struct ldt_struct *ldt;
>
>                 if (idx > LDT_ENTRIES)
> @@ -2190,6 +2191,9 @@ static unsigned long get_segment_base(unsigned int segment)
>                         return 0;
>
>                 desc = &ldt->entries[idx];
> +#else
> +               return 0;
> +#endif
>         } else {
>                 if (idx > GDT_ENTRIES)
>                         return 0;
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index f6b916387590..941295ddf802 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -121,6 +121,7 @@ void __show_regs(struct pt_regs *regs, int all)
>  void release_thread(struct task_struct *dead_task)
>  {
>         if (dead_task->mm) {
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
>                 if (dead_task->mm->context.ldt) {
>                         pr_warn("WARNING: dead process %s still has LDT? <%p/%d>\n",
>                                 dead_task->comm,
> @@ -128,6 +129,7 @@ void release_thread(struct task_struct *dead_task)
>                                 dead_task->mm->context.ldt->size);
>                         BUG();
>                 }
> +#endif
>         }
>  }
>
> diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c
> index 6273324186ac..fd88e152d584 100644
> --- a/arch/x86/kernel/step.c
> +++ b/arch/x86/kernel/step.c
> @@ -18,6 +18,7 @@ unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *re
>                 return addr;
>         }
>
> +#ifdef CONFIG_MODIFY_LDT_SYSCALL
>         /*
>          * We'll assume that the code segments in the GDT
>          * are all zero-based. That is largely true: the
> @@ -45,6 +46,7 @@ unsigned long convert_ip_to_linear(struct task_struct *child, struct pt_regs *re
>                 }
>                 mutex_unlock(&child->mm->context.lock);
>         }
> +#endif
>
>         return addr;
>  }
> diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c
> index 7995ef5868d8..ca7d84f438f1 100644
> --- a/kernel/sys_ni.c
> +++ b/kernel/sys_ni.c
> @@ -140,6 +140,7 @@ cond_syscall(sys_sgetmask);
>  cond_syscall(sys_ssetmask);
>  cond_syscall(sys_vm86old);
>  cond_syscall(sys_vm86);
> +cond_syscall(sys_modify_ldt);
>  cond_syscall(sys_ipc);
>  cond_syscall(compat_sys_ipc);
>  cond_syscall(compat_sys_sysctl);
> --
> 2.4.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

I look forward to the runtime disabling patch. :)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 4/4] x86/ldt: Make modify_ldt optional
       [not found]   ` <CAGXu5jLGWEiA+RCf-eApAdx0fsuRFWguOQMpWX9nOKDwWLHWtg@mail.gmail.com>
@ 2015-07-28 20:03     ` Willy Tarreau
       [not found]     ` <20150728200351.GA20360@1wt.eu>
  1 sibling, 0 replies; 25+ messages in thread
From: Willy Tarreau @ 2015-07-28 20:03 UTC (permalink / raw)
  To: Kees Cook
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	LKML, Steven Rostedt, xen-devel, Borislav Petkov,
	Andy Lutomirski, Sasha Levin, Boris Ostrovsky

Hi Kees,

On Tue, Jul 28, 2015 at 09:56:12AM -0700, Kees Cook wrote:
> I look forward to the runtime disabling patch. :)

Did you get my response to your comments regarding the proposed patch ?

I can rebase it and update it if needed, I just want to make sure
everyone's on the same line regarding this.

Cheers,
Willy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 4/4] x86/ldt: Make modify_ldt optional
       [not found]     ` <20150728200351.GA20360@1wt.eu>
@ 2015-07-28 20:42       ` Kees Cook
       [not found]       ` <CAGXu5jKBmGmH1dbgtwmBeQeSisxrykoxYRcFKe4yNFHyDxTnKw@mail.gmail.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Kees Cook @ 2015-07-28 20:42 UTC (permalink / raw)
  To: Willy Tarreau
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	LKML, Steven Rostedt, xen-devel, Borislav Petkov,
	Andy Lutomirski, Sasha Levin, Boris Ostrovsky

On Tue, Jul 28, 2015 at 1:03 PM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Kees,
>
> On Tue, Jul 28, 2015 at 09:56:12AM -0700, Kees Cook wrote:
>> I look forward to the runtime disabling patch. :)
>
> Did you get my response to your comments regarding the proposed patch ?
>
> I can rebase it and update it if needed, I just want to make sure
> everyone's on the same line regarding this.

Yeah, I'm fine with what you have. I'd still like a "off until next
reboot", but I'll live. :)

-Kees

-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 4/4] x86/ldt: Make modify_ldt optional
       [not found]       ` <CAGXu5jKBmGmH1dbgtwmBeQeSisxrykoxYRcFKe4yNFHyDxTnKw@mail.gmail.com>
@ 2015-07-28 20:51         ` Willy Tarreau
  0 siblings, 0 replies; 25+ messages in thread
From: Willy Tarreau @ 2015-07-28 20:51 UTC (permalink / raw)
  To: Kees Cook
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	LKML, Steven Rostedt, xen-devel, Borislav Petkov,
	Andy Lutomirski, Sasha Levin, Boris Ostrovsky

On Tue, Jul 28, 2015 at 01:42:20PM -0700, Kees Cook wrote:
> On Tue, Jul 28, 2015 at 1:03 PM, Willy Tarreau <w@1wt.eu> wrote:
> > Hi Kees,
> >
> > On Tue, Jul 28, 2015 at 09:56:12AM -0700, Kees Cook wrote:
> >> I look forward to the runtime disabling patch. :)
> >
> > Did you get my response to your comments regarding the proposed patch ?
> >
> > I can rebase it and update it if needed, I just want to make sure
> > everyone's on the same line regarding this.
> 
> Yeah, I'm fine with what you have. I'd still like a "off until next
> reboot", but I'll live. :)

That's an improvement we can bring later with special value "-1" which is
even less than zero.

Thanks,
Willy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
       [not found] ` <1f1e8e0367408585694132a0e8693d157959ce30.1438061139.git.luto@kernel.org>
@ 2015-07-30  7:49   ` Borislav Petkov
  2015-07-30 17:56   ` Boris Ostrovsky
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 25+ messages in thread
From: Borislav Petkov @ 2015-07-30  7:49 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: security, Peter Zijlstra, Andrew Cooper, X86 ML, linux-kernel,
	Steven Rostedt, xen-devel, stable, Jan Beulich, Sasha Levin,
	Boris Ostrovsky

On Mon, Jul 27, 2015 at 10:29:39PM -0700, Andy Lutomirski wrote:
> modify_ldt has questionable locking and does not synchronize
> threads.  Improve it: redesign the locking and synchronize all
> threads' LDTs using an IPI on all modifications.
> 
> This will dramatically slow down modify_ldt in multithreaded
> programs, but there shouldn't be any multithreaded programs that
> care about modify_ldt's performance in the first place.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Andy Lutomirski <luto@kernel.org>

I've stared a lot at this one these days, I guess a

Reviewed-by: Borislav Petkov <bp@suse.de>

is in order.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
  2015-07-28  5:29 [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option Andy Lutomirski
                   ` (5 preceding siblings ...)
       [not found] ` <49b3792ac157c32e2479ac3092a915a1b8c99ce8.1438061139.git.luto@kernel.org>
@ 2015-07-30 15:53 ` Boris Ostrovsky
       [not found] ` <55BA487E.9030708@oracle.com>
       [not found] ` <1f1e8e0367408585694132a0e8693d157959ce30.1438061139.git.luto@kernel.org>
  8 siblings, 0 replies; 25+ messages in thread
From: Boris Ostrovsky @ 2015-07-30 15:53 UTC (permalink / raw)
  To: Andy Lutomirski, Peter Zijlstra, Steven Rostedt
  Cc: security, Andrew Cooper, X86 ML, linux-kernel, xen-devel,
	Borislav Petkov, Jan Beulich, Sasha Levin

On 07/28/2015 01:29 AM, Andy Lutomirski wrote:
> This is intended for x86/urgent.  Sorry for taking so long, but it
> seemed nice to avoid breaking Xen.
>
> This fixes the "dazed and confused" issue which was exposed by the
> CVE-2015-5157 fix.  It's also probably a good general attack surface
> reduction, and it replaces some scary code with IMO less scary code.
>
> Also, servers and embedded systems should probably turn off modify_ldt.
> This makes that possible.
>
> Boris, could I get a Tested-by, assuming this works for you?

As far as Xen guests are concerned,

Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

But ldt_gdt_32 test segfaults on 64-bit kernels. Baremetal and virt. I 
thought it worked for me before but can't reproduce this with older 
patches. Does it work for you?


-boris


>
> Willy and Kees: I left the config option alone.  The -tiny people will
> like it, and we can always add a sysctl of some sort later.
>
> Changes from v4:
>   - Fix Xen even better (patch 1 is new).
>   - Reorder the patches to make a little more sense.
>
> Changes from v3:
>   - Hopefully fixed Xen.
>   - Fixed 32-bit test case on 32-bit native kernel.
>   - Fix bogus vumnap for some LDT sizes.
>   - Strengthen test case to check all LDT sizes (catches bogus vunmap).
>   - Lots of cleanups, mostly from Borislav.
>   - Simplify IPI code using on_each_cpu_mask.
>
> Changes from v2:
>   - Allocate ldt_struct and the LDT entries separately.  This should fix Xen.
>   - Stop using write_ldt_entry, since I'm pretty sure it's unnecessary now
>     that we no longer mutate an in-use LDT.  (Xen people, can you check?)
>
> Changes from v1:
>   - The config option is new.
>   - The test case is new.
>   - Fixed a missing allocation failure check.
>   - Fixed a use-after-free on fork().
>
> Andy Lutomirski (4):
>    x86/xen: Unmap aliases in xen_alloc_ldt and xen_free_ldt
>    x86/ldt: Make modify_ldt synchronous
>    selftests/x86, x86/ldt: Add a selftest for modify_ldt
>    x86/ldt: Make modify_ldt optional
>
>   arch/x86/Kconfig                      |  17 ++
>   arch/x86/include/asm/desc.h           |  15 -
>   arch/x86/include/asm/mmu.h            |   5 +-
>   arch/x86/include/asm/mmu_context.h    |  68 ++++-
>   arch/x86/kernel/Makefile              |   3 +-
>   arch/x86/kernel/cpu/common.c          |   4 +-
>   arch/x86/kernel/cpu/perf_event.c      |  16 +-
>   arch/x86/kernel/ldt.c                 | 262 +++++++++--------
>   arch/x86/kernel/process_64.c          |   6 +-
>   arch/x86/kernel/step.c                |   8 +-
>   arch/x86/power/cpu.c                  |   3 +-
>   arch/x86/xen/enlighten.c              |  12 +
>   kernel/sys_ni.c                       |   1 +
>   tools/testing/selftests/x86/Makefile  |   2 +-
>   tools/testing/selftests/x86/ldt_gdt.c | 520 ++++++++++++++++++++++++++++++++++
>   15 files changed, 787 insertions(+), 155 deletions(-)
>   create mode 100644 tools/testing/selftests/x86/ldt_gdt.c
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
       [not found] ` <55BA487E.9030708@oracle.com>
@ 2015-07-30 16:05   ` Borislav Petkov
       [not found]   ` <20150730160555.GB28617@nazgul.tnic>
  1 sibling, 0 replies; 25+ messages in thread
From: Borislav Petkov @ 2015-07-30 16:05 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	linux-kernel, Steven Rostedt, xen-devel, Andy Lutomirski,
	Sasha Levin

On Thu, Jul 30, 2015 at 11:53:34AM -0400, Boris Ostrovsky wrote:
> As far as Xen guests are concerned,
> 
> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

Does that mean, this patch 1/4 fixes the 32bit issue you guys are still
debugging on the v4 thread? Or does that need more fixing?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
       [not found]   ` <20150730160555.GB28617@nazgul.tnic>
@ 2015-07-30 16:12     ` Andrew Cooper
  2015-07-30 16:31       ` Boris Ostrovsky
       [not found]       ` <55BA5173.401@oracle.com>
  0 siblings, 2 replies; 25+ messages in thread
From: Andrew Cooper @ 2015-07-30 16:12 UTC (permalink / raw)
  To: Borislav Petkov, Boris Ostrovsky
  Cc: security, Andy Lutomirski, Peter Zijlstra, X86 ML, linux-kernel,
	Steven Rostedt, xen-devel, Jan Beulich, Sasha Levin

On 30/07/15 17:05, Borislav Petkov wrote:
> On Thu, Jul 30, 2015 at 11:53:34AM -0400, Boris Ostrovsky wrote:
>> As far as Xen guests are concerned,
>>
>> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Does that mean, this patch 1/4 fixes the 32bit issue you guys are still
> debugging on the v4 thread? Or does that need more fixing?
>

I was going to say... This v5 pre-dates figuring out what was wrong with
32bit Xen.  v5 1/4 is still susceptible.

Boris: does your Tested-by cover v5 + proposed fix?

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
  2015-07-30 16:12     ` Andrew Cooper
@ 2015-07-30 16:31       ` Boris Ostrovsky
       [not found]       ` <55BA5173.401@oracle.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Boris Ostrovsky @ 2015-07-30 16:31 UTC (permalink / raw)
  To: Andrew Cooper, Borislav Petkov
  Cc: security, Andy Lutomirski, Peter Zijlstra, X86 ML, linux-kernel,
	Steven Rostedt, xen-devel, Jan Beulich, Sasha Levin

On 07/30/2015 12:12 PM, Andrew Cooper wrote:
> On 30/07/15 17:05, Borislav Petkov wrote:
>> On Thu, Jul 30, 2015 at 11:53:34AM -0400, Boris Ostrovsky wrote:
>>> As far as Xen guests are concerned,
>>>
>>> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Does that mean, this patch 1/4 fixes the 32bit issue you guys are still
>> debugging on the v4 thread? Or does that need more fixing?
>>
> I was going to say... This v5 pre-dates figuring out what was wrong with
> 32bit Xen.  v5 1/4 is still susceptible.
>
> Boris: does your Tested-by cover v5 + proposed fix?
>

Only V5, no extra changes.

And perhaps dropping aliases in xen_alloc_ldt() may be sufficient since 
with that done we will only have one mapping so a subsequent fault will 
have "correct" cr2 provided by the hypervisor (from your earlier email 
it sounded that hypervisor may have been providing incorrect cr2 if 
alias exists)

-boris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
       [not found]       ` <55BA5173.401@oracle.com>
@ 2015-07-30 17:06         ` Andrew Cooper
       [not found]         ` <55BA59B1.5000600@citrix.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Andrew Cooper @ 2015-07-30 17:06 UTC (permalink / raw)
  To: Boris Ostrovsky, Borislav Petkov
  Cc: security, Andy Lutomirski, Peter Zijlstra, X86 ML, linux-kernel,
	Steven Rostedt, xen-devel, Jan Beulich, Sasha Levin

On 30/07/15 17:31, Boris Ostrovsky wrote:
> On 07/30/2015 12:12 PM, Andrew Cooper wrote:
>> On 30/07/15 17:05, Borislav Petkov wrote:
>>> On Thu, Jul 30, 2015 at 11:53:34AM -0400, Boris Ostrovsky wrote:
>>>> As far as Xen guests are concerned,
>>>>
>>>> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>> Does that mean, this patch 1/4 fixes the 32bit issue you guys are still
>>> debugging on the v4 thread? Or does that need more fixing?
>>>
>> I was going to say... This v5 pre-dates figuring out what was wrong with
>> 32bit Xen.  v5 1/4 is still susceptible.
>>
>> Boris: does your Tested-by cover v5 + proposed fix?
>>
>
> Only V5, no extra changes.

Including running the ldt_gdt test?

>
> And perhaps dropping aliases in xen_alloc_ldt() may be sufficient
> since with that done we will only have one mapping so a subsequent
> fault will have "correct" cr2 provided by the hypervisor (from your
> earlier email it sounded that hypervisor may have been providing
> incorrect cr2 if alias exists)

They are sufficient to fix the first of the two bugs, but the free side
still has no protection against a missing l2, unless I am missing
something in the rest of the series?

~Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
       [not found]         ` <55BA59B1.5000600@citrix.com>
@ 2015-07-30 17:18           ` Boris Ostrovsky
       [not found]           ` <55BA5C5C.2050904@oracle.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Boris Ostrovsky @ 2015-07-30 17:18 UTC (permalink / raw)
  To: Andrew Cooper, Borislav Petkov
  Cc: security, Andy Lutomirski, Peter Zijlstra, X86 ML, linux-kernel,
	Steven Rostedt, xen-devel, Jan Beulich, Sasha Levin

On 07/30/2015 01:06 PM, Andrew Cooper wrote:
> On 30/07/15 17:31, Boris Ostrovsky wrote:
>> On 07/30/2015 12:12 PM, Andrew Cooper wrote:
>>> On 30/07/15 17:05, Borislav Petkov wrote:
>>>> On Thu, Jul 30, 2015 at 11:53:34AM -0400, Boris Ostrovsky wrote:
>>>>> As far as Xen guests are concerned,
>>>>>
>>>>> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>>>> Does that mean, this patch 1/4 fixes the 32bit issue you guys are still
>>>> debugging on the v4 thread? Or does that need more fixing?
>>>>
>>> I was going to say... This v5 pre-dates figuring out what was wrong with
>>> 32bit Xen.  v5 1/4 is still susceptible.
>>>
>>> Boris: does your Tested-by cover v5 + proposed fix?
>>>
>> Only V5, no extra changes.
> Including running the ldt_gdt test?

Yes, except that 32-on-64 doesn't work, but that's not Xen-specific.

Still, user-visible behavior changes.

>
>> And perhaps dropping aliases in xen_alloc_ldt() may be sufficient
>> since with that done we will only have one mapping so a subsequent
>> fault will have "correct" cr2 provided by the hypervisor (from your
>> earlier email it sounded that hypervisor may have been providing
>> incorrect cr2 if alias exists)
> They are sufficient to fix the first of the two bugs, but the free side
> still has no protection against a missing l2, unless I am missing
> something in the rest of the series?

Without aliases a subsequent fault *will* fill correct l2, won't it?

-boris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
       [not found] ` <1f1e8e0367408585694132a0e8693d157959ce30.1438061139.git.luto@kernel.org>
  2015-07-30  7:49   ` [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous Borislav Petkov
@ 2015-07-30 17:56   ` Boris Ostrovsky
       [not found]   ` <55BA6534.2080407@oracle.com>
  2015-08-13 21:05   ` H. Peter Anvin
  3 siblings, 0 replies; 25+ messages in thread
From: Boris Ostrovsky @ 2015-07-30 17:56 UTC (permalink / raw)
  To: Andy Lutomirski, Peter Zijlstra, Steven Rostedt
  Cc: security, Andrew Cooper, X86 ML, linux-kernel, stable, xen-devel,
	Borislav Petkov, Jan Beulich, Sasha Levin

On 07/28/2015 01:29 AM, Andy Lutomirski wrote:

> +
> +static inline void load_mm_ldt(struct mm_struct *mm)
> +{
> +	struct ldt_struct *ldt;
> +	DEBUG_LOCKS_WARN_ON(!irqs_disabled());


I thought this was supposed to be checking preemptible()?

-boris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
       [not found]   ` <55BA6534.2080407@oracle.com>
@ 2015-07-30 18:14     ` Andy Lutomirski
       [not found]     ` <CALCETrXqT3zeJUX9uyjDGOdUi0dHUGwBn-n2NOaPaFpmB2pmDg@mail.gmail.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Andy Lutomirski @ 2015-07-30 18:14 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	linux-kernel, Steven Rostedt, xen-devel, Borislav Petkov, stable,
	Andy Lutomirski, Sasha Levin

On Thu, Jul 30, 2015 at 10:56 AM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 07/28/2015 01:29 AM, Andy Lutomirski wrote:
>
>> +
>> +static inline void load_mm_ldt(struct mm_struct *mm)
>> +{
>> +       struct ldt_struct *ldt;
>> +       DEBUG_LOCKS_WARN_ON(!irqs_disabled());
>
>
>
> I thought this was supposed to be checking preemptible()?

v6 fixes that.  Check your future inbox :)  I'm goint to rework the
Xen bit too based on the long discussion.

Is that the only failure you're seeing?  ldt_gdt_32 passes on 64-bit for me

>
> -boris



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
       [not found]     ` <CALCETrXqT3zeJUX9uyjDGOdUi0dHUGwBn-n2NOaPaFpmB2pmDg@mail.gmail.com>
@ 2015-07-30 18:35       ` Boris Ostrovsky
       [not found]       ` <55BA6E6B.7040102@oracle.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Boris Ostrovsky @ 2015-07-30 18:35 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	linux-kernel, Steven Rostedt, xen-devel, Borislav Petkov, stable,
	Andy Lutomirski, Sasha Levin

On 07/30/2015 02:14 PM, Andy Lutomirski wrote:
> On Thu, Jul 30, 2015 at 10:56 AM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com> wrote:
>> On 07/28/2015 01:29 AM, Andy Lutomirski wrote:
>>
>>> +
>>> +static inline void load_mm_ldt(struct mm_struct *mm)
>>> +{
>>> +       struct ldt_struct *ldt;
>>> +       DEBUG_LOCKS_WARN_ON(!irqs_disabled());
>>
>>
>> I thought this was supposed to be checking preemptible()?
> v6 fixes that.  Check your future inbox :)  I'm goint to rework the
> Xen bit too based on the long discussion.
>
> Is that the only failure you're seeing?

Yes.

> ldt_gdt_32 passes on 64-bit for me

With your patch:

root@haswell> uname -a
Linux dhcp-burlington7-2nd-B-east-10-152-55-89.usdhcp.oraclecorp.com 
4.2.0-rc4 #107 SMP Thu Jul 30 11:05:19 EDT 2015 x86_64 x86_64 x86_64 
GNU/Linux
root@haswell> cd tmp/linux/tools/testing/selftests/x86/
root@haswell> ls -l ldt_gdt_32
-rwxr-xr-x 1 root root 25975 Jul 30 11:48 ldt_gdt_32
root@haswell> ./ldt_gdt_32
[OK]    LDT entry 0 has AR 0x0040FA00 and limit 0x0000000A
[OK]    LDT entry 0 has AR 0x00C0FA00 and limit 0x0000AFFF
[OK]    LDT entry 1 is invalid
[OK]    LDT entry 2 has AR 0x00C0FA00 and limit 0x0000AFFF
[OK]    LDT entry 1 is invalid
[OK]    LDT entry 2 has AR 0x00C0FA00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D0FA00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07A00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00907A00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07200 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07000 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07400 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00507600 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507E00 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507C00 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507A00 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507800 and limit 0x0000000A
[RUN]   Test fork
[OK]    LDT entry 2 has AR 0x00507800 and limit 0x0000000A
[OK]    LDT entry 1 is invalid
[OK]    Child succeeded
[RUN]   Test size
[DONE]  Size test
[OK]    modify_ldt failure 22
[OK]    LDT entry 0 has AR 0x0000F200 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x00007200 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x0000F000 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x00007200 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x00007000 and limit 0x00000001
[OK]    LDT entry 0 has AR 0x00007000 and limit 0x00000000
[OK]    LDT entry 0 is invalid
[OK]    LDT entry 0 has AR 0x0040F200 and limit 0x00000000
[OK]    LDT entry 0 is invalid
[RUN]   Cross-CPU LDT invalidation
Segmentation fault (core dumped)
root@haswell> dmesg | grep -i xen
[    2.953815] xenfs: not registering filesystem on non-xen platform
[   17.495141] IPv6: ADDRCONF(NETDEV_UP): xenbr0: link is not ready
[   20.913839] xenbr0: port 1(eth0) entered forwarding state
[   20.913907] xenbr0: port 1(eth0) entered forwarding state
[   20.914044] IPv6: ADDRCONF(NETDEV_CHANGE): xenbr0: link becomes ready


On a slightly older kernel:


root@haswell> uname -a
Linux dhcp-burlington7-2nd-B-east-10-152-55-89.usdhcp.oraclecorp.com 
4.1.0-rc2 #111 SMP Fri Jun 19 16:28:46 EDT 2015 x86_64 x86_64 x86_64 
GNU/Linux
root@haswell> cd tmp/linux/tools/testing/selftests/x86/
root@haswell> ls -l ldt_gdt_32
-rwxr-xr-x 1 root root 25975 Jul 30 11:48 ldt_gdt_32
root@haswell> ./ldt_gdt_32
[OK]    LDT entry 0 has AR 0x0040FA00 and limit 0x0000000A
[OK]    LDT entry 0 has AR 0x00C0FA00 and limit 0x0000AFFF

[OK]    LDT entry 1 is invalid
[OK]    LDT entry 2 has AR 0x00C0FA00 and limit 0x0000AFFF
[OK]    LDT entry 1 is invalid
[OK]    LDT entry 2 has AR 0x00C0FA00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D0FA00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07A00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00907A00 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07200 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07000 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00D07400 and limit 0x0000AFFF
[OK]    LDT entry 2 has AR 0x00507600 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507E00 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507C00 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507A00 and limit 0x0000000A
[OK]    LDT entry 2 has AR 0x00507800 and limit 0x0000000A
[RUN]   Test fork
[OK]    LDT entry 2 has AR 0x00507800 and limit 0x0000000A
[OK]    LDT entry 1 is invalid
[OK]    Child succeeded
[RUN]   Test size
[DONE]  Size test
[OK]    modify_ldt failure 22
[OK]    LDT entry 0 has AR 0x0000F200 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x00007200 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x0000F000 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x00007200 and limit 0x00000000
[OK]    LDT entry 0 has AR 0x00007000 and limit 0x00000001
[OK]    LDT entry 0 has AR 0x00007000 and limit 0x00000000
[OK]    LDT entry 0 is invalid
[OK]    LDT entry 0 has AR 0x0040F200 and limit 0x00000000
[OK]    LDT entry 0 is invalid
[RUN]   Cross-CPU LDT invalidation
[FAIL]  5 of 5 iterations failed
root@haswell> dmesg | grep -i xen
[    2.971167] xenfs: not registering filesystem on non-xen platform
[   17.144879] IPv6: ADDRCONF(NETDEV_UP): xenbr0: link is not ready
[   20.588663] xenbr0: port 1(eth0) entered forwarding state
[   20.588706] xenbr0: port 1(eth0) entered forwarding state
[   20.588802] IPv6: ADDRCONF(NETDEV_CHANGE): xenbr0: link becomes ready

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
       [not found]       ` <55BA6E6B.7040102@oracle.com>
@ 2015-07-30 19:25         ` Andy Lutomirski
       [not found]         ` <CALCETrX7ppxZ5_9hbGwFx7fxe_60t8xVamg62f1Xghi8a68urg@mail.gmail.com>
  1 sibling, 0 replies; 25+ messages in thread
From: Andy Lutomirski @ 2015-07-30 19:25 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	linux-kernel, Steven Rostedt, xen-devel, Borislav Petkov, stable,
	Andy Lutomirski, Sasha Levin

On Thu, Jul 30, 2015 at 11:35 AM, Boris Ostrovsky
<boris.ostrovsky@oracle.com> wrote:
> On 07/30/2015 02:14 PM, Andy Lutomirski wrote:
>>
>> On Thu, Jul 30, 2015 at 10:56 AM, Boris Ostrovsky
>> <boris.ostrovsky@oracle.com> wrote:
>>>
>>> On 07/28/2015 01:29 AM, Andy Lutomirski wrote:
>>>
>>>> +
>>>> +static inline void load_mm_ldt(struct mm_struct *mm)
>>>> +{
>>>> +       struct ldt_struct *ldt;
>>>> +       DEBUG_LOCKS_WARN_ON(!irqs_disabled());
>>>
>>>
>>>
>>> I thought this was supposed to be checking preemptible()?
>>
>> v6 fixes that.  Check your future inbox :)  I'm goint to rework the
>> Xen bit too based on the long discussion.
>>
>> Is that the only failure you're seeing?
>
>
> Yes.
>
>> ldt_gdt_32 passes on 64-bit for me
>
>
> With your patch:
>
> root@haswell> uname -a
> Linux dhcp-burlington7-2nd-B-east-10-152-55-89.usdhcp.oraclecorp.com
> 4.2.0-rc4 #107 SMP Thu Jul 30 11:05:19 EDT 2015 x86_64 x86_64 x86_64
> GNU/Linux
> root@haswell> cd tmp/linux/tools/testing/selftests/x86/
> root@haswell> ls -l ldt_gdt_32
> -rwxr-xr-x 1 root root 25975 Jul 30 11:48 ldt_gdt_32
> root@haswell> ./ldt_gdt_32
> [OK]    LDT entry 0 has AR 0x0040FA00 and limit 0x0000000A
> [OK]    LDT entry 0 has AR 0x00C0FA00 and limit 0x0000AFFF
> [OK]    LDT entry 1 is invalid
> [OK]    LDT entry 2 has AR 0x00C0FA00 and limit 0x0000AFFF
> [OK]    LDT entry 1 is invalid
> [OK]    LDT entry 2 has AR 0x00C0FA00 and limit 0x0000AFFF
> [OK]    LDT entry 2 has AR 0x00D0FA00 and limit 0x0000AFFF
> [OK]    LDT entry 2 has AR 0x00D07A00 and limit 0x0000AFFF
> [OK]    LDT entry 2 has AR 0x00907A00 and limit 0x0000AFFF
> [OK]    LDT entry 2 has AR 0x00D07200 and limit 0x0000AFFF
> [OK]    LDT entry 2 has AR 0x00D07000 and limit 0x0000AFFF
> [OK]    LDT entry 2 has AR 0x00D07400 and limit 0x0000AFFF
> [OK]    LDT entry 2 has AR 0x00507600 and limit 0x0000000A
> [OK]    LDT entry 2 has AR 0x00507E00 and limit 0x0000000A
> [OK]    LDT entry 2 has AR 0x00507C00 and limit 0x0000000A
> [OK]    LDT entry 2 has AR 0x00507A00 and limit 0x0000000A
> [OK]    LDT entry 2 has AR 0x00507800 and limit 0x0000000A
> [RUN]   Test fork
> [OK]    LDT entry 2 has AR 0x00507800 and limit 0x0000000A
> [OK]    LDT entry 1 is invalid
> [OK]    Child succeeded
> [RUN]   Test size
> [DONE]  Size test
> [OK]    modify_ldt failure 22
> [OK]    LDT entry 0 has AR 0x0000F200 and limit 0x00000000
> [OK]    LDT entry 0 has AR 0x00007200 and limit 0x00000000
> [OK]    LDT entry 0 has AR 0x0000F000 and limit 0x00000000
> [OK]    LDT entry 0 has AR 0x00007200 and limit 0x00000000
> [OK]    LDT entry 0 has AR 0x00007000 and limit 0x00000001
> [OK]    LDT entry 0 has AR 0x00007000 and limit 0x00000000
> [OK]    LDT entry 0 is invalid
> [OK]    LDT entry 0 has AR 0x0040F200 and limit 0x00000000
> [OK]    LDT entry 0 is invalid
> [RUN]   Cross-CPU LDT invalidation
> Segmentation fault (core dumped)

That's not good.

Can you backtrace it?  (I.e. compite ldt_gdt_32 with -g and load the
core dumb in gdb?)  My best guesses are either a signal delivery
failure (although that shouldn't be a problem for 32-bit userspace on
any kernel) or an actual LDT access fault, and the latter would be
interesting.

I haven't been able to reproduce this.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
       [not found]           ` <55BA5C5C.2050904@oracle.com>
@ 2015-07-31  8:43             ` Borislav Petkov
       [not found]             ` <20150731084348.GD2128@nazgul.tnic>
  1 sibling, 0 replies; 25+ messages in thread
From: Borislav Petkov @ 2015-07-31  8:43 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Juergen Gross, security, Andy Lutomirski, Peter Zijlstra,
	Andrew Cooper, X86 ML, linux-kernel, Steven Rostedt, xen-devel,
	Jan Beulich, Sasha Levin

Hey Boris,

On Thu, Jul 30, 2015 at 01:18:20PM -0400, Boris Ostrovsky wrote:
> >>Only V5, no extra changes.
> >Including running the ldt_gdt test?
> 
> Yes, except that 32-on-64 doesn't work, but that's not Xen-specific.

so which tests are you running exactly and where can I get them? Andy's
repo?

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option
       [not found]             ` <20150731084348.GD2128@nazgul.tnic>
@ 2015-07-31 13:42               ` Boris Ostrovsky
  0 siblings, 0 replies; 25+ messages in thread
From: Boris Ostrovsky @ 2015-07-31 13:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Juergen Gross, security, Andy Lutomirski, Peter Zijlstra,
	Andrew Cooper, X86 ML, linux-kernel, Steven Rostedt, xen-devel,
	Jan Beulich, Sasha Levin

On 07/31/2015 04:43 AM, Borislav Petkov wrote:
> Hey Boris,
>
> On Thu, Jul 30, 2015 at 01:18:20PM -0400, Boris Ostrovsky wrote:
>>>> Only V5, no extra changes.
>>> Including running the ldt_gdt test?
>> Yes, except that 32-on-64 doesn't work, but that's not Xen-specific.
> so which tests are you running exactly and where can I get them? Andy's
> repo?

tools/testing/selftests/x86/ldt_gdt.c, which is patch 3/4 in Andy's series.

-boris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
       [not found]         ` <CALCETrX7ppxZ5_9hbGwFx7fxe_60t8xVamg62f1Xghi8a68urg@mail.gmail.com>
@ 2015-07-31 16:51           ` Boris Ostrovsky
  0 siblings, 0 replies; 25+ messages in thread
From: Boris Ostrovsky @ 2015-07-31 16:51 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: security, Jan Beulich, Peter Zijlstra, Andrew Cooper, X86 ML,
	linux-kernel, Steven Rostedt, xen-devel, Borislav Petkov, stable,
	Andy Lutomirski, Sasha Levin

On 07/30/2015 03:25 PM, Andy Lutomirski wrote:
> On Thu, Jul 30, 2015 at 11:35 AM, Boris Ostrovsky
> <boris.ostrovsky@oracle.com> wrote:
>> [OK]    LDT entry 0 has AR 0x0040F200 and limit 0x00000000
>> [OK]    LDT entry 0 is invalid
>> [RUN]   Cross-CPU LDT invalidation
>> Segmentation fault (core dumped)
> That's not good.
>
> Can you backtrace it?  (I.e. compite ldt_gdt_32 with -g and load the
> core dumb in gdb?)  My best guesses are either a signal delivery
> failure (although that shouldn't be a problem for 32-bit userspace on
> any kernel) or an actual LDT access fault, and the latter would be
> interesting.
>
> I haven't been able to reproduce this.

This looks like a userspace bug. Breaks on F18, works fine in F22.

Possibly something about signal handling --- I noticed on F18 I'd get 
two SEGV's in a row whereas we should only get one.

Anyway, this is not an issue as far as this thread is concerned.

-boris

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous
       [not found] ` <1f1e8e0367408585694132a0e8693d157959ce30.1438061139.git.luto@kernel.org>
                     ` (2 preceding siblings ...)
       [not found]   ` <55BA6534.2080407@oracle.com>
@ 2015-08-13 21:05   ` H. Peter Anvin
  3 siblings, 0 replies; 25+ messages in thread
From: H. Peter Anvin @ 2015-08-13 21:05 UTC (permalink / raw)
  To: Andy Lutomirski, Peter Zijlstra, Steven Rostedt
  Cc: security, Andrew Cooper, X86 ML, linux-kernel, stable, xen-devel,
	Borislav Petkov, Jan Beulich, Sasha Levin, Boris Ostrovsky

On 07/27/2015 10:29 PM, Andy Lutomirski wrote:
> modify_ldt has questionable locking and does not synchronize
> threads.  Improve it: redesign the locking and synchronize all
> threads' LDTs using an IPI on all modifications.
> 
> This will dramatically slow down modify_ldt in multithreaded
> programs, but there shouldn't be any multithreaded programs that
> care about modify_ldt's performance in the first place.
> 

<nitpick>

... except 32-bit programs compiled with one specific version of glibc.
 Do we care?  I don't think so.

</nitpick>

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-08-13 21:05 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-28  5:29 [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option Andy Lutomirski
2015-07-28  5:29 ` [PATCH v5 1/4] x86/xen: Unmap aliases in xen_alloc_ldt and xen_free_ldt Andy Lutomirski
2015-07-28  5:29 ` [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous Andy Lutomirski
2015-07-28  5:29 ` [PATCH v5 3/4] selftests/x86, x86/ldt: Add a selftest for modify_ldt Andy Lutomirski
2015-07-28  5:29 ` [PATCH v5 4/4] x86/ldt: Make modify_ldt optional Andy Lutomirski
     [not found] ` <8d0567e8ab3933a7c91486e6d628c5f5c02f0753.1438061139.git.luto@kernel.org>
2015-07-28 16:53   ` [PATCH v5 3/4] selftests/x86, x86/ldt: Add a selftest for modify_ldt Kees Cook
     [not found] ` <49b3792ac157c32e2479ac3092a915a1b8c99ce8.1438061139.git.luto@kernel.org>
2015-07-28 16:56   ` [PATCH v5 4/4] x86/ldt: Make modify_ldt optional Kees Cook
     [not found]   ` <CAGXu5jLGWEiA+RCf-eApAdx0fsuRFWguOQMpWX9nOKDwWLHWtg@mail.gmail.com>
2015-07-28 20:03     ` Willy Tarreau
     [not found]     ` <20150728200351.GA20360@1wt.eu>
2015-07-28 20:42       ` Kees Cook
     [not found]       ` <CAGXu5jKBmGmH1dbgtwmBeQeSisxrykoxYRcFKe4yNFHyDxTnKw@mail.gmail.com>
2015-07-28 20:51         ` Willy Tarreau
2015-07-30 15:53 ` [PATCH v5 0/4] x86: modify_ldt improvement, test, and config option Boris Ostrovsky
     [not found] ` <55BA487E.9030708@oracle.com>
2015-07-30 16:05   ` Borislav Petkov
     [not found]   ` <20150730160555.GB28617@nazgul.tnic>
2015-07-30 16:12     ` Andrew Cooper
2015-07-30 16:31       ` Boris Ostrovsky
     [not found]       ` <55BA5173.401@oracle.com>
2015-07-30 17:06         ` Andrew Cooper
     [not found]         ` <55BA59B1.5000600@citrix.com>
2015-07-30 17:18           ` Boris Ostrovsky
     [not found]           ` <55BA5C5C.2050904@oracle.com>
2015-07-31  8:43             ` Borislav Petkov
     [not found]             ` <20150731084348.GD2128@nazgul.tnic>
2015-07-31 13:42               ` Boris Ostrovsky
     [not found] ` <1f1e8e0367408585694132a0e8693d157959ce30.1438061139.git.luto@kernel.org>
2015-07-30  7:49   ` [PATCH v5 2/4] x86/ldt: Make modify_ldt synchronous Borislav Petkov
2015-07-30 17:56   ` Boris Ostrovsky
     [not found]   ` <55BA6534.2080407@oracle.com>
2015-07-30 18:14     ` Andy Lutomirski
     [not found]     ` <CALCETrXqT3zeJUX9uyjDGOdUi0dHUGwBn-n2NOaPaFpmB2pmDg@mail.gmail.com>
2015-07-30 18:35       ` Boris Ostrovsky
     [not found]       ` <55BA6E6B.7040102@oracle.com>
2015-07-30 19:25         ` Andy Lutomirski
     [not found]         ` <CALCETrX7ppxZ5_9hbGwFx7fxe_60t8xVamg62f1Xghi8a68urg@mail.gmail.com>
2015-07-31 16:51           ` Boris Ostrovsky
2015-08-13 21:05   ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).