linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
@ 2016-01-26  1:15 Laura Abbott
  2016-01-26  1:15 ` [RFC][PATCH 1/3] slub: Drop lock at the end of free_debug_processing Laura Abbott
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Laura Abbott @ 2016-01-26  1:15 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton
  Cc: Laura Abbott, linux-mm, linux-kernel, kernel-hardening, Kees Cook

Hi,

Based on the discussion from the series to add slab sanitization
(lkml.kernel.org/g/<1450755641-7856-1-git-send-email-laura@labbott.name>)
the existing SLAB_POISON mechanism already covers similar behavior.
The performance of SLAB_POISON isn't very good. With hackbench -g 20 -l 1000
on QEMU with one cpu:

slub_debug=-:  7.437
slub_debug=P: 15.366

Poisoning memory is certainly going to have a performance impact but there
are two major contributors to this slowdown: the fastpath is always disabled
when debugging features are enabled and there are lots of expensive
consistency checks happening. This series attempts to address both of them.

Debugging checks now happen on the fast path. This does involve disabling
preemption and interrupts for consistency. This series also introduces a
new slab flag to skip consistency checks but let poisoning or possibly
tracing to happen. After this series:

slub_debug=-:   7.932
slub_debug=PQ:  8.203
slub_debug=P:  10.707

I haven't run this series through a ton of stress tests yet as I was hoping
to get some feedback that this approach looks correct.

Since I expect this to be the trickiest part of SL*B sanitization, my plan
is to focus on getting SLUB speed up merged and then work on the rest of
SL*B sanitization.

As always, feedback is appreciated.

Thanks,
Laura

Laura Abbott (3):
  slub: Drop lock at the end of free_debug_processing
  slub: Don't limit debugging to slow paths
  slub: Add option to skip consistency checks

 include/linux/slab.h |   1 +
 init/Kconfig         |  12 +++
 mm/slub.c            | 214 ++++++++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 200 insertions(+), 27 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC][PATCH 1/3] slub: Drop lock at the end of free_debug_processing
  2016-01-26  1:15 [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Laura Abbott
@ 2016-01-26  1:15 ` Laura Abbott
  2016-01-26 16:19   ` Christoph Lameter
  2016-01-26  1:15 ` [RFC][PATCH 2/3] slub: Don't limit debugging to slow paths Laura Abbott
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Laura Abbott @ 2016-01-26  1:15 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton
  Cc: Laura Abbott, linux-mm, linux-kernel, kernel-hardening, Kees Cook


Currently, free_debug_processing has a comment "Keep node_lock to preserve
integrity until the object is actually freed". In actuallity,
the lock is dropped immediately in __slab_free. Rather than wait until
__slab_free and potentially throw off the unlikely marking, just drop
the lock in __slab_free. This also lets free_debug_processing take
its own copy of the spinlock flags rather than trying to share the ones
from __slab_free. Since there is no use for the node afterwards, change
the return type of free_debug_processing to return an int like
alloc_debug_processing.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
---
 mm/slub.c | 23 ++++++++++-------------
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 574a085..6ddba32 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1071,16 +1071,17 @@ bad:
 }
 
 /* Supports checking bulk free of a constructed freelist */
-static noinline struct kmem_cache_node *free_debug_processing(
+static noinline int free_debug_processing(
 	struct kmem_cache *s, struct page *page,
 	void *head, void *tail, int bulk_cnt,
-	unsigned long addr, unsigned long *flags)
+	unsigned long addr)
 {
 	struct kmem_cache_node *n = get_node(s, page_to_nid(page));
 	void *object = head;
 	int cnt = 0;
+	unsigned long uninitialized_var(flags);
 
-	spin_lock_irqsave(&n->list_lock, *flags);
+	spin_lock_irqsave(&n->list_lock, flags);
 	slab_lock(page);
 
 	if (!check_slab(s, page))
@@ -1133,17 +1134,14 @@ out:
 			 bulk_cnt, cnt);
 
 	slab_unlock(page);
-	/*
-	 * Keep node_lock to preserve integrity
-	 * until the object is actually freed
-	 */
-	return n;
+	spin_unlock_irqrestore(&n->list_lock, flags);
+	return 1;
 
 fail:
 	slab_unlock(page);
-	spin_unlock_irqrestore(&n->list_lock, *flags);
+	spin_unlock_irqrestore(&n->list_lock, flags);
 	slab_fix(s, "Object at 0x%p not freed", object);
-	return NULL;
+	return 0;
 }
 
 static int __init setup_slub_debug(char *str)
@@ -1234,7 +1232,7 @@ static inline void setup_object_debug(struct kmem_cache *s,
 static inline int alloc_debug_processing(struct kmem_cache *s,
 	struct page *page, void *object, unsigned long addr) { return 0; }
 
-static inline struct kmem_cache_node *free_debug_processing(
+static inline int free_debug_processing(
 	struct kmem_cache *s, struct page *page,
 	void *head, void *tail, int bulk_cnt,
 	unsigned long addr, unsigned long *flags) { return NULL; }
@@ -2651,8 +2649,7 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
 	stat(s, FREE_SLOWPATH);
 
 	if (kmem_cache_debug(s) &&
-	    !(n = free_debug_processing(s, page, head, tail, cnt,
-					addr, &flags)))
+	    !free_debug_processing(s, page, head, tail, cnt, addr))
 		return;
 
 	do {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC][PATCH 2/3] slub: Don't limit debugging to slow paths
  2016-01-26  1:15 [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Laura Abbott
  2016-01-26  1:15 ` [RFC][PATCH 1/3] slub: Drop lock at the end of free_debug_processing Laura Abbott
@ 2016-01-26  1:15 ` Laura Abbott
  2016-01-26  8:48   ` Paul Bolle
  2016-01-26  1:15 ` [PATCH 3/3] slub: Add option to skip consistency checks Laura Abbott
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Laura Abbott @ 2016-01-26  1:15 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton
  Cc: Laura Abbott, linux-mm, linux-kernel, kernel-hardening, Kees Cook


Currently, when slabs are marked with debug options, the allocation
path will skip using CPU slabs. This has a definite performance
impact. Add an option to allow debugging to happen on the fast path.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
---
 init/Kconfig |  12 +++++
 mm/slub.c    | 164 +++++++++++++++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 161 insertions(+), 15 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 2232080..6d807e7 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1674,6 +1674,18 @@ config SLUB_DEBUG
 	  SLUB sysfs support. /sys/slab will not exist and there will be
 	  no support for cache validation etc.
 
+config SLUB_DEBUG_FASTPATH
+	bool "Allow SLUB debugging to utilize the fastpath"
+	depends on SLUB_DEBUG
+	help
+	  SLUB_DEBUG forces all allocations to utilize the slow path which
+	  is a performance penalty. Turning on this option lets the debugging
+	  use the fast path. This helps the performance when debugging
+	  features are turned on. If you aren't planning on utilizing any
+	  of the SLUB_DEBUG features, you should say N here.
+
+	  If unsure, say N
+
 config COMPAT_BRK
 	bool "Disable heap randomization"
 	default y
diff --git a/mm/slub.c b/mm/slub.c
index 6ddba32..a47e615 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -898,9 +898,10 @@ static int check_slab(struct kmem_cache *s, struct page *page)
  * Determine if a certain object on a page is on the freelist. Must hold the
  * slab lock to guarantee that the chains are in a consistent state.
  */
-static int on_freelist(struct kmem_cache *s, struct page *page, void *search)
+static int on_freelist(struct kmem_cache *s, struct page *page, void *search,
+			void *cpu_freelist)
 {
-	int nr = 0;
+	int nr = 0, cpu_nr = 0;
 	void *fp;
 	void *object = NULL;
 	int max_objects;
@@ -928,6 +929,29 @@ static int on_freelist(struct kmem_cache *s, struct page *page, void *search)
 		nr++;
 	}
 
+	fp = cpu_freelist;
+	while (fp && cpu_nr <= page->objects) {
+		if (fp == search)
+			return 1;
+		if (!check_valid_pointer(s, page, fp)) {
+			if (object) {
+				object_err(s, page, object,
+					"Freechain corrupt");
+				set_freepointer(s, object, NULL);
+			} else {
+				slab_err(s, page, "Freepointer corrupt");
+				page->freelist = NULL;
+				page->inuse = page->objects;
+				slab_fix(s, "Freelist cleared");
+				return 0;
+			}
+			break;
+		}
+		object = fp;
+		fp = get_freepointer(s, object);
+		cpu_nr++;
+	}
+
 	max_objects = order_objects(compound_order(page), s->size, s->reserved);
 	if (max_objects > MAX_OBJS_PER_PAGE)
 		max_objects = MAX_OBJS_PER_PAGE;
@@ -1034,6 +1058,7 @@ static void setup_object_debug(struct kmem_cache *s, struct page *page,
 	init_tracking(s, object);
 }
 
+/* Must be not be called when migration can happen */
 static noinline int alloc_debug_processing(struct kmem_cache *s,
 					struct page *page,
 					void *object, unsigned long addr)
@@ -1070,10 +1095,51 @@ bad:
 	return 0;
 }
 
+#ifdef SLUB_DEBUG_FASTPATH
+static noinline int alloc_debug_processing_fastpath(struct kmem_cache *s,
+					struct kmem_cache_cpu *c,
+					struct page *page,
+					void *object, unsigned long tid,
+					unsigned long addr)
+{
+	unsigned long flags;
+	int ret = 0;
+
+	preempt_disable();
+	local_irq_save(flags);
+
+	/*
+	 * We've now disabled preemption and IRQs but we still need
+	 * to check that this is the right CPU
+	 */
+	if (!this_cpu_cmpxchg_double(s->cpu_slab->freelist, s->cpu_slab->tid,
+				c->freelist, tid,
+				c->freelist, tid))
+		goto out;
+
+	ret = alloc_debug_processing(s, page, object, addr);
+
+out:
+	local_irq_restore(flags);
+	preempt_enable();
+	return ret;
+}
+#else
+static noinline int alloc_debug_processing_fastpath(struct kmem_cache *s,
+					struct kmem_cache_cpu *c,
+					struct page *page,
+					void *object, unsigned long tid,
+					unsigned long addr)
+{
+	return 1;
+}
+#endif
+
 /* Supports checking bulk free of a constructed freelist */
 static noinline int free_debug_processing(
 	struct kmem_cache *s, struct page *page,
 	void *head, void *tail, int bulk_cnt,
+	void *cpu_freelist,
 	unsigned long addr)
 {
 	struct kmem_cache_node *n = get_node(s, page_to_nid(page));
@@ -1095,7 +1161,7 @@ next_object:
 		goto fail;
 	}
 
-	if (on_freelist(s, page, object)) {
+	if (on_freelist(s, page, object, cpu_freelist)) {
 		object_err(s, page, object, "Object already free");
 		goto fail;
 	}
@@ -1144,6 +1210,53 @@ fail:
 	return 0;
 }
 
+#ifdef CONFIG_SLUB_DEBUG_FASTPATH
+static noinline int free_debug_processing_fastpath(
+	struct kmem_cache *s,
+	struct kmem_cache_cpu *c,
+	struct page *page,
+	void *head, void *tail, int bulk_cnt,
+	unsigned long tid,
+	unsigned long addr)
+{
+	int ret = 0;
+	unsigned long flags;
+
+	preempt_disable();
+	local_irq_save(flags);
+
+	/*
+	 * We've now disabled preemption and IRQs but we still need
+	 * to check that this is the right CPU
+	 */
+	if (!this_cpu_cmpxchg_double(s->cpu_slab->freelist, s->cpu_slab->tid,
+				c->freelist, tid,
+				c->freelist, tid))
+		goto out;
+
+
+	ret = free_debug_processing(s, page, head, tail, bulk_cnt,
+				c->freelist, addr);
+
+out:
+	local_irq_restore(flags);
+	preempt_enable();
+	return ret;
+}
+#else
+static inline int free_debug_processing_fastpath(
+	struct kmem_cache *s,
+	struct kmem_cache_cpu *c,
+	struct page *page,
+	void *head, void *tail, int bulk_cnt,
+	unsigned long tid,
+	unsigned long addr)
+{
+	return 1;
+}
+#endif
+
+
 static int __init setup_slub_debug(char *str)
 {
 	slub_debug = DEBUG_DEFAULT_FLAGS;
@@ -1234,8 +1347,8 @@ static inline int alloc_debug_processing(struct kmem_cache *s,
 
 static inline int free_debug_processing(
 	struct kmem_cache *s, struct page *page,
-	void *head, void *tail, int bulk_cnt,
-	unsigned long addr, unsigned long *flags) { return NULL; }
+	void *head, void *tail, int bulk_cnt, void *cpu_freelist,
+	unsigned long addr, unsigned long *flags) { return 0; }
 
 static inline int slab_pad_check(struct kmem_cache *s, struct page *page)
 			{ return 1; }
@@ -2352,7 +2465,8 @@ static inline void *get_freelist(struct kmem_cache *s, struct page *page)
  * already disabled (which is the case for bulk allocation).
  */
 static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
-			  unsigned long addr, struct kmem_cache_cpu *c)
+			  unsigned long addr, struct kmem_cache_cpu *c,
+			  bool debug_fail)
 {
 	void *freelist;
 	struct page *page;
@@ -2382,7 +2496,7 @@ redo:
 	 * PFMEMALLOC but right now, we are losing the pfmemalloc
 	 * information when the page leaves the per-cpu allocator
 	 */
-	if (unlikely(!pfmemalloc_match(page, gfpflags))) {
+	if (unlikely(debug_fail || !pfmemalloc_match(page, gfpflags))) {
 		deactivate_slab(s, page, c->freelist);
 		c->page = NULL;
 		c->freelist = NULL;
@@ -2433,7 +2547,9 @@ new_slab:
 	}
 
 	page = c->page;
-	if (likely(!kmem_cache_debug(s) && pfmemalloc_match(page, gfpflags)))
+
+	if (!IS_ENABLED(CONFIG_SLUB_DEBUG_FASTPATH) &&
+	    likely(!kmem_cache_debug(s) && pfmemalloc_match(page, gfpflags)))
 		goto load_freelist;
 
 	/* Only entered in the debug case */
@@ -2441,6 +2557,10 @@ new_slab:
 			!alloc_debug_processing(s, page, freelist, addr))
 		goto new_slab;	/* Slab failed checks. Next slab needed */
 
+	if (IS_ENABLED(CONFIG_SLUB_DEBUG_FASTPATH) &&
+	    likely(pfmemalloc_match(page, gfpflags)))
+		goto load_freelist;
+
 	deactivate_slab(s, page, get_freepointer(s, freelist));
 	c->page = NULL;
 	c->freelist = NULL;
@@ -2452,7 +2572,8 @@ new_slab:
  * cpu changes by refetching the per cpu area pointer.
  */
 static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
-			  unsigned long addr, struct kmem_cache_cpu *c)
+			  unsigned long addr, struct kmem_cache_cpu *c,
+			  bool debug_fail)
 {
 	void *p;
 	unsigned long flags;
@@ -2467,7 +2588,7 @@ static void *__slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
 	c = this_cpu_ptr(s->cpu_slab);
 #endif
 
-	p = ___slab_alloc(s, gfpflags, node, addr, c);
+	p = ___slab_alloc(s, gfpflags, node, addr, c, debug_fail);
 	local_irq_restore(flags);
 	return p;
 }
@@ -2489,6 +2610,7 @@ static __always_inline void *slab_alloc_node(struct kmem_cache *s,
 	struct kmem_cache_cpu *c;
 	struct page *page;
 	unsigned long tid;
+	bool debug_fail = false;
 
 	s = slab_pre_alloc_hook(s, gfpflags);
 	if (!s)
@@ -2529,12 +2651,18 @@ redo:
 
 	object = c->freelist;
 	page = c->page;
-	if (unlikely(!object || !node_match(page, node))) {
-		object = __slab_alloc(s, gfpflags, node, addr, c);
+	if (unlikely(debug_fail || !object || !node_match(page, node))) {
+		object = __slab_alloc(s, gfpflags, node, addr, c, debug_fail);
 		stat(s, ALLOC_SLOWPATH);
 	} else {
 		void *next_object = get_freepointer_safe(s, object);
 
+
+		if (kmem_cache_debug(s) && !alloc_debug_processing_fastpath(s, c, page, object, tid, addr)) {
+			debug_fail = true;
+			goto redo;
+		}
+
 		/*
 		 * The cmpxchg will only match if there was no additional
 		 * operation and if we are on the right processor.
@@ -2557,6 +2685,7 @@ redo:
 			note_cmpxchg_failure("slab_alloc", s, tid);
 			goto redo;
 		}
+
 		prefetch_freepointer(s, next_object);
 		stat(s, ALLOC_FASTPATH);
 	}
@@ -2649,9 +2778,10 @@ static void __slab_free(struct kmem_cache *s, struct page *page,
 	stat(s, FREE_SLOWPATH);
 
 	if (kmem_cache_debug(s) &&
-	    !free_debug_processing(s, page, head, tail, cnt, addr))
+	    !free_debug_processing(s, page, head, tail, cnt, NULL, addr))
 		return;
 
+
 	do {
 		if (unlikely(n)) {
 			spin_unlock_irqrestore(&n->list_lock, flags);
@@ -2790,6 +2920,10 @@ redo:
 	barrier();
 
 	if (likely(page == c->page)) {
+		if (kmem_cache_debug(s) &&
+		    !free_debug_processing_fastpath(s, c, page, head, tail_obj, cnt, tid, addr))
+			return;
+
 		set_freepointer(s, tail_obj, c->freelist);
 
 		if (unlikely(!this_cpu_cmpxchg_double(
@@ -2938,7 +3072,7 @@ int kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
 			 * of re-populating per CPU c->freelist
 			 */
 			p[i] = ___slab_alloc(s, flags, NUMA_NO_NODE,
-					    _RET_IP_, c);
+					    _RET_IP_, c, false);
 			if (unlikely(!p[i]))
 				goto error;
 
@@ -4094,7 +4228,7 @@ static int validate_slab(struct kmem_cache *s, struct page *page,
 	void *addr = page_address(page);
 
 	if (!check_slab(s, page) ||
-			!on_freelist(s, page, NULL))
+			!on_freelist(s, page, NULL, NULL))
 		return 0;
 
 	/* Now we know that a valid freelist exists */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] slub: Add option to skip consistency checks
  2016-01-26  1:15 [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Laura Abbott
  2016-01-26  1:15 ` [RFC][PATCH 1/3] slub: Drop lock at the end of free_debug_processing Laura Abbott
  2016-01-26  1:15 ` [RFC][PATCH 2/3] slub: Don't limit debugging to slow paths Laura Abbott
@ 2016-01-26  1:15 ` Laura Abbott
  2016-01-26 15:00   ` Christoph Lameter
  2016-01-26  7:03 ` [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Joonsoo Kim
  2016-01-26 14:57 ` Christoph Lameter
  4 siblings, 1 reply; 17+ messages in thread
From: Laura Abbott @ 2016-01-26  1:15 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton
  Cc: Laura Abbott, linux-mm, linux-kernel, kernel-hardening, Kees Cook


SLUB debugging by default does checks to ensure consistency.
These checks, while useful, are expensive for allocation speed.
Features such as poisoning and tracing can stand alone without
any checks. Add a slab flag to skip these checks.

Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
---
 include/linux/slab.h |  1 +
 mm/slub.c            | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 3627d5c..789f6a3 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -23,6 +23,7 @@
 #define SLAB_DEBUG_FREE		0x00000100UL	/* DEBUG: Perform (expensive) checks on free */
 #define SLAB_RED_ZONE		0x00000400UL	/* DEBUG: Red zone objs in a cache */
 #define SLAB_POISON		0x00000800UL	/* DEBUG: Poison objects */
+#define SLAB_NO_CHECKS		0x00001000UL	/* DEBUG: Skip all consistency checks*/
 #define SLAB_HWCACHE_ALIGN	0x00002000UL	/* Align objs on cache lines */
 #define SLAB_CACHE_DMA		0x00004000UL	/* Use GFP_DMA memory */
 #define SLAB_STORE_USER		0x00010000UL	/* DEBUG: Store the last owner for bug hunting */
diff --git a/mm/slub.c b/mm/slub.c
index a47e615..078f088 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -230,6 +230,9 @@ static inline int check_valid_pointer(struct kmem_cache *s,
 {
 	void *base;
 
+	if (s->flags & SLAB_NO_CHECKS)
+		return 1;
+
 	if (!object)
 		return 1;
 
@@ -818,6 +821,9 @@ static int check_object(struct kmem_cache *s, struct page *page,
 	u8 *p = object;
 	u8 *endobject = object + s->object_size;
 
+	if (s->flags & SLAB_NO_CHECKS)
+		return 1;
+
 	if (s->flags & SLAB_RED_ZONE) {
 		if (!check_bytes_and_report(s, page, object, "Redzone",
 			endobject, val, s->inuse - s->object_size))
@@ -873,6 +879,9 @@ static int check_slab(struct kmem_cache *s, struct page *page)
 
 	VM_BUG_ON(!irqs_disabled());
 
+	if (s->flags & SLAB_NO_CHECKS)
+		return 1;
+
 	if (!PageSlab(page)) {
 		slab_err(s, page, "Not a valid slab page");
 		return 0;
@@ -906,6 +915,9 @@ static int on_freelist(struct kmem_cache *s, struct page *page, void *search,
 	void *object = NULL;
 	int max_objects;
 
+	if (s->flags & SLAB_NO_CHECKS)
+		return 0;
+
 	fp = page->freelist;
 	while (fp && nr <= page->objects) {
 		if (fp == search)
@@ -1303,6 +1315,8 @@ static int __init setup_slub_debug(char *str)
 		case 'a':
 			slub_debug |= SLAB_FAILSLAB;
 			break;
+		case 'q':
+			slub_debug |= SLAB_NO_CHECKS;
 		case 'o':
 			/*
 			 * Avoid enabling debugging on caches if its minimum
@@ -5032,6 +5046,20 @@ static ssize_t poison_store(struct kmem_cache *s,
 }
 SLAB_ATTR(poison);
 
+static ssize_t no_checks_show(struct kmem_cache *s, char *buf)
+{
+	return sprintf(buf, "%d\n", !!(s->flags & SLAB_NO_CHECKS));
+}
+
+static ssize_t no_checks_store(struct kmem_cache *s,
+				const char *buf, size_t length)
+{
+	return -EINVAL;
+}
+SLAB_ATTR(no_checks);
+
+
+
 static ssize_t store_user_show(struct kmem_cache *s, char *buf)
 {
 	return sprintf(buf, "%d\n", !!(s->flags & SLAB_STORE_USER));
@@ -5257,6 +5285,7 @@ static struct attribute *slab_attrs[] = {
 	&trace_attr.attr,
 	&red_zone_attr.attr,
 	&poison_attr.attr,
+	&no_checks_attr.attr,
 	&store_user_attr.attr,
 	&validate_attr.attr,
 	&alloc_calls_attr.attr,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-01-26  1:15 [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Laura Abbott
                   ` (2 preceding siblings ...)
  2016-01-26  1:15 ` [PATCH 3/3] slub: Add option to skip consistency checks Laura Abbott
@ 2016-01-26  7:03 ` Joonsoo Kim
  2016-01-26 15:01   ` Christoph Lameter
  2016-02-03 18:46   ` Laura Abbott
  2016-01-26 14:57 ` Christoph Lameter
  4 siblings, 2 replies; 17+ messages in thread
From: Joonsoo Kim @ 2016-01-26  7:03 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Andrew Morton,
	linux-mm, linux-kernel, kernel-hardening, Kees Cook

On Mon, Jan 25, 2016 at 05:15:10PM -0800, Laura Abbott wrote:
> Hi,
> 
> Based on the discussion from the series to add slab sanitization
> (lkml.kernel.org/g/<1450755641-7856-1-git-send-email-laura@labbott.name>)
> the existing SLAB_POISON mechanism already covers similar behavior.
> The performance of SLAB_POISON isn't very good. With hackbench -g 20 -l 1000
> on QEMU with one cpu:

I doesn't follow up that discussion, but, I think that reusing
SLAB_POISON for slab sanitization needs more changes. I assume that
completeness and performance is matter for slab sanitization.

1) SLAB_POISON isn't applied to specific kmem_cache which has
constructor or SLAB_DESTROY_BY_RCU flag. For debug, it's not necessary
to be applied, but, for slab sanitization, it is better to apply it to
all caches.

2) SLAB_POISON makes object size bigger so natural alignment will be
broken. For example, kmalloc(256) cache's size is 256 in normal
case but it would be 264 when SLAB_POISON is enabled. This causes
memory waste.

In fact, I'd prefer not reusing SLAB_POISON. It would make thing
simpler. But, it's up to Christoph.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 2/3] slub: Don't limit debugging to slow paths
  2016-01-26  1:15 ` [RFC][PATCH 2/3] slub: Don't limit debugging to slow paths Laura Abbott
@ 2016-01-26  8:48   ` Paul Bolle
  0 siblings, 0 replies; 17+ messages in thread
From: Paul Bolle @ 2016-01-26  8:48 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, linux-kernel, kernel-hardening,
	Kees Cook

On ma, 2016-01-25 at 17:15 -0800, Laura Abbott wrote:
> --- a/init/Kconfig
> +++ b/init/Kconfig
 
> +config SLUB_DEBUG_FASTPATH
> +	bool "Allow SLUB debugging to utilize the fastpath"
> +	depends on SLUB_DEBUG
> +	help
> +	  SLUB_DEBUG forces all allocations to utilize the slow path which
> +	  is a performance penalty. Turning on this option lets the debugging
> +	  use the fast path. This helps the performance when debugging
> +	  features are turned on. If you aren't planning on utilizing any
> +	  of the SLUB_DEBUG features, you should say N here.
> +
> +	  If unsure, say N

> --- a/mm/slub.c
> +++ b/mm/slub.c

> +#ifdef SLUB_DEBUG_FASTPATH

I have no clue what your patch does, but I could spot this should
probably be
	#ifdef CONFIG_SLUB_DEBUG_FASTPATH

> +static noinline int alloc_debug_processing_fastpath(struct kmem_cache
> *s,
> +					struct kmem_cache_cpu *c,
> +					struct page *page,
> +					void *object, unsigned long
> tid,
> +					unsigned long addr)
> +{
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	preempt_disable();
> +	local_irq_save(flags);
> +
> +	/*
> +	 * We've now disabled preemption and IRQs but we still need
> +	 * to check that this is the right CPU
> +	 */
> +	if (!this_cpu_cmpxchg_double(s->cpu_slab->freelist, s
> ->cpu_slab->tid,
> +				c->freelist, tid,
> +				c->freelist, tid))
> +		goto out;
> +
> +	ret = alloc_debug_processing(s, page, object, addr);
> +
> +out:
> +	local_irq_restore(flags);
> +	preempt_enable();
> +	return ret;
> +}
> +#else
> +static noinline int alloc_debug_processing_fastpath(struct kmem_cache
> *s,
> +					struct kmem_cache_cpu *c,
> +					struct page *page,
> +					void *object, unsigned long
> tid,
> +					unsigned long addr)
> +{
> +	return 1;
> +}
> +#endif

Thanks,


Paul Bolle

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-01-26  1:15 [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Laura Abbott
                   ` (3 preceding siblings ...)
  2016-01-26  7:03 ` [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Joonsoo Kim
@ 2016-01-26 14:57 ` Christoph Lameter
  4 siblings, 0 replies; 17+ messages in thread
From: Christoph Lameter @ 2016-01-26 14:57 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, linux-kernel, kernel-hardening, Kees Cook

On Mon, 25 Jan 2016, Laura Abbott wrote:

> slub_debug=-:  7.437
> slub_debug=-:   7.932

So thats an almost 10% performance regression if the feature is not used.
The reason that posoning is on the slow path is because it is impacting
performance. Focus on optimizing the debug path without impacting the fast
path please.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] slub: Add option to skip consistency checks
  2016-01-26  1:15 ` [PATCH 3/3] slub: Add option to skip consistency checks Laura Abbott
@ 2016-01-26 15:00   ` Christoph Lameter
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Lameter @ 2016-01-26 15:00 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, linux-kernel, kernel-hardening, Kees Cook

On Mon, 25 Jan 2016, Laura Abbott wrote:

> SLUB debugging by default does checks to ensure consistency.
> These checks, while useful, are expensive for allocation speed.
> Features such as poisoning and tracing can stand alone without
> any checks. Add a slab flag to skip these checks.

I would suggest to rename the SLAB_DEBUG_FREE to SLAB_CONSISTENCY_CHECKS
instead. I think the flag is already used that way in a couple of places.

Flags generally enable stuff. Disabling what is enabled by others is
something that we want to avoid for simplicities sake.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-01-26  7:03 ` [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Joonsoo Kim
@ 2016-01-26 15:01   ` Christoph Lameter
  2016-01-26 15:21     ` Joonsoo Kim
  2016-02-03 18:46   ` Laura Abbott
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Lameter @ 2016-01-26 15:01 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Laura Abbott, Pekka Enberg, David Rientjes, Andrew Morton,
	linux-mm, linux-kernel, kernel-hardening, Kees Cook

On Tue, 26 Jan 2016, Joonsoo Kim wrote:

> I doesn't follow up that discussion, but, I think that reusing
> SLAB_POISON for slab sanitization needs more changes. I assume that
> completeness and performance is matter for slab sanitization.
>
> 1) SLAB_POISON isn't applied to specific kmem_cache which has
> constructor or SLAB_DESTROY_BY_RCU flag. For debug, it's not necessary
> to be applied, but, for slab sanitization, it is better to apply it to
> all caches.

Those slabs can be legitimately accessed after the objects were freed. You
cannot sanitize nor poison.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-01-26 15:01   ` Christoph Lameter
@ 2016-01-26 15:21     ` Joonsoo Kim
  0 siblings, 0 replies; 17+ messages in thread
From: Joonsoo Kim @ 2016-01-26 15:21 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Joonsoo Kim, Laura Abbott, Pekka Enberg, David Rientjes,
	Andrew Morton, Linux Memory Management List, LKML,
	kernel-hardening, Kees Cook

2016-01-27 0:01 GMT+09:00 Christoph Lameter <cl@linux.com>:
> On Tue, 26 Jan 2016, Joonsoo Kim wrote:
>
>> I doesn't follow up that discussion, but, I think that reusing
>> SLAB_POISON for slab sanitization needs more changes. I assume that
>> completeness and performance is matter for slab sanitization.
>>
>> 1) SLAB_POISON isn't applied to specific kmem_cache which has
>> constructor or SLAB_DESTROY_BY_RCU flag. For debug, it's not necessary
>> to be applied, but, for slab sanitization, it is better to apply it to
>> all caches.
>
> Those slabs can be legitimately accessed after the objects were freed. You
> cannot sanitize nor poison.

Oops... you are right. I misunderstand what SLAB_DESTROY_BY_RCU is.
Now, it's clear to me.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 1/3] slub: Drop lock at the end of free_debug_processing
  2016-01-26  1:15 ` [RFC][PATCH 1/3] slub: Drop lock at the end of free_debug_processing Laura Abbott
@ 2016-01-26 16:19   ` Christoph Lameter
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Lameter @ 2016-01-26 16:19 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, linux-kernel, kernel-hardening, Kees Cook

On Mon, 25 Jan 2016, Laura Abbott wrote:

> Currently, free_debug_processing has a comment "Keep node_lock to preserve
> integrity until the object is actually freed". In actuallity,
> the lock is dropped immediately in __slab_free. Rather than wait until
> __slab_free and potentially throw off the unlikely marking, just drop
> the lock in __slab_free. This also lets free_debug_processing take
> its own copy of the spinlock flags rather than trying to share the ones
> from __slab_free. Since there is no use for the node afterwards, change
> the return type of free_debug_processing to return an int like
> alloc_debug_processing.

Acked-by: Christoph Lameter <cl@linux.com>\

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-01-26  7:03 ` [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Joonsoo Kim
  2016-01-26 15:01   ` Christoph Lameter
@ 2016-02-03 18:46   ` Laura Abbott
  2016-02-03 21:06     ` Kees Cook
  1 sibling, 1 reply; 17+ messages in thread
From: Laura Abbott @ 2016-02-03 18:46 UTC (permalink / raw)
  To: Joonsoo Kim, Laura Abbott
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Andrew Morton,
	linux-mm, linux-kernel, kernel-hardening, Kees Cook

On 01/25/2016 11:03 PM, Joonsoo Kim wrote:
> On Mon, Jan 25, 2016 at 05:15:10PM -0800, Laura Abbott wrote:
>> Hi,
>>
>> Based on the discussion from the series to add slab sanitization
>> (lkml.kernel.org/g/<1450755641-7856-1-git-send-email-laura@labbott.name>)
>> the existing SLAB_POISON mechanism already covers similar behavior.
>> The performance of SLAB_POISON isn't very good. With hackbench -g 20 -l 1000
>> on QEMU with one cpu:
>
> I doesn't follow up that discussion, but, I think that reusing
> SLAB_POISON for slab sanitization needs more changes. I assume that
> completeness and performance is matter for slab sanitization.
>
> 1) SLAB_POISON isn't applied to specific kmem_cache which has
> constructor or SLAB_DESTROY_BY_RCU flag. For debug, it's not necessary
> to be applied, but, for slab sanitization, it is better to apply it to
> all caches.

The grsecurity patches get around this by calling the constructor again
after poisoning. It could be worth investigating doing that as well
although my focus was on the cases without the constructor.
>
> 2) SLAB_POISON makes object size bigger so natural alignment will be
> broken. For example, kmalloc(256) cache's size is 256 in normal
> case but it would be 264 when SLAB_POISON is enabled. This causes
> memory waste.

The grsecurity patches also bump the size up to put the free pointer
outside the object. For sanitization purposes it is cleaner to have
no pointers in the object after free

>
> In fact, I'd prefer not reusing SLAB_POISON. It would make thing
> simpler. But, it's up to Christoph.
>
> Thanks.
>

It basically looks like trying to poison on the fast path at all
will have a negative impact even with the feature is turned off.
Christoph has indicated this is not acceptable so we are forced
to limit it to the slow path only if we want runtime enablement.
If we're limited to the slow path only, we might as well work
with SLAB_POISON to make it faster. We can reevaluate if it turns
out the poisoning isn't fast enough to be useful.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-02-03 18:46   ` Laura Abbott
@ 2016-02-03 21:06     ` Kees Cook
  2016-02-03 21:35       ` Laura Abbott
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2016-02-03 21:06 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Joonsoo Kim, Laura Abbott, Christoph Lameter, Pekka Enberg,
	David Rientjes, Andrew Morton, Linux-MM, LKML, kernel-hardening

On Wed, Feb 3, 2016 at 10:46 AM, Laura Abbott <labbott@redhat.com> wrote:
> On 01/25/2016 11:03 PM, Joonsoo Kim wrote:
>>
>> On Mon, Jan 25, 2016 at 05:15:10PM -0800, Laura Abbott wrote:
>>>
>>> Hi,
>>>
>>> Based on the discussion from the series to add slab sanitization
>>> (lkml.kernel.org/g/<1450755641-7856-1-git-send-email-laura@labbott.name>)
>>> the existing SLAB_POISON mechanism already covers similar behavior.
>>> The performance of SLAB_POISON isn't very good. With hackbench -g 20 -l
>>> 1000
>>> on QEMU with one cpu:
>>
>>
>> I doesn't follow up that discussion, but, I think that reusing
>> SLAB_POISON for slab sanitization needs more changes. I assume that
>> completeness and performance is matter for slab sanitization.
>>
>> 1) SLAB_POISON isn't applied to specific kmem_cache which has
>> constructor or SLAB_DESTROY_BY_RCU flag. For debug, it's not necessary
>> to be applied, but, for slab sanitization, it is better to apply it to
>> all caches.
>
>
> The grsecurity patches get around this by calling the constructor again
> after poisoning. It could be worth investigating doing that as well
> although my focus was on the cases without the constructor.
>>
>>
>> 2) SLAB_POISON makes object size bigger so natural alignment will be
>> broken. For example, kmalloc(256) cache's size is 256 in normal
>> case but it would be 264 when SLAB_POISON is enabled. This causes
>> memory waste.
>
>
> The grsecurity patches also bump the size up to put the free pointer
> outside the object. For sanitization purposes it is cleaner to have
> no pointers in the object after free
>
>>
>> In fact, I'd prefer not reusing SLAB_POISON. It would make thing
>> simpler. But, it's up to Christoph.
>>
>> Thanks.
>>
>
> It basically looks like trying to poison on the fast path at all
> will have a negative impact even with the feature is turned off.
> Christoph has indicated this is not acceptable so we are forced
> to limit it to the slow path only if we want runtime enablement.

Is it possible to have both? i.e fast path via CONFIG, and slow path
via runtime options?

> If we're limited to the slow path only, we might as well work
> with SLAB_POISON to make it faster. We can reevaluate if it turns
> out the poisoning isn't fast enough to be useful.

And since I'm new to this area, I know of fast/slow path in the
syscall sense. What happens in the allocation/free fast/slow path that
makes it fast or slow?

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-02-03 21:06     ` Kees Cook
@ 2016-02-03 21:35       ` Laura Abbott
  2016-02-03 23:02         ` Christoph Lameter
  0 siblings, 1 reply; 17+ messages in thread
From: Laura Abbott @ 2016-02-03 21:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Joonsoo Kim, Laura Abbott, Christoph Lameter, Pekka Enberg,
	David Rientjes, Andrew Morton, Linux-MM, LKML, kernel-hardening

On 02/03/2016 01:06 PM, Kees Cook wrote:
> On Wed, Feb 3, 2016 at 10:46 AM, Laura Abbott <labbott@redhat.com> wrote:
>> On 01/25/2016 11:03 PM, Joonsoo Kim wrote:
>>>
>>> On Mon, Jan 25, 2016 at 05:15:10PM -0800, Laura Abbott wrote:
>>>>
>>>> Hi,
>>>>
>>>> Based on the discussion from the series to add slab sanitization
>>>> (lkml.kernel.org/g/<1450755641-7856-1-git-send-email-laura@labbott.name>)
>>>> the existing SLAB_POISON mechanism already covers similar behavior.
>>>> The performance of SLAB_POISON isn't very good. With hackbench -g 20 -l
>>>> 1000
>>>> on QEMU with one cpu:
>>>
>>>
>>> I doesn't follow up that discussion, but, I think that reusing
>>> SLAB_POISON for slab sanitization needs more changes. I assume that
>>> completeness and performance is matter for slab sanitization.
>>>
>>> 1) SLAB_POISON isn't applied to specific kmem_cache which has
>>> constructor or SLAB_DESTROY_BY_RCU flag. For debug, it's not necessary
>>> to be applied, but, for slab sanitization, it is better to apply it to
>>> all caches.
>>
>>
>> The grsecurity patches get around this by calling the constructor again
>> after poisoning. It could be worth investigating doing that as well
>> although my focus was on the cases without the constructor.
>>>
>>>
>>> 2) SLAB_POISON makes object size bigger so natural alignment will be
>>> broken. For example, kmalloc(256) cache's size is 256 in normal
>>> case but it would be 264 when SLAB_POISON is enabled. This causes
>>> memory waste.
>>
>>
>> The grsecurity patches also bump the size up to put the free pointer
>> outside the object. For sanitization purposes it is cleaner to have
>> no pointers in the object after free
>>
>>>
>>> In fact, I'd prefer not reusing SLAB_POISON. It would make thing
>>> simpler. But, it's up to Christoph.
>>>
>>> Thanks.
>>>
>>
>> It basically looks like trying to poison on the fast path at all
>> will have a negative impact even with the feature is turned off.
>> Christoph has indicated this is not acceptable so we are forced
>> to limit it to the slow path only if we want runtime enablement.
>
> Is it possible to have both? i.e fast path via CONFIG, and slow path
> via runtime options?
>

That's what this patch series had. A Kconfig to turn the fast path
debugging on and off. When the Kconfig is off it reverts back to the
existing behavior and there is no fastpath penalty.
  
>> If we're limited to the slow path only, we might as well work
>> with SLAB_POISON to make it faster. We can reevaluate if it turns
>> out the poisoning isn't fast enough to be useful.
>
> And since I'm new to this area, I know of fast/slow path in the
> syscall sense. What happens in the allocation/free fast/slow path that
> makes it fast or slow?

The fast path uses the per cpu caches. No locks are taken and there
is no IRQ disabling. For concurrency protection this comment
explains it best:

/*
  * The cmpxchg will only match if there was no additional
  * operation and if we are on the right processor.
  *
  * The cmpxchg does the following atomically (without lock
  * semantics!)
  * 1. Relocate first pointer to the current per cpu area.
  * 2. Verify that tid and freelist have not been changed
  * 3. If they were not changed replace tid and freelist
  *
  * Since this is without lock semantics the protection is only
  * against code executing on this cpu *not* from access by
  * other cpus.
  */

in the slow path, IRQs and locks have to be taken at the minimum.
The debug options disable ever loading the per CPU caches so it
always falls back to the slow path.

>
> -Kees
>

Thanks,
Laura

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-02-03 21:35       ` Laura Abbott
@ 2016-02-03 23:02         ` Christoph Lameter
  2016-02-04  0:46           ` Laura Abbott
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Lameter @ 2016-02-03 23:02 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Kees Cook, Joonsoo Kim, Laura Abbott, Pekka Enberg,
	David Rientjes, Andrew Morton, Linux-MM, LKML, kernel-hardening

> The fast path uses the per cpu caches. No locks are taken and there
> is no IRQ disabling. For concurrency protection this comment
> explains it best:
>
> /*
>  * The cmpxchg will only match if there was no additional
>  * operation and if we are on the right processor.
>  *
>  * The cmpxchg does the following atomically (without lock
>  * semantics!)
>  * 1. Relocate first pointer to the current per cpu area.
>  * 2. Verify that tid and freelist have not been changed
>  * 3. If they were not changed replace tid and freelist
>  *
>  * Since this is without lock semantics the protection is only
>  * against code executing on this cpu *not* from access by
>  * other cpus.
>  */
>
> in the slow path, IRQs and locks have to be taken at the minimum.
> The debug options disable ever loading the per CPU caches so it
> always falls back to the slow path.

You could add the use of per cpu lists to the slow paths as well in
order
to increase performance. Then weave in the debugging options.

But the performance of the fast path is critical to the overall
performance of the kernel as a whole since this is a heavily used code
path for many subsystems.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-02-03 23:02         ` Christoph Lameter
@ 2016-02-04  0:46           ` Laura Abbott
  2016-02-04  3:23             ` Christoph Lameter
  0 siblings, 1 reply; 17+ messages in thread
From: Laura Abbott @ 2016-02-04  0:46 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Kees Cook, Joonsoo Kim, Laura Abbott, Pekka Enberg,
	David Rientjes, Andrew Morton, Linux-MM, LKML, kernel-hardening

On 02/03/2016 03:02 PM, Christoph Lameter wrote:
>> The fast path uses the per cpu caches. No locks are taken and there
>> is no IRQ disabling. For concurrency protection this comment
>> explains it best:
>>
>> /*
>>   * The cmpxchg will only match if there was no additional
>>   * operation and if we are on the right processor.
>>   *
>>   * The cmpxchg does the following atomically (without lock
>>   * semantics!)
>>   * 1. Relocate first pointer to the current per cpu area.
>>   * 2. Verify that tid and freelist have not been changed
>>   * 3. If they were not changed replace tid and freelist
>>   *
>>   * Since this is without lock semantics the protection is only
>>   * against code executing on this cpu *not* from access by
>>   * other cpus.
>>   */
>>
>> in the slow path, IRQs and locks have to be taken at the minimum.
>> The debug options disable ever loading the per CPU caches so it
>> always falls back to the slow path.
>
> You could add the use of per cpu lists to the slow paths as well in
> order
> to increase performance. Then weave in the debugging options.
>

How would that work? The use of the CPU caches is what defines the
fast path so I'm not sure how to add them in on the slow path and
not affect the fast path.
  
> But the performance of the fast path is critical to the overall
> performance of the kernel as a whole since this is a heavily used code
> path for many subsystems.
>

I also notice that __CMPXCHG_DOUBLE is turned off when the debug
options are turned on. I don't see any details about why. What's
the reason for turning it off when the debug options are enabled?

Thanks,
Laura  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks
  2016-02-04  0:46           ` Laura Abbott
@ 2016-02-04  3:23             ` Christoph Lameter
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Lameter @ 2016-02-04  3:23 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Kees Cook, Joonsoo Kim, Laura Abbott, Pekka Enberg,
	David Rientjes, Andrew Morton, Linux-MM, LKML, kernel-hardening

On Wed, 3 Feb 2016, Laura Abbott wrote:

> I also notice that __CMPXCHG_DOUBLE is turned off when the debug
> options are turned on. I don't see any details about why. What's
> the reason for turning it off when the debug options are enabled?

Because operations on the object need to be locked out while the debug
code is running. Otherwise concurrent operations from other processors
could lead to weird object states. The object needs to be stable for
debug checks. Poisoning and the related checks need that otherwise you
will get sporadic false positives.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-02-04  3:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-26  1:15 [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Laura Abbott
2016-01-26  1:15 ` [RFC][PATCH 1/3] slub: Drop lock at the end of free_debug_processing Laura Abbott
2016-01-26 16:19   ` Christoph Lameter
2016-01-26  1:15 ` [RFC][PATCH 2/3] slub: Don't limit debugging to slow paths Laura Abbott
2016-01-26  8:48   ` Paul Bolle
2016-01-26  1:15 ` [PATCH 3/3] slub: Add option to skip consistency checks Laura Abbott
2016-01-26 15:00   ` Christoph Lameter
2016-01-26  7:03 ` [RFC][PATCH 0/3] Speed up SLUB poisoning + disable checks Joonsoo Kim
2016-01-26 15:01   ` Christoph Lameter
2016-01-26 15:21     ` Joonsoo Kim
2016-02-03 18:46   ` Laura Abbott
2016-02-03 21:06     ` Kees Cook
2016-02-03 21:35       ` Laura Abbott
2016-02-03 23:02         ` Christoph Lameter
2016-02-04  0:46           ` Laura Abbott
2016-02-04  3:23             ` Christoph Lameter
2016-01-26 14:57 ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).