All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] debugobjects: Reduce global pool_lock contention
@ 2019-05-20 14:14 Waiman Long
  2019-05-20 14:14 ` [PATCH 1/5] debugobjects: Add percpu free pools Waiman Long
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Waiman Long @ 2019-05-20 14:14 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Yang Shi, Joel Fernandes (Google),
	Qian Cai, Zhong Jiang, Waiman Long

Many large workloads require the kernel to acquire a lot of objects
and then free them when the work is done. When the debug objects code
is configured, this will cause a lot of debug objects to be allocated and
then free later on. For instance, after a kernel boot up and 3 parallel
kernel builds, the partial output of the debug objects stats file was:

  pool_free     :3082
  pool_min_free :498
  pool_used     :108488
  pool_max_used :170127
  on_free_list  :0
  objs_allocated:34954917
  objs_freed    :34844371

It can be seen that a lot of debug objects were allocated and freed
during those operations. All these debug object allocation and free
operations require grabbing the global pool_lock. On systems with many
CPUs, the contention on this single global lock can become one of the
bottlenecks.

This patchset tries to reduce the level of contention on this global
pool_lock by the following means:
 1) Add a percpu free object pool to serve as a cache so that object
    allocation and freeing can be done without acquiring pool_lock when
    is not empty or full.
 2) Batching up multiple operations in a single lock/unlock critical
    section to reduce the number of times the pool_lock is to be
    acquired.
 3) Make the actual freeing of the debug objects via the workqueue less
    aggressive to minimize the actual number of slab allocation and
    freeing calls.

In addition, this patchset also tries to move the printk() call out
of the raw db->lock critical section to reduce the lock hold time as
much as possible.

With or without these changes, the times to do a parallel kernel build
on a 2-socket 36-core 72-thread Haswell system were:

   Kernel         Elapsed time      System time
   ------         ------------      -----------
   Pre-patch        4m51.01s         83m11.53s
   Post-patch       4m47.45s         80m25.78s

The post-patch partial debug objects stats file for the same operations
was:

  pool_free     :5901
  pool_pcp_free :3742
  pool_min_free :1022
  pool_used     :104805
  pool_max_used :168081
  on_free_list  :0
  objs_allocated:5796864
  objs_freed    :5687182

Waiman Long (5):
  debugobjects: Add percpu free pools
  debugobjects: Percpu pool lookahead freeing/allocation
  debugobjects: Reduce number of pool_lock acquisitions in fill_pool()
  debugobjects: Less aggressive freeing of excess debug objects
  debugobjects: Move printk out of db lock critical sections

 lib/debugobjects.c | 308 ++++++++++++++++++++++++++++++++++++---------
 1 file changed, 252 insertions(+), 56 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/5] debugobjects: Add percpu free pools
  2019-05-20 14:14 [PATCH 0/5] debugobjects: Reduce global pool_lock contention Waiman Long
@ 2019-05-20 14:14 ` Waiman Long
  2019-06-14 12:58   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2019-05-20 14:14 ` [PATCH 2/5] debugobjects: Percpu pool lookahead freeing/allocation Waiman Long
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2019-05-20 14:14 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Yang Shi, Joel Fernandes (Google),
	Qian Cai, Zhong Jiang, Waiman Long

When a multi-threaded workload does a lot of small memory object
allocation and deallocation, it may cause the allocation and freeing of
many debug objects. This will make the global pool_lock a bottleneck
in the performance of the workload.  Since interrupt is disabled in
acquiring the pool_lock, it may even cause hard lockup to happen.

To reduce contention of the global pool_lock, this patch adds a percpu
debug object free pool that can be used to buffer some of the debug
object allocation and freeing requests without acquiring the pool_lock.
Each CPU will now have a percpu free pool that can hold up to a maximum
of 64 debug objects. Allocation and freeing requests will go to the
percpu free pool first. If that fails, the pool_lock will be taken and
the global free pool will be used.

The presence or absence of obj_cache is used as a marker to see if the
percpu cache should be used.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 115 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 91 insertions(+), 24 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 55437fd5128b..8a235c9412dc 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -25,6 +25,7 @@
 
 #define ODEBUG_POOL_SIZE	1024
 #define ODEBUG_POOL_MIN_LEVEL	256
+#define ODEBUG_POOL_PERCPU_SIZE	64
 
 #define ODEBUG_CHUNK_SHIFT	PAGE_SHIFT
 #define ODEBUG_CHUNK_SIZE	(1 << ODEBUG_CHUNK_SHIFT)
@@ -35,6 +36,17 @@ struct debug_bucket {
 	raw_spinlock_t		lock;
 };
 
+/*
+ * Debug object percpu free list
+ * Access is protected by disabling irq
+ */
+struct debug_percpu_free {
+	struct hlist_head	free_objs;
+	int			obj_free;
+};
+
+static DEFINE_PER_CPU(struct debug_percpu_free, percpu_obj_pool);
+
 static struct debug_bucket	obj_hash[ODEBUG_HASH_SIZE];
 
 static struct debug_obj		obj_static_pool[ODEBUG_POOL_SIZE] __initdata;
@@ -44,13 +56,19 @@ static DEFINE_RAW_SPINLOCK(pool_lock);
 static HLIST_HEAD(obj_pool);
 static HLIST_HEAD(obj_to_free);
 
+/*
+ * Because of the presence of percpu free pools, obj_pool_free will
+ * under-count those in the percpu free pools. Similarly, obj_pool_used
+ * will over-count those in the percpu free pools. Adjustments will be
+ * made at debug_stats_show(). Both obj_pool_min_free and obj_pool_max_used
+ * can be off.
+ */
 static int			obj_pool_min_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_used;
 static int			obj_pool_max_used;
 /* The number of objs on the global free list */
 static int			obj_nr_tofree;
-static struct kmem_cache	*obj_cache;
 
 static int			debug_objects_maxchain __read_mostly;
 static int __maybe_unused	debug_objects_maxchecked __read_mostly;
@@ -63,6 +81,7 @@ static int			debug_objects_pool_size __read_mostly
 static int			debug_objects_pool_min_level __read_mostly
 				= ODEBUG_POOL_MIN_LEVEL;
 static struct debug_obj_descr	*descr_test  __read_mostly;
+static struct kmem_cache	*obj_cache __read_mostly;
 
 /*
  * Track numbers of kmem_cache_alloc()/free() calls done.
@@ -162,6 +181,21 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
 	return NULL;
 }
 
+/*
+ * Allocate a new object from the hlist
+ */
+static struct debug_obj *__alloc_object(struct hlist_head *list)
+{
+	struct debug_obj *obj = NULL;
+
+	if (list->first) {
+		obj = hlist_entry(list->first, typeof(*obj), node);
+		hlist_del(&obj->node);
+	}
+
+	return obj;
+}
+
 /*
  * Allocate a new object. If the pool is empty, switch off the debugger.
  * Must be called with interrupts disabled.
@@ -169,20 +203,21 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
 static struct debug_obj *
 alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 {
-	struct debug_obj *obj = NULL;
-
-	raw_spin_lock(&pool_lock);
-	if (obj_pool.first) {
-		obj	    = hlist_entry(obj_pool.first, typeof(*obj), node);
-
-		obj->object = addr;
-		obj->descr  = descr;
-		obj->state  = ODEBUG_STATE_NONE;
-		obj->astate = 0;
-		hlist_del(&obj->node);
+	struct debug_percpu_free *percpu_pool;
+	struct debug_obj *obj;
 
-		hlist_add_head(&obj->node, &b->list);
+	if (likely(obj_cache)) {
+		percpu_pool = this_cpu_ptr(&percpu_obj_pool);
+		obj = __alloc_object(&percpu_pool->free_objs);
+		if (obj) {
+			percpu_pool->obj_free--;
+			goto init_obj;
+		}
+	}
 
+	raw_spin_lock(&pool_lock);
+	obj = __alloc_object(&obj_pool);
+	if (obj) {
 		obj_pool_used++;
 		if (obj_pool_used > obj_pool_max_used)
 			obj_pool_max_used = obj_pool_used;
@@ -193,6 +228,14 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 	}
 	raw_spin_unlock(&pool_lock);
 
+init_obj:
+	if (obj) {
+		obj->object = addr;
+		obj->descr  = descr;
+		obj->state  = ODEBUG_STATE_NONE;
+		obj->astate = 0;
+		hlist_add_head(&obj->node, &b->list);
+	}
 	return obj;
 }
 
@@ -247,8 +290,21 @@ static bool __free_object(struct debug_obj *obj)
 {
 	unsigned long flags;
 	bool work;
+	struct debug_percpu_free *percpu_pool;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
+	local_irq_save(flags);
+	/*
+	 * Try to free it into the percpu pool first.
+	 */
+	percpu_pool = this_cpu_ptr(&percpu_obj_pool);
+	if (obj_cache && percpu_pool->obj_free < ODEBUG_POOL_PERCPU_SIZE) {
+		hlist_add_head(&obj->node, &percpu_pool->free_objs);
+		percpu_pool->obj_free++;
+		local_irq_restore(flags);
+		return false;
+	}
+
+	raw_spin_lock(&pool_lock);
 	work = (obj_pool_free > debug_objects_pool_size) && obj_cache;
 	obj_pool_used--;
 
@@ -259,7 +315,8 @@ static bool __free_object(struct debug_obj *obj)
 		obj_pool_free++;
 		hlist_add_head(&obj->node, &obj_pool);
 	}
-	raw_spin_unlock_irqrestore(&pool_lock, flags);
+	raw_spin_unlock(&pool_lock);
+	local_irq_restore(flags);
 	return work;
 }
 
@@ -822,13 +879,19 @@ void debug_check_no_obj_freed(const void *address, unsigned long size)
 
 static int debug_stats_show(struct seq_file *m, void *v)
 {
+	int cpu, obj_percpu_free = 0;
+
+	for_each_possible_cpu(cpu)
+		obj_percpu_free += per_cpu(percpu_obj_pool.obj_free, cpu);
+
 	seq_printf(m, "max_chain     :%d\n", debug_objects_maxchain);
 	seq_printf(m, "max_checked   :%d\n", debug_objects_maxchecked);
 	seq_printf(m, "warnings      :%d\n", debug_objects_warnings);
 	seq_printf(m, "fixups        :%d\n", debug_objects_fixups);
-	seq_printf(m, "pool_free     :%d\n", obj_pool_free);
+	seq_printf(m, "pool_free     :%d\n", obj_pool_free + obj_percpu_free);
+	seq_printf(m, "pool_pcp_free :%d\n", obj_percpu_free);
 	seq_printf(m, "pool_min_free :%d\n", obj_pool_min_free);
-	seq_printf(m, "pool_used     :%d\n", obj_pool_used);
+	seq_printf(m, "pool_used     :%d\n", obj_pool_used - obj_percpu_free);
 	seq_printf(m, "pool_max_used :%d\n", obj_pool_max_used);
 	seq_printf(m, "on_free_list  :%d\n", obj_nr_tofree);
 	seq_printf(m, "objs_allocated:%d\n", debug_objects_allocated);
@@ -1175,9 +1238,20 @@ static int __init debug_objects_replace_static_objects(void)
  */
 void __init debug_objects_mem_init(void)
 {
+	int cpu;
+
 	if (!debug_objects_enabled)
 		return;
 
+	/*
+	 * Initialize the percpu object pools
+	 *
+	 * Initialization is not strictly necessary, but was done for
+	 * completeness.
+	 */
+	for_each_possible_cpu(cpu)
+		INIT_HLIST_HEAD(&per_cpu(percpu_obj_pool.free_objs, cpu));
+
 	obj_cache = kmem_cache_create("debug_objects_cache",
 				      sizeof (struct debug_obj), 0,
 				      SLAB_DEBUG_OBJECTS | SLAB_NOLEAKTRACE,
@@ -1189,11 +1263,4 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
-
-	/*
-	 * Increase the thresholds for allocating and freeing objects
-	 * according to the number of possible CPUs available in the system.
-	 */
-	debug_objects_pool_size += num_possible_cpus() * 32;
-	debug_objects_pool_min_level += num_possible_cpus() * 4;
 }
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/5] debugobjects: Percpu pool lookahead freeing/allocation
  2019-05-20 14:14 [PATCH 0/5] debugobjects: Reduce global pool_lock contention Waiman Long
  2019-05-20 14:14 ` [PATCH 1/5] debugobjects: Add percpu free pools Waiman Long
@ 2019-05-20 14:14 ` Waiman Long
  2019-06-14 12:59   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2019-05-20 14:14 ` [PATCH 3/5] debugobjects: Reduce number of pool_lock acquisitions in fill_pool() Waiman Long
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2019-05-20 14:14 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Yang Shi, Joel Fernandes (Google),
	Qian Cai, Zhong Jiang, Waiman Long

Most workloads will allocate a bunch of memory objects, work on them
and then freeing all or most of them. So just having a percpu free pool
may not reduce the pool_lock contention significantly if large number
of objects are being used.

To help those situations, we are now doing lookahead allocation and
freeing of the debug objects into and out of the percpu free pool. This
will hopefully reduce the number of times the pool_lock needs to be
taken and hence its contention level.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 69 +++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 65 insertions(+), 4 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 8a235c9412dc..c30c9d308e9f 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -26,6 +26,7 @@
 #define ODEBUG_POOL_SIZE	1024
 #define ODEBUG_POOL_MIN_LEVEL	256
 #define ODEBUG_POOL_PERCPU_SIZE	64
+#define ODEBUG_BATCH_SIZE	16
 
 #define ODEBUG_CHUNK_SHIFT	PAGE_SHIFT
 #define ODEBUG_CHUNK_SIZE	(1 << ODEBUG_CHUNK_SHIFT)
@@ -206,8 +207,8 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 	struct debug_percpu_free *percpu_pool;
 	struct debug_obj *obj;
 
+	percpu_pool = this_cpu_ptr(&percpu_obj_pool);
 	if (likely(obj_cache)) {
-		percpu_pool = this_cpu_ptr(&percpu_obj_pool);
 		obj = __alloc_object(&percpu_pool->free_objs);
 		if (obj) {
 			percpu_pool->obj_free--;
@@ -219,10 +220,31 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 	obj = __alloc_object(&obj_pool);
 	if (obj) {
 		obj_pool_used++;
+		obj_pool_free--;
+
+		/*
+		 * Looking ahead, allocate one batch of debug objects and
+		 * put them into the percpu free pool.
+		 */
+		if (likely(obj_cache)) {
+			int i;
+			struct debug_obj *obj2;
+
+			for (i = 0; i < ODEBUG_BATCH_SIZE; i++) {
+				obj2 = __alloc_object(&obj_pool);
+				if (!obj2)
+					break;
+				hlist_add_head(&obj2->node,
+					       &percpu_pool->free_objs);
+				percpu_pool->obj_free++;
+				obj_pool_used++;
+				obj_pool_free--;
+			}
+		}
+
 		if (obj_pool_used > obj_pool_max_used)
 			obj_pool_max_used = obj_pool_used;
 
-		obj_pool_free--;
 		if (obj_pool_free < obj_pool_min_free)
 			obj_pool_min_free = obj_pool_free;
 	}
@@ -290,20 +312,37 @@ static bool __free_object(struct debug_obj *obj)
 {
 	unsigned long flags;
 	bool work;
+	int lookahead_count = 0;
+	struct debug_obj *objs[ODEBUG_BATCH_SIZE];
 	struct debug_percpu_free *percpu_pool;
 
 	local_irq_save(flags);
+	if (!obj_cache)
+		goto free_to_obj_pool;
+
 	/*
 	 * Try to free it into the percpu pool first.
 	 */
 	percpu_pool = this_cpu_ptr(&percpu_obj_pool);
-	if (obj_cache && percpu_pool->obj_free < ODEBUG_POOL_PERCPU_SIZE) {
+	if (percpu_pool->obj_free < ODEBUG_POOL_PERCPU_SIZE) {
 		hlist_add_head(&obj->node, &percpu_pool->free_objs);
 		percpu_pool->obj_free++;
 		local_irq_restore(flags);
 		return false;
 	}
 
+	/*
+	 * As the percpu pool is full, look ahead and pull out a batch
+	 * of objects from the percpu pool and free them as well.
+	 */
+	for (; lookahead_count < ODEBUG_BATCH_SIZE; lookahead_count++) {
+		objs[lookahead_count] = __alloc_object(&percpu_pool->free_objs);
+		if (!objs[lookahead_count])
+			break;
+		percpu_pool->obj_free--;
+	}
+
+free_to_obj_pool:
 	raw_spin_lock(&pool_lock);
 	work = (obj_pool_free > debug_objects_pool_size) && obj_cache;
 	obj_pool_used--;
@@ -311,9 +350,23 @@ static bool __free_object(struct debug_obj *obj)
 	if (work) {
 		obj_nr_tofree++;
 		hlist_add_head(&obj->node, &obj_to_free);
+		if (lookahead_count) {
+			obj_nr_tofree += lookahead_count;
+			obj_pool_used -= lookahead_count;
+			while (lookahead_count)
+				hlist_add_head(&objs[--lookahead_count]->node,
+					       &obj_to_free);
+		}
 	} else {
 		obj_pool_free++;
 		hlist_add_head(&obj->node, &obj_pool);
+		if (lookahead_count) {
+			obj_pool_free += lookahead_count;
+			obj_pool_used -= lookahead_count;
+			while (lookahead_count)
+				hlist_add_head(&objs[--lookahead_count]->node,
+					       &obj_pool);
+		}
 	}
 	raw_spin_unlock(&pool_lock);
 	local_irq_restore(flags);
@@ -1238,7 +1291,7 @@ static int __init debug_objects_replace_static_objects(void)
  */
 void __init debug_objects_mem_init(void)
 {
-	int cpu;
+	int cpu, extras;
 
 	if (!debug_objects_enabled)
 		return;
@@ -1263,4 +1316,12 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
+
+	/*
+	 * Increase the thresholds for allocating and freeing objects
+	 * according to the number of possible CPUs available in the system.
+	 */
+	extras = num_possible_cpus() * ODEBUG_BATCH_SIZE;
+	debug_objects_pool_size += extras;
+	debug_objects_pool_min_level += extras;
 }
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/5] debugobjects: Reduce number of pool_lock acquisitions in fill_pool()
  2019-05-20 14:14 [PATCH 0/5] debugobjects: Reduce global pool_lock contention Waiman Long
  2019-05-20 14:14 ` [PATCH 1/5] debugobjects: Add percpu free pools Waiman Long
  2019-05-20 14:14 ` [PATCH 2/5] debugobjects: Percpu pool lookahead freeing/allocation Waiman Long
@ 2019-05-20 14:14 ` Waiman Long
  2019-06-14 13:00   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2019-05-20 14:14 ` [PATCH 4/5] debugobjects: Less aggressive freeing of excess debug objects Waiman Long
  2019-05-20 14:14 ` [PATCH 5/5] debugobjects: Move printk out of db lock critical sections Waiman Long
  4 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2019-05-20 14:14 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Yang Shi, Joel Fernandes (Google),
	Qian Cai, Zhong Jiang, Waiman Long

In fill_pool(), the pool_lock is acquired and then released once per
debug object. If many objects are to be filled, the constant lock and
unlock operations are extra overhead. To reduce these overhead, we
batch them up and do an allocation of 4 objects per lock/unlock sequence.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index c30c9d308e9f..7573c736da7d 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -120,7 +120,7 @@ static const char *obj_states[ODEBUG_STATE_MAX] = {
 static void fill_pool(void)
 {
 	gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
-	struct debug_obj *new, *obj;
+	struct debug_obj *obj;
 	unsigned long flags;
 
 	if (likely(obj_pool_free >= debug_objects_pool_min_level))
@@ -136,7 +136,7 @@ static void fill_pool(void)
 		 * Recheck with the lock held as the worker thread might have
 		 * won the race and freed the global free list already.
 		 */
-		if (obj_nr_tofree) {
+		while (obj_nr_tofree && (obj_pool_free < obj_pool_min_free)) {
 			obj = hlist_entry(obj_to_free.first, typeof(*obj), node);
 			hlist_del(&obj->node);
 			obj_nr_tofree--;
@@ -150,15 +150,23 @@ static void fill_pool(void)
 		return;
 
 	while (obj_pool_free < debug_objects_pool_min_level) {
+		struct debug_obj *new[ODEBUG_BATCH_SIZE];
+		int cnt;
 
-		new = kmem_cache_zalloc(obj_cache, gfp);
-		if (!new)
+		for (cnt = 0; cnt < ODEBUG_BATCH_SIZE; cnt++) {
+			new[cnt] = kmem_cache_zalloc(obj_cache, gfp);
+			if (!new[cnt])
+				break;
+		}
+		if (!cnt)
 			return;
 
 		raw_spin_lock_irqsave(&pool_lock, flags);
-		hlist_add_head(&new->node, &obj_pool);
-		debug_objects_allocated++;
-		obj_pool_free++;
+		while (cnt) {
+			hlist_add_head(&new[--cnt]->node, &obj_pool);
+			debug_objects_allocated++;
+			obj_pool_free++;
+		}
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
 	}
 }
@@ -280,7 +288,7 @@ static void free_obj_work(struct work_struct *work)
 	/*
 	 * The objs on the pool list might be allocated before the work is
 	 * run, so recheck if pool list it full or not, if not fill pool
-	 * list from the global free list
+	 * list from the global free list.
 	 */
 	while (obj_nr_tofree && obj_pool_free < debug_objects_pool_size) {
 		obj = hlist_entry(obj_to_free.first, typeof(*obj), node);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/5] debugobjects: Less aggressive freeing of excess debug objects
  2019-05-20 14:14 [PATCH 0/5] debugobjects: Reduce global pool_lock contention Waiman Long
                   ` (2 preceding siblings ...)
  2019-05-20 14:14 ` [PATCH 3/5] debugobjects: Reduce number of pool_lock acquisitions in fill_pool() Waiman Long
@ 2019-05-20 14:14 ` Waiman Long
  2019-06-14 13:01   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2019-05-20 14:14 ` [PATCH 5/5] debugobjects: Move printk out of db lock critical sections Waiman Long
  4 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2019-05-20 14:14 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Yang Shi, Joel Fernandes (Google),
	Qian Cai, Zhong Jiang, Waiman Long

After a system bootup and 3 parallel kernel builds, a partial output
of the debug objects stats file was:

pool_free     :5101
pool_pcp_free :4181
pool_min_free :220
pool_used     :104172
pool_max_used :171920
on_free_list  :0
objs_allocated:39268280
objs_freed    :39160031

More than 39 millions debug objects had since been allocated and then
freed. The pool_max_used, however, was only about 172k. So this is a
lot of extra overhead in freeing and allocating objects from slabs. It
may also causes the slabs to be more fragmented and harder to reclaim.

This patch tries to make the freeing of excess debug objects less
aggressive by freeing it at a maximum frequency of 10Hz and about 1k
objects at each round of freeing.

With that change applied, the partial output of the debug objects stats
file after similar actions became:

pool_free     :5901
pool_pcp_free :3742
pool_min_free :1022
pool_used     :104805
pool_max_used :168081
on_free_list  :0
objs_allocated:5796864
objs_freed    :5687182

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 61 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 49 insertions(+), 12 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 7573c736da7d..065770254899 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -32,6 +32,14 @@
 #define ODEBUG_CHUNK_SIZE	(1 << ODEBUG_CHUNK_SHIFT)
 #define ODEBUG_CHUNK_MASK	(~(ODEBUG_CHUNK_SIZE - 1))
 
+/*
+ * We limit the freeing of debug objects via workqueue at a maximum
+ * frequency of 10Hz and about 1024 objects for each freeing operation.
+ * So it is freeing at most 10k debug objects per second.
+ */
+#define ODEBUG_FREE_WORK_MAX	1024
+#define ODEBUG_FREE_WORK_DELAY	DIV_ROUND_UP(HZ, 10)
+
 struct debug_bucket {
 	struct hlist_head	list;
 	raw_spinlock_t		lock;
@@ -68,6 +76,7 @@ static int			obj_pool_min_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_used;
 static int			obj_pool_max_used;
+static bool			obj_freeing;
 /* The number of objs on the global free list */
 static int			obj_nr_tofree;
 
@@ -91,7 +100,7 @@ static int			debug_objects_allocated;
 static int			debug_objects_freed;
 
 static void free_obj_work(struct work_struct *work);
-static DECLARE_WORK(debug_obj_work, free_obj_work);
+static DECLARE_DELAYED_WORK(debug_obj_work, free_obj_work);
 
 static int __init enable_object_debug(char *str)
 {
@@ -282,13 +291,19 @@ static void free_obj_work(struct work_struct *work)
 	unsigned long flags;
 	HLIST_HEAD(tofree);
 
+	WRITE_ONCE(obj_freeing, false);
 	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
 		return;
 
+	if (obj_pool_free >= debug_objects_pool_size)
+		goto free_objs;
+
 	/*
 	 * The objs on the pool list might be allocated before the work is
 	 * run, so recheck if pool list it full or not, if not fill pool
-	 * list from the global free list.
+	 * list from the global free list. As it is likely that a workload
+	 * may be gearing up to use more and more objects, don't free any
+	 * of them until the next round.
 	 */
 	while (obj_nr_tofree && obj_pool_free < debug_objects_pool_size) {
 		obj = hlist_entry(obj_to_free.first, typeof(*obj), node);
@@ -297,7 +312,10 @@ static void free_obj_work(struct work_struct *work)
 		obj_pool_free++;
 		obj_nr_tofree--;
 	}
+	raw_spin_unlock_irqrestore(&pool_lock, flags);
+	return;
 
+free_objs:
 	/*
 	 * Pool list is already full and there are still objs on the free
 	 * list. Move remaining free objs to a temporary list to free the
@@ -316,7 +334,7 @@ static void free_obj_work(struct work_struct *work)
 	}
 }
 
-static bool __free_object(struct debug_obj *obj)
+static void __free_object(struct debug_obj *obj)
 {
 	unsigned long flags;
 	bool work;
@@ -336,7 +354,7 @@ static bool __free_object(struct debug_obj *obj)
 		hlist_add_head(&obj->node, &percpu_pool->free_objs);
 		percpu_pool->obj_free++;
 		local_irq_restore(flags);
-		return false;
+		return;
 	}
 
 	/*
@@ -352,7 +370,8 @@ static bool __free_object(struct debug_obj *obj)
 
 free_to_obj_pool:
 	raw_spin_lock(&pool_lock);
-	work = (obj_pool_free > debug_objects_pool_size) && obj_cache;
+	work = (obj_pool_free > debug_objects_pool_size) && obj_cache &&
+	       (obj_nr_tofree < ODEBUG_FREE_WORK_MAX);
 	obj_pool_used--;
 
 	if (work) {
@@ -365,6 +384,21 @@ static bool __free_object(struct debug_obj *obj)
 				hlist_add_head(&objs[--lookahead_count]->node,
 					       &obj_to_free);
 		}
+
+		if ((obj_pool_free > debug_objects_pool_size) &&
+		    (obj_nr_tofree < ODEBUG_FREE_WORK_MAX)) {
+			int i;
+
+			/*
+			 * Free one more batch of objects from obj_pool.
+			 */
+			for (i = 0; i < ODEBUG_BATCH_SIZE; i++) {
+				obj = __alloc_object(&obj_pool);
+				hlist_add_head(&obj->node, &obj_to_free);
+				obj_pool_free--;
+				obj_nr_tofree++;
+			}
+		}
 	} else {
 		obj_pool_free++;
 		hlist_add_head(&obj->node, &obj_pool);
@@ -378,7 +412,6 @@ static bool __free_object(struct debug_obj *obj)
 	}
 	raw_spin_unlock(&pool_lock);
 	local_irq_restore(flags);
-	return work;
 }
 
 /*
@@ -387,8 +420,11 @@ static bool __free_object(struct debug_obj *obj)
  */
 static void free_object(struct debug_obj *obj)
 {
-	if (__free_object(obj))
-		schedule_work(&debug_obj_work);
+	__free_object(obj);
+	if (!obj_freeing && obj_nr_tofree) {
+		WRITE_ONCE(obj_freeing, true);
+		schedule_delayed_work(&debug_obj_work, ODEBUG_FREE_WORK_DELAY);
+	}
 }
 
 /*
@@ -878,7 +914,6 @@ static void __debug_check_no_obj_freed(const void *address, unsigned long size)
 	struct hlist_node *tmp;
 	struct debug_obj *obj;
 	int cnt, objs_checked = 0;
-	bool work = false;
 
 	saddr = (unsigned long) address;
 	eaddr = saddr + size;
@@ -909,7 +944,7 @@ static void __debug_check_no_obj_freed(const void *address, unsigned long size)
 				goto repeat;
 			default:
 				hlist_del(&obj->node);
-				work |= __free_object(obj);
+				__free_object(obj);
 				break;
 			}
 		}
@@ -925,8 +960,10 @@ static void __debug_check_no_obj_freed(const void *address, unsigned long size)
 		debug_objects_maxchecked = objs_checked;
 
 	/* Schedule work to actually kmem_cache_free() objects */
-	if (work)
-		schedule_work(&debug_obj_work);
+	if (!obj_freeing && obj_nr_tofree) {
+		WRITE_ONCE(obj_freeing, true);
+		schedule_delayed_work(&debug_obj_work, ODEBUG_FREE_WORK_DELAY);
+	}
 }
 
 void debug_check_no_obj_freed(const void *address, unsigned long size)
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/5] debugobjects: Move printk out of db lock critical sections
  2019-05-20 14:14 [PATCH 0/5] debugobjects: Reduce global pool_lock contention Waiman Long
                   ` (3 preceding siblings ...)
  2019-05-20 14:14 ` [PATCH 4/5] debugobjects: Less aggressive freeing of excess debug objects Waiman Long
@ 2019-05-20 14:14 ` Waiman Long
  2019-06-14 13:01   ` [tip:core/debugobjects] debugobjects: Move printk out of db->lock " tip-bot for Waiman Long
  4 siblings, 1 reply; 11+ messages in thread
From: Waiman Long @ 2019-05-20 14:14 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Yang Shi, Joel Fernandes (Google),
	Qian Cai, Zhong Jiang, Waiman Long

The db->lock is a raw spinlock and so the lock hold time is supposed
to be short. This will not be the case when printk() is being involved
in some of the critical sections. In order to avoid the long hold time,
in case some messages need to be printed, the debug_object_is_on_stack()
and debug_print_object() calls are now moved out of those critical
sections.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 61 +++++++++++++++++++++++++++++++---------------
 1 file changed, 42 insertions(+), 19 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 065770254899..5b6941568da0 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -529,6 +529,8 @@ __debug_object_init(void *addr, struct debug_obj_descr *descr, int onstack)
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
+	bool print_object = false;
+	bool check_stack = false;
 
 	fill_pool();
 
@@ -545,7 +547,7 @@ __debug_object_init(void *addr, struct debug_obj_descr *descr, int onstack)
 			debug_objects_oom();
 			return;
 		}
-		debug_object_is_on_stack(addr, onstack);
+		check_stack = true;
 	}
 
 	switch (obj->state) {
@@ -556,20 +558,25 @@ __debug_object_init(void *addr, struct debug_obj_descr *descr, int onstack)
 		break;
 
 	case ODEBUG_STATE_ACTIVE:
-		debug_print_object(obj, "init");
 		state = obj->state;
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		debug_print_object(obj, "init");
 		debug_object_fixup(descr->fixup_init, addr, state);
 		return;
 
 	case ODEBUG_STATE_DESTROYED:
-		debug_print_object(obj, "init");
+		print_object = true;
 		break;
 	default:
 		break;
 	}
 
 	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (check_stack)
+		debug_object_is_on_stack(addr, onstack);
+	if (print_object)
+		debug_print_object(obj, "init");
+
 }
 
 /**
@@ -627,6 +634,8 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
 
 	obj = lookup_object(addr, db);
 	if (obj) {
+		bool print_object = false;
+
 		switch (obj->state) {
 		case ODEBUG_STATE_INIT:
 		case ODEBUG_STATE_INACTIVE:
@@ -635,14 +644,14 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
 			break;
 
 		case ODEBUG_STATE_ACTIVE:
-			debug_print_object(obj, "activate");
 			state = obj->state;
 			raw_spin_unlock_irqrestore(&db->lock, flags);
+			debug_print_object(obj, "activate");
 			ret = debug_object_fixup(descr->fixup_activate, addr, state);
 			return ret ? 0 : -EINVAL;
 
 		case ODEBUG_STATE_DESTROYED:
-			debug_print_object(obj, "activate");
+			print_object = true;
 			ret = -EINVAL;
 			break;
 		default:
@@ -650,10 +659,13 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
 			break;
 		}
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		if (print_object)
+			debug_print_object(obj, "activate");
 		return ret;
 	}
 
 	raw_spin_unlock_irqrestore(&db->lock, flags);
+
 	/*
 	 * We are here when a static object is activated. We
 	 * let the type specific code confirm whether this is
@@ -685,6 +697,7 @@ void debug_object_deactivate(void *addr, struct debug_obj_descr *descr)
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
+	bool print_object = false;
 
 	if (!debug_objects_enabled)
 		return;
@@ -702,24 +715,27 @@ void debug_object_deactivate(void *addr, struct debug_obj_descr *descr)
 			if (!obj->astate)
 				obj->state = ODEBUG_STATE_INACTIVE;
 			else
-				debug_print_object(obj, "deactivate");
+				print_object = true;
 			break;
 
 		case ODEBUG_STATE_DESTROYED:
-			debug_print_object(obj, "deactivate");
+			print_object = true;
 			break;
 		default:
 			break;
 		}
-	} else {
+	}
+
+	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (!obj) {
 		struct debug_obj o = { .object = addr,
 				       .state = ODEBUG_STATE_NOTAVAILABLE,
 				       .descr = descr };
 
 		debug_print_object(&o, "deactivate");
+	} else if (print_object) {
+		debug_print_object(obj, "deactivate");
 	}
-
-	raw_spin_unlock_irqrestore(&db->lock, flags);
 }
 EXPORT_SYMBOL_GPL(debug_object_deactivate);
 
@@ -734,6 +750,7 @@ void debug_object_destroy(void *addr, struct debug_obj_descr *descr)
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
+	bool print_object = false;
 
 	if (!debug_objects_enabled)
 		return;
@@ -753,20 +770,22 @@ void debug_object_destroy(void *addr, struct debug_obj_descr *descr)
 		obj->state = ODEBUG_STATE_DESTROYED;
 		break;
 	case ODEBUG_STATE_ACTIVE:
-		debug_print_object(obj, "destroy");
 		state = obj->state;
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		debug_print_object(obj, "destroy");
 		debug_object_fixup(descr->fixup_destroy, addr, state);
 		return;
 
 	case ODEBUG_STATE_DESTROYED:
-		debug_print_object(obj, "destroy");
+		print_object = true;
 		break;
 	default:
 		break;
 	}
 out_unlock:
 	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (print_object)
+		debug_print_object(obj, "destroy");
 }
 EXPORT_SYMBOL_GPL(debug_object_destroy);
 
@@ -795,9 +814,9 @@ void debug_object_free(void *addr, struct debug_obj_descr *descr)
 
 	switch (obj->state) {
 	case ODEBUG_STATE_ACTIVE:
-		debug_print_object(obj, "free");
 		state = obj->state;
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		debug_print_object(obj, "free");
 		debug_object_fixup(descr->fixup_free, addr, state);
 		return;
 	default:
@@ -870,6 +889,7 @@ debug_object_active_state(void *addr, struct debug_obj_descr *descr,
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
+	bool print_object = false;
 
 	if (!debug_objects_enabled)
 		return;
@@ -885,22 +905,25 @@ debug_object_active_state(void *addr, struct debug_obj_descr *descr,
 			if (obj->astate == expect)
 				obj->astate = next;
 			else
-				debug_print_object(obj, "active_state");
+				print_object = true;
 			break;
 
 		default:
-			debug_print_object(obj, "active_state");
+			print_object = true;
 			break;
 		}
-	} else {
+	}
+
+	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (!obj) {
 		struct debug_obj o = { .object = addr,
 				       .state = ODEBUG_STATE_NOTAVAILABLE,
 				       .descr = descr };
 
 		debug_print_object(&o, "active_state");
+	} else if (print_object) {
+		debug_print_object(obj, "active_state");
 	}
-
-	raw_spin_unlock_irqrestore(&db->lock, flags);
 }
 EXPORT_SYMBOL_GPL(debug_object_active_state);
 
@@ -935,10 +958,10 @@ static void __debug_check_no_obj_freed(const void *address, unsigned long size)
 
 			switch (obj->state) {
 			case ODEBUG_STATE_ACTIVE:
-				debug_print_object(obj, "free");
 				descr = obj->descr;
 				state = obj->state;
 				raw_spin_unlock_irqrestore(&db->lock, flags);
+				debug_print_object(obj, "free");
 				debug_object_fixup(descr->fixup_free,
 						   (void *) oaddr, state);
 				goto repeat;
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/debugobjects] debugobjects: Add percpu free pools
  2019-05-20 14:14 ` [PATCH 1/5] debugobjects: Add percpu free pools Waiman Long
@ 2019-06-14 12:58   ` tip-bot for Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Waiman Long @ 2019-06-14 12:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: akpm, tglx, linux-kernel, longman, zhongjiang, hpa, joel, cai,
	yang.shi, mingo

Commit-ID:  d86998b17a01050c0232231fa481e65ef8171ca6
Gitweb:     https://git.kernel.org/tip/d86998b17a01050c0232231fa481e65ef8171ca6
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Mon, 20 May 2019 10:14:46 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 14 Jun 2019 14:51:14 +0200

debugobjects: Add percpu free pools

When a multi-threaded workload does a lot of small memory object
allocations and deallocations, it may cause the allocation and freeing of
many debug objects. This will make the global pool_lock a bottleneck in the
performance of the workload.  Since interrupts are disabled when acquiring
the pool_lock, it may even cause hard lockups to happen.

To reduce contention of the global pool_lock, add a percpu debug object
free pool that can be used to buffer some of the debug object allocation
and freeing requests without acquiring the pool_lock.  Each CPU will now
have a percpu free pool that can hold up to a maximum of 64 debug
objects. Allocation and freeing requests will go to the percpu free pool
first. If that fails, the pool_lock will be taken and the global free pool
will be used.

The presence or absence of obj_cache is used as a marker to see if the
percpu cache should be used.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-2-longman@redhat.com

---
 lib/debugobjects.c | 115 ++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 91 insertions(+), 24 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 2ac42286cd08..38c23b528f6f 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -25,6 +25,7 @@
 
 #define ODEBUG_POOL_SIZE	1024
 #define ODEBUG_POOL_MIN_LEVEL	256
+#define ODEBUG_POOL_PERCPU_SIZE	64
 
 #define ODEBUG_CHUNK_SHIFT	PAGE_SHIFT
 #define ODEBUG_CHUNK_SIZE	(1 << ODEBUG_CHUNK_SHIFT)
@@ -35,6 +36,17 @@ struct debug_bucket {
 	raw_spinlock_t		lock;
 };
 
+/*
+ * Debug object percpu free list
+ * Access is protected by disabling irq
+ */
+struct debug_percpu_free {
+	struct hlist_head	free_objs;
+	int			obj_free;
+};
+
+static DEFINE_PER_CPU(struct debug_percpu_free, percpu_obj_pool);
+
 static struct debug_bucket	obj_hash[ODEBUG_HASH_SIZE];
 
 static struct debug_obj		obj_static_pool[ODEBUG_POOL_SIZE] __initdata;
@@ -44,13 +56,19 @@ static DEFINE_RAW_SPINLOCK(pool_lock);
 static HLIST_HEAD(obj_pool);
 static HLIST_HEAD(obj_to_free);
 
+/*
+ * Because of the presence of percpu free pools, obj_pool_free will
+ * under-count those in the percpu free pools. Similarly, obj_pool_used
+ * will over-count those in the percpu free pools. Adjustments will be
+ * made at debug_stats_show(). Both obj_pool_min_free and obj_pool_max_used
+ * can be off.
+ */
 static int			obj_pool_min_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_used;
 static int			obj_pool_max_used;
 /* The number of objs on the global free list */
 static int			obj_nr_tofree;
-static struct kmem_cache	*obj_cache;
 
 static int			debug_objects_maxchain __read_mostly;
 static int __maybe_unused	debug_objects_maxchecked __read_mostly;
@@ -63,6 +81,7 @@ static int			debug_objects_pool_size __read_mostly
 static int			debug_objects_pool_min_level __read_mostly
 				= ODEBUG_POOL_MIN_LEVEL;
 static struct debug_obj_descr	*descr_test  __read_mostly;
+static struct kmem_cache	*obj_cache __read_mostly;
 
 /*
  * Track numbers of kmem_cache_alloc()/free() calls done.
@@ -162,6 +181,21 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
 	return NULL;
 }
 
+/*
+ * Allocate a new object from the hlist
+ */
+static struct debug_obj *__alloc_object(struct hlist_head *list)
+{
+	struct debug_obj *obj = NULL;
+
+	if (list->first) {
+		obj = hlist_entry(list->first, typeof(*obj), node);
+		hlist_del(&obj->node);
+	}
+
+	return obj;
+}
+
 /*
  * Allocate a new object. If the pool is empty, switch off the debugger.
  * Must be called with interrupts disabled.
@@ -169,20 +203,21 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
 static struct debug_obj *
 alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 {
-	struct debug_obj *obj = NULL;
-
-	raw_spin_lock(&pool_lock);
-	if (obj_pool.first) {
-		obj	    = hlist_entry(obj_pool.first, typeof(*obj), node);
-
-		obj->object = addr;
-		obj->descr  = descr;
-		obj->state  = ODEBUG_STATE_NONE;
-		obj->astate = 0;
-		hlist_del(&obj->node);
+	struct debug_percpu_free *percpu_pool;
+	struct debug_obj *obj;
 
-		hlist_add_head(&obj->node, &b->list);
+	if (likely(obj_cache)) {
+		percpu_pool = this_cpu_ptr(&percpu_obj_pool);
+		obj = __alloc_object(&percpu_pool->free_objs);
+		if (obj) {
+			percpu_pool->obj_free--;
+			goto init_obj;
+		}
+	}
 
+	raw_spin_lock(&pool_lock);
+	obj = __alloc_object(&obj_pool);
+	if (obj) {
 		obj_pool_used++;
 		if (obj_pool_used > obj_pool_max_used)
 			obj_pool_max_used = obj_pool_used;
@@ -193,6 +228,14 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 	}
 	raw_spin_unlock(&pool_lock);
 
+init_obj:
+	if (obj) {
+		obj->object = addr;
+		obj->descr  = descr;
+		obj->state  = ODEBUG_STATE_NONE;
+		obj->astate = 0;
+		hlist_add_head(&obj->node, &b->list);
+	}
 	return obj;
 }
 
@@ -247,8 +290,21 @@ static bool __free_object(struct debug_obj *obj)
 {
 	unsigned long flags;
 	bool work;
+	struct debug_percpu_free *percpu_pool;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
+	local_irq_save(flags);
+	/*
+	 * Try to free it into the percpu pool first.
+	 */
+	percpu_pool = this_cpu_ptr(&percpu_obj_pool);
+	if (obj_cache && percpu_pool->obj_free < ODEBUG_POOL_PERCPU_SIZE) {
+		hlist_add_head(&obj->node, &percpu_pool->free_objs);
+		percpu_pool->obj_free++;
+		local_irq_restore(flags);
+		return false;
+	}
+
+	raw_spin_lock(&pool_lock);
 	work = (obj_pool_free > debug_objects_pool_size) && obj_cache;
 	obj_pool_used--;
 
@@ -259,7 +315,8 @@ static bool __free_object(struct debug_obj *obj)
 		obj_pool_free++;
 		hlist_add_head(&obj->node, &obj_pool);
 	}
-	raw_spin_unlock_irqrestore(&pool_lock, flags);
+	raw_spin_unlock(&pool_lock);
+	local_irq_restore(flags);
 	return work;
 }
 
@@ -822,13 +879,19 @@ void debug_check_no_obj_freed(const void *address, unsigned long size)
 
 static int debug_stats_show(struct seq_file *m, void *v)
 {
+	int cpu, obj_percpu_free = 0;
+
+	for_each_possible_cpu(cpu)
+		obj_percpu_free += per_cpu(percpu_obj_pool.obj_free, cpu);
+
 	seq_printf(m, "max_chain     :%d\n", debug_objects_maxchain);
 	seq_printf(m, "max_checked   :%d\n", debug_objects_maxchecked);
 	seq_printf(m, "warnings      :%d\n", debug_objects_warnings);
 	seq_printf(m, "fixups        :%d\n", debug_objects_fixups);
-	seq_printf(m, "pool_free     :%d\n", obj_pool_free);
+	seq_printf(m, "pool_free     :%d\n", obj_pool_free + obj_percpu_free);
+	seq_printf(m, "pool_pcp_free :%d\n", obj_percpu_free);
 	seq_printf(m, "pool_min_free :%d\n", obj_pool_min_free);
-	seq_printf(m, "pool_used     :%d\n", obj_pool_used);
+	seq_printf(m, "pool_used     :%d\n", obj_pool_used - obj_percpu_free);
 	seq_printf(m, "pool_max_used :%d\n", obj_pool_max_used);
 	seq_printf(m, "on_free_list  :%d\n", obj_nr_tofree);
 	seq_printf(m, "objs_allocated:%d\n", debug_objects_allocated);
@@ -1165,9 +1228,20 @@ free:
  */
 void __init debug_objects_mem_init(void)
 {
+	int cpu;
+
 	if (!debug_objects_enabled)
 		return;
 
+	/*
+	 * Initialize the percpu object pools
+	 *
+	 * Initialization is not strictly necessary, but was done for
+	 * completeness.
+	 */
+	for_each_possible_cpu(cpu)
+		INIT_HLIST_HEAD(&per_cpu(percpu_obj_pool.free_objs, cpu));
+
 	obj_cache = kmem_cache_create("debug_objects_cache",
 				      sizeof (struct debug_obj), 0,
 				      SLAB_DEBUG_OBJECTS | SLAB_NOLEAKTRACE,
@@ -1179,11 +1253,4 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
-
-	/*
-	 * Increase the thresholds for allocating and freeing objects
-	 * according to the number of possible CPUs available in the system.
-	 */
-	debug_objects_pool_size += num_possible_cpus() * 32;
-	debug_objects_pool_min_level += num_possible_cpus() * 4;
 }

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/debugobjects] debugobjects: Percpu pool lookahead freeing/allocation
  2019-05-20 14:14 ` [PATCH 2/5] debugobjects: Percpu pool lookahead freeing/allocation Waiman Long
@ 2019-06-14 12:59   ` tip-bot for Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Waiman Long @ 2019-06-14 12:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: yang.shi, tglx, mingo, hpa, linux-kernel, joel, cai, akpm,
	longman, zhongjiang

Commit-ID:  634d61f45d6f668fe7e468b62d00ae469a583ca2
Gitweb:     https://git.kernel.org/tip/634d61f45d6f668fe7e468b62d00ae469a583ca2
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Mon, 20 May 2019 10:14:47 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 14 Jun 2019 14:51:14 +0200

debugobjects: Percpu pool lookahead freeing/allocation

Most workloads will allocate a bunch of memory objects, work on them
and then freeing all or most of them. So just having a percpu free pool
may not reduce the pool_lock contention significantly if large number
of objects are being used.

To help those situations, we are now doing lookahead allocation and
freeing of the debug objects into and out of the percpu free pool. This
will hopefully reduce the number of times the pool_lock needs to be
taken and hence its contention level.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-3-longman@redhat.com

---
 lib/debugobjects.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 69 insertions(+), 6 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 38c23b528f6f..714459a8dc10 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -26,6 +26,7 @@
 #define ODEBUG_POOL_SIZE	1024
 #define ODEBUG_POOL_MIN_LEVEL	256
 #define ODEBUG_POOL_PERCPU_SIZE	64
+#define ODEBUG_BATCH_SIZE	16
 
 #define ODEBUG_CHUNK_SHIFT	PAGE_SHIFT
 #define ODEBUG_CHUNK_SIZE	(1 << ODEBUG_CHUNK_SHIFT)
@@ -203,11 +204,10 @@ static struct debug_obj *__alloc_object(struct hlist_head *list)
 static struct debug_obj *
 alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 {
-	struct debug_percpu_free *percpu_pool;
+	struct debug_percpu_free *percpu_pool = this_cpu_ptr(&percpu_obj_pool);
 	struct debug_obj *obj;
 
 	if (likely(obj_cache)) {
-		percpu_pool = this_cpu_ptr(&percpu_obj_pool);
 		obj = __alloc_object(&percpu_pool->free_objs);
 		if (obj) {
 			percpu_pool->obj_free--;
@@ -219,10 +219,32 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 	obj = __alloc_object(&obj_pool);
 	if (obj) {
 		obj_pool_used++;
+		obj_pool_free--;
+
+		/*
+		 * Looking ahead, allocate one batch of debug objects and
+		 * put them into the percpu free pool.
+		 */
+		if (likely(obj_cache)) {
+			int i;
+
+			for (i = 0; i < ODEBUG_BATCH_SIZE; i++) {
+				struct debug_obj *obj2;
+
+				obj2 = __alloc_object(&obj_pool);
+				if (!obj2)
+					break;
+				hlist_add_head(&obj2->node,
+					       &percpu_pool->free_objs);
+				percpu_pool->obj_free++;
+				obj_pool_used++;
+				obj_pool_free--;
+			}
+		}
+
 		if (obj_pool_used > obj_pool_max_used)
 			obj_pool_max_used = obj_pool_used;
 
-		obj_pool_free--;
 		if (obj_pool_free < obj_pool_min_free)
 			obj_pool_min_free = obj_pool_free;
 	}
@@ -288,22 +310,39 @@ static void free_obj_work(struct work_struct *work)
 
 static bool __free_object(struct debug_obj *obj)
 {
+	struct debug_obj *objs[ODEBUG_BATCH_SIZE];
+	struct debug_percpu_free *percpu_pool;
+	int lookahead_count = 0;
 	unsigned long flags;
 	bool work;
-	struct debug_percpu_free *percpu_pool;
 
 	local_irq_save(flags);
+	if (!obj_cache)
+		goto free_to_obj_pool;
+
 	/*
 	 * Try to free it into the percpu pool first.
 	 */
 	percpu_pool = this_cpu_ptr(&percpu_obj_pool);
-	if (obj_cache && percpu_pool->obj_free < ODEBUG_POOL_PERCPU_SIZE) {
+	if (percpu_pool->obj_free < ODEBUG_POOL_PERCPU_SIZE) {
 		hlist_add_head(&obj->node, &percpu_pool->free_objs);
 		percpu_pool->obj_free++;
 		local_irq_restore(flags);
 		return false;
 	}
 
+	/*
+	 * As the percpu pool is full, look ahead and pull out a batch
+	 * of objects from the percpu pool and free them as well.
+	 */
+	for (; lookahead_count < ODEBUG_BATCH_SIZE; lookahead_count++) {
+		objs[lookahead_count] = __alloc_object(&percpu_pool->free_objs);
+		if (!objs[lookahead_count])
+			break;
+		percpu_pool->obj_free--;
+	}
+
+free_to_obj_pool:
 	raw_spin_lock(&pool_lock);
 	work = (obj_pool_free > debug_objects_pool_size) && obj_cache;
 	obj_pool_used--;
@@ -311,9 +350,25 @@ static bool __free_object(struct debug_obj *obj)
 	if (work) {
 		obj_nr_tofree++;
 		hlist_add_head(&obj->node, &obj_to_free);
+		if (lookahead_count) {
+			obj_nr_tofree += lookahead_count;
+			obj_pool_used -= lookahead_count;
+			while (lookahead_count) {
+				hlist_add_head(&objs[--lookahead_count]->node,
+					       &obj_to_free);
+			}
+		}
 	} else {
 		obj_pool_free++;
 		hlist_add_head(&obj->node, &obj_pool);
+		if (lookahead_count) {
+			obj_pool_free += lookahead_count;
+			obj_pool_used -= lookahead_count;
+			while (lookahead_count) {
+				hlist_add_head(&objs[--lookahead_count]->node,
+					       &obj_pool);
+			}
+		}
 	}
 	raw_spin_unlock(&pool_lock);
 	local_irq_restore(flags);
@@ -1228,7 +1283,7 @@ free:
  */
 void __init debug_objects_mem_init(void)
 {
-	int cpu;
+	int cpu, extras;
 
 	if (!debug_objects_enabled)
 		return;
@@ -1253,4 +1308,12 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
+
+	/*
+	 * Increase the thresholds for allocating and freeing objects
+	 * according to the number of possible CPUs available in the system.
+	 */
+	extras = num_possible_cpus() * ODEBUG_BATCH_SIZE;
+	debug_objects_pool_size += extras;
+	debug_objects_pool_min_level += extras;
 }

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/debugobjects] debugobjects: Reduce number of pool_lock acquisitions in fill_pool()
  2019-05-20 14:14 ` [PATCH 3/5] debugobjects: Reduce number of pool_lock acquisitions in fill_pool() Waiman Long
@ 2019-06-14 13:00   ` tip-bot for Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Waiman Long @ 2019-06-14 13:00 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: akpm, linux-kernel, hpa, tglx, cai, longman, zhongjiang, joel,
	mingo, yang.shi

Commit-ID:  d26bf5056fc087d845bfbb8b651b4be2933ab7a6
Gitweb:     https://git.kernel.org/tip/d26bf5056fc087d845bfbb8b651b4be2933ab7a6
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Mon, 20 May 2019 10:14:48 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 14 Jun 2019 14:51:15 +0200

debugobjects: Reduce number of pool_lock acquisitions in fill_pool()

In fill_pool(), the pool_lock is acquired and then released once per debug
object. If many objects are to be filled, the constant lock and unlock
operations are extra overhead.

To reduce the overhead, batch them up and do an allocation of 4 objects per
lock/unlock sequence.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-4-longman@redhat.com

---
 lib/debugobjects.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 714459a8dc10..7ea19fa63561 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -120,7 +120,7 @@ static const char *obj_states[ODEBUG_STATE_MAX] = {
 static void fill_pool(void)
 {
 	gfp_t gfp = GFP_ATOMIC | __GFP_NORETRY | __GFP_NOWARN;
-	struct debug_obj *new, *obj;
+	struct debug_obj *obj;
 	unsigned long flags;
 
 	if (likely(obj_pool_free >= debug_objects_pool_min_level))
@@ -136,7 +136,7 @@ static void fill_pool(void)
 		 * Recheck with the lock held as the worker thread might have
 		 * won the race and freed the global free list already.
 		 */
-		if (obj_nr_tofree) {
+		while (obj_nr_tofree && (obj_pool_free < obj_pool_min_free)) {
 			obj = hlist_entry(obj_to_free.first, typeof(*obj), node);
 			hlist_del(&obj->node);
 			obj_nr_tofree--;
@@ -150,15 +150,23 @@ static void fill_pool(void)
 		return;
 
 	while (obj_pool_free < debug_objects_pool_min_level) {
+		struct debug_obj *new[ODEBUG_BATCH_SIZE];
+		int cnt;
 
-		new = kmem_cache_zalloc(obj_cache, gfp);
-		if (!new)
+		for (cnt = 0; cnt < ODEBUG_BATCH_SIZE; cnt++) {
+			new[cnt] = kmem_cache_zalloc(obj_cache, gfp);
+			if (!new[cnt])
+				break;
+		}
+		if (!cnt)
 			return;
 
 		raw_spin_lock_irqsave(&pool_lock, flags);
-		hlist_add_head(&new->node, &obj_pool);
-		debug_objects_allocated++;
-		obj_pool_free++;
+		while (cnt) {
+			hlist_add_head(&new[--cnt]->node, &obj_pool);
+			debug_objects_allocated++;
+			obj_pool_free++;
+		}
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
 	}
 }
@@ -280,7 +288,7 @@ static void free_obj_work(struct work_struct *work)
 	/*
 	 * The objs on the pool list might be allocated before the work is
 	 * run, so recheck if pool list it full or not, if not fill pool
-	 * list from the global free list
+	 * list from the global free list.
 	 */
 	while (obj_nr_tofree && obj_pool_free < debug_objects_pool_size) {
 		obj = hlist_entry(obj_to_free.first, typeof(*obj), node);

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/debugobjects] debugobjects: Less aggressive freeing of excess debug objects
  2019-05-20 14:14 ` [PATCH 4/5] debugobjects: Less aggressive freeing of excess debug objects Waiman Long
@ 2019-06-14 13:01   ` tip-bot for Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Waiman Long @ 2019-06-14 13:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: yang.shi, mingo, joel, tglx, hpa, cai, zhongjiang, linux-kernel,
	akpm, longman

Commit-ID:  a7344a68a79ab91bc38af4b9d24284b479aa780a
Gitweb:     https://git.kernel.org/tip/a7344a68a79ab91bc38af4b9d24284b479aa780a
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Mon, 20 May 2019 10:14:49 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 14 Jun 2019 14:51:15 +0200

debugobjects: Less aggressive freeing of excess debug objects

After a system bootup and 3 parallel kernel builds, a partial output
of the debug objects stats file was:

pool_free     :5101
pool_pcp_free :4181
pool_min_free :220
pool_used     :104172
pool_max_used :171920
on_free_list  :0
objs_allocated:39268280
objs_freed    :39160031

More than 39 millions debug objects had since been allocated and then
freed. The pool_max_used, however, was only about 172k. So this is a
lot of extra overhead in freeing and allocating objects from slabs. It
may also causes the slabs to be more fragmented and harder to reclaim.

Make the freeing of excess debug objects less aggressive by freeing them at
a maximum frequency of 10Hz and about 1k objects at each round of freeing.

With that change applied, the partial output of the debug objects stats
file after similar actions became:

pool_free     :5901
pool_pcp_free :3742
pool_min_free :1022
pool_used     :104805
pool_max_used :168081
on_free_list  :0
objs_allocated:5796864
objs_freed    :5687182

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-5-longman@redhat.com

---
 lib/debugobjects.c | 61 +++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 49 insertions(+), 12 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 7ea19fa63561..ede96c659552 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -32,6 +32,14 @@
 #define ODEBUG_CHUNK_SIZE	(1 << ODEBUG_CHUNK_SHIFT)
 #define ODEBUG_CHUNK_MASK	(~(ODEBUG_CHUNK_SIZE - 1))
 
+/*
+ * We limit the freeing of debug objects via workqueue at a maximum
+ * frequency of 10Hz and about 1024 objects for each freeing operation.
+ * So it is freeing at most 10k debug objects per second.
+ */
+#define ODEBUG_FREE_WORK_MAX	1024
+#define ODEBUG_FREE_WORK_DELAY	DIV_ROUND_UP(HZ, 10)
+
 struct debug_bucket {
 	struct hlist_head	list;
 	raw_spinlock_t		lock;
@@ -68,6 +76,7 @@ static int			obj_pool_min_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_free = ODEBUG_POOL_SIZE;
 static int			obj_pool_used;
 static int			obj_pool_max_used;
+static bool			obj_freeing;
 /* The number of objs on the global free list */
 static int			obj_nr_tofree;
 
@@ -91,7 +100,7 @@ static int			debug_objects_allocated;
 static int			debug_objects_freed;
 
 static void free_obj_work(struct work_struct *work);
-static DECLARE_WORK(debug_obj_work, free_obj_work);
+static DECLARE_DELAYED_WORK(debug_obj_work, free_obj_work);
 
 static int __init enable_object_debug(char *str)
 {
@@ -282,13 +291,19 @@ static void free_obj_work(struct work_struct *work)
 	unsigned long flags;
 	HLIST_HEAD(tofree);
 
+	WRITE_ONCE(obj_freeing, false);
 	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
 		return;
 
+	if (obj_pool_free >= debug_objects_pool_size)
+		goto free_objs;
+
 	/*
 	 * The objs on the pool list might be allocated before the work is
 	 * run, so recheck if pool list it full or not, if not fill pool
-	 * list from the global free list.
+	 * list from the global free list. As it is likely that a workload
+	 * may be gearing up to use more and more objects, don't free any
+	 * of them until the next round.
 	 */
 	while (obj_nr_tofree && obj_pool_free < debug_objects_pool_size) {
 		obj = hlist_entry(obj_to_free.first, typeof(*obj), node);
@@ -297,7 +312,10 @@ static void free_obj_work(struct work_struct *work)
 		obj_pool_free++;
 		obj_nr_tofree--;
 	}
+	raw_spin_unlock_irqrestore(&pool_lock, flags);
+	return;
 
+free_objs:
 	/*
 	 * Pool list is already full and there are still objs on the free
 	 * list. Move remaining free objs to a temporary list to free the
@@ -316,7 +334,7 @@ static void free_obj_work(struct work_struct *work)
 	}
 }
 
-static bool __free_object(struct debug_obj *obj)
+static void __free_object(struct debug_obj *obj)
 {
 	struct debug_obj *objs[ODEBUG_BATCH_SIZE];
 	struct debug_percpu_free *percpu_pool;
@@ -336,7 +354,7 @@ static bool __free_object(struct debug_obj *obj)
 		hlist_add_head(&obj->node, &percpu_pool->free_objs);
 		percpu_pool->obj_free++;
 		local_irq_restore(flags);
-		return false;
+		return;
 	}
 
 	/*
@@ -352,7 +370,8 @@ static bool __free_object(struct debug_obj *obj)
 
 free_to_obj_pool:
 	raw_spin_lock(&pool_lock);
-	work = (obj_pool_free > debug_objects_pool_size) && obj_cache;
+	work = (obj_pool_free > debug_objects_pool_size) && obj_cache &&
+	       (obj_nr_tofree < ODEBUG_FREE_WORK_MAX);
 	obj_pool_used--;
 
 	if (work) {
@@ -366,6 +385,21 @@ free_to_obj_pool:
 					       &obj_to_free);
 			}
 		}
+
+		if ((obj_pool_free > debug_objects_pool_size) &&
+		    (obj_nr_tofree < ODEBUG_FREE_WORK_MAX)) {
+			int i;
+
+			/*
+			 * Free one more batch of objects from obj_pool.
+			 */
+			for (i = 0; i < ODEBUG_BATCH_SIZE; i++) {
+				obj = __alloc_object(&obj_pool);
+				hlist_add_head(&obj->node, &obj_to_free);
+				obj_pool_free--;
+				obj_nr_tofree++;
+			}
+		}
 	} else {
 		obj_pool_free++;
 		hlist_add_head(&obj->node, &obj_pool);
@@ -380,7 +414,6 @@ free_to_obj_pool:
 	}
 	raw_spin_unlock(&pool_lock);
 	local_irq_restore(flags);
-	return work;
 }
 
 /*
@@ -389,8 +422,11 @@ free_to_obj_pool:
  */
 static void free_object(struct debug_obj *obj)
 {
-	if (__free_object(obj))
-		schedule_work(&debug_obj_work);
+	__free_object(obj);
+	if (!obj_freeing && obj_nr_tofree) {
+		WRITE_ONCE(obj_freeing, true);
+		schedule_delayed_work(&debug_obj_work, ODEBUG_FREE_WORK_DELAY);
+	}
 }
 
 /*
@@ -880,7 +916,6 @@ static void __debug_check_no_obj_freed(const void *address, unsigned long size)
 	struct hlist_node *tmp;
 	struct debug_obj *obj;
 	int cnt, objs_checked = 0;
-	bool work = false;
 
 	saddr = (unsigned long) address;
 	eaddr = saddr + size;
@@ -911,7 +946,7 @@ repeat:
 				goto repeat;
 			default:
 				hlist_del(&obj->node);
-				work |= __free_object(obj);
+				__free_object(obj);
 				break;
 			}
 		}
@@ -927,8 +962,10 @@ repeat:
 		debug_objects_maxchecked = objs_checked;
 
 	/* Schedule work to actually kmem_cache_free() objects */
-	if (work)
-		schedule_work(&debug_obj_work);
+	if (!obj_freeing && obj_nr_tofree) {
+		WRITE_ONCE(obj_freeing, true);
+		schedule_delayed_work(&debug_obj_work, ODEBUG_FREE_WORK_DELAY);
+	}
 }
 
 void debug_check_no_obj_freed(const void *address, unsigned long size)

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:core/debugobjects] debugobjects: Move printk out of db->lock critical sections
  2019-05-20 14:14 ` [PATCH 5/5] debugobjects: Move printk out of db lock critical sections Waiman Long
@ 2019-06-14 13:01   ` tip-bot for Waiman Long
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Waiman Long @ 2019-06-14 13:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: joel, mingo, zhongjiang, linux-kernel, cai, hpa, tglx, longman,
	yang.shi, akpm

Commit-ID:  d5f34153e526903abe71869dbbc898bfc0f69373
Gitweb:     https://git.kernel.org/tip/d5f34153e526903abe71869dbbc898bfc0f69373
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Mon, 20 May 2019 10:14:50 -0400
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 14 Jun 2019 14:51:16 +0200

debugobjects: Move printk out of db->lock critical sections

The db->lock is a raw spinlock and so the lock hold time is supposed
to be short. This will not be the case when printk() is being involved
in some of the critical sections. In order to avoid the long hold time,
in case some messages need to be printed, the debug_object_is_on_stack()
and debug_print_object() calls are now moved out of those critical
sections.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: Qian Cai <cai@gmx.us>
Cc: Zhong Jiang <zhongjiang@huawei.com>
Link: https://lkml.kernel.org/r/20190520141450.7575-6-longman@redhat.com

---
 lib/debugobjects.c | 58 ++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index ede96c659552..61261195f5b6 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -528,6 +528,7 @@ static void
 __debug_object_init(void *addr, struct debug_obj_descr *descr, int onstack)
 {
 	enum debug_obj_state state;
+	bool check_stack = false;
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
@@ -547,7 +548,7 @@ __debug_object_init(void *addr, struct debug_obj_descr *descr, int onstack)
 			debug_objects_oom();
 			return;
 		}
-		debug_object_is_on_stack(addr, onstack);
+		check_stack = true;
 	}
 
 	switch (obj->state) {
@@ -558,20 +559,23 @@ __debug_object_init(void *addr, struct debug_obj_descr *descr, int onstack)
 		break;
 
 	case ODEBUG_STATE_ACTIVE:
-		debug_print_object(obj, "init");
 		state = obj->state;
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		debug_print_object(obj, "init");
 		debug_object_fixup(descr->fixup_init, addr, state);
 		return;
 
 	case ODEBUG_STATE_DESTROYED:
+		raw_spin_unlock_irqrestore(&db->lock, flags);
 		debug_print_object(obj, "init");
-		break;
+		return;
 	default:
 		break;
 	}
 
 	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (check_stack)
+		debug_object_is_on_stack(addr, onstack);
 }
 
 /**
@@ -629,6 +633,8 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
 
 	obj = lookup_object(addr, db);
 	if (obj) {
+		bool print_object = false;
+
 		switch (obj->state) {
 		case ODEBUG_STATE_INIT:
 		case ODEBUG_STATE_INACTIVE:
@@ -637,14 +643,14 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
 			break;
 
 		case ODEBUG_STATE_ACTIVE:
-			debug_print_object(obj, "activate");
 			state = obj->state;
 			raw_spin_unlock_irqrestore(&db->lock, flags);
+			debug_print_object(obj, "activate");
 			ret = debug_object_fixup(descr->fixup_activate, addr, state);
 			return ret ? 0 : -EINVAL;
 
 		case ODEBUG_STATE_DESTROYED:
-			debug_print_object(obj, "activate");
+			print_object = true;
 			ret = -EINVAL;
 			break;
 		default:
@@ -652,10 +658,13 @@ int debug_object_activate(void *addr, struct debug_obj_descr *descr)
 			break;
 		}
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		if (print_object)
+			debug_print_object(obj, "activate");
 		return ret;
 	}
 
 	raw_spin_unlock_irqrestore(&db->lock, flags);
+
 	/*
 	 * We are here when a static object is activated. We
 	 * let the type specific code confirm whether this is
@@ -687,6 +696,7 @@ void debug_object_deactivate(void *addr, struct debug_obj_descr *descr)
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
+	bool print_object = false;
 
 	if (!debug_objects_enabled)
 		return;
@@ -704,24 +714,27 @@ void debug_object_deactivate(void *addr, struct debug_obj_descr *descr)
 			if (!obj->astate)
 				obj->state = ODEBUG_STATE_INACTIVE;
 			else
-				debug_print_object(obj, "deactivate");
+				print_object = true;
 			break;
 
 		case ODEBUG_STATE_DESTROYED:
-			debug_print_object(obj, "deactivate");
+			print_object = true;
 			break;
 		default:
 			break;
 		}
-	} else {
+	}
+
+	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (!obj) {
 		struct debug_obj o = { .object = addr,
 				       .state = ODEBUG_STATE_NOTAVAILABLE,
 				       .descr = descr };
 
 		debug_print_object(&o, "deactivate");
+	} else if (print_object) {
+		debug_print_object(obj, "deactivate");
 	}
-
-	raw_spin_unlock_irqrestore(&db->lock, flags);
 }
 EXPORT_SYMBOL_GPL(debug_object_deactivate);
 
@@ -736,6 +749,7 @@ void debug_object_destroy(void *addr, struct debug_obj_descr *descr)
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
+	bool print_object = false;
 
 	if (!debug_objects_enabled)
 		return;
@@ -755,20 +769,22 @@ void debug_object_destroy(void *addr, struct debug_obj_descr *descr)
 		obj->state = ODEBUG_STATE_DESTROYED;
 		break;
 	case ODEBUG_STATE_ACTIVE:
-		debug_print_object(obj, "destroy");
 		state = obj->state;
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		debug_print_object(obj, "destroy");
 		debug_object_fixup(descr->fixup_destroy, addr, state);
 		return;
 
 	case ODEBUG_STATE_DESTROYED:
-		debug_print_object(obj, "destroy");
+		print_object = true;
 		break;
 	default:
 		break;
 	}
 out_unlock:
 	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (print_object)
+		debug_print_object(obj, "destroy");
 }
 EXPORT_SYMBOL_GPL(debug_object_destroy);
 
@@ -797,9 +813,9 @@ void debug_object_free(void *addr, struct debug_obj_descr *descr)
 
 	switch (obj->state) {
 	case ODEBUG_STATE_ACTIVE:
-		debug_print_object(obj, "free");
 		state = obj->state;
 		raw_spin_unlock_irqrestore(&db->lock, flags);
+		debug_print_object(obj, "free");
 		debug_object_fixup(descr->fixup_free, addr, state);
 		return;
 	default:
@@ -872,6 +888,7 @@ debug_object_active_state(void *addr, struct debug_obj_descr *descr,
 	struct debug_bucket *db;
 	struct debug_obj *obj;
 	unsigned long flags;
+	bool print_object = false;
 
 	if (!debug_objects_enabled)
 		return;
@@ -887,22 +904,25 @@ debug_object_active_state(void *addr, struct debug_obj_descr *descr,
 			if (obj->astate == expect)
 				obj->astate = next;
 			else
-				debug_print_object(obj, "active_state");
+				print_object = true;
 			break;
 
 		default:
-			debug_print_object(obj, "active_state");
+			print_object = true;
 			break;
 		}
-	} else {
+	}
+
+	raw_spin_unlock_irqrestore(&db->lock, flags);
+	if (!obj) {
 		struct debug_obj o = { .object = addr,
 				       .state = ODEBUG_STATE_NOTAVAILABLE,
 				       .descr = descr };
 
 		debug_print_object(&o, "active_state");
+	} else if (print_object) {
+		debug_print_object(obj, "active_state");
 	}
-
-	raw_spin_unlock_irqrestore(&db->lock, flags);
 }
 EXPORT_SYMBOL_GPL(debug_object_active_state);
 
@@ -937,10 +957,10 @@ repeat:
 
 			switch (obj->state) {
 			case ODEBUG_STATE_ACTIVE:
-				debug_print_object(obj, "free");
 				descr = obj->descr;
 				state = obj->state;
 				raw_spin_unlock_irqrestore(&db->lock, flags);
+				debug_print_object(obj, "free");
 				debug_object_fixup(descr->fixup_free,
 						   (void *) oaddr, state);
 				goto repeat;

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-06-14 13:02 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-20 14:14 [PATCH 0/5] debugobjects: Reduce global pool_lock contention Waiman Long
2019-05-20 14:14 ` [PATCH 1/5] debugobjects: Add percpu free pools Waiman Long
2019-06-14 12:58   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2019-05-20 14:14 ` [PATCH 2/5] debugobjects: Percpu pool lookahead freeing/allocation Waiman Long
2019-06-14 12:59   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2019-05-20 14:14 ` [PATCH 3/5] debugobjects: Reduce number of pool_lock acquisitions in fill_pool() Waiman Long
2019-06-14 13:00   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2019-05-20 14:14 ` [PATCH 4/5] debugobjects: Less aggressive freeing of excess debug objects Waiman Long
2019-06-14 13:01   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2019-05-20 14:14 ` [PATCH 5/5] debugobjects: Move printk out of db lock critical sections Waiman Long
2019-06-14 13:01   ` [tip:core/debugobjects] debugobjects: Move printk out of db->lock " tip-bot for Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.