linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention
@ 2017-01-05 20:17 Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

 v1->v2:
  - Move patch 2 in front of patch 1.
  - Fix merge conflict with linux-next.

This patchset aims to reduce contention of the global pool_lock
while improving performance at the same time. It is done to resolve
the following soft lockup problem with a debug kernel in some of the
large SMP systems:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

On a 8-socket IvyBridge-EX system (120 cores, 240 threads), the
elapsed time of a 4.9-rc7 kernel parallel build (make -j 240) was
reduced from 7m57s to 7m19s with a patched 4.9-rc7 kernel. There
was also about a 10X reduction in the number of debug objects being
allocated from or freed to the kmemcache during the kernel build.

Waiman Long (3):
  debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  debugobjects: Scale thresholds with # of CPUs
  debugobjects: Reduce contention on the global pool_lock

 lib/debugobjects.c | 57 ++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 45 insertions(+), 12 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
@ 2017-01-05 20:17 ` Waiman Long
  2017-02-04 16:22   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
  2 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

New debugfs stat counters are added to track the numbers of
kmem_cache_alloc() and kmem_cache_free() function calls to get a
sense of how the internal debug objects cache management is performing.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 04c1ef7..d78673e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -55,6 +55,12 @@ struct debug_bucket {
 
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
+/*
+ * Track numbers of kmem_cache_alloc and kmem_cache_free done.
+ */
+static int			debug_objects_alloc;
+static int			debug_objects_freed;
+
 static void free_obj_work(struct work_struct *work);
 static DECLARE_WORK(debug_obj_work, free_obj_work);
 
@@ -102,6 +108,7 @@ static void fill_pool(void)
 
 		raw_spin_lock_irqsave(&pool_lock, flags);
 		hlist_add_head(&new->node, &obj_pool);
+		debug_objects_alloc++;
 		obj_pool_free++;
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
 	}
@@ -173,6 +180,7 @@ static void free_obj_work(struct work_struct *work)
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
+		debug_objects_freed++;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
@@ -758,6 +766,8 @@ static int debug_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "pool_min_free :%d\n", obj_pool_min_free);
 	seq_printf(m, "pool_used     :%d\n", obj_pool_used);
 	seq_printf(m, "pool_max_used :%d\n", obj_pool_max_used);
+	seq_printf(m, "objects_alloc :%d\n", debug_objects_alloc);
+	seq_printf(m, "objects_freed :%d\n", debug_objects_freed);
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs
  2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
@ 2017-01-05 20:17 ` Waiman Long
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
  2 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

On a large SMP systems with hundreds of CPUs, the current thresholds
for allocating and freeing debug objects (256 and 1024 respectively)
may not work well. This can cause a lot of needless calls to
kmem_aloc() and kmem_free() on those systems.

To alleviate this thrashing problem, the object freeing threshold
is now increased to "1024 + # of CPUs * 32". Whereas the object
allocation threshold is increased to "256 + # of CPUs * 4". That
should make the debug objects subsystem scale better with the number
of CPUs available in the system.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index d78673e..dc78217 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -52,7 +52,10 @@ struct debug_bucket {
 static int			debug_objects_warnings __read_mostly;
 static int			debug_objects_enabled __read_mostly
 				= CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
-
+static int			debug_objects_pool_size __read_mostly
+				= ODEBUG_POOL_SIZE;
+static int			debug_objects_pool_min_level __read_mostly
+				= ODEBUG_POOL_MIN_LEVEL;
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
 /*
@@ -94,13 +97,13 @@ static void fill_pool(void)
 	struct debug_obj *new;
 	unsigned long flags;
 
-	if (likely(obj_pool_free >= ODEBUG_POOL_MIN_LEVEL))
+	if (likely(obj_pool_free >= debug_objects_pool_min_level))
 		return;
 
 	if (unlikely(!obj_cache))
 		return;
 
-	while (obj_pool_free < ODEBUG_POOL_MIN_LEVEL) {
+	while (obj_pool_free < debug_objects_pool_min_level) {
 
 		new = kmem_cache_zalloc(obj_cache, gfp);
 		if (!new)
@@ -176,7 +179,7 @@ static void free_obj_work(struct work_struct *work)
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > ODEBUG_POOL_SIZE) {
+	while (obj_pool_free > debug_objects_pool_size) {
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
@@ -206,7 +209,7 @@ static void free_object(struct debug_obj *obj)
 	 * schedule work when the pool is filled and the cache is
 	 * initialized:
 	 */
-	if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
+	if (obj_pool_free > debug_objects_pool_size && obj_cache)
 		sched = 1;
 	hlist_add_head(&obj->node, &obj_pool);
 	obj_pool_free++;
@@ -1126,4 +1129,11 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
+
+	/*
+	 * Increase the thresholds for allocating and freeing objects
+	 * according to the number of possible CPUs available in the system.
+	 */
+	debug_objects_pool_size += num_possible_cpus() * 32;
+	debug_objects_pool_min_level += num_possible_cpus() * 4;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock
  2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
@ 2017-01-05 20:17 ` Waiman Long
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2017-02-05 16:13   ` tip-bot for Waiman Long
  2 siblings, 2 replies; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing should be reduced.

This patch also groups the freeing of debug objects in a batch of 4
to reduce the total number of lock/unlock cycles.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index dc78217..5476bbe 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -172,25 +172,38 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
 
 /*
  * workqueue function to free objects.
+ *
+ * To reduce contention on the global pool_lock, the actual freeing of
+ * debug objects will be delayed if the pool_lock is busy. We also free
+ * the objects in a batch of 4 for each lock/unlock cycle.
  */
+#define ODEBUG_FREE_BATCH	4
 static void free_obj_work(struct work_struct *work)
 {
-	struct debug_obj *obj;
+	struct debug_obj *objs[ODEBUG_FREE_BATCH];
 	unsigned long flags;
+	int i;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > debug_objects_pool_size) {
-		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
-		hlist_del(&obj->node);
-		obj_pool_free--;
-		debug_objects_freed++;
+	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+		return;
+	while (obj_pool_free >= debug_objects_pool_size + ODEBUG_FREE_BATCH) {
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++) {
+			objs[i] = hlist_entry(obj_pool.first,
+					      typeof(*objs[0]), node);
+			hlist_del(&objs[i]->node);
+		}
+
+		obj_pool_free -= ODEBUG_FREE_BATCH;
+		debug_objects_freed += ODEBUG_FREE_BATCH;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
 		 */
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
-		kmem_cache_free(obj_cache, obj);
-		raw_spin_lock_irqsave(&pool_lock, flags);
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++)
+			kmem_cache_free(obj_cache, objs[i]);
+		if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+			return;
 	}
 	raw_spin_unlock_irqrestore(&pool_lock, flags);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
@ 2017-02-04 16:22   ` tip-bot for Waiman Long
  2017-02-05 10:08     ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-04 16:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, longman, mingo, akpm, jstancek, changbin.du, hpa,
	borntraeger, tglx

Commit-ID:  c4b73aabd0989d93b82894417ae501690bd1db5e
Gitweb:     http://git.kernel.org/tip/c4b73aabd0989d93b82894417ae501690bd1db5e
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:03 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 4 Feb 2017 09:01:54 +0100

debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done

New debugfs stat counters are added to track the numbers of
kmem_cache_alloc() and kmem_cache_free() function calls to get a
sense of how the internal debug objects cache management is performing.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 lib/debugobjects.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 04c1ef7..d78673e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -55,6 +55,12 @@ static int			debug_objects_enabled __read_mostly
 
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
+/*
+ * Track numbers of kmem_cache_alloc and kmem_cache_free done.
+ */
+static int			debug_objects_alloc;
+static int			debug_objects_freed;
+
 static void free_obj_work(struct work_struct *work);
 static DECLARE_WORK(debug_obj_work, free_obj_work);
 
@@ -102,6 +108,7 @@ static void fill_pool(void)
 
 		raw_spin_lock_irqsave(&pool_lock, flags);
 		hlist_add_head(&new->node, &obj_pool);
+		debug_objects_alloc++;
 		obj_pool_free++;
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
 	}
@@ -173,6 +180,7 @@ static void free_obj_work(struct work_struct *work)
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
+		debug_objects_freed++;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
@@ -758,6 +766,8 @@ static int debug_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "pool_min_free :%d\n", obj_pool_min_free);
 	seq_printf(m, "pool_used     :%d\n", obj_pool_used);
 	seq_printf(m, "pool_max_used :%d\n", obj_pool_max_used);
+	seq_printf(m, "objects_alloc :%d\n", debug_objects_alloc);
+	seq_printf(m, "objects_freed :%d\n", debug_objects_freed);
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Scale thresholds with # of CPUs
  2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
@ 2017-02-04 16:23   ` tip-bot for Waiman Long
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-04 16:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, jstancek, tglx, changbin.du, akpm, longman, borntraeger,
	linux-kernel, mingo

Commit-ID:  97dd552eb23c83dbf626a6e84666c7e281375d47
Gitweb:     http://git.kernel.org/tip/97dd552eb23c83dbf626a6e84666c7e281375d47
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:04 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 4 Feb 2017 09:01:55 +0100

debugobjects: Scale thresholds with # of CPUs

On a large SMP systems with hundreds of CPUs, the current thresholds
for allocating and freeing debug objects (256 and 1024 respectively)
may not work well. This can cause a lot of needless calls to
kmem_aloc() and kmem_free() on those systems.

To alleviate this thrashing problem, the object freeing threshold
is now increased to "1024 + # of CPUs * 32". Whereas the object
allocation threshold is increased to "256 + # of CPUs * 4". That
should make the debug objects subsystem scale better with the number
of CPUs available in the system.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 lib/debugobjects.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index d78673e..dc78217 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -52,7 +52,10 @@ static int			debug_objects_fixups __read_mostly;
 static int			debug_objects_warnings __read_mostly;
 static int			debug_objects_enabled __read_mostly
 				= CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
-
+static int			debug_objects_pool_size __read_mostly
+				= ODEBUG_POOL_SIZE;
+static int			debug_objects_pool_min_level __read_mostly
+				= ODEBUG_POOL_MIN_LEVEL;
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
 /*
@@ -94,13 +97,13 @@ static void fill_pool(void)
 	struct debug_obj *new;
 	unsigned long flags;
 
-	if (likely(obj_pool_free >= ODEBUG_POOL_MIN_LEVEL))
+	if (likely(obj_pool_free >= debug_objects_pool_min_level))
 		return;
 
 	if (unlikely(!obj_cache))
 		return;
 
-	while (obj_pool_free < ODEBUG_POOL_MIN_LEVEL) {
+	while (obj_pool_free < debug_objects_pool_min_level) {
 
 		new = kmem_cache_zalloc(obj_cache, gfp);
 		if (!new)
@@ -176,7 +179,7 @@ static void free_obj_work(struct work_struct *work)
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > ODEBUG_POOL_SIZE) {
+	while (obj_pool_free > debug_objects_pool_size) {
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
@@ -206,7 +209,7 @@ static void free_object(struct debug_obj *obj)
 	 * schedule work when the pool is filled and the cache is
 	 * initialized:
 	 */
-	if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
+	if (obj_pool_free > debug_objects_pool_size && obj_cache)
 		sched = 1;
 	hlist_add_head(&obj->node, &obj_pool);
 	obj_pool_free++;
@@ -1126,4 +1129,11 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
+
+	/*
+	 * Increase the thresholds for allocating and freeing objects
+	 * according to the number of possible CPUs available in the system.
+	 */
+	debug_objects_pool_size += num_possible_cpus() * 32;
+	debug_objects_pool_min_level += num_possible_cpus() * 4;
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
@ 2017-02-04 16:23   ` tip-bot for Waiman Long
  2017-02-05 10:03     ` Ingo Molnar
  2017-02-05 16:13   ` tip-bot for Waiman Long
  1 sibling, 1 reply; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-04 16:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, tglx, changbin.du, linux-kernel, mingo, longman, jstancek,
	akpm, borntraeger

Commit-ID:  6d2fea9837a584e706edad9b4b52833e31396736
Gitweb:     http://git.kernel.org/tip/6d2fea9837a584e706edad9b4b52833e31396736
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:05 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 4 Feb 2017 09:01:55 +0100

debugobjects: Reduce contention on the global pool_lock

On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing is reduced.

To further reduce the lock operations free debug objects in batches of
four.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 lib/debugobjects.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index dc78217..5476bbe 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 
 /*
  * workqueue function to free objects.
+ *
+ * To reduce contention on the global pool_lock, the actual freeing of
+ * debug objects will be delayed if the pool_lock is busy. We also free
+ * the objects in a batch of 4 for each lock/unlock cycle.
  */
+#define ODEBUG_FREE_BATCH	4
 static void free_obj_work(struct work_struct *work)
 {
-	struct debug_obj *obj;
+	struct debug_obj *objs[ODEBUG_FREE_BATCH];
 	unsigned long flags;
+	int i;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > debug_objects_pool_size) {
-		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
-		hlist_del(&obj->node);
-		obj_pool_free--;
-		debug_objects_freed++;
+	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+		return;
+	while (obj_pool_free >= debug_objects_pool_size + ODEBUG_FREE_BATCH) {
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++) {
+			objs[i] = hlist_entry(obj_pool.first,
+					      typeof(*objs[0]), node);
+			hlist_del(&objs[i]->node);
+		}
+
+		obj_pool_free -= ODEBUG_FREE_BATCH;
+		debug_objects_freed += ODEBUG_FREE_BATCH;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
 		 */
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
-		kmem_cache_free(obj_cache, obj);
-		raw_spin_lock_irqsave(&pool_lock, flags);
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++)
+			kmem_cache_free(obj_cache, objs[i]);
+		if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+			return;
 	}
 	raw_spin_unlock_irqrestore(&pool_lock, flags);
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
@ 2017-02-05 10:03     ` Ingo Molnar
  2017-02-06 22:09       ` Waiman Long
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2017-02-05 10:03 UTC (permalink / raw)
  To: hpa, tglx, linux-kernel, changbin.du, longman, jstancek, akpm,
	borntraeger
  Cc: linux-tip-commits


* tip-bot for Waiman Long <tipbot@zytor.com> wrote:

> ---
>  lib/debugobjects.c | 31 ++++++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index dc78217..5476bbe 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
>  
>  /*
>   * workqueue function to free objects.
> + *
> + * To reduce contention on the global pool_lock, the actual freeing of
> + * debug objects will be delayed if the pool_lock is busy. We also free
> + * the objects in a batch of 4 for each lock/unlock cycle.
>   */
> +#define ODEBUG_FREE_BATCH	4
>  static void free_obj_work(struct work_struct *work)
>  {

Please put an extra newline before function definitions.

Looks good otherwise!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  2017-02-04 16:22   ` [tip:core/debugobjects] " tip-bot for Waiman Long
@ 2017-02-05 10:08     ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2017-02-05 10:08 UTC (permalink / raw)
  To: borntraeger, tglx, hpa, linux-kernel, longman, changbin.du,
	jstancek, akpm
  Cc: linux-tip-commits


* tip-bot for Waiman Long <tipbot@zytor.com> wrote:

> Commit-ID:  c4b73aabd0989d93b82894417ae501690bd1db5e
> Gitweb:     http://git.kernel.org/tip/c4b73aabd0989d93b82894417ae501690bd1db5e
> Author:     Waiman Long <longman@redhat.com>
> AuthorDate: Thu, 5 Jan 2017 15:17:03 -0500
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Sat, 4 Feb 2017 09:01:54 +0100
> 
> debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
> 
> New debugfs stat counters are added to track the numbers of
> kmem_cache_alloc() and kmem_cache_free() function calls to get a
> sense of how the internal debug objects cache management is performing.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: "Du Changbin" <changbin.du@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jan Stancek <jstancek@redhat.com>
> Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> 
> ---
>  lib/debugobjects.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 04c1ef7..d78673e 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -55,6 +55,12 @@ static int			debug_objects_enabled __read_mostly
>  
>  static struct debug_obj_descr	*descr_test  __read_mostly;
>  
> +/*
> + * Track numbers of kmem_cache_alloc and kmem_cache_free done.

Nit:

   /*
    * Track the number of kmem_cache_alloc()/free() calls done.
    */

Another nit:

> + */
> +static int			debug_objects_alloc;
> +static int			debug_objects_freed;

Yeah, so we want to either use past tense consistently:

   static int			debug_objects_allocated;
   static int			debug_objects_freed;

Or we want to use present tense consistently:

   static int			debug_objects_alloc;
   static int			debug_objects_free;

... but we don't want to mix the two when naming related counters!

( Btw., I'm for the _allocated/_freed pattern, that's what the usual nomenclature 
  for statistics counters. )

> +	seq_printf(m, "objects_alloc :%d\n", debug_objects_alloc);
> +	seq_printf(m, "objects_freed :%d\n", debug_objects_freed);

Ditto.

Thanks,

	Ingo  

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
@ 2017-02-05 16:13   ` tip-bot for Waiman Long
  1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-05 16:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, changbin.du, jstancek, mingo, longman, hpa, tglx,
	borntraeger, akpm

Commit-ID:  858274b6a13b4db0e6fb451eea7f8817c42426a7
Gitweb:     http://git.kernel.org/tip/858274b6a13b4db0e6fb451eea7f8817c42426a7
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:05 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sun, 5 Feb 2017 17:09:32 +0100

debugobjects: Reduce contention on the global pool_lock

On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing is reduced.

To further reduce the lock operations free debug objects in batches of
four.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 lib/debugobjects.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index dc78217..5e1bf2f 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -172,25 +172,39 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 
 /*
  * workqueue function to free objects.
+ *
+ * To reduce contention on the global pool_lock, the actual freeing of
+ * debug objects will be delayed if the pool_lock is busy. We also free
+ * the objects in a batch of 4 for each lock/unlock cycle.
  */
+#define ODEBUG_FREE_BATCH	4
+
 static void free_obj_work(struct work_struct *work)
 {
-	struct debug_obj *obj;
+	struct debug_obj *objs[ODEBUG_FREE_BATCH];
 	unsigned long flags;
+	int i;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > debug_objects_pool_size) {
-		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
-		hlist_del(&obj->node);
-		obj_pool_free--;
-		debug_objects_freed++;
+	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+		return;
+	while (obj_pool_free >= debug_objects_pool_size + ODEBUG_FREE_BATCH) {
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++) {
+			objs[i] = hlist_entry(obj_pool.first,
+					      typeof(*objs[0]), node);
+			hlist_del(&objs[i]->node);
+		}
+
+		obj_pool_free -= ODEBUG_FREE_BATCH;
+		debug_objects_freed += ODEBUG_FREE_BATCH;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
 		 */
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
-		kmem_cache_free(obj_cache, obj);
-		raw_spin_lock_irqsave(&pool_lock, flags);
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++)
+			kmem_cache_free(obj_cache, objs[i]);
+		if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+			return;
 	}
 	raw_spin_unlock_irqrestore(&pool_lock, flags);
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-02-05 10:03     ` Ingo Molnar
@ 2017-02-06 22:09       ` Waiman Long
  2017-02-06 22:50         ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2017-02-06 22:09 UTC (permalink / raw)
  To: Ingo Molnar, hpa, tglx, linux-kernel, changbin.du, jstancek,
	akpm, borntraeger
  Cc: linux-tip-commits

On 02/05/2017 05:03 AM, Ingo Molnar wrote:
> * tip-bot for Waiman Long <tipbot@zytor.com> wrote:
>
>> ---
>>  lib/debugobjects.c | 31 ++++++++++++++++++++++---------
>>  1 file changed, 22 insertions(+), 9 deletions(-)
>>
>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>> index dc78217..5476bbe 100644
>> --- a/lib/debugobjects.c
>> +++ b/lib/debugobjects.c
>> @@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
>>  
>>  /*
>>   * workqueue function to free objects.
>> + *
>> + * To reduce contention on the global pool_lock, the actual freeing of
>> + * debug objects will be delayed if the pool_lock is busy. We also free
>> + * the objects in a batch of 4 for each lock/unlock cycle.
>>   */
>> +#define ODEBUG_FREE_BATCH	4
>>  static void free_obj_work(struct work_struct *work)
>>  {
> Please put an extra newline before function definitions.
>
> Looks good otherwise!
>
> Thanks,
>
> 	Ingo


Sure, I can do that. BTW the debugobjects patch was also pull into the
-mm tree a little while ago. Will that be a problem?

-Longman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-02-06 22:09       ` Waiman Long
@ 2017-02-06 22:50         ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2017-02-06 22:50 UTC (permalink / raw)
  To: Waiman Long, Andrew Morton
  Cc: hpa, tglx, linux-kernel, changbin.du, jstancek, akpm,
	borntraeger, linux-tip-commits


* Waiman Long <longman@redhat.com> wrote:

> On 02/05/2017 05:03 AM, Ingo Molnar wrote:
> > * tip-bot for Waiman Long <tipbot@zytor.com> wrote:
> >
> >> ---
> >>  lib/debugobjects.c | 31 ++++++++++++++++++++++---------
> >>  1 file changed, 22 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> >> index dc78217..5476bbe 100644
> >> --- a/lib/debugobjects.c
> >> +++ b/lib/debugobjects.c
> >> @@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
> >>  
> >>  /*
> >>   * workqueue function to free objects.
> >> + *
> >> + * To reduce contention on the global pool_lock, the actual freeing of
> >> + * debug objects will be delayed if the pool_lock is busy. We also free
> >> + * the objects in a batch of 4 for each lock/unlock cycle.
> >>   */
> >> +#define ODEBUG_FREE_BATCH	4
> >>  static void free_obj_work(struct work_struct *work)
> >>  {
> > Please put an extra newline before function definitions.
> >
> > Looks good otherwise!
> >
> > Thanks,
> >
> > 	Ingo
> 
> 
> Sure, I can do that. BTW the debugobjects patch was also pull into the
> -mm tree a little while ago. Will that be a problem?

Should not be a problem usually, Andrew typically drops such patches if they show 
up in a maintainer tree via linux-next.

And once accepted we don't silently drop patches from -tip hosted maintainer 
trees, so this is a reliable workflow.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-02-06 22:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
2017-02-04 16:22   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2017-02-05 10:08     ` Ingo Molnar
2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2017-02-05 10:03     ` Ingo Molnar
2017-02-06 22:09       ` Waiman Long
2017-02-06 22:50         ` Ingo Molnar
2017-02-05 16:13   ` tip-bot for Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).