All of lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention
@ 2017-01-05 20:17 Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

 v1->v2:
  - Move patch 2 in front of patch 1.
  - Fix merge conflict with linux-next.

This patchset aims to reduce contention of the global pool_lock
while improving performance at the same time. It is done to resolve
the following soft lockup problem with a debug kernel in some of the
large SMP systems:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

On a 8-socket IvyBridge-EX system (120 cores, 240 threads), the
elapsed time of a 4.9-rc7 kernel parallel build (make -j 240) was
reduced from 7m57s to 7m19s with a patched 4.9-rc7 kernel. There
was also about a 10X reduction in the number of debug objects being
allocated from or freed to the kmemcache during the kernel build.

Waiman Long (3):
  debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  debugobjects: Scale thresholds with # of CPUs
  debugobjects: Reduce contention on the global pool_lock

 lib/debugobjects.c | 57 ++++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 45 insertions(+), 12 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
@ 2017-01-05 20:17 ` Waiman Long
  2017-02-04 16:22   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
  2 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

New debugfs stat counters are added to track the numbers of
kmem_cache_alloc() and kmem_cache_free() function calls to get a
sense of how the internal debug objects cache management is performing.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 04c1ef7..d78673e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -55,6 +55,12 @@ struct debug_bucket {
 
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
+/*
+ * Track numbers of kmem_cache_alloc and kmem_cache_free done.
+ */
+static int			debug_objects_alloc;
+static int			debug_objects_freed;
+
 static void free_obj_work(struct work_struct *work);
 static DECLARE_WORK(debug_obj_work, free_obj_work);
 
@@ -102,6 +108,7 @@ static void fill_pool(void)
 
 		raw_spin_lock_irqsave(&pool_lock, flags);
 		hlist_add_head(&new->node, &obj_pool);
+		debug_objects_alloc++;
 		obj_pool_free++;
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
 	}
@@ -173,6 +180,7 @@ static void free_obj_work(struct work_struct *work)
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
+		debug_objects_freed++;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
@@ -758,6 +766,8 @@ static int debug_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "pool_min_free :%d\n", obj_pool_min_free);
 	seq_printf(m, "pool_used     :%d\n", obj_pool_used);
 	seq_printf(m, "pool_max_used :%d\n", obj_pool_max_used);
+	seq_printf(m, "objects_alloc :%d\n", debug_objects_alloc);
+	seq_printf(m, "objects_freed :%d\n", debug_objects_freed);
 	return 0;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs
  2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
@ 2017-01-05 20:17 ` Waiman Long
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
  2 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

On a large SMP systems with hundreds of CPUs, the current thresholds
for allocating and freeing debug objects (256 and 1024 respectively)
may not work well. This can cause a lot of needless calls to
kmem_aloc() and kmem_free() on those systems.

To alleviate this thrashing problem, the object freeing threshold
is now increased to "1024 + # of CPUs * 32". Whereas the object
allocation threshold is increased to "256 + # of CPUs * 4". That
should make the debug objects subsystem scale better with the number
of CPUs available in the system.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index d78673e..dc78217 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -52,7 +52,10 @@ struct debug_bucket {
 static int			debug_objects_warnings __read_mostly;
 static int			debug_objects_enabled __read_mostly
 				= CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
-
+static int			debug_objects_pool_size __read_mostly
+				= ODEBUG_POOL_SIZE;
+static int			debug_objects_pool_min_level __read_mostly
+				= ODEBUG_POOL_MIN_LEVEL;
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
 /*
@@ -94,13 +97,13 @@ static void fill_pool(void)
 	struct debug_obj *new;
 	unsigned long flags;
 
-	if (likely(obj_pool_free >= ODEBUG_POOL_MIN_LEVEL))
+	if (likely(obj_pool_free >= debug_objects_pool_min_level))
 		return;
 
 	if (unlikely(!obj_cache))
 		return;
 
-	while (obj_pool_free < ODEBUG_POOL_MIN_LEVEL) {
+	while (obj_pool_free < debug_objects_pool_min_level) {
 
 		new = kmem_cache_zalloc(obj_cache, gfp);
 		if (!new)
@@ -176,7 +179,7 @@ static void free_obj_work(struct work_struct *work)
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > ODEBUG_POOL_SIZE) {
+	while (obj_pool_free > debug_objects_pool_size) {
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
@@ -206,7 +209,7 @@ static void free_object(struct debug_obj *obj)
 	 * schedule work when the pool is filled and the cache is
 	 * initialized:
 	 */
-	if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
+	if (obj_pool_free > debug_objects_pool_size && obj_cache)
 		sched = 1;
 	hlist_add_head(&obj->node, &obj_pool);
 	obj_pool_free++;
@@ -1126,4 +1129,11 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
+
+	/*
+	 * Increase the thresholds for allocating and freeing objects
+	 * according to the number of possible CPUs available in the system.
+	 */
+	debug_objects_pool_size += num_possible_cpus() * 32;
+	debug_objects_pool_min_level += num_possible_cpus() * 4;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock
  2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
  2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
@ 2017-01-05 20:17 ` Waiman Long
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
  2017-02-05 16:13   ` tip-bot for Waiman Long
  2 siblings, 2 replies; 12+ messages in thread
From: Waiman Long @ 2017-01-05 20:17 UTC (permalink / raw)
  To: Thomas Gleixner, Andrew Morton
  Cc: linux-kernel, Du Changbin, Christian Borntraeger, Jan Stancek,
	Waiman Long

On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing should be reduced.

This patch also groups the freeing of debug objects in a batch of 4
to reduce the total number of lock/unlock cycles.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/debugobjects.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index dc78217..5476bbe 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -172,25 +172,38 @@ static struct debug_obj *lookup_object(void *addr, struct debug_bucket *b)
 
 /*
  * workqueue function to free objects.
+ *
+ * To reduce contention on the global pool_lock, the actual freeing of
+ * debug objects will be delayed if the pool_lock is busy. We also free
+ * the objects in a batch of 4 for each lock/unlock cycle.
  */
+#define ODEBUG_FREE_BATCH	4
 static void free_obj_work(struct work_struct *work)
 {
-	struct debug_obj *obj;
+	struct debug_obj *objs[ODEBUG_FREE_BATCH];
 	unsigned long flags;
+	int i;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > debug_objects_pool_size) {
-		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
-		hlist_del(&obj->node);
-		obj_pool_free--;
-		debug_objects_freed++;
+	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+		return;
+	while (obj_pool_free >= debug_objects_pool_size + ODEBUG_FREE_BATCH) {
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++) {
+			objs[i] = hlist_entry(obj_pool.first,
+					      typeof(*objs[0]), node);
+			hlist_del(&objs[i]->node);
+		}
+
+		obj_pool_free -= ODEBUG_FREE_BATCH;
+		debug_objects_freed += ODEBUG_FREE_BATCH;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
 		 */
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
-		kmem_cache_free(obj_cache, obj);
-		raw_spin_lock_irqsave(&pool_lock, flags);
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++)
+			kmem_cache_free(obj_cache, objs[i]);
+		if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+			return;
 	}
 	raw_spin_unlock_irqrestore(&pool_lock, flags);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
@ 2017-02-04 16:22   ` tip-bot for Waiman Long
  2017-02-05 10:08     ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-04 16:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, longman, mingo, akpm, jstancek, changbin.du, hpa,
	borntraeger, tglx

Commit-ID:  c4b73aabd0989d93b82894417ae501690bd1db5e
Gitweb:     http://git.kernel.org/tip/c4b73aabd0989d93b82894417ae501690bd1db5e
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:03 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 4 Feb 2017 09:01:54 +0100

debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done

New debugfs stat counters are added to track the numbers of
kmem_cache_alloc() and kmem_cache_free() function calls to get a
sense of how the internal debug objects cache management is performing.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 lib/debugobjects.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 04c1ef7..d78673e 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -55,6 +55,12 @@ static int			debug_objects_enabled __read_mostly
 
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
+/*
+ * Track numbers of kmem_cache_alloc and kmem_cache_free done.
+ */
+static int			debug_objects_alloc;
+static int			debug_objects_freed;
+
 static void free_obj_work(struct work_struct *work);
 static DECLARE_WORK(debug_obj_work, free_obj_work);
 
@@ -102,6 +108,7 @@ static void fill_pool(void)
 
 		raw_spin_lock_irqsave(&pool_lock, flags);
 		hlist_add_head(&new->node, &obj_pool);
+		debug_objects_alloc++;
 		obj_pool_free++;
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
 	}
@@ -173,6 +180,7 @@ static void free_obj_work(struct work_struct *work)
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
+		debug_objects_freed++;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
@@ -758,6 +766,8 @@ static int debug_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, "pool_min_free :%d\n", obj_pool_min_free);
 	seq_printf(m, "pool_used     :%d\n", obj_pool_used);
 	seq_printf(m, "pool_max_used :%d\n", obj_pool_max_used);
+	seq_printf(m, "objects_alloc :%d\n", debug_objects_alloc);
+	seq_printf(m, "objects_freed :%d\n", debug_objects_freed);
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Scale thresholds with # of CPUs
  2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
@ 2017-02-04 16:23   ` tip-bot for Waiman Long
  0 siblings, 0 replies; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-04 16:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, jstancek, tglx, changbin.du, akpm, longman, borntraeger,
	linux-kernel, mingo

Commit-ID:  97dd552eb23c83dbf626a6e84666c7e281375d47
Gitweb:     http://git.kernel.org/tip/97dd552eb23c83dbf626a6e84666c7e281375d47
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:04 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 4 Feb 2017 09:01:55 +0100

debugobjects: Scale thresholds with # of CPUs

On a large SMP systems with hundreds of CPUs, the current thresholds
for allocating and freeing debug objects (256 and 1024 respectively)
may not work well. This can cause a lot of needless calls to
kmem_aloc() and kmem_free() on those systems.

To alleviate this thrashing problem, the object freeing threshold
is now increased to "1024 + # of CPUs * 32". Whereas the object
allocation threshold is increased to "256 + # of CPUs * 4". That
should make the debug objects subsystem scale better with the number
of CPUs available in the system.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 lib/debugobjects.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index d78673e..dc78217 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -52,7 +52,10 @@ static int			debug_objects_fixups __read_mostly;
 static int			debug_objects_warnings __read_mostly;
 static int			debug_objects_enabled __read_mostly
 				= CONFIG_DEBUG_OBJECTS_ENABLE_DEFAULT;
-
+static int			debug_objects_pool_size __read_mostly
+				= ODEBUG_POOL_SIZE;
+static int			debug_objects_pool_min_level __read_mostly
+				= ODEBUG_POOL_MIN_LEVEL;
 static struct debug_obj_descr	*descr_test  __read_mostly;
 
 /*
@@ -94,13 +97,13 @@ static void fill_pool(void)
 	struct debug_obj *new;
 	unsigned long flags;
 
-	if (likely(obj_pool_free >= ODEBUG_POOL_MIN_LEVEL))
+	if (likely(obj_pool_free >= debug_objects_pool_min_level))
 		return;
 
 	if (unlikely(!obj_cache))
 		return;
 
-	while (obj_pool_free < ODEBUG_POOL_MIN_LEVEL) {
+	while (obj_pool_free < debug_objects_pool_min_level) {
 
 		new = kmem_cache_zalloc(obj_cache, gfp);
 		if (!new)
@@ -176,7 +179,7 @@ static void free_obj_work(struct work_struct *work)
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > ODEBUG_POOL_SIZE) {
+	while (obj_pool_free > debug_objects_pool_size) {
 		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
 		hlist_del(&obj->node);
 		obj_pool_free--;
@@ -206,7 +209,7 @@ static void free_object(struct debug_obj *obj)
 	 * schedule work when the pool is filled and the cache is
 	 * initialized:
 	 */
-	if (obj_pool_free > ODEBUG_POOL_SIZE && obj_cache)
+	if (obj_pool_free > debug_objects_pool_size && obj_cache)
 		sched = 1;
 	hlist_add_head(&obj->node, &obj_pool);
 	obj_pool_free++;
@@ -1126,4 +1129,11 @@ void __init debug_objects_mem_init(void)
 		pr_warn("out of memory.\n");
 	} else
 		debug_objects_selftest();
+
+	/*
+	 * Increase the thresholds for allocating and freeing objects
+	 * according to the number of possible CPUs available in the system.
+	 */
+	debug_objects_pool_size += num_possible_cpus() * 32;
+	debug_objects_pool_min_level += num_possible_cpus() * 4;
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
@ 2017-02-04 16:23   ` tip-bot for Waiman Long
  2017-02-05 10:03     ` Ingo Molnar
  2017-02-05 16:13   ` tip-bot for Waiman Long
  1 sibling, 1 reply; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-04 16:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, tglx, changbin.du, linux-kernel, mingo, longman, jstancek,
	akpm, borntraeger

Commit-ID:  6d2fea9837a584e706edad9b4b52833e31396736
Gitweb:     http://git.kernel.org/tip/6d2fea9837a584e706edad9b4b52833e31396736
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:05 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 4 Feb 2017 09:01:55 +0100

debugobjects: Reduce contention on the global pool_lock

On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing is reduced.

To further reduce the lock operations free debug objects in batches of
four.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 lib/debugobjects.c | 31 ++++++++++++++++++++++---------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index dc78217..5476bbe 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 
 /*
  * workqueue function to free objects.
+ *
+ * To reduce contention on the global pool_lock, the actual freeing of
+ * debug objects will be delayed if the pool_lock is busy. We also free
+ * the objects in a batch of 4 for each lock/unlock cycle.
  */
+#define ODEBUG_FREE_BATCH	4
 static void free_obj_work(struct work_struct *work)
 {
-	struct debug_obj *obj;
+	struct debug_obj *objs[ODEBUG_FREE_BATCH];
 	unsigned long flags;
+	int i;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > debug_objects_pool_size) {
-		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
-		hlist_del(&obj->node);
-		obj_pool_free--;
-		debug_objects_freed++;
+	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+		return;
+	while (obj_pool_free >= debug_objects_pool_size + ODEBUG_FREE_BATCH) {
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++) {
+			objs[i] = hlist_entry(obj_pool.first,
+					      typeof(*objs[0]), node);
+			hlist_del(&objs[i]->node);
+		}
+
+		obj_pool_free -= ODEBUG_FREE_BATCH;
+		debug_objects_freed += ODEBUG_FREE_BATCH;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
 		 */
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
-		kmem_cache_free(obj_cache, obj);
-		raw_spin_lock_irqsave(&pool_lock, flags);
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++)
+			kmem_cache_free(obj_cache, objs[i]);
+		if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+			return;
 	}
 	raw_spin_unlock_irqrestore(&pool_lock, flags);
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
@ 2017-02-05 10:03     ` Ingo Molnar
  2017-02-06 22:09       ` Waiman Long
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2017-02-05 10:03 UTC (permalink / raw)
  To: hpa, tglx, linux-kernel, changbin.du, longman, jstancek, akpm,
	borntraeger
  Cc: linux-tip-commits


* tip-bot for Waiman Long <tipbot@zytor.com> wrote:

> ---
>  lib/debugobjects.c | 31 ++++++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index dc78217..5476bbe 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
>  
>  /*
>   * workqueue function to free objects.
> + *
> + * To reduce contention on the global pool_lock, the actual freeing of
> + * debug objects will be delayed if the pool_lock is busy. We also free
> + * the objects in a batch of 4 for each lock/unlock cycle.
>   */
> +#define ODEBUG_FREE_BATCH	4
>  static void free_obj_work(struct work_struct *work)
>  {

Please put an extra newline before function definitions.

Looks good otherwise!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
  2017-02-04 16:22   ` [tip:core/debugobjects] " tip-bot for Waiman Long
@ 2017-02-05 10:08     ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2017-02-05 10:08 UTC (permalink / raw)
  To: borntraeger, tglx, hpa, linux-kernel, longman, changbin.du,
	jstancek, akpm
  Cc: linux-tip-commits


* tip-bot for Waiman Long <tipbot@zytor.com> wrote:

> Commit-ID:  c4b73aabd0989d93b82894417ae501690bd1db5e
> Gitweb:     http://git.kernel.org/tip/c4b73aabd0989d93b82894417ae501690bd1db5e
> Author:     Waiman Long <longman@redhat.com>
> AuthorDate: Thu, 5 Jan 2017 15:17:03 -0500
> Committer:  Thomas Gleixner <tglx@linutronix.de>
> CommitDate: Sat, 4 Feb 2017 09:01:54 +0100
> 
> debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done
> 
> New debugfs stat counters are added to track the numbers of
> kmem_cache_alloc() and kmem_cache_free() function calls to get a
> sense of how the internal debug objects cache management is performing.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Cc: "Du Changbin" <changbin.du@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Jan Stancek <jstancek@redhat.com>
> Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> 
> ---
>  lib/debugobjects.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 04c1ef7..d78673e 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -55,6 +55,12 @@ static int			debug_objects_enabled __read_mostly
>  
>  static struct debug_obj_descr	*descr_test  __read_mostly;
>  
> +/*
> + * Track numbers of kmem_cache_alloc and kmem_cache_free done.

Nit:

   /*
    * Track the number of kmem_cache_alloc()/free() calls done.
    */

Another nit:

> + */
> +static int			debug_objects_alloc;
> +static int			debug_objects_freed;

Yeah, so we want to either use past tense consistently:

   static int			debug_objects_allocated;
   static int			debug_objects_freed;

Or we want to use present tense consistently:

   static int			debug_objects_alloc;
   static int			debug_objects_free;

... but we don't want to mix the two when naming related counters!

( Btw., I'm for the _allocated/_freed pattern, that's what the usual nomenclature 
  for statistics counters. )

> +	seq_printf(m, "objects_alloc :%d\n", debug_objects_alloc);
> +	seq_printf(m, "objects_freed :%d\n", debug_objects_freed);

Ditto.

Thanks,

	Ingo  

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
  2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
@ 2017-02-05 16:13   ` tip-bot for Waiman Long
  1 sibling, 0 replies; 12+ messages in thread
From: tip-bot for Waiman Long @ 2017-02-05 16:13 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, changbin.du, jstancek, mingo, longman, hpa, tglx,
	borntraeger, akpm

Commit-ID:  858274b6a13b4db0e6fb451eea7f8817c42426a7
Gitweb:     http://git.kernel.org/tip/858274b6a13b4db0e6fb451eea7f8817c42426a7
Author:     Waiman Long <longman@redhat.com>
AuthorDate: Thu, 5 Jan 2017 15:17:05 -0500
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sun, 5 Feb 2017 17:09:32 +0100

debugobjects: Reduce contention on the global pool_lock

On a large SMP system with many CPUs, the global pool_lock may become
a performance bottleneck as all the CPUs that need to allocate or
free debug objects have to take the lock. That can sometimes cause
soft lockups like:

 NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
 ...
 RIP: 0010:[<ffffffff817c216b>]  [<ffffffff817c216b>]
	_raw_spin_unlock_irqrestore+0x3b/0x60
 ...
 Call Trace:
  [<ffffffff813f40d1>] free_object+0x81/0xb0
  [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff81284996>] ? file_free_rcu+0x36/0x60
  [<ffffffff81251712>] kmem_cache_free+0xd2/0x380
  [<ffffffff81284960>] ? fput+0x90/0x90
  [<ffffffff81284996>] file_free_rcu+0x36/0x60
  [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550
  [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550
  [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50
  [<ffffffff810c59d1>] kthread+0x101/0x120
  [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0
  [<ffffffff817c2d32>] ret_from_fork+0x22/0x50

To reduce the amount of contention on the pool_lock, the actual
kmem_cache_free() of the debug objects will be delayed if the pool_lock
is busy. This will temporarily increase the amount of free objects
available at the free pool when the system is busy. As a result,
the number of kmem_cache allocation and freeing is reduced.

To further reduce the lock operations free debug objects in batches of
four.

Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: "Du Changbin" <changbin.du@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jan Stancek <jstancek@redhat.com>
Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 lib/debugobjects.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index dc78217..5e1bf2f 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -172,25 +172,39 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
 
 /*
  * workqueue function to free objects.
+ *
+ * To reduce contention on the global pool_lock, the actual freeing of
+ * debug objects will be delayed if the pool_lock is busy. We also free
+ * the objects in a batch of 4 for each lock/unlock cycle.
  */
+#define ODEBUG_FREE_BATCH	4
+
 static void free_obj_work(struct work_struct *work)
 {
-	struct debug_obj *obj;
+	struct debug_obj *objs[ODEBUG_FREE_BATCH];
 	unsigned long flags;
+	int i;
 
-	raw_spin_lock_irqsave(&pool_lock, flags);
-	while (obj_pool_free > debug_objects_pool_size) {
-		obj = hlist_entry(obj_pool.first, typeof(*obj), node);
-		hlist_del(&obj->node);
-		obj_pool_free--;
-		debug_objects_freed++;
+	if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+		return;
+	while (obj_pool_free >= debug_objects_pool_size + ODEBUG_FREE_BATCH) {
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++) {
+			objs[i] = hlist_entry(obj_pool.first,
+					      typeof(*objs[0]), node);
+			hlist_del(&objs[i]->node);
+		}
+
+		obj_pool_free -= ODEBUG_FREE_BATCH;
+		debug_objects_freed += ODEBUG_FREE_BATCH;
 		/*
 		 * We release pool_lock across kmem_cache_free() to
 		 * avoid contention on pool_lock.
 		 */
 		raw_spin_unlock_irqrestore(&pool_lock, flags);
-		kmem_cache_free(obj_cache, obj);
-		raw_spin_lock_irqsave(&pool_lock, flags);
+		for (i = 0; i < ODEBUG_FREE_BATCH; i++)
+			kmem_cache_free(obj_cache, objs[i]);
+		if (!raw_spin_trylock_irqsave(&pool_lock, flags))
+			return;
 	}
 	raw_spin_unlock_irqrestore(&pool_lock, flags);
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-02-05 10:03     ` Ingo Molnar
@ 2017-02-06 22:09       ` Waiman Long
  2017-02-06 22:50         ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2017-02-06 22:09 UTC (permalink / raw)
  To: Ingo Molnar, hpa, tglx, linux-kernel, changbin.du, jstancek,
	akpm, borntraeger
  Cc: linux-tip-commits

On 02/05/2017 05:03 AM, Ingo Molnar wrote:
> * tip-bot for Waiman Long <tipbot@zytor.com> wrote:
>
>> ---
>>  lib/debugobjects.c | 31 ++++++++++++++++++++++---------
>>  1 file changed, 22 insertions(+), 9 deletions(-)
>>
>> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
>> index dc78217..5476bbe 100644
>> --- a/lib/debugobjects.c
>> +++ b/lib/debugobjects.c
>> @@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
>>  
>>  /*
>>   * workqueue function to free objects.
>> + *
>> + * To reduce contention on the global pool_lock, the actual freeing of
>> + * debug objects will be delayed if the pool_lock is busy. We also free
>> + * the objects in a batch of 4 for each lock/unlock cycle.
>>   */
>> +#define ODEBUG_FREE_BATCH	4
>>  static void free_obj_work(struct work_struct *work)
>>  {
> Please put an extra newline before function definitions.
>
> Looks good otherwise!
>
> Thanks,
>
> 	Ingo


Sure, I can do that. BTW the debugobjects patch was also pull into the
-mm tree a little while ago. Will that be a problem?

-Longman

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:core/debugobjects] debugobjects: Reduce contention on the global pool_lock
  2017-02-06 22:09       ` Waiman Long
@ 2017-02-06 22:50         ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2017-02-06 22:50 UTC (permalink / raw)
  To: Waiman Long, Andrew Morton
  Cc: hpa, tglx, linux-kernel, changbin.du, jstancek, akpm,
	borntraeger, linux-tip-commits


* Waiman Long <longman@redhat.com> wrote:

> On 02/05/2017 05:03 AM, Ingo Molnar wrote:
> > * tip-bot for Waiman Long <tipbot@zytor.com> wrote:
> >
> >> ---
> >>  lib/debugobjects.c | 31 ++++++++++++++++++++++---------
> >>  1 file changed, 22 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> >> index dc78217..5476bbe 100644
> >> --- a/lib/debugobjects.c
> >> +++ b/lib/debugobjects.c
> >> @@ -172,25 +172,38 @@ alloc_object(void *addr, struct debug_bucket *b, struct debug_obj_descr *descr)
> >>  
> >>  /*
> >>   * workqueue function to free objects.
> >> + *
> >> + * To reduce contention on the global pool_lock, the actual freeing of
> >> + * debug objects will be delayed if the pool_lock is busy. We also free
> >> + * the objects in a batch of 4 for each lock/unlock cycle.
> >>   */
> >> +#define ODEBUG_FREE_BATCH	4
> >>  static void free_obj_work(struct work_struct *work)
> >>  {
> > Please put an extra newline before function definitions.
> >
> > Looks good otherwise!
> >
> > Thanks,
> >
> > 	Ingo
> 
> 
> Sure, I can do that. BTW the debugobjects patch was also pull into the
> -mm tree a little while ago. Will that be a problem?

Should not be a problem usually, Andrew typically drops such patches if they show 
up in a maintainer tree via linux-next.

And once accepted we don't silently drop patches from -tip hosted maintainer 
trees, so this is a reliable workflow.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-02-06 22:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-05 20:17 [RESEND PATCH v2 0/3] debugobjects: Reduce global pool_lock contention Waiman Long
2017-01-05 20:17 ` [RESEND PATCH v2 1/3] debugobjects: Track number of kmem_cache_alloc/kmem_cache_free done Waiman Long
2017-02-04 16:22   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2017-02-05 10:08     ` Ingo Molnar
2017-01-05 20:17 ` [RESEND PATCH v2 2/3] debugobjects: Scale thresholds with # of CPUs Waiman Long
2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2017-01-05 20:17 ` [RESEND PATCH v2 3/3] debugobjects: Reduce contention on the global pool_lock Waiman Long
2017-02-04 16:23   ` [tip:core/debugobjects] " tip-bot for Waiman Long
2017-02-05 10:03     ` Ingo Molnar
2017-02-06 22:09       ` Waiman Long
2017-02-06 22:50         ` Ingo Molnar
2017-02-05 16:13   ` tip-bot for Waiman Long

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.