linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries
@ 2020-01-15 21:43 Waiman Long
  2020-01-15 21:43 ` [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

 v3:
  - Move the bug fix patches to the beginning of the series.
  - Include a number of changes as suggested by PeterZ.
  - Increase MAX_CHAIN_BUCKETS from 8 to 10 to reduce the chance of using
    the unsized list.
  - Add patch 7 to add a lockdep_early_init() call.
  - Add patch 8 to allocate chain hlocks by splitting large chain block
    as a last resort.

 v2:
  - Revamp the chain_hlocks reuse patch to store the freed chain_hlocks
    information in the chain_hlocks entries themselves avoiding the
    need of a separate set of tracking structures. This, however,
    requires a minimum allocation size of at least 2. Thanks to PeterZ
    for his review and inspiring this change.
  - Remove the leakage counter as it is no longer applicable.
  - Add patch 6 to make the output of /proc/lockdep_chains more readable.

It was found that when running a workload that kept on adding lock
classes and then zapping them repetitively, the system will eventually
run out of chain_hlocks[] entries even though there were still plenty
of other lockdep data buffers available.

  [ 4318.443670] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  [ 4318.444809] turning off the locking correctness validator.

In order to fix this problem, we have to make chain_hlocks[] entries
reusable just like other lockdep arrays. Besides that, the patchset
also adds some zapped class and chain_hlocks counters to be tracked by
/proc/lockdep_stats. It also fixes leakage in the irq context counters
and makes the output of /proc/lockdep_chains more readable.

Waiman Long (8):
  locking/lockdep: Decrement irq context counters when removing lock
    chain
  locking/lockdep: Display irq_context names in /proc/lockdep_chains
  locking/lockdep: Track number of zapped classes
  locking/lockdep: Throw away all lock chains with zapped class
  locking/lockdep: Track number of zapped lock chains
  locking/lockdep: Reuse freed chain_hlocks entries
  locking/lockdep: Add lockdep_early_init() before any lock is taken
  locking/lockdep: Enable chain block splitting as last resort

 include/linux/lockdep.h            |   2 +
 init/main.c                        |   1 +
 kernel/locking/lockdep.c           | 373 ++++++++++++++++++++++++-----
 kernel/locking/lockdep_internals.h |  15 +-
 kernel/locking/lockdep_proc.c      |  25 +-
 5 files changed, 351 insertions(+), 65 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  2020-01-16 15:32   ` Peter Zijlstra
  2020-01-17 18:14   ` kbuild test robot
  2020-01-15 21:43 ` [PATCH v3 2/8] locking/lockdep: Display irq_context names in /proc/lockdep_chains Waiman Long
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

There are currently three counters to track the irq context of a lock
chain - nr_hardirq_chains, nr_softirq_chains and nr_process_chains.
They are incremented when a new lock chain is added, but they are
not decremented when a lock chain is removed. That causes some of the
statistic counts reported by /proc/lockdep_stats to be incorrect.

Fix that by decrementing the right counter when a lock chain is removed.

Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 36 +++++++++++++++++++++---------
 kernel/locking/lockdep_internals.h |  6 +++++
 2 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 32282e7112d3..b20fa6236b2a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2299,16 +2299,24 @@ static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
 	return 0;
 }
 
-static void inc_chains(void)
+static void inc_chains(int irq_context)
 {
-	if (current->hardirq_context)
+	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
 		nr_hardirq_chains++;
-	else {
-		if (current->softirq_context)
-			nr_softirq_chains++;
-		else
-			nr_process_chains++;
-	}
+	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
+		nr_softirq_chains++;
+	else
+		nr_process_chains++;
+}
+
+static void dec_chains(int irq_context)
+{
+	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
+		nr_hardirq_chains--;
+	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
+		nr_softirq_chains--;
+	else
+		nr_process_chains--;
 }
 
 #else
@@ -2324,6 +2332,10 @@ static inline void inc_chains(void)
 	nr_process_chains++;
 }
 
+static void dec_chains(int irq_context)
+{
+	nr_process_chains--;
+}
 #endif /* CONFIG_TRACE_IRQFLAGS */
 
 static void
@@ -2844,7 +2856,7 @@ static inline int add_chain_cache(struct task_struct *curr,
 
 	hlist_add_head_rcu(&chain->entry, hash_head);
 	debug_atomic_inc(chain_lookup_misses);
-	inc_chains();
+	inc_chains(chain->irq_context);
 
 	return 1;
 }
@@ -3597,7 +3609,8 @@ mark_usage(struct task_struct *curr, struct held_lock *hlock, int check)
 
 static inline unsigned int task_irq_context(struct task_struct *task)
 {
-	return 2 * !!task->hardirq_context + !!task->softirq_context;
+	return LOCK_CHAIN_HARDIRQ_CONTEXT * !!task->hardirq_context +
+	       LOCK_CHAIN_SOFTIRQ_CONTEXT * !!task->softirq_context;
 }
 
 static int separate_irq_context(struct task_struct *curr,
@@ -4799,6 +4812,8 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 		return;
 	/* Overwrite the chain key for concurrent RCU readers. */
 	WRITE_ONCE(chain->chain_key, chain_key);
+	dec_chains(chain->irq_context);
+
 	/*
 	 * Note: calling hlist_del_rcu() from inside a
 	 * hlist_for_each_entry_rcu() loop is safe.
@@ -4820,6 +4835,7 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 	}
 	*new_chain = *chain;
 	hlist_add_head_rcu(&new_chain->entry, chainhashentry(chain_key));
+	inc_chains(new_chain->irq_context);
 #endif
 }
 
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 18d85aebbb57..53500a1dac58 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -98,6 +98,12 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 
 #define MAX_LOCKDEP_CHAINS_BITS	16
 
+/*
+ * Bit definitions for lock_chain.irq_context
+ */
+#define LOCK_CHAIN_SOFTIRQ_CONTEXT	(1 << 0)
+#define LOCK_CHAIN_HARDIRQ_CONTEXT	(1 << 1)
+
 /*
  * Stack-trace: tightly packed array of stack backtrace
  * addresses. Protected by the hash_lock.
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/8] locking/lockdep: Display irq_context names in /proc/lockdep_chains
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
  2020-01-15 21:43 ` [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  2020-01-17 22:12   ` kbuild test robot
  2020-01-15 21:43 ` [PATCH v3 3/8] locking/lockdep: Track number of zapped classes Waiman Long
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

Currently, the irq_context field of a lock chains displayed in
/proc/lockdep_chains is just a number. It is likely that many people
may not know what a non-zero number means. To make the information more
useful, print the actual irq names ("softirq" and "hardirq") instead.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep_proc.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index dadb7b7fba37..5e64a6ec09c2 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -128,6 +128,13 @@ static int lc_show(struct seq_file *m, void *v)
 	struct lock_chain *chain = v;
 	struct lock_class *class;
 	int i;
+	static const char * const irq_strs[] = {
+		[0]			     = "0",
+		[LOCK_CHAIN_HARDIRQ_CONTEXT] = "hardirq",
+		[LOCK_CHAIN_SOFTIRQ_CONTEXT] = "softirq",
+		[LOCK_CHAIN_SOFTIRQ_CONTEXT|
+		 LOCK_CHAIN_HARDIRQ_CONTEXT] = "hardirq|softirq",
+	};
 
 	if (v == SEQ_START_TOKEN) {
 		if (nr_chain_hlocks > MAX_LOCKDEP_CHAIN_HLOCKS)
@@ -136,7 +143,7 @@ static int lc_show(struct seq_file *m, void *v)
 		return 0;
 	}
 
-	seq_printf(m, "irq_context: %d\n", chain->irq_context);
+	seq_printf(m, "irq_context: %s\n", irq_strs[chain->irq_context]);
 
 	for (i = 0; i < chain->depth; i++) {
 		class = lock_chain_get_class(chain, i);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/8] locking/lockdep: Track number of zapped classes
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
  2020-01-15 21:43 ` [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
  2020-01-15 21:43 ` [PATCH v3 2/8] locking/lockdep: Display irq_context names in /proc/lockdep_chains Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  2020-01-15 21:43 ` [PATCH v3 4/8] locking/lockdep: Throw away all lock chains with zapped class Waiman Long
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

The whole point of the lockdep dynamic key patch is to allow unused
locks to be removed from the lockdep data buffers so that existing
buffer space can be reused. However, there is no way to find out how
many unused locks are zapped and so we don't know if the zapping process
is working properly.

Add a new nr_zapped_classes counter to track that and show it in
/proc/lockdep_stats.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 2 ++
 kernel/locking/lockdep_internals.h | 1 +
 kernel/locking/lockdep_proc.c      | 6 ++++++
 3 files changed, 9 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index b20fa6236b2a..8bea42136da1 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -147,6 +147,7 @@ static DECLARE_BITMAP(list_entries_in_use, MAX_LOCKDEP_ENTRIES);
 #define KEYHASH_SIZE		(1UL << KEYHASH_BITS)
 static struct hlist_head lock_keys_hash[KEYHASH_SIZE];
 unsigned long nr_lock_classes;
+unsigned long nr_zapped_classes;
 #ifndef CONFIG_DEBUG_LOCKDEP
 static
 #endif
@@ -4891,6 +4892,7 @@ static void zap_class(struct pending_free *pf, struct lock_class *class)
 	}
 
 	remove_class_from_lock_chains(pf, class);
+	nr_zapped_classes++;
 }
 
 static void reinit_class(struct lock_class *class)
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 53500a1dac58..fe794cbdf34c 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -130,6 +130,7 @@ extern const char *__get_key_name(const struct lockdep_subclass_key *key,
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i);
 
 extern unsigned long nr_lock_classes;
+extern unsigned long nr_zapped_classes;
 extern unsigned long nr_list_entries;
 long lockdep_next_lockchain(long i);
 unsigned long lock_chain_count(void);
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index 5e64a6ec09c2..bd853848832b 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -343,6 +343,12 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, " debug_locks:                   %11u\n",
 			debug_locks);
 
+	/*
+	 * Zappped classes and lockdep data buffers reuse statistics.
+	 */
+	seq_puts(m, "\n");
+	seq_printf(m, " zapped classes:                %11lu\n",
+			nr_zapped_classes);
 	return 0;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 4/8] locking/lockdep: Throw away all lock chains with zapped class
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
                   ` (2 preceding siblings ...)
  2020-01-15 21:43 ` [PATCH v3 3/8] locking/lockdep: Track number of zapped classes Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  2020-01-15 21:43 ` [PATCH v3 5/8] locking/lockdep: Track number of zapped lock chains Waiman Long
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

If a lock chain contains a class that is zapped, the whole lock chain is
likely to be invalid. If the zapped class is at the end of the chain,
the partial chain without the zapped class should have been stored
already as the current code will store all its predecessor chains. If
the zapped class is somewhere in the middle, there is no guarantee that
the partial chain will actually happen. It may just clutter up the hash
and make searching slower. I would rather prefer storing the chain only
when it actually happens.

So just dump the corresponding chain_hlocks entries for now. A latter
patch will try to reuse the freed chain_hlocks entries.

This patch also changes the type of nr_chain_hlocks to unsigned integer
to be consistent with the other counters.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 37 ++++--------------------------
 kernel/locking/lockdep_internals.h |  4 ++--
 kernel/locking/lockdep_proc.c      |  2 +-
 3 files changed, 7 insertions(+), 36 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 8bea42136da1..436fcaa4d5d5 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2636,8 +2636,8 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 
 struct lock_chain lock_chains[MAX_LOCKDEP_CHAINS];
 static DECLARE_BITMAP(lock_chains_in_use, MAX_LOCKDEP_CHAINS);
-int nr_chain_hlocks;
 static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
+unsigned int nr_chain_hlocks;
 
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
 {
@@ -4783,36 +4783,23 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 					 struct lock_class *class)
 {
 #ifdef CONFIG_PROVE_LOCKING
-	struct lock_chain *new_chain;
-	u64 chain_key;
 	int i;
 
 	for (i = chain->base; i < chain->base + chain->depth; i++) {
 		if (chain_hlocks[i] != class - lock_classes)
 			continue;
-		/* The code below leaks one chain_hlock[] entry. */
-		if (--chain->depth > 0) {
-			memmove(&chain_hlocks[i], &chain_hlocks[i + 1],
-				(chain->base + chain->depth - i) *
-				sizeof(chain_hlocks[0]));
-		}
 		/*
 		 * Each lock class occurs at most once in a lock chain so once
 		 * we found a match we can break out of this loop.
 		 */
-		goto recalc;
+		goto free_lock_chain;
 	}
 	/* Since the chain has not been modified, return. */
 	return;
 
-recalc:
-	chain_key = INITIAL_CHAIN_KEY;
-	for (i = chain->base; i < chain->base + chain->depth; i++)
-		chain_key = iterate_chain_key(chain_key, chain_hlocks[i]);
-	if (chain->depth && chain->chain_key == chain_key)
-		return;
+free_lock_chain:
 	/* Overwrite the chain key for concurrent RCU readers. */
-	WRITE_ONCE(chain->chain_key, chain_key);
+	WRITE_ONCE(chain->chain_key, INITIAL_CHAIN_KEY);
 	dec_chains(chain->irq_context);
 
 	/*
@@ -4821,22 +4808,6 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 	 */
 	hlist_del_rcu(&chain->entry);
 	__set_bit(chain - lock_chains, pf->lock_chains_being_freed);
-	if (chain->depth == 0)
-		return;
-	/*
-	 * If the modified lock chain matches an existing lock chain, drop
-	 * the modified lock chain.
-	 */
-	if (lookup_chain_cache(chain_key))
-		return;
-	new_chain = alloc_lock_chain();
-	if (WARN_ON_ONCE(!new_chain)) {
-		debug_locks_off();
-		return;
-	}
-	*new_chain = *chain;
-	hlist_add_head_rcu(&new_chain->entry, chainhashentry(chain_key));
-	inc_chains(new_chain->irq_context);
 #endif
 }
 
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index fe794cbdf34c..42a1e55f49a7 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -134,14 +134,14 @@ extern unsigned long nr_zapped_classes;
 extern unsigned long nr_list_entries;
 long lockdep_next_lockchain(long i);
 unsigned long lock_chain_count(void);
-extern int nr_chain_hlocks;
 extern unsigned long nr_stack_trace_entries;
 
 extern unsigned int nr_hardirq_chains;
 extern unsigned int nr_softirq_chains;
 extern unsigned int nr_process_chains;
-extern unsigned int max_lockdep_depth;
+extern unsigned int nr_chain_hlocks;
 
+extern unsigned int max_lockdep_depth;
 extern unsigned int max_bfs_queue_depth;
 
 #ifdef CONFIG_PROVE_LOCKING
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index bd853848832b..2457bc97924c 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -278,7 +278,7 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 #ifdef CONFIG_PROVE_LOCKING
 	seq_printf(m, " dependency chains:             %11lu [max: %lu]\n",
 			lock_chain_count(), MAX_LOCKDEP_CHAINS);
-	seq_printf(m, " dependency chain hlocks:       %11d [max: %lu]\n",
+	seq_printf(m, " dependency chain hlocks:       %11u [max: %lu]\n",
 			nr_chain_hlocks, MAX_LOCKDEP_CHAIN_HLOCKS);
 #endif
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 5/8] locking/lockdep: Track number of zapped lock chains
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
                   ` (3 preceding siblings ...)
  2020-01-15 21:43 ` [PATCH v3 4/8] locking/lockdep: Throw away all lock chains with zapped class Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  2020-01-15 21:43 ` [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries Waiman Long
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

Add a new counter nr_zapped_lock_chains to track the number lock chains
that have been removed.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 2 ++
 kernel/locking/lockdep_internals.h | 1 +
 kernel/locking/lockdep_proc.c      | 2 ++
 3 files changed, 5 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 436fcaa4d5d5..10471167b5f7 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2637,6 +2637,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 struct lock_chain lock_chains[MAX_LOCKDEP_CHAINS];
 static DECLARE_BITMAP(lock_chains_in_use, MAX_LOCKDEP_CHAINS);
 static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
+unsigned long nr_zapped_lock_chains;
 unsigned int nr_chain_hlocks;
 
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
@@ -4808,6 +4809,7 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 	 */
 	hlist_del_rcu(&chain->entry);
 	__set_bit(chain - lock_chains, pf->lock_chains_being_freed);
+	nr_zapped_lock_chains++;
 #endif
 }
 
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 42a1e55f49a7..b013250f2ca1 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -131,6 +131,7 @@ struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i);
 
 extern unsigned long nr_lock_classes;
 extern unsigned long nr_zapped_classes;
+extern unsigned long nr_zapped_lock_chains;
 extern unsigned long nr_list_entries;
 long lockdep_next_lockchain(long i);
 unsigned long lock_chain_count(void);
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index 2457bc97924c..df1b97912871 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -349,6 +349,8 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 	seq_puts(m, "\n");
 	seq_printf(m, " zapped classes:                %11lu\n",
 			nr_zapped_classes);
+	seq_printf(m, " zapped lock chains:            %11lu\n",
+			nr_zapped_lock_chains);
 	return 0;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
                   ` (4 preceding siblings ...)
  2020-01-15 21:43 ` [PATCH v3 5/8] locking/lockdep: Track number of zapped lock chains Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  2020-01-16 21:13   ` Peter Zijlstra
  2020-01-17  5:51   ` kbuild test robot
  2020-01-15 21:43 ` [PATCH v3 7/8] locking/lockdep: Add lockdep_early_init() before any lock is taken Waiman Long
  2020-01-15 21:43 ` [PATCH v3 8/8] locking/lockdep: Enable chain block splitting as last resort Waiman Long
  7 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

Once a lock class is zapped, all the lock chains that include the zapped
class are essentially useless. The lock_chain structure itself can be
reused, but not the corresponding chain_hlocks[] entries. Over time,
we will run out of chain_hlocks entries while there are still plenty
of other lockdep array entries available.

To fix this imbalance, we have to make chain_hlocks entries reusable
just like the others. As the freed chain_hlocks entries are in blocks of
various lengths. A simple bitmap like the one used in the other reusable
lockdep arrays isn't applicable. Instead the freed chain_hlocks entries
are formed into a chain block and put into a free bucketed list according
to the block size. On allocation, the corresponding chain block bucket
is searched first before allocating directly from the chain_hlocks array.

By reusing the chain_hlocks entries, we are able to handle workloads
that add and zap a lot of lock classes without the risk of overflowing
the chain_hlocks array as long as the total number of outstanding lock
classes at any time remain within a reasonable limit.

Two new tracking counters, nr_free_chain_hlocks & nr_large_chain_blocks,
are added to track the total number of chain_hlocks entries in the
free bucketed lists and the number of large chain blocks in buckets[0]
respectively.

The nr_free_chain_hlocks counters enables us to see how efficient the
current workload is able to reuse the freed chain_hlocks entries in
the bucketed lists. The nr_large_chain_blocks counter enables to see if
we should increase the number of buckets (MAX_CHAIN_BUCKETS) available
as linear search is used to find a matching chain block which is much
slower than the O(1) time to find a chain block with size not bigger
than MAX_CHAIN_BUCKETS.

An internal nfsd test that ran for more than an hour and kept on
loading and unloading kernel modules could cause the following message
to be displayed.

  [ 4318.443670] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!

The patched kernel was able to complete the test with a lot of
chain_hlocks entries to spare:

  # cat /proc/lockdep_stats
     :
   dependency chains:                   24861 [max: 65536]
   dependency chain hlocks:            102232 [max: 327680]
     :
   zapped classes:                       1538
   zapped lock chains:                  57077
   free chain hlocks:                     335
   large chain blocks:                      0

By modifying MAX_CHAIN_BUCKETS to 3 in order to exercise the large
chain blocks allocation and freeing code paths, the stats were:

  # cat /proc/lockdep_stats
     :
   dependency chains:                   24561 [max: 65536]
   dependency chain hlocks:            103098 [max: 327680]
     :
   zapped classes:                       1544
   zapped lock chains:                  57067
   free chain hlocks:                     383
   large chain blocks:                     64

By monitoring the output of lockdep_stats, the number of free
chain hlocks and large chain blocks grew and shrank (sometimes to 0)
repetitively as the test was progressing. There was no movement in the
large chain blocks count with a MAX_CHAIN_BUCKETS value of 10.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 231 ++++++++++++++++++++++++++---
 kernel/locking/lockdep_internals.h |   2 +
 kernel/locking/lockdep_proc.c      |   4 +
 3 files changed, 218 insertions(+), 19 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 10471167b5f7..a1d839e522a9 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2639,6 +2639,207 @@ static DECLARE_BITMAP(lock_chains_in_use, MAX_LOCKDEP_CHAINS);
 static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
 unsigned long nr_zapped_lock_chains;
 unsigned int nr_chain_hlocks;
+unsigned int nr_free_chain_hlocks;	/* Free cfhain_hlocks in buckets */
+unsigned int nr_large_chain_blocks;	/* size > MAX_CHAIN_BUCKETS */
+
+#ifdef CONFIG_PROVE_LOCKING
+/*
+ * As chain_hlocks entries are allocated and freed, the freed entries are
+ * grouped into chain blocks that can be reused again. These chain blocks
+ * are put into a bucketed list according to their block size.
+ *
+ * Two APIs are provided for chain block allocation and freeing:
+ *  - int alloc_chain_hlocks(int size)
+ *  - void free_chain_hlocks(int base, int size)
+ *
+ * A minimum allocation size of 2 entries is enforced. When a lock chain
+ * is freed, the corresponding chain_hlocks[] entries are put into one
+ * of the bucketed lists:
+ *   2 <= size <= MAX_CHAIN_BUCKETS => buckets[size-1]
+ *   > MAX_CHAIN_BUCKETS	    => buckets[0]
+ *
+ * When an allocation request is made, alloc_chain_hlocks() will first
+ * look up the matching bucketed list to find a free chain block. If not
+ * found, allocation will be made directly from the chain_hlocks array
+ * from the current nr_chain_hlocks value.
+ *
+ * Note that a chain_hlocks entry contain the class index which is
+ * crrently limited to 13 bits.
+ *
+ * The first 2 chain_hlocks entries in the chain block in the bucket
+ * list contains the following meta data:
+ * entry[0]:
+ *   Bit    15 - always set to 1 (it is not a class index)
+ *   Bits 0-14 - upper 15 bits of the next block index
+ * entry[1]    - lower 16 bits of next block index
+ *
+ * A next block index of all 1 bits means it is the end of the list.
+ *
+ * On the unsized bucket (buckets[0]), the 3rd and 4th entries contain
+ * the chain block size.
+ *
+ * entry[2] - upper 16 bits of the chain block size
+ * entry[3] - lower 16 bits of the chain block size
+ */
+#define MAX_CHAIN_BUCKETS	10
+#define CHAIN_BLK_FLAG		(1U << 15)
+#define CHAIN_BLK_LIST_END	((u16)-1)
+
+/*
+ * An empty bucket has a value of -1. Otherwise, it represents an offset
+ * to the chain_hlocks array.
+ */
+static int chain_block_buckets[MAX_CHAIN_BUCKETS];
+
+/*
+ * Iterate all the chain blocks in a bucket.
+ * The loop body has to set the next parameter.
+ */
+#define for_each_chain_block(bucket, prev, curr, next)		\
+	for ((prev) = -1, (curr) = chain_block_buckets[bucket];	\
+	     (curr) >= 0;					\
+	     (prev) = (curr), (curr) = (next))
+
+static inline void init_chain_block_buckets(void)
+{
+	int i;
+
+	for (i = 0; i < MAX_CHAIN_BUCKETS; i++)
+		chain_block_buckets[i] = -1;
+}
+
+/*
+ * Return offset of next chain block or -1 if end of list.
+ */
+static inline int next_chain_block(int offset)
+{
+	if (chain_hlocks[offset] == CHAIN_BLK_LIST_END)
+		return -1;
+	return ((chain_hlocks[offset] & ~CHAIN_BLK_FLAG) << 16) |
+		 chain_hlocks[offset + 1];
+}
+
+static inline void set_chain_block(int offset, int size, int next)
+{
+	if (unlikely(offset < 0)) {
+		chain_block_buckets[0] = next;
+		return;
+	}
+	chain_hlocks[offset] = (next >> 16) | CHAIN_BLK_FLAG;
+	chain_hlocks[offset + 1] = (u16)next;
+	if (size > MAX_CHAIN_BUCKETS) {
+		chain_hlocks[offset + 2] = size >> 16;
+		chain_hlocks[offset + 3] = (u16)size;
+	}
+}
+
+/*
+ * This function should only be called for chain block in buckets[0].
+ */
+static inline int chain_block_size(int offset)
+{
+	return (chain_hlocks[offset + 2] << 16) | chain_hlocks[offset + 3];
+}
+
+/*
+ * Return offset of a chain block of the right size or -1 if not found.
+ */
+static inline int alloc_chain_hlocks_from_buckets(int size)
+{
+	int prev, curr, next;
+
+	if (!nr_free_chain_hlocks)
+		return -1;
+
+	if (size <= MAX_CHAIN_BUCKETS) {
+		curr = chain_block_buckets[size - 1];
+		if (curr < 0)
+			return -1;
+
+		chain_block_buckets[size - 1] = next_chain_block(curr);
+		nr_free_chain_hlocks -= size;
+		return curr;
+	}
+
+	/*
+	 * Look for a free chain block of the given size
+	 *
+	 * It is rare to have a lock chain with depth > MAX_CHAIN_BUCKETS.
+	 * It is also more expensive as we may iterate the whole list
+	 * without finding one.
+	 */
+	for_each_chain_block(0, prev, curr, next) {
+		next = next_chain_block(curr);
+		if (chain_block_size(curr) == size) {
+			set_chain_block(prev, 0, next);
+			nr_free_chain_hlocks -= size;
+			nr_large_chain_blocks--;
+			return curr;
+		}
+	}
+	return -1;
+}
+
+static inline void free_chain_hlocks(int base, int size)
+{
+	if (size < 2)
+		size = 2;
+
+	nr_free_chain_hlocks += size;
+	if (size <= MAX_CHAIN_BUCKETS) {
+		set_chain_block(base, 0, chain_block_buckets[size - 1]);
+		chain_block_buckets[size - 1] = base;
+		return;
+	}
+	set_chain_block(base, size, chain_block_buckets[0]);
+	chain_block_buckets[0] = base;
+	nr_large_chain_blocks++;
+}
+#else
+static inline void init_chain_block_buckets(void)	{ }
+static inline int  alloc_chain_hlocks_from_buckets(int size)
+{
+	return -1;
+}
+#endif /* CONFIG_PROVE_LOCKING */
+
+/*
+ * The graph lock must be held before calling this function.
+ *
+ * Return: an offset to chain_hlocks if successful, or
+ *	   -1 with graph lock released
+ */
+static int alloc_chain_hlocks(int size)
+{
+	int curr;
+
+	if (size < 2)
+		size = 2;
+
+	curr = alloc_chain_hlocks_from_buckets(size);
+	if (curr >= 0)
+		return curr;
+
+	BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks));
+	BUILD_BUG_ON((1UL << 6)  <= ARRAY_SIZE(current->held_locks));
+	BUILD_BUG_ON((1UL << 8*sizeof(chain_hlocks[0])) <=
+		     ARRAY_SIZE(lock_classes));
+
+	/*
+	 * Allocate directly from chain_hlocks.
+	 */
+	if (likely(nr_chain_hlocks + size <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
+		curr = nr_chain_hlocks;
+		nr_chain_hlocks += size;
+		return curr;
+	}
+	if (!debug_locks_off_graph_unlock())
+		return -1;
+
+	print_lockdep_off("BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!");
+	dump_stack();
+	return -1;
+}
 
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
 {
@@ -2834,28 +3035,17 @@ static inline int add_chain_cache(struct task_struct *curr,
 	chain->irq_context = hlock->irq_context;
 	i = get_first_held_lock(curr, hlock);
 	chain->depth = curr->lockdep_depth + 1 - i;
+	j = alloc_chain_hlocks(chain->depth);
+	if (j < 0)
+		return 0;
 
-	BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks));
-	BUILD_BUG_ON((1UL << 6)  <= ARRAY_SIZE(curr->held_locks));
-	BUILD_BUG_ON((1UL << 8*sizeof(chain_hlocks[0])) <= ARRAY_SIZE(lock_classes));
-
-	if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
-		chain->base = nr_chain_hlocks;
-		for (j = 0; j < chain->depth - 1; j++, i++) {
-			int lock_id = curr->held_locks[i].class_idx;
-			chain_hlocks[chain->base + j] = lock_id;
-		}
-		chain_hlocks[chain->base + j] = class - lock_classes;
-		nr_chain_hlocks += chain->depth;
-	} else {
-		if (!debug_locks_off_graph_unlock())
-			return 0;
+	chain->base = j;
+	for (j = 0; j < chain->depth - 1; j++, i++) {
+		int lock_id = curr->held_locks[i].class_idx;
 
-		print_lockdep_off("BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!");
-		dump_stack();
-		return 0;
+		chain_hlocks[chain->base + j] = lock_id;
 	}
-
+	chain_hlocks[chain->base + j] = class - lock_classes;
 	hlist_add_head_rcu(&chain->entry, hash_head);
 	debug_atomic_inc(chain_lookup_misses);
 	inc_chains(chain->irq_context);
@@ -4799,6 +4989,7 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 	return;
 
 free_lock_chain:
+	free_chain_hlocks(chain->base, chain->depth);
 	/* Overwrite the chain key for concurrent RCU readers. */
 	WRITE_ONCE(chain->chain_key, INITIAL_CHAIN_KEY);
 	dec_chains(chain->irq_context);
@@ -5193,6 +5384,8 @@ EXPORT_SYMBOL_GPL(lockdep_unregister_key);
 
 void __init lockdep_init(void)
 {
+	init_chain_block_buckets();
+
 	printk("Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar\n");
 
 	printk("... MAX_LOCKDEP_SUBCLASSES:  %lu\n", MAX_LOCKDEP_SUBCLASSES);
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index b013250f2ca1..9425155b9053 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -141,6 +141,8 @@ extern unsigned int nr_hardirq_chains;
 extern unsigned int nr_softirq_chains;
 extern unsigned int nr_process_chains;
 extern unsigned int nr_chain_hlocks;
+extern unsigned int nr_free_chain_hlocks;
+extern unsigned int nr_large_chain_blocks;
 
 extern unsigned int max_lockdep_depth;
 extern unsigned int max_bfs_queue_depth;
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index df1b97912871..9b3d2976bd5a 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -351,6 +351,10 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 			nr_zapped_classes);
 	seq_printf(m, " zapped lock chains:            %11lu\n",
 			nr_zapped_lock_chains);
+	seq_printf(m, " free chain hlocks:             %11u\n",
+			nr_free_chain_hlocks);
+	seq_printf(m, " large chain blocks:            %11u\n",
+			nr_large_chain_blocks);
 	return 0;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 7/8] locking/lockdep: Add lockdep_early_init() before any lock is taken
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
                   ` (5 preceding siblings ...)
  2020-01-15 21:43 ` [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  2020-01-15 21:43 ` [PATCH v3 8/8] locking/lockdep: Enable chain block splitting as last resort Waiman Long
  7 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

The lockdep_init() function is actually called after __lock_acquire()
has been called. So it is not a good place to do chain buckets
initialization. Fortunately, the check for nr_free_chain_hlocks prevents
those chain buckets from being used prematurely.

Add a lockdep_early_init() function that will be called very early in the
boot process and move chain buckets initialization over there to ensure
that the buckets are initialized before __lock_acquire() is called.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/lockdep.h  | 2 ++
 init/main.c              | 1 +
 kernel/locking/lockdep.c | 5 ++++-
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index c50d01ef1414..efb1ec28a2cd 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -279,6 +279,7 @@ struct held_lock {
  * Initialization, self-test and debugging-output methods:
  */
 extern void lockdep_init(void);
+extern void lockdep_early_init(void);
 extern void lockdep_reset(void);
 extern void lockdep_reset_lock(struct lockdep_map *lock);
 extern void lockdep_free_key_range(void *start, unsigned long size);
@@ -432,6 +433,7 @@ static inline void lockdep_set_selftest_task(struct task_struct *task)
 # define lock_set_class(l, n, k, s, i)		do { } while (0)
 # define lock_set_subclass(l, s, i)		do { } while (0)
 # define lockdep_init()				do { } while (0)
+# define lockdep_early_init()			do { } while (0)
 # define lockdep_init_map(lock, name, key, sub) \
 		do { (void)(name); (void)(key); } while (0)
 # define lockdep_set_class(lock, key)		do { (void)(key); } while (0)
diff --git a/init/main.c b/init/main.c
index 2cd736059416..571b83981276 100644
--- a/init/main.c
+++ b/init/main.c
@@ -580,6 +580,7 @@ asmlinkage __visible void __init start_kernel(void)
 	set_task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
+	lockdep_early_init();
 
 	cgroup_init_early();
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index a1d839e522a9..165e2361b25e 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5382,10 +5382,13 @@ void lockdep_unregister_key(struct lock_class_key *key)
 }
 EXPORT_SYMBOL_GPL(lockdep_unregister_key);
 
-void __init lockdep_init(void)
+void __init lockdep_early_init(void)
 {
 	init_chain_block_buckets();
+}
 
+void __init lockdep_init(void)
+{
 	printk("Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar\n");
 
 	printk("... MAX_LOCKDEP_SUBCLASSES:  %lu\n", MAX_LOCKDEP_SUBCLASSES);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 8/8] locking/lockdep: Enable chain block splitting as last resort
  2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
                   ` (6 preceding siblings ...)
  2020-01-15 21:43 ` [PATCH v3 7/8] locking/lockdep: Add lockdep_early_init() before any lock is taken Waiman Long
@ 2020-01-15 21:43 ` Waiman Long
  7 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-15 21:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

When the main chain_hlocks[] buffer is running out and a chain block of
a matching size isn't available, it is possible that there are larger
free chain blocks available that can accommodate an allocation request.
So instead of disabling lockdep, we will try to find a larger chain
block and split it.

Chain block splitting will be used as the last resort as doing too many
splitting will generate a lot of small chain blocks that cannot satisfy
larger size allocation requests and hence potentally wasting more space.

To test chain block splitting code, the call to
alloc_chain_hlocks_by_splitting_block() was moved up before direct
array allocation. A nfsd workload was run before and after the change.
Before it, the lockdep_stats output was:

 zapped classes:                       1538
 zapped direct dependencies:           4612
 zapped lock chains:                  57409
 free chain hlocks:                       0
 large chain blocks:                      0
 split chain blocks:                      0

After the change, the output became:

 zapped classes:                       1538
 zapped direct dependencies:           4612
 zapped lock chains:                  57002
 free chain hlocks:                     944
 large chain blocks:                      0
 split chain blocks:                    784

Chain block merging is not being considered for now as it can be time
consuming while holding the graph lock which is not good.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 66 +++++++++++++++++++++++++++++-
 kernel/locking/lockdep_internals.h |  1 +
 kernel/locking/lockdep_proc.c      |  2 +
 3 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 165e2361b25e..9a75f9582b6b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2641,6 +2641,7 @@ unsigned long nr_zapped_lock_chains;
 unsigned int nr_chain_hlocks;
 unsigned int nr_free_chain_hlocks;	/* Free cfhain_hlocks in buckets */
 unsigned int nr_large_chain_blocks;	/* size > MAX_CHAIN_BUCKETS */
+unsigned int nr_split_chain_blocks;	/* Chain block split count */
 
 #ifdef CONFIG_PROVE_LOCKING
 /*
@@ -2661,7 +2662,8 @@ unsigned int nr_large_chain_blocks;	/* size > MAX_CHAIN_BUCKETS */
  * When an allocation request is made, alloc_chain_hlocks() will first
  * look up the matching bucketed list to find a free chain block. If not
  * found, allocation will be made directly from the chain_hlocks array
- * from the current nr_chain_hlocks value.
+ * from the current nr_chain_hlocks value. As a last resort, larger chain
+ * block will be split, if available, to satisfy the request.
  *
  * Note that a chain_hlocks entry contain the class index which is
  * crrently limited to 13 bits.
@@ -2795,12 +2797,66 @@ static inline void free_chain_hlocks(int base, int size)
 	chain_block_buckets[0] = base;
 	nr_large_chain_blocks++;
 }
+
+/*
+ * Allocate chain block by splitting larger block in buckets.
+ * This is the last resort before disabling lockdep.
+ */
+static inline int alloc_chain_hlocks_by_splitting_block(int size)
+{
+	int newsize;
+	int prev, curr, next;
+
+	if (!nr_free_chain_hlocks)
+		return -1;
+
+	/*
+	 * As the minimum block size is 2, we have to look for a larger
+	 * block that is at least 2 greater than the desired size.
+	 */
+	for (newsize = size + 2; newsize <= MAX_CHAIN_BUCKETS; newsize++) {
+		curr = chain_block_buckets[newsize - 1];
+		if (curr < 0)
+			continue;
+		chain_block_buckets[newsize - 1] = next_chain_block(curr);
+		nr_free_chain_hlocks -= newsize;
+		goto found;
+	}
+
+	/*
+	 * Look for a block in chain_block_buckets[0] with a size of
+	 * (size + 2) or larger.
+	 */
+	for_each_chain_block(0, prev, curr, next) {
+		next = next_chain_block(curr);
+		newsize = chain_block_size(curr);
+		if (newsize >= size + 2) {
+			set_chain_block(prev, 0, next);
+			nr_free_chain_hlocks -= newsize;
+			nr_large_chain_blocks--;
+			goto found;
+		}
+	}
+	return -1;
+
+found:
+	/*
+	 * Free the unneed chain_hlocks entries at the end of the block.
+	 */
+	free_chain_hlocks(curr + size, newsize - size);
+	nr_split_chain_blocks++;
+	return curr;
+}
 #else
 static inline void init_chain_block_buckets(void)	{ }
 static inline int  alloc_chain_hlocks_from_buckets(int size)
 {
 	return -1;
 }
+static inline int alloc_chain_hlocks_by_splitting_block(int size)
+{
+	return -1;
+}
 #endif /* CONFIG_PROVE_LOCKING */
 
 /*
@@ -2833,6 +2889,14 @@ static int alloc_chain_hlocks(int size)
 		nr_chain_hlocks += size;
 		return curr;
 	}
+
+	/*
+	 * Try to allocate it by splitting large chain block.
+	 */
+	curr = alloc_chain_hlocks_by_splitting_block(size);
+	if (curr >= 0)
+		return curr;
+
 	if (!debug_locks_off_graph_unlock())
 		return -1;
 
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 9425155b9053..7b6ec1e7168a 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -143,6 +143,7 @@ extern unsigned int nr_process_chains;
 extern unsigned int nr_chain_hlocks;
 extern unsigned int nr_free_chain_hlocks;
 extern unsigned int nr_large_chain_blocks;
+extern unsigned int nr_split_chain_blocks;
 
 extern unsigned int max_lockdep_depth;
 extern unsigned int max_bfs_queue_depth;
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index 9b3d2976bd5a..97fc50facd98 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -355,6 +355,8 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 			nr_free_chain_hlocks);
 	seq_printf(m, " large chain blocks:            %11u\n",
 			nr_large_chain_blocks);
+	seq_printf(m, " split chain blocks:            %11u\n",
+			nr_split_chain_blocks);
 	return 0;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain
  2020-01-15 21:43 ` [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
@ 2020-01-16 15:32   ` Peter Zijlstra
  2020-01-16 15:50     ` Waiman Long
  2020-01-17 18:14   ` kbuild test robot
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2020-01-16 15:32 UTC (permalink / raw)
  To: Waiman Long; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On Wed, Jan 15, 2020 at 04:43:06PM -0500, Waiman Long wrote:
> There are currently three counters to track the irq context of a lock
> chain - nr_hardirq_chains, nr_softirq_chains and nr_process_chains.
> They are incremented when a new lock chain is added, but they are
> not decremented when a lock chain is removed. That causes some of the
> statistic counts reported by /proc/lockdep_stats to be incorrect.
> 
> Fix that by decrementing the right counter when a lock chain is removed.
> 
> Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  kernel/locking/lockdep.c           | 36 +++++++++++++++++++++---------
>  kernel/locking/lockdep_internals.h |  6 +++++
>  2 files changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 32282e7112d3..b20fa6236b2a 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -2299,16 +2299,24 @@ static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
>  	return 0;
>  }
>  
> -static void inc_chains(void)
> +static void inc_chains(int irq_context)
>  {
> -	if (current->hardirq_context)
> +	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
>  		nr_hardirq_chains++;
> -	else {
> -		if (current->softirq_context)
> -			nr_softirq_chains++;
> -		else
> -			nr_process_chains++;
> -	}
> +	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
> +		nr_softirq_chains++;
> +	else
> +		nr_process_chains++;
> +}
> +
> +static void dec_chains(int irq_context)
> +{
> +	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
> +		nr_hardirq_chains--;
> +	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
> +		nr_softirq_chains--;
> +	else
> +		nr_process_chains--;
>  }
>  
>  #else
> @@ -2324,6 +2332,10 @@ static inline void inc_chains(void)
>  	nr_process_chains++;
>  }
>  
> +static void dec_chains(int irq_context)
> +{
> +	nr_process_chains--;
> +}
>  #endif /* CONFIG_TRACE_IRQFLAGS */
>  

Is there really need for two versions of those functions? Would the
@irq_context argument not always be 0 in the CONFIG_TRACE_IRQFLAGS=n
case?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain
  2020-01-16 15:32   ` Peter Zijlstra
@ 2020-01-16 15:50     ` Waiman Long
  0 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-16 15:50 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On 1/16/20 10:32 AM, Peter Zijlstra wrote:
> On Wed, Jan 15, 2020 at 04:43:06PM -0500, Waiman Long wrote:
>> There are currently three counters to track the irq context of a lock
>> chain - nr_hardirq_chains, nr_softirq_chains and nr_process_chains.
>> They are incremented when a new lock chain is added, but they are
>> not decremented when a lock chain is removed. That causes some of the
>> statistic counts reported by /proc/lockdep_stats to be incorrect.
>>
>> Fix that by decrementing the right counter when a lock chain is removed.
>>
>> Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>  kernel/locking/lockdep.c           | 36 +++++++++++++++++++++---------
>>  kernel/locking/lockdep_internals.h |  6 +++++
>>  2 files changed, 32 insertions(+), 10 deletions(-)
>>
>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>> index 32282e7112d3..b20fa6236b2a 100644
>> --- a/kernel/locking/lockdep.c
>> +++ b/kernel/locking/lockdep.c
>> @@ -2299,16 +2299,24 @@ static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
>>  	return 0;
>>  }
>>  
>> -static void inc_chains(void)
>> +static void inc_chains(int irq_context)
>>  {
>> -	if (current->hardirq_context)
>> +	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
>>  		nr_hardirq_chains++;
>> -	else {
>> -		if (current->softirq_context)
>> -			nr_softirq_chains++;
>> -		else
>> -			nr_process_chains++;
>> -	}
>> +	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
>> +		nr_softirq_chains++;
>> +	else
>> +		nr_process_chains++;
>> +}
>> +
>> +static void dec_chains(int irq_context)
>> +{
>> +	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
>> +		nr_hardirq_chains--;
>> +	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
>> +		nr_softirq_chains--;
>> +	else
>> +		nr_process_chains--;
>>  }
>>  
>>  #else
>> @@ -2324,6 +2332,10 @@ static inline void inc_chains(void)
>>  	nr_process_chains++;
>>  }
>>  
>> +static void dec_chains(int irq_context)
>> +{
>> +	nr_process_chains--;
>> +}
>>  #endif /* CONFIG_TRACE_IRQFLAGS */
>>  
> Is there really need for two versions of those functions? Would the
> @irq_context argument not always be 0 in the CONFIG_TRACE_IRQFLAGS=n
> case?
>
You are right. I changed the code so that inc_chains() won't look at
{hard|soft}irq_context directly. So I could take it out of
CONFIG_TRACE_IRQFLAGS as a single version.

I will make the change in the next version.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries
  2020-01-15 21:43 ` [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries Waiman Long
@ 2020-01-16 21:13   ` Peter Zijlstra
  2020-01-20  4:22     ` Waiman Long
  2020-01-17  5:51   ` kbuild test robot
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2020-01-16 21:13 UTC (permalink / raw)
  To: Waiman Long; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On Wed, Jan 15, 2020 at 04:43:11PM -0500, Waiman Long wrote:
> +static inline int alloc_chain_hlocks_from_buckets(int size)
> +{
> +	int prev, curr, next;
> +
> +	if (!nr_free_chain_hlocks)
> +		return -1;
> +
> +	if (size <= MAX_CHAIN_BUCKETS) {
> +		curr = chain_block_buckets[size - 1];
> +		if (curr < 0)
> +			return -1;
> +
> +		chain_block_buckets[size - 1] = next_chain_block(curr);
> +		nr_free_chain_hlocks -= size;
> +		return curr;
> +	}
> +
> +	/*
> +	 * Look for a free chain block of the given size
> +	 *
> +	 * It is rare to have a lock chain with depth > MAX_CHAIN_BUCKETS.
> +	 * It is also more expensive as we may iterate the whole list
> +	 * without finding one.
> +	 */
> +	for_each_chain_block(0, prev, curr, next) {
> +		next = next_chain_block(curr);
> +		if (chain_block_size(curr) == size) {
> +			set_chain_block(prev, 0, next);
> +			nr_free_chain_hlocks -= size;
> +			nr_large_chain_blocks--;
> +			return curr;
> +		}
> +	}
> +	return -1;
> +}

> +static int alloc_chain_hlocks(int size)
> +{
> +	int curr;
> +
> +	if (size < 2)
> +		size = 2;
> +
> +	curr = alloc_chain_hlocks_from_buckets(size);
> +	if (curr >= 0)
> +		return curr;
> +
> +	BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks));
> +	BUILD_BUG_ON((1UL << 6)  <= ARRAY_SIZE(current->held_locks));
> +	BUILD_BUG_ON((1UL << 8*sizeof(chain_hlocks[0])) <=
> +		     ARRAY_SIZE(lock_classes));
> +
> +	/*
> +	 * Allocate directly from chain_hlocks.
> +	 */
> +	if (likely(nr_chain_hlocks + size <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
> +		curr = nr_chain_hlocks;
> +		nr_chain_hlocks += size;
> +		return curr;
> +	}
> +	if (!debug_locks_off_graph_unlock())
> +		return -1;
> +
> +	print_lockdep_off("BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!");
> +	dump_stack();
> +	return -1;
> +}

Argh, that's still _two_ half allocators.

Here, please try this one, it seems to boot. It compiles with some
noise, but that is because GCC is stupid and I'm too tired.

---

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1071,15 +1071,22 @@ static inline void check_data_structures
 
 #endif /* CONFIG_DEBUG_LOCKDEP */
 
+static void init_chain_block_buckets(void);
+
 /*
  * Initialize the lock_classes[] array elements, the free_lock_classes list
  * and also the delayed_free structure.
  */
 static void init_data_structures_once(void)
 {
-	static bool ds_initialized, rcu_head_initialized;
+	static bool ds_initialized, rcu_head_initialized, chain_block_initialized;
 	int i;
 
+	if (!chain_block_initialized) {
+		chain_block_initialized = true;
+		init_chain_block_buckets();
+	}
+
 	if (likely(rcu_head_initialized))
 		return;
 
@@ -2637,7 +2644,217 @@ struct lock_chain lock_chains[MAX_LOCKDE
 static DECLARE_BITMAP(lock_chains_in_use, MAX_LOCKDEP_CHAINS);
 static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
 unsigned long nr_zapped_lock_chains;
-unsigned int nr_chain_hlocks;
+unsigned int nr_free_chain_hlocks;	/* Free cfhain_hlocks in buckets */
+unsigned int nr_large_chain_blocks;	/* size > MAX_CHAIN_BUCKETS */
+
+/*
+ * The first 2 chain_hlocks entries in the chain block in the bucket
+ * list contains the following meta data:
+ *
+ *   entry[0]:
+ *     Bit    15 - always set to 1 (it is not a class index)
+ *     Bits 0-14 - upper 15 bits of the next block index
+ *   entry[1]    - lower 16 bits of next block index
+ *
+ * A next block index of all 1 bits means it is the end of the list.
+ *
+ * On the unsized bucket (bucket-0), the 3rd and 4th entries contain
+ * the chain block size:
+ *
+ *   entry[2] - upper 16 bits of the chain block size
+ *   entry[3] - lower 16 bits of the chain block size
+ */
+#define MAX_CHAIN_BUCKETS	10
+#define CHAIN_BLK_FLAG		(1U << 15)
+#define CHAIN_BLK_LIST_END	0xFFFFU
+
+static int chain_block_buckets[MAX_CHAIN_BUCKETS];
+
+static inline int size_to_bucket(int size)
+{
+	if (size > MAX_CHAIN_BUCKETS)
+		return 0;
+
+	return size - 1;
+}
+
+/*
+ * Iterate all the chain blocks in a bucket.
+ * The loop body has to set the next parameter.
+ */
+#define for_each_chain_block(bucket, prev, curr, next)		\
+	for ((prev) = bucket - MAX_CHAIN_BUCKETS,		\
+	     (curr) = chain_block_buckets[bucket];		\
+	     (curr) >= 0;					\
+	     (prev) = (curr), (curr) = (next))
+
+/*
+ * next block or -1
+ */
+static inline int chain_block_next(int offset)
+{
+	int next = chain_hlocks[offset];
+
+	WARN_ON_ONCE(!(next & CHAIN_BLK_FLAG));
+
+	if (next == CHAIN_BLK_LIST_END)
+		return -1;
+
+	next &= ~CHAIN_BLK_FLAG;
+	next <<= 16;
+	next |= chain_hlocks[offset + 1];
+
+	return next;
+}
+
+/*
+ * bucket-0 only
+ */
+static inline int chain_block_size(int offset)
+{
+	return (chain_hlocks[offset + 2] << 16) | chain_hlocks[offset + 3];
+}
+
+static inline void init_chain_block(int offset, int next, int bucket, int size)
+{
+	chain_hlocks[offset] = (next >> 16) | CHAIN_BLK_FLAG;
+	chain_hlocks[offset + 1] = (u16)next;
+
+	if (bucket == 0) {
+		chain_hlocks[offset + 2] = size >> 16;
+		chain_hlocks[offset + 3] = (u16)size;
+	}
+}
+
+static inline void add_chain_block(int offset, int size)
+{
+	int bucket = size_to_bucket(size);
+	int next = chain_block_buckets[bucket];
+
+	if (size < 2)
+		return; // XXX leak!
+
+	init_chain_block(offset, next, bucket, size);
+	chain_block_buckets[bucket] = offset;
+	nr_free_chain_hlocks += size;
+
+	if (bucket == 0)
+		nr_large_chain_blocks++;
+}
+
+static inline void del_chain_block(int offset, int size, int next)
+{
+	int bucket = size_to_bucket(size);
+
+	nr_free_chain_hlocks -= size;
+	if (offset < 0)
+		chain_block_buckets[offset + MAX_CHAIN_BUCKETS] = next;
+	else
+		init_chain_block(offset, next, bucket, size);
+
+	if (bucket == 0)
+		nr_large_chain_blocks--;
+}
+
+static void init_chain_block_buckets(void)
+{
+	int i;
+
+	for (i = 0; i < MAX_CHAIN_BUCKETS; i++)
+		chain_block_buckets[i] = -1;
+
+	add_chain_block(0, ARRAY_SIZE(chain_hlocks));
+}
+
+/*
+ * Return offset of a chain block of the right size or -1 if not found.
+ *
+ * Fairly simple worst-fit allocator with the addition of a number of size
+ * specific free lists.
+ */
+static int alloc_chain_hlocks(int req)
+{
+	int max_prev, max_curr, max_next, max_size = INT_MIN;
+	int prev, curr, next, size;
+	int bucket;
+
+	/*
+	 * We rely on the MSB to act as an escape bit to denote freelist
+	 * pointers. Make sure this bit isn't set in 'normal' class_idx usage.
+	 */
+	BUILD_BUG_ON((MAX_LOCKDEP_KEYS-1) & CHAIN_BLK_FLAG);
+
+	init_data_structures_once();
+
+	if (!nr_free_chain_hlocks)
+		return -1;
+
+	/*
+	 * We require a minimum of 2 (u16) entries to encode a freelist
+	 * 'pointer'.
+	 */
+	req = max(req, 2);
+	bucket = size_to_bucket(req);
+
+	if (bucket != 0) {
+		curr = chain_block_buckets[bucket];
+		if (curr < 0)
+			goto search;
+
+		del_chain_block(bucket - MAX_CHAIN_BUCKETS, req,
+				chain_block_next(curr));
+		return curr;
+	}
+
+search:
+	/*
+	 * linear search in the 'dump' bucket; look for an exact match or the
+	 * largest block.
+	 */
+	for_each_chain_block(0, prev, curr, next) {
+		size = chain_block_size(curr);
+		next = chain_block_next(curr);
+
+		if (size == req) {
+			del_chain_block(prev, req, next);
+			return curr;
+		}
+
+		if (size > max_size) {
+			max_prev = prev;
+			max_curr = curr;
+			max_next = next;
+			max_size = size;
+		}
+	}
+
+	if (max_size > req) {
+		del_chain_block(max_prev, max_size, max_next);
+		add_chain_block(max_curr + req, max_size - req);
+		return max_curr;
+	}
+
+	for (size = MAX_CHAIN_BUCKETS; size > req; size--) {
+		bucket = size_to_bucket(size);
+		curr = chain_block_buckets[bucket];
+		if (curr < 0)
+			continue;
+
+		del_chain_block(bucket - MAX_CHAIN_BUCKETS, size,
+				chain_block_next(curr));
+		add_chain_block(curr + req, size - req);
+		return curr;
+	}
+
+	return -1;
+}
+
+#ifdef CONFIG_PROVE_LOCKING
+static inline void free_chain_hlocks(int base, int size)
+{
+	add_chain_block(base, max(size, 2));
+}
+#endif
 
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
 {
@@ -2838,15 +3055,8 @@ static inline int add_chain_cache(struct
 	BUILD_BUG_ON((1UL << 6)  <= ARRAY_SIZE(curr->held_locks));
 	BUILD_BUG_ON((1UL << 8*sizeof(chain_hlocks[0])) <= ARRAY_SIZE(lock_classes));
 
-	if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
-		chain->base = nr_chain_hlocks;
-		for (j = 0; j < chain->depth - 1; j++, i++) {
-			int lock_id = curr->held_locks[i].class_idx;
-			chain_hlocks[chain->base + j] = lock_id;
-		}
-		chain_hlocks[chain->base + j] = class - lock_classes;
-		nr_chain_hlocks += chain->depth;
-	} else {
+	j = alloc_chain_hlocks(chain->depth);
+	if (j < 0) {
 		if (!debug_locks_off_graph_unlock())
 			return 0;
 
@@ -2855,6 +3065,13 @@ static inline int add_chain_cache(struct
 		return 0;
 	}
 
+	chain->base = j;
+	for (j = 0; j < chain->depth - 1; j++, i++) {
+		int lock_id = curr->held_locks[i].class_idx;
+
+		chain_hlocks[chain->base + j] = lock_id;
+	}
+	chain_hlocks[chain->base + j] = class - lock_classes;
 	hlist_add_head_rcu(&chain->entry, hash_head);
 	debug_atomic_inc(chain_lookup_misses);
 	inc_chains(chain->irq_context);
@@ -4798,6 +5015,7 @@ static void remove_class_from_lock_chain
 	return;
 
 free_lock_chain:
+	free_chain_hlocks(chain->base, chain->depth);
 	/* Overwrite the chain key for concurrent RCU readers. */
 	WRITE_ONCE(chain->chain_key, INITIAL_CHAIN_KEY);
 	dec_chains(chain->irq_context);
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -140,7 +140,8 @@ extern unsigned long nr_stack_trace_entr
 extern unsigned int nr_hardirq_chains;
 extern unsigned int nr_softirq_chains;
 extern unsigned int nr_process_chains;
-extern unsigned int nr_chain_hlocks;
+extern unsigned int nr_free_chain_hlocks;
+extern unsigned int nr_large_chain_blocks;
 
 extern unsigned int max_lockdep_depth;
 extern unsigned int max_bfs_queue_depth;
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -137,7 +137,7 @@ static int lc_show(struct seq_file *m, v
 	};
 
 	if (v == SEQ_START_TOKEN) {
-		if (nr_chain_hlocks > MAX_LOCKDEP_CHAIN_HLOCKS)
+		if (!nr_free_chain_hlocks)
 			seq_printf(m, "(buggered) ");
 		seq_printf(m, "all lock chains:\n");
 		return 0;
@@ -278,8 +278,8 @@ static int lockdep_stats_show(struct seq
 #ifdef CONFIG_PROVE_LOCKING
 	seq_printf(m, " dependency chains:             %11lu [max: %lu]\n",
 			lock_chain_count(), MAX_LOCKDEP_CHAINS);
-	seq_printf(m, " dependency chain hlocks:       %11u [max: %lu]\n",
-			nr_chain_hlocks, MAX_LOCKDEP_CHAIN_HLOCKS);
+	seq_printf(m, " dependency chain hlocks:       %11lu [max: %lu]\n",
+			MAX_LOCKDEP_CHAIN_HLOCKS - nr_free_chain_hlocks, MAX_LOCKDEP_CHAIN_HLOCKS);
 #endif
 
 #ifdef CONFIG_TRACE_IRQFLAGS
@@ -351,6 +351,10 @@ static int lockdep_stats_show(struct seq
 			nr_zapped_classes);
 	seq_printf(m, " zapped lock chains:            %11lu\n",
 			nr_zapped_lock_chains);
+	seq_printf(m, " free chain hlocks:             %11u\n",
+			nr_free_chain_hlocks);
+	seq_printf(m, " large chain blocks:            %11u\n",
+			nr_large_chain_blocks);
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries
  2020-01-15 21:43 ` [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries Waiman Long
  2020-01-16 21:13   ` Peter Zijlstra
@ 2020-01-17  5:51   ` kbuild test robot
  1 sibling, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2020-01-17  5:51 UTC (permalink / raw)
  To: Waiman Long
  Cc: kbuild-all, Peter Zijlstra, Ingo Molnar, Will Deacon,
	linux-kernel, Bart Van Assche, Waiman Long

[-- Attachment #1: Type: text/plain, Size: 2793 bytes --]

Hi Waiman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/locking/core]
[also build test ERROR on tip/auto-latest linus/master arm-perf/for-next/perf v5.5-rc6 next-20200116]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/locking-lockdep-Reuse-zapped-chain_hlocks-entries/20200117-093009
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 1a365e822372ba24c9da0822bc583894f6f3d821
config: i386-randconfig-d002-20200117 (attached as .config)
compiler: gcc-7 (Debian 7.5.0-3) 7.5.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   kernel/locking/lockdep.c: In function 'lockdep_init':
>> kernel/locking/lockdep.c:5387:2: error: implicit declaration of function 'init_chain_block_buckets' [-Werror=implicit-function-declaration]
     init_chain_block_buckets();
     ^~~~~~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/init_chain_block_buckets +5387 kernel/locking/lockdep.c

  5384	
  5385	void __init lockdep_init(void)
  5386	{
> 5387		init_chain_block_buckets();
  5388	
  5389		printk("Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar\n");
  5390	
  5391		printk("... MAX_LOCKDEP_SUBCLASSES:  %lu\n", MAX_LOCKDEP_SUBCLASSES);
  5392		printk("... MAX_LOCK_DEPTH:          %lu\n", MAX_LOCK_DEPTH);
  5393		printk("... MAX_LOCKDEP_KEYS:        %lu\n", MAX_LOCKDEP_KEYS);
  5394		printk("... CLASSHASH_SIZE:          %lu\n", CLASSHASH_SIZE);
  5395		printk("... MAX_LOCKDEP_ENTRIES:     %lu\n", MAX_LOCKDEP_ENTRIES);
  5396		printk("... MAX_LOCKDEP_CHAINS:      %lu\n", MAX_LOCKDEP_CHAINS);
  5397		printk("... CHAINHASH_SIZE:          %lu\n", CHAINHASH_SIZE);
  5398	
  5399		printk(" memory used by lock dependency info: %zu kB\n",
  5400		       (sizeof(lock_classes) +
  5401			sizeof(lock_classes_in_use) +
  5402			sizeof(classhash_table) +
  5403			sizeof(list_entries) +
  5404			sizeof(list_entries_in_use) +
  5405			sizeof(chainhash_table) +
  5406			sizeof(delayed_free)
  5407	#ifdef CONFIG_PROVE_LOCKING
  5408			+ sizeof(lock_cq)
  5409			+ sizeof(lock_chains)
  5410			+ sizeof(lock_chains_in_use)
  5411			+ sizeof(chain_hlocks)
  5412	#endif
  5413			) / 1024
  5414			);
  5415	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 31814 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain
  2020-01-15 21:43 ` [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
  2020-01-16 15:32   ` Peter Zijlstra
@ 2020-01-17 18:14   ` kbuild test robot
  1 sibling, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2020-01-17 18:14 UTC (permalink / raw)
  To: Waiman Long
  Cc: kbuild-all, Peter Zijlstra, Ingo Molnar, Will Deacon,
	linux-kernel, Bart Van Assche, Waiman Long

[-- Attachment #1: Type: text/plain, Size: 3940 bytes --]

Hi Waiman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/locking/core]
[also build test ERROR on tip/auto-latest linus/master arm-perf/for-next/perf v5.5-rc6 next-20200117]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/locking-lockdep-Reuse-zapped-chain_hlocks-entries/20200117-093009
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 1a365e822372ba24c9da0822bc583894f6f3d821
config: sparc64-allmodconfig (attached as .config)
compiler: sparc64-linux-gcc (GCC) 7.5.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.5.0 make.cross ARCH=sparc64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   kernel/locking/lockdep.c: In function 'inc_chains':
>> kernel/locking/lockdep.c:2304:20: error: 'LOCK_CHAIN_HARDIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_USED_IN_HARDIRQ_READ'?
     if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~
                       LOCK_USED_IN_HARDIRQ_READ
   kernel/locking/lockdep.c:2304:20: note: each undeclared identifier is reported only once for each function it appears in
>> kernel/locking/lockdep.c:2306:25: error: 'LOCK_CHAIN_SOFTIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_CHAIN_HARDIRQ_CONTEXT'?
     else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~
                            LOCK_CHAIN_HARDIRQ_CONTEXT
   kernel/locking/lockdep.c: In function 'dec_chains':
   kernel/locking/lockdep.c:2314:20: error: 'LOCK_CHAIN_HARDIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_USED_IN_HARDIRQ_READ'?
     if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~
                       LOCK_USED_IN_HARDIRQ_READ
   kernel/locking/lockdep.c:2316:25: error: 'LOCK_CHAIN_SOFTIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_CHAIN_HARDIRQ_CONTEXT'?
     else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
                            ^~~~~~~~~~~~~~~~~~~~~~~~~~
                            LOCK_CHAIN_HARDIRQ_CONTEXT
   kernel/locking/lockdep.c: In function 'task_irq_context':
   kernel/locking/lockdep.c:3612:9: error: 'LOCK_CHAIN_HARDIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_USED_IN_HARDIRQ_READ'?
     return LOCK_CHAIN_HARDIRQ_CONTEXT * !!task->hardirq_context +
            ^~~~~~~~~~~~~~~~~~~~~~~~~~
            LOCK_USED_IN_HARDIRQ_READ
   kernel/locking/lockdep.c:3613:9: error: 'LOCK_CHAIN_SOFTIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_CHAIN_HARDIRQ_CONTEXT'?
            LOCK_CHAIN_SOFTIRQ_CONTEXT * !!task->softirq_context;
            ^~~~~~~~~~~~~~~~~~~~~~~~~~
            LOCK_CHAIN_HARDIRQ_CONTEXT
>> kernel/locking/lockdep.c:3614:1: warning: control reaches end of non-void function [-Wreturn-type]
    }
    ^

vim +2304 kernel/locking/lockdep.c

  2301	
  2302	static void inc_chains(int irq_context)
  2303	{
> 2304		if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
  2305			nr_hardirq_chains++;
> 2306		else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
  2307			nr_softirq_chains++;
  2308		else
  2309			nr_process_chains++;
  2310	}
  2311	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 59412 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 2/8] locking/lockdep: Display irq_context names in /proc/lockdep_chains
  2020-01-15 21:43 ` [PATCH v3 2/8] locking/lockdep: Display irq_context names in /proc/lockdep_chains Waiman Long
@ 2020-01-17 22:12   ` kbuild test robot
  0 siblings, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2020-01-17 22:12 UTC (permalink / raw)
  To: Waiman Long
  Cc: kbuild-all, Peter Zijlstra, Ingo Molnar, Will Deacon,
	linux-kernel, Bart Van Assche, Waiman Long

[-- Attachment #1: Type: text/plain, Size: 3790 bytes --]

Hi Waiman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/locking/core]
[also build test ERROR on tip/auto-latest linus/master arm-perf/for-next/perf v5.5-rc6 next-20200117]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Waiman-Long/locking-lockdep-Reuse-zapped-chain_hlocks-entries/20200117-093009
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 1a365e822372ba24c9da0822bc583894f6f3d821
config: sparc64-allmodconfig (attached as .config)
compiler: sparc64-linux-gcc (GCC) 7.5.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.5.0 make.cross ARCH=sparc64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   kernel/locking/lockdep_proc.c: In function 'lc_show':
>> kernel/locking/lockdep_proc.c:133:4: error: 'LOCK_CHAIN_HARDIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_USED_IN_HARDIRQ_READ'?
      [LOCK_CHAIN_HARDIRQ_CONTEXT] = "hardirq",
       ^~~~~~~~~~~~~~~~~~~~~~~~~~
       LOCK_USED_IN_HARDIRQ_READ
   kernel/locking/lockdep_proc.c:133:4: note: each undeclared identifier is reported only once for each function it appears in
>> kernel/locking/lockdep_proc.c:133:4: error: array index in initializer not of integer type
   kernel/locking/lockdep_proc.c:133:4: note: (near initialization for 'irq_strs')
>> kernel/locking/lockdep_proc.c:134:4: error: 'LOCK_CHAIN_SOFTIRQ_CONTEXT' undeclared (first use in this function); did you mean 'LOCK_CHAIN_HARDIRQ_CONTEXT'?
      [LOCK_CHAIN_SOFTIRQ_CONTEXT] = "softirq",
       ^~~~~~~~~~~~~~~~~~~~~~~~~~
       LOCK_CHAIN_HARDIRQ_CONTEXT
   kernel/locking/lockdep_proc.c:134:4: error: array index in initializer not of integer type
   kernel/locking/lockdep_proc.c:134:4: note: (near initialization for 'irq_strs')
   kernel/locking/lockdep_proc.c:135:4: error: array index in initializer not of integer type
      [LOCK_CHAIN_SOFTIRQ_CONTEXT|
       ^~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/locking/lockdep_proc.c:135:4: note: (near initialization for 'irq_strs')

vim +133 kernel/locking/lockdep_proc.c

   125	
   126	static int lc_show(struct seq_file *m, void *v)
   127	{
   128		struct lock_chain *chain = v;
   129		struct lock_class *class;
   130		int i;
   131		static const char * const irq_strs[] = {
   132			[0]			     = "0",
 > 133			[LOCK_CHAIN_HARDIRQ_CONTEXT] = "hardirq",
 > 134			[LOCK_CHAIN_SOFTIRQ_CONTEXT] = "softirq",
   135			[LOCK_CHAIN_SOFTIRQ_CONTEXT|
   136			 LOCK_CHAIN_HARDIRQ_CONTEXT] = "hardirq|softirq",
   137		};
   138	
   139		if (v == SEQ_START_TOKEN) {
   140			if (nr_chain_hlocks > MAX_LOCKDEP_CHAIN_HLOCKS)
   141				seq_printf(m, "(buggered) ");
   142			seq_printf(m, "all lock chains:\n");
   143			return 0;
   144		}
   145	
   146		seq_printf(m, "irq_context: %s\n", irq_strs[chain->irq_context]);
   147	
   148		for (i = 0; i < chain->depth; i++) {
   149			class = lock_chain_get_class(chain, i);
   150			if (!class->key)
   151				continue;
   152	
   153			seq_printf(m, "[%p] ", class->key);
   154			print_name(m, class);
   155			seq_puts(m, "\n");
   156		}
   157		seq_puts(m, "\n");
   158	
   159		return 0;
   160	}
   161	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 59412 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries
  2020-01-16 21:13   ` Peter Zijlstra
@ 2020-01-20  4:22     ` Waiman Long
  0 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2020-01-20  4:22 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On 1/16/20 4:13 PM, Peter Zijlstra wrote:
> On Wed, Jan 15, 2020 at 04:43:11PM -0500, Waiman Long wrote:
>> +static inline int alloc_chain_hlocks_from_buckets(int size)
>> +{
>> +	int prev, curr, next;
>> +
>> +	if (!nr_free_chain_hlocks)
>> +		return -1;
>> +
>> +	if (size <= MAX_CHAIN_BUCKETS) {
>> +		curr = chain_block_buckets[size - 1];
>> +		if (curr < 0)
>> +			return -1;
>> +
>> +		chain_block_buckets[size - 1] = next_chain_block(curr);
>> +		nr_free_chain_hlocks -= size;
>> +		return curr;
>> +	}
>> +
>> +	/*
>> +	 * Look for a free chain block of the given size
>> +	 *
>> +	 * It is rare to have a lock chain with depth > MAX_CHAIN_BUCKETS.
>> +	 * It is also more expensive as we may iterate the whole list
>> +	 * without finding one.
>> +	 */
>> +	for_each_chain_block(0, prev, curr, next) {
>> +		next = next_chain_block(curr);
>> +		if (chain_block_size(curr) == size) {
>> +			set_chain_block(prev, 0, next);
>> +			nr_free_chain_hlocks -= size;
>> +			nr_large_chain_blocks--;
>> +			return curr;
>> +		}
>> +	}
>> +	return -1;
>> +}
>> +static int alloc_chain_hlocks(int size)
>> +{
>> +	int curr;
>> +
>> +	if (size < 2)
>> +		size = 2;
>> +
>> +	curr = alloc_chain_hlocks_from_buckets(size);
>> +	if (curr >= 0)
>> +		return curr;
>> +
>> +	BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks));
>> +	BUILD_BUG_ON((1UL << 6)  <= ARRAY_SIZE(current->held_locks));
>> +	BUILD_BUG_ON((1UL << 8*sizeof(chain_hlocks[0])) <=
>> +		     ARRAY_SIZE(lock_classes));
>> +
>> +	/*
>> +	 * Allocate directly from chain_hlocks.
>> +	 */
>> +	if (likely(nr_chain_hlocks + size <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
>> +		curr = nr_chain_hlocks;
>> +		nr_chain_hlocks += size;
>> +		return curr;
>> +	}
>> +	if (!debug_locks_off_graph_unlock())
>> +		return -1;
>> +
>> +	print_lockdep_off("BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!");
>> +	dump_stack();
>> +	return -1;
>> +}
> Argh, that's still _two_ half allocators.
>
> Here, please try this one, it seems to boot. It compiles with some
> noise, but that is because GCC is stupid and I'm too tired.
>
> ---
>
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1071,15 +1071,22 @@ static inline void check_data_structures
>  
>  #endif /* CONFIG_DEBUG_LOCKDEP */
>  
> +static void init_chain_block_buckets(void);
> +
>  /*
>   * Initialize the lock_classes[] array elements, the free_lock_classes list
>   * and also the delayed_free structure.
>   */
>  static void init_data_structures_once(void)
>  {
> -	static bool ds_initialized, rcu_head_initialized;
> +	static bool ds_initialized, rcu_head_initialized, chain_block_initialized;
>  	int i;
>  
> +	if (!chain_block_initialized) {
> +		chain_block_initialized = true;
> +		init_chain_block_buckets();
> +	}
> +
>  	if (likely(rcu_head_initialized))
>  		return;

Oh, I was not aware that there is such a init_data_structure_once()
function. I don't think we need a chain_block_initialized. The
ds_initialized should be enough and the init_chain_block_buckets() can
be put to the end of the function.

Other than that, the rests look OK to me so far. I will try it out tomorrow.

Thanks,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-01-20  4:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-15 21:43 [PATCH v3 0/8] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
2020-01-15 21:43 ` [PATCH v3 1/8] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
2020-01-16 15:32   ` Peter Zijlstra
2020-01-16 15:50     ` Waiman Long
2020-01-17 18:14   ` kbuild test robot
2020-01-15 21:43 ` [PATCH v3 2/8] locking/lockdep: Display irq_context names in /proc/lockdep_chains Waiman Long
2020-01-17 22:12   ` kbuild test robot
2020-01-15 21:43 ` [PATCH v3 3/8] locking/lockdep: Track number of zapped classes Waiman Long
2020-01-15 21:43 ` [PATCH v3 4/8] locking/lockdep: Throw away all lock chains with zapped class Waiman Long
2020-01-15 21:43 ` [PATCH v3 5/8] locking/lockdep: Track number of zapped lock chains Waiman Long
2020-01-15 21:43 ` [PATCH v3 6/8] locking/lockdep: Reuse freed chain_hlocks entries Waiman Long
2020-01-16 21:13   ` Peter Zijlstra
2020-01-20  4:22     ` Waiman Long
2020-01-17  5:51   ` kbuild test robot
2020-01-15 21:43 ` [PATCH v3 7/8] locking/lockdep: Add lockdep_early_init() before any lock is taken Waiman Long
2020-01-15 21:43 ` [PATCH v3 8/8] locking/lockdep: Enable chain block splitting as last resort Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).