linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] locking/lockdep: Reuse zapped chain_hlocks entries
@ 2019-12-12 22:35 Waiman Long
  2019-12-12 22:35 ` [PATCH 1/5] locking/lockdep: Track number of zapped classes Waiman Long
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-12 22:35 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

It was found that when running a workload that kept on adding lock
classes and then zapping them repetitively, the system will eventually
running out of chain_hlocks[] entries even though there were still
plenty of other lockdep data buffers available.

  [ 4318.443670] BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!
  [ 4318.444809] turning off the locking correctness validator.

In order to fix this problem, we have to make chain_hlocks[] entries
reusable just like other lockdep arrays. Besides that, the patchset
also adds some zapped class and chain_hlocks counters to be tracked by
/proc/lockdep_stats. It also fixes leakage in the irq context counters.

Waiman Long (5):
  locking/lockdep: Track number of zapped classes
  locking/lockdep: Track leaked chain_hlocks entries
  locking/lockdep: Track number of zapped lock chains
  locking/lockdep: Reuse free chain_hlocks entries
  locking/lockdep: Decrement irq context counters when removing lock
    chain

 kernel/locking/lockdep.c           | 201 ++++++++++++++++++++++-------
 kernel/locking/lockdep_internals.h |  23 +++-
 kernel/locking/lockdep_proc.c      |  17 ++-
 3 files changed, 189 insertions(+), 52 deletions(-)

-- 
2.18.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/5] locking/lockdep: Track number of zapped classes
  2019-12-12 22:35 [PATCH 0/5] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
@ 2019-12-12 22:35 ` Waiman Long
  2019-12-12 22:35 ` [PATCH 2/5] locking/lockdep: Track leaked chain_hlocks entries Waiman Long
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-12 22:35 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

The whole point of the lockdep dynamic key patch is to allow unused
locks to be removed from the lockdep data buffers so that existing
buffer space can be reused. However, there is no way to find out how
many unused locks are zapped and so we don't know if the zapping process
is working properly.

Add a new nr_zapped_classes variable to track that and show it in
/proc/lockdep_stats if it is non-zero.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 2 ++
 kernel/locking/lockdep_internals.h | 1 +
 kernel/locking/lockdep_proc.c      | 9 +++++++++
 3 files changed, 12 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 32282e7112d3..c8d0374101b0 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -147,6 +147,7 @@ static DECLARE_BITMAP(list_entries_in_use, MAX_LOCKDEP_ENTRIES);
 #define KEYHASH_SIZE		(1UL << KEYHASH_BITS)
 static struct hlist_head lock_keys_hash[KEYHASH_SIZE];
 unsigned long nr_lock_classes;
+unsigned long nr_zapped_classes;
 #ifndef CONFIG_DEBUG_LOCKDEP
 static
 #endif
@@ -4875,6 +4876,7 @@ static void zap_class(struct pending_free *pf, struct lock_class *class)
 	}
 
 	remove_class_from_lock_chains(pf, class);
+	nr_zapped_classes++;
 }
 
 static void reinit_class(struct lock_class *class)
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 18d85aebbb57..a31d16fd3c43 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -124,6 +124,7 @@ extern const char *__get_key_name(const struct lockdep_subclass_key *key,
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i);
 
 extern unsigned long nr_lock_classes;
+extern unsigned long nr_zapped_classes;
 extern unsigned long nr_list_entries;
 long lockdep_next_lockchain(long i);
 unsigned long lock_chain_count(void);
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index 9bb6d2497b04..82579fd7469f 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -336,6 +336,15 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 	seq_printf(m, " debug_locks:                   %11u\n",
 			debug_locks);
 
+	/*
+	 * Zappped classes and lockdep data buffers reuse statistics.
+	 */
+	if (!nr_zapped_classes)
+		return 0;
+
+	seq_puts(m, "\n");
+	seq_printf(m, " zapped classes:                %11lu\n",
+			nr_zapped_classes);
 	return 0;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/5] locking/lockdep: Track leaked chain_hlocks entries
  2019-12-12 22:35 [PATCH 0/5] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
  2019-12-12 22:35 ` [PATCH 1/5] locking/lockdep: Track number of zapped classes Waiman Long
@ 2019-12-12 22:35 ` Waiman Long
  2019-12-12 22:35 ` [PATCH 3/5] locking/lockdep: Track number of zapped lock chains Waiman Long
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-12 22:35 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

If a lock chain contains a class that is zapped, the whole lock chain is
now invalid. There is no point in keeping a partial chain behind hoping
it can be useful. So just dump the corresponding chain_hlocks entries
and account for them in a new nr_leaked_chain_hlocks stat counter which
is shown as part of /proc/lockdep_stats.

This patch also changes the type of nr_chain_hlocks to unsigned integer
to be consistent with the other counters.

A latter patch will try to reuse the freed chain_hlocks entries.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 40 ++++++------------------------
 kernel/locking/lockdep_internals.h |  5 ++--
 kernel/locking/lockdep_proc.c      |  4 ++-
 3 files changed, 14 insertions(+), 35 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c8d0374101b0..d03503b348f1 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2624,8 +2624,9 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 
 struct lock_chain lock_chains[MAX_LOCKDEP_CHAINS];
 static DECLARE_BITMAP(lock_chains_in_use, MAX_LOCKDEP_CHAINS);
-int nr_chain_hlocks;
 static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
+unsigned int nr_chain_hlocks;
+unsigned int nr_leaked_chain_hlocks;
 
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
 {
@@ -4770,57 +4771,32 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 					 struct lock_class *class)
 {
 #ifdef CONFIG_PROVE_LOCKING
-	struct lock_chain *new_chain;
-	u64 chain_key;
 	int i;
 
 	for (i = chain->base; i < chain->base + chain->depth; i++) {
 		if (chain_hlocks[i] != class - lock_classes)
 			continue;
-		/* The code below leaks one chain_hlock[] entry. */
-		if (--chain->depth > 0) {
-			memmove(&chain_hlocks[i], &chain_hlocks[i + 1],
-				(chain->base + chain->depth - i) *
-				sizeof(chain_hlocks[0]));
-		}
+		/* Account for leaked chain_hlock[] entries. */
+		nr_leaked_chain_hlocks += chain->depth;
+
 		/*
 		 * Each lock class occurs at most once in a lock chain so once
 		 * we found a match we can break out of this loop.
 		 */
-		goto recalc;
+		goto free_lock_chain;
 	}
 	/* Since the chain has not been modified, return. */
 	return;
 
-recalc:
-	chain_key = INITIAL_CHAIN_KEY;
-	for (i = chain->base; i < chain->base + chain->depth; i++)
-		chain_key = iterate_chain_key(chain_key, chain_hlocks[i]);
-	if (chain->depth && chain->chain_key == chain_key)
-		return;
+free_lock_chain:
 	/* Overwrite the chain key for concurrent RCU readers. */
-	WRITE_ONCE(chain->chain_key, chain_key);
+	WRITE_ONCE(chain->chain_key, INITIAL_CHAIN_KEY);
 	/*
 	 * Note: calling hlist_del_rcu() from inside a
 	 * hlist_for_each_entry_rcu() loop is safe.
 	 */
 	hlist_del_rcu(&chain->entry);
 	__set_bit(chain - lock_chains, pf->lock_chains_being_freed);
-	if (chain->depth == 0)
-		return;
-	/*
-	 * If the modified lock chain matches an existing lock chain, drop
-	 * the modified lock chain.
-	 */
-	if (lookup_chain_cache(chain_key))
-		return;
-	new_chain = alloc_lock_chain();
-	if (WARN_ON_ONCE(!new_chain)) {
-		debug_locks_off();
-		return;
-	}
-	*new_chain = *chain;
-	hlist_add_head_rcu(&new_chain->entry, chainhashentry(chain_key));
 #endif
 }
 
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index a31d16fd3c43..b6ec9595ae88 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -128,14 +128,15 @@ extern unsigned long nr_zapped_classes;
 extern unsigned long nr_list_entries;
 long lockdep_next_lockchain(long i);
 unsigned long lock_chain_count(void);
-extern int nr_chain_hlocks;
 extern unsigned long nr_stack_trace_entries;
 
 extern unsigned int nr_hardirq_chains;
 extern unsigned int nr_softirq_chains;
 extern unsigned int nr_process_chains;
-extern unsigned int max_lockdep_depth;
+extern unsigned int nr_chain_hlocks;
+extern unsigned int nr_leaked_chain_hlocks;
 
+extern unsigned int max_lockdep_depth;
 extern unsigned int max_bfs_queue_depth;
 
 #ifdef CONFIG_PROVE_LOCKING
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index 82579fd7469f..aab524530115 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -271,7 +271,7 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 #ifdef CONFIG_PROVE_LOCKING
 	seq_printf(m, " dependency chains:             %11lu [max: %lu]\n",
 			lock_chain_count(), MAX_LOCKDEP_CHAINS);
-	seq_printf(m, " dependency chain hlocks:       %11d [max: %lu]\n",
+	seq_printf(m, " dependency chain hlocks:       %11u [max: %lu]\n",
 			nr_chain_hlocks, MAX_LOCKDEP_CHAIN_HLOCKS);
 #endif
 
@@ -345,6 +345,8 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 	seq_puts(m, "\n");
 	seq_printf(m, " zapped classes:                %11lu\n",
 			nr_zapped_classes);
+	seq_printf(m, " leaked chain hlocks:           %11u\n",
+			nr_leaked_chain_hlocks);
 	return 0;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/5] locking/lockdep: Track number of zapped lock chains
  2019-12-12 22:35 [PATCH 0/5] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
  2019-12-12 22:35 ` [PATCH 1/5] locking/lockdep: Track number of zapped classes Waiman Long
  2019-12-12 22:35 ` [PATCH 2/5] locking/lockdep: Track leaked chain_hlocks entries Waiman Long
@ 2019-12-12 22:35 ` Waiman Long
  2019-12-12 22:35 ` [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries Waiman Long
  2019-12-12 22:35 ` [PATCH 5/5] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
  4 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-12 22:35 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

Add a new counter nr_zapped_lock_chains to track the number lock chains
that have been removed.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 2 ++
 kernel/locking/lockdep_internals.h | 1 +
 kernel/locking/lockdep_proc.c      | 2 ++
 3 files changed, 5 insertions(+)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index d03503b348f1..b543bcc5ff50 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2625,6 +2625,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 struct lock_chain lock_chains[MAX_LOCKDEP_CHAINS];
 static DECLARE_BITMAP(lock_chains_in_use, MAX_LOCKDEP_CHAINS);
 static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
+unsigned long nr_zapped_lock_chains;
 unsigned int nr_chain_hlocks;
 unsigned int nr_leaked_chain_hlocks;
 
@@ -4797,6 +4798,7 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 	 */
 	hlist_del_rcu(&chain->entry);
 	__set_bit(chain - lock_chains, pf->lock_chains_being_freed);
+	nr_zapped_lock_chains++;
 #endif
 }
 
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index b6ec9595ae88..4c23ab7d27c2 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -125,6 +125,7 @@ struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i);
 
 extern unsigned long nr_lock_classes;
 extern unsigned long nr_zapped_classes;
+extern unsigned long nr_zapped_lock_chains;
 extern unsigned long nr_list_entries;
 long lockdep_next_lockchain(long i);
 unsigned long lock_chain_count(void);
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index aab524530115..65a1b96238ef 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -345,6 +345,8 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 	seq_puts(m, "\n");
 	seq_printf(m, " zapped classes:                %11lu\n",
 			nr_zapped_classes);
+	seq_printf(m, " zapped lock chains:            %11lu\n",
+			nr_zapped_lock_chains);
 	seq_printf(m, " leaked chain hlocks:           %11u\n",
 			nr_leaked_chain_hlocks);
 	return 0;
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-12 22:35 [PATCH 0/5] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
                   ` (2 preceding siblings ...)
  2019-12-12 22:35 ` [PATCH 3/5] locking/lockdep: Track number of zapped lock chains Waiman Long
@ 2019-12-12 22:35 ` Waiman Long
  2019-12-13 10:25   ` Peter Zijlstra
  2019-12-12 22:35 ` [PATCH 5/5] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long
  4 siblings, 1 reply; 16+ messages in thread
From: Waiman Long @ 2019-12-12 22:35 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

Once a lock class is zapped, all the lock chains that include the zapped
class are essentially useless. The lock_chain structure itself can be
reused, but not the corresponding chain_hlocks[] entries. Over time,
we will run out of chain_hlocks entries while there are still plenty
of other lockdep array entries available.

To fix this imbalance, we have to make chain_hlocks entries reusable
just like the others. As the freed chain_hlocks entries are in blocks of
various lengths. A simple bitmap like the one used in the other reusable
lockdep arrays isn't applicable. Instead a new chain_hlocks_block
structure is used to list the location and the length of each free block.
These chain_hlocks_block structures are then put into an array of hashed
lists for easier lookup by the size of the blocks.

By reusing the chain_hlocks entries, we are able to handle workloads
that add and zap a lot of lock classes as long as the total number of
outstanding lock classes at any time remain within a reasonable limit.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 130 ++++++++++++++++++++++++++---
 kernel/locking/lockdep_internals.h |  10 ++-
 kernel/locking/lockdep_proc.c      |   2 +
 3 files changed, 131 insertions(+), 11 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index b543bcc5ff50..97c17ba85d29 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2628,6 +2628,88 @@ static u16 chain_hlocks[MAX_LOCKDEP_CHAIN_HLOCKS];
 unsigned long nr_zapped_lock_chains;
 unsigned int nr_chain_hlocks;
 unsigned int nr_leaked_chain_hlocks;
+unsigned int nr_free_chain_hlocks;
+
+#ifdef CONFIG_PROVE_LOCKING
+static struct chain_hlocks_block chain_hlocks_blocks[MAX_CHAIN_HLOCKS_BLOCKS];
+static struct chain_hlocks_block *chain_hlocks_block_hash[CHAIN_HLOCKS_HASH];
+static struct chain_hlocks_block *free_chain_hlocks_blocks;
+
+/*
+ * Graph lock must be held before calling the chain_hlocks_block functions.
+ * Chain hlocks of depth 1-(CHAIN_HLOCKS_HASH-1) is mapped directly to
+ * chain_hlocks_block_hash[1-(CHAIN_HLOCKS_HASH-1)]. All other sizes
+ * are mapped to chain_hlocks_block_hash[0].
+ */
+static inline struct chain_hlocks_block *alloc_chain_hlocks_block(void)
+{
+	struct chain_hlocks_block *block = free_chain_hlocks_blocks;
+
+	WARN_ONCE(!debug_locks_silent && !block,
+		  "Running out of chain_hlocks_block\n");
+	free_chain_hlocks_blocks = block ? block->next : NULL;
+	return block;
+}
+
+static inline void free_chain_hlocks_block(struct chain_hlocks_block *block)
+{
+	block->next = free_chain_hlocks_blocks;
+	free_chain_hlocks_blocks = block;
+}
+
+static inline void push_chain_hlocks_block(struct chain_hlocks_block *block)
+{
+	int hash, depth = block->depth;
+
+	hash = (depth >= CHAIN_HLOCKS_HASH) ? 0 : depth;
+	block->next = chain_hlocks_block_hash[hash];
+	chain_hlocks_block_hash[hash] = block;
+	nr_free_chain_hlocks += depth;
+}
+
+static inline struct chain_hlocks_block *pop_chain_hlocks_block(int depth)
+{
+	struct chain_hlocks_block *curr, **pprev;
+
+	if (!nr_free_chain_hlocks)
+		return NULL;
+
+	if (depth < CHAIN_HLOCKS_HASH) {
+		curr = chain_hlocks_block_hash[depth];
+		if (curr) {
+			chain_hlocks_block_hash[depth] = curr->next;
+			nr_free_chain_hlocks -= depth;
+		}
+		return curr;
+	}
+
+	/*
+	 * For depth >= CHAIN_HLOCKS_HASH, it is not really a pop operation.
+	 * Instead, the first entry with the right size is returned.
+	 */
+	curr  = chain_hlocks_block_hash[0];
+	pprev = chain_hlocks_block_hash;
+
+	while (curr) {
+		if (curr->depth == depth)
+			break;
+		pprev = &(curr->next);
+		curr = curr->next;
+	}
+
+	if (curr) {
+		*pprev = curr->next;
+		nr_free_chain_hlocks -= depth;
+	}
+	return curr;
+}
+#else
+static inline void free_chain_hlocks_block(struct chain_hlocks_block *block) { }
+static inline struct chain_hlocks_block *pop_chain_hlocks_block(int depth)
+{
+	return NULL;
+}
+#endif /* CONFIG_PROVE_LOCKING */
 
 struct lock_class *lock_chain_get_class(struct lock_chain *chain, int i)
 {
@@ -2800,7 +2882,8 @@ static inline int add_chain_cache(struct task_struct *curr,
 	struct lock_class *class = hlock_class(hlock);
 	struct hlist_head *hash_head = chainhashentry(chain_key);
 	struct lock_chain *chain;
-	int i, j;
+	struct chain_hlocks_block *chblock;
+	int i, j, lock_id;
 
 	/*
 	 * The caller must hold the graph lock, ensure we've got IRQs
@@ -2827,14 +2910,12 @@ static inline int add_chain_cache(struct task_struct *curr,
 	BUILD_BUG_ON((1UL << 24) <= ARRAY_SIZE(chain_hlocks));
 	BUILD_BUG_ON((1UL << 6)  <= ARRAY_SIZE(curr->held_locks));
 	BUILD_BUG_ON((1UL << 8*sizeof(chain_hlocks[0])) <= ARRAY_SIZE(lock_classes));
-
-	if (likely(nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS)) {
+	chblock = pop_chain_hlocks_block(chain->depth);
+	if (chblock) {
+		chain->base = chblock->base;
+		free_chain_hlocks_block(chblock);
+	} else if (nr_chain_hlocks + chain->depth <= MAX_LOCKDEP_CHAIN_HLOCKS) {
 		chain->base = nr_chain_hlocks;
-		for (j = 0; j < chain->depth - 1; j++, i++) {
-			int lock_id = curr->held_locks[i].class_idx;
-			chain_hlocks[chain->base + j] = lock_id;
-		}
-		chain_hlocks[chain->base + j] = class - lock_classes;
 		nr_chain_hlocks += chain->depth;
 	} else {
 		if (!debug_locks_off_graph_unlock())
@@ -2845,6 +2926,11 @@ static inline int add_chain_cache(struct task_struct *curr,
 		return 0;
 	}
 
+	for (j = 0; j < chain->depth - 1; j++, i++) {
+		lock_id = curr->held_locks[i].class_idx;
+		chain_hlocks[chain->base + j] = lock_id;
+	}
+	chain_hlocks[chain->base + j] = class - lock_classes;
 	hlist_add_head_rcu(&chain->entry, hash_head);
 	debug_atomic_inc(chain_lookup_misses);
 	inc_chains();
@@ -4773,12 +4859,24 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 {
 #ifdef CONFIG_PROVE_LOCKING
 	int i;
+	struct chain_hlocks_block *chblock;
 
 	for (i = chain->base; i < chain->base + chain->depth; i++) {
 		if (chain_hlocks[i] != class - lock_classes)
 			continue;
-		/* Account for leaked chain_hlock[] entries. */
-		nr_leaked_chain_hlocks += chain->depth;
+		/*
+		 * Try to save the free chain_hlock[] entries in a
+		 * chain_hlocks_block. If no free chain_hlocks_block is
+		 * available, count the entries in nr_leaked_chain_hlocks.
+		 */
+		chblock = alloc_chain_hlocks_block();
+		if (chblock) {
+			chblock->base  = chain->base;
+			chblock->depth = chain->depth;
+			push_chain_hlocks_block(chblock);
+		} else {
+			nr_leaked_chain_hlocks += chain->depth;
+		}
 
 		/*
 		 * Each lock class occurs at most once in a lock chain so once
@@ -5217,6 +5315,18 @@ void __init lockdep_init(void)
 
 	printk(" per task-struct memory footprint: %zu bytes\n",
 	       sizeof(((struct task_struct *)NULL)->held_locks));
+
+#ifdef CONFIG_PROVE_LOCKING
+	{
+		int i;
+
+		/* Put all chain hlock blocks into the free list */
+		free_chain_hlocks_blocks = chain_hlocks_blocks;
+		for (i = 0; i < MAX_CHAIN_HLOCKS_BLOCKS - 1; i++)
+			chain_hlocks_blocks[i].next = &chain_hlocks_blocks[i+1];
+		chain_hlocks_blocks[i].next = NULL;
+	}
+#endif
 }
 
 static void
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 4c23ab7d27c2..999cd714e0d1 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -107,8 +107,15 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 #endif
 
 #define MAX_LOCKDEP_CHAINS	(1UL << MAX_LOCKDEP_CHAINS_BITS)
-
 #define MAX_LOCKDEP_CHAIN_HLOCKS (MAX_LOCKDEP_CHAINS*5)
+#define MAX_CHAIN_HLOCKS_BLOCKS	1024
+#define CHAIN_HLOCKS_HASH	8
+
+struct chain_hlocks_block {
+	unsigned int		   depth: 8,
+				   base :24;
+	struct chain_hlocks_block *next;
+};
 
 extern struct list_head all_lock_classes;
 extern struct lock_chain lock_chains[];
@@ -136,6 +143,7 @@ extern unsigned int nr_softirq_chains;
 extern unsigned int nr_process_chains;
 extern unsigned int nr_chain_hlocks;
 extern unsigned int nr_leaked_chain_hlocks;
+extern unsigned int nr_free_chain_hlocks;
 
 extern unsigned int max_lockdep_depth;
 extern unsigned int max_bfs_queue_depth;
diff --git a/kernel/locking/lockdep_proc.c b/kernel/locking/lockdep_proc.c
index 65a1b96238ef..126e6869412f 100644
--- a/kernel/locking/lockdep_proc.c
+++ b/kernel/locking/lockdep_proc.c
@@ -349,6 +349,8 @@ static int lockdep_stats_show(struct seq_file *m, void *v)
 			nr_zapped_lock_chains);
 	seq_printf(m, " leaked chain hlocks:           %11u\n",
 			nr_leaked_chain_hlocks);
+	seq_printf(m, " free chain hlocks:             %11u\n",
+			nr_free_chain_hlocks);
 	return 0;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/5] locking/lockdep: Decrement irq context counters when removing lock chain
  2019-12-12 22:35 [PATCH 0/5] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
                   ` (3 preceding siblings ...)
  2019-12-12 22:35 ` [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries Waiman Long
@ 2019-12-12 22:35 ` Waiman Long
  4 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-12 22:35 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

There are currently three counters to track the irq context of a lock
chain - nr_hardirq_chains, nr_softirq_chains and nr_process_chains.
They are incremented when a new lock chain is added, but they are not
decremented when a lock chain is removed. That causes the some of
the statistic counts reported by /proc/lockdep_stats to be incorrect.

Fix that by decrementing the right counter when a lock chain is removed.

Fixes: a0b0fd53e1e6 ("locking/lockdep: Free lock classes that are no longer in use")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c           | 35 +++++++++++++++++++++---------
 kernel/locking/lockdep_internals.h |  6 +++++
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 97c17ba85d29..1d8f2fcd4bb4 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2300,16 +2300,24 @@ static int check_irq_usage(struct task_struct *curr, struct held_lock *prev,
 	return 0;
 }
 
-static void inc_chains(void)
+static void inc_chains(int irq_context)
 {
-	if (current->hardirq_context)
+	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
 		nr_hardirq_chains++;
-	else {
-		if (current->softirq_context)
-			nr_softirq_chains++;
-		else
-			nr_process_chains++;
-	}
+	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
+		nr_softirq_chains++;
+	else
+		nr_process_chains++;
+}
+
+static void dec_chains(int irq_context)
+{
+	if (irq_context & LOCK_CHAIN_HARDIRQ_CONTEXT)
+		nr_hardirq_chains--;
+	else if (irq_context & LOCK_CHAIN_SOFTIRQ_CONTEXT)
+		nr_softirq_chains--;
+	else
+		nr_process_chains--;
 }
 
 #else
@@ -2325,6 +2333,10 @@ static inline void inc_chains(void)
 	nr_process_chains++;
 }
 
+static void dec_chains(int irq_context)
+{
+	nr_process_chains--;
+}
 #endif /* CONFIG_TRACE_IRQFLAGS */
 
 static void
@@ -2933,7 +2945,7 @@ static inline int add_chain_cache(struct task_struct *curr,
 	chain_hlocks[chain->base + j] = class - lock_classes;
 	hlist_add_head_rcu(&chain->entry, hash_head);
 	debug_atomic_inc(chain_lookup_misses);
-	inc_chains();
+	inc_chains(chain->irq_context);
 
 	return 1;
 }
@@ -3686,7 +3698,8 @@ mark_usage(struct task_struct *curr, struct held_lock *hlock, int check)
 
 static inline unsigned int task_irq_context(struct task_struct *task)
 {
-	return 2 * !!task->hardirq_context + !!task->softirq_context;
+	return LOCK_CHAIN_HARDIRQ_CONTEXT * !!task->hardirq_context +
+	       LOCK_CHAIN_SOFTIRQ_CONTEXT * !!task->softirq_context;
 }
 
 static int separate_irq_context(struct task_struct *curr,
@@ -4890,6 +4903,8 @@ static void remove_class_from_lock_chain(struct pending_free *pf,
 free_lock_chain:
 	/* Overwrite the chain key for concurrent RCU readers. */
 	WRITE_ONCE(chain->chain_key, INITIAL_CHAIN_KEY);
+	dec_chains(chain->irq_context);
+
 	/*
 	 * Note: calling hlist_del_rcu() from inside a
 	 * hlist_for_each_entry_rcu() loop is safe.
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index 999cd714e0d1..26e387d3155a 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -98,6 +98,12 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 
 #define MAX_LOCKDEP_CHAINS_BITS	16
 
+/*
+ * Bit definitions for lock_chain.irq_context
+ */
+#define LOCK_CHAIN_SOFTIRQ_CONTEXT	(1 << 0)
+#define LOCK_CHAIN_HARDIRQ_CONTEXT	(1 << 1)
+
 /*
  * Stack-trace: tightly packed array of stack backtrace
  * addresses. Protected by the hash_lock.
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-12 22:35 ` [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries Waiman Long
@ 2019-12-13 10:25   ` Peter Zijlstra
  2019-12-13 10:50     ` Peter Zijlstra
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Peter Zijlstra @ 2019-12-13 10:25 UTC (permalink / raw)
  To: Waiman Long; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On Thu, Dec 12, 2019 at 05:35:24PM -0500, Waiman Long wrote:

> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
> index 4c23ab7d27c2..999cd714e0d1 100644
> --- a/kernel/locking/lockdep_internals.h
> +++ b/kernel/locking/lockdep_internals.h
> @@ -107,8 +107,15 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
>  #endif
>  
>  #define MAX_LOCKDEP_CHAINS	(1UL << MAX_LOCKDEP_CHAINS_BITS)
> -
>  #define MAX_LOCKDEP_CHAIN_HLOCKS (MAX_LOCKDEP_CHAINS*5)
> +#define MAX_CHAIN_HLOCKS_BLOCKS	1024
> +#define CHAIN_HLOCKS_HASH	8
> +
> +struct chain_hlocks_block {
> +	unsigned int		   depth: 8,
> +				   base :24;
> +	struct chain_hlocks_block *next;
> +};
>  
>  extern struct list_head all_lock_classes;
>  extern struct lock_chain lock_chains[];

This doesn't need to be in the header, there are no users outside of the
stuff you wrote below.

> +#ifdef CONFIG_PROVE_LOCKING
> +static struct chain_hlocks_block chain_hlocks_blocks[MAX_CHAIN_HLOCKS_BLOCKS];
> +static struct chain_hlocks_block *chain_hlocks_block_hash[CHAIN_HLOCKS_HASH];
> +static struct chain_hlocks_block *free_chain_hlocks_blocks;
> +
> +/*
> + * Graph lock must be held before calling the chain_hlocks_block functions.
> + * Chain hlocks of depth 1-(CHAIN_HLOCKS_HASH-1) is mapped directly to
> + * chain_hlocks_block_hash[1-(CHAIN_HLOCKS_HASH-1)]. All other sizes
> + * are mapped to chain_hlocks_block_hash[0].
> + */
> +static inline struct chain_hlocks_block *alloc_chain_hlocks_block(void)
> +{
> +	struct chain_hlocks_block *block = free_chain_hlocks_blocks;
> +
> +	WARN_ONCE(!debug_locks_silent && !block,
> +		  "Running out of chain_hlocks_block\n");
> +	free_chain_hlocks_blocks = block ? block->next : NULL;
> +	return block;
> +}
> +
> +static inline void free_chain_hlocks_block(struct chain_hlocks_block *block)
> +{
> +	block->next = free_chain_hlocks_blocks;
> +	free_chain_hlocks_blocks = block;
> +}
> +
> +static inline void push_chain_hlocks_block(struct chain_hlocks_block *block)
> +{
> +	int hash, depth = block->depth;
> +
> +	hash = (depth >= CHAIN_HLOCKS_HASH) ? 0 : depth;
> +	block->next = chain_hlocks_block_hash[hash];
> +	chain_hlocks_block_hash[hash] = block;
> +	nr_free_chain_hlocks += depth;
> +}

I would argue this is not a hash, these are buckets. You're doing
regular size buckets.

> +static inline struct chain_hlocks_block *pop_chain_hlocks_block(int depth)
> +{
> +	struct chain_hlocks_block *curr, **pprev;
> +
> +	if (!nr_free_chain_hlocks)
> +		return NULL;
> +
> +	if (depth < CHAIN_HLOCKS_HASH) {
> +		curr = chain_hlocks_block_hash[depth];
> +		if (curr) {
> +			chain_hlocks_block_hash[depth] = curr->next;
> +			nr_free_chain_hlocks -= depth;
> +		}
> +		return curr;
> +	}
> +
> +	/*
> +	 * For depth >= CHAIN_HLOCKS_HASH, it is not really a pop operation.
> +	 * Instead, the first entry with the right size is returned.
> +	 */
> +	curr  = chain_hlocks_block_hash[0];
> +	pprev = chain_hlocks_block_hash;
> +
> +	while (curr) {
> +		if (curr->depth == depth)
> +			break;
> +		pprev = &(curr->next);
> +		curr = curr->next;
> +	}
> +
> +	if (curr) {
> +		*pprev = curr->next;
> +		nr_free_chain_hlocks -= depth;
> +	}
> +	return curr;
> +}
> +#else
> +static inline void free_chain_hlocks_block(struct chain_hlocks_block *block) { }
> +static inline struct chain_hlocks_block *pop_chain_hlocks_block(int depth)
> +{
> +	return NULL;
> +}
> +#endif /* CONFIG_PROVE_LOCKING */

You've made a bit of a mess of things though. Why is that push/pop crud
exposed? Why didn't you build a self contained allocator?

All you really want is:

u16 *alloc_chain(int size);
void free_chain(int size, u16 *chain);


An implementation would be something along these lines (completely
untested, fresh from the keyboard):

struct chain_block {
	int size;
	int base;
	struct chain_block *next;
};

struct chain_block chain_blocks[MAX_BLOCKS];
struct chain_block *free_blocks;


struct chain_block init_block = {
	.size = MAX_LOCKDEP_CHAIN_HLOCKS,
	.base = 0,
	.next = NULL,
};

struct chain_block *free_list[MAX_BUCKETS] = {
	&init_block, /* 0 */
};


void free_block(struct chain_block *b)
{
	b->next = free_blocks;
	free_blocks = b;
}

struct chain_block *alloc_block(void)
{
	struct chain_block *block = free_blocks;
	free_blocks = block->next;
	return block;
}

struct chain_block *pop_block(struct chain_block **bp)
{
	struct chain_block *b = *bp;
	if (!b)
		return NULL;
	*bp = b->next;
}

void push_block(struct chain_block **bp, struct chain_block *b)
{
	b->next = *bp;
	*bp = b;
}

/* could contemplate ilog2() buckets */
int size2bucket(int size)
{
	return size >= MAX_BUCKET ? 0 : size;
}

/* bucket[0] is mixed size */
struct chain_block *pop_block_0(struct chain_block **bp, int size)
{
	struct chain_block **p = bp, *b = *bp;
	if (!b)
		return NULL;

	p = bp;
	while (b && b->size < size) {
		p = &b->next;
		b = b->next;
	}
	if (!b)
		return NULL;

	*p = b->next;
	return b;
}

u16 *alloc_chain(int size)
{
	int i, bucket = size2bucket(size);
	struct chain_block *b;
	u16 *chain;

	if (!bucket) {
		b = pop_block_0(&free_list[0], size);
		if (b)
			goto got_block;
	} else {
		b = pop_block(&free_list[bucket]);
		if (b)
			goto got_block;

		/* pop a large block, hope the fragment is still useful */
		b = pop_block_0(&free_list[0], size);
		if (b)
			goto got_block;

		for (i = MAX_BUCKETS-1; i > bucket; i--)
			b = pop_block(&free_list[bucket]);
			if (b)
				goto got_block;
		}
	}
	return NULL;

got_block:

	chain = chain_hlocks + b->base;
	b->base += size;
	b->size -= size;

	if (b->size) {
		/* return the fragment */
		bucket = size2bucket(b->size);
		push_block(&free_list[bucket], b);
	} else {
		free_block(b);
	}

	return chain;
}

void free_chain(int size, u16 *chain)
{
	struct chain_block *b = alloc_block();
	int bucket = size2bucket(size);

	if (!b) {
		// leak stuff;
		return;
	}

	b->size = size;
	b->base = chain - chain_hlocks;

	push_bucket(&free_list[bucket], b);
}

void init_blocks(void)
{
	int i;

	for (i = 0; i < MAX_BLOCKS; i++)
		free_block(chain_blocks[i]);
}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 10:25   ` Peter Zijlstra
@ 2019-12-13 10:50     ` Peter Zijlstra
  2019-12-13 16:02       ` Waiman Long
  2019-12-13 15:02     ` Waiman Long
  2019-12-13 18:43     ` Peter Zijlstra
  2 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2019-12-13 10:50 UTC (permalink / raw)
  To: Waiman Long; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On Fri, Dec 13, 2019 at 11:25:25AM +0100, Peter Zijlstra wrote:

> void push_block(struct chain_block **bp, struct chain_block *b)
> {
> 	b->next = *bp;
> 	*bp = b;
> }
> 
> /* could contemplate ilog2() buckets */
> int size2bucket(int size)
> {
> 	return size >= MAX_BUCKET ? 0 : size;
> }

If you make the allocation granularity 4 entries, then you can have
push_block() store the chain_block * at the start of every free entry,
which would enable merging adjecent blocks.

That is, if I'm not mistaken these u16 chain entries encode the class
index (per lock_chain_get_class()). And since the lock_classes array is
MAX_LOCKDEP_KEYS, or 13 bits, big, we have the MSB of each entry spare.

union {
	struct {
		u16 hlock[4];
	}
	u64 addr;
} ponies;

So if we make the rule that:

	!(idx % 4) && (((union ponies *)chain_hlock[idx])->addr & BIT_ULL(63))

encodes a free block at said addr, then we should be able to detect if:

	chain_hlock[b->base + b->size]

is also free and merge the blocks.

(we just have to fix up the MSB of the address, not all arches have
negative addresses for kernel space)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 10:25   ` Peter Zijlstra
  2019-12-13 10:50     ` Peter Zijlstra
@ 2019-12-13 15:02     ` Waiman Long
  2019-12-13 18:43     ` Peter Zijlstra
  2 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-13 15:02 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On 12/13/19 5:25 AM, Peter Zijlstra wrote:
> On Thu, Dec 12, 2019 at 05:35:24PM -0500, Waiman Long wrote:
>
>> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
>> index 4c23ab7d27c2..999cd714e0d1 100644
>> --- a/kernel/locking/lockdep_internals.h
>> +++ b/kernel/locking/lockdep_internals.h
>> @@ -107,8 +107,15 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
>>  #endif
>>  
>>  #define MAX_LOCKDEP_CHAINS	(1UL << MAX_LOCKDEP_CHAINS_BITS)
>> -
>>  #define MAX_LOCKDEP_CHAIN_HLOCKS (MAX_LOCKDEP_CHAINS*5)
>> +#define MAX_CHAIN_HLOCKS_BLOCKS	1024
>> +#define CHAIN_HLOCKS_HASH	8
>> +
>> +struct chain_hlocks_block {
>> +	unsigned int		   depth: 8,
>> +				   base :24;
>> +	struct chain_hlocks_block *next;
>> +};
>>  
>>  extern struct list_head all_lock_classes;
>>  extern struct lock_chain lock_chains[];
> This doesn't need to be in the header, there are no users outside of the
> stuff you wrote below.
>
That is true.
>> +#ifdef CONFIG_PROVE_LOCKING
>> +static struct chain_hlocks_block chain_hlocks_blocks[MAX_CHAIN_HLOCKS_BLOCKS];
>> +static struct chain_hlocks_block *chain_hlocks_block_hash[CHAIN_HLOCKS_HASH];
>> +static struct chain_hlocks_block *free_chain_hlocks_blocks;
>> +
>> +/*
>> + * Graph lock must be held before calling the chain_hlocks_block functions.
>> + * Chain hlocks of depth 1-(CHAIN_HLOCKS_HASH-1) is mapped directly to
>> + * chain_hlocks_block_hash[1-(CHAIN_HLOCKS_HASH-1)]. All other sizes
>> + * are mapped to chain_hlocks_block_hash[0].
>> + */
>> +static inline struct chain_hlocks_block *alloc_chain_hlocks_block(void)
>> +{
>> +	struct chain_hlocks_block *block = free_chain_hlocks_blocks;
>> +
>> +	WARN_ONCE(!debug_locks_silent && !block,
>> +		  "Running out of chain_hlocks_block\n");
>> +	free_chain_hlocks_blocks = block ? block->next : NULL;
>> +	return block;
>> +}
>> +
>> +static inline void free_chain_hlocks_block(struct chain_hlocks_block *block)
>> +{
>> +	block->next = free_chain_hlocks_blocks;
>> +	free_chain_hlocks_blocks = block;
>> +}
>> +
>> +static inline void push_chain_hlocks_block(struct chain_hlocks_block *block)
>> +{
>> +	int hash, depth = block->depth;
>> +
>> +	hash = (depth >= CHAIN_HLOCKS_HASH) ? 0 : depth;
>> +	block->next = chain_hlocks_block_hash[hash];
>> +	chain_hlocks_block_hash[hash] = block;
>> +	nr_free_chain_hlocks += depth;
>> +}
> I would argue this is not a hash, these are buckets. You're doing
> regular size buckets.

I think I used the wrong terminology here. Bucket is a better description.


>
>> +static inline struct chain_hlocks_block *pop_chain_hlocks_block(int depth)
>> +{
>> +	struct chain_hlocks_block *curr, **pprev;
>> +
>> +	if (!nr_free_chain_hlocks)
>> +		return NULL;
>> +
>> +	if (depth < CHAIN_HLOCKS_HASH) {
>> +		curr = chain_hlocks_block_hash[depth];
>> +		if (curr) {
>> +			chain_hlocks_block_hash[depth] = curr->next;
>> +			nr_free_chain_hlocks -= depth;
>> +		}
>> +		return curr;
>> +	}
>> +
>> +	/*
>> +	 * For depth >= CHAIN_HLOCKS_HASH, it is not really a pop operation.
>> +	 * Instead, the first entry with the right size is returned.
>> +	 */
>> +	curr  = chain_hlocks_block_hash[0];
>> +	pprev = chain_hlocks_block_hash;
>> +
>> +	while (curr) {
>> +		if (curr->depth == depth)
>> +			break;
>> +		pprev = &(curr->next);
>> +		curr = curr->next;
>> +	}
>> +
>> +	if (curr) {
>> +		*pprev = curr->next;
>> +		nr_free_chain_hlocks -= depth;
>> +	}
>> +	return curr;
>> +}
>> +#else
>> +static inline void free_chain_hlocks_block(struct chain_hlocks_block *block) { }
>> +static inline struct chain_hlocks_block *pop_chain_hlocks_block(int depth)
>> +{
>> +	return NULL;
>> +}
>> +#endif /* CONFIG_PROVE_LOCKING */
> You've made a bit of a mess of things though. Why is that push/pop crud
> exposed? Why didn't you build a self contained allocator?
>
> All you really want is:
>
> u16 *alloc_chain(int size);
> void free_chain(int size, u16 *chain);
>
>
> An implementation would be something along these lines (completely
> untested, fresh from the keyboard):
>
> struct chain_block {
> 	int size;
> 	int base;
> 	struct chain_block *next;
> };
>
> struct chain_block chain_blocks[MAX_BLOCKS];
> struct chain_block *free_blocks;
>
>
> struct chain_block init_block = {
> 	.size = MAX_LOCKDEP_CHAIN_HLOCKS,
> 	.base = 0,
> 	.next = NULL,
> };
>
> struct chain_block *free_list[MAX_BUCKETS] = {
> 	&init_block, /* 0 */
> };
>
>
> void free_block(struct chain_block *b)
> {
> 	b->next = free_blocks;
> 	free_blocks = b;
> }
>
> struct chain_block *alloc_block(void)
> {
> 	struct chain_block *block = free_blocks;
> 	free_blocks = block->next;
> 	return block;
> }
>
> struct chain_block *pop_block(struct chain_block **bp)
> {
> 	struct chain_block *b = *bp;
> 	if (!b)
> 		return NULL;
> 	*bp = b->next;
> }
>
> void push_block(struct chain_block **bp, struct chain_block *b)
> {
> 	b->next = *bp;
> 	*bp = b;
> }
>
> /* could contemplate ilog2() buckets */
> int size2bucket(int size)
> {
> 	return size >= MAX_BUCKET ? 0 : size;
> }
>
> /* bucket[0] is mixed size */
> struct chain_block *pop_block_0(struct chain_block **bp, int size)
> {
> 	struct chain_block **p = bp, *b = *bp;
> 	if (!b)
> 		return NULL;
>
> 	p = bp;
> 	while (b && b->size < size) {
> 		p = &b->next;
> 		b = b->next;
> 	}
> 	if (!b)
> 		return NULL;
>
> 	*p = b->next;
> 	return b;
> }
>
> u16 *alloc_chain(int size)
> {
> 	int i, bucket = size2bucket(size);
> 	struct chain_block *b;
> 	u16 *chain;
>
> 	if (!bucket) {
> 		b = pop_block_0(&free_list[0], size);
> 		if (b)
> 			goto got_block;
> 	} else {
> 		b = pop_block(&free_list[bucket]);
> 		if (b)
> 			goto got_block;
>
> 		/* pop a large block, hope the fragment is still useful */
> 		b = pop_block_0(&free_list[0], size);
> 		if (b)
> 			goto got_block;
>
> 		for (i = MAX_BUCKETS-1; i > bucket; i--)
> 			b = pop_block(&free_list[bucket]);
> 			if (b)
> 				goto got_block;
> 		}
> 	}
> 	return NULL;
>
> got_block:
>
> 	chain = chain_hlocks + b->base;
> 	b->base += size;
> 	b->size -= size;
>
> 	if (b->size) {
> 		/* return the fragment */
> 		bucket = size2bucket(b->size);
> 		push_block(&free_list[bucket], b);
> 	} else {
> 		free_block(b);
> 	}
>
> 	return chain;
> }
>
> void free_chain(int size, u16 *chain)
> {
> 	struct chain_block *b = alloc_block();
> 	int bucket = size2bucket(size);
>
> 	if (!b) {
> 		// leak stuff;
> 		return;
> 	}
>
> 	b->size = size;
> 	b->base = chain - chain_hlocks;
>
> 	push_bucket(&free_list[bucket], b);
> }
>
> void init_blocks(void)
> {
> 	int i;
>
> 	for (i = 0; i < MAX_BLOCKS; i++)
> 		free_block(chain_blocks[i]);
> }
>
Thanks for the suggestion. I will incorporate your idea in the next
version of the patch.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 10:50     ` Peter Zijlstra
@ 2019-12-13 16:02       ` Waiman Long
  2019-12-13 16:05         ` Waiman Long
  2019-12-13 18:12         ` Peter Zijlstra
  0 siblings, 2 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-13 16:02 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On 12/13/19 5:50 AM, Peter Zijlstra wrote:
> On Fri, Dec 13, 2019 at 11:25:25AM +0100, Peter Zijlstra wrote:
>
>> void push_block(struct chain_block **bp, struct chain_block *b)
>> {
>> 	b->next = *bp;
>> 	*bp = b;
>> }
>>
>> /* could contemplate ilog2() buckets */
>> int size2bucket(int size)
>> {
>> 	return size >= MAX_BUCKET ? 0 : size;
>> }
> If you make the allocation granularity 4 entries, then you can have
> push_block() store the chain_block * at the start of every free entry,
> which would enable merging adjecent blocks.
>
> That is, if I'm not mistaken these u16 chain entries encode the class
> index (per lock_chain_get_class()). And since the lock_classes array is
> MAX_LOCKDEP_KEYS, or 13 bits, big, we have the MSB of each entry spare.
>
> union {
> 	struct {
> 		u16 hlock[4];
> 	}
> 	u64 addr;
> } ponies;
>
> So if we make the rule that:
>
> 	!(idx % 4) && (((union ponies *)chain_hlock[idx])->addr & BIT_ULL(63))
>
> encodes a free block at said addr, then we should be able to detect if:
>
> 	chain_hlock[b->base + b->size]
>
> is also free and merge the blocks.
>
> (we just have to fix up the MSB of the address, not all arches have
> negative addresses for kernel space)
>
That is an interesting idea. It will eliminate the need of a separate
array to track the free chain_hlocks. However, if there are n chains
available, it will waste about 3n bytes of storage, on average.

I have a slightly different idea. I will enforce a minimum allocation
size of 2. For a free block, the first 2 hlocks for each allocation
block will store a 32-bit integer (hlock[0] << 16)|hlock[1]:

Bit 31: always 1
Bits 24-30: block size
Bits 0-23: index to the next free block.

In this way, the wasted space will be k bytes where k is the number of
1-entry chains. I don't think merging adjacent blocks will be that
useful at this point. We can always add this capability later on if it
is found to be useful.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 16:02       ` Waiman Long
@ 2019-12-13 16:05         ` Waiman Long
  2019-12-13 18:12         ` Peter Zijlstra
  1 sibling, 0 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-13 16:05 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On 12/13/19 11:02 AM, Waiman Long wrote:
> That is an interesting idea. It will eliminate the need of a separate
> array to track the free chain_hlocks. However, if there are n chains
> available, it will waste about 3n bytes of storage, on average.
>
> I have a slightly different idea. I will enforce a minimum allocation
> size of 2. For a free block, the first 2 hlocks for each allocation
> block will store a 32-bit integer (hlock[0] << 16)|hlock[1]:
>
> Bit 31: always 1
> Bits 24-30: block size
> Bits 0-23: index to the next free block.
>
> In this way, the wasted space will be k bytes where k is the number of
The wasted space should be 2k bytes. My mistake.
> 1-entry chains. I don't think merging adjacent blocks will be that
> useful at this point. We can always add this capability later on if it
> is found to be useful.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 16:02       ` Waiman Long
  2019-12-13 16:05         ` Waiman Long
@ 2019-12-13 18:12         ` Peter Zijlstra
       [not found]           ` <7ca26a9a-003f-6f24-08e4-f01b80e3e962@redhat.com>
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2019-12-13 18:12 UTC (permalink / raw)
  To: Waiman Long; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On Fri, Dec 13, 2019 at 11:02:46AM -0500, Waiman Long wrote:

> That is an interesting idea. It will eliminate the need of a separate
> array to track the free chain_hlocks. However, if there are n chains
> available, it will waste about 3n bytes of storage, on average.
> 
> I have a slightly different idea. I will enforce a minimum allocation
> size of 2. For a free block, the first 2 hlocks for each allocation
> block will store a 32-bit integer (hlock[0] << 16)|hlock[1]:
> 
> Bit 31: always 1
> Bits 24-30: block size
> Bits 0-23: index to the next free block.

If you look closely at the proposed allocator, my blocks can be much
larger than 7 bit. In fact, it start with a single block of
MAX_LOCKDEP_CHAIN_HLOCKS entries.

That said; I don't think you need to encode the size at all. All we need
to do is encode the chain_blocks[] index (and stick init_block in that
array). That should maybe even fit in a single u16.

Also, if we store that in the first and last 'word' of the free range,
we can detect both before and after freespace.

> In this way, the wasted space will be k bytes where k is the number of
> 1-entry chains. I don't think merging adjacent blocks will be that
> useful at this point. We can always add this capability later on if it
> is found to be useful.

I'm thinking 1 entry isn't much of a chain. My brain is completely fried
atm, but are we really storing single entry 'chains' ? It seems to me we
could skip that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 10:25   ` Peter Zijlstra
  2019-12-13 10:50     ` Peter Zijlstra
  2019-12-13 15:02     ` Waiman Long
@ 2019-12-13 18:43     ` Peter Zijlstra
  2 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2019-12-13 18:43 UTC (permalink / raw)
  To: Waiman Long; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche


modified version that assumes MAX_BLOCKS <= 15bit, if it is more, you
need the 2 entry version of it, but I suppose you get the idea.

On Fri, Dec 13, 2019 at 11:25:25AM +0100, Peter Zijlstra wrote:

> An implementation would be something along these lines (completely
> untested, fresh from the keyboard):
> 
> struct chain_block {
> 	int size;
> 	int base;
> 	struct chain_block *next;
> };
> 
> struct chain_block chain_blocks[MAX_BLOCKS];
> struct chain_block *free_blocks;

- struct chain_block init_block = {
- 	.size = MAX_LOCKDEP_CHAIN_HLOCKS,
- 	.base = 0,
- 	.next = NULL,
- };

- struct chain_block *free_list[MAX_BUCKETS] = {
- 	&init_block, /* 0 */
- };

+ struct chain_block *free_list;

> void free_block(struct chain_block *b)
> {
> 	b->next = free_blocks;
> 	free_blocks = b;
> }
> 
> struct chain_block *alloc_block(void)
> {
> 	struct chain_block *block = free_blocks;
> 	free_blocks = block->next;
> 	return block;
> }
> 
> struct chain_block *pop_block(struct chain_block **bp)
> {
> 	struct chain_block *b = *bp;
> 	if (!b)
> 		return NULL;
> 	*bp = b->next;
> }
> 
> void push_block(struct chain_block **bp, struct chain_block *b)
> {
	u16 idx = (b - chain_blocks) | BIT(15);

+	chain_hlock[b->base] = idx;
+	chain_block[b->base + b->size - 1] = idx;

> 	b->next = *bp;
> 	*bp = b;
> }
> 
> /* could contemplate ilog2() buckets */
> int size2bucket(int size)
> {
> 	return size >= MAX_BUCKET ? 0 : size;
> }
> 
> /* bucket[0] is mixed size */
> struct chain_block *pop_block_0(struct chain_block **bp, int size)
> {
> 	struct chain_block **p = bp, *b = *bp;
> 	if (!b)
> 		return NULL;
> 
> 	p = bp;
> 	while (b && b->size < size) {
> 		p = &b->next;
> 		b = b->next;
> 	}
> 	if (!b)
> 		return NULL;
> 
> 	*p = b->next;
> 	return b;
> }
> 
> u16 *alloc_chain(int size)
> {
> 	int i, bucket = size2bucket(size);
> 	struct chain_block *b;
> 	u16 *chain;
> 
> 	if (!bucket) {
> 		b = pop_block_0(&free_list[0], size);
> 		if (b)
> 			goto got_block;
> 	} else {
> 		b = pop_block(&free_list[bucket]);
> 		if (b)
> 			goto got_block;
> 
> 		/* pop a large block, hope the fragment is still useful */
> 		b = pop_block_0(&free_list[0], size);
> 		if (b)
> 			goto got_block;
> 
> 		for (i = MAX_BUCKETS-1; i > bucket; i--)
> 			b = pop_block(&free_list[bucket]);
> 			if (b)
> 				goto got_block;
> 		}
> 	}
> 	return NULL;
> 
> got_block:
> 
> 	chain = chain_hlocks + b->base;
> 	b->base += size;
> 	b->size -= size;
> 
> 	if (b->size) {
> 		/* return the fragment */
> 		bucket = size2bucket(b->size);
> 		push_block(&free_list[bucket], b);
> 	} else {
> 		free_block(b);
> 	}
> 
> 	return chain;
> }
> 
> void free_chain(int size, u16 *chain)
> {
- 	struct chain_block *b = alloc_block();
- 	int bucket = size2bucket(size);

+	struct chain_block *b = NULL, *p = NULL, *n = NULL;
+	int base = chain - chain_hlocks;
+	u16 p_idx, n_idx;
+	int bucket;

	p_idx = chain_hlocks[base - 1];
	n_idx = chain_hlocks[base + size];

	if (p_idx & BIT(15)) {
		p = chain_blocks[p_idx & ~BIT(15)];
		bucket = size2bucket(p->size);

		// find and remove @p from @bucket

		p->size += size; // merge with prev
		b = p;
	}

	if (n_idx & BIT(15)) {
		n = chain_blocks[n_idx & ~BIT(15)];
		bucket = size2bucket(n->size);

		// find and remove @n from @bucket

		if (p) {
			p->size += n->size;
			free_block(n);
			n = NULL;
		} else {
			n->base -= size;
			b = n;
		}
	}
		
	if (!b) {
		b = alloc_bucket();
		if (!b) {
			// leak stuff
			return;
		}

		b->size = size;
		b->base = base;
	}
	
- 	if (!b) {
- 		// leak stuff;
- 		return;
- 	}
- 
- 	b->size = size;
- 	b->base = chain - chain_hlocks;

	bucket = size2bucket(b->size);

> 	push_bucket(&free_list[bucket], b);
> }
> 
> void init_blocks(void)
> {
> 	int i;

	chain_blocks[0].size = MAX_LOCKDEP_CHAIN_HLOCKS;
	chain_blocks[0].base = 0;
	chain_blocks[0].next = NULL;

	free_list[0] = &chain_blocks[0];

> 	for (i = 1; i < MAX_BLOCKS; i++)
> 		free_block(chain_blocks[i]);
> }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
       [not found]           ` <7ca26a9a-003f-6f24-08e4-f01b80e3e962@redhat.com>
@ 2019-12-13 18:47             ` Peter Zijlstra
  2019-12-13 20:08               ` Waiman Long
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Zijlstra @ 2019-12-13 18:47 UTC (permalink / raw)
  To: Waiman Long; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On Fri, Dec 13, 2019 at 01:35:05PM -0500, Waiman Long wrote:
> On 12/13/19 1:12 PM, Peter Zijlstra wrote:
> >> In this way, the wasted space will be k bytes where k is the number of
> >> 1-entry chains. I don't think merging adjacent blocks will be that
> >> useful at this point. We can always add this capability later on if it
> >> is found to be useful.
> > I'm thinking 1 entry isn't much of a chain. My brain is completely fried
> > atm, but are we really storing single entry 'chains' ? It seems to me we
> > could skip that.
> >
> Indeed, the current code can produce a 1-entry chain. I also thought
> that a chain had to be at least 2 entries. I got tripped up assuming
> that. It could be a bug somewhere that allow a 1-entry chain to happen,
> but I am not focusing on that right now.

If we need the minimum 2 entry granularity, it might make sense to spend
a little time on that. If we can get away with single entry markers,
then maybe write a comment so we'll not forget about it.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 18:47             ` Peter Zijlstra
@ 2019-12-13 20:08               ` Waiman Long
  2019-12-15 17:06                 ` Waiman Long
  0 siblings, 1 reply; 16+ messages in thread
From: Waiman Long @ 2019-12-13 20:08 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On 12/13/19 1:47 PM, Peter Zijlstra wrote:
> On Fri, Dec 13, 2019 at 01:35:05PM -0500, Waiman Long wrote:
>> On 12/13/19 1:12 PM, Peter Zijlstra wrote:
>>>> In this way, the wasted space will be k bytes where k is the number of
>>>> 1-entry chains. I don't think merging adjacent blocks will be that
>>>> useful at this point. We can always add this capability later on if it
>>>> is found to be useful.
>>> I'm thinking 1 entry isn't much of a chain. My brain is completely fried
>>> atm, but are we really storing single entry 'chains' ? It seems to me we
>>> could skip that.
>>>
>> Indeed, the current code can produce a 1-entry chain. I also thought
>> that a chain had to be at least 2 entries. I got tripped up assuming
>> that. It could be a bug somewhere that allow a 1-entry chain to happen,
>> but I am not focusing on that right now.
> If we need the minimum 2 entry granularity, it might make sense to spend
> a little time on that. If we can get away with single entry markers,
> then maybe write a comment so we'll not forget about it.
>
I will take a look at why an 1-entry chain happes and see if it is a bug
that need to be fixed.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries
  2019-12-13 20:08               ` Waiman Long
@ 2019-12-15 17:06                 ` Waiman Long
  0 siblings, 0 replies; 16+ messages in thread
From: Waiman Long @ 2019-12-15 17:06 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche

On 12/13/19 3:08 PM, Waiman Long wrote:
> On 12/13/19 1:47 PM, Peter Zijlstra wrote:
>> On Fri, Dec 13, 2019 at 01:35:05PM -0500, Waiman Long wrote:
>>> On 12/13/19 1:12 PM, Peter Zijlstra wrote:
>>>>> In this way, the wasted space will be k bytes where k is the number of
>>>>> 1-entry chains. I don't think merging adjacent blocks will be that
>>>>> useful at this point. We can always add this capability later on if it
>>>>> is found to be useful.
>>>> I'm thinking 1 entry isn't much of a chain. My brain is completely fried
>>>> atm, but are we really storing single entry 'chains' ? It seems to me we
>>>> could skip that.
>>>>
>>> Indeed, the current code can produce a 1-entry chain. I also thought
>>> that a chain had to be at least 2 entries. I got tripped up assuming
>>> that. It could be a bug somewhere that allow a 1-entry chain to happen,
>>> but I am not focusing on that right now.
>> If we need the minimum 2 entry granularity, it might make sense to spend
>> a little time on that. If we can get away with single entry markers,
>> then maybe write a comment so we'll not forget about it.
>>
> I will take a look at why an 1-entry chain happes and see if it is a bug
> that need to be fixed.

New lock chains are stored as part of the validate_chain() call from
__lock_acquire(). So for a n-entry lock chain, all previous n-1, n-2,
... 1 entry lock chains are stored as well. That may not be the most
efficient way to store the information, but it is simple. When booting
up a 2-socket x86-64 system, I saw about 800 1-entry lock chains being
stored.

Since I am planning to enforce a minimum of 2 chain_hlocks entry
allocation, we can theoretically allow a 1-entry chain to share the same
storage with a 2-entry chains with the same starting lock. That will add
a bit more code in the allocation and freeing path. I am not planning to
do that for this patchset, but may consider it as a follow up patch.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-12-15 17:06 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-12 22:35 [PATCH 0/5] locking/lockdep: Reuse zapped chain_hlocks entries Waiman Long
2019-12-12 22:35 ` [PATCH 1/5] locking/lockdep: Track number of zapped classes Waiman Long
2019-12-12 22:35 ` [PATCH 2/5] locking/lockdep: Track leaked chain_hlocks entries Waiman Long
2019-12-12 22:35 ` [PATCH 3/5] locking/lockdep: Track number of zapped lock chains Waiman Long
2019-12-12 22:35 ` [PATCH 4/5] locking/lockdep: Reuse free chain_hlocks entries Waiman Long
2019-12-13 10:25   ` Peter Zijlstra
2019-12-13 10:50     ` Peter Zijlstra
2019-12-13 16:02       ` Waiman Long
2019-12-13 16:05         ` Waiman Long
2019-12-13 18:12         ` Peter Zijlstra
     [not found]           ` <7ca26a9a-003f-6f24-08e4-f01b80e3e962@redhat.com>
2019-12-13 18:47             ` Peter Zijlstra
2019-12-13 20:08               ` Waiman Long
2019-12-15 17:06                 ` Waiman Long
2019-12-13 15:02     ` Waiman Long
2019-12-13 18:43     ` Peter Zijlstra
2019-12-12 22:35 ` [PATCH 5/5] locking/lockdep: Decrement irq context counters when removing lock chain Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).