linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] lockdep: Introduce wait-type checks
@ 2014-01-09 11:15 Peter Zijlstra
  2014-01-09 11:49 ` Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-09 11:15 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, Steven Rostedt, Oleg Nesterov,
	Paul McKenney, Linus Torvalds

Subject: lockdep: Introduce wait-type checks
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 19 Nov 2013 21:45:48 +0100

This patch extends lockdep to validate lock wait-type context.

The current wait-types are:

	LD_WAIT_FREE,		/* wait free, rcu etc.. */
	LD_WAIT_SPIN,		/* spin loops, raw_spinlock_t etc.. */
	LD_WAIT_CONFIG,		/* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
	LD_WAIT_SLEEP,		/* sleeping locks, mutex_t etc.. */

Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).

This ensures that we do not try and acquire mutices while holding
spinlocks, do not attempt to acquire spinlocks while holding
raw_spinlocks and so on. In other words, its a more fancy
might_sleep().

Obviously RCU made the entire ordeal more complex than a simple single
value test because we can acquire RCU in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.

Therefore we needed to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested
context (inner). For most 'normal' locks these two are the same.

[ To make static initialization easier we have the rule that:
  .outer == INV means .outer == .inner; because INV == 0. ]

It further means that we need to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while
'normal' RCU presents a CONFIG type to nested locks, if it is taken
while already holding a SPIN type it obviously doesn't relax the
rules.

Below is an example output; generated by the trivial example:

	raw_spin_lock(&foo);
	spin_lock(&bar);
	spin_unlock(&bar);
	raw_spin_unlock(&foo);

The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for our attempted lock, and try and match that up
to the held locks, which in this case is the one: -{2,2}.

This tells us that the acquiring lock requires a more relaxed
environment that presented by the lock stack.

Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More
convertions can be done when desired.

 [ ] =============================
 [ ] [ BUG: Invalid wait context ]
 [ ] 3.13.0-rc7-01825-g4443577b1c38-dirty #718 Not tainted
 [ ] -----------------------------
 [ ] swapper/0/1 is trying to lock:
 [ ]  (bar){......}-{3:3}, at: [<ffffffff81d0acea>] sched_init_smp+0x423/0x45e
 [ ]
 [ ] stack backtrace:
 [ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc7-01825-g4443577b1c38-dirty #718
 [ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
 [ ]  00000000000000a2 ffff880236847cf8 ffffffff8165d96a 0000000000000001
 [ ]  ffff880236847df0 ffffffff810df09f ffffffff81c3e698 ffffffff820b0290
 [ ]  ffff880200000000 ffffea0000000000 ffff880400000000 0000000000004140
 [ ] Call Trace:
 [ ]  [<ffffffff8165d96a>] dump_stack+0x4e/0x7a
 [ ]  [<ffffffff810df09f>] __lock_acquire+0x44f/0x2100
 [ ]  [<ffffffff810c0c8a>] ? task_rq_lock+0x5a/0xa0
 [ ]  [<ffffffff816657ad>] ? _raw_spin_unlock_irqrestore+0x6d/0x80
 [ ]  [<ffffffff810e1317>] lock_acquire+0x87/0x120
 [ ]  [<ffffffff81d0acea>] ? sched_init_smp+0x423/0x45e
 [ ]  [<ffffffff81664e5b>] _raw_spin_lock+0x3b/0x50
 [ ]  [<ffffffff81d0acea>] ? sched_init_smp+0x423/0x45e
 [ ]  [<ffffffff81d0acea>] sched_init_smp+0x423/0x45e
 [ ]  [<ffffffff81cedf03>] kernel_init_freeable+0x91/0x197
 [ ]  [<ffffffff8164f600>] ? rest_init+0xd0/0xd0
 [ ]  [<ffffffff8164f60e>] kernel_init+0xe/0x130
 [ ]  [<ffffffff8166d36c>] ret_from_fork+0x7c/0xb0
 [ ]  [<ffffffff8164f600>] ? rest_init+0xd0/0xd0
 [ ]
 [ ] other info that might help us debug this:
 [ ] 1 lock held by swapper/0/1:
 [ ]  #0:  (foo){+.+...}-{2:2}, at: [<ffffffff81d0acde>] sched_init_smp+0x417/0x45e
 [ ]
 [ ] stack backtrace:
 [ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc7-01825-g4443577b1c38-dirty #718
 [ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
 [ ]  00000000000000a2 ffff880236847cf8 ffffffff8165d96a 0000000000000001
 [ ]  ffff880236847df0 ffffffff810df0cf ffffffff81c3e698 ffffffff820b0290
 [ ]  ffff880200000000 ffffea0000000000 ffff880400000000 0000000000004140
 [ ] Call Trace:
 [ ]  [<ffffffff8165d96a>] dump_stack+0x4e/0x7a
 [ ]  [<ffffffff810df0cf>] __lock_acquire+0x47f/0x2100
 [ ]  [<ffffffff810c0c8a>] ? task_rq_lock+0x5a/0xa0
 [ ]  [<ffffffff816657ad>] ? _raw_spin_unlock_irqrestore+0x6d/0x80
 [ ]  [<ffffffff810e1317>] lock_acquire+0x87/0x120
 [ ]  [<ffffffff81d0acea>] ? sched_init_smp+0x423/0x45e
 [ ]  [<ffffffff81664e5b>] _raw_spin_lock+0x3b/0x50
 [ ]  [<ffffffff81d0acea>] ? sched_init_smp+0x423/0x45e
 [ ]  [<ffffffff81d0acea>] sched_init_smp+0x423/0x45e
 [ ]  [<ffffffff81cedf03>] kernel_init_freeable+0x91/0x197
 [ ]  [<ffffffff8164f600>] ? rest_init+0xd0/0xd0
 [ ]  [<ffffffff8164f60e>] kernel_init+0xe/0x130
 [ ]  [<ffffffff8166d36c>] ret_from_fork+0x7c/0xb0
 [ ]  [<ffffffff8164f600>] ? rest_init+0xd0/0xd0

Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
---
 include/linux/lockdep.h         |   27 +++++++++-
 include/linux/mutex.h           |    7 +-
 include/linux/rwlock_types.h    |    6 +-
 include/linux/rwsem.h           |    6 +-
 include/linux/spinlock.h        |   36 ++++++++++---
 include/linux/spinlock_types.h  |   24 +++++++--
 kernel/locking/lockdep.c        |  103 +++++++++++++++++++++++++++++++++++++---
 kernel/locking/mutex-debug.c    |    2 
 kernel/locking/rwsem-spinlock.c |    2 
 kernel/locking/rwsem-xadd.c     |    2 
 kernel/locking/spinlock_debug.c |    6 +-
 kernel/rcu/update.c             |   24 ++++++---
 12 files changed, 207 insertions(+), 38 deletions(-)

--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -104,6 +104,9 @@ struct lock_class {
 	const char			*name;
 	int				name_version;
 
+	short				wait_type_inner;
+	short				wait_type_outer;
+
 #ifdef CONFIG_LOCK_STAT
 	unsigned long			contention_point[LOCKSTAT_POINTS];
 	unsigned long			contending_point[LOCKSTAT_POINTS];
@@ -143,6 +146,17 @@ struct lock_class_stats lock_stats(struc
 void clear_lock_stats(struct lock_class *class);
 #endif
 
+enum lockdep_wait_type {
+	LD_WAIT_INV = 0,	/* not checked, catch all */
+
+	LD_WAIT_FREE,		/* wait free, rcu etc.. */
+	LD_WAIT_SPIN,		/* spin loops, raw_spinlock_t etc.. */
+	LD_WAIT_CONFIG,		/* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
+	LD_WAIT_SLEEP,		/* sleeping locks, mutex_t etc.. */
+
+	LD_WAIT_MAX,		/* must be last */
+};
+
 /*
  * Map the lock object (the lock instance) to the lock-class object.
  * This is embedded into specific lock instances:
@@ -151,6 +165,8 @@ struct lockdep_map {
 	struct lock_class_key		*key;
 	struct lock_class		*class_cache[NR_LOCKDEP_CACHING_CLASSES];
 	const char			*name;
+	short				wait_type_outer; /* can be taken in this context */
+	short				wait_type_inner; /* presents this context */
 #ifdef CONFIG_LOCK_STAT
 	int				cpu;
 	unsigned long			ip;
@@ -276,8 +292,14 @@ extern void lockdep_on(void);
  * to lockdep:
  */
 
-extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
-			     struct lock_class_key *key, int subclass);
+extern void lockdep_init_map_wait(struct lockdep_map *lock, const char *name,
+		struct lock_class_key *key, int subclass, short inner);
+
+static inline void lockdep_init_map(struct lockdep_map *lock, const char *name,
+			     struct lock_class_key *key, int subclass)
+{
+	lockdep_init_map_wait(lock, name, key, subclass, LD_WAIT_INV);
+}
 
 /*
  * To initialize a lockdep_map statically use this macro.
@@ -304,6 +326,7 @@ extern void lockdep_init_map(struct lock
 
 #define lockdep_set_novalidate_class(lock) \
 	lockdep_set_class(lock, &__lockdep_no_validate__)
+
 /*
  * Compare locking classes
  */
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -100,8 +100,11 @@ static inline void mutex_destroy(struct
 #endif
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __DEP_MAP_MUTEX_INITIALIZER(lockname) \
-		, .dep_map = { .name = #lockname }
+# define __DEP_MAP_MUTEX_INITIALIZER(lockname)			\
+		, .dep_map = {					\
+			.name = #lockname,			\
+			.wait_type_inner = LD_WAIT_SLEEP,	\
+		}
 #else
 # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
 #endif
--- a/include/linux/rwlock_types.h
+++ b/include/linux/rwlock_types.h
@@ -25,7 +25,11 @@ typedef struct {
 #define RWLOCK_MAGIC		0xdeaf1eed
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define RW_DEP_MAP_INIT(lockname)	.dep_map = { .name = #lockname }
+# define RW_DEP_MAP_INIT(lockname)					\
+	.dep_map = {							\
+		.name = #lockname,					\
+		.wait_type_inner = LD_WAIT_CONFIG,			\
+	}
 #else
 # define RW_DEP_MAP_INIT(lockname)
 #endif
--- a/include/linux/rwsem.h
+++ b/include/linux/rwsem.h
@@ -50,7 +50,11 @@ static inline int rwsem_is_locked(struct
 /* Common initializer macros and functions */
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define __RWSEM_DEP_MAP_INIT(lockname) , .dep_map = { .name = #lockname }
+# define __RWSEM_DEP_MAP_INIT(lockname)			\
+	, .dep_map = {					\
+		.name = #lockname,			\
+		.wait_type_inner = LD_WAIT_SLEEP,	\
+	}
 #else
 # define __RWSEM_DEP_MAP_INIT(lockname)
 #endif
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -91,12 +91,13 @@
 
 #ifdef CONFIG_DEBUG_SPINLOCK
   extern void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
-				   struct lock_class_key *key);
-# define raw_spin_lock_init(lock)				\
-do {								\
-	static struct lock_class_key __key;			\
-								\
-	__raw_spin_lock_init((lock), #lock, &__key);		\
+				   struct lock_class_key *key, short inner);
+
+# define raw_spin_lock_init(lock)					\
+do {									\
+	static struct lock_class_key __key;				\
+									\
+	__raw_spin_lock_init((lock), #lock, &__key, LD_WAIT_SPIN);	\
 } while (0)
 
 #else
@@ -292,12 +293,27 @@ static inline raw_spinlock_t *spinlock_c
 	return &lock->rlock;
 }
 
-#define spin_lock_init(_lock)				\
-do {							\
-	spinlock_check(_lock);				\
-	raw_spin_lock_init(&(_lock)->rlock);		\
+#ifdef CONFIG_DEBUG_SPINLOCK
+
+# define spin_lock_init(lock)					\
+do {								\
+	static struct lock_class_key __key;			\
+								\
+	__raw_spin_lock_init(spinlock_check(lock),		\
+			     #lock, &__key, LD_WAIT_CONFIG);	\
+} while (0)
+
+#else
+
+# define spin_lock_init(_lock)			\
+do {						\
+	spinlock_check(_lock);			\
+	*(lock) = __SPIN_LOCK_UNLOCKED(lock);	\
 } while (0)
 
+#endif
+
+
 static inline void spin_lock(spinlock_t *lock)
 {
 	raw_spin_lock(&lock->rlock);
--- a/include/linux/spinlock_types.h
+++ b/include/linux/spinlock_types.h
@@ -36,8 +36,18 @@ typedef struct raw_spinlock {
 #define SPINLOCK_OWNER_INIT	((void *)-1L)
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define SPIN_DEP_MAP_INIT(lockname)	.dep_map = { .name = #lockname }
+# define RAW_SPIN_DEP_MAP_INIT(lockname)		\
+	.dep_map = {					\
+		.name = #lockname,			\
+		.wait_type_inner = LD_WAIT_SPIN,	\
+	}
+# define SPIN_DEP_MAP_INIT(lockname)			\
+	.dep_map = {					\
+		.name = #lockname,			\
+		.wait_type_inner = LD_WAIT_CONFIG,	\
+	}
 #else
+# define RAW_SPIN_DEP_MAP_INIT(lockname)
 # define SPIN_DEP_MAP_INIT(lockname)
 #endif
 
@@ -54,7 +64,7 @@ typedef struct raw_spinlock {
 	{					\
 	.raw_lock = __ARCH_SPIN_LOCK_UNLOCKED,	\
 	SPIN_DEBUG_INIT(lockname)		\
-	SPIN_DEP_MAP_INIT(lockname) }
+	RAW_SPIN_DEP_MAP_INIT(lockname) }
 
 #define __RAW_SPIN_LOCK_UNLOCKED(lockname)	\
 	(raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname)
@@ -75,11 +85,17 @@ typedef struct spinlock {
 	};
 } spinlock_t;
 
+#define ___SPIN_LOCK_INITIALIZER(lockname)	\
+	{					\
+	.raw_lock = __ARCH_SPIN_LOCK_UNLOCKED,	\
+	SPIN_DEBUG_INIT(lockname)		\
+	SPIN_DEP_MAP_INIT(lockname) }
+
 #define __SPIN_LOCK_INITIALIZER(lockname) \
-	{ { .rlock = __RAW_SPIN_LOCK_INITIALIZER(lockname) } }
+	{ { .rlock = ___SPIN_LOCK_INITIALIZER(lockname) } }
 
 #define __SPIN_LOCK_UNLOCKED(lockname) \
-	(spinlock_t ) __SPIN_LOCK_INITIALIZER(lockname)
+	(spinlock_t) __SPIN_LOCK_INITIALIZER(lockname)
 
 #define DEFINE_SPINLOCK(x)	spinlock_t x = __SPIN_LOCK_UNLOCKED(x)
 
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -532,7 +532,9 @@ static void print_lock_name(struct lock_
 
 	printk(" (");
 	__print_lock_name(class);
-	printk("){%s}", usage);
+	printk("){%s}-{%hd:%hd}", usage,
+			class->wait_type_outer ?: class->wait_type_inner,
+			class->wait_type_inner);
 }
 
 static void print_lockdep_cache(struct lockdep_map *lock)
@@ -757,9 +759,11 @@ register_lock_class(struct lockdep_map *
 	 * We have to do the hash-walk again, to avoid races
 	 * with another CPU:
 	 */
-	list_for_each_entry(class, hash_head, hash_entry)
+	list_for_each_entry(class, hash_head, hash_entry) {
 		if (class->key == key)
 			goto out_unlock_set;
+	}
+
 	/*
 	 * Allocate a new key from the static array, and add it to
 	 * the hash:
@@ -784,6 +788,8 @@ register_lock_class(struct lockdep_map *
 	INIT_LIST_HEAD(&class->locks_before);
 	INIT_LIST_HEAD(&class->locks_after);
 	class->name_version = count_matching_names(class);
+	class->wait_type_inner = lock->wait_type_inner;
+	class->wait_type_outer = lock->wait_type_outer;
 	/*
 	 * We use RCU's safe list-add method to make
 	 * parallel walking of the hash-list safe:
@@ -2949,8 +2955,8 @@ static int mark_lock(struct task_struct
 /*
  * Initialize a lock instance's lock-class mapping info:
  */
-void lockdep_init_map(struct lockdep_map *lock, const char *name,
-		      struct lock_class_key *key, int subclass)
+void lockdep_init_map_wait(struct lockdep_map *lock, const char *name,
+		struct lock_class_key *key, int subclass, short inner)
 {
 	int i;
 
@@ -2973,6 +2979,9 @@ void lockdep_init_map(struct lockdep_map
 
 	lock->name = name;
 
+	lock->wait_type_outer = LD_WAIT_INV; /* INV outer matches inner. */
+	lock->wait_type_inner = inner;
+
 	/*
 	 * No key, no joy, we need to hash something.
 	 */
@@ -2997,7 +3006,7 @@ void lockdep_init_map(struct lockdep_map
 	if (subclass)
 		register_lock_class(lock, subclass, 1);
 }
-EXPORT_SYMBOL_GPL(lockdep_init_map);
+EXPORT_SYMBOL_GPL(lockdep_init_map_wait);
 
 struct lock_class_key __lockdep_no_validate__;
 EXPORT_SYMBOL_GPL(__lockdep_no_validate__);
@@ -3036,6 +3045,85 @@ print_lock_nested_lock_not_held(struct t
 	return 0;
 }
 
+static int
+print_lock_invalid_wait_context(struct task_struct *curr,
+				struct held_lock *hlock)
+{
+	if (!debug_locks_off())
+		return 0;
+	if (debug_locks_silent)
+		return 0;
+
+	printk("\n");
+	printk("=============================\n");
+	printk("[ BUG: Invalid wait context ]\n");
+	print_kernel_ident();
+	printk("-----------------------------\n");
+
+	printk("%s/%d is trying to lock:\n", curr->comm, task_pid_nr(curr));
+	print_lock(hlock);
+
+	/* XXX */
+
+	printk("\nstack backtrace:\n");
+	dump_stack();
+
+	printk("\nother info that might help us debug this:\n");
+	lockdep_print_held_locks(curr);
+
+	printk("\nstack backtrace:\n");
+	dump_stack();
+
+	return 0;
+}
+
+/*
+ * Verify the wait_type context.
+ *
+ * This check validates we takes locks in the right wait-type order; that is it
+ * ensures that we do not take mutexes inside spinlocks and do not attempt to
+ * acquire spinlocks inside raw_spinlocks and the sort.
+ *
+ * The entire thing is slightly more complex because of RCU, RCU is a lock that
+ * can be taken from (pretty much) any context but also has constraints.
+ * However when taken in a stricter environment the RCU lock does not loosen
+ * the constraints.
+ *
+ * Therefore we must look for the strictest environment in the lock stack and
+ * compare that to the lock we're trying to acquire.
+ */
+static int check_context(struct task_struct *curr, struct held_lock *next)
+{
+	short next_inner = hlock_class(next)->wait_type_inner;
+	short next_outer = hlock_class(next)->wait_type_outer;
+	short curr_inner = LD_WAIT_MAX;
+	int depth;
+
+	if (!curr->lockdep_depth || !next_inner)
+		return 0;
+
+	if (!next_outer)
+		next_outer = next_inner;
+
+	for (depth = 0; depth < curr->lockdep_depth; depth++) {
+		struct held_lock *prev = curr->held_locks + depth;
+		short prev_inner = hlock_class(prev)->wait_type_inner;
+
+		if (prev_inner) {
+			/*
+			 * we can have a bigger inner than a previous one
+			 * when outer is smaller than inner, as with RCU.
+			 */
+			curr_inner = min(curr_inner, prev_inner);
+		}
+	}
+
+	if (next_outer > curr_inner)
+		return print_lock_invalid_wait_context(curr, next);
+
+	return 0;
+}
+
 static int __lock_is_held(struct lockdep_map *lock);
 
 /*
@@ -3105,7 +3193,7 @@ static int __lock_acquire(struct lockdep
 
 	class_idx = class - lock_classes + 1;
 
-	if (depth) {
+	if (depth) { /* we're holding locks */
 		hlock = curr->held_locks + depth - 1;
 		if (hlock->class_idx == class_idx && nest_lock) {
 			if (hlock->references)
@@ -3138,6 +3226,9 @@ static int __lock_acquire(struct lockdep
 	hlock->holdtime_stamp = lockstat_clock();
 #endif
 
+	if (check_context(curr, hlock))
+		return 0;
+
 	if (check == 2 && !mark_irqflags(curr, hlock))
 		return 0;
 
--- a/kernel/locking/mutex-debug.c
+++ b/kernel/locking/mutex-debug.c
@@ -93,7 +93,7 @@ void debug_mutex_init(struct mutex *lock
 	 * Make sure we are not reinitializing a held lock:
 	 */
 	debug_check_no_locks_freed((void *)lock, sizeof(*lock));
-	lockdep_init_map(&lock->dep_map, name, key, 0);
+	lockdep_init_map_wait(&lock->dep_map, name, key, 0, LD_WAIT_SLEEP);
 #endif
 	lock->magic = lock;
 }
--- a/kernel/locking/rwsem-spinlock.c
+++ b/kernel/locking/rwsem-spinlock.c
@@ -44,7 +44,7 @@ void __init_rwsem(struct rw_semaphore *s
 	 * Make sure we are not reinitializing a held semaphore:
 	 */
 	debug_check_no_locks_freed((void *)sem, sizeof(*sem));
-	lockdep_init_map(&sem->dep_map, name, key, 0);
+	lockdep_init_map_wait(&sem->dep_map, name, key, 0, LD_WAIT_SLEEP);
 #endif
 	sem->activity = 0;
 	raw_spin_lock_init(&sem->wait_lock);
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -22,7 +22,7 @@ void __init_rwsem(struct rw_semaphore *s
 	 * Make sure we are not reinitializing a held semaphore:
 	 */
 	debug_check_no_locks_freed((void *)sem, sizeof(*sem));
-	lockdep_init_map(&sem->dep_map, name, key, 0);
+	lockdep_init_map_wait(&sem->dep_map, name, key, 0, LD_WAIT_SLEEP);
 #endif
 	sem->count = RWSEM_UNLOCKED_VALUE;
 	raw_spin_lock_init(&sem->wait_lock);
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -14,14 +14,14 @@
 #include <linux/export.h>
 
 void __raw_spin_lock_init(raw_spinlock_t *lock, const char *name,
-			  struct lock_class_key *key)
+			  struct lock_class_key *key, short inner)
 {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 	/*
 	 * Make sure we are not reinitializing a held lock:
 	 */
 	debug_check_no_locks_freed((void *)lock, sizeof(*lock));
-	lockdep_init_map(&lock->dep_map, name, key, 0);
+	lockdep_init_map_wait(&lock->dep_map, name, key, 0, inner);
 #endif
 	lock->raw_lock = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 	lock->magic = SPINLOCK_MAGIC;
@@ -39,7 +39,7 @@ void __rwlock_init(rwlock_t *lock, const
 	 * Make sure we are not reinitializing a held lock:
 	 */
 	debug_check_no_locks_freed((void *)lock, sizeof(*lock));
-	lockdep_init_map(&lock->dep_map, name, key, 0);
+	lockdep_init_map_wait(&lock->dep_map, name, key, 0, LD_WAIT_CONFIG);
 #endif
 	lock->raw_lock = (arch_rwlock_t) __ARCH_RW_LOCK_UNLOCKED;
 	lock->magic = RWLOCK_MAGIC;
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -114,18 +114,30 @@ EXPORT_SYMBOL_GPL(__rcu_read_unlock);
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 static struct lock_class_key rcu_lock_key;
-struct lockdep_map rcu_lock_map =
-	STATIC_LOCKDEP_MAP_INIT("rcu_read_lock", &rcu_lock_key);
+struct lockdep_map rcu_lock_map = {
+	.name = "rcu_read_lock",
+	.key = &rcu_lock_key,
+	.wait_type_outer = LD_WAIT_FREE,
+	.wait_type_inner = LD_WAIT_CONFIG, /* XXX PREEMPT_RCU ? */
+};
 EXPORT_SYMBOL_GPL(rcu_lock_map);
 
 static struct lock_class_key rcu_bh_lock_key;
-struct lockdep_map rcu_bh_lock_map =
-	STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_bh", &rcu_bh_lock_key);
+struct lockdep_map rcu_bh_lock_map = {
+	.name = "rcu_read_lock_bh",
+	.key = &rcu_bh_lock_key,
+	.wait_type_outer = LD_WAIT_FREE,
+	.wait_type_inner = LD_WAIT_CONFIG, /* PREEMPT_LOCK also makes BH preemptible */
+};
 EXPORT_SYMBOL_GPL(rcu_bh_lock_map);
 
 static struct lock_class_key rcu_sched_lock_key;
-struct lockdep_map rcu_sched_lock_map =
-	STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_sched", &rcu_sched_lock_key);
+struct lockdep_map rcu_sched_lock_map = {
+	.name = "rcu_read_lock_sched",
+	.key = &rcu_sched_lock_key,
+	.wait_type_outer = LD_WAIT_FREE,
+	.wait_type_inner = LD_WAIT_SPIN,
+};
 EXPORT_SYMBOL_GPL(rcu_sched_lock_map);
 
 static struct lock_class_key rcu_callback_key;

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC][PATCH] lockdep: Introduce wait-type checks
  2014-01-09 11:15 [RFC][PATCH] lockdep: Introduce wait-type checks Peter Zijlstra
@ 2014-01-09 11:49 ` Peter Zijlstra
  2014-01-09 16:31 ` Oleg Nesterov
  2014-01-09 17:33 ` [RFC][PATCH] lockdep: Introduce wait-type checks Dave Jones
  2 siblings, 0 replies; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-09 11:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Thomas Gleixner, Steven Rostedt, Oleg Nesterov,
	Paul McKenney, Linus Torvalds

On Thu, Jan 09, 2014 at 12:15:16PM +0100, Peter Zijlstra wrote:
> @@ -276,8 +292,14 @@ extern void lockdep_on(void);
>   * to lockdep:
>   */
>  
> -extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
> -			     struct lock_class_key *key, int subclass);
> +extern void lockdep_init_map_wait(struct lockdep_map *lock, const char *name,
> +		struct lock_class_key *key, int subclass, short inner);
> +
> +static inline void lockdep_init_map(struct lockdep_map *lock, const char *name,
> +			     struct lock_class_key *key, int subclass)
> +{
> +	lockdep_init_map_wait(lock, name, key, subclass, LD_WAIT_INV);
> +}
>  
>  /*
>   * To initialize a lockdep_map statically use this macro.

Realized that the above destroys existing wait types for
lockdep_set_*(), so update those to preserve the wait_types.

At which point a trylock triggered a wait_type violation which is of
course a validation mistake, so ignore trylocks.

--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -314,15 +314,21 @@ static inline void lockdep_init_map(stru
  * of dependencies wrong: they are either too broad (they need a class-split)
  * or they are too narrow (they suffer from a false class-split):
  */
-#define lockdep_set_class(lock, key) \
-		lockdep_init_map(&(lock)->dep_map, #key, key, 0)
-#define lockdep_set_class_and_name(lock, key, name) \
-		lockdep_init_map(&(lock)->dep_map, name, key, 0)
-#define lockdep_set_class_and_subclass(lock, key, sub) \
-		lockdep_init_map(&(lock)->dep_map, #key, key, sub)
-#define lockdep_set_subclass(lock, sub)	\
-		lockdep_init_map(&(lock)->dep_map, #lock, \
-				 (lock)->dep_map.key, sub)
+#define lockdep_set_class(lock, key)				\
+	lockdep_init_map_wait(&(lock)->dep_map, #key, key, 0,	\
+			      (lock)->dep_map.wait_type_inner)
+
+#define lockdep_set_class_and_name(lock, key, name)		\
+	lockdep_init_map_wait(&(lock)->dep_map, name, key, 0,	\
+			      (lock)->dep_map.wait_type_inner)
+
+#define lockdep_set_class_and_subclass(lock, key, sub)		\
+	lockdep_init_map_wait(&(lock)->dep_map, #key, key, sub,	\
+			      (lock)->dep_map.wait_type_inner)
+
+#define lockdep_set_subclass(lock, sub)						\
+	lockdep_init_map_wait(&(lock)->dep_map, #lock, (lock)->dep_map.key, sub,\
+			      (lock)->dep_map.wait_type_inner)
 
 #define lockdep_set_novalidate_class(lock) \
 	lockdep_set_class(lock, &__lockdep_no_validate__)
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3092,14 +3092,14 @@ print_lock_invalid_wait_context(struct t
  * Therefore we must look for the strictest environment in the lock stack and
  * compare that to the lock we're trying to acquire.
  */
-static int check_context(struct task_struct *curr, struct held_lock *next)
+static int check_wait_context(struct task_struct *curr, struct held_lock *next)
 {
 	short next_inner = hlock_class(next)->wait_type_inner;
 	short next_outer = hlock_class(next)->wait_type_outer;
 	short curr_inner = LD_WAIT_MAX;
 	int depth;
 
-	if (!curr->lockdep_depth || !next_inner)
+	if (!curr->lockdep_depth || !next_inner || next->trylock)
 		return 0;
 
 	if (!next_outer)
@@ -3111,8 +3111,10 @@ static int check_context(struct task_str
 
 		if (prev_inner) {
 			/*
-			 * we can have a bigger inner than a previous one
+			 * We can have a bigger inner than a previous one
 			 * when outer is smaller than inner, as with RCU.
+			 *
+			 * Also due to trylocks.
 			 */
 			curr_inner = min(curr_inner, prev_inner);
 		}
@@ -3226,7 +3228,7 @@ static int __lock_acquire(struct lockdep
 	hlock->holdtime_stamp = lockstat_clock();
 #endif
 
-	if (check_context(curr, hlock))
+	if (check_wait_context(curr, hlock))
 		return 0;
 
 	if (check == 2 && !mark_irqflags(curr, hlock))

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC][PATCH] lockdep: Introduce wait-type checks
  2014-01-09 11:15 [RFC][PATCH] lockdep: Introduce wait-type checks Peter Zijlstra
  2014-01-09 11:49 ` Peter Zijlstra
@ 2014-01-09 16:31 ` Oleg Nesterov
  2014-01-09 17:08   ` Peter Zijlstra
  2014-01-09 17:33 ` [RFC][PATCH] lockdep: Introduce wait-type checks Dave Jones
  2 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-09 16:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On 01/09, Peter Zijlstra wrote:
>
> +static int check_context(struct task_struct *curr, struct held_lock *next)
> +{
> +	short next_inner = hlock_class(next)->wait_type_inner;
> +	short next_outer = hlock_class(next)->wait_type_outer;
> +	short curr_inner = LD_WAIT_MAX;
> +	int depth;
> +
> +	if (!curr->lockdep_depth || !next_inner)
> +		return 0;
> +
> +	if (!next_outer)
> +		next_outer = next_inner;
> +
> +	for (depth = 0; depth < curr->lockdep_depth; depth++) {
> +		struct held_lock *prev = curr->held_locks + depth;
> +		short prev_inner = hlock_class(prev)->wait_type_inner;
> +
> +		if (prev_inner) {
> +			/*
> +			 * we can have a bigger inner than a previous one
> +			 * when outer is smaller than inner, as with RCU.
> +			 */
> +			curr_inner = min(curr_inner, prev_inner);
> +		}
> +	}
> +
> +	if (next_outer > curr_inner)
> +		return print_lock_invalid_wait_context(curr, next);
> +
> +	return 0;
> +}

This is really minor, but it seems you can simplify it a little bit.
We do not really need curr_inner, the main loop can do

	for (...) {
		...

		if (prev_inner && prev_inner < next_outer)
			return print_lock_invalid_wait_context(...);
	}

	return 0;


Off-topic question... I can't understand the "int check" argument of
lock_acquire(). First of all, __lock_acquire() does

	if (!prove_locking)
		check = 1;

Doesn't this mean lock_acquire_*() do not depend on CONFIG_PROVE_LOCKING?
IOW, can't we do

	--- x/include/linux/lockdep.h
	+++ x/include/linux/lockdep.h
	@@ -479,15 +479,9 @@ static inline void print_irqtrace_events
	  * on the per lock-class debug mode:
	  */
	 
	-#ifdef CONFIG_PROVE_LOCKING
	- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
	- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
	- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)
	-#else
	- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
	- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
	- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
	-#endif
	+#define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
	+#define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
	+#define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)
	 
	 #define spin_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
	 #define spin_acquire_nest(l, s, t, n, i)	lock_acquire_exclusive(l, s, t, n, i)


But what I really can't understans is what "check == 0" means? It
seems that in fact it can be 1 or 2? Or, iow, "check == 0" is actually
equivalent to "check == 1" ?

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC][PATCH] lockdep: Introduce wait-type checks
  2014-01-09 16:31 ` Oleg Nesterov
@ 2014-01-09 17:08   ` Peter Zijlstra
  2014-01-09 17:54     ` check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks) Oleg Nesterov
  2014-01-12  9:40     ` [RFC][PATCH] lockdep: Introduce wait-type checks Ingo Molnar
  0 siblings, 2 replies; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-09 17:08 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On Thu, Jan 09, 2014 at 05:31:20PM +0100, Oleg Nesterov wrote:
> This is really minor, but it seems you can simplify it a little bit.
> We do not really need curr_inner, the main loop can do
> 
> 	for (...) {
> 		...
> 
> 		if (prev_inner && prev_inner < next_outer)
> 			return print_lock_invalid_wait_context(...);
> 	}
> 
> 	return 0;

Indeed, that might work.. My brain still finds it easier to understand
the min() version though.

> Off-topic question... I can't understand the "int check" argument of
> lock_acquire(). First of all, __lock_acquire() does
> 
> 	if (!prove_locking)
> 		check = 1;
> 
> Doesn't this mean lock_acquire_*() do not depend on CONFIG_PROVE_LOCKING?
> IOW, can't we do
> 
> 	--- x/include/linux/lockdep.h
> 	+++ x/include/linux/lockdep.h
> 	@@ -479,15 +479,9 @@ static inline void print_irqtrace_events
> 	  * on the per lock-class debug mode:
> 	  */
> 	 
> 	-#ifdef CONFIG_PROVE_LOCKING
> 	- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
> 	- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
> 	- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)
> 	-#else
> 	- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
> 	- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
> 	- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
> 	-#endif
> 	+#define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
> 	+#define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
> 	+#define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)

I suppose we could; note however that the if (!prove_locking) logic was
added later.

> But what I really can't understans is what "check == 0" means? It
> seems that in fact it can be 1 or 2? Or, iow, "check == 0" is actually
> equivalent to "check == 1" ?

Hmm indeed, the comment in lockdep.h says 0 means no checks at all, but
the code doesn't actually appear to work like that. I'm not sure it ever
did or not, I'd have to go dig through history.

That said, given the current state it certainly looks like we can remove
the check argument.

Ingo?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC][PATCH] lockdep: Introduce wait-type checks
  2014-01-09 11:15 [RFC][PATCH] lockdep: Introduce wait-type checks Peter Zijlstra
  2014-01-09 11:49 ` Peter Zijlstra
  2014-01-09 16:31 ` Oleg Nesterov
@ 2014-01-09 17:33 ` Dave Jones
  2014-01-09 22:12   ` Peter Zijlstra
  2014-01-10 12:11   ` Peter Zijlstra
  2 siblings, 2 replies; 50+ messages in thread
From: Dave Jones @ 2014-01-09 17:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Steven Rostedt,
	Oleg Nesterov, Paul McKenney, Linus Torvalds

On Thu, Jan 09, 2014 at 12:15:16PM +0100, Peter Zijlstra wrote:
 > Subject: lockdep: Introduce wait-type checks
 > From: Peter Zijlstra <peterz@infradead.org>
 > Date: Tue, 19 Nov 2013 21:45:48 +0100
 > 
 > This patch extends lockdep to validate lock wait-type context.

ooh, a new toy.

*boom*

[    0.298629] =============================
[    0.298732] [ BUG: Invalid wait context ]
[    0.298834] 3.13.0-rc7+ #15 Not tainted
[    0.298935] -----------------------------
[    0.299038] swapper/0/1 is trying to lock:
[    0.299135]  (&n->list_lock){......}-{3:3}, at: [<ffffffff816dea54>] get_partial_node.isra.49+0x4d/0x228
[    0.299453] 
stack backtrace:
[    0.299608] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc7+ #15 
[    0.299983]  0000000000000001 ffff880243f37a00 ffffffff816dfe5b 0000000000000014
[    0.300302]  ffff880243f37a78 ffffffff8109f1f7 0000000000000000 ffff880243f37a78
[    0.300611]  0000000000000046 ffffffff81189ae3 ffffffff00000000 0000000000000046
[    0.300927] Call Trace:
[    0.301028]  [<ffffffff816dfe5b>] dump_stack+0x4e/0x7a
[    0.301128]  [<ffffffff8109f1f7>] __lock_acquire.isra.28+0x3d7/0xd80
[    0.301238]  [<ffffffff81189ae3>] ? deactivate_slab+0x3c3/0x740
[    0.301345]  [<ffffffff810a030d>] lock_acquire+0x8d/0x120
[    0.302971]  [<ffffffff816dea54>] ? get_partial_node.isra.49+0x4d/0x228
[    0.303077]  [<ffffffff816e9e3b>] _raw_spin_lock+0x3b/0x50
[    0.303183]  [<ffffffff816dea54>] ? get_partial_node.isra.49+0x4d/0x228
[    0.303290]  [<ffffffff816dea54>] get_partial_node.isra.49+0x4d/0x228
[    0.303397]  [<ffffffff810cd482>] ? __module_text_address+0x12/0x60
[    0.303502]  [<ffffffff810d35ff>] ? is_module_text_address+0x2f/0x50
[    0.303610]  [<ffffffff81074548>] ? __kernel_text_address+0x58/0x80
[    0.303717]  [<ffffffff816dedfc>] __slab_alloc+0x1cd/0x562
[    0.303821]  [<ffffffff812e86ff>] ? alloc_cpumask_var_node+0x1f/0x90
[    0.303929]  [<ffffffff8118ab6a>] kmem_cache_alloc_node_trace+0xda/0x290
[    0.304037]  [<ffffffff812e86ff>] ? alloc_cpumask_var_node+0x1f/0x90
[    0.304145]  [<ffffffff812e86ff>] alloc_cpumask_var_node+0x1f/0x90
[    0.304250]  [<ffffffff812e879e>] alloc_cpumask_var+0xe/0x10
[    0.304357]  [<ffffffff81030990>] __assign_irq_vector+0x40/0x340
[    0.304462]  [<ffffffff810324a1>] __create_irqs+0x151/0x210
[    0.304567]  [<ffffffff810325a2>] create_irq+0x22/0x30
[    0.304674]  [<ffffffff815ab25d>] dmar_set_interrupt+0x2d/0xd0
[    0.304784]  [<ffffffff81d6a532>] enable_drhd_fault_handling+0x24/0x66
[    0.304890]  [<ffffffff81d6bc7f>] irq_remap_enable_fault_handling+0x26/0x30
[    0.304999]  [<ffffffff81d2f3e0>] bsp_end_local_APIC_setup+0x18/0x1a
[    0.305106]  [<ffffffff81d2d44a>] native_smp_prepare_cpus+0x35c/0x3d3
[    0.305215]  [<ffffffff81d20f93>] kernel_init_freeable+0x124/0x26c
[    0.305321]  [<ffffffff816d6d4e>] ? kernel_init+0xe/0x130
[    0.305427]  [<ffffffff816d6d40>] ? rest_init+0xd0/0xd0
[    0.305529]  [<ffffffff816d6d4e>] kernel_init+0xe/0x130
[    0.305627]  [<ffffffff816f23ac>] ret_from_fork+0x7c/0xb0
[    0.305731]  [<ffffffff816d6d40>] ? rest_init+0xd0/0xd0
[    0.305836] 
other info that might help us debug this:
[    0.305993] 1 lock held by swapper/0/1:
[    0.306093]  #0:  (vector_lock){......}-{2:2}, at: [<ffffffff8103245c>] __create_irqs+0x10c/0x210
[    0.306444] 
stack backtrace:
[    0.306599] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc7+ #15 
[    0.306972]  0000000000000001 ffff880243f37a00 ffffffff816dfe5b 0000000000000014
[    0.307282]  ffff880243f37a78 ffffffff8109f220 0000000000000000 ffff880243f37a78
[    0.307594]  0000000000000046 ffffffff81189ae3 ffffffff00000000 0000000000000046
[    0.307903] Call Trace:
[    0.307999]  [<ffffffff816dfe5b>] dump_stack+0x4e/0x7a
[    0.308103]  [<ffffffff8109f220>] __lock_acquire.isra.28+0x400/0xd80
[    0.308209]  [<ffffffff81189ae3>] ? deactivate_slab+0x3c3/0x740
[    0.308316]  [<ffffffff810a030d>] lock_acquire+0x8d/0x120
[    0.308422]  [<ffffffff816dea54>] ? get_partial_node.isra.49+0x4d/0x228
[    0.308529]  [<ffffffff816e9e3b>] _raw_spin_lock+0x3b/0x50
[    0.308631]  [<ffffffff816dea54>] ? get_partial_node.isra.49+0x4d/0x228
[    0.308735]  [<ffffffff816dea54>] get_partial_node.isra.49+0x4d/0x228
[    0.308843]  [<ffffffff810cd482>] ? __module_text_address+0x12/0x60
[    0.308949]  [<ffffffff810d35ff>] ? is_module_text_address+0x2f/0x50
[    0.309055]  [<ffffffff81074548>] ? __kernel_text_address+0x58/0x80
[    0.309161]  [<ffffffff816dedfc>] __slab_alloc+0x1cd/0x562
[    0.309265]  [<ffffffff812e86ff>] ? alloc_cpumask_var_node+0x1f/0x90
[    0.309372]  [<ffffffff8118ab6a>] kmem_cache_alloc_node_trace+0xda/0x290
[    0.309479]  [<ffffffff812e86ff>] ? alloc_cpumask_var_node+0x1f/0x90
[    0.309586]  [<ffffffff812e86ff>] alloc_cpumask_var_node+0x1f/0x90
[    0.309692]  [<ffffffff812e879e>] alloc_cpumask_var+0xe/0x10
[    0.309799]  [<ffffffff81030990>] __assign_irq_vector+0x40/0x340
[    0.309906]  [<ffffffff810324a1>] __create_irqs+0x151/0x210
[    0.310005]  [<ffffffff810325a2>] create_irq+0x22/0x30
[    0.310109]  [<ffffffff815ab25d>] dmar_set_interrupt+0x2d/0xd0
[    0.310215]  [<ffffffff81d6a532>] enable_drhd_fault_handling+0x24/0x66
[    0.310321]  [<ffffffff81d6bc7f>] irq_remap_enable_fault_handling+0x26/0x30
[    0.310429]  [<ffffffff81d2f3e0>] bsp_end_local_APIC_setup+0x18/0x1a
[    0.310536]  [<ffffffff81d2d44a>] native_smp_prepare_cpus+0x35c/0x3d3
[    0.310644]  [<ffffffff81d20f93>] kernel_init_freeable+0x124/0x26c
[    0.310746]  [<ffffffff816d6d4e>] ? kernel_init+0xe/0x130
[    0.310852]  [<ffffffff816d6d40>] ? rest_init+0xd0/0xd0
[    0.310955]  [<ffffffff816d6d4e>] kernel_init+0xe/0x130
[    0.311058]  [<ffffffff816f23ac>] ret_from_fork+0x7c/0xb0
[    0.311164]  [<ffffffff816d6d40>] ? rest_init+0xd0/0xd0


nitpick: Why is the backtrace printed twice ?

	Dave


^ permalink raw reply	[flat|nested] 50+ messages in thread

* check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-09 17:08   ` Peter Zijlstra
@ 2014-01-09 17:54     ` Oleg Nesterov
  2014-01-12 20:58       ` Peter Zijlstra
  2014-01-12  9:40     ` [RFC][PATCH] lockdep: Introduce wait-type checks Ingo Molnar
  1 sibling, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-09 17:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

I changed the subject to avoid the confusion.

On 01/09, Peter Zijlstra wrote:
>
> On Thu, Jan 09, 2014 at 05:31:20PM +0100, Oleg Nesterov wrote:
>
> > 	-#ifdef CONFIG_PROVE_LOCKING
> > 	- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
> > 	- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
> > 	- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)
> > 	-#else
> > 	- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
> > 	- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
> > 	- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
> > 	-#endif
> > 	+#define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
> > 	+#define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
> > 	+#define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)
>
> I suppose we could; note however that the if (!prove_locking) logic was
> added later.

OK, thanks...

> > But what I really can't understans is what "check == 0" means? It
> > seems that in fact it can be 1 or 2? Or, iow, "check == 0" is actually
> > equivalent to "check == 1" ?
>
> Hmm indeed, the comment in lockdep.h says 0 means no checks at all, but
> the code doesn't actually appear to work like that. I'm not sure it ever
> did or not, I'd have to go dig through history.
>
> That said, given the current state it certainly looks like we can remove
> the check argument.

Or yes, we can probably simply remove it. Unlikely we will need
lock_acquire(check => 0).

But this connects to lockdep_no_validate. Not sure I understand what
this class should actually do, but consider this code:

	DEFINE_MUTEX(m1);
	DEFINE_MUTEX(m2);
	DEFINE_MUTEX(mx);

	void lockdep_should_complain(void)
	{
		lockdep_set_novalidate_class(&mx);

		// m1 -> mx -> m2
		mutex_lock(&m1);
		mutex_lock(&mx);
		mutex_lock(&m2);
		mutex_unlock(&m2);
		mutex_unlock(&mx);
		mutex_unlock(&m1);


		// m2 -> m1 ; should trigger the warning
		mutex_lock(&m2);
		mutex_lock(&m1);
		mutex_unlock(&m1);
		mutex_unlock(&m2);
	}

lockdep doesn't not detect the trivial possible deadlock.

The patch below seems to work but most probably it is not right, and
I forgot everything (not too much) I knew about lockdep internals.

Oleg.

--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1939,7 +1939,8 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 		 * Only non-recursive-read entries get new dependencies
 		 * added:
 		 */
-		if (hlock->read != 2) {
+		if (hlock->read != 2 &&
+		    hlock->instance->key != &__lockdep_no_validate__) {
 			if (!check_prev_add(curr, hlock, next,
 						distance, trylock_loop))
 				return 0;


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC][PATCH] lockdep: Introduce wait-type checks
  2014-01-09 17:33 ` [RFC][PATCH] lockdep: Introduce wait-type checks Dave Jones
@ 2014-01-09 22:12   ` Peter Zijlstra
  2014-01-10 12:11   ` Peter Zijlstra
  1 sibling, 0 replies; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-09 22:12 UTC (permalink / raw)
  To: Dave Jones, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Steven Rostedt, Oleg Nesterov, Paul McKenney, Linus Torvalds

On Thu, Jan 09, 2014 at 12:33:26PM -0500, Dave Jones wrote:
> On Thu, Jan 09, 2014 at 12:15:16PM +0100, Peter Zijlstra wrote:
>  > Subject: lockdep: Introduce wait-type checks
>  > From: Peter Zijlstra <peterz@infradead.org>
>  > Date: Tue, 19 Nov 2013 21:45:48 +0100
>  > 
>  > This patch extends lockdep to validate lock wait-type context.
> 
> ooh, a new toy.
> 
> *boom*
> 

Oh cute, I'll have to stare at slub. ISTR there's a current slub-rt
patch floating about.

I'd have to just blindly convert slub to raw_spinlocks (although that'll
get rid of the warning) without trying to limit the lock hold time of
raw_spinlocks.

> nitpick: Why is the backtrace printed twice ?

Because I'm an idiot, lemme fix that ;-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC][PATCH] lockdep: Introduce wait-type checks
  2014-01-09 17:33 ` [RFC][PATCH] lockdep: Introduce wait-type checks Dave Jones
  2014-01-09 22:12   ` Peter Zijlstra
@ 2014-01-10 12:11   ` Peter Zijlstra
  1 sibling, 0 replies; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-10 12:11 UTC (permalink / raw)
  To: Dave Jones, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Steven Rostedt, Oleg Nesterov, Paul McKenney, Linus Torvalds

On Thu, Jan 09, 2014 at 12:33:26PM -0500, Dave Jones wrote:
> On Thu, Jan 09, 2014 at 12:15:16PM +0100, Peter Zijlstra wrote:
>  > Subject: lockdep: Introduce wait-type checks
>  > From: Peter Zijlstra <peterz@infradead.org>
>  > Date: Tue, 19 Nov 2013 21:45:48 +0100
>  > 
>  > This patch extends lockdep to validate lock wait-type context.
> 
> ooh, a new toy.
> 
> *boom*
> 
> [    0.298629] =============================
> [    0.298732] [ BUG: Invalid wait context ]
> [    0.298834] 3.13.0-rc7+ #15 Not tainted
> [    0.298935] -----------------------------
> [    0.299038] swapper/0/1 is trying to lock:
> [    0.299135]  (&n->list_lock){......}-{3:3}, at: [<ffffffff816dea54>] get_partial_node.isra.49+0x4d/0x228
> [    0.299453] 
> stack backtrace:
> [    0.299608] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc7+ #15 
> [    0.299983]  0000000000000001 ffff880243f37a00 ffffffff816dfe5b 0000000000000014
> [    0.300302]  ffff880243f37a78 ffffffff8109f1f7 0000000000000000 ffff880243f37a78
> [    0.300611]  0000000000000046 ffffffff81189ae3 ffffffff00000000 0000000000000046
> [    0.300927] Call Trace:
> [    0.301028]  [<ffffffff816dfe5b>] dump_stack+0x4e/0x7a
> [    0.301128]  [<ffffffff8109f1f7>] __lock_acquire.isra.28+0x3d7/0xd80
> [    0.301238]  [<ffffffff81189ae3>] ? deactivate_slab+0x3c3/0x740
> [    0.301345]  [<ffffffff810a030d>] lock_acquire+0x8d/0x120
> [    0.302971]  [<ffffffff816dea54>] ? get_partial_node.isra.49+0x4d/0x228
> [    0.303077]  [<ffffffff816e9e3b>] _raw_spin_lock+0x3b/0x50
> [    0.303183]  [<ffffffff816dea54>] ? get_partial_node.isra.49+0x4d/0x228
> [    0.303290]  [<ffffffff816dea54>] get_partial_node.isra.49+0x4d/0x228
> [    0.303397]  [<ffffffff810cd482>] ? __module_text_address+0x12/0x60
> [    0.303502]  [<ffffffff810d35ff>] ? is_module_text_address+0x2f/0x50
> [    0.303610]  [<ffffffff81074548>] ? __kernel_text_address+0x58/0x80
> [    0.303717]  [<ffffffff816dedfc>] __slab_alloc+0x1cd/0x562
> [    0.303821]  [<ffffffff812e86ff>] ? alloc_cpumask_var_node+0x1f/0x90
> [    0.303929]  [<ffffffff8118ab6a>] kmem_cache_alloc_node_trace+0xda/0x290
> [    0.304037]  [<ffffffff812e86ff>] ? alloc_cpumask_var_node+0x1f/0x90
> [    0.304145]  [<ffffffff812e86ff>] alloc_cpumask_var_node+0x1f/0x90
> [    0.304250]  [<ffffffff812e879e>] alloc_cpumask_var+0xe/0x10
> [    0.304357]  [<ffffffff81030990>] __assign_irq_vector+0x40/0x340
> [    0.304462]  [<ffffffff810324a1>] __create_irqs+0x151/0x210
> [    0.304567]  [<ffffffff810325a2>] create_irq+0x22/0x30
> [    0.304674]  [<ffffffff815ab25d>] dmar_set_interrupt+0x2d/0xd0
> [    0.304784]  [<ffffffff81d6a532>] enable_drhd_fault_handling+0x24/0x66
> [    0.304890]  [<ffffffff81d6bc7f>] irq_remap_enable_fault_handling+0x26/0x30
> [    0.304999]  [<ffffffff81d2f3e0>] bsp_end_local_APIC_setup+0x18/0x1a
> [    0.305106]  [<ffffffff81d2d44a>] native_smp_prepare_cpus+0x35c/0x3d3
> [    0.305215]  [<ffffffff81d20f93>] kernel_init_freeable+0x124/0x26c
> [    0.305321]  [<ffffffff816d6d4e>] ? kernel_init+0xe/0x130
> [    0.305427]  [<ffffffff816d6d40>] ? rest_init+0xd0/0xd0
> [    0.305529]  [<ffffffff816d6d4e>] kernel_init+0xe/0x130
> [    0.305627]  [<ffffffff816f23ac>] ret_from_fork+0x7c/0xb0
> [    0.305731]  [<ffffffff816d6d40>] ? rest_init+0xd0/0xd0
> [    0.305836] 
> other info that might help us debug this:
> [    0.305993] 1 lock held by swapper/0/1:
> [    0.306093]  #0:  (vector_lock){......}-{2:2}, at: [<ffffffff8103245c>] __create_irqs+0x10c/0x210
> [    0.306444] 

Ok, so whatever way I turn this thing, we simply cannot allocate memory
while holding a raw_spinlock, since all the allocator locks upto and
including zone->lock are regular spinlocks and we very much want
preemptible allocators, so changing that is not an option.

While -rt does appear to turn list_lock into a raw_spinlock that is at
most a band-aid afaict because all those list iterations that are done
while holding it aren't in any way bounded.

But even converting list_lock doesn't help, because SLUB (and any of the
others) will eventually call into alloc_page and friends which will
touch zone->lock, which is very much a spinlock, even on -rt.

So we must change this __create_irqs() site to not do this allocation
while holding the lock, /me greps the -rt patches to see if anybody
touches that.

Ah, no, -rt simply forbids CPUMASK_OFFSTACK and side-steps the issue
here.

Bugger.

Thomas, any clue? __assign_irq_vector() is called rather deep down in
the whole IRQ story and appears to be rather stupidly expensive, a
sideways reading of it makes it appear to be O(nr_cpus^2) surely
a complete fail even for !rt kernels.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC][PATCH] lockdep: Introduce wait-type checks
  2014-01-09 17:08   ` Peter Zijlstra
  2014-01-09 17:54     ` check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks) Oleg Nesterov
@ 2014-01-12  9:40     ` Ingo Molnar
  2014-01-12 17:45       ` [PATCH 0/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire() Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Ingo Molnar @ 2014-01-12  9:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds


* Peter Zijlstra <peterz@infradead.org> wrote:

> > But what I really can't understans is what "check == 0" means? It 
> > seems that in fact it can be 1 or 2? Or, iow, "check == 0" is 
> > actually equivalent to "check == 1" ?
> 
> Hmm indeed, the comment in lockdep.h says 0 means no checks at all, 
> but the code doesn't actually appear to work like that. I'm not sure 
> it ever did or not, I'd have to go dig through history.
> 
> That said, given the current state it certainly looks like we can 
> remove the check argument.
> 
> Ingo?

Agreed.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-12  9:40     ` [RFC][PATCH] lockdep: Introduce wait-type checks Ingo Molnar
@ 2014-01-12 17:45       ` Oleg Nesterov
  2014-01-12 17:45         ` [PATCH 1/1] " Oleg Nesterov
  2014-01-12 20:00         ` [PATCH 0/1] " Peter Zijlstra
  0 siblings, 2 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-12 17:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On 01/12, Ingo Molnar wrote:
>
> * Peter Zijlstra <peterz@infradead.org> wrote:
>
> > > But what I really can't understans is what "check == 0" means? It
> > > seems that in fact it can be 1 or 2? Or, iow, "check == 0" is
> > > actually equivalent to "check == 1" ?
> >
> > Hmm indeed, the comment in lockdep.h says 0 means no checks at all,
> > but the code doesn't actually appear to work like that. I'm not sure
> > it ever did or not, I'd have to go dig through history.
> >
> > That said, given the current state it certainly looks like we can
> > remove the check argument.
> >
> > Ingo?
>
> Agreed.

OK, could you and Peter review the patch?

If it passes the review I'll send another one which changes the callers
of lock_acquire(). And trace_lock_acquire() should be trivially updated
too.

But could someone please explain me what should lockdep_no_validate
actually do? 1704f47b5 "lockdep: Add novalidate class for dev->mutex
conversion" doesn't tell which kind of warnings it tries to avoid,
and it looks buggy (see another email from me).

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-12 17:45       ` [PATCH 0/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire() Oleg Nesterov
@ 2014-01-12 17:45         ` Oleg Nesterov
  2014-01-13  0:28           ` Dave Jones
  2014-01-13 17:06           ` Oleg Nesterov
  2014-01-12 20:00         ` [PATCH 0/1] " Peter Zijlstra
  1 sibling, 2 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-12 17:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

The "int check" argument of lock_acquire() and held_lock->check
are misleading and unneeded. This is only used as a boolean, 2
denotes "true", everything else is "false". And this boolean is
always equal to prove_locking.

The only exception is __lockdep_no_validate__ which should make
this condition "false" in validate_chain().

This patch kills this member and removes the argument from
__lock_acquire(), the next patch will remove the now unused arg
from lock_acquire() and update its callers.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/lockdep.h  |    1 -
 kernel/locking/lockdep.c |   22 ++++++++--------------
 2 files changed, 8 insertions(+), 15 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 92b1bfc..13bd13d 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -252,7 +252,6 @@ struct held_lock {
 	unsigned int trylock:1;						/* 16 bits */
 
 	unsigned int read:2;        /* see lock_acquire() comment */
-	unsigned int check:2;       /* see lock_acquire() comment */
 	unsigned int hardirqs_off:1;
 	unsigned int references:11;					/* 32 bits */
 };
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 576ba75..32ba948 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2096,7 +2096,8 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
 	 * (If lookup_chain_cache() returns with 1 it acquires
 	 * graph_lock for us)
 	 */
-	if (!hlock->trylock && (hlock->check == 2) &&
+	if (!hlock->trylock && prove_locking &&
+	    hlock_class(hlock)->key != __lockdep_no_validate__.subkeys &&
 	    lookup_chain_cache(curr, hlock, chain_key)) {
 		/*
 		 * Check whether last held lock:
@@ -3041,7 +3042,7 @@ static int __lock_is_held(struct lockdep_map *lock);
  * We maintain the dependency maps and validate the locking attempt:
  */
 static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			  int trylock, int read, int check, int hardirqs_off,
+			  int trylock, int read, int hardirqs_off,
 			  struct lockdep_map *nest_lock, unsigned long ip,
 			  int references)
 {
@@ -3053,9 +3054,6 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	int class_idx;
 	u64 chain_key;
 
-	if (!prove_locking)
-		check = 1;
-
 	if (unlikely(!debug_locks))
 		return 0;
 
@@ -3067,9 +3065,6 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
 		return 0;
 
-	if (lock->key == &__lockdep_no_validate__)
-		check = 1;
-
 	if (subclass < NR_LOCKDEP_CACHING_CLASSES)
 		class = lock->class_cache[subclass];
 	/*
@@ -3128,7 +3123,6 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	hlock->nest_lock = nest_lock;
 	hlock->trylock = trylock;
 	hlock->read = read;
-	hlock->check = check;
 	hlock->hardirqs_off = !!hardirqs_off;
 	hlock->references = references;
 #ifdef CONFIG_LOCK_STAT
@@ -3136,7 +3130,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	hlock->holdtime_stamp = lockstat_clock();
 #endif
 
-	if (check == 2 && !mark_irqflags(curr, hlock))
+	if (prove_locking && !mark_irqflags(curr, hlock))
 		return 0;
 
 	/* mark it as used: */
@@ -3338,7 +3332,7 @@ found_it:
 		hlock = curr->held_locks + i;
 		if (!__lock_acquire(hlock->instance,
 			hlock_class(hlock)->subclass, hlock->trylock,
-				hlock->read, hlock->check, hlock->hardirqs_off,
+				hlock->read, hlock->hardirqs_off,
 				hlock->nest_lock, hlock->acquire_ip,
 				hlock->references))
 			return 0;
@@ -3422,7 +3416,7 @@ found_it:
 		hlock = curr->held_locks + i;
 		if (!__lock_acquire(hlock->instance,
 			hlock_class(hlock)->subclass, hlock->trylock,
-				hlock->read, hlock->check, hlock->hardirqs_off,
+				hlock->read, hlock->hardirqs_off,
 				hlock->nest_lock, hlock->acquire_ip,
 				hlock->references))
 			return 0;
@@ -3598,8 +3592,8 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 
 	current->lockdep_recursion = 1;
 	trace_lock_acquire(lock, subclass, trylock, read, check, nest_lock, ip);
-	__lock_acquire(lock, subclass, trylock, read, check,
-		       irqs_disabled_flags(flags), nest_lock, ip, 0);
+	__lock_acquire(lock, subclass, trylock, read,
+			irqs_disabled_flags(flags), nest_lock, ip, 0);
 	current->lockdep_recursion = 0;
 	raw_local_irq_restore(flags);
 }
-- 
1.5.5.1



^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-12 17:45       ` [PATCH 0/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire() Oleg Nesterov
  2014-01-12 17:45         ` [PATCH 1/1] " Oleg Nesterov
@ 2014-01-12 20:00         ` Peter Zijlstra
  2014-01-13 18:35           ` Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-12 20:00 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On Sun, Jan 12, 2014 at 06:45:32PM +0100, Oleg Nesterov wrote:
> But could someone please explain me what should lockdep_no_validate
> actually do? 1704f47b5 "lockdep: Add novalidate class for dev->mutex
> conversion" doesn't tell which kind of warnings it tries to avoid,
> and it looks buggy (see another email from me).

It should disable lock-order checks for the locks that set that as
class. So we can still do the simple checks like detect if we free
memory with a held lock in and make sure we do not return to userspace
with a held lock; but the class should not participate in the lock
graph.

I'll have a look at your earlier email again; I did note the question
but it slipped my mind :/

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-09 17:54     ` check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks) Oleg Nesterov
@ 2014-01-12 20:58       ` Peter Zijlstra
  2014-01-13 16:07         ` Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-12 20:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On Thu, Jan 09, 2014 at 06:54:48PM +0100, Oleg Nesterov wrote:
> But this connects to lockdep_no_validate. Not sure I understand what
> this class should actually do, but consider this code:
> 
> 	DEFINE_MUTEX(m1);
> 	DEFINE_MUTEX(m2);
> 	DEFINE_MUTEX(mx);
> 
> 	void lockdep_should_complain(void)
> 	{
> 		lockdep_set_novalidate_class(&mx);
> 
> 		// m1 -> mx -> m2
> 		mutex_lock(&m1);
> 		mutex_lock(&mx);
> 		mutex_lock(&m2);
> 		mutex_unlock(&m2);
> 		mutex_unlock(&mx);
> 		mutex_unlock(&m1);
> 
> 
> 		// m2 -> m1 ; should trigger the warning
> 		mutex_lock(&m2);
> 		mutex_lock(&m1);
> 		mutex_unlock(&m1);
> 		mutex_unlock(&m2);
> 	}
> 
> lockdep doesn't not detect the trivial possible deadlock.
> 
> The patch below seems to work but most probably it is not right, and
> I forgot everything (not too much) I knew about lockdep internals.
> 
> Oleg.
> 
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1939,7 +1939,8 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
>  		 * Only non-recursive-read entries get new dependencies
>  		 * added:
>  		 */
> -		if (hlock->read != 2) {
> +		if (hlock->read != 2 &&
> +		    hlock->instance->key != &__lockdep_no_validate__) {
>  			if (!check_prev_add(curr, hlock, next,
>  						distance, trylock_loop))
>  				return 0;
> 

Hmm, you are quite right indeed; although I would write it like:

  if (hlock->read != 2 && hlock->check == 2)

because the __lockdep_no_validate__ thing forces the ->check value to 1.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-12 17:45         ` [PATCH 1/1] " Oleg Nesterov
@ 2014-01-13  0:28           ` Dave Jones
  2014-01-13 16:20             ` Oleg Nesterov
  2014-01-13 17:06           ` Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Dave Jones @ 2014-01-13  0:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, Thomas Gleixner,
	Steven Rostedt, Paul McKenney, Linus Torvalds

On Sun, Jan 12, 2014 at 06:45:54PM +0100, Oleg Nesterov wrote:
 
 > diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
 > index 92b1bfc..13bd13d 100644
 > --- a/include/linux/lockdep.h
 > +++ b/include/linux/lockdep.h
 > @@ -252,7 +252,6 @@ struct held_lock {
 >  	unsigned int trylock:1;						/* 16 bits */
 >  
 >  	unsigned int read:2;        /* see lock_acquire() comment */
 > -	unsigned int check:2;       /* see lock_acquire() comment */
 >  	unsigned int hardirqs_off:1;
 >  	unsigned int references:11;					/* 32 bits */

The 'bits' comments seem to be crap, even before your patch.

	Dave


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-12 20:58       ` Peter Zijlstra
@ 2014-01-13 16:07         ` Oleg Nesterov
  2014-01-16 17:43           ` Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-13 16:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On 01/12, Peter Zijlstra wrote:
>
> On Thu, Jan 09, 2014 at 06:54:48PM +0100, Oleg Nesterov wrote:
> >
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -1939,7 +1939,8 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
> >  		 * Only non-recursive-read entries get new dependencies
> >  		 * added:
> >  		 */
> > -		if (hlock->read != 2) {
> > +		if (hlock->read != 2 &&
> > +		    hlock->instance->key != &__lockdep_no_validate__) {
> >  			if (!check_prev_add(curr, hlock, next,
> >  						distance, trylock_loop))
> >  				return 0;
> >
>
> Hmm, you are quite right indeed;

Thanks!

> although I would write it like:
>
>   if (hlock->read != 2 && hlock->check == 2)
>
> because the __lockdep_no_validate__ thing forces the ->check value to 1.

Agreed, hlock->check == 2 looks better. But this connects to another
patch I sent which removes hlock->check...

OK, I'll wait for review on that patch, then resend this one with
->check or __lockdep_no_validate__ depending on the result.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-13  0:28           ` Dave Jones
@ 2014-01-13 16:20             ` Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-13 16:20 UTC (permalink / raw)
  To: Dave Jones, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Thomas Gleixner, Steven Rostedt, Paul McKenney, Linus Torvalds

On 01/12, Dave Jones wrote:
>
> On Sun, Jan 12, 2014 at 06:45:54PM +0100, Oleg Nesterov wrote:
>
>  > diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
>  > index 92b1bfc..13bd13d 100644
>  > --- a/include/linux/lockdep.h
>  > +++ b/include/linux/lockdep.h
>  > @@ -252,7 +252,6 @@ struct held_lock {
>  >  	unsigned int trylock:1;						/* 16 bits */
>  >
>  >  	unsigned int read:2;        /* see lock_acquire() comment */
>  > -	unsigned int check:2;       /* see lock_acquire() comment */
>  >  	unsigned int hardirqs_off:1;
>  >  	unsigned int references:11;					/* 32 bits */
>
> The 'bits' comments seem to be crap, even before your patch.

Hmm, I didn't even notice them.

I think the comments were correct, note that the previous field is uint:13.

But after (with) this patch the width of "references" should be updated.

Thanks,

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-12 17:45         ` [PATCH 1/1] " Oleg Nesterov
  2014-01-13  0:28           ` Dave Jones
@ 2014-01-13 17:06           ` Oleg Nesterov
  2014-01-13 17:28             ` Peter Zijlstra
  1 sibling, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-13 17:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On 01/12, Oleg Nesterov wrote:
>
> The "int check" argument of lock_acquire() and held_lock->check
> are misleading and unneeded. This is only used as a boolean, 2
> denotes "true", everything else is "false". And this boolean is
> always equal to prove_locking.
>
> The only exception is __lockdep_no_validate__ which should make
> this condition "false" in validate_chain().

And I missed mark_irqflags(),

> @@ -3136,7 +3130,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
>  	hlock->holdtime_stamp = lockstat_clock();
>  #endif
>
> -	if (check == 2 && !mark_irqflags(curr, hlock))
> +	if (prove_locking && !mark_irqflags(curr, hlock))
>  		return 0;

This change is not right, at least it is not equivalent.

And I just realized that rcu_lock_acquire() does lock_acquire(check => 1).
Probably we can mark rcu_lock_map's as __lockdep_no_validate__.

Anything else I missed?

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-13 17:06           ` Oleg Nesterov
@ 2014-01-13 17:28             ` Peter Zijlstra
  2014-01-13 18:52               ` Oleg Nesterov
  2014-01-13 22:34               ` Paul E. McKenney
  0 siblings, 2 replies; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-13 17:28 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On Mon, Jan 13, 2014 at 06:06:09PM +0100, Oleg Nesterov wrote:
> On 01/12, Oleg Nesterov wrote:
> >
> > The "int check" argument of lock_acquire() and held_lock->check
> > are misleading and unneeded. This is only used as a boolean, 2
> > denotes "true", everything else is "false". And this boolean is
> > always equal to prove_locking.
> >
> > The only exception is __lockdep_no_validate__ which should make
> > this condition "false" in validate_chain().
> 
> And I missed mark_irqflags(),
> 
> > @@ -3136,7 +3130,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> >  	hlock->holdtime_stamp = lockstat_clock();
> >  #endif
> >
> > -	if (check == 2 && !mark_irqflags(curr, hlock))
> > +	if (prove_locking && !mark_irqflags(curr, hlock))
> >  		return 0;
> 
> This change is not right, at least it is not equivalent.
> 
> And I just realized that rcu_lock_acquire() does lock_acquire(check => 1).
> Probably we can mark rcu_lock_map's as __lockdep_no_validate__.

Can't, RCU needs its own classes. Otherwise it cannot tell which version
of the RCU read lock its holding at just that moment.

> Anything else I missed?

Nothing springs to mind, but then, I totally missed the RCU thing too.

At the very least we can reduce check to a single bit.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-12 20:00         ` [PATCH 0/1] " Peter Zijlstra
@ 2014-01-13 18:35           ` Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-13 18:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On 01/12, Peter Zijlstra wrote:
>
> On Sun, Jan 12, 2014 at 06:45:32PM +0100, Oleg Nesterov wrote:
> > But could someone please explain me what should lockdep_no_validate
> > actually do? 1704f47b5 "lockdep: Add novalidate class for dev->mutex
> > conversion" doesn't tell which kind of warnings it tries to avoid,
> > and it looks buggy (see another email from me).
>
> It should disable lock-order checks for the locks that set that as
> class. So we can still do the simple checks like detect if we free
> memory with a held lock in and make sure we do not return to userspace
> with a held lock; but the class should not participate in the lock
> graph.

Yes, this is what I more or less understood, but I was confused anyway.
Somehow I thought that this patch tries to suppress the warnings caused
by rw_semaphore -> mutex conversion, and I was completely puzzled. But
3142788b79 "drivers/base: Convert dev->sem to mutex" actually did the
semaphore -> mutex change!

Everything is clear now, thanks!

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-13 17:28             ` Peter Zijlstra
@ 2014-01-13 18:52               ` Oleg Nesterov
  2014-01-13 22:34               ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-13 18:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, linux-kernel, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On 01/13, Peter Zijlstra wrote:
>
> On Mon, Jan 13, 2014 at 06:06:09PM +0100, Oleg Nesterov wrote:
> >
> > And I just realized that rcu_lock_acquire() does lock_acquire(check => 1).
> > Probably we can mark rcu_lock_map's as __lockdep_no_validate__.
>
> Can't, RCU needs its own classes. Otherwise it cannot tell which version
> of the RCU read lock its holding at just that moment.

Ah, indeed. Thanks.

Can't it do lock_acquire(trylock => 1, read => 2) ? this still means
mark_irqflags(), but perhaps this won't hurt too much.

> At the very least we can reduce check to a single bit.

Or this, yes.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 1/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire()
  2014-01-13 17:28             ` Peter Zijlstra
  2014-01-13 18:52               ` Oleg Nesterov
@ 2014-01-13 22:34               ` Paul E. McKenney
  1 sibling, 0 replies; 50+ messages in thread
From: Paul E. McKenney @ 2014-01-13 22:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Ingo Molnar, linux-kernel, Thomas Gleixner,
	Steven Rostedt, Linus Torvalds

On Mon, Jan 13, 2014 at 06:28:33PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 13, 2014 at 06:06:09PM +0100, Oleg Nesterov wrote:
> > On 01/12, Oleg Nesterov wrote:
> > >
> > > The "int check" argument of lock_acquire() and held_lock->check
> > > are misleading and unneeded. This is only used as a boolean, 2
> > > denotes "true", everything else is "false". And this boolean is
> > > always equal to prove_locking.
> > >
> > > The only exception is __lockdep_no_validate__ which should make
> > > this condition "false" in validate_chain().
> > 
> > And I missed mark_irqflags(),
> > 
> > > @@ -3136,7 +3130,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> > >  	hlock->holdtime_stamp = lockstat_clock();
> > >  #endif
> > >
> > > -	if (check == 2 && !mark_irqflags(curr, hlock))
> > > +	if (prove_locking && !mark_irqflags(curr, hlock))
> > >  		return 0;
> > 
> > This change is not right, at least it is not equivalent.
> > 
> > And I just realized that rcu_lock_acquire() does lock_acquire(check => 1).
> > Probably we can mark rcu_lock_map's as __lockdep_no_validate__.
> 
> Can't, RCU needs its own classes. Otherwise it cannot tell which version
> of the RCU read lock its holding at just that moment.

Just confirming this.  RCU uses this to detected mismatches between
the rcu_read_lock() group and the rcu_dereference() group.

							Thanx, Paul

> > Anything else I missed?
> 
> Nothing springs to mind, but then, I totally missed the RCU thing too.
> 
> At the very least we can reduce check to a single bit.
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-13 16:07         ` Oleg Nesterov
@ 2014-01-16 17:43           ` Oleg Nesterov
  2014-01-16 18:09             ` Peter Zijlstra
  0 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-16 17:43 UTC (permalink / raw)
  To: Peter Zijlstra, Greg Kroah-Hartman
  Cc: linux-kernel, Ingo Molnar, Thomas Gleixner, Steven Rostedt,
	Paul McKenney, Linus Torvalds

On 01/13, Oleg Nesterov wrote:
>
> On 01/12, Peter Zijlstra wrote:
> >
> > On Thu, Jan 09, 2014 at 06:54:48PM +0100, Oleg Nesterov wrote:
> > >
> > > --- a/kernel/locking/lockdep.c
> > > +++ b/kernel/locking/lockdep.c
> > > @@ -1939,7 +1939,8 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
> > >  		 * Only non-recursive-read entries get new dependencies
> > >  		 * added:
> > >  		 */
> > > -		if (hlock->read != 2) {
> > > +		if (hlock->read != 2 &&
> > > +		    hlock->instance->key != &__lockdep_no_validate__) {
> > >  			if (!check_prev_add(curr, hlock, next,
> > >  						distance, trylock_loop))
> > >  				return 0;
> > >
> >
> > Hmm, you are quite right indeed;
>
> Thanks!
>
> > although I would write it like:
> >
> >   if (hlock->read != 2 && hlock->check == 2)
> >
> > because the __lockdep_no_validate__ thing forces the ->check value to 1.
>
> Agreed, hlock->check == 2 looks better. But this connects to another
> patch I sent which removes hlock->check...
>
> OK, I'll wait for review on that patch, then resend this one with
> ->check or __lockdep_no_validate__ depending on the result.

And I still think that we should try to remove hlock->check, but
please forget about this for the moment. I'll try to send some
patches later in any case.


OK, lets suppose we do the change above using ->key or ->check,
doesn't matter. This will fix the already discussed pattern:

		static DEFINE_MUTEX(m1);
		static DEFINE_MUTEX(m2);
		static DEFINE_MUTEX(mx);

		lockdep_set_novalidate_class(&mx);

		// m1 -> mx -> m2
		mutex_lock(&m1);
		mutex_lock(&mx);
		mutex_lock(&m2);
		mutex_unlock(&m2);
		mutex_unlock(&mx);
		mutex_unlock(&m1);

		// m2 -> m1 ; should trigger the warning
		mutex_lock(&m2);
		mutex_lock(&m1);
		mutex_unlock(&m1);
		mutex_unlock(&m2);

before this change lockdep can't detect the trivial deadlock.

But with or without this change the following code

		static DEFINE_MUTEX(m1);
		static DEFINE_MUTEX(mx);

		lockdep_set_novalidate_class(&mx);

		// m1 -> mx
		mutex_lock(&m1);
		mutex_lock(&mx);
		mutex_unlock(&mx);
		mutex_unlock(&m1);

		// mx -> m1 ; should trigger the warning ???
		mutex_lock(&mx);
		mutex_lock(&m1);
		mutex_unlock(&m1);
		mutex_unlock(&mx);

doesn't trigger the warning too. This is correct because
lockdep_set_novalidate_class() means, well, no-validate.
The question is: do we really want to avoid all validations?

Why lockdep_set_novalidate_class() was added? Unlees I missed
something the problem is that (say) __driver_attach() can take
the "same" lock twice, drivers/base/ lacks annotations.

Perhaps we should change the meaning of lockdep_set_novalidate_class?
(perhaps with rename). What do you think about the patch below?

With this patch __lockdep_no_validate__ means "automatically nested",
although I have to remind I can hardly understand the code I am
trying to change ;)

Oleg.

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 576ba75..844d25d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2515,9 +2515,6 @@ mark_held_locks(struct task_struct *curr, enum mark_type mark)
 
 		BUG_ON(usage_bit >= LOCK_USAGE_STATES);
 
-		if (hlock_class(hlock)->key == __lockdep_no_validate__.subkeys)
-			continue;
-
 		if (!mark_lock(curr, hlock, usage_bit))
 			return 0;
 	}
@@ -3067,8 +3064,15 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
 		return 0;
 
-	if (lock->key == &__lockdep_no_validate__)
-		check = 1;
+	if (lock->key == &__lockdep_no_validate__) {
+		int i;
+
+		for (i = curr->lockdep_depth; --i >= 0; ) {
+			hlock = curr->held_locks + i;
+			if (hlock->instance->key == lock->key)
+				goto nested;
+		}
+	}
 
 	if (subclass < NR_LOCKDEP_CACHING_CLASSES)
 		class = lock->class_cache[subclass];
@@ -3106,6 +3110,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	if (depth) {
 		hlock = curr->held_locks + depth - 1;
 		if (hlock->class_idx == class_idx && nest_lock) {
+nested:
 			if (hlock->references)
 				hlock->references++;
 			else
@@ -3282,8 +3287,9 @@ static int match_held_lock(struct held_lock *hlock, struct lockdep_map *lock)
 		 * References, but not a lock we're actually ref-counting?
 		 * State got messed up, follow the sites that change ->references
 		 * and try to make sense of it.
+		 * FIXME: s/0/novalidate/ ?
 		 */
-		if (DEBUG_LOCKS_WARN_ON(!hlock->nest_lock))
+		if (DEBUG_LOCKS_WARN_ON(0 && !hlock->nest_lock))
 			return 0;
 
 		if (hlock->class_idx == class - lock_classes + 1)


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-16 17:43           ` Oleg Nesterov
@ 2014-01-16 18:09             ` Peter Zijlstra
  2014-01-16 20:26               ` Alan Stern
  0 siblings, 1 reply; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-16 18:09 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Greg Kroah-Hartman, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Steven Rostedt, Paul McKenney, Linus Torvalds, Alan Stern

On Thu, Jan 16, 2014 at 06:43:48PM +0100, Oleg Nesterov wrote:
> But with or without this change the following code
> 
> 		static DEFINE_MUTEX(m1);
> 		static DEFINE_MUTEX(mx);
> 
> 		lockdep_set_novalidate_class(&mx);
> 
> 		// m1 -> mx
> 		mutex_lock(&m1);
> 		mutex_lock(&mx);
> 		mutex_unlock(&mx);
> 		mutex_unlock(&m1);
> 
> 		// mx -> m1 ; should trigger the warning ???
> 		mutex_lock(&mx);
> 		mutex_lock(&m1);
> 		mutex_unlock(&m1);
> 		mutex_unlock(&mx);
> 
> doesn't trigger the warning too. This is correct because
> lockdep_set_novalidate_class() means, well, no-validate.
> The question is: do we really want to avoid all validations?

Good question.

> Why lockdep_set_novalidate_class() was added? Unlees I missed
> something the problem is that (say) __driver_attach() can take
> the "same" lock twice, drivers/base/ lacks annotations.

Indeed, the driver model locking always slips my mind but yes its
creative. Alan Stern seems to have a good grasp on it though.

> Perhaps we should change the meaning of lockdep_set_novalidate_class?
> (perhaps with rename). What do you think about the patch below?
> 
> With this patch __lockdep_no_validate__ means "automatically nested",

Yes, I suppose that might work, it would allow some validation.

> although I have to remind I can hardly understand the code I am
> trying to change ;)

You don't seem to be doing too badly ;-)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-16 18:09             ` Peter Zijlstra
@ 2014-01-16 20:26               ` Alan Stern
  2014-01-17 16:31                 ` Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Alan Stern @ 2014-01-16 20:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Greg Kroah-Hartman, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Steven Rostedt, Paul McKenney, Linus Torvalds

On Thu, 16 Jan 2014, Peter Zijlstra wrote:

> On Thu, Jan 16, 2014 at 06:43:48PM +0100, Oleg Nesterov wrote:
> > But with or without this change the following code
> > 
> > 		static DEFINE_MUTEX(m1);
> > 		static DEFINE_MUTEX(mx);
> > 
> > 		lockdep_set_novalidate_class(&mx);
> > 
> > 		// m1 -> mx
> > 		mutex_lock(&m1);
> > 		mutex_lock(&mx);
> > 		mutex_unlock(&mx);
> > 		mutex_unlock(&m1);
> > 
> > 		// mx -> m1 ; should trigger the warning ???
> > 		mutex_lock(&mx);
> > 		mutex_lock(&m1);
> > 		mutex_unlock(&m1);
> > 		mutex_unlock(&mx);
> > 
> > doesn't trigger the warning too. This is correct because
> > lockdep_set_novalidate_class() means, well, no-validate.
> > The question is: do we really want to avoid all validations?
> 
> Good question.
> 
> > Why lockdep_set_novalidate_class() was added? Unlees I missed
> > something the problem is that (say) __driver_attach() can take
> > the "same" lock twice, drivers/base/ lacks annotations.
> 
> Indeed, the driver model locking always slips my mind but yes its
> creative. Alan Stern seems to have a good grasp on it though.

Mostly my "grasp" is just a firm belief that trying to manage locking
throughout the entire driver tree is hopeless.  Individual sub-portions
of it are usually well behaved, but the thing as a whole is a mess.

> > Perhaps we should change the meaning of lockdep_set_novalidate_class?
> > (perhaps with rename). What do you think about the patch below?
> > 
> > With this patch __lockdep_no_validate__ means "automatically nested",
> 
> Yes, I suppose that might work, it would allow some validation.

I haven't seen the patch, but I'm not so sure it will work.  Suppose we
have two devices, D1 and D2, and some other mutex, M.  Then the locking
pattern:

	lock(D1);
	lock(M);
	unlock(M);
	unlock(D1);

generally should not conflict with:

	lock(M);
	lock(D2);
	unlock(D2);
	unlock(M);

even though D1's and D2's locks belong to the same class.  For example,
M might be a mutex embedded in the private data associated with D1, and
D2 might be a child of D1.

Alan Stern


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-16 20:26               ` Alan Stern
@ 2014-01-17 16:31                 ` Oleg Nesterov
  2014-01-17 18:01                   ` Alan Stern
  0 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-17 16:31 UTC (permalink / raw)
  To: Alan Stern
  Cc: Peter Zijlstra, Greg Kroah-Hartman, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Steven Rostedt, Paul McKenney, Linus Torvalds

On 01/16, Alan Stern wrote:
>
> On Thu, 16 Jan 2014, Peter Zijlstra wrote:
>
> > On Thu, Jan 16, 2014 at 06:43:48PM +0100, Oleg Nesterov wrote:
>
> > > Perhaps we should change the meaning of lockdep_set_novalidate_class?
> > > (perhaps with rename). What do you think about the patch below?
> > >
> > > With this patch __lockdep_no_validate__ means "automatically nested",
> >
> > Yes, I suppose that might work, it would allow some validation.
>
> I haven't seen the patch, but I'm not so sure it will work.  Suppose we
> have two devices, D1 and D2, and some other mutex, M.  Then the locking
> pattern:
>
> 	lock(D1);
> 	lock(M);
> 	unlock(M);
> 	unlock(D1);
>
> generally should not conflict with:
>
> 	lock(M);
> 	lock(D2);
> 	unlock(D2);
> 	unlock(M);

Yes, sure. This change assumes that the only problem in drivers/base is
dev->parent->mutex / dev->mutex dependency. If the locking is even more
"broken" (wrt lockdep), we can't replace lockdep_set_novalidate_class()
with lockdep_set_auto_nested().

And, otoh, with this change lockdep can miss the real problems too, for
example:

	func1(dev)
	{
		device_lock(dev->parent);
		mutex_lock(MUTEX);
		device_lock(dev);
		...
	}

	func2(dev)
	{
		device_lock(dev);
		mutex_lock(MUTEX);
		...
	}

lockdep will only notice dev -> MUTEX dependency.

I booted the kernel (under kvm) with this change and there is nothing
in dmesg, but of course this is not the real testing.

So do you think that dev->mutex should not be validated at all ?


Just in case... Of course, if we actually add auto_nested we should not
use a single class unless dev->mutex will be the only user.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks)
  2014-01-17 16:31                 ` Oleg Nesterov
@ 2014-01-17 18:01                   ` Alan Stern
  2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Alan Stern @ 2014-01-17 18:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Greg Kroah-Hartman, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Steven Rostedt, Paul McKenney, Linus Torvalds

On Fri, 17 Jan 2014, Oleg Nesterov wrote:

> On 01/16, Alan Stern wrote:
> >
> > On Thu, 16 Jan 2014, Peter Zijlstra wrote:
> >
> > > On Thu, Jan 16, 2014 at 06:43:48PM +0100, Oleg Nesterov wrote:
> >
> > > > Perhaps we should change the meaning of lockdep_set_novalidate_class?
> > > > (perhaps with rename). What do you think about the patch below?
> > > >
> > > > With this patch __lockdep_no_validate__ means "automatically nested",
> > >
> > > Yes, I suppose that might work, it would allow some validation.
> >
> > I haven't seen the patch, but I'm not so sure it will work.  Suppose we
> > have two devices, D1 and D2, and some other mutex, M.  Then the locking
> > pattern:
> >
> > 	lock(D1);
> > 	lock(M);
> > 	unlock(M);
> > 	unlock(D1);
> >
> > generally should not conflict with:
> >
> > 	lock(M);
> > 	lock(D2);
> > 	unlock(D2);
> > 	unlock(M);
> 
> Yes, sure. This change assumes that the only problem in drivers/base is
> dev->parent->mutex / dev->mutex dependency. If the locking is even more
> "broken" (wrt lockdep), we can't replace lockdep_set_novalidate_class()
> with lockdep_set_auto_nested().

I suspect it is even more "broken".  But I can't point to specific 
examples.

> And, otoh, with this change lockdep can miss the real problems too, for
> example:
> 
> 	func1(dev)
> 	{
> 		device_lock(dev->parent);
> 		mutex_lock(MUTEX);
> 		device_lock(dev);
> 		...
> 	}
> 
> 	func2(dev)
> 	{
> 		device_lock(dev);
> 		mutex_lock(MUTEX);
> 		...
> 	}
> 
> lockdep will only notice dev -> MUTEX dependency.
> 
> I booted the kernel (under kvm) with this change and there is nothing
> in dmesg, but of course this is not the real testing.
> 
> So do you think that dev->mutex should not be validated at all ?

My guess is that if your change is deployed widely, there will be 
reports of violations.  That's only a guess.

Still, you could go ahead and try it, just to see what happens.

> Just in case... Of course, if we actually add auto_nested we should not
> use a single class unless dev->mutex will be the only user.

That's a good point.  I don't know of any other classes using 
LOCKDEP_NO_VALIDATE, but there may be one or two.

Also, take a look at commit 356c05d58af0.  It's a similar situation 
(not exactly the same).

Alan Stern


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate)
  2014-01-17 18:01                   ` Alan Stern
@ 2014-01-20 18:19                     ` Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 1/5] lockdep: make held_lock->check and "int check" argument bool Oleg Nesterov
                                         ` (5 more replies)
  0 siblings, 6 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-20 18:19 UTC (permalink / raw)
  To: Alan Stern, Peter Zijlstra
  Cc: Greg Kroah-Hartman, linux-kernel, Ingo Molnar, Thomas Gleixner,
	Steven Rostedt, Paul McKenney, Linus Torvalds, Dave Jones

On 01/17, Alan Stern wrote:
>
> On Fri, 17 Jan 2014, Oleg Nesterov wrote:
>
> > Yes, sure. This change assumes that the only problem in drivers/base is
> > dev->parent->mutex / dev->mutex dependency. If the locking is even more
> > "broken" (wrt lockdep), we can't replace lockdep_set_novalidate_class()
> > with lockdep_set_auto_nested().
>
> I suspect it is even more "broken".  But I can't point to specific
> examples.
>
> ...
>
> My guess is that if your change is deployed widely, there will be
> reports of violations.  That's only a guess.

OK, lets (try to) do this later. Let me send the changes which I hope
should be fine in any case.

> Still, you could go ahead and try it, just to see what happens.

Yes, perhaps it makes sense at least to test this change and see what
happens... We will see.

> Also, take a look at commit 356c05d58af0.  It's a similar situation
> (not exactly the same).

At first glance, can't __ATTR_IGNORE_LOCKDEP() use no_validate too ?
(ignoring the fact checkpatch.pl won't be happy). This can simplify
the code, it seems.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 1/5] lockdep: make held_lock->check and "int check" argument bool
  2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
@ 2014-01-20 18:20                       ` Oleg Nesterov
  2014-02-10 13:32                         ` [tip:core/locking] lockdep: Make " tip-bot for Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 2/5] lockdep: don't create the wrong dependency on hlock->check == 0 Oleg Nesterov
                                         ` (4 subsequent siblings)
  5 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-20 18:20 UTC (permalink / raw)
  To: Alan Stern, Peter Zijlstra
  Cc: Dave Jones, Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds,
	Paul McKenney, Steven Rostedt, Thomas Gleixner, linux-kernel

The "int check" argument of lock_acquire() and held_lock->check are
misleading. This is actually a boolean: 2 means "true", everything
else is "false".

And there is no need to pass 1 or 0 to lock_acquire() depending on
CONFIG_PROVE_LOCKING, __lock_acquire() checks prove_locking at the
start and clears "check" if !CONFIG_PROVE_LOCKING.

Note: probably we can simply kill this member/arg. The only explicit
user of check => 0 is rcu_lock_acquire(), perhaps we can change it to
use lock_acquire(trylock =>, read => 2). __lockdep_no_validate means
check => 0 implicitly, but we can change validate_chain() to check
hlock->instance->key instead. Not to mention it would be nice to get
rid of lockdep_set_novalidate_class().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 drivers/tty/tty_ldsem.c  |   15 ++++-----------
 include/linux/lockdep.h  |   25 +++++++++----------------
 include/linux/rcupdate.h |    2 +-
 kernel/locking/lockdep.c |   11 ++++-------
 4 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c
index 22fad8a..9e9b0a9 100644
--- a/drivers/tty/tty_ldsem.c
+++ b/drivers/tty/tty_ldsem.c
@@ -39,17 +39,10 @@
 				lock_acquire(&(l)->dep_map, s, t, r, c, n, i)
 # define __rel(l, n, i)				\
 				lock_release(&(l)->dep_map, n, i)
-# ifdef CONFIG_PROVE_LOCKING
-#  define lockdep_acquire(l, s, t, i)		__acq(l, s, t, 0, 2, NULL, i)
-#  define lockdep_acquire_nest(l, s, t, n, i)	__acq(l, s, t, 0, 2, n, i)
-#  define lockdep_acquire_read(l, s, t, i)	__acq(l, s, t, 1, 2, NULL, i)
-#  define lockdep_release(l, n, i)		__rel(l, n, i)
-# else
-#  define lockdep_acquire(l, s, t, i)		__acq(l, s, t, 0, 1, NULL, i)
-#  define lockdep_acquire_nest(l, s, t, n, i)	__acq(l, s, t, 0, 1, n, i)
-#  define lockdep_acquire_read(l, s, t, i)	__acq(l, s, t, 1, 1, NULL, i)
-#  define lockdep_release(l, n, i)		__rel(l, n, i)
-# endif
+#define lockdep_acquire(l, s, t, i)		__acq(l, s, t, 0, 1, NULL, i)
+#define lockdep_acquire_nest(l, s, t, n, i)	__acq(l, s, t, 0, 1, n, i)
+#define lockdep_acquire_read(l, s, t, i)	__acq(l, s, t, 1, 1, NULL, i)
+#define lockdep_release(l, n, i)		__rel(l, n, i)
 #else
 # define lockdep_acquire(l, s, t, i)		do { } while (0)
 # define lockdep_acquire_nest(l, s, t, n, i)	do { } while (0)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 92b1bfc..1626047 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -252,9 +252,9 @@ struct held_lock {
 	unsigned int trylock:1;						/* 16 bits */
 
 	unsigned int read:2;        /* see lock_acquire() comment */
-	unsigned int check:2;       /* see lock_acquire() comment */
+	unsigned int check:1;       /* see lock_acquire() comment */
 	unsigned int hardirqs_off:1;
-	unsigned int references:11;					/* 32 bits */
+	unsigned int references:12;					/* 32 bits */
 };
 
 /*
@@ -326,9 +326,8 @@ static inline int lockdep_match_key(struct lockdep_map *lock,
  *
  * Values for check:
  *
- *   0: disabled
- *   1: simple checks (freeing, held-at-exit-time, etc.)
- *   2: full validation
+ *   0: simple checks (freeing, held-at-exit-time, etc.)
+ *   1: full validation
  */
 extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 			 int trylock, int read, int check,
@@ -479,15 +478,9 @@ static inline void print_irqtrace_events(struct task_struct *curr)
  * on the per lock-class debug mode:
  */
 
-#ifdef CONFIG_PROVE_LOCKING
- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)
-#else
- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
-#endif
+#define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
+#define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
+#define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
 
 #define spin_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
 #define spin_acquire_nest(l, s, t, n, i)	lock_acquire_exclusive(l, s, t, n, i)
@@ -518,13 +511,13 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 # define might_lock(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, 0, 1, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 # define might_lock_read(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, 1, 1, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 #else
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index a2482cf..2eef290 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -314,7 +314,7 @@ static inline bool rcu_lockdep_current_cpu_online(void)
 
 static inline void rcu_lock_acquire(struct lockdep_map *map)
 {
-	lock_acquire(map, 0, 0, 2, 1, NULL, _THIS_IP_);
+	lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
 }
 
 static inline void rcu_lock_release(struct lockdep_map *map)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 576ba75..c6a7d9d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2096,7 +2096,7 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
 	 * (If lookup_chain_cache() returns with 1 it acquires
 	 * graph_lock for us)
 	 */
-	if (!hlock->trylock && (hlock->check == 2) &&
+	if (!hlock->trylock && hlock->check &&
 	    lookup_chain_cache(curr, hlock, chain_key)) {
 		/*
 		 * Check whether last held lock:
@@ -3053,9 +3053,6 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	int class_idx;
 	u64 chain_key;
 
-	if (!prove_locking)
-		check = 1;
-
 	if (unlikely(!debug_locks))
 		return 0;
 
@@ -3067,8 +3064,8 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
 		return 0;
 
-	if (lock->key == &__lockdep_no_validate__)
-		check = 1;
+	if (!prove_locking || lock->key == &__lockdep_no_validate__)
+		check = 0;
 
 	if (subclass < NR_LOCKDEP_CACHING_CLASSES)
 		class = lock->class_cache[subclass];
@@ -3136,7 +3133,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	hlock->holdtime_stamp = lockstat_clock();
 #endif
 
-	if (check == 2 && !mark_irqflags(curr, hlock))
+	if (check && !mark_irqflags(curr, hlock))
 		return 0;
 
 	/* mark it as used: */
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 2/5] lockdep: don't create the wrong dependency on hlock->check == 0
  2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 1/5] lockdep: make held_lock->check and "int check" argument bool Oleg Nesterov
@ 2014-01-20 18:20                       ` Oleg Nesterov
  2014-02-10 13:33                         ` [tip:core/locking] lockdep: Don' t " tip-bot for Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 3/5] lockdep: change mark_held_locks() to check hlock->check instead of lockdep_no_validate Oleg Nesterov
                                         ` (3 subsequent siblings)
  5 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-20 18:20 UTC (permalink / raw)
  To: Alan Stern, Peter Zijlstra
  Cc: Dave Jones, Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds,
	Paul McKenney, Steven Rostedt, Thomas Gleixner, linux-kernel

Test-case:

	DEFINE_MUTEX(m1);
	DEFINE_MUTEX(m2);
	DEFINE_MUTEX(mx);

	void lockdep_should_complain(void)
	{
		lockdep_set_novalidate_class(&mx);

		// m1 -> mx -> m2
		mutex_lock(&m1);
		mutex_lock(&mx);
		mutex_lock(&m2);
		mutex_unlock(&m2);
		mutex_unlock(&mx);
		mutex_unlock(&m1);

		// m2 -> m1 ; should trigger the warning
		mutex_lock(&m2);
		mutex_lock(&m1);
		mutex_unlock(&m1);
		mutex_unlock(&m2);
	}

this doesn't trigger any warning, lockdep can't detect the trivial
deadlock.

This is because lock(&mx) correctly avoids m1 -> mx dependency, it
skips validate_chain() due to mx->check == 0. But lock(&m2) wrongly
adds mx -> m2 and thus m1 -> m2 is not created.

rcu_lock_acquire()->lock_acquire(check => 0) is fine due to read == 2,
so currently only __lockdep_no_validate__ can trigger this problem.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/locking/lockdep.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c6a7d9d..543e120 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1934,12 +1934,12 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 
 	for (;;) {
 		int distance = curr->lockdep_depth - depth + 1;
-		hlock = curr->held_locks + depth-1;
+		hlock = curr->held_locks + depth - 1;
 		/*
 		 * Only non-recursive-read entries get new dependencies
 		 * added:
 		 */
-		if (hlock->read != 2) {
+		if (hlock->read != 2 && hlock->check) {
 			if (!check_prev_add(curr, hlock, next,
 						distance, trylock_loop))
 				return 0;
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 3/5] lockdep: change mark_held_locks() to check hlock->check instead of lockdep_no_validate
  2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 1/5] lockdep: make held_lock->check and "int check" argument bool Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 2/5] lockdep: don't create the wrong dependency on hlock->check == 0 Oleg Nesterov
@ 2014-01-20 18:20                       ` Oleg Nesterov
  2014-02-10 13:33                         ` [tip:core/locking] lockdep: Change " tip-bot for Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 4/5] lockdep: change lockdep_set_novalidate_class() to use _and_name Oleg Nesterov
                                         ` (2 subsequent siblings)
  5 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-20 18:20 UTC (permalink / raw)
  To: Alan Stern, Peter Zijlstra
  Cc: Dave Jones, Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds,
	Paul McKenney, Steven Rostedt, Thomas Gleixner, linux-kernel

The __lockdep_no_validate check in mark_held_locks() adds the subtle
and (afaics) unnecessary difference between no-validate and check==0.
And this looks even more inconsistent because __lock_acquire() skips
mark_irqflags()->mark_lock() if !check.

Change mark_held_locks() to check hlock->check instead.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/locking/lockdep.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 543e120..28e41c0 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2515,7 +2515,7 @@ mark_held_locks(struct task_struct *curr, enum mark_type mark)
 
 		BUG_ON(usage_bit >= LOCK_USAGE_STATES);
 
-		if (hlock_class(hlock)->key == __lockdep_no_validate__.subkeys)
+		if (!hlock->check)
 			continue;
 
 		if (!mark_lock(curr, hlock, usage_bit))
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 4/5] lockdep: change lockdep_set_novalidate_class() to use _and_name
  2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
                                         ` (2 preceding siblings ...)
  2014-01-20 18:20                       ` [PATCH 3/5] lockdep: change mark_held_locks() to check hlock->check instead of lockdep_no_validate Oleg Nesterov
@ 2014-01-20 18:20                       ` Oleg Nesterov
  2014-02-10 13:33                         ` [tip:core/locking] lockdep: Change " tip-bot for Oleg Nesterov
  2014-01-20 18:20                       ` [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument Oleg Nesterov
  2014-01-20 18:37                       ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Alan Stern
  5 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-20 18:20 UTC (permalink / raw)
  To: Alan Stern, Peter Zijlstra
  Cc: Dave Jones, Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds,
	Paul McKenney, Steven Rostedt, Thomas Gleixner, linux-kernel

Cosmetic. This doesn't really matter because a) device->mutex is
the only user of __lockdep_no_validate__ and b) this class should
be never reported as the source of problem, but if something goes
wrong "&dev->mutex" looks better than "&__lockdep_no_validate__"
as the name of the lock.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/lockdep.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 1626047..060e513 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -303,7 +303,7 @@ extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
 				 (lock)->dep_map.key, sub)
 
 #define lockdep_set_novalidate_class(lock) \
-	lockdep_set_class(lock, &__lockdep_no_validate__)
+	lockdep_set_class_and_name(lock, &__lockdep_no_validate__, #lock)
 /*
  * Compare locking classes
  */
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument
  2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
                                         ` (3 preceding siblings ...)
  2014-01-20 18:20                       ` [PATCH 4/5] lockdep: change lockdep_set_novalidate_class() to use _and_name Oleg Nesterov
@ 2014-01-20 18:20                       ` Oleg Nesterov
  2014-01-21 14:10                         ` Peter Zijlstra
  2014-01-20 18:37                       ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Alan Stern
  5 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-20 18:20 UTC (permalink / raw)
  To: Alan Stern, Peter Zijlstra
  Cc: Dave Jones, Greg Kroah-Hartman, Ingo Molnar, Linus Torvalds,
	Paul McKenney, Steven Rostedt, Thomas Gleixner, linux-kernel

lock_acquire() takes 7 arguments, we can pack 4 "short" enums into
the single "unsigned long acqf" to help the callers which mostly
use the constant values of subclass/trylock/read/check,

		$ size vmlinux
		   text	   data	    bss	    dec	    hex	filename
	-	5262832	2970280	10125312	18358424	1182098	vmlinux
	+	5255131	2974376	10125312	18354819	1181283	vmlinux

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/lockdep.h  |   29 ++++++++++++++++++++++++++---
 kernel/locking/lockdep.c |   17 +++++++++++++----
 2 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 060e513..23a8af3 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -329,9 +329,32 @@ static inline int lockdep_match_key(struct lockdep_map *lock,
  *   0: simple checks (freeing, held-at-exit-time, etc.)
  *   1: full validation
  */
-extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			 int trylock, int read, int check,
-			 struct lockdep_map *nest_lock, unsigned long ip);
+#define ACQF_SUBCLASS_SHIFT	0
+#define ACQF_SUBCLASS_BITS	3
+
+#define ACQF_TRYLOCK_SHIFT	(ACQF_SUBCLASS_SHIFT + ACQF_SUBCLASS_BITS)
+#define ACQF_TRYLOCK_BITS	1
+
+#define ACQF_READ_SHIFT		(ACQF_TRYLOCK_SHIFT + ACQF_TRYLOCK_BITS)
+#define ACQF_READ_BITS		2
+
+#define ACQF_CHECK_SHIFT	(ACQF_READ_SHIFT + ACQF_READ_BITS)
+#define ACQF_CHECK_BITS		1
+
+extern void do_lock_acquire(struct lockdep_map *lock, unsigned long acqf,
+			struct lockdep_map *nest_lock, unsigned long ip);
+
+static inline void lock_acquire(struct lockdep_map *lock,
+		unsigned int subclass, int trylock, int read, int check,
+		struct lockdep_map *nest_lock, unsigned long ip)
+{
+	unsigned long acqf =	(subclass	<< ACQF_SUBCLASS_SHIFT) |
+				(trylock	<< ACQF_TRYLOCK_SHIFT) |
+				(read		<< ACQF_READ_SHIFT) |
+				(check		<< ACQF_CHECK_SHIFT);
+
+	do_lock_acquire(lock, acqf, nest_lock, ip);
+}
 
 extern void lock_release(struct lockdep_map *lock, int nested,
 			 unsigned long ip);
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 28e41c0..8826557 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3581,10 +3581,19 @@ EXPORT_SYMBOL_GPL(lock_set_class);
  * We are not always called with irqs disabled - do that here,
  * and also avoid lockdep recursion:
  */
-void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			  int trylock, int read, int check,
-			  struct lockdep_map *nest_lock, unsigned long ip)
+void do_lock_acquire(struct lockdep_map *lock, unsigned long acqf,
+			struct lockdep_map *nest_lock, unsigned long ip)
 {
+	#define ACQF(bits, what)	\
+		((bits >> ACQF_ ## what ## _SHIFT) &	\
+		 ((1 << ACQF_ ## what ## _BITS) - 1))
+
+	unsigned int subclass = ACQF(acqf, SUBCLASS);
+	unsigned int trylock = ACQF(acqf, TRYLOCK);
+	unsigned int read = ACQF(acqf, READ);
+	unsigned int check = ACQF(acqf, CHECK);
+	#undef ACQF
+
 	unsigned long flags;
 
 	if (unlikely(current->lockdep_recursion))
@@ -3600,7 +3609,7 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	current->lockdep_recursion = 0;
 	raw_local_irq_restore(flags);
 }
-EXPORT_SYMBOL_GPL(lock_acquire);
+EXPORT_SYMBOL_GPL(do_lock_acquire);
 
 void lock_release(struct lockdep_map *lock, int nested,
 			  unsigned long ip)
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate)
  2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
                                         ` (4 preceding siblings ...)
  2014-01-20 18:20                       ` [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument Oleg Nesterov
@ 2014-01-20 18:37                       ` Alan Stern
  2014-01-20 18:54                         ` Oleg Nesterov
  5 siblings, 1 reply; 50+ messages in thread
From: Alan Stern @ 2014-01-20 18:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Greg Kroah-Hartman, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Steven Rostedt, Paul McKenney, Linus Torvalds,
	Dave Jones

On Mon, 20 Jan 2014, Oleg Nesterov wrote:

> On 01/17, Alan Stern wrote:
> >
> > On Fri, 17 Jan 2014, Oleg Nesterov wrote:
> >
> > > Yes, sure. This change assumes that the only problem in drivers/base is
> > > dev->parent->mutex / dev->mutex dependency. If the locking is even more
> > > "broken" (wrt lockdep), we can't replace lockdep_set_novalidate_class()
> > > with lockdep_set_auto_nested().
> >
> > I suspect it is even more "broken".  But I can't point to specific
> > examples.
> >
> > ...
> >
> > My guess is that if your change is deployed widely, there will be
> > reports of violations.  That's only a guess.
> 
> OK, lets (try to) do this later. Let me send the changes which I hope
> should be fine in any case.
> 
> > Still, you could go ahead and try it, just to see what happens.
> 
> Yes, perhaps it makes sense at least to test this change and see what
> happens... We will see.
> 
> > Also, take a look at commit 356c05d58af0.  It's a similar situation
> > (not exactly the same).
> 
> At first glance, can't __ATTR_IGNORE_LOCKDEP() use no_validate too ?
> (ignoring the fact checkpatch.pl won't be happy). This can simplify
> the code, it seems.

Well, the macro itself doesn't specify the lockdep class.  That happens 
implicitly in sysfs_get_active(), in the call to rwsem_acquire_read().
However, it ought to be possible to change the code so that when 
ignore_lockdep(sd) returns nonzero, we end up using no_validate.

Alan Stern


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate)
  2014-01-20 18:37                       ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Alan Stern
@ 2014-01-20 18:54                         ` Oleg Nesterov
  2014-01-20 21:42                           ` Alan Stern
  0 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-20 18:54 UTC (permalink / raw)
  To: Alan Stern
  Cc: Peter Zijlstra, Greg Kroah-Hartman, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Steven Rostedt, Paul McKenney, Linus Torvalds,
	Dave Jones

On 01/20, Alan Stern wrote:
>
> On Mon, 20 Jan 2014, Oleg Nesterov wrote:
>
> > On 01/17, Alan Stern wrote:
> > >
> > > Also, take a look at commit 356c05d58af0.  It's a similar situation
> > > (not exactly the same).
> >
> > At first glance, can't __ATTR_IGNORE_LOCKDEP() use no_validate too ?
> > (ignoring the fact checkpatch.pl won't be happy). This can simplify
> > the code, it seems.
>
> Well, the macro itself doesn't specify the lockdep class.  That happens
> implicitly in sysfs_get_active(), in the call to rwsem_acquire_read().
> However, it ought to be possible to change the code so that when
> ignore_lockdep(sd) returns nonzero, we end up using no_validate.

sysfs_dirent_init_lockdep() can check ->ignore_lockdep and do
lockdep_set_novalidate_class(). This way sysfs_ignore_lockdep() can
go away.

I guess we could even change __ATTR_IGNORE_LOCKDEP() to initialize
->key = __lockdep_no_validate__ and kill ->ignore_lockdep.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate)
  2014-01-20 18:54                         ` Oleg Nesterov
@ 2014-01-20 21:42                           ` Alan Stern
  0 siblings, 0 replies; 50+ messages in thread
From: Alan Stern @ 2014-01-20 21:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Peter Zijlstra, Greg Kroah-Hartman, linux-kernel, Ingo Molnar,
	Thomas Gleixner, Steven Rostedt, Paul McKenney, Linus Torvalds,
	Dave Jones

On Mon, 20 Jan 2014, Oleg Nesterov wrote:

> > > At first glance, can't __ATTR_IGNORE_LOCKDEP() use no_validate too ?
> > > (ignoring the fact checkpatch.pl won't be happy). This can simplify
> > > the code, it seems.
> >
> > Well, the macro itself doesn't specify the lockdep class.  That happens
> > implicitly in sysfs_get_active(), in the call to rwsem_acquire_read().
> > However, it ought to be possible to change the code so that when
> > ignore_lockdep(sd) returns nonzero, we end up using no_validate.
> 
> sysfs_dirent_init_lockdep() can check ->ignore_lockdep and do
> lockdep_set_novalidate_class(). This way sysfs_ignore_lockdep() can
> go away.
> 
> I guess we could even change __ATTR_IGNORE_LOCKDEP() to initialize
> ->key = __lockdep_no_validate__ and kill ->ignore_lockdep.

It's clear that you have a more thorough understanding of how sysfs and
lockdep work than I do.  :-)

This suggestion sounds quite reasonable.

Alan Stern


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument
  2014-01-20 18:20                       ` [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument Oleg Nesterov
@ 2014-01-21 14:10                         ` Peter Zijlstra
  2014-01-21 17:27                           ` Oleg Nesterov
  2014-01-21 17:35                           ` Dave Jones
  0 siblings, 2 replies; 50+ messages in thread
From: Peter Zijlstra @ 2014-01-21 14:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Stern, Dave Jones, Greg Kroah-Hartman, Ingo Molnar,
	Linus Torvalds, Paul McKenney, Steven Rostedt, Thomas Gleixner,
	linux-kernel


I tried the below but filed to see my vmlinux shrink, maybe I'm just not
building the right kernel, or otherwise GCC is stupid.

---
 include/linux/lockdep.h  |   33 ++++++++++++++++++++++++++++++---
 kernel/locking/lockdep.c |   11 +++++++----
 2 files changed, 32 insertions(+), 7 deletions(-)

--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -329,9 +329,31 @@ static inline int lockdep_match_key(stru
  *   0: simple checks (freeing, held-at-exit-time, etc.)
  *   1: full validation
  */
-extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			 int trylock, int read, int check,
-			 struct lockdep_map *nest_lock, unsigned long ip);
+struct lockdep_acquire_flags {
+	unsigned long subclass : 3;
+	unsigned long trylock  : 1;
+	unsigned long read     : 2;
+	unsigned long check    : 1;
+};
+
+extern void do_lock_acquire(struct lockdep_map *lock,
+			    struct lockdep_acquire_flags acqf,
+			    struct lockdep_map *nest_lock,
+			    unsigned long ip);
+
+static inline void lock_acquire(struct lockdep_map *lock,
+		unsigned int subclass, int trylock, int read, int check,
+		struct lockdep_map *nest_lock, unsigned long ip)
+{
+	struct lockdep_acquire_flags acqf = {
+		.subclass = subclass,
+		.trylock = trylock,
+		.read = read,
+		.check = check,
+	};
+
+	do_lock_acquire(lock, acqf, nest_lock, ip);
+}
 
 extern void lock_release(struct lockdep_map *lock, int nested,
 			 unsigned long ip);
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3583,10 +3583,13 @@ EXPORT_SYMBOL_GPL(lock_set_class);
  * We are not always called with irqs disabled - do that here,
  * and also avoid lockdep recursion:
  */
-void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
-			  int trylock, int read, int check,
-			  struct lockdep_map *nest_lock, unsigned long ip)
+void do_lock_acquire(struct lockdep_map *lock, struct lockdep_acquire_flags acqf,
+			struct lockdep_map *nest_lock, unsigned long ip)
 {
+	unsigned int subclass = acqf.subclass;
+	unsigned int trylock = acqf.trylock;
+	unsigned int read = acqf.read;
+	unsigned int check = acqf.check;
 	unsigned long flags;
 
 	if (unlikely(current->lockdep_recursion))
@@ -3602,7 +3605,7 @@ void lock_acquire(struct lockdep_map *lo
 	current->lockdep_recursion = 0;
 	raw_local_irq_restore(flags);
 }
-EXPORT_SYMBOL_GPL(lock_acquire);
+EXPORT_SYMBOL_GPL(do_lock_acquire);
 
 void lock_release(struct lockdep_map *lock, int nested,
 			  unsigned long ip)

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument
  2014-01-21 14:10                         ` Peter Zijlstra
@ 2014-01-21 17:27                           ` Oleg Nesterov
  2014-01-21 17:35                           ` Dave Jones
  1 sibling, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-21 17:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alan Stern, Dave Jones, Greg Kroah-Hartman, Ingo Molnar,
	Linus Torvalds, Paul McKenney, Steven Rostedt, Thomas Gleixner,
	linux-kernel

On 01/21, Peter Zijlstra wrote:
>
> I tried the below but filed to see my vmlinux shrink, maybe I'm just not
> building the right kernel, or otherwise GCC is stupid.

...

> +struct lockdep_acquire_flags {
> +	unsigned long subclass : 3;
> +	unsigned long trylock  : 1;
> +	unsigned long read     : 2;
> +	unsigned long check    : 1;
> +};

Yes, I tried this too. I do see the difference, but the result is worse:

	- 5262832
	+ 5256987

looks like, gcc is not smart enough...

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument
  2014-01-21 14:10                         ` Peter Zijlstra
  2014-01-21 17:27                           ` Oleg Nesterov
@ 2014-01-21 17:35                           ` Dave Jones
  2014-01-21 18:43                             ` Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Dave Jones @ 2014-01-21 17:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Alan Stern, Greg Kroah-Hartman, Ingo Molnar,
	Linus Torvalds, Paul McKenney, Steven Rostedt, Thomas Gleixner,
	linux-kernel

On Tue, Jan 21, 2014 at 03:10:22PM +0100, Peter Zijlstra wrote:
 > 
 > I tried the below but filed to see my vmlinux shrink, maybe I'm just not
 > building the right kernel, or otherwise GCC is stupid.
 > 
 > -extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 > -			 int trylock, int read, int check,
 > -			 struct lockdep_map *nest_lock, unsigned long ip);
 > +struct lockdep_acquire_flags {
 > +	unsigned long subclass : 3;
 > +	unsigned long trylock  : 1;
 > +	unsigned long read     : 2;
 > +	unsigned long check    : 1;
 > +};
 
As it's only 7 bits here, could we pack them into a char ?

	Dave


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument
  2014-01-21 17:35                           ` Dave Jones
@ 2014-01-21 18:43                             ` Oleg Nesterov
  2014-01-21 18:53                               ` Steven Rostedt
  2014-01-21 19:39                               ` uninline rcu_lock_acquire/etc ? Oleg Nesterov
  0 siblings, 2 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-21 18:43 UTC (permalink / raw)
  To: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Paul McKenney, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On 01/21, Dave Jones wrote:
>
> On Tue, Jan 21, 2014 at 03:10:22PM +0100, Peter Zijlstra wrote:
>  >
>  > I tried the below but filed to see my vmlinux shrink, maybe I'm just not
>  > building the right kernel, or otherwise GCC is stupid.
>  >
>  > -extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
>  > -			 int trylock, int read, int check,
>  > -			 struct lockdep_map *nest_lock, unsigned long ip);
>  > +struct lockdep_acquire_flags {
>  > +	unsigned long subclass : 3;
>  > +	unsigned long trylock  : 1;
>  > +	unsigned long read     : 2;
>  > +	unsigned long check    : 1;
>  > +};
>
> As it's only 7 bits here, could we pack them into a char ?

Do you mean __attribute__((packed)) ?

Suprizingly it helps a bit. Still "size vmlinux" is worse than with
bitmasks.

But I agreed that the code looks simpler with bitfields, so perhaps
this patch is better.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument
  2014-01-21 18:43                             ` Oleg Nesterov
@ 2014-01-21 18:53                               ` Steven Rostedt
  2014-01-21 20:06                                 ` Oleg Nesterov
  2014-01-21 19:39                               ` uninline rcu_lock_acquire/etc ? Oleg Nesterov
  1 sibling, 1 reply; 50+ messages in thread
From: Steven Rostedt @ 2014-01-21 18:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Paul McKenney, Thomas Gleixner,
	linux-kernel

On Tue, 21 Jan 2014 19:43:10 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> On 01/21, Dave Jones wrote:
> >
> > On Tue, Jan 21, 2014 at 03:10:22PM +0100, Peter Zijlstra wrote:
> >  >
> >  > I tried the below but filed to see my vmlinux shrink, maybe I'm just not
> >  > building the right kernel, or otherwise GCC is stupid.
> >  >
> >  > -extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> >  > -			 int trylock, int read, int check,
> >  > -			 struct lockdep_map *nest_lock, unsigned long ip);
> >  > +struct lockdep_acquire_flags {
> >  > +	unsigned long subclass : 3;
> >  > +	unsigned long trylock  : 1;
> >  > +	unsigned long read     : 2;
> >  > +	unsigned long check    : 1;
> >  > +};
> >
> > As it's only 7 bits here, could we pack them into a char ?
> 
> Do you mean __attribute__((packed)) ?
> 

I think Dave means:

	unsigned char subclass	: 3;
	unsigned char trylock	: 1;
	unsigned char read	: 2;
	unsigned char check	: 1;

-- Steve

^ permalink raw reply	[flat|nested] 50+ messages in thread

* uninline rcu_lock_acquire/etc ?
  2014-01-21 18:43                             ` Oleg Nesterov
  2014-01-21 18:53                               ` Steven Rostedt
@ 2014-01-21 19:39                               ` Oleg Nesterov
  2014-01-22  3:54                                 ` Paul E. McKenney
  1 sibling, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-21 19:39 UTC (permalink / raw)
  To: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Paul McKenney, Steven Rostedt,
	Thomas Gleixner, linux-kernel

On 01/21, Oleg Nesterov wrote:
>
> But I agreed that the code looks simpler with bitfields, so perhaps
> this patch is better.

Besides, I guess the major offender is rcu...

Paul, can't we do something like below? Saves 19.5 kilobytes,

	-       5255131 2974376 10125312        18354819        1181283 vmlinux
	+	5235227 2970344 10125312        18330883        117b503 vmlinux

probably we can also uninline rcu_lockdep_assert()...

Oleg.
---

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 2eef290..58f7a97 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -310,18 +310,34 @@ static inline bool rcu_lockdep_current_cpu_online(void)
 }
 #endif /* #else #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) */
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-
-static inline void rcu_lock_acquire(struct lockdep_map *map)
+static inline void __rcu_lock_acquire(struct lockdep_map *map, unsigned long ip)
 {
-	lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
+	lock_acquire(map, 0, 0, 2, 0, NULL, ip);
 }
 
-static inline void rcu_lock_release(struct lockdep_map *map)
+static inline void __rcu_lock_release(struct lockdep_map *map, unsigned long ip)
 {
 	lock_release(map, 1, _THIS_IP_);
 }
 
+#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU)
+extern void rcu_lock_acquire(void);
+extern void rcu_lock_release(void);
+extern void rcu_lock_acquire_bh(void);
+extern void rcu_lock_release_bh(void);
+extern void rcu_lock_acquire_sched(void);
+extern void rcu_lock_release_sched(void);
+#else
+#define rcu_lock_acquire()		do { } while (0)
+#define rcu_lock_release()		do { } while (0)
+#define rcu_lock_acquire_bh()		do { } while (0)
+#define rcu_lock_release_bh()		do { } while (0)
+#define rcu_lock_acquire_sched()	do { } while (0)
+#define rcu_lock_release_sched()	do { } while (0)
+#endif
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
 extern struct lockdep_map rcu_lock_map;
 extern struct lockdep_map rcu_bh_lock_map;
 extern struct lockdep_map rcu_sched_lock_map;
@@ -419,9 +435,6 @@ static inline int rcu_read_lock_sched_held(void)
 
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
-# define rcu_lock_acquire(a)		do { } while (0)
-# define rcu_lock_release(a)		do { } while (0)
-
 static inline int rcu_read_lock_held(void)
 {
 	return 1;
@@ -766,11 +779,9 @@ static inline void rcu_preempt_sleep_check(void)
  */
 static inline void rcu_read_lock(void)
 {
-	__rcu_read_lock();
 	__acquire(RCU);
-	rcu_lock_acquire(&rcu_lock_map);
-	rcu_lockdep_assert(rcu_is_watching(),
-			   "rcu_read_lock() used illegally while idle");
+	__rcu_read_lock();
+	rcu_lock_acquire();
 }
 
 /*
@@ -790,11 +801,9 @@ static inline void rcu_read_lock(void)
  */
 static inline void rcu_read_unlock(void)
 {
-	rcu_lockdep_assert(rcu_is_watching(),
-			   "rcu_read_unlock() used illegally while idle");
-	rcu_lock_release(&rcu_lock_map);
-	__release(RCU);
+	rcu_lock_release();
 	__rcu_read_unlock();
+	__release(RCU);
 }
 
 /**
@@ -816,11 +825,9 @@ static inline void rcu_read_unlock(void)
  */
 static inline void rcu_read_lock_bh(void)
 {
-	local_bh_disable();
 	__acquire(RCU_BH);
-	rcu_lock_acquire(&rcu_bh_lock_map);
-	rcu_lockdep_assert(rcu_is_watching(),
-			   "rcu_read_lock_bh() used illegally while idle");
+	local_bh_disable();
+	rcu_lock_acquire_bh();
 }
 
 /*
@@ -830,11 +837,9 @@ static inline void rcu_read_lock_bh(void)
  */
 static inline void rcu_read_unlock_bh(void)
 {
-	rcu_lockdep_assert(rcu_is_watching(),
-			   "rcu_read_unlock_bh() used illegally while idle");
-	rcu_lock_release(&rcu_bh_lock_map);
-	__release(RCU_BH);
+	rcu_lock_release_bh();
 	local_bh_enable();
+	__release(RCU_BH);
 }
 
 /**
@@ -852,9 +857,9 @@ static inline void rcu_read_unlock_bh(void)
  */
 static inline void rcu_read_lock_sched(void)
 {
-	preempt_disable();
 	__acquire(RCU_SCHED);
-	rcu_lock_acquire(&rcu_sched_lock_map);
+	preempt_disable();
+	rcu_lock_acquire_sched();
 	rcu_lockdep_assert(rcu_is_watching(),
 			   "rcu_read_lock_sched() used illegally while idle");
 }
@@ -862,8 +867,8 @@ static inline void rcu_read_lock_sched(void)
 /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
 static inline notrace void rcu_read_lock_sched_notrace(void)
 {
-	preempt_disable_notrace();
 	__acquire(RCU_SCHED);
+	preempt_disable_notrace();
 }
 
 /*
@@ -873,18 +878,16 @@ static inline notrace void rcu_read_lock_sched_notrace(void)
  */
 static inline void rcu_read_unlock_sched(void)
 {
-	rcu_lockdep_assert(rcu_is_watching(),
-			   "rcu_read_unlock_sched() used illegally while idle");
-	rcu_lock_release(&rcu_sched_lock_map);
-	__release(RCU_SCHED);
+	rcu_lock_release_sched();
 	preempt_enable();
+	__release(RCU_SCHED);
 }
 
 /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
 static inline notrace void rcu_read_unlock_sched_notrace(void)
 {
-	__release(RCU_SCHED);
 	preempt_enable_notrace();
+	__release(RCU_SCHED);
 }
 
 /**
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 9b058ee..9b0f568 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -219,7 +219,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 {
 	int retval = __srcu_read_lock(sp);
 
-	rcu_lock_acquire(&(sp)->dep_map);
+	__rcu_lock_acquire(&(sp)->dep_map, _THIS_IP_);
 	return retval;
 }
 
@@ -233,7 +233,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
 static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
 	__releases(sp)
 {
-	rcu_lock_release(&(sp)->dep_map);
+	__rcu_lock_release(&(sp)->dep_map, _THIS_IP_);
 	__srcu_read_unlock(sp, idx);
 }
 
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index a3596c8..19ff915 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -333,4 +333,47 @@ static int __init check_cpu_stall_init(void)
 }
 early_initcall(check_cpu_stall_init);
 
+#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU)
+
+static void ck_rcu_is_watching(const char *message)
+{
+	rcu_lockdep_assert(rcu_is_watching(), message);
+}
+
+void rcu_lock_acquire(void)
+{
+	__rcu_lock_acquire(&rcu_lock_map, _RET_IP_);
+	ck_rcu_is_watching("rcu_read_lock() used illegally while idle");
+}
+
+void rcu_lock_release(void)
+{
+	ck_rcu_is_watching("rcu_read_unlock() used illegally while idle");
+	__rcu_lock_release(&rcu_lock_map, _RET_IP_);
+}
+
+void rcu_lock_acquire_bh(void)
+{
+	__rcu_lock_acquire(&rcu_bh_lock_map, _RET_IP_);
+	ck_rcu_is_watching("rcu_read_lock_bh() used illegally while idle");
+}
+
+void rcu_lock_release_bh(void)
+{
+	ck_rcu_is_watching("rcu_read_unlock_bh() used illegally while idle");
+	__rcu_lock_release(&rcu_bh_lock_map, _RET_IP_);
+}
+void rcu_lock_acquire_sched(void)
+{
+	__rcu_lock_acquire(&rcu_sched_lock_map, _RET_IP_);
+	ck_rcu_is_watching("rcu_read_lock_sched() used illegally while idle");
+}
+
+void rcu_lock_release_sched(void)
+{
+	ck_rcu_is_watching("rcu_read_unlock_sched() used illegally while idle");
+	__rcu_lock_release(&rcu_sched_lock_map, _RET_IP_);
+}
+#endif
+
 #endif /* #ifdef CONFIG_RCU_STALL_COMMON */


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument
  2014-01-21 18:53                               ` Steven Rostedt
@ 2014-01-21 20:06                                 ` Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-21 20:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Paul McKenney, Thomas Gleixner,
	linux-kernel

On 01/21, Steven Rostedt wrote:
>
> On Tue, 21 Jan 2014 19:43:10 +0100
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Do you mean __attribute__((packed)) ?
>
> I think Dave means:
>
> 	unsigned char subclass	: 3;
> 	unsigned char trylock	: 1;
> 	unsigned char read	: 2;
> 	unsigned char check	: 1;

Ah, yes. But I don't think this will change the code generation.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: uninline rcu_lock_acquire/etc ?
  2014-01-21 19:39                               ` uninline rcu_lock_acquire/etc ? Oleg Nesterov
@ 2014-01-22  3:54                                 ` Paul E. McKenney
  2014-01-22 18:31                                   ` Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2014-01-22  3:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Steven Rostedt, Thomas Gleixner,
	linux-kernel

On Tue, Jan 21, 2014 at 08:39:09PM +0100, Oleg Nesterov wrote:
> On 01/21, Oleg Nesterov wrote:
> >
> > But I agreed that the code looks simpler with bitfields, so perhaps
> > this patch is better.
> 
> Besides, I guess the major offender is rcu...
> 
> Paul, can't we do something like below? Saves 19.5 kilobytes,
> 
> 	-       5255131 2974376 10125312        18354819        1181283 vmlinux
> 	+	5235227 2970344 10125312        18330883        117b503 vmlinux
> 
> probably we can also uninline rcu_lockdep_assert()...

Looks mostly plausible, some questions inline below.

							Thanx, Paul

> Oleg.
> ---
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 2eef290..58f7a97 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -310,18 +310,34 @@ static inline bool rcu_lockdep_current_cpu_online(void)
>  }
>  #endif /* #else #if defined(CONFIG_HOTPLUG_CPU) && defined(CONFIG_PROVE_RCU) */
> 
> -#ifdef CONFIG_DEBUG_LOCK_ALLOC
> -
> -static inline void rcu_lock_acquire(struct lockdep_map *map)
> +static inline void __rcu_lock_acquire(struct lockdep_map *map, unsigned long ip)
>  {
> -	lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
> +	lock_acquire(map, 0, 0, 2, 0, NULL, ip);
>  }
> 
> -static inline void rcu_lock_release(struct lockdep_map *map)
> +static inline void __rcu_lock_release(struct lockdep_map *map, unsigned long ip)
>  {
>  	lock_release(map, 1, _THIS_IP_);
>  }
> 
> +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU)
> +extern void rcu_lock_acquire(void);
> +extern void rcu_lock_release(void);
> +extern void rcu_lock_acquire_bh(void);
> +extern void rcu_lock_release_bh(void);
> +extern void rcu_lock_acquire_sched(void);
> +extern void rcu_lock_release_sched(void);
> +#else
> +#define rcu_lock_acquire()		do { } while (0)
> +#define rcu_lock_release()		do { } while (0)
> +#define rcu_lock_acquire_bh()		do { } while (0)
> +#define rcu_lock_release_bh()		do { } while (0)
> +#define rcu_lock_acquire_sched()	do { } while (0)
> +#define rcu_lock_release_sched()	do { } while (0)
> +#endif
> +
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +
>  extern struct lockdep_map rcu_lock_map;
>  extern struct lockdep_map rcu_bh_lock_map;
>  extern struct lockdep_map rcu_sched_lock_map;
> @@ -419,9 +435,6 @@ static inline int rcu_read_lock_sched_held(void)
> 
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
> 
> -# define rcu_lock_acquire(a)		do { } while (0)
> -# define rcu_lock_release(a)		do { } while (0)
> -
>  static inline int rcu_read_lock_held(void)
>  {
>  	return 1;
> @@ -766,11 +779,9 @@ static inline void rcu_preempt_sleep_check(void)
>   */
>  static inline void rcu_read_lock(void)
>  {
> -	__rcu_read_lock();
>  	__acquire(RCU);
> -	rcu_lock_acquire(&rcu_lock_map);
> -	rcu_lockdep_assert(rcu_is_watching(),
> -			   "rcu_read_lock() used illegally while idle");
> +	__rcu_read_lock();
> +	rcu_lock_acquire();

Not sure why __rcu_read_lock() needs to be in any particular order
with respect to the sparse __acquire(RCU), but should work either way.
Same question about the other reorderings of similar statements.

>  }
> 
>  /*
> @@ -790,11 +801,9 @@ static inline void rcu_read_lock(void)
>   */
>  static inline void rcu_read_unlock(void)
>  {
> -	rcu_lockdep_assert(rcu_is_watching(),
> -			   "rcu_read_unlock() used illegally while idle");
> -	rcu_lock_release(&rcu_lock_map);
> -	__release(RCU);
> +	rcu_lock_release();
>  	__rcu_read_unlock();
> +	__release(RCU);
>  }
> 
>  /**
> @@ -816,11 +825,9 @@ static inline void rcu_read_unlock(void)
>   */
>  static inline void rcu_read_lock_bh(void)
>  {
> -	local_bh_disable();
>  	__acquire(RCU_BH);
> -	rcu_lock_acquire(&rcu_bh_lock_map);
> -	rcu_lockdep_assert(rcu_is_watching(),
> -			   "rcu_read_lock_bh() used illegally while idle");
> +	local_bh_disable();
> +	rcu_lock_acquire_bh();
>  }
> 
>  /*
> @@ -830,11 +837,9 @@ static inline void rcu_read_lock_bh(void)
>   */
>  static inline void rcu_read_unlock_bh(void)
>  {
> -	rcu_lockdep_assert(rcu_is_watching(),
> -			   "rcu_read_unlock_bh() used illegally while idle");
> -	rcu_lock_release(&rcu_bh_lock_map);
> -	__release(RCU_BH);
> +	rcu_lock_release_bh();
>  	local_bh_enable();
> +	__release(RCU_BH);
>  }
> 
>  /**
> @@ -852,9 +857,9 @@ static inline void rcu_read_unlock_bh(void)
>   */
>  static inline void rcu_read_lock_sched(void)
>  {
> -	preempt_disable();
>  	__acquire(RCU_SCHED);
> -	rcu_lock_acquire(&rcu_sched_lock_map);
> +	preempt_disable();
> +	rcu_lock_acquire_sched();
>  	rcu_lockdep_assert(rcu_is_watching(),
>  			   "rcu_read_lock_sched() used illegally while idle");

The above pair of lines (rcu_lockdep_assert()) should also be removed,
correct?

>  }
> @@ -862,8 +867,8 @@ static inline void rcu_read_lock_sched(void)
>  /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
>  static inline notrace void rcu_read_lock_sched_notrace(void)
>  {
> -	preempt_disable_notrace();
>  	__acquire(RCU_SCHED);
> +	preempt_disable_notrace();

I cannot help repeating myself on this one...  ;-)

Why the change in order?

>  }
> 
>  /*
> @@ -873,18 +878,16 @@ static inline notrace void rcu_read_lock_sched_notrace(void)
>   */
>  static inline void rcu_read_unlock_sched(void)
>  {
> -	rcu_lockdep_assert(rcu_is_watching(),
> -			   "rcu_read_unlock_sched() used illegally while idle");
> -	rcu_lock_release(&rcu_sched_lock_map);
> -	__release(RCU_SCHED);
> +	rcu_lock_release_sched();
>  	preempt_enable();
> +	__release(RCU_SCHED);
>  }
> 
>  /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
>  static inline notrace void rcu_read_unlock_sched_notrace(void)
>  {
> -	__release(RCU_SCHED);
>  	preempt_enable_notrace();
> +	__release(RCU_SCHED);
>  }
> 
>  /**
> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> index 9b058ee..9b0f568 100644
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -219,7 +219,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
>  {
>  	int retval = __srcu_read_lock(sp);
> 
> -	rcu_lock_acquire(&(sp)->dep_map);
> +	__rcu_lock_acquire(&(sp)->dep_map, _THIS_IP_);

Good, we do not way srcu_read_lock() complaining about offline or idle
CPUs.

>  	return retval;
>  }
> 
> @@ -233,7 +233,7 @@ static inline int srcu_read_lock(struct srcu_struct *sp) __acquires(sp)
>  static inline void srcu_read_unlock(struct srcu_struct *sp, int idx)
>  	__releases(sp)
>  {
> -	rcu_lock_release(&(sp)->dep_map);
> +	__rcu_lock_release(&(sp)->dep_map, _THIS_IP_);
>  	__srcu_read_unlock(sp, idx);
>  }
> 
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index a3596c8..19ff915 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -333,4 +333,47 @@ static int __init check_cpu_stall_init(void)
>  }
>  early_initcall(check_cpu_stall_init);
> 
> +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU)
> +
> +static void ck_rcu_is_watching(const char *message)
> +{
> +	rcu_lockdep_assert(rcu_is_watching(), message);
> +}
> +
> +void rcu_lock_acquire(void)
> +{
> +	__rcu_lock_acquire(&rcu_lock_map, _RET_IP_);
> +	ck_rcu_is_watching("rcu_read_lock() used illegally while idle");
> +}
> +
> +void rcu_lock_release(void)
> +{
> +	ck_rcu_is_watching("rcu_read_unlock() used illegally while idle");
> +	__rcu_lock_release(&rcu_lock_map, _RET_IP_);
> +}
> +
> +void rcu_lock_acquire_bh(void)
> +{
> +	__rcu_lock_acquire(&rcu_bh_lock_map, _RET_IP_);
> +	ck_rcu_is_watching("rcu_read_lock_bh() used illegally while idle");
> +}
> +
> +void rcu_lock_release_bh(void)
> +{
> +	ck_rcu_is_watching("rcu_read_unlock_bh() used illegally while idle");
> +	__rcu_lock_release(&rcu_bh_lock_map, _RET_IP_);
> +}
> +void rcu_lock_acquire_sched(void)
> +{
> +	__rcu_lock_acquire(&rcu_sched_lock_map, _RET_IP_);
> +	ck_rcu_is_watching("rcu_read_lock_sched() used illegally while idle");
> +}
> +
> +void rcu_lock_release_sched(void)
> +{
> +	ck_rcu_is_watching("rcu_read_unlock_sched() used illegally while idle");
> +	__rcu_lock_release(&rcu_sched_lock_map, _RET_IP_);
> +}
> +#endif
> +
>  #endif /* #ifdef CONFIG_RCU_STALL_COMMON */
> 


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: uninline rcu_lock_acquire/etc ?
  2014-01-22  3:54                                 ` Paul E. McKenney
@ 2014-01-22 18:31                                   ` Oleg Nesterov
  2014-01-22 19:34                                     ` Paul E. McKenney
  0 siblings, 1 reply; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-22 18:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Steven Rostedt, Thomas Gleixner,
	linux-kernel

On 01/21, Paul E. McKenney wrote:
>
> On Tue, Jan 21, 2014 at 08:39:09PM +0100, Oleg Nesterov wrote:
> > On 01/21, Oleg Nesterov wrote:
> > >
> > > But I agreed that the code looks simpler with bitfields, so perhaps
> > > this patch is better.
> >
> > Besides, I guess the major offender is rcu...
> >
> > Paul, can't we do something like below? Saves 19.5 kilobytes,
> >
> > 	-       5255131 2974376 10125312        18354819        1181283 vmlinux
> > 	+	5235227 2970344 10125312        18330883        117b503 vmlinux
> >
> > probably we can also uninline rcu_lockdep_assert()...
>
> Looks mostly plausible, some questions inline below.

Thanks!

> >  static inline void rcu_read_lock(void)
> >  {
> > -	__rcu_read_lock();
> >  	__acquire(RCU);
> > -	rcu_lock_acquire(&rcu_lock_map);
> > -	rcu_lockdep_assert(rcu_is_watching(),
> > -			   "rcu_read_lock() used illegally while idle");
> > +	__rcu_read_lock();
> > +	rcu_lock_acquire();
>
> Not sure why __rcu_read_lock() needs to be in any particular order
> with respect to the sparse __acquire(RCU), but should work either way.
> Same question about the other reorderings of similar statements.

I did this unconsciously and for no reason, will revert this accidental
change.

> >  static inline void rcu_read_lock_sched(void)
> >  {
> > -	preempt_disable();
> >  	__acquire(RCU_SCHED);
> > -	rcu_lock_acquire(&rcu_sched_lock_map);
> > +	preempt_disable();
> > +	rcu_lock_acquire_sched();
> >  	rcu_lockdep_assert(rcu_is_watching(),
> >  			   "rcu_read_lock_sched() used illegally while idle");
>
> The above pair of lines (rcu_lockdep_assert()) should also be removed,
> correct?

yes, sure, thanks,

> > @@ -862,8 +867,8 @@ static inline void rcu_read_lock_sched(void)
> >  /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
> >  static inline notrace void rcu_read_lock_sched_notrace(void)
> >  {
> > -	preempt_disable_notrace();
> >  	__acquire(RCU_SCHED);
> > +	preempt_disable_notrace();
>
> I cannot help repeating myself on this one...  ;-)
>
> Why the change in order?

see above ;)

> > --- a/kernel/rcu/update.c
> > +++ b/kernel/rcu/update.c
> > @@ -333,4 +333,47 @@ static int __init check_cpu_stall_init(void)
> >  }
> >  early_initcall(check_cpu_stall_init);
> >
> > +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU)
> > +
> > +static void ck_rcu_is_watching(const char *message)
> > +{
> > +	rcu_lockdep_assert(rcu_is_watching(), message);
> > +}
> > +
> > +void rcu_lock_acquire(void)
> > +{
> > +	__rcu_lock_acquire(&rcu_lock_map, _RET_IP_);
> > +	ck_rcu_is_watching("rcu_read_lock() used illegally while idle");
> > +}
> > +
> > +void rcu_lock_release(void)
> > +{
> > +	ck_rcu_is_watching("rcu_read_unlock() used illegally while idle");
> > +	__rcu_lock_release(&rcu_lock_map, _RET_IP_);
> > +}
> > ...

Also, this all should be exported. And I think cleanuped somehow.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: uninline rcu_lock_acquire/etc ?
  2014-01-22 18:31                                   ` Oleg Nesterov
@ 2014-01-22 19:34                                     ` Paul E. McKenney
  2014-01-22 19:39                                       ` Oleg Nesterov
  0 siblings, 1 reply; 50+ messages in thread
From: Paul E. McKenney @ 2014-01-22 19:34 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Steven Rostedt, Thomas Gleixner,
	linux-kernel

On Wed, Jan 22, 2014 at 07:31:25PM +0100, Oleg Nesterov wrote:
> On 01/21, Paul E. McKenney wrote:
> >
> > On Tue, Jan 21, 2014 at 08:39:09PM +0100, Oleg Nesterov wrote:
> > > On 01/21, Oleg Nesterov wrote:
> > > >
> > > > But I agreed that the code looks simpler with bitfields, so perhaps
> > > > this patch is better.
> > >
> > > Besides, I guess the major offender is rcu...
> > >
> > > Paul, can't we do something like below? Saves 19.5 kilobytes,
> > >
> > > 	-       5255131 2974376 10125312        18354819        1181283 vmlinux
> > > 	+	5235227 2970344 10125312        18330883        117b503 vmlinux
> > >
> > > probably we can also uninline rcu_lockdep_assert()...
> >
> > Looks mostly plausible, some questions inline below.
> 
> Thanks!
> 
> > >  static inline void rcu_read_lock(void)
> > >  {
> > > -	__rcu_read_lock();
> > >  	__acquire(RCU);
> > > -	rcu_lock_acquire(&rcu_lock_map);
> > > -	rcu_lockdep_assert(rcu_is_watching(),
> > > -			   "rcu_read_lock() used illegally while idle");
> > > +	__rcu_read_lock();
> > > +	rcu_lock_acquire();
> >
> > Not sure why __rcu_read_lock() needs to be in any particular order
> > with respect to the sparse __acquire(RCU), but should work either way.
> > Same question about the other reorderings of similar statements.
> 
> I did this unconsciously and for no reason, will revert this accidental
> change.
> 
> > >  static inline void rcu_read_lock_sched(void)
> > >  {
> > > -	preempt_disable();
> > >  	__acquire(RCU_SCHED);
> > > -	rcu_lock_acquire(&rcu_sched_lock_map);
> > > +	preempt_disable();
> > > +	rcu_lock_acquire_sched();
> > >  	rcu_lockdep_assert(rcu_is_watching(),
> > >  			   "rcu_read_lock_sched() used illegally while idle");
> >
> > The above pair of lines (rcu_lockdep_assert()) should also be removed,
> > correct?
> 
> yes, sure, thanks,
> 
> > > @@ -862,8 +867,8 @@ static inline void rcu_read_lock_sched(void)
> > >  /* Used by lockdep and tracing: cannot be traced, cannot call lockdep. */
> > >  static inline notrace void rcu_read_lock_sched_notrace(void)
> > >  {
> > > -	preempt_disable_notrace();
> > >  	__acquire(RCU_SCHED);
> > > +	preempt_disable_notrace();
> >
> > I cannot help repeating myself on this one...  ;-)
> >
> > Why the change in order?
> 
> see above ;)
> 
> > > --- a/kernel/rcu/update.c
> > > +++ b/kernel/rcu/update.c
> > > @@ -333,4 +333,47 @@ static int __init check_cpu_stall_init(void)
> > >  }
> > >  early_initcall(check_cpu_stall_init);
> > >
> > > +#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_PROVE_RCU)
> > > +
> > > +static void ck_rcu_is_watching(const char *message)
> > > +{
> > > +	rcu_lockdep_assert(rcu_is_watching(), message);
> > > +}
> > > +
> > > +void rcu_lock_acquire(void)
> > > +{
> > > +	__rcu_lock_acquire(&rcu_lock_map, _RET_IP_);
> > > +	ck_rcu_is_watching("rcu_read_lock() used illegally while idle");
> > > +}
> > > +
> > > +void rcu_lock_release(void)
> > > +{
> > > +	ck_rcu_is_watching("rcu_read_unlock() used illegally while idle");
> > > +	__rcu_lock_release(&rcu_lock_map, _RET_IP_);
> > > +}
> > > ...
> 
> Also, this all should be exported. And I think cleanuped somehow.

I would be happy to take a patch with the above issues fixed.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: uninline rcu_lock_acquire/etc ?
  2014-01-22 19:34                                     ` Paul E. McKenney
@ 2014-01-22 19:39                                       ` Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: Oleg Nesterov @ 2014-01-22 19:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dave Jones, Peter Zijlstra, Alan Stern, Greg Kroah-Hartman,
	Ingo Molnar, Linus Torvalds, Steven Rostedt, Thomas Gleixner,
	linux-kernel

On 01/22, Paul E. McKenney wrote:
>
> On Wed, Jan 22, 2014 at 07:31:25PM +0100, Oleg Nesterov wrote:
> >
> > Also, this all should be exported. And I think cleanuped somehow.
>
> I would be happy to take a patch with the above issues fixed.

Yes, yes, thanks.

Sorry for noise, I forgot to say that of course I'll try to make v2,
but I will be offline a couple of days.

Oleg.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [tip:core/locking] lockdep: Make held_lock->check and "int check" argument bool
  2014-01-20 18:20                       ` [PATCH 1/5] lockdep: make held_lock->check and "int check" argument bool Oleg Nesterov
@ 2014-02-10 13:32                         ` tip-bot for Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: tip-bot for Oleg Nesterov @ 2014-02-10 13:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, sasha.levin, hpa, mingo, stern, torvalds, peterz,
	davej, gregkh, paulmck, rostedt, tglx, oleg

Commit-ID:  fb9edbe98493fcd9df66de926ae9157cbe0e4dcd
Gitweb:     http://git.kernel.org/tip/fb9edbe98493fcd9df66de926ae9157cbe0e4dcd
Author:     Oleg Nesterov <oleg@redhat.com>
AuthorDate: Mon, 20 Jan 2014 19:20:06 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 9 Feb 2014 21:18:54 +0100

lockdep: Make held_lock->check and "int check" argument bool

The "int check" argument of lock_acquire() and held_lock->check are
misleading. This is actually a boolean: 2 means "true", everything
else is "false".

And there is no need to pass 1 or 0 to lock_acquire() depending on
CONFIG_PROVE_LOCKING, __lock_acquire() checks prove_locking at the
start and clears "check" if !CONFIG_PROVE_LOCKING.

Note: probably we can simply kill this member/arg. The only explicit
user of check => 0 is rcu_lock_acquire(), perhaps we can change it to
use lock_acquire(trylock =>, read => 2). __lockdep_no_validate means
check => 0 implicitly, but we can change validate_chain() to check
hlock->instance->key instead. Not to mention it would be nice to get
rid of lockdep_set_novalidate_class().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140120182006.GA26495@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 drivers/tty/tty_ldsem.c  | 15 ++++-----------
 include/linux/lockdep.h  | 25 +++++++++----------------
 include/linux/rcupdate.h |  2 +-
 kernel/locking/lockdep.c | 11 ++++-------
 4 files changed, 18 insertions(+), 35 deletions(-)

diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c
index d8a55e8..0ffb0cb 100644
--- a/drivers/tty/tty_ldsem.c
+++ b/drivers/tty/tty_ldsem.c
@@ -39,17 +39,10 @@
 				lock_acquire(&(l)->dep_map, s, t, r, c, n, i)
 # define __rel(l, n, i)				\
 				lock_release(&(l)->dep_map, n, i)
-# ifdef CONFIG_PROVE_LOCKING
-#  define lockdep_acquire(l, s, t, i)		__acq(l, s, t, 0, 2, NULL, i)
-#  define lockdep_acquire_nest(l, s, t, n, i)	__acq(l, s, t, 0, 2, n, i)
-#  define lockdep_acquire_read(l, s, t, i)	__acq(l, s, t, 1, 2, NULL, i)
-#  define lockdep_release(l, n, i)		__rel(l, n, i)
-# else
-#  define lockdep_acquire(l, s, t, i)		__acq(l, s, t, 0, 1, NULL, i)
-#  define lockdep_acquire_nest(l, s, t, n, i)	__acq(l, s, t, 0, 1, n, i)
-#  define lockdep_acquire_read(l, s, t, i)	__acq(l, s, t, 1, 1, NULL, i)
-#  define lockdep_release(l, n, i)		__rel(l, n, i)
-# endif
+#define lockdep_acquire(l, s, t, i)		__acq(l, s, t, 0, 1, NULL, i)
+#define lockdep_acquire_nest(l, s, t, n, i)	__acq(l, s, t, 0, 1, n, i)
+#define lockdep_acquire_read(l, s, t, i)	__acq(l, s, t, 1, 1, NULL, i)
+#define lockdep_release(l, n, i)		__rel(l, n, i)
 #else
 # define lockdep_acquire(l, s, t, i)		do { } while (0)
 # define lockdep_acquire_nest(l, s, t, n, i)	do { } while (0)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 92b1bfc..1626047 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -252,9 +252,9 @@ struct held_lock {
 	unsigned int trylock:1;						/* 16 bits */
 
 	unsigned int read:2;        /* see lock_acquire() comment */
-	unsigned int check:2;       /* see lock_acquire() comment */
+	unsigned int check:1;       /* see lock_acquire() comment */
 	unsigned int hardirqs_off:1;
-	unsigned int references:11;					/* 32 bits */
+	unsigned int references:12;					/* 32 bits */
 };
 
 /*
@@ -326,9 +326,8 @@ static inline int lockdep_match_key(struct lockdep_map *lock,
  *
  * Values for check:
  *
- *   0: disabled
- *   1: simple checks (freeing, held-at-exit-time, etc.)
- *   2: full validation
+ *   0: simple checks (freeing, held-at-exit-time, etc.)
+ *   1: full validation
  */
 extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 			 int trylock, int read, int check,
@@ -479,15 +478,9 @@ static inline void print_irqtrace_events(struct task_struct *curr)
  * on the per lock-class debug mode:
  */
 
-#ifdef CONFIG_PROVE_LOCKING
- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 2, n, i)
- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 2, n, i)
- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 2, n, i)
-#else
- #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
- #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
- #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
-#endif
+#define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
+#define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
+#define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
 
 #define spin_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
 #define spin_acquire_nest(l, s, t, n, i)	lock_acquire_exclusive(l, s, t, n, i)
@@ -518,13 +511,13 @@ static inline void print_irqtrace_events(struct task_struct *curr)
 # define might_lock(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 0, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, 0, 1, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 # define might_lock_read(lock) 						\
 do {									\
 	typecheck(struct lockdep_map *, &(lock)->dep_map);		\
-	lock_acquire(&(lock)->dep_map, 0, 0, 1, 2, NULL, _THIS_IP_);	\
+	lock_acquire(&(lock)->dep_map, 0, 0, 1, 1, NULL, _THIS_IP_);	\
 	lock_release(&(lock)->dep_map, 0, _THIS_IP_);			\
 } while (0)
 #else
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 72bf3a0..adff3c9 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -314,7 +314,7 @@ static inline bool rcu_lockdep_current_cpu_online(void)
 
 static inline void rcu_lock_acquire(struct lockdep_map *map)
 {
-	lock_acquire(map, 0, 0, 2, 1, NULL, _THIS_IP_);
+	lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
 }
 
 static inline void rcu_lock_release(struct lockdep_map *map)
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index eb8a547..8c85a0d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2098,7 +2098,7 @@ static int validate_chain(struct task_struct *curr, struct lockdep_map *lock,
 	 * (If lookup_chain_cache() returns with 1 it acquires
 	 * graph_lock for us)
 	 */
-	if (!hlock->trylock && (hlock->check == 2) &&
+	if (!hlock->trylock && hlock->check &&
 	    lookup_chain_cache(curr, hlock, chain_key)) {
 		/*
 		 * Check whether last held lock:
@@ -3055,9 +3055,6 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	int class_idx;
 	u64 chain_key;
 
-	if (!prove_locking)
-		check = 1;
-
 	if (unlikely(!debug_locks))
 		return 0;
 
@@ -3069,8 +3066,8 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
 		return 0;
 
-	if (lock->key == &__lockdep_no_validate__)
-		check = 1;
+	if (!prove_locking || lock->key == &__lockdep_no_validate__)
+		check = 0;
 
 	if (subclass < NR_LOCKDEP_CACHING_CLASSES)
 		class = lock->class_cache[subclass];
@@ -3138,7 +3135,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	hlock->holdtime_stamp = lockstat_clock();
 #endif
 
-	if (check == 2 && !mark_irqflags(curr, hlock))
+	if (check && !mark_irqflags(curr, hlock))
 		return 0;
 
 	/* mark it as used: */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [tip:core/locking] lockdep: Don' t create the wrong dependency on hlock->check == 0
  2014-01-20 18:20                       ` [PATCH 2/5] lockdep: don't create the wrong dependency on hlock->check == 0 Oleg Nesterov
@ 2014-02-10 13:33                         ` tip-bot for Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: tip-bot for Oleg Nesterov @ 2014-02-10 13:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, sasha.levin, hpa, mingo, stern, torvalds, peterz,
	davej, gregkh, paulmck, rostedt, tglx, oleg

Commit-ID:  1b5ff816cab708ba44c7d7b56b613516269eb577
Gitweb:     http://git.kernel.org/tip/1b5ff816cab708ba44c7d7b56b613516269eb577
Author:     Oleg Nesterov <oleg@redhat.com>
AuthorDate: Mon, 20 Jan 2014 19:20:10 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 9 Feb 2014 21:18:57 +0100

lockdep: Don't create the wrong dependency on hlock->check == 0

Test-case:

	DEFINE_MUTEX(m1);
	DEFINE_MUTEX(m2);
	DEFINE_MUTEX(mx);

	void lockdep_should_complain(void)
	{
		lockdep_set_novalidate_class(&mx);

		// m1 -> mx -> m2
		mutex_lock(&m1);
		mutex_lock(&mx);
		mutex_lock(&m2);
		mutex_unlock(&m2);
		mutex_unlock(&mx);
		mutex_unlock(&m1);

		// m2 -> m1 ; should trigger the warning
		mutex_lock(&m2);
		mutex_lock(&m1);
		mutex_unlock(&m1);
		mutex_unlock(&m2);
	}

this doesn't trigger any warning, lockdep can't detect the trivial
deadlock.

This is because lock(&mx) correctly avoids m1 -> mx dependency, it
skips validate_chain() due to mx->check == 0. But lock(&m2) wrongly
adds mx -> m2 and thus m1 -> m2 is not created.

rcu_lock_acquire()->lock_acquire(check => 0) is fine due to read == 2,
so currently only __lockdep_no_validate__ can trigger this problem.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140120182010.GA26498@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/locking/lockdep.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 8c85a0d..f7eba92 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1936,12 +1936,12 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
 
 	for (;;) {
 		int distance = curr->lockdep_depth - depth + 1;
-		hlock = curr->held_locks + depth-1;
+		hlock = curr->held_locks + depth - 1;
 		/*
 		 * Only non-recursive-read entries get new dependencies
 		 * added:
 		 */
-		if (hlock->read != 2) {
+		if (hlock->read != 2 && hlock->check) {
 			if (!check_prev_add(curr, hlock, next,
 						distance, trylock_loop))
 				return 0;

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [tip:core/locking] lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate
  2014-01-20 18:20                       ` [PATCH 3/5] lockdep: change mark_held_locks() to check hlock->check instead of lockdep_no_validate Oleg Nesterov
@ 2014-02-10 13:33                         ` tip-bot for Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: tip-bot for Oleg Nesterov @ 2014-02-10 13:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, sasha.levin, hpa, mingo, stern, torvalds, peterz,
	davej, gregkh, paulmck, rostedt, tglx, oleg

Commit-ID:  34d0ed5ea7a72d5961552fb1758a94f0d3f8f3dc
Gitweb:     http://git.kernel.org/tip/34d0ed5ea7a72d5961552fb1758a94f0d3f8f3dc
Author:     Oleg Nesterov <oleg@redhat.com>
AuthorDate: Mon, 20 Jan 2014 19:20:13 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 9 Feb 2014 21:18:59 +0100

lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate

The __lockdep_no_validate check in mark_held_locks() adds the subtle
and (afaics) unnecessary difference between no-validate and check==0.
And this looks even more inconsistent because __lock_acquire() skips
mark_irqflags()->mark_lock() if !check.

Change mark_held_locks() to check hlock->check instead.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140120182013.GA26505@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/locking/lockdep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index f7eba92..bf0c6b0 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2517,7 +2517,7 @@ mark_held_locks(struct task_struct *curr, enum mark_type mark)
 
 		BUG_ON(usage_bit >= LOCK_USAGE_STATES);
 
-		if (hlock_class(hlock)->key == __lockdep_no_validate__.subkeys)
+		if (!hlock->check)
 			continue;
 
 		if (!mark_lock(curr, hlock, usage_bit))

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [tip:core/locking] lockdep: Change lockdep_set_novalidate_class() to use _and_name
  2014-01-20 18:20                       ` [PATCH 4/5] lockdep: change lockdep_set_novalidate_class() to use _and_name Oleg Nesterov
@ 2014-02-10 13:33                         ` tip-bot for Oleg Nesterov
  0 siblings, 0 replies; 50+ messages in thread
From: tip-bot for Oleg Nesterov @ 2014-02-10 13:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, sasha.levin, hpa, mingo, stern, torvalds, peterz,
	davej, gregkh, paulmck, rostedt, tglx, oleg

Commit-ID:  47be1c1a0e188232b5e5962917b21750053cd3f8
Gitweb:     http://git.kernel.org/tip/47be1c1a0e188232b5e5962917b21750053cd3f8
Author:     Oleg Nesterov <oleg@redhat.com>
AuthorDate: Mon, 20 Jan 2014 19:20:16 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 9 Feb 2014 21:19:01 +0100

lockdep: Change lockdep_set_novalidate_class() to use _and_name

Cosmetic. This doesn't really matter because a) device->mutex is
the only user of __lockdep_no_validate__ and b) this class should
be never reported as the source of problem, but if something goes
wrong "&dev->mutex" looks better than "&__lockdep_no_validate__"
as the name of the lock.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140120182016.GA26512@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/lockdep.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 1626047..060e513 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -303,7 +303,7 @@ extern void lockdep_init_map(struct lockdep_map *lock, const char *name,
 				 (lock)->dep_map.key, sub)
 
 #define lockdep_set_novalidate_class(lock) \
-	lockdep_set_class(lock, &__lockdep_no_validate__)
+	lockdep_set_class_and_name(lock, &__lockdep_no_validate__, #lock)
 /*
  * Compare locking classes
  */

^ permalink raw reply related	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2014-02-10 13:35 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09 11:15 [RFC][PATCH] lockdep: Introduce wait-type checks Peter Zijlstra
2014-01-09 11:49 ` Peter Zijlstra
2014-01-09 16:31 ` Oleg Nesterov
2014-01-09 17:08   ` Peter Zijlstra
2014-01-09 17:54     ` check && lockdep_no_validate (Was: lockdep: Introduce wait-type checks) Oleg Nesterov
2014-01-12 20:58       ` Peter Zijlstra
2014-01-13 16:07         ` Oleg Nesterov
2014-01-16 17:43           ` Oleg Nesterov
2014-01-16 18:09             ` Peter Zijlstra
2014-01-16 20:26               ` Alan Stern
2014-01-17 16:31                 ` Oleg Nesterov
2014-01-17 18:01                   ` Alan Stern
2014-01-20 18:19                     ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Oleg Nesterov
2014-01-20 18:20                       ` [PATCH 1/5] lockdep: make held_lock->check and "int check" argument bool Oleg Nesterov
2014-02-10 13:32                         ` [tip:core/locking] lockdep: Make " tip-bot for Oleg Nesterov
2014-01-20 18:20                       ` [PATCH 2/5] lockdep: don't create the wrong dependency on hlock->check == 0 Oleg Nesterov
2014-02-10 13:33                         ` [tip:core/locking] lockdep: Don' t " tip-bot for Oleg Nesterov
2014-01-20 18:20                       ` [PATCH 3/5] lockdep: change mark_held_locks() to check hlock->check instead of lockdep_no_validate Oleg Nesterov
2014-02-10 13:33                         ` [tip:core/locking] lockdep: Change " tip-bot for Oleg Nesterov
2014-01-20 18:20                       ` [PATCH 4/5] lockdep: change lockdep_set_novalidate_class() to use _and_name Oleg Nesterov
2014-02-10 13:33                         ` [tip:core/locking] lockdep: Change " tip-bot for Oleg Nesterov
2014-01-20 18:20                       ` [PATCH 5/5] lockdep: pack subclass/trylock/read/check into a single argument Oleg Nesterov
2014-01-21 14:10                         ` Peter Zijlstra
2014-01-21 17:27                           ` Oleg Nesterov
2014-01-21 17:35                           ` Dave Jones
2014-01-21 18:43                             ` Oleg Nesterov
2014-01-21 18:53                               ` Steven Rostedt
2014-01-21 20:06                                 ` Oleg Nesterov
2014-01-21 19:39                               ` uninline rcu_lock_acquire/etc ? Oleg Nesterov
2014-01-22  3:54                                 ` Paul E. McKenney
2014-01-22 18:31                                   ` Oleg Nesterov
2014-01-22 19:34                                     ` Paul E. McKenney
2014-01-22 19:39                                       ` Oleg Nesterov
2014-01-20 18:37                       ` [PATCH 0/5] lockdep: (Was: check && lockdep_no_validate) Alan Stern
2014-01-20 18:54                         ` Oleg Nesterov
2014-01-20 21:42                           ` Alan Stern
2014-01-12  9:40     ` [RFC][PATCH] lockdep: Introduce wait-type checks Ingo Molnar
2014-01-12 17:45       ` [PATCH 0/1] lockdep: Kill held_lock->check and "int check" arg of __lock_acquire() Oleg Nesterov
2014-01-12 17:45         ` [PATCH 1/1] " Oleg Nesterov
2014-01-13  0:28           ` Dave Jones
2014-01-13 16:20             ` Oleg Nesterov
2014-01-13 17:06           ` Oleg Nesterov
2014-01-13 17:28             ` Peter Zijlstra
2014-01-13 18:52               ` Oleg Nesterov
2014-01-13 22:34               ` Paul E. McKenney
2014-01-12 20:00         ` [PATCH 0/1] " Peter Zijlstra
2014-01-13 18:35           ` Oleg Nesterov
2014-01-09 17:33 ` [RFC][PATCH] lockdep: Introduce wait-type checks Dave Jones
2014-01-09 22:12   ` Peter Zijlstra
2014-01-10 12:11   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).