linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 1/2] rtmutex: Handle deadlock detection smarter
  2014-06-05 15:28 [patch 0/2] rtmutex: Sanitze deadlock detection and chain walk Thomas Gleixner
@ 2014-06-05 15:28 ` Thomas Gleixner
  2014-06-06  1:14   ` Steven Rostedt
                     ` (2 more replies)
  2014-06-05 15:28 ` [patch 2/2] rtmutex: Detect changes in the pi lock chain Thomas Gleixner
  1 sibling, 3 replies; 14+ messages in thread
From: Thomas Gleixner @ 2014-06-05 15:28 UTC (permalink / raw)
  To: LKML; +Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Brad Mouring

[-- Attachment #1: rtmutex-handle-deadlock-detection-gracefully.patch --]
[-- Type: text/plain, Size: 3458 bytes --]

Even in the case when deadlock detection is not requested by the
caller, we can detect deadlocks. Right now the code stops the lock
chain walk and keeps the waiter enqueued, even on itself. Silly not to
yell when such a scenario is detected and to keep the waiter enqueued.

Return -EDEADLK unconditionally and handle it at the call sites.

The futex calls return -EDEADLK. The non futex ones dequeue the
waiter, throw a warning and put the task into a schedule loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |   33 ++++++++++++++++++++++++++++-----
 kernel/locking/rtmutex.h |    6 +++++-
 2 files changed, 33 insertions(+), 6 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -314,7 +314,7 @@ static int rt_mutex_adjust_prio_chain(st
 		}
 		put_task_struct(task);
 
-		return deadlock_detect ? -EDEADLK : 0;
+		return -EDEADLK;
 	}
  retry:
 	/*
@@ -377,7 +377,7 @@ static int rt_mutex_adjust_prio_chain(st
 	if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
 		debug_rt_mutex_deadlock(deadlock_detect, orig_waiter, lock);
 		raw_spin_unlock(&lock->wait_lock);
-		ret = deadlock_detect ? -EDEADLK : 0;
+		ret = -EDEADLK;
 		goto out_unlock_pi;
 	}
 
@@ -548,7 +548,7 @@ static int task_blocks_on_rt_mutex(struc
 	 * which is wrong, as the other waiter is not in a deadlock
 	 * situation.
 	 */
-	if (detect_deadlock && owner == task)
+	if (owner == task)
 		return -EDEADLK;
 
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
@@ -763,6 +763,26 @@ __rt_mutex_slowlock(struct rt_mutex *loc
 	return ret;
 }
 
+static void rt_mutex_handle_deadlock(int res, int detect_deadlock,
+				     struct rtmutex_waiter *w)
+{
+	/*
+	 * If the result is not -EDEADLOCK or the caller requested
+	 * deadlock detection, nothing to do here.
+	 */
+	if (res != -EDEADLOCK || detect_deadlock)
+		return;
+
+	/*
+	 * Yell lowdly and stop the task right here.
+	 */
+	debug_rt_mutex_print_deadlock(w);
+	while (1) {
+		set_current_state(TASK_INTERRUPTIBLE);
+		schedule();
+	}
+}
+
 /*
  * Slow path lock function:
  */
@@ -802,8 +822,10 @@ rt_mutex_slowlock(struct rt_mutex *lock,
 
 	set_current_state(TASK_RUNNING);
 
-	if (unlikely(ret))
+	if (unlikely(ret)) {
 		remove_waiter(lock, &waiter);
+		rt_mutex_handle_deadlock(ret, detect_deadlock, &waiter);
+	}
 
 	/*
 	 * try_to_take_rt_mutex() sets the waiter bit
@@ -1112,7 +1134,8 @@ int rt_mutex_start_proxy_lock(struct rt_
 		return 1;
 	}
 
-	ret = task_blocks_on_rt_mutex(lock, waiter, task, detect_deadlock);
+	/* We enforce deadlock detection for futexes */
+	ret = task_blocks_on_rt_mutex(lock, waiter, task, 1);
 
 	if (ret && !rt_mutex_owner(lock)) {
 		/*
Index: tip/kernel/locking/rtmutex.h
===================================================================
--- tip.orig/kernel/locking/rtmutex.h
+++ tip/kernel/locking/rtmutex.h
@@ -21,6 +21,10 @@
 #define debug_rt_mutex_unlock(l)			do { } while (0)
 #define debug_rt_mutex_init(m, n)			do { } while (0)
 #define debug_rt_mutex_deadlock(d, a ,l)		do { } while (0)
-#define debug_rt_mutex_print_deadlock(w)		do { } while (0)
 #define debug_rt_mutex_detect_deadlock(w,d)		(d)
 #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
+
+static inline void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
+{
+	WARN(1, "rtmutex deadlock detected\n");
+}



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 0/2] rtmutex: Sanitze deadlock detection and chain walk
@ 2014-06-05 15:28 Thomas Gleixner
  2014-06-05 15:28 ` [patch 1/2] rtmutex: Handle deadlock detection smarter Thomas Gleixner
  2014-06-05 15:28 ` [patch 2/2] rtmutex: Detect changes in the pi lock chain Thomas Gleixner
  0 siblings, 2 replies; 14+ messages in thread
From: Thomas Gleixner @ 2014-06-05 15:28 UTC (permalink / raw)
  To: LKML; +Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Brad Mouring

Brad found a flaw in the dead lock detection logic when the lock chain
changes during the chain walk and tried to fixup the deadlock detector
itself.

The following patches address this issue in a different way. Instead
of guessing about the chain change, simply detect it.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 2/2] rtmutex: Detect changes in the pi lock chain
  2014-06-05 15:28 [patch 0/2] rtmutex: Sanitze deadlock detection and chain walk Thomas Gleixner
  2014-06-05 15:28 ` [patch 1/2] rtmutex: Handle deadlock detection smarter Thomas Gleixner
@ 2014-06-05 15:28 ` Thomas Gleixner
  2014-06-06  9:48   ` Peter Zijlstra
                     ` (3 more replies)
  1 sibling, 4 replies; 14+ messages in thread
From: Thomas Gleixner @ 2014-06-05 15:28 UTC (permalink / raw)
  To: LKML; +Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Brad Mouring

[-- Attachment #1: rtmutex-detect-lock-chain-changes.patch --]
[-- Type: text/plain, Size: 8552 bytes --]

When we walk the lock chain, we drop all locks after each step. So the
lock chain can change under us before we reacquire the locks. That's
harmless in principle as we just follow the wrong lock path. But it
can lead to a false positive in the dead lock detection logic:

T0 holds L0
T0 blocks on L1 held by T1
T1 blocks on L2 held by T2
T2 blocks on L3 held by T3
T4 blocks on L4 held by T4

Now we walk the chain

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> 
     lock T2 ->  adjust T2 ->  drop locks

T2 times out and blocks on L0

Now we continue:

lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all.

Brad tried to work around that in the deadlock detection logic itself,
but the more I looked at it the less I liked it, because it's crystal
ball magic after the fact.

We actually can detect a chain change very simple:

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->

     next_lock = T2->pi_blocked_on->lock;

drop locks

T2 times out and blocks on L0

Now we continue:

lock T2 -> 

     if (next_lock != T2->pi_blocked_on->lock)
     	   return;

So if we detect that T2 is now blocked on a different lock we stop the
chain walk. That's also correct in the following scenario:

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->

     next_lock = T2->pi_blocked_on->lock;

drop locks

T3 times out and drops L3
T2 acquires L3 and blocks on L4 now

Now we continue:

lock T2 -> 

     if (next_lock != T2->pi_blocked_on->lock)
     	   return;

We don't have to follow up the chain at that point, because T2
propagated our priority up to T4 already.

Reported-by: Brad Mouring <bmouring@ni.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |   84 ++++++++++++++++++++++++++++++++++-------------
 1 file changed, 61 insertions(+), 23 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -264,23 +264,27 @@ int max_lock_depth = 1024;
  * Adjust the priority chain. Also used for deadlock detection.
  * Decreases task's usage by one - may thus free the task.
  *
- * @task: the task owning the mutex (owner) for which a chain walk is probably
- *	  needed
+ * @task:	the task owning the mutex (owner) for which a chain walk is
+ *		probably needed
  * @deadlock_detect: do we have to carry out deadlock detection?
- * @orig_lock: the mutex (can be NULL if we are walking the chain to recheck
- * 	       things for a task that has just got its priority adjusted, and
- *	       is waiting on a mutex)
+ * @orig_lock:	the mutex (can be NULL if we are walking the chain to recheck
+ *		things for a task that has just got its priority adjusted, and
+ *		is waiting on a mutex)
+ * @next_lock:	the mutex on which the owner of @orig_lock was blocked before
+ *		we dropped its pi_lock. Is never dereferenced, only used for
+ *		comparison to detect lock chain changes.
  * @orig_waiter: rt_mutex_waiter struct for the task that has just donated
- *		 its priority to the mutex owner (can be NULL in the case
- *		 depicted above or if the top waiter is gone away and we are
- *		 actually deboosting the owner)
- * @top_task: the current top waiter
+ *		its priority to the mutex owner (can be NULL in the case
+ *		depicted above or if the top waiter is gone away and we are
+ *		actually deboosting the owner)
+ * @top_task:	the current top waiter
  *
  * Returns 0 or -EDEADLK.
  */
 static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 				      int deadlock_detect,
 				      struct rt_mutex *orig_lock,
+				      struct rt_mutex *next_lock,
 				      struct rt_mutex_waiter *orig_waiter,
 				      struct task_struct *top_task)
 {
@@ -339,6 +343,18 @@ static int rt_mutex_adjust_prio_chain(st
 		goto out_unlock_pi;
 
 	/*
+	 * We dropped all locks after taking a refcount on @task, so
+	 * the task might have moved on in the lock chain or even left
+	 * the chain completely and blocks now on an unrelated lock or
+	 * on @orig_lock.
+	 *
+	 * We stored the lock on which @task was blocked in @next_lock,
+	 * so we can detect the chain change.
+	 */
+	if (next_lock != waiter->lock)
+		goto out_unlock_pi;
+
+	/*
 	 * Drop out, when the task has no waiters. Note,
 	 * top_waiter can be NULL, when we are in the deboosting
 	 * mode!
@@ -422,11 +438,28 @@ static int rt_mutex_adjust_prio_chain(st
 		__rt_mutex_adjust_prio(task);
 	}
 
+	/*
+	 * Check whether the task which owns the current lock is pi
+	 * blocked itself. If yes we store a pointer to the lock for
+	 * the lock chain change detection above.
+	 */
+	if (task->pi_blocked_on)
+		next_lock = task->pi_blocked_on->lock;
+	else
+		next_lock = NULL;
+
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 
 	top_waiter = rt_mutex_top_waiter(lock);
 	raw_spin_unlock(&lock->wait_lock);
 
+	/*
+	 * We reached the end of the lock chain. Stop right here. No
+	 * point to go back just to figure that out.
+	 */
+	if (!next_lock)
+		goto out_put_task;
+
 	if (!detect_deadlock && waiter != top_waiter)
 		goto out_put_task;
 
@@ -536,8 +569,9 @@ static int task_blocks_on_rt_mutex(struc
 {
 	struct task_struct *owner = rt_mutex_owner(lock);
 	struct rt_mutex_waiter *top_waiter = waiter;
+	struct rt_mutex *next_lock = NULL;
 	unsigned long flags;
-	int chain_walk = 0, res;
+	int chain_walk, res;
 
 	/*
 	 * Early deadlock detection. We really don't want the task to
@@ -569,19 +603,21 @@ static int task_blocks_on_rt_mutex(struc
 	if (!owner)
 		return 0;
 
+	raw_spin_lock_irqsave(&owner->pi_lock, flags);
 	if (waiter == rt_mutex_top_waiter(lock)) {
-		raw_spin_lock_irqsave(&owner->pi_lock, flags);
 		rt_mutex_dequeue_pi(owner, top_waiter);
 		rt_mutex_enqueue_pi(owner, waiter);
 
 		__rt_mutex_adjust_prio(owner);
 		if (owner->pi_blocked_on)
 			chain_walk = 1;
-		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
-	}
-	else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock))
+	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
 		chain_walk = 1;
+	}
+	if (owner->pi_blocked_on)
+		next_lock = owner->pi_blocked_on->lock;
 
+	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
 	if (!chain_walk)
 		return 0;
 
@@ -594,8 +630,8 @@ static int task_blocks_on_rt_mutex(struc
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock, waiter,
-					 task);
+	res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock,
+					 next_lock, waiter, task);
 
 	raw_spin_lock(&lock->wait_lock);
 
@@ -644,8 +680,8 @@ static void remove_waiter(struct rt_mute
 {
 	int first = (waiter == rt_mutex_top_waiter(lock));
 	struct task_struct *owner = rt_mutex_owner(lock);
+	struct rt_mutex *next_lock = NULL;
 	unsigned long flags;
-	int chain_walk = 0;
 
 	raw_spin_lock_irqsave(&current->pi_lock, flags);
 	rt_mutex_dequeue(lock, waiter);
@@ -670,12 +706,12 @@ static void remove_waiter(struct rt_mute
 		__rt_mutex_adjust_prio(owner);
 
 		if (owner->pi_blocked_on)
-			chain_walk = 1;
+			next_lock = owner->pi_blocked_on->lock;
 
 		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
 	}
 
-	if (!chain_walk)
+	if (!next_lock)
 		return;
 
 	/* gets dropped in rt_mutex_adjust_prio_chain()! */
@@ -683,7 +719,7 @@ static void remove_waiter(struct rt_mute
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	rt_mutex_adjust_prio_chain(owner, 0, lock, NULL, current);
+	rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, current);
 
 	raw_spin_lock(&lock->wait_lock);
 }
@@ -696,6 +732,7 @@ static void remove_waiter(struct rt_mute
 void rt_mutex_adjust_pi(struct task_struct *task)
 {
 	struct rt_mutex_waiter *waiter;
+	struct rt_mutex *next_lock;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
@@ -706,12 +743,13 @@ void rt_mutex_adjust_pi(struct task_stru
 		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 		return;
 	}
-
+	next_lock = waiter->lock;
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 
 	/* gets dropped in rt_mutex_adjust_prio_chain()! */
 	get_task_struct(task);
-	rt_mutex_adjust_prio_chain(task, 0, NULL, NULL, task);
+
+	rt_mutex_adjust_prio_chain(task, 0, NULL, next_lock, NULL, task);
 }
 
 /**
@@ -764,7 +802,7 @@ __rt_mutex_slowlock(struct rt_mutex *loc
 }
 
 static void rt_mutex_handle_deadlock(int res, int detect_deadlock,
-				     struct rtmutex_waiter *w)
+				     struct rt_mutex_waiter *w)
 {
 	/*
 	 * If the result is not -EDEADLOCK or the caller requested



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] rtmutex: Handle deadlock detection smarter
  2014-06-05 15:28 ` [patch 1/2] rtmutex: Handle deadlock detection smarter Thomas Gleixner
@ 2014-06-06  1:14   ` Steven Rostedt
  2014-06-06  5:41     ` Thomas Gleixner
  2014-06-06  2:59   ` Steven Rostedt
  2014-06-09 16:49   ` [tip:locking/urgent] " tip-bot for Thomas Gleixner
  2 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2014-06-06  1:14 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Peter Zijlstra, Ingo Molnar, Brad Mouring

On Thu, 05 Jun 2014 15:28:32 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:


> @@ -1112,7 +1134,8 @@ int rt_mutex_start_proxy_lock(struct rt_
>  		return 1;
>  	}
>  
> -	ret = task_blocks_on_rt_mutex(lock, waiter, task, detect_deadlock);
> +	/* We enforce deadlock detection for futexes */
> +	ret = task_blocks_on_rt_mutex(lock, waiter, task, 1);

Why bother with passing in detect_deadlock then?

Same goes for rt_mutex_finish_proxy_lock().

-- Steve



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] rtmutex: Handle deadlock detection smarter
  2014-06-05 15:28 ` [patch 1/2] rtmutex: Handle deadlock detection smarter Thomas Gleixner
  2014-06-06  1:14   ` Steven Rostedt
@ 2014-06-06  2:59   ` Steven Rostedt
  2014-06-06  5:40     ` Thomas Gleixner
  2014-06-09 16:49   ` [tip:locking/urgent] " tip-bot for Thomas Gleixner
  2 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2014-06-06  2:59 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Peter Zijlstra, Ingo Molnar, Brad Mouring

On Thu, 05 Jun 2014 15:28:32 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> Index: tip/kernel/locking/rtmutex.h
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.h
> +++ tip/kernel/locking/rtmutex.h
> @@ -21,6 +21,10 @@
>  #define debug_rt_mutex_unlock(l)			do { } while (0)
>  #define debug_rt_mutex_init(m, n)			do { } while (0)
>  #define debug_rt_mutex_deadlock(d, a ,l)		do { } while (0)
> -#define debug_rt_mutex_print_deadlock(w)		do { } while (0)
>  #define debug_rt_mutex_detect_deadlock(w,d)		(d)
>  #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
> +
> +static inline void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
> +{
> +	WARN(1, "rtmutex deadlock detected\n");
> +}
> 

The above is called directly in rt_mutex_start_proxy_lock(), and as it
doesn't have a conditional where the DEBUG_RT_MUTEXES version does, I
get a ton of these:

[   27.192428] WARNING: CPU: 1 PID: 1013 at /home/rostedt/work/git/linux-rt.git/kernel/locking/rtmutex.h:29 rt_mutex_start_proxy_lock+0x89/0xc0()
[   27.205212] rtmutex deadlock detected
[   27.208901] Modules linked in: lockd bnep bluetooth nf_conntrack_ipv4 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv4 nf_defrag_ipv6 xt_state ip6table_filter nf_conntrack ip6_table
s snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm x86_pkg_temp_thermal hp_wmi
 tpm_infineon snd_timer rfkill i2c_i801 lpc_ich coretemp sparse_keymap wmi mfd_core tpm_tis tpm snd soundcore serio_raw microcode pcspkr uinput i915 i2c_algo_bit e1000e drm_kms_hel
per ptp crc32c_intel drm pps_core i2c_core video sunrpc
[   27.262062] CPU: 1 PID: 1013 Comm: threaded-ml Tainted: G        W     3.15.0-rc8-test+ #292
[   27.270531] Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
[   27.279504]  000000000000001d ffff8800c7bc3c48 ffffffff816884fd ffff88011ea4ef68
[   27.287008]  ffff8800c7bc3c98 ffff8800c7bc3c88 ffffffff8106f28c ffff8800c7bc3c78
[   27.294519]  ffff8800c795a8b0 0000000000000000 ffff8800c890bd40 ffffc9000476e7f0
[   27.302018] Call Trace:
[   27.304491]  [<ffffffff816884fd>] dump_stack+0x4f/0x7c
[   27.309643]  [<ffffffff8106f28c>] warn_slowpath_common+0x8c/0xc0
[   27.315678]  [<ffffffff8106f376>] warn_slowpath_fmt+0x46/0x50
[   27.321461]  [<ffffffff810b9259>] rt_mutex_start_proxy_lock+0x89/0xc0
[   27.327921]  [<ffffffff810e0ad7>] futex_requeue+0x507/0x840
[   27.333504]  [<ffffffff810e1576>] do_futex+0x2e6/0xb80
[   27.338660]  [<ffffffff8118317a>] ? unmap_region+0xea/0x110
[   27.344257]  [<ffffffff8109f0f1>] ? get_parent_ip+0x11/0x50
[   27.349876]  [<ffffffff8169374d>] ? preempt_count_add+0x5d/0xb0
[   27.355816]  [<ffffffff8109f0f1>] ? get_parent_ip+0x11/0x50
[   27.361404]  [<ffffffff81338633>] ? __this_cpu_preempt_check+0x13/0x20
[   27.367947]  [<ffffffff81344093>] ? __percpu_counter_add+0x93/0xd0
[   27.374161]  [<ffffffff810e1f52>] SyS_futex+0x142/0x1a0
[   27.379403]  [<ffffffff81101fa0>] ? __audit_syscall_exit+0x200/0x280
[   27.385776]  [<ffffffff81697cd2>] system_call_fastpath+0x16/0x1b
[   27.391797] ---[ end trace 438aec68793ded61 ]---
[   27.396483] ------------[ cut here ]------------


-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] rtmutex: Handle deadlock detection smarter
  2014-06-06  2:59   ` Steven Rostedt
@ 2014-06-06  5:40     ` Thomas Gleixner
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2014-06-06  5:40 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Peter Zijlstra, Ingo Molnar, Brad Mouring

On Thu, 5 Jun 2014, Steven Rostedt wrote:

> On Thu, 05 Jun 2014 15:28:32 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > Index: tip/kernel/locking/rtmutex.h
> > ===================================================================
> > --- tip.orig/kernel/locking/rtmutex.h
> > +++ tip/kernel/locking/rtmutex.h
> > @@ -21,6 +21,10 @@
> >  #define debug_rt_mutex_unlock(l)			do { } while (0)
> >  #define debug_rt_mutex_init(m, n)			do { } while (0)
> >  #define debug_rt_mutex_deadlock(d, a ,l)		do { } while (0)
> > -#define debug_rt_mutex_print_deadlock(w)		do { } while (0)
> >  #define debug_rt_mutex_detect_deadlock(w,d)		(d)
> >  #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
> > +
> > +static inline void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
> > +{
> > +	WARN(1, "rtmutex deadlock detected\n");
> > +}
> > 
> 
> The above is called directly in rt_mutex_start_proxy_lock(), and as it
> doesn't have a conditional where the DEBUG_RT_MUTEXES version does, I
> get a ton of these:

Crap, yes. Of course I had debug enabled :)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] rtmutex: Handle deadlock detection smarter
  2014-06-06  1:14   ` Steven Rostedt
@ 2014-06-06  5:41     ` Thomas Gleixner
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2014-06-06  5:41 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Peter Zijlstra, Ingo Molnar, Brad Mouring

On Thu, 5 Jun 2014, Steven Rostedt wrote:

> On Thu, 05 Jun 2014 15:28:32 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> 
> > @@ -1112,7 +1134,8 @@ int rt_mutex_start_proxy_lock(struct rt_
> >  		return 1;
> >  	}
> >  
> > -	ret = task_blocks_on_rt_mutex(lock, waiter, task, detect_deadlock);
> > +	/* We enforce deadlock detection for futexes */
> > +	ret = task_blocks_on_rt_mutex(lock, waiter, task, 1);
> 
> Why bother with passing in detect_deadlock then?
> 
> Same goes for rt_mutex_finish_proxy_lock().

Because that's part of the cleanup series to remove it and I did not
want mix stuff here.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/2] rtmutex: Detect changes in the pi lock chain
  2014-06-05 15:28 ` [patch 2/2] rtmutex: Detect changes in the pi lock chain Thomas Gleixner
@ 2014-06-06  9:48   ` Peter Zijlstra
  2014-06-06  9:58     ` Steven Rostedt
  2014-06-06 14:25   ` Steven Rostedt
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Peter Zijlstra @ 2014-06-06  9:48 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Steven Rostedt, Ingo Molnar, Brad Mouring

[-- Attachment #1: Type: text/plain, Size: 1835 bytes --]



hehe, I can almost follow this code :-)

How about something like this on top?

---
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -260,6 +260,11 @@ static void rt_mutex_adjust_prio(struct
  */
 int max_lock_depth = 1024;
 
+static inline struct rt_mutex *task_blocked_on(struct task_struct *p)
+{
+	return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL;
+}
+
 /*
  * Adjust the priority chain. Also used for deadlock detection.
  * Decreases task's usage by one - may thus free the task.
@@ -443,10 +448,7 @@ static int rt_mutex_adjust_prio_chain(st
 	 * blocked itself. If yes we store a pointer to the lock for
 	 * the lock chain change detection above.
 	 */
-	if (task->pi_blocked_on)
-		next_lock = task->pi_blocked_on->lock;
-	else
-		next_lock = NULL;
+	next_lock = task_blocked_on(task);
 
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 
@@ -569,7 +571,7 @@ static int task_blocks_on_rt_mutex(struc
 {
 	struct task_struct *owner = rt_mutex_owner(lock);
 	struct rt_mutex_waiter *top_waiter = waiter;
-	struct rt_mutex *next_lock = NULL;
+	struct rt_mutex *next_lock;
 	unsigned long flags;
 	int chain_walk, res;
 
@@ -614,8 +616,8 @@ static int task_blocks_on_rt_mutex(struc
 	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
 		chain_walk = 1;
 	}
-	if (owner->pi_blocked_on)
-		next_lock = owner->pi_blocked_on->lock;
+
+	next_lock = task_blocked_on(owner);
 
 	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
 	if (!chain_walk)
@@ -705,8 +707,7 @@ static void remove_waiter(struct rt_mute
 		}
 		__rt_mutex_adjust_prio(owner);
 
-		if (owner->pi_blocked_on)
-			next_lock = owner->pi_blocked_on->lock;
+		next_lock = task_blocked_on(owner);
 
 		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
 	}

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/2] rtmutex: Detect changes in the pi lock chain
  2014-06-06  9:48   ` Peter Zijlstra
@ 2014-06-06  9:58     ` Steven Rostedt
  0 siblings, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2014-06-06  9:58 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Thomas Gleixner, LKML, Ingo Molnar, Brad Mouring

On Fri, 6 Jun 2014 11:48:59 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> 
> 
> hehe, I can almost follow this code :-)
> 
> How about something like this on top?

Acked-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> 
> ---
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -260,6 +260,11 @@ static void rt_mutex_adjust_prio(struct
>   */
>  int max_lock_depth = 1024;
>  
> +static inline struct rt_mutex *task_blocked_on(struct task_struct *p)
> +{
> +	return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL;
> +}
> +
>  /*
>   * Adjust the priority chain. Also used for deadlock detection.
>   * Decreases task's usage by one - may thus free the task.
> @@ -443,10 +448,7 @@ static int rt_mutex_adjust_prio_chain(st
>  	 * blocked itself. If yes we store a pointer to the lock for
>  	 * the lock chain change detection above.
>  	 */
> -	if (task->pi_blocked_on)
> -		next_lock = task->pi_blocked_on->lock;
> -	else
> -		next_lock = NULL;
> +	next_lock = task_blocked_on(task);
>  
>  	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
>  
> @@ -569,7 +571,7 @@ static int task_blocks_on_rt_mutex(struc
>  {
>  	struct task_struct *owner = rt_mutex_owner(lock);
>  	struct rt_mutex_waiter *top_waiter = waiter;
> -	struct rt_mutex *next_lock = NULL;
> +	struct rt_mutex *next_lock;
>  	unsigned long flags;
>  	int chain_walk, res;
>  
> @@ -614,8 +616,8 @@ static int task_blocks_on_rt_mutex(struc
>  	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
>  		chain_walk = 1;
>  	}
> -	if (owner->pi_blocked_on)
> -		next_lock = owner->pi_blocked_on->lock;
> +
> +	next_lock = task_blocked_on(owner);
>  
>  	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
>  	if (!chain_walk)
> @@ -705,8 +707,7 @@ static void remove_waiter(struct rt_mute
>  		}
>  		__rt_mutex_adjust_prio(owner);
>  
> -		if (owner->pi_blocked_on)
> -			next_lock = owner->pi_blocked_on->lock;
> +		next_lock = task_blocked_on(owner);
>  
>  		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
>  	}


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/2] rtmutex: Detect changes in the pi lock chain
  2014-06-05 15:28 ` [patch 2/2] rtmutex: Detect changes in the pi lock chain Thomas Gleixner
  2014-06-06  9:48   ` Peter Zijlstra
@ 2014-06-06 14:25   ` Steven Rostedt
  2014-06-06 14:49   ` Steven Rostedt
  2014-06-09 16:49   ` [tip:locking/urgent] " tip-bot for Thomas Gleixner
  3 siblings, 0 replies; 14+ messages in thread
From: Steven Rostedt @ 2014-06-06 14:25 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Peter Zijlstra, Ingo Molnar, Brad Mouring

On Thu, 05 Jun 2014 15:28:33 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:
 
> @@ -536,8 +569,9 @@ static int task_blocks_on_rt_mutex(struc
>  {
>  	struct task_struct *owner = rt_mutex_owner(lock);
>  	struct rt_mutex_waiter *top_waiter = waiter;
> +	struct rt_mutex *next_lock = NULL;
>  	unsigned long flags;
> -	int chain_walk = 0, res;
> +	int chain_walk, res;
>  
>  	/*
>  	 * Early deadlock detection. We really don't want the task to
> @@ -569,19 +603,21 @@ static int task_blocks_on_rt_mutex(struc
>  	if (!owner)
>  		return 0;
>  
> +	raw_spin_lock_irqsave(&owner->pi_lock, flags);
>  	if (waiter == rt_mutex_top_waiter(lock)) {
> -		raw_spin_lock_irqsave(&owner->pi_lock, flags);
>  		rt_mutex_dequeue_pi(owner, top_waiter);
>  		rt_mutex_enqueue_pi(owner, waiter);
>  
>  		__rt_mutex_adjust_prio(owner);
>  		if (owner->pi_blocked_on)
>  			chain_walk = 1;
> -		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
> -	}
> -	else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock))
> +	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
>  		chain_walk = 1;
> +	}
> +	if (owner->pi_blocked_on)
> +		next_lock = owner->pi_blocked_on->lock;
>  
> +	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
>  	if (!chain_walk)
>  		return 0;

Here's another optimization:

	/* If the owner is not blocked, no need to walk the chain */
	if (!chain_walk || !next_lock)
		return 0;



-- Steve

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/2] rtmutex: Detect changes in the pi lock chain
  2014-06-05 15:28 ` [patch 2/2] rtmutex: Detect changes in the pi lock chain Thomas Gleixner
  2014-06-06  9:48   ` Peter Zijlstra
  2014-06-06 14:25   ` Steven Rostedt
@ 2014-06-06 14:49   ` Steven Rostedt
  2014-06-07  7:31     ` Thomas Gleixner
  2014-06-09 16:49   ` [tip:locking/urgent] " tip-bot for Thomas Gleixner
  3 siblings, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2014-06-06 14:49 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML, Peter Zijlstra, Ingo Molnar, Brad Mouring

On Thu, 05 Jun 2014 15:28:33 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:
 
>  /**
> @@ -764,7 +802,7 @@ __rt_mutex_slowlock(struct rt_mutex *loc
>  }
>  
>  static void rt_mutex_handle_deadlock(int res, int detect_deadlock,
> -				     struct rtmutex_waiter *w)
> +				     struct rt_mutex_waiter *w)
>  {
>  	/*
>  	 * If the result is not -EDEADLOCK or the caller requested
> 

This needs to be folded in patch 1.

-- Steve


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 2/2] rtmutex: Detect changes in the pi lock chain
  2014-06-06 14:49   ` Steven Rostedt
@ 2014-06-07  7:31     ` Thomas Gleixner
  0 siblings, 0 replies; 14+ messages in thread
From: Thomas Gleixner @ 2014-06-07  7:31 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML, Peter Zijlstra, Ingo Molnar, Brad Mouring

On Fri, 6 Jun 2014, Steven Rostedt wrote:
> On Thu, 05 Jun 2014 15:28:33 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
>  
> >  /**
> > @@ -764,7 +802,7 @@ __rt_mutex_slowlock(struct rt_mutex *loc
> >  }
> >  
> >  static void rt_mutex_handle_deadlock(int res, int detect_deadlock,
> > -				     struct rtmutex_waiter *w)
> > +				     struct rt_mutex_waiter *w)
> >  {
> >  	/*
> >  	 * If the result is not -EDEADLOCK or the caller requested
> > 
> 
> This needs to be folded in patch 1.

I know :) quilt refresh folded it automagically :)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [tip:locking/urgent] rtmutex: Handle deadlock detection smarter
  2014-06-05 15:28 ` [patch 1/2] rtmutex: Handle deadlock detection smarter Thomas Gleixner
  2014-06-06  1:14   ` Steven Rostedt
  2014-06-06  2:59   ` Steven Rostedt
@ 2014-06-09 16:49   ` tip-bot for Thomas Gleixner
  2 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Thomas Gleixner @ 2014-06-09 16:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, bmouring, rostedt, peterz, tglx

Commit-ID:  3d5c9340d1949733eb37616abd15db36aef9a57c
Gitweb:     http://git.kernel.org/tip/3d5c9340d1949733eb37616abd15db36aef9a57c
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 5 Jun 2014 12:34:23 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 7 Jun 2014 14:55:40 +0200

rtmutex: Handle deadlock detection smarter

Even in the case when deadlock detection is not requested by the
caller, we can detect deadlocks. Right now the code stops the lock
chain walk and keeps the waiter enqueued, even on itself. Silly not to
yell when such a scenario is detected and to keep the waiter enqueued.

Return -EDEADLK unconditionally and handle it at the call sites.

The futex calls return -EDEADLK. The non futex ones dequeue the
waiter, throw a warning and put the task into a schedule loop.

Tagged for stable as it makes the code more robust.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Brad Mouring <bmouring@ni.com>
Link: http://lkml.kernel.org/r/20140605152801.836501969@linutronix.de
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex-debug.h |  5 +++++
 kernel/locking/rtmutex.c       | 33 ++++++++++++++++++++++++++++-----
 kernel/locking/rtmutex.h       |  5 +++++
 3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/kernel/locking/rtmutex-debug.h b/kernel/locking/rtmutex-debug.h
index 14193d5..ab29b6a 100644
--- a/kernel/locking/rtmutex-debug.h
+++ b/kernel/locking/rtmutex-debug.h
@@ -31,3 +31,8 @@ static inline int debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *waiter,
 {
 	return (waiter != NULL);
 }
+
+static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
+{
+	debug_rt_mutex_print_deadlock(w);
+}
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index a620d4d..eb7a463 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -314,7 +314,7 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		}
 		put_task_struct(task);
 
-		return deadlock_detect ? -EDEADLK : 0;
+		return -EDEADLK;
 	}
  retry:
 	/*
@@ -377,7 +377,7 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 	if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
 		debug_rt_mutex_deadlock(deadlock_detect, orig_waiter, lock);
 		raw_spin_unlock(&lock->wait_lock);
-		ret = deadlock_detect ? -EDEADLK : 0;
+		ret = -EDEADLK;
 		goto out_unlock_pi;
 	}
 
@@ -548,7 +548,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 	 * which is wrong, as the other waiter is not in a deadlock
 	 * situation.
 	 */
-	if (detect_deadlock && owner == task)
+	if (owner == task)
 		return -EDEADLK;
 
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
@@ -763,6 +763,26 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state,
 	return ret;
 }
 
+static void rt_mutex_handle_deadlock(int res, int detect_deadlock,
+				     struct rt_mutex_waiter *w)
+{
+	/*
+	 * If the result is not -EDEADLOCK or the caller requested
+	 * deadlock detection, nothing to do here.
+	 */
+	if (res != -EDEADLOCK || detect_deadlock)
+		return;
+
+	/*
+	 * Yell lowdly and stop the task right here.
+	 */
+	rt_mutex_print_deadlock(w);
+	while (1) {
+		set_current_state(TASK_INTERRUPTIBLE);
+		schedule();
+	}
+}
+
 /*
  * Slow path lock function:
  */
@@ -802,8 +822,10 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 
 	set_current_state(TASK_RUNNING);
 
-	if (unlikely(ret))
+	if (unlikely(ret)) {
 		remove_waiter(lock, &waiter);
+		rt_mutex_handle_deadlock(ret, detect_deadlock, &waiter);
+	}
 
 	/*
 	 * try_to_take_rt_mutex() sets the waiter bit
@@ -1112,7 +1134,8 @@ int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
 		return 1;
 	}
 
-	ret = task_blocks_on_rt_mutex(lock, waiter, task, detect_deadlock);
+	/* We enforce deadlock detection for futexes */
+	ret = task_blocks_on_rt_mutex(lock, waiter, task, 1);
 
 	if (ret && !rt_mutex_owner(lock)) {
 		/*
diff --git a/kernel/locking/rtmutex.h b/kernel/locking/rtmutex.h
index a1a1dd0..f6a1f3c 100644
--- a/kernel/locking/rtmutex.h
+++ b/kernel/locking/rtmutex.h
@@ -24,3 +24,8 @@
 #define debug_rt_mutex_print_deadlock(w)		do { } while (0)
 #define debug_rt_mutex_detect_deadlock(w,d)		(d)
 #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
+
+static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
+{
+	WARN(1, "rtmutex deadlock detected\n");
+}

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [tip:locking/urgent] rtmutex: Detect changes in the pi lock chain
  2014-06-05 15:28 ` [patch 2/2] rtmutex: Detect changes in the pi lock chain Thomas Gleixner
                     ` (2 preceding siblings ...)
  2014-06-06 14:49   ` Steven Rostedt
@ 2014-06-09 16:49   ` tip-bot for Thomas Gleixner
  3 siblings, 0 replies; 14+ messages in thread
From: tip-bot for Thomas Gleixner @ 2014-06-09 16:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, rostedt, bmouring, peterz, tglx

Commit-ID:  82084984383babe728e6e3c9a8e5c46278091315
Gitweb:     http://git.kernel.org/tip/82084984383babe728e6e3c9a8e5c46278091315
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Thu, 5 Jun 2014 11:16:12 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 7 Jun 2014 14:55:40 +0200

rtmutex: Detect changes in the pi lock chain

When we walk the lock chain, we drop all locks after each step. So the
lock chain can change under us before we reacquire the locks. That's
harmless in principle as we just follow the wrong lock path. But it
can lead to a false positive in the dead lock detection logic:

T0 holds L0
T0 blocks on L1 held by T1
T1 blocks on L2 held by T2
T2 blocks on L3 held by T3
T4 blocks on L4 held by T4

Now we walk the chain

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> 
     lock T2 ->  adjust T2 ->  drop locks

T2 times out and blocks on L0

Now we continue:

lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all.

Brad tried to work around that in the deadlock detection logic itself,
but the more I looked at it the less I liked it, because it's crystal
ball magic after the fact.

We actually can detect a chain change very simple:

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->

     next_lock = T2->pi_blocked_on->lock;

drop locks

T2 times out and blocks on L0

Now we continue:

lock T2 -> 

     if (next_lock != T2->pi_blocked_on->lock)
     	   return;

So if we detect that T2 is now blocked on a different lock we stop the
chain walk. That's also correct in the following scenario:

lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->

     next_lock = T2->pi_blocked_on->lock;

drop locks

T3 times out and drops L3
T2 acquires L3 and blocks on L4 now

Now we continue:

lock T2 -> 

     if (next_lock != T2->pi_blocked_on->lock)
     	   return;

We don't have to follow up the chain at that point, because T2
propagated our priority up to T4 already.

[ Folded a cleanup patch from peterz ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Brad Mouring <bmouring@ni.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140605152801.930031935@linutronix.de
Cc: stable@vger.kernel.org
---
 kernel/locking/rtmutex.c | 95 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 71 insertions(+), 24 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index eb7a463..a8a83a2 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -260,27 +260,36 @@ static void rt_mutex_adjust_prio(struct task_struct *task)
  */
 int max_lock_depth = 1024;
 
+static inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p)
+{
+	return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL;
+}
+
 /*
  * Adjust the priority chain. Also used for deadlock detection.
  * Decreases task's usage by one - may thus free the task.
  *
- * @task: the task owning the mutex (owner) for which a chain walk is probably
- *	  needed
+ * @task:	the task owning the mutex (owner) for which a chain walk is
+ *		probably needed
  * @deadlock_detect: do we have to carry out deadlock detection?
- * @orig_lock: the mutex (can be NULL if we are walking the chain to recheck
- * 	       things for a task that has just got its priority adjusted, and
- *	       is waiting on a mutex)
+ * @orig_lock:	the mutex (can be NULL if we are walking the chain to recheck
+ *		things for a task that has just got its priority adjusted, and
+ *		is waiting on a mutex)
+ * @next_lock:	the mutex on which the owner of @orig_lock was blocked before
+ *		we dropped its pi_lock. Is never dereferenced, only used for
+ *		comparison to detect lock chain changes.
  * @orig_waiter: rt_mutex_waiter struct for the task that has just donated
- *		 its priority to the mutex owner (can be NULL in the case
- *		 depicted above or if the top waiter is gone away and we are
- *		 actually deboosting the owner)
- * @top_task: the current top waiter
+ *		its priority to the mutex owner (can be NULL in the case
+ *		depicted above or if the top waiter is gone away and we are
+ *		actually deboosting the owner)
+ * @top_task:	the current top waiter
  *
  * Returns 0 or -EDEADLK.
  */
 static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 				      int deadlock_detect,
 				      struct rt_mutex *orig_lock,
+				      struct rt_mutex *next_lock,
 				      struct rt_mutex_waiter *orig_waiter,
 				      struct task_struct *top_task)
 {
@@ -339,6 +348,18 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		goto out_unlock_pi;
 
 	/*
+	 * We dropped all locks after taking a refcount on @task, so
+	 * the task might have moved on in the lock chain or even left
+	 * the chain completely and blocks now on an unrelated lock or
+	 * on @orig_lock.
+	 *
+	 * We stored the lock on which @task was blocked in @next_lock,
+	 * so we can detect the chain change.
+	 */
+	if (next_lock != waiter->lock)
+		goto out_unlock_pi;
+
+	/*
 	 * Drop out, when the task has no waiters. Note,
 	 * top_waiter can be NULL, when we are in the deboosting
 	 * mode!
@@ -422,11 +443,26 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 		__rt_mutex_adjust_prio(task);
 	}
 
+	/*
+	 * Check whether the task which owns the current lock is pi
+	 * blocked itself. If yes we store a pointer to the lock for
+	 * the lock chain change detection above. After we dropped
+	 * task->pi_lock next_lock cannot be dereferenced anymore.
+	 */
+	next_lock = task_blocked_on_lock(task);
+
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 
 	top_waiter = rt_mutex_top_waiter(lock);
 	raw_spin_unlock(&lock->wait_lock);
 
+	/*
+	 * We reached the end of the lock chain. Stop right here. No
+	 * point to go back just to figure that out.
+	 */
+	if (!next_lock)
+		goto out_put_task;
+
 	if (!detect_deadlock && waiter != top_waiter)
 		goto out_put_task;
 
@@ -536,8 +572,9 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 {
 	struct task_struct *owner = rt_mutex_owner(lock);
 	struct rt_mutex_waiter *top_waiter = waiter;
-	unsigned long flags;
+	struct rt_mutex *next_lock;
 	int chain_walk = 0, res;
+	unsigned long flags;
 
 	/*
 	 * Early deadlock detection. We really don't want the task to
@@ -569,20 +606,28 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 	if (!owner)
 		return 0;
 
+	raw_spin_lock_irqsave(&owner->pi_lock, flags);
 	if (waiter == rt_mutex_top_waiter(lock)) {
-		raw_spin_lock_irqsave(&owner->pi_lock, flags);
 		rt_mutex_dequeue_pi(owner, top_waiter);
 		rt_mutex_enqueue_pi(owner, waiter);
 
 		__rt_mutex_adjust_prio(owner);
 		if (owner->pi_blocked_on)
 			chain_walk = 1;
-		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
-	}
-	else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock))
+	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
 		chain_walk = 1;
+	}
 
-	if (!chain_walk)
+	/* Store the lock on which owner is blocked or NULL */
+	next_lock = task_blocked_on_lock(owner);
+
+	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
+	/*
+	 * Even if full deadlock detection is on, if the owner is not
+	 * blocked itself, we can avoid finding this out in the chain
+	 * walk.
+	 */
+	if (!chain_walk || !next_lock)
 		return 0;
 
 	/*
@@ -594,8 +639,8 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock, waiter,
-					 task);
+	res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock,
+					 next_lock, waiter, task);
 
 	raw_spin_lock(&lock->wait_lock);
 
@@ -644,8 +689,8 @@ static void remove_waiter(struct rt_mutex *lock,
 {
 	int first = (waiter == rt_mutex_top_waiter(lock));
 	struct task_struct *owner = rt_mutex_owner(lock);
+	struct rt_mutex *next_lock = NULL;
 	unsigned long flags;
-	int chain_walk = 0;
 
 	raw_spin_lock_irqsave(&current->pi_lock, flags);
 	rt_mutex_dequeue(lock, waiter);
@@ -669,13 +714,13 @@ static void remove_waiter(struct rt_mutex *lock,
 		}
 		__rt_mutex_adjust_prio(owner);
 
-		if (owner->pi_blocked_on)
-			chain_walk = 1;
+		/* Store the lock on which owner is blocked or NULL */
+		next_lock = task_blocked_on_lock(owner);
 
 		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
 	}
 
-	if (!chain_walk)
+	if (!next_lock)
 		return;
 
 	/* gets dropped in rt_mutex_adjust_prio_chain()! */
@@ -683,7 +728,7 @@ static void remove_waiter(struct rt_mutex *lock,
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	rt_mutex_adjust_prio_chain(owner, 0, lock, NULL, current);
+	rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, current);
 
 	raw_spin_lock(&lock->wait_lock);
 }
@@ -696,6 +741,7 @@ static void remove_waiter(struct rt_mutex *lock,
 void rt_mutex_adjust_pi(struct task_struct *task)
 {
 	struct rt_mutex_waiter *waiter;
+	struct rt_mutex *next_lock;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
@@ -706,12 +752,13 @@ void rt_mutex_adjust_pi(struct task_struct *task)
 		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 		return;
 	}
-
+	next_lock = waiter->lock;
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 
 	/* gets dropped in rt_mutex_adjust_prio_chain()! */
 	get_task_struct(task);
-	rt_mutex_adjust_prio_chain(task, 0, NULL, NULL, task);
+
+	rt_mutex_adjust_prio_chain(task, 0, NULL, next_lock, NULL, task);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-06-09 16:50 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-05 15:28 [patch 0/2] rtmutex: Sanitze deadlock detection and chain walk Thomas Gleixner
2014-06-05 15:28 ` [patch 1/2] rtmutex: Handle deadlock detection smarter Thomas Gleixner
2014-06-06  1:14   ` Steven Rostedt
2014-06-06  5:41     ` Thomas Gleixner
2014-06-06  2:59   ` Steven Rostedt
2014-06-06  5:40     ` Thomas Gleixner
2014-06-09 16:49   ` [tip:locking/urgent] " tip-bot for Thomas Gleixner
2014-06-05 15:28 ` [patch 2/2] rtmutex: Detect changes in the pi lock chain Thomas Gleixner
2014-06-06  9:48   ` Peter Zijlstra
2014-06-06  9:58     ` Steven Rostedt
2014-06-06 14:25   ` Steven Rostedt
2014-06-06 14:49   ` Steven Rostedt
2014-06-07  7:31     ` Thomas Gleixner
2014-06-09 16:49   ` [tip:locking/urgent] " tip-bot for Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).