All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V3 0/7] rtmutex: Code clarification and optimization
@ 2014-06-09 20:28 Thomas Gleixner
  2014-06-09 20:28 ` [patch V3 1/7] rtmutex: Deobfuscate chain walk Thomas Gleixner
                   ` (7 more replies)
  0 siblings, 8 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

This is another round of clarification and optimization patches for
the rtmutex code.

When I looked last week into the dead lock detector I spent some time
to figure out how the various arguments and local variables are
protected and whats the scope of the various protection algorithms, so
I added documentation for this as well.

I also changed the last patch which avoids the requeueing in case of
the deadlock detection only chain walk. Instead of adding tons of
conditional to the boost/deboost code path I simply have a separate
code path for it.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [patch V3 1/7] rtmutex: Deobfuscate chain walk
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
@ 2014-06-09 20:28 ` Thomas Gleixner
  2014-06-09 20:59   ` Steven Rostedt
                     ` (2 more replies)
  2014-06-09 20:28 ` [patch V3 2/7] rtmutex: Clarify the boost/deboost part Thomas Gleixner
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

[-- Attachment #1: rtmutex-deobfuscate-chain-walk.patch --]
[-- Type: text/plain, Size: 1092 bytes --]

There is no point to keep the task ref across the check for lock
owner. Drop the ref before that, so the protection context is clear.

Found while documenting the chain walk.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -411,6 +411,8 @@ static int rt_mutex_adjust_prio_chain(st
 
 	/* Release the task */
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+	put_task_struct(task);
+
 	if (!rt_mutex_owner(lock)) {
 		/*
 		 * If the requeue above changed the top waiter, then we need
@@ -420,9 +422,8 @@ static int rt_mutex_adjust_prio_chain(st
 		if (top_waiter != rt_mutex_top_waiter(lock))
 			wake_up_process(rt_mutex_top_waiter(lock)->task);
 		raw_spin_unlock(&lock->wait_lock);
-		goto out_put_task;
+		return 0;
 	}
-	put_task_struct(task);
 
 	/* Grab the next task */
 	task = rt_mutex_owner(lock);



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [patch V3 2/7] rtmutex: Clarify the boost/deboost part
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
  2014-06-09 20:28 ` [patch V3 1/7] rtmutex: Deobfuscate chain walk Thomas Gleixner
@ 2014-06-09 20:28 ` Thomas Gleixner
  2014-06-10  0:37   ` Steven Rostedt
                     ` (2 more replies)
  2014-06-09 20:28 ` [patch V3 3/7] rtmutex: Document pi chain walk Thomas Gleixner
                   ` (5 subsequent siblings)
  7 siblings, 3 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

[-- Attachment #1: rtmutex-clarify-the-lock-chain-walk.patch --]
[-- Type: text/plain, Size: 4249 bytes --]

Add a separate local variable for the boost/deboost logic to make the
code more readable. Add comments where appropriate.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |   58 ++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 48 insertions(+), 10 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -293,9 +293,10 @@ static int rt_mutex_adjust_prio_chain(st
 				      struct rt_mutex_waiter *orig_waiter,
 				      struct task_struct *top_task)
 {
-	struct rt_mutex *lock;
 	struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter;
+	struct rt_mutex_waiter *prerequeue_top_waiter;
 	int detect_deadlock, ret = 0, depth = 0;
+	struct rt_mutex *lock;
 	unsigned long flags;
 
 	detect_deadlock = debug_rt_mutex_detect_deadlock(orig_waiter,
@@ -402,9 +403,14 @@ static int rt_mutex_adjust_prio_chain(st
 		goto out_unlock_pi;
 	}
 
-	top_waiter = rt_mutex_top_waiter(lock);
+	/*
+	 * Store the current top waiter before doing the requeue
+	 * operation on @lock. We need it for the boost/deboost
+	 * decision below.
+	 */
+	prerequeue_top_waiter = rt_mutex_top_waiter(lock);
 
-	/* Requeue the waiter */
+	/* Requeue the waiter in the lock waiter list. */
 	rt_mutex_dequeue(lock, waiter);
 	waiter->prio = task->prio;
 	rt_mutex_enqueue(lock, waiter);
@@ -413,35 +419,58 @@ static int rt_mutex_adjust_prio_chain(st
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 	put_task_struct(task);
 
+	/*
+	 * We must abort the chain walk if there is no lock owner even
+	 * in the dead lock detection case, as we have nothing to
+	 * follow here. This is the end of the chain we are walking.
+	 */
 	if (!rt_mutex_owner(lock)) {
 		/*
 		 * If the requeue above changed the top waiter, then we need
 		 * to wake the new top waiter up to try to get the lock.
 		 */
-
-		if (top_waiter != rt_mutex_top_waiter(lock))
+		if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
 			wake_up_process(rt_mutex_top_waiter(lock)->task);
 		raw_spin_unlock(&lock->wait_lock);
 		return 0;
 	}
 
-	/* Grab the next task */
+	/* Grab the next task, i.e. the owner of @lock */
 	task = rt_mutex_owner(lock);
 	get_task_struct(task);
 	raw_spin_lock_irqsave(&task->pi_lock, flags);
 
 	if (waiter == rt_mutex_top_waiter(lock)) {
-		/* Boost the owner */
-		rt_mutex_dequeue_pi(task, top_waiter);
+		/*
+		 * The waiter became the new top (highest priority)
+		 * waiter on the lock. Replace the previous top waiter
+		 * in the owner tasks pi waiters list with this waiter
+		 * and adjust the priority of the owner.
+		 */
+		rt_mutex_dequeue_pi(task, prerequeue_top_waiter);
 		rt_mutex_enqueue_pi(task, waiter);
 		__rt_mutex_adjust_prio(task);
 
-	} else if (top_waiter == waiter) {
-		/* Deboost the owner */
+	} else if (prerequeue_top_waiter == waiter) {
+		/*
+		 * The waiter was the top waiter on the lock, but is
+		 * no longer the top prority waiter. Replace waiter in
+		 * the owner tasks pi waiters list with the new top
+		 * (highest priority) waiter and adjust the priority
+		 * of the owner.
+		 * The new top waiter is stored in @waiter so that
+		 * @waiter == @top_waiter evaluates to true below and
+		 * we continue to deboost the rest of the chain.
+		 */
 		rt_mutex_dequeue_pi(task, waiter);
 		waiter = rt_mutex_top_waiter(lock);
 		rt_mutex_enqueue_pi(task, waiter);
 		__rt_mutex_adjust_prio(task);
+	} else {
+		/*
+		 * Nothing changed. No need to do any priority
+		 * adjustment.
+		 */
 	}
 
 	/*
@@ -454,6 +483,10 @@ static int rt_mutex_adjust_prio_chain(st
 
 	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
 
+	/*
+	 * Store the top waiter of @lock for the end of chain walk
+	 * decision below.
+	 */
 	top_waiter = rt_mutex_top_waiter(lock);
 	raw_spin_unlock(&lock->wait_lock);
 
@@ -464,6 +497,11 @@ static int rt_mutex_adjust_prio_chain(st
 	if (!next_lock)
 		goto out_put_task;
 
+	/*
+	 * If the current waiter is not the top waiter on the lock,
+	 * then we can stop the chain walk here if we are not in full
+	 * deadlock detection mode.
+	 */
 	if (!detect_deadlock && waiter != top_waiter)
 		goto out_put_task;
 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [patch V3 4/7] rtmutex: Siplify remove_waiter()
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
                   ` (2 preceding siblings ...)
  2014-06-09 20:28 ` [patch V3 3/7] rtmutex: Document pi chain walk Thomas Gleixner
@ 2014-06-09 20:28 ` Thomas Gleixner
  2014-06-10  0:53   ` Steven Rostedt
  2014-06-10 14:10   ` Brad Mouring
  2014-06-09 20:28 ` [patch V3 5/7] rtmutex: Confine deadlock logic to futex Thomas Gleixner
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

[-- Attachment #1: rtmutex-simplify-deboost.patch --]
[-- Type: text/plain, Size: 1843 bytes --]

Exit right away, when the removed waiter was not the top prioriy
waiter on the lock. Get rid of the extra indent level.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |   26 ++++++++++----------------
 1 file changed, 10 insertions(+), 16 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -780,7 +780,7 @@ static void remove_waiter(struct rt_mute
 {
 	int first = (waiter == rt_mutex_top_waiter(lock));
 	struct task_struct *owner = rt_mutex_owner(lock);
-	struct rt_mutex *next_lock = NULL;
+	struct rt_mutex *next_lock;
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&current->pi_lock, flags);
@@ -788,28 +788,22 @@ static void remove_waiter(struct rt_mute
 	current->pi_blocked_on = NULL;
 	raw_spin_unlock_irqrestore(&current->pi_lock, flags);
 
-	if (!owner)
+	if (!owner || !first)
 		return;
 
-	if (first) {
+	raw_spin_lock_irqsave(&owner->pi_lock, flags);
 
-		raw_spin_lock_irqsave(&owner->pi_lock, flags);
+	rt_mutex_dequeue_pi(owner, waiter);
 
-		rt_mutex_dequeue_pi(owner, waiter);
+	if (rt_mutex_has_waiters(lock))
+		rt_mutex_enqueue_pi(owner, rt_mutex_top_waiter(lock));
 
-		if (rt_mutex_has_waiters(lock)) {
-			struct rt_mutex_waiter *next;
+	__rt_mutex_adjust_prio(owner);
 
-			next = rt_mutex_top_waiter(lock);
-			rt_mutex_enqueue_pi(owner, next);
-		}
-		__rt_mutex_adjust_prio(owner);
+	/* Store the lock on which owner is blocked or NULL */
+	next_lock = task_blocked_on_lock(owner);
 
-		/* Store the lock on which owner is blocked or NULL */
-		next_lock = task_blocked_on_lock(owner);
-
-		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
-	}
+	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
 
 	if (!next_lock)
 		return;



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [patch V3 3/7] rtmutex: Document pi chain walk
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
  2014-06-09 20:28 ` [patch V3 1/7] rtmutex: Deobfuscate chain walk Thomas Gleixner
  2014-06-09 20:28 ` [patch V3 2/7] rtmutex: Clarify the boost/deboost part Thomas Gleixner
@ 2014-06-09 20:28 ` Thomas Gleixner
  2014-06-10  0:45   ` Steven Rostedt
  2014-06-10 14:21   ` Brad Mouring
  2014-06-09 20:28 ` [patch V3 4/7] rtmutex: Siplify remove_waiter() Thomas Gleixner
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

[-- Attachment #1: rtmutex-document-chain-walk.patch --]
[-- Type: text/plain, Size: 2756 bytes --]

Add commentry to document the chain walk and the protection mechanisms
and their scope.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -285,6 +285,47 @@ static inline struct rt_mutex *task_bloc
  * @top_task:	the current top waiter
  *
  * Returns 0 or -EDEADLK.
+ *
+ * Chain walk basics and protection scope
+ *
+ * [A] refcount on task
+ * [B] task->pi_lock held
+ * [C] rtmutex->lock held
+ *
+ * call()					Protected by
+ *	@task					[A]
+ *	@orig_lock if != NULL			@top_task is blocked on it
+ *	@next_lock				Unprotected. Cannot be
+ *						dereferenced. Only used for
+ *						comparison.
+ *	@orig_waiter if != NULL			@top_task is blocked on it
+ *	@top_task				current, or in case of proxy
+ *						locking protected by calling
+ *						code
+ * again:
+ *	loop_sanity_check();
+ * retry:
+ *	lock(task->pi_lock);			[A] acquire [B]
+ *	waiter = task->pi_blocked_on;		[B]
+ *	check_exit_conditions();		[B]
+ *	lock = waiter->lock;			[B]
+ *	if (!try_lock(lock->wait_lock)) {	[B] try to acquire [C]
+ *		unlock(task->pi_lock);		drop [B]
+ *		goto retry;
+ *	}
+ *	check_exit_conditions();		[B] + [C]
+ *	requeue_lock_waiter(lock, waiter);	[B] + [C]
+ *	unlock(task->pi_lock);			drop [B]
+ *	drop_task_ref(task);			drop [A]
+ *	check_exit_conditions();		[C]
+ *	task = owner(lock);			[C]
+ *	get_task_ref(task);			[C] acquire [A]
+ *	lock(task->pi_lock);			[C] acquire [B]
+ *	requeue_pi_waiter(task, waiters(lock));	[B] + [C]
+ *	check_exit_conditions();		[B] + [C]
+ *	unlock(task->pi_lock);			drop [B]
+ *	unlock(lock->wait_lock);		drop [C]
+ *	goto again;
  */
 static int rt_mutex_adjust_prio_chain(struct task_struct *task,
 				      int deadlock_detect,
@@ -326,6 +367,12 @@ static int rt_mutex_adjust_prio_chain(st
 
 		return -EDEADLK;
 	}
+
+	/*
+	 * We are fully preemptible here and only hold the refcount on
+	 * @task. So everything can have changed under us since the
+	 * caller or our own code below (goto retry) dropped all locks.
+	 */
  retry:
 	/*
 	 * Task can not go away as we did a get_task() before !
@@ -383,6 +430,11 @@ static int rt_mutex_adjust_prio_chain(st
 	if (!detect_deadlock && waiter->prio == task->prio)
 		goto out_unlock_pi;
 
+	/*
+	 * We need to trylock here as we are holding task->pi_lock,
+	 * which is the reverse lock order versus the other rtmutex
+	 * operations.
+	 */
 	lock = waiter->lock;
 	if (!raw_spin_trylock(&lock->wait_lock)) {
 		raw_spin_unlock_irqrestore(&task->pi_lock, flags);



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [patch V3 5/7] rtmutex: Confine deadlock logic to futex
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
                   ` (3 preceding siblings ...)
  2014-06-09 20:28 ` [patch V3 4/7] rtmutex: Siplify remove_waiter() Thomas Gleixner
@ 2014-06-09 20:28 ` Thomas Gleixner
  2014-06-10  0:59   ` Steven Rostedt
  2014-06-09 20:28 ` [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic Thomas Gleixner
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

[-- Attachment #1: rtmutex-confine-deadlock-logic-to-futex.patch --]
[-- Type: text/plain, Size: 7885 bytes --]

The deadlock logic is only required for futexes.

Remove the extra arguments for the public functions and also for the
futex specific ones which get always called with deadlock detection
enabled.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/rtmutex.h         |    6 +---
 kernel/futex.c                  |   10 +++---
 kernel/locking/rtmutex.c        |   59 ++++++++++++++++++++--------------------
 kernel/locking/rtmutex_common.h |    7 ++--
 4 files changed, 40 insertions(+), 42 deletions(-)

Index: tip/include/linux/rtmutex.h
===================================================================
--- tip.orig/include/linux/rtmutex.h
+++ tip/include/linux/rtmutex.h
@@ -90,11 +90,9 @@ extern void __rt_mutex_init(struct rt_mu
 extern void rt_mutex_destroy(struct rt_mutex *lock);
 
 extern void rt_mutex_lock(struct rt_mutex *lock);
-extern int rt_mutex_lock_interruptible(struct rt_mutex *lock,
-						int detect_deadlock);
+extern int rt_mutex_lock_interruptible(struct rt_mutex *lock);
 extern int rt_mutex_timed_lock(struct rt_mutex *lock,
-					struct hrtimer_sleeper *timeout,
-					int detect_deadlock);
+			       struct hrtimer_sleeper *timeout);
 
 extern int rt_mutex_trylock(struct rt_mutex *lock);
 
Index: tip/kernel/futex.c
===================================================================
--- tip.orig/kernel/futex.c
+++ tip/kernel/futex.c
@@ -1718,7 +1718,7 @@ retry_private:
 			this->pi_state = pi_state;
 			ret = rt_mutex_start_proxy_lock(&pi_state->pi_mutex,
 							this->rt_waiter,
-							this->task, 1);
+							this->task);
 			if (ret == 1) {
 				/* We got the lock. */
 				requeue_pi_wake_futex(this, &key2, hb2);
@@ -2337,9 +2337,9 @@ retry_private:
 	/*
 	 * Block on the PI mutex:
 	 */
-	if (!trylock)
-		ret = rt_mutex_timed_lock(&q.pi_state->pi_mutex, to, 1);
-	else {
+	if (!trylock) {
+		ret = __rt_mutex_timed_lock(&q.pi_state->pi_mutex, to);
+	} else {
 		ret = rt_mutex_trylock(&q.pi_state->pi_mutex);
 		/* Fixup the trylock return value: */
 		ret = ret ? 0 : -EWOULDBLOCK;
@@ -2669,7 +2669,7 @@ static int futex_wait_requeue_pi(u32 __u
 		 */
 		WARN_ON(!q.pi_state);
 		pi_mutex = &q.pi_state->pi_mutex;
-		ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter, 1);
+		ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter);
 		debug_rt_mutex_free_waiter(&rt_waiter);
 
 		spin_lock(q.lock_ptr);
Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -1035,16 +1035,15 @@ rt_mutex_slowunlock(struct rt_mutex *loc
  */
 static inline int
 rt_mutex_fastlock(struct rt_mutex *lock, int state,
-		  int detect_deadlock,
 		  int (*slowfn)(struct rt_mutex *lock, int state,
 				struct hrtimer_sleeper *timeout,
 				int detect_deadlock))
 {
-	if (!detect_deadlock && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
+	if (likely(rt_mutex_cmpxchg(lock, NULL, current))) {
 		rt_mutex_deadlock_account_lock(lock, current);
 		return 0;
 	} else
-		return slowfn(lock, state, NULL, detect_deadlock);
+		return slowfn(lock, state, NULL, 0);
 }
 
 static inline int
@@ -1091,54 +1090,59 @@ void __sched rt_mutex_lock(struct rt_mut
 {
 	might_sleep();
 
-	rt_mutex_fastlock(lock, TASK_UNINTERRUPTIBLE, 0, rt_mutex_slowlock);
+	rt_mutex_fastlock(lock, TASK_UNINTERRUPTIBLE, rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_lock);
 
 /**
  * rt_mutex_lock_interruptible - lock a rt_mutex interruptible
  *
- * @lock: 		the rt_mutex to be locked
- * @detect_deadlock:	deadlock detection on/off
+ * @lock:		the rt_mutex to be locked
  *
  * Returns:
- *  0 		on success
- * -EINTR 	when interrupted by a signal
- * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
+ *  0		on success
+ * -EINTR	when interrupted by a signal
  */
-int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock,
-						 int detect_deadlock)
+int __sched rt_mutex_lock_interruptible(struct rt_mutex *lock)
 {
 	might_sleep();
 
-	return rt_mutex_fastlock(lock, TASK_INTERRUPTIBLE,
-				 detect_deadlock, rt_mutex_slowlock);
+	return rt_mutex_fastlock(lock, TASK_INTERRUPTIBLE, rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_lock_interruptible);
 
+/*
+ * Futex variant to allow full deadlock detection.
+ */
+int __rt_mutex_timed_lock(struct rt_mutex *lock,
+			  struct hrtimer_sleeper *timeout)
+{
+	might_sleep();
+
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
+				       rt_mutex_slowlock);
+}
+
 /**
  * rt_mutex_timed_lock - lock a rt_mutex interruptible
  *			the timeout structure is provided
  *			by the caller
  *
- * @lock: 		the rt_mutex to be locked
+ * @lock:		the rt_mutex to be locked
  * @timeout:		timeout structure or NULL (no timeout)
- * @detect_deadlock:	deadlock detection on/off
  *
  * Returns:
- *  0 		on success
- * -EINTR 	when interrupted by a signal
+ *  0		on success
+ * -EINTR	when interrupted by a signal
  * -ETIMEDOUT	when the timeout expired
- * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
  */
 int
-rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout,
-		    int detect_deadlock)
+rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
 {
 	might_sleep();
 
-	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
-				       detect_deadlock, rt_mutex_slowlock);
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
+				       rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
 
@@ -1244,7 +1248,6 @@ void rt_mutex_proxy_unlock(struct rt_mut
  * @lock:		the rt_mutex to take
  * @waiter:		the pre-initialized rt_mutex_waiter
  * @task:		the task to prepare
- * @detect_deadlock:	perform deadlock detection (1) or not (0)
  *
  * Returns:
  *  0 - task blocked on lock
@@ -1255,7 +1258,7 @@ void rt_mutex_proxy_unlock(struct rt_mut
  */
 int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
 			      struct rt_mutex_waiter *waiter,
-			      struct task_struct *task, int detect_deadlock)
+			      struct task_struct *task)
 {
 	int ret;
 
@@ -1313,22 +1316,20 @@ struct task_struct *rt_mutex_next_owner(
  * rt_mutex_finish_proxy_lock() - Complete lock acquisition
  * @lock:		the rt_mutex we were woken on
  * @to:			the timeout, null if none. hrtimer should already have
- * 			been started.
+ *			been started.
  * @waiter:		the pre-initialized rt_mutex_waiter
- * @detect_deadlock:	perform deadlock detection (1) or not (0)
  *
  * Complete the lock acquisition started our behalf by another thread.
  *
  * Returns:
  *  0 - success
- * <0 - error, one of -EINTR, -ETIMEDOUT, or -EDEADLK
+ * <0 - error, one of -EINTR, -ETIMEDOUT
  *
  * Special API call for PI-futex requeue support
  */
 int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
 			       struct hrtimer_sleeper *to,
-			       struct rt_mutex_waiter *waiter,
-			       int detect_deadlock)
+			       struct rt_mutex_waiter *waiter)
 {
 	int ret;
 
Index: tip/kernel/locking/rtmutex_common.h
===================================================================
--- tip.orig/kernel/locking/rtmutex_common.h
+++ tip/kernel/locking/rtmutex_common.h
@@ -111,12 +111,11 @@ extern void rt_mutex_proxy_unlock(struct
 				  struct task_struct *proxy_owner);
 extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
 				     struct rt_mutex_waiter *waiter,
-				     struct task_struct *task,
-				     int detect_deadlock);
+				     struct task_struct *task);
 extern int rt_mutex_finish_proxy_lock(struct rt_mutex *lock,
 				      struct hrtimer_sleeper *to,
-				      struct rt_mutex_waiter *waiter,
-				      int detect_deadlock);
+				      struct rt_mutex_waiter *waiter);
+extern int __rt_mutex_timed_lock(struct rt_mutex *l, struct hrtimer_sleeper *to);
 
 #ifdef CONFIG_DEBUG_RT_MUTEXES
 # include "rtmutex-debug.h"



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
                   ` (4 preceding siblings ...)
  2014-06-09 20:28 ` [patch V3 5/7] rtmutex: Confine deadlock logic to futex Thomas Gleixner
@ 2014-06-09 20:28 ` Thomas Gleixner
  2014-06-10  1:04   ` Steven Rostedt
  2014-06-10 15:09   ` Brad Mouring
  2014-06-09 20:28 ` [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Thomas Gleixner
  2014-06-10  0:27 ` [patch V3 0/7] rtmutex: Code clarification and optimization Steven Rostedt
  7 siblings, 2 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

[-- Attachment #1: rtmutex-cleanup-deadlock-detector-debug-logic.patch --]
[-- Type: text/plain, Size: 10852 bytes --]

The conditions under which deadlock detection is conducted are unclear
and undocumented.

Add constants instead of using 0/1 and provide a selection function
which hides the additional debug dependency from the calling code.

Add comments where needed.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex-debug.c  |    5 +-
 kernel/locking/rtmutex-debug.h  |    7 ++--
 kernel/locking/rtmutex.c        |   69 +++++++++++++++++++++++++++-------------
 kernel/locking/rtmutex.h        |    7 +++-
 kernel/locking/rtmutex_common.h |   15 ++++++++
 5 files changed, 75 insertions(+), 28 deletions(-)

Index: tip/kernel/locking/rtmutex-debug.c
===================================================================
--- tip.orig/kernel/locking/rtmutex-debug.c
+++ tip/kernel/locking/rtmutex-debug.c
@@ -66,12 +66,13 @@ void rt_mutex_debug_task_free(struct tas
  * the deadlock. We print when we return. act_waiter can be NULL in
  * case of a remove waiter operation.
  */
-void debug_rt_mutex_deadlock(int detect, struct rt_mutex_waiter *act_waiter,
+void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
+			     struct rt_mutex_waiter *act_waiter,
 			     struct rt_mutex *lock)
 {
 	struct task_struct *task;
 
-	if (!debug_locks || detect || !act_waiter)
+	if (!debug_locks || chwalk || !act_waiter)
 		return;
 
 	task = rt_mutex_owner(act_waiter->lock);
Index: tip/kernel/locking/rtmutex-debug.h
===================================================================
--- tip.orig/kernel/locking/rtmutex-debug.h
+++ tip/kernel/locking/rtmutex-debug.h
@@ -20,14 +20,15 @@ extern void debug_rt_mutex_unlock(struct
 extern void debug_rt_mutex_proxy_lock(struct rt_mutex *lock,
 				      struct task_struct *powner);
 extern void debug_rt_mutex_proxy_unlock(struct rt_mutex *lock);
-extern void debug_rt_mutex_deadlock(int detect, struct rt_mutex_waiter *waiter,
+extern void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
+				    struct rt_mutex_waiter *waiter,
 				    struct rt_mutex *lock);
 extern void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter);
 # define debug_rt_mutex_reset_waiter(w)			\
 	do { (w)->deadlock_lock = NULL; } while (0)
 
-static inline int debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *waiter,
-						 int detect)
+static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiterr *waiter,
+						  enum rtmutex_chainwalk walk)
 {
 	return (waiter != NULL);
 }
Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -256,6 +256,25 @@ static void rt_mutex_adjust_prio(struct
 }
 
 /*
+ * Deadlock detection is conditional:
+ *
+ * If CONFIG_DEBUG_RT_MUTEXES=n, deadlock detection is only conducted
+ * if the detect argument is == RT_MUTEX_FULL_CHAINWALK.
+ *
+ * If CONFIG_DEBUG_RT_MUTEXES=y, deadlock detection is always
+ * conducted independent of the detect argument.
+ *
+ * If the waiter argument is NULL this indicates the deboost path and
+ * deadlock detection is disabled independent of the detect argument
+ * and the config settings.
+ */
+static bool rt_mutex_cond_detect_deadlock(struct rt_mutex_waiter *waiter,
+					  enum rtmutex_chainwalk chwalk)
+{
+	return debug_rt_mutex_detect_deadlock(waiter, chwalk);
+}
+
+/*
  * Max number of times we'll walk the boosting chain:
  */
 int max_lock_depth = 1024;
@@ -328,7 +347,7 @@ static inline struct rt_mutex *task_bloc
  *	goto again;
  */
 static int rt_mutex_adjust_prio_chain(struct task_struct *task,
-				      int deadlock_detect,
+				      enum rtmutex_chainwalk chwalk,
 				      struct rt_mutex *orig_lock,
 				      struct rt_mutex *next_lock,
 				      struct rt_mutex_waiter *orig_waiter,
@@ -336,12 +355,12 @@ static int rt_mutex_adjust_prio_chain(st
 {
 	struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter;
 	struct rt_mutex_waiter *prerequeue_top_waiter;
-	int detect_deadlock, ret = 0, depth = 0;
+	int ret = 0, depth = 0;
 	struct rt_mutex *lock;
+	bool detect_deadlock;
 	unsigned long flags;
 
-	detect_deadlock = debug_rt_mutex_detect_deadlock(orig_waiter,
-							 deadlock_detect);
+	detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
 
 	/*
 	 * The (de)boosting is a step by step approach with a lot of
@@ -449,7 +468,7 @@ static int rt_mutex_adjust_prio_chain(st
 	 * walk, we detected a deadlock.
 	 */
 	if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
-		debug_rt_mutex_deadlock(deadlock_detect, orig_waiter, lock);
+		debug_rt_mutex_deadlock(chwalk, orig_waiter, lock);
 		raw_spin_unlock(&lock->wait_lock);
 		ret = -EDEADLK;
 		goto out_unlock_pi;
@@ -659,7 +678,7 @@ static int try_to_take_rt_mutex(struct r
 static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
 				   struct rt_mutex_waiter *waiter,
 				   struct task_struct *task,
-				   int detect_deadlock)
+				   enum rtmutex_chainwalk chwalk)
 {
 	struct task_struct *owner = rt_mutex_owner(lock);
 	struct rt_mutex_waiter *top_waiter = waiter;
@@ -705,7 +724,7 @@ static int task_blocks_on_rt_mutex(struc
 		__rt_mutex_adjust_prio(owner);
 		if (owner->pi_blocked_on)
 			chain_walk = 1;
-	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
+	} else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) {
 		chain_walk = 1;
 	}
 
@@ -730,7 +749,7 @@ static int task_blocks_on_rt_mutex(struc
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock,
+	res = rt_mutex_adjust_prio_chain(owner, chwalk, lock,
 					 next_lock, waiter, task);
 
 	raw_spin_lock(&lock->wait_lock);
@@ -813,7 +832,8 @@ static void remove_waiter(struct rt_mute
 
 	raw_spin_unlock(&lock->wait_lock);
 
-	rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, current);
+	rt_mutex_adjust_prio_chain(owner, RT_MUTEX_MIN_CHAINWALK, lock,
+				   next_lock, NULL, current);
 
 	raw_spin_lock(&lock->wait_lock);
 }
@@ -843,7 +863,8 @@ void rt_mutex_adjust_pi(struct task_stru
 	/* gets dropped in rt_mutex_adjust_prio_chain()! */
 	get_task_struct(task);
 
-	rt_mutex_adjust_prio_chain(task, 0, NULL, next_lock, NULL, task);
+	rt_mutex_adjust_prio_chain(task, RT_MUTEX_MIN_CHAINWALK, NULL, NULL,
+				   NULL, task);
 }
 
 /**
@@ -921,7 +942,7 @@ static void rt_mutex_handle_deadlock(int
 static int __sched
 rt_mutex_slowlock(struct rt_mutex *lock, int state,
 		  struct hrtimer_sleeper *timeout,
-		  int detect_deadlock)
+		  enum rtmutex_chainwalk chwalk)
 {
 	struct rt_mutex_waiter waiter;
 	int ret = 0;
@@ -947,7 +968,7 @@ rt_mutex_slowlock(struct rt_mutex *lock,
 			timeout->task = NULL;
 	}
 
-	ret = task_blocks_on_rt_mutex(lock, &waiter, current, detect_deadlock);
+	ret = task_blocks_on_rt_mutex(lock, &waiter, current, chwalk);
 
 	if (likely(!ret))
 		ret = __rt_mutex_slowlock(lock, state, timeout, &waiter);
@@ -956,7 +977,7 @@ rt_mutex_slowlock(struct rt_mutex *lock,
 
 	if (unlikely(ret)) {
 		remove_waiter(lock, &waiter);
-		rt_mutex_handle_deadlock(ret, detect_deadlock, &waiter);
+		rt_mutex_handle_deadlock(ret, chwalk, &waiter);
 	}
 
 	/*
@@ -1037,27 +1058,28 @@ static inline int
 rt_mutex_fastlock(struct rt_mutex *lock, int state,
 		  int (*slowfn)(struct rt_mutex *lock, int state,
 				struct hrtimer_sleeper *timeout,
-				int detect_deadlock))
+				enum rtmutex_chainwalk chwalk))
 {
 	if (likely(rt_mutex_cmpxchg(lock, NULL, current))) {
 		rt_mutex_deadlock_account_lock(lock, current);
 		return 0;
 	} else
-		return slowfn(lock, state, NULL, 0);
+		return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK);
 }
 
 static inline int
 rt_mutex_timed_fastlock(struct rt_mutex *lock, int state,
-			struct hrtimer_sleeper *timeout, int detect_deadlock,
+			struct hrtimer_sleeper *timeout,
+			enum rtmutex_chainwalk chwalk,
 			int (*slowfn)(struct rt_mutex *lock, int state,
 				      struct hrtimer_sleeper *timeout,
-				      int detect_deadlock))
+				      enum rtmutex_chainwalk chwalk))
 {
-	if (!detect_deadlock && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
+	if (!chwalk && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
 		rt_mutex_deadlock_account_lock(lock, current);
 		return 0;
 	} else
-		return slowfn(lock, state, timeout, detect_deadlock);
+		return slowfn(lock, state, timeout, chwalk);
 }
 
 static inline int
@@ -1119,7 +1141,8 @@ int __rt_mutex_timed_lock(struct rt_mute
 {
 	might_sleep();
 
-	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
+				       RT_MUTEX_FULL_CHAINWALK,
 				       rt_mutex_slowlock);
 }
 
@@ -1141,7 +1164,8 @@ rt_mutex_timed_lock(struct rt_mutex *loc
 {
 	might_sleep();
 
-	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
+	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
+				       RT_MUTEX_MIN_CHAINWALK,
 				       rt_mutex_slowlock);
 }
 EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
@@ -1270,7 +1294,8 @@ int rt_mutex_start_proxy_lock(struct rt_
 	}
 
 	/* We enforce deadlock detection for futexes */
-	ret = task_blocks_on_rt_mutex(lock, waiter, task, 1);
+	ret = task_blocks_on_rt_mutex(lock, waiter, task,
+				      RT_MUTEX_FULL_CHAINWALK);
 
 	if (ret && !rt_mutex_owner(lock)) {
 		/*
Index: tip/kernel/locking/rtmutex.h
===================================================================
--- tip.orig/kernel/locking/rtmutex.h
+++ tip/kernel/locking/rtmutex.h
@@ -22,10 +22,15 @@
 #define debug_rt_mutex_init(m, n)			do { } while (0)
 #define debug_rt_mutex_deadlock(d, a ,l)		do { } while (0)
 #define debug_rt_mutex_print_deadlock(w)		do { } while (0)
-#define debug_rt_mutex_detect_deadlock(w,d)		(d)
 #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
 
 static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
 {
 	WARN(1, "rtmutex deadlock detected\n");
 }
+
+static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *w,
+						  enum rtmutex_chainwalk walk)
+{
+	return walk == RT_MUTEX_FULL_CHAINWALK;
+}
Index: tip/kernel/locking/rtmutex_common.h
===================================================================
--- tip.orig/kernel/locking/rtmutex_common.h
+++ tip/kernel/locking/rtmutex_common.h
@@ -102,6 +102,21 @@ static inline struct task_struct *rt_mut
 }
 
 /*
+ * Constants for rt mutex functions which have a selectable deadlock
+ * detection.
+ *
+ * RT_MUTEX_MIN_CHAINWALK:	Stops the lock chain walk when there are
+ *				no further PI adjustments to be made.
+ *
+ * RT_MUTEX_FULL_CHAINWALK:	Invoke deadlock detection with a full
+ *				walk of the lock chain.
+ */
+enum rtmutex_chainwalk {
+	RT_MUTEX_MIN_CHAINWALK,
+	RT_MUTEX_FULL_CHAINWALK,
+};
+
+/*
  * PI-futex support (proxy locking functions, etc.):
  */
 extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock);



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
                   ` (5 preceding siblings ...)
  2014-06-09 20:28 ` [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic Thomas Gleixner
@ 2014-06-09 20:28 ` Thomas Gleixner
  2014-06-10  1:20   ` Steven Rostedt
  2014-06-10 14:57   ` Brad Mouring
  2014-06-10  0:27 ` [patch V3 0/7] rtmutex: Code clarification and optimization Steven Rostedt
  7 siblings, 2 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-09 20:28 UTC (permalink / raw)
  To: LKML
  Cc: Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

[-- Attachment #1: rtmutex-avoid-pointless-requeueing-in-the-deadlock-detection-chain-walk.patch --]
[-- Type: text/plain, Size: 3459 bytes --]

In case the dead lock detector is enabled we follow the lock chain to
the end in rt_mutex_adjust_prio_chain, even if we could stop earlier
due to the priority/waiter constellation.

But once we are not longer the top priority waiter in a certain step
or the task holding the lock has already the same priority then there
is no point in dequeing and enqueing along the lock chain as there is
no change at all.

So stop the requeueing at this point.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/locking/rtmutex.c |   61 +++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 54 insertions(+), 7 deletions(-)

Index: tip/kernel/locking/rtmutex.c
===================================================================
--- tip.orig/kernel/locking/rtmutex.c
+++ tip/kernel/locking/rtmutex.c
@@ -359,6 +359,7 @@ static int rt_mutex_adjust_prio_chain(st
 	struct rt_mutex *lock;
 	bool detect_deadlock;
 	unsigned long flags;
+	bool requeue = true;
 
 	detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
 
@@ -436,18 +437,31 @@ static int rt_mutex_adjust_prio_chain(st
 			goto out_unlock_pi;
 		/*
 		 * If deadlock detection is off, we stop here if we
-		 * are not the top pi waiter of the task.
+		 * are not the top pi waiter of the task. If deadlock
+		 * detection is enabled we continue, but stop the
+		 * requeueing in the chain walk.
 		 */
-		if (!detect_deadlock && top_waiter != task_top_pi_waiter(task))
-			goto out_unlock_pi;
+		if (top_waiter != task_top_pi_waiter(task)) {
+			if (!detect_deadlock)
+				goto out_unlock_pi;
+			else
+				requeue = false;
+		}
 	}
 
 	/*
-	 * When deadlock detection is off then we check, if further
-	 * priority adjustment is necessary.
+	 * If the waiter priority is the same as the task priority
+	 * then there is no further priority adjustment necessary.  If
+	 * deadlock detection is off, we stop the chain walk. If its
+	 * enabled we continue, but stop the requeueing in the chain
+	 * walk.
 	 */
-	if (!detect_deadlock && waiter->prio == task->prio)
-		goto out_unlock_pi;
+	if (waiter->prio == task->prio) {
+		if (!detect_deadlock)
+			goto out_unlock_pi;
+		else
+			requeue = false;
+	}
 
 	/*
 	 * We need to trylock here as we are holding task->pi_lock,
@@ -475,6 +489,39 @@ static int rt_mutex_adjust_prio_chain(st
 	}
 
 	/*
+	 * If we just follow the lock chain for deadlock detection, no
+	 * need to do all the requeue operations. We avoid a truckload
+	 * of conditinals around the various places below and just do
+	 * the minimum chain walk checks here.
+	 */
+	if (!requeue) {
+		/* Release the task */
+		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+		put_task_struct(task);
+
+		/* If there is no owner of the lock, end of chain. */
+		if (!rt_mutex_owner(lock)) {
+			raw_spin_unlock(&lock->wait_lock);
+			return 0;
+		}
+
+		/* Grab the next task, i.e. owner of @lock */
+		task = rt_mutex_owner(lock);
+		get_task_struct(task);
+		raw_spin_lock_irqsave(&task->pi_lock, flags);
+
+		/* Store whether owner is blocked itself and drop locks */
+		next_lock = task_blocked_on(task);
+		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
+		raw_spin_unlock(&lock->wait_lock);
+
+		/* If owner is not blocked, end of chain. */
+		if (!next_lock)
+			goto out_put_task;
+		goto again;
+	}
+
+	/*
 	 * Store the current top waiter before doing the requeue
 	 * operation on @lock. We need it for the boost/deboost
 	 * decision below.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 1/7] rtmutex: Deobfuscate chain walk
  2014-06-09 20:28 ` [patch V3 1/7] rtmutex: Deobfuscate chain walk Thomas Gleixner
@ 2014-06-09 20:59   ` Steven Rostedt
  2014-06-10  3:52     ` Lai Jiangshan
  2014-06-10  3:21   ` Jason Low
  2014-06-10 13:57   ` Brad Mouring
  2 siblings, 1 reply; 40+ messages in thread
From: Steven Rostedt @ 2014-06-09 20:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:06 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> There is no point to keep the task ref across the check for lock
> owner. Drop the ref before that, so the protection context is clear.
> 
> Found while documenting the chain walk.

This looks fine, I just hate the subject. I don't see how it is
'deobfuscating" the chain walk. How about:

rtmutex: No need to keep task ref when checking lock ownership

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve

> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -411,6 +411,8 @@ static int rt_mutex_adjust_prio_chain(st
>  
>  	/* Release the task */
>  	raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> +	put_task_struct(task);
> +
>  	if (!rt_mutex_owner(lock)) {
>  		/*
>  		 * If the requeue above changed the top waiter, then we need
> @@ -420,9 +422,8 @@ static int rt_mutex_adjust_prio_chain(st
>  		if (top_waiter != rt_mutex_top_waiter(lock))
>  			wake_up_process(rt_mutex_top_waiter(lock)->task);
>  		raw_spin_unlock(&lock->wait_lock);
> -		goto out_put_task;
> +		return 0;
>  	}
> -	put_task_struct(task);
>  
>  	/* Grab the next task */
>  	task = rt_mutex_owner(lock);
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 0/7] rtmutex: Code clarification and optimization
  2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
                   ` (6 preceding siblings ...)
  2014-06-09 20:28 ` [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Thomas Gleixner
@ 2014-06-10  0:27 ` Steven Rostedt
  2014-06-10  0:35   ` Steven Rostedt
  7 siblings, 1 reply; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  0:27 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:06 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> This is another round of clarification and optimization patches for
> the rtmutex code.
> 
> When I looked last week into the dead lock detector I spent some time
> to figure out how the various arguments and local variables are
> protected and whats the scope of the various protection algorithms, so
> I added documentation for this as well.
> 
> I also changed the last patch which avoids the requeueing in case of
> the deadlock detection only chain walk. Instead of adding tons of
> conditional to the boost/deboost code path I simply have a separate
> code path for it.

So what branch do these apply to? I tried latest Linus master and it
didn't apply.

-- Steve

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 0/7] rtmutex: Code clarification and optimization
  2014-06-10  0:27 ` [patch V3 0/7] rtmutex: Code clarification and optimization Steven Rostedt
@ 2014-06-10  0:35   ` Steven Rostedt
  2014-06-10  3:00     ` Lai Jiangshan
  0 siblings, 1 reply; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  0:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar,
	Lai Jiangshan, Jason Low, Brad Mouring

On Mon, 9 Jun 2014 20:27:13 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:


> So what branch do these apply to? I tried latest Linus master and it
> didn't apply.

OK, they seem to be on tip/locking/urgent. Not yet in mainline.

-- Steve

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 2/7] rtmutex: Clarify the boost/deboost part
  2014-06-09 20:28 ` [patch V3 2/7] rtmutex: Clarify the boost/deboost part Thomas Gleixner
@ 2014-06-10  0:37   ` Steven Rostedt
  2014-06-10  3:22   ` Jason Low
  2014-06-10 14:04   ` Brad Mouring
  2 siblings, 0 replies; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  0:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:07 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> Add a separate local variable for the boost/deboost logic to make the
> code more readable. Add comments where appropriate.
> 

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 3/7] rtmutex: Document pi chain walk
  2014-06-09 20:28 ` [patch V3 3/7] rtmutex: Document pi chain walk Thomas Gleixner
@ 2014-06-10  0:45   ` Steven Rostedt
  2014-06-10  3:51     ` Lai Jiangshan
  2014-06-10 14:21   ` Brad Mouring
  1 sibling, 1 reply; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  0:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:08 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> Add commentry to document the chain walk and the protection mechanisms
> and their scope.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -285,6 +285,47 @@ static inline struct rt_mutex *task_bloc
>   * @top_task:	the current top waiter
>   *
>   * Returns 0 or -EDEADLK.
> + *
> + * Chain walk basics and protection scope
> + *
> + * [A] refcount on task
> + * [B] task->pi_lock held
> + * [C] rtmutex->lock held

A,B, C is rather meaningless, and requires constant looking back up at
the key. Perhaps [R],[P] and [L]

 [R] refcount on task (get_task_struct)
 [P] task->pi_lock held
 [L] rtmutex->lock held


That way we can associate R being refcount, P being pi_lock and L being
lock. Easier to remember.


> + *
> + * call()					Protected by

"call()"?

> + *	@task					[A]
> + *	@orig_lock if != NULL			@top_task is blocked on it
> + *	@next_lock				Unprotected. Cannot be
> + *						dereferenced. Only used for
> + *						comparison.
> + *	@orig_waiter if != NULL			@top_task is blocked on it
> + *	@top_task				current, or in case of proxy
> + *						locking protected by calling
> + *						code
> + * again:
> + *	loop_sanity_check();
> + * retry:
> + *	lock(task->pi_lock);			[A] acquire [B]
> + *	waiter = task->pi_blocked_on;		[B]
> + *	check_exit_conditions();		[B]
> + *	lock = waiter->lock;			[B]
> + *	if (!try_lock(lock->wait_lock)) {	[B] try to acquire [C]
> + *		unlock(task->pi_lock);		drop [B]
> + *		goto retry;
> + *	}
> + *	check_exit_conditions();		[B] + [C]
> + *	requeue_lock_waiter(lock, waiter);	[B] + [C]
> + *	unlock(task->pi_lock);			drop [B]
> + *	drop_task_ref(task);			drop [A]

Maybe just state "put_task_struct()", less abstractions.

> + *	check_exit_conditions();		[C]
> + *	task = owner(lock);			[C]
> + *	get_task_ref(task);			[C] acquire [A]

get_task_struct()

-- Steve

> + *	lock(task->pi_lock);			[C] acquire [B]
> + *	requeue_pi_waiter(task, waiters(lock));	[B] + [C]
> + *	check_exit_conditions();		[B] + [C]
> + *	unlock(task->pi_lock);			drop [B]
> + *	unlock(lock->wait_lock);		drop [C]
> + *	goto again;
>   */
>  static int rt_mutex_adjust_prio_chain(struct task_struct *task,
>  				      int deadlock_detect,
> @@ -326,6 +367,12 @@ static int rt_mutex_adjust_prio_chain(st
>  
>  		return -EDEADLK;
>  	}
> +
> +	/*
> +	 * We are fully preemptible here and only hold the refcount on
> +	 * @task. So everything can have changed under us since the
> +	 * caller or our own code below (goto retry) dropped all locks.
> +	 */
>   retry:
>  	/*
>  	 * Task can not go away as we did a get_task() before !
> @@ -383,6 +430,11 @@ static int rt_mutex_adjust_prio_chain(st
>  	if (!detect_deadlock && waiter->prio == task->prio)
>  		goto out_unlock_pi;
>  
> +	/*
> +	 * We need to trylock here as we are holding task->pi_lock,
> +	 * which is the reverse lock order versus the other rtmutex
> +	 * operations.
> +	 */
>  	lock = waiter->lock;
>  	if (!raw_spin_trylock(&lock->wait_lock)) {
>  		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 4/7] rtmutex: Siplify remove_waiter()
  2014-06-09 20:28 ` [patch V3 4/7] rtmutex: Siplify remove_waiter() Thomas Gleixner
@ 2014-06-10  0:53   ` Steven Rostedt
  2014-06-10  3:35     ` Lai Jiangshan
  2014-06-10  3:44     ` Jason Low
  2014-06-10 14:10   ` Brad Mouring
  1 sibling, 2 replies; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  0:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:08 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> Exit right away, when the removed waiter was not the top prioriy
> waiter on the lock. Get rid of the extra indent level.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex.c |   26 ++++++++++----------------
>  1 file changed, 10 insertions(+), 16 deletions(-)
> 
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -780,7 +780,7 @@ static void remove_waiter(struct rt_mute
>  {
>  	int first = (waiter == rt_mutex_top_waiter(lock));
>  	struct task_struct *owner = rt_mutex_owner(lock);
> -	struct rt_mutex *next_lock = NULL;
> +	struct rt_mutex *next_lock;
>  	unsigned long flags;
>  
>  	raw_spin_lock_irqsave(&current->pi_lock, flags);
> @@ -788,28 +788,22 @@ static void remove_waiter(struct rt_mute
>  	current->pi_blocked_on = NULL;
>  	raw_spin_unlock_irqrestore(&current->pi_lock, flags);
>  

Add comment here, something like...

	/*
	 * Only update priority if this task was the highest priority
	 * task waiting on the lock, and there is an owner to update.
	 */

Rest looks good.

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

-- Steve


> -	if (!owner)
> +	if (!owner || !first)
>  		return;
>  
> -	if (first) {
> +	raw_spin_lock_irqsave(&owner->pi_lock, flags);
>  
> -		raw_spin_lock_irqsave(&owner->pi_lock, flags);
> +	rt_mutex_dequeue_pi(owner, waiter);
>  
> -		rt_mutex_dequeue_pi(owner, waiter);
> +	if (rt_mutex_has_waiters(lock))
> +		rt_mutex_enqueue_pi(owner, rt_mutex_top_waiter(lock));
>  
> -		if (rt_mutex_has_waiters(lock)) {
> -			struct rt_mutex_waiter *next;
> +	__rt_mutex_adjust_prio(owner);
>  
> -			next = rt_mutex_top_waiter(lock);
> -			rt_mutex_enqueue_pi(owner, next);
> -		}
> -		__rt_mutex_adjust_prio(owner);
> +	/* Store the lock on which owner is blocked or NULL */
> +	next_lock = task_blocked_on_lock(owner);
>  
> -		/* Store the lock on which owner is blocked or NULL */
> -		next_lock = task_blocked_on_lock(owner);
> -
> -		raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
> -	}
> +	raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
>  
>  	if (!next_lock)
>  		return;
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 5/7] rtmutex: Confine deadlock logic to futex
  2014-06-09 20:28 ` [patch V3 5/7] rtmutex: Confine deadlock logic to futex Thomas Gleixner
@ 2014-06-10  0:59   ` Steven Rostedt
  2014-06-10  4:03     ` Lai Jiangshan
  2014-06-10 17:39     ` Thomas Gleixner
  0 siblings, 2 replies; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  0:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:09 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:
  
> +/*
> + * Futex variant to allow full deadlock detection.
> + */
> +int __rt_mutex_timed_lock(struct rt_mutex *lock,
> +			  struct hrtimer_sleeper *timeout)
> +{
> +	might_sleep();
> +
> +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
> +				       rt_mutex_slowlock);
> +}
> +
>  /**
>   * rt_mutex_timed_lock - lock a rt_mutex interruptible
>   *			the timeout structure is provided
>   *			by the caller
>   *
> - * @lock: 		the rt_mutex to be locked
> + * @lock:		the rt_mutex to be locked
>   * @timeout:		timeout structure or NULL (no timeout)
> - * @detect_deadlock:	deadlock detection on/off
>   *
>   * Returns:
> - *  0 		on success
> - * -EINTR 	when interrupted by a signal
> + *  0		on success
> + * -EINTR	when interrupted by a signal
>   * -ETIMEDOUT	when the timeout expired
> - * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
>   */
>  int
> -rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout,
> -		    int detect_deadlock)
> +rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
>  {
>  	might_sleep();
>  
> -	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
> -				       detect_deadlock, rt_mutex_slowlock);
> +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
> +				       rt_mutex_slowlock);
>  }
>  EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
>  

I must be missing something. What's the difference between the above
and the futex variant one? They both do the exact same thing:


int foo(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
{
	might_sleep();

	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timout,
				       0, rt_mutex_slowlock);
}


??

-- Steve

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic
  2014-06-09 20:28 ` [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic Thomas Gleixner
@ 2014-06-10  1:04   ` Steven Rostedt
  2014-06-10 15:09   ` Brad Mouring
  1 sibling, 0 replies; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  1:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:10 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> The conditions under which deadlock detection is conducted are unclear
> and undocumented.
> 
> Add constants instead of using 0/1 and provide a selection function
> which hides the additional debug dependency from the calling code.
> 
> Add comments where needed.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex-debug.c  |    5 +-
>  kernel/locking/rtmutex-debug.h  |    7 ++--
>  kernel/locking/rtmutex.c        |   69 +++++++++++++++++++++++++++-------------
>  kernel/locking/rtmutex.h        |    7 +++-
>  kernel/locking/rtmutex_common.h |   15 ++++++++
>  5 files changed, 75 insertions(+), 28 deletions(-)
> 
> Index: tip/kernel/locking/rtmutex-debug.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex-debug.c
> +++ tip/kernel/locking/rtmutex-debug.c
> @@ -66,12 +66,13 @@ void rt_mutex_debug_task_free(struct tas
>   * the deadlock. We print when we return. act_waiter can be NULL in
>   * case of a remove waiter operation.
>   */
> -void debug_rt_mutex_deadlock(int detect, struct rt_mutex_waiter *act_waiter,
> +void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
> +			     struct rt_mutex_waiter *act_waiter,
>  			     struct rt_mutex *lock)
>  {
>  	struct task_struct *task;
>  
> -	if (!debug_locks || detect || !act_waiter)
> +	if (!debug_locks || chwalk || !act_waiter)

I know this will probably get a little verbose, but chwalk isn't very
descriptive. Perhaps change this to:

	if (!debug_locks || chwalk == RT_MUTEX_FULL_CHAINWALK ||
	    !act_waiter)

To cut down on the verbosity, we could add helper macros:

#define chwalk_is_full(chwalk)	((chwalk) == RT_MUTEX_FULL_CHAINWALK)
#define chwalk_is_min(chwalk)   ((chwalk) == RT_MUTEX_MIN_CHAINWALK)

And then the above would simply be:

	if (!debug_locks || chwalk_is_full(chwalk) || !act_waiter)

And put this throughout.

-- Steve


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-09 20:28 ` [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Thomas Gleixner
@ 2014-06-10  1:20   ` Steven Rostedt
  2014-06-10  3:48     ` Jason Low
  2014-06-10 17:41     ` Thomas Gleixner
  2014-06-10 14:57   ` Brad Mouring
  1 sibling, 2 replies; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10  1:20 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 09 Jun 2014 20:28:10 -0000
Thomas Gleixner <tglx@linutronix.de> wrote:

> In case the dead lock detector is enabled we follow the lock chain to
> the end in rt_mutex_adjust_prio_chain, even if we could stop earlier
> due to the priority/waiter constellation.
> 
> But once we are not longer the top priority waiter in a certain step
> or the task holding the lock has already the same priority then there
> is no point in dequeing and enqueing along the lock chain as there is
> no change at all.
> 
> So stop the requeueing at this point.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex.c |   61 +++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 54 insertions(+), 7 deletions(-)
> 
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -359,6 +359,7 @@ static int rt_mutex_adjust_prio_chain(st
>  	struct rt_mutex *lock;
>  	bool detect_deadlock;
>  	unsigned long flags;
> +	bool requeue = true;
>  
>  	detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
>  
> @@ -436,18 +437,31 @@ static int rt_mutex_adjust_prio_chain(st
>  			goto out_unlock_pi;
>  		/*
>  		 * If deadlock detection is off, we stop here if we
> -		 * are not the top pi waiter of the task.
> +		 * are not the top pi waiter of the task. If deadlock
> +		 * detection is enabled we continue, but stop the
> +		 * requeueing in the chain walk.
>  		 */
> -		if (!detect_deadlock && top_waiter != task_top_pi_waiter(task))
> -			goto out_unlock_pi;
> +		if (top_waiter != task_top_pi_waiter(task)) {
> +			if (!detect_deadlock)
> +				goto out_unlock_pi;
> +			else
> +				requeue = false;
> +		}
>  	}
>  
>  	/*
> -	 * When deadlock detection is off then we check, if further
> -	 * priority adjustment is necessary.
> +	 * If the waiter priority is the same as the task priority
> +	 * then there is no further priority adjustment necessary.  If
> +	 * deadlock detection is off, we stop the chain walk. If its
> +	 * enabled we continue, but stop the requeueing in the chain
> +	 * walk.
>  	 */
> -	if (!detect_deadlock && waiter->prio == task->prio)
> -		goto out_unlock_pi;
> +	if (waiter->prio == task->prio) {
> +		if (!detect_deadlock)
> +			goto out_unlock_pi;
> +		else
> +			requeue = false;
> +	}
>  
>  	/*
>  	 * We need to trylock here as we are holding task->pi_lock,
> @@ -475,6 +489,39 @@ static int rt_mutex_adjust_prio_chain(st
>  	}
>  
>  	/*
> +	 * If we just follow the lock chain for deadlock detection, no
> +	 * need to do all the requeue operations. We avoid a truckload

s/We/To/


> +	 * of conditinals around the various places below and just do

s/ and/, /

> +	 * the minimum chain walk checks here.
> +	 */
> +	if (!requeue) {
> +		/* Release the task */
> +		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> +		put_task_struct(task);
> +
> +		/* If there is no owner of the lock, end of chain. */
> +		if (!rt_mutex_owner(lock)) {
> +			raw_spin_unlock(&lock->wait_lock);
> +			return 0;
> +		}
> +
> +		/* Grab the next task, i.e. owner of @lock */
> +		task = rt_mutex_owner(lock);
> +		get_task_struct(task);
> +		raw_spin_lock_irqsave(&task->pi_lock, flags);
> +
> +		/* Store whether owner is blocked itself and drop locks */
> +		next_lock = task_blocked_on(task);
> +		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> +		raw_spin_unlock(&lock->wait_lock);
> +
> +		/* If owner is not blocked, end of chain. */
> +		if (!next_lock)
> +			goto out_put_task;

On the loop back around, have something like:

	if (top_waiter) {
		if (!task_has_pi_waiters(task))
			goto out_unlock_pi;

		if (!requeue &&
		    top_waiter != task_top_pi_waiter(task)) {
			if (!detect_deadlock)
				goto out_unlock_pi;
			else
				requeue = false;
		}
	}

??


> +		goto again;
> +	}
> +
> +	/*
>  	 * Store the current top waiter before doing the requeue
>  	 * operation on @lock. We need it for the boost/deboost
>  	 * decision below.
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 0/7] rtmutex: Code clarification and optimization
  2014-06-10  0:35   ` Steven Rostedt
@ 2014-06-10  3:00     ` Lai Jiangshan
  0 siblings, 0 replies; 40+ messages in thread
From: Lai Jiangshan @ 2014-06-10  3:00 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar, Jason Low,
	Brad Mouring

On 06/10/2014 08:35 AM, Steven Rostedt wrote:
> On Mon, 9 Jun 2014 20:27:13 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> 
>> So what branch do these apply to? I tried latest Linus master and it
>> didn't apply.
> 
> OK, they seem to be on tip/locking/urgent. Not yet in mainline.


Thank you for share it, I'm trying to review the patches.

Lai

> 
> -- Steve
> .
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 1/7] rtmutex: Deobfuscate chain walk
  2014-06-09 20:28 ` [patch V3 1/7] rtmutex: Deobfuscate chain walk Thomas Gleixner
  2014-06-09 20:59   ` Steven Rostedt
@ 2014-06-10  3:21   ` Jason Low
  2014-06-10 13:57   ` Brad Mouring
  2 siblings, 0 replies; 40+ messages in thread
From: Jason Low @ 2014-06-10  3:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 9, 2014 at 1:28 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> There is no point to keep the task ref across the check for lock
> owner. Drop the ref before that, so the protection context is clear.
>
> Found while documenting the chain walk.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Jason Low <jason.low2@hp.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 2/7] rtmutex: Clarify the boost/deboost part
  2014-06-09 20:28 ` [patch V3 2/7] rtmutex: Clarify the boost/deboost part Thomas Gleixner
  2014-06-10  0:37   ` Steven Rostedt
@ 2014-06-10  3:22   ` Jason Low
  2014-06-10 14:04   ` Brad Mouring
  2 siblings, 0 replies; 40+ messages in thread
From: Jason Low @ 2014-06-10  3:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 9, 2014 at 1:28 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> Add a separate local variable for the boost/deboost logic to make the
> code more readable. Add comments where appropriate.
>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Jason Low <jason.low2@hp.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 4/7] rtmutex: Siplify remove_waiter()
  2014-06-10  0:53   ` Steven Rostedt
@ 2014-06-10  3:35     ` Lai Jiangshan
  2014-06-10  3:44     ` Jason Low
  1 sibling, 0 replies; 40+ messages in thread
From: Lai Jiangshan @ 2014-06-10  3:35 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar, Jason Low,
	Brad Mouring, Lai Jiangshan

On 06/10/2014 08:53 AM, Steven Rostedt wrote:
> On Mon, 09 Jun 2014 20:28:08 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
>> Exit right away, when the removed waiter was not the top prioriy
>> waiter on the lock. Get rid of the extra indent level.
>>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  kernel/locking/rtmutex.c |   26 ++++++++++----------------
>>  1 file changed, 10 insertions(+), 16 deletions(-)
>>
>> Index: tip/kernel/locking/rtmutex.c
>> ===================================================================
>> --- tip.orig/kernel/locking/rtmutex.c
>> +++ tip/kernel/locking/rtmutex.c
>> @@ -780,7 +780,7 @@ static void remove_waiter(struct rt_mute
>>  {
>>  	int first = (waiter == rt_mutex_top_waiter(lock));
>>  	struct task_struct *owner = rt_mutex_owner(lock);
>> -	struct rt_mutex *next_lock = NULL;
>> +	struct rt_mutex *next_lock;
>>  	unsigned long flags;
>>  
>>  	raw_spin_lock_irqsave(&current->pi_lock, flags);
>> @@ -788,28 +788,22 @@ static void remove_waiter(struct rt_mute
>>  	current->pi_blocked_on = NULL;
>>  	raw_spin_unlock_irqrestore(&current->pi_lock, flags);
>>  
> 
> Add comment here, something like...
> 
> 	/*
> 	 * Only update priority if this task was the highest priority
> 	 * task waiting on the lock, and there is an owner to update.
> 	 */
> 
> Rest looks good.
> 
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>

Thanks,
Lai

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 4/7] rtmutex: Siplify remove_waiter()
  2014-06-10  0:53   ` Steven Rostedt
  2014-06-10  3:35     ` Lai Jiangshan
@ 2014-06-10  3:44     ` Jason Low
  1 sibling, 0 replies; 40+ messages in thread
From: Jason Low @ 2014-06-10  3:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar,
	Lai Jiangshan, Jason Low, Brad Mouring

On Mon, Jun 9, 2014 at 5:53 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 09 Jun 2014 20:28:08 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
>
>> Exit right away, when the removed waiter was not the top prioriy
>> waiter on the lock. Get rid of the extra indent level.
>>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  kernel/locking/rtmutex.c |   26 ++++++++++----------------
>>  1 file changed, 10 insertions(+), 16 deletions(-)
>>
>> Index: tip/kernel/locking/rtmutex.c
>> ===================================================================
>> --- tip.orig/kernel/locking/rtmutex.c
>> +++ tip/kernel/locking/rtmutex.c
>> @@ -780,7 +780,7 @@ static void remove_waiter(struct rt_mute
>>  {
>>       int first = (waiter == rt_mutex_top_waiter(lock));
>>       struct task_struct *owner = rt_mutex_owner(lock);
>> -     struct rt_mutex *next_lock = NULL;
>> +     struct rt_mutex *next_lock;
>>       unsigned long flags;
>>
>>       raw_spin_lock_irqsave(&current->pi_lock, flags);
>> @@ -788,28 +788,22 @@ static void remove_waiter(struct rt_mute
>>       current->pi_blocked_on = NULL;
>>       raw_spin_unlock_irqrestore(&current->pi_lock, flags);
>>
>
> Add comment here, something like...
>
>         /*
>          * Only update priority if this task was the highest priority
>          * task waiting on the lock, and there is an owner to update.
>          */

Would it also make it clearer if we were to change "first" to something
such as "bool is_top_waiter"?

> Rest looks good.
>
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

Reviewed-by: Jason Low <jason.low2@hp.com>

> -- Steve
>
>
>> -     if (!owner)
>> +     if (!owner || !first)
>>               return;
>>
>> -     if (first) {
>> +     raw_spin_lock_irqsave(&owner->pi_lock, flags);
>>
>> -             raw_spin_lock_irqsave(&owner->pi_lock, flags);
>> +     rt_mutex_dequeue_pi(owner, waiter);
>>
>> -             rt_mutex_dequeue_pi(owner, waiter);
>> +     if (rt_mutex_has_waiters(lock))
>> +             rt_mutex_enqueue_pi(owner, rt_mutex_top_waiter(lock));
>>
>> -             if (rt_mutex_has_waiters(lock)) {
>> -                     struct rt_mutex_waiter *next;
>> +     __rt_mutex_adjust_prio(owner);
>>
>> -                     next = rt_mutex_top_waiter(lock);
>> -                     rt_mutex_enqueue_pi(owner, next);
>> -             }
>> -             __rt_mutex_adjust_prio(owner);
>> +     /* Store the lock on which owner is blocked or NULL */
>> +     next_lock = task_blocked_on_lock(owner);
>>
>> -             /* Store the lock on which owner is blocked or NULL */
>> -             next_lock = task_blocked_on_lock(owner);
>> -
>> -             raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
>> -     }
>> +     raw_spin_unlock_irqrestore(&owner->pi_lock, flags);
>>
>>       if (!next_lock)
>>               return;

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10  1:20   ` Steven Rostedt
@ 2014-06-10  3:48     ` Jason Low
  2014-06-10 17:41     ` Thomas Gleixner
  1 sibling, 0 replies; 40+ messages in thread
From: Jason Low @ 2014-06-10  3:48 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar,
	Lai Jiangshan, Jason Low, Brad Mouring

On Mon, Jun 9, 2014 at 6:20 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 09 Jun 2014 20:28:10 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
>
>> In case the dead lock detector is enabled we follow the lock chain to
>> the end in rt_mutex_adjust_prio_chain, even if we could stop earlier
>> due to the priority/waiter constellation.
>>
>> But once we are not longer the top priority waiter in a certain step
>> or the task holding the lock has already the same priority then there
>> is no point in dequeing and enqueing along the lock chain as there is
>> no change at all.
>>
>> So stop the requeueing at this point.
>>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  kernel/locking/rtmutex.c |   61 +++++++++++++++++++++++++++++++++++++++++------
>>  1 file changed, 54 insertions(+), 7 deletions(-)
>>
>> Index: tip/kernel/locking/rtmutex.c
>> ===================================================================
>> --- tip.orig/kernel/locking/rtmutex.c
>> +++ tip/kernel/locking/rtmutex.c
>> @@ -359,6 +359,7 @@ static int rt_mutex_adjust_prio_chain(st
>>       struct rt_mutex *lock;
>>       bool detect_deadlock;
>>       unsigned long flags;
>> +     bool requeue = true;
>>
>>       detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
>>
>> @@ -436,18 +437,31 @@ static int rt_mutex_adjust_prio_chain(st
>>                       goto out_unlock_pi;
>>               /*
>>                * If deadlock detection is off, we stop here if we
>> -              * are not the top pi waiter of the task.
>> +              * are not the top pi waiter of the task. If deadlock
>> +              * detection is enabled we continue, but stop the
>> +              * requeueing in the chain walk.
>>                */
>> -             if (!detect_deadlock && top_waiter != task_top_pi_waiter(task))
>> -                     goto out_unlock_pi;
>> +             if (top_waiter != task_top_pi_waiter(task)) {
>> +                     if (!detect_deadlock)
>> +                             goto out_unlock_pi;
>> +                     else
>> +                             requeue = false;
>> +             }
>>       }
>>
>>       /*
>> -      * When deadlock detection is off then we check, if further
>> -      * priority adjustment is necessary.
>> +      * If the waiter priority is the same as the task priority
>> +      * then there is no further priority adjustment necessary.  If
>> +      * deadlock detection is off, we stop the chain walk. If its
>> +      * enabled we continue, but stop the requeueing in the chain
>> +      * walk.
>>        */
>> -     if (!detect_deadlock && waiter->prio == task->prio)
>> -             goto out_unlock_pi;
>> +     if (waiter->prio == task->prio) {
>> +             if (!detect_deadlock)
>> +                     goto out_unlock_pi;
>> +             else
>> +                     requeue = false;
>> +     }
>>
>>       /*
>>        * We need to trylock here as we are holding task->pi_lock,
>> @@ -475,6 +489,39 @@ static int rt_mutex_adjust_prio_chain(st
>>       }
>>
>>       /*
>> +      * If we just follow the lock chain for deadlock detection, no
>> +      * need to do all the requeue operations. We avoid a truckload
>
> s/We/To/
>
>
>> +      * of conditinals around the various places below and just do
>
> s/ and/, /

And s/conditinals/conditionals/

>> +      * the minimum chain walk checks here.
>> +      */
>> +     if (!requeue) {
>> +             /* Release the task */
>> +             raw_spin_unlock_irqrestore(&task->pi_lock, flags);
>> +             put_task_struct(task);
>> +
>> +             /* If there is no owner of the lock, end of chain. */
>> +             if (!rt_mutex_owner(lock)) {
>> +                     raw_spin_unlock(&lock->wait_lock);
>> +                     return 0;
>> +             }
>> +
>> +             /* Grab the next task, i.e. owner of @lock */
>> +             task = rt_mutex_owner(lock);
>> +             get_task_struct(task);
>> +             raw_spin_lock_irqsave(&task->pi_lock, flags);
>> +
>> +             /* Store whether owner is blocked itself and drop locks */
>> +             next_lock = task_blocked_on(task);
>> +             raw_spin_unlock_irqrestore(&task->pi_lock, flags);
>> +             raw_spin_unlock(&lock->wait_lock);
>> +
>> +             /* If owner is not blocked, end of chain. */
>> +             if (!next_lock)
>> +                     goto out_put_task;
>
> On the loop back around, have something like:
>
>         if (top_waiter) {
>                 if (!task_has_pi_waiters(task))
>                         goto out_unlock_pi;
>
>                 if (!requeue &&
>                     top_waiter != task_top_pi_waiter(task)) {
>                         if (!detect_deadlock)
>                                 goto out_unlock_pi;
>                         else
>                                 requeue = false;
>                 }
>         }
>
> ??
>
>
>> +             goto again;
>> +     }
>> +
>> +     /*
>>        * Store the current top waiter before doing the requeue
>>        * operation on @lock. We need it for the boost/deboost
>>        * decision below.
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 3/7] rtmutex: Document pi chain walk
  2014-06-10  0:45   ` Steven Rostedt
@ 2014-06-10  3:51     ` Lai Jiangshan
  0 siblings, 0 replies; 40+ messages in thread
From: Lai Jiangshan @ 2014-06-10  3:51 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar, Jason Low,
	Brad Mouring

On 06/10/2014 08:45 AM, Steven Rostedt wrote:
> On Mon, 09 Jun 2014 20:28:08 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
>> Add commentry to document the chain walk and the protection mechanisms
>> and their scope.
>>
>> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> ---
>>  kernel/locking/rtmutex.c |   52 +++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 52 insertions(+)
>>
>> Index: tip/kernel/locking/rtmutex.c
>> ===================================================================
>> --- tip.orig/kernel/locking/rtmutex.c
>> +++ tip/kernel/locking/rtmutex.c
>> @@ -285,6 +285,47 @@ static inline struct rt_mutex *task_bloc
>>   * @top_task:	the current top waiter
>>   *
>>   * Returns 0 or -EDEADLK.
>> + *
>> + * Chain walk basics and protection scope
>> + *
>> + * [A] refcount on task
>> + * [B] task->pi_lock held
>> + * [C] rtmutex->lock held
> 
> A,B, C is rather meaningless, and requires constant looking back up at
> the key. Perhaps [R],[P] and [L]
> 
>  [R] refcount on task (get_task_struct)
>  [P] task->pi_lock held
>  [L] rtmutex->lock held
> 
> 
> That way we can associate R being refcount, P being pi_lock and L being
> lock. Easier to remember.
> 
> 
>> + *
>> + * call()					Protected by
> 
> "call()"?
> 
>> + *	@task					[A]
>> + *	@orig_lock if != NULL			@top_task is blocked on it
>> + *	@next_lock				Unprotected. Cannot be
>> + *						dereferenced. Only used for
>> + *						comparison.
>> + *	@orig_waiter if != NULL			@top_task is blocked on it
>> + *	@top_task				current, or in case of proxy
>> + *						locking protected by calling
>> + *						code
>> + * again:
>> + *	loop_sanity_check();
>> + * retry:
>> + *	lock(task->pi_lock);			[A] acquire [B]
>> + *	waiter = task->pi_blocked_on;		[B]
>> + *	check_exit_conditions();		[B]
>> + *	lock = waiter->lock;			[B]
>> + *	if (!try_lock(lock->wait_lock)) {	[B] try to acquire [C]
>> + *		unlock(task->pi_lock);		drop [B]
>> + *		goto retry;
>> + *	}
>> + *	check_exit_conditions();		[B] + [C]
>> + *	requeue_lock_waiter(lock, waiter);	[B] + [C]
>> + *	unlock(task->pi_lock);			drop [B]
>> + *	drop_task_ref(task);			drop [A]
> 
> Maybe just state "put_task_struct()", less abstractions.
> 
>> + *	check_exit_conditions();		[C]
>> + *	task = owner(lock);			[C]
>> + *	get_task_ref(task);			[C] acquire [A]
> 
> get_task_struct()
> 
> -- Steve
> 
>> + *	lock(task->pi_lock);			[C] acquire [B]
>> + *	requeue_pi_waiter(task, waiters(lock));	[B] + [C]
>> + *	check_exit_conditions();		[B] + [C]
>> + *	unlock(task->pi_lock);			drop [B]
>> + *	unlock(lock->wait_lock);		drop [C]
>> + *	goto again;
>>   */

There are four check_exit_conditions()s with the same name but with different locking.

I don't think it is a good a idea to copy the code to the comment of
the function description, we will need to always keep them coincident forever.

I prefer to comment them in the function body or comment them
in higher level abstraction.

>>  static int rt_mutex_adjust_prio_chain(struct task_struct *task,
>>  				      int deadlock_detect,
>> @@ -326,6 +367,12 @@ static int rt_mutex_adjust_prio_chain(st
>>  
>>  		return -EDEADLK;
>>  	}
>> +
>> +	/*
>> +	 * We are fully preemptible here and only hold the refcount on
>> +	 * @task. So everything can have changed under us since the
>> +	 * caller or our own code below (goto retry) dropped all locks.
>> +	 */
>>   retry:
>>  	/*
>>  	 * Task can not go away as we did a get_task() before !
>> @@ -383,6 +430,11 @@ static int rt_mutex_adjust_prio_chain(st
>>  	if (!detect_deadlock && waiter->prio == task->prio)
>>  		goto out_unlock_pi;
>>  
>> +	/*
>> +	 * We need to trylock here as we are holding task->pi_lock,
>> +	 * which is the reverse lock order versus the other rtmutex
>> +	 * operations.
>> +	 */
>>  	lock = waiter->lock;
>>  	if (!raw_spin_trylock(&lock->wait_lock)) {
>>  		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 1/7] rtmutex: Deobfuscate chain walk
  2014-06-09 20:59   ` Steven Rostedt
@ 2014-06-10  3:52     ` Lai Jiangshan
  0 siblings, 0 replies; 40+ messages in thread
From: Lai Jiangshan @ 2014-06-10  3:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar, Jason Low,
	Brad Mouring

On 06/10/2014 04:59 AM, Steven Rostedt wrote:
> On Mon, 09 Jun 2014 20:28:06 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
>> There is no point to keep the task ref across the check for lock
>> owner. Drop the ref before that, so the protection context is clear.
>>
>> Found while documenting the chain walk.
> 
> This looks fine, I just hate the subject. I don't see how it is
> 'deobfuscating" the chain walk. How about:
> 
> rtmutex: No need to keep task ref when checking lock ownership
> 
> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>

Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>

Thanks,
Lai

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 5/7] rtmutex: Confine deadlock logic to futex
  2014-06-10  0:59   ` Steven Rostedt
@ 2014-06-10  4:03     ` Lai Jiangshan
  2014-06-10 17:39     ` Thomas Gleixner
  1 sibling, 0 replies; 40+ messages in thread
From: Lai Jiangshan @ 2014-06-10  4:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar, Jason Low,
	Brad Mouring

On 06/10/2014 08:59 AM, Steven Rostedt wrote:
> On Mon, 09 Jun 2014 20:28:09 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
>   
>> +/*
>> + * Futex variant to allow full deadlock detection.
>> + */
>> +int __rt_mutex_timed_lock(struct rt_mutex *lock,
>> +			  struct hrtimer_sleeper *timeout)
>> +{
>> +	might_sleep();
>> +
>> +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
>> +				       rt_mutex_slowlock);
>> +}

This futex variant one is not used until next patch, it should not be
introduced in this patch.

>> +
>>  /**
>>   * rt_mutex_timed_lock - lock a rt_mutex interruptible
>>   *			the timeout structure is provided
>>   *			by the caller
>>   *
>> - * @lock: 		the rt_mutex to be locked
>> + * @lock:		the rt_mutex to be locked
>>   * @timeout:		timeout structure or NULL (no timeout)
>> - * @detect_deadlock:	deadlock detection on/off
>>   *
>>   * Returns:
>> - *  0 		on success
>> - * -EINTR 	when interrupted by a signal
>> + *  0		on success
>> + * -EINTR	when interrupted by a signal
>>   * -ETIMEDOUT	when the timeout expired
>> - * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
>>   */
>>  int
>> -rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout,
>> -		    int detect_deadlock)
>> +rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
>>  {
>>  	might_sleep();
>>  
>> -	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
>> -				       detect_deadlock, rt_mutex_slowlock);
>> +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
>> +				       rt_mutex_slowlock);
>>  }
>>  EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
>>  
> 
> I must be missing something. What's the difference between the above
> and the futex variant one? They both do the exact same thing:
> 

It will use RT_MUTEX_*FULL*_CHAINWALK for the futex variant one while
rt_mutex_timed_lock() will use RT_MUTEX_*MIN*_CHAINWALK

> 
> int foo(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
> {
> 	might_sleep();
> 
> 	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timout,
> 				       0, rt_mutex_slowlock);
> }
> 
> 
> ??
> 
> -- Steve
> .
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 1/7] rtmutex: Deobfuscate chain walk
  2014-06-09 20:28 ` [patch V3 1/7] rtmutex: Deobfuscate chain walk Thomas Gleixner
  2014-06-09 20:59   ` Steven Rostedt
  2014-06-10  3:21   ` Jason Low
@ 2014-06-10 13:57   ` Brad Mouring
  2 siblings, 0 replies; 40+ messages in thread
From: Brad Mouring @ 2014-06-10 13:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 09, 2014 at 08:28:06PM -0000, Thomas Gleixner wrote:
> There is no point to keep the task ref across the check for lock
> owner. Drop the ref before that, so the protection context is clear.
> 
> Found while documenting the chain walk.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

I also am not a huge fan of the verbiage of the commit message here.
Steven's makes more sense to me, and you basically say the same thing
in the actual commit message.

Reviewed-by: Brad Mouring <brad.mouring@ni.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 2/7] rtmutex: Clarify the boost/deboost part
  2014-06-09 20:28 ` [patch V3 2/7] rtmutex: Clarify the boost/deboost part Thomas Gleixner
  2014-06-10  0:37   ` Steven Rostedt
  2014-06-10  3:22   ` Jason Low
@ 2014-06-10 14:04   ` Brad Mouring
  2 siblings, 0 replies; 40+ messages in thread
From: Brad Mouring @ 2014-06-10 14:04 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 09, 2014 at 08:28:07PM -0000, Thomas Gleixner wrote:
> Add a separate local variable for the boost/deboost logic to make the
> code more readable. Add comments where appropriate.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

This would have made my life much easier these past few weeks.

Reviewed-by: Brad Mouring <brad.mouring@ni.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 4/7] rtmutex: Siplify remove_waiter()
  2014-06-09 20:28 ` [patch V3 4/7] rtmutex: Siplify remove_waiter() Thomas Gleixner
  2014-06-10  0:53   ` Steven Rostedt
@ 2014-06-10 14:10   ` Brad Mouring
  1 sibling, 0 replies; 40+ messages in thread
From: Brad Mouring @ 2014-06-10 14:10 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 09, 2014 at 08:28:08PM -0000, Thomas Gleixner wrote:
> Exit right away, when the removed waiter was not the top prioriy
> waiter on the lock. Get rid of the extra indent level.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex.c |   26 ++++++++++----------------
>  1 file changed, 10 insertions(+), 16 deletions(-)
> 
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -780,7 +780,7 @@ static void remove_waiter(struct rt_mute
>  {
>  	int first = (waiter == rt_mutex_top_waiter(lock));

I agree that @first is ambiguous and would prefer something like
@is_top_waiter

Reviewed-by: Brad Mouring <brad.mouring@ni.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 3/7] rtmutex: Document pi chain walk
  2014-06-09 20:28 ` [patch V3 3/7] rtmutex: Document pi chain walk Thomas Gleixner
  2014-06-10  0:45   ` Steven Rostedt
@ 2014-06-10 14:21   ` Brad Mouring
  1 sibling, 0 replies; 40+ messages in thread
From: Brad Mouring @ 2014-06-10 14:21 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 09, 2014 at 08:28:08PM -0000, Thomas Gleixner wrote:
> Add commentry to document the chain walk and the protection mechanisms
> and their scope.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Reviewed-by: Brad Mouring <brad.mouring@ni.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-09 20:28 ` [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Thomas Gleixner
  2014-06-10  1:20   ` Steven Rostedt
@ 2014-06-10 14:57   ` Brad Mouring
  2014-06-10 15:19     ` Steven Rostedt
  1 sibling, 1 reply; 40+ messages in thread
From: Brad Mouring @ 2014-06-10 14:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 09, 2014 at 08:28:10PM -0000, Thomas Gleixner wrote:
> In case the dead lock detector is enabled we follow the lock chain to
> the end in rt_mutex_adjust_prio_chain, even if we could stop earlier
> due to the priority/waiter constellation.
> 
> But once we are not longer the top priority waiter in a certain step
> or the task holding the lock has already the same priority then there
> is no point in dequeing and enqueing along the lock chain as there is
> no change at all.
> 
> So stop the requeueing at this point.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex.c |   61 +++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 54 insertions(+), 7 deletions(-)
> 
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -359,6 +359,7 @@ static int rt_mutex_adjust_prio_chain(st
>  	struct rt_mutex *lock;
>  	bool detect_deadlock;
>  	unsigned long flags;
> +	bool requeue = true;
>  
>  	detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
>  
> @@ -436,18 +437,31 @@ static int rt_mutex_adjust_prio_chain(st
>  			goto out_unlock_pi;
>  		/*
>  		 * If deadlock detection is off, we stop here if we
> -		 * are not the top pi waiter of the task.
> +		 * are not the top pi waiter of the task. If deadlock
> +		 * detection is enabled we continue, but stop the
> +		 * requeueing in the chain walk.
>  		 */
> -		if (!detect_deadlock && top_waiter != task_top_pi_waiter(task))
> -			goto out_unlock_pi;
> +		if (top_waiter != task_top_pi_waiter(task)) {
> +			if (!detect_deadlock)
> +				goto out_unlock_pi;
> +			else
> +				requeue = false;
> +		}
>  	}
>  
>  	/*
> -	 * When deadlock detection is off then we check, if further
> -	 * priority adjustment is necessary.
> +	 * If the waiter priority is the same as the task priority
> +	 * then there is no further priority adjustment necessary.  If
> +	 * deadlock detection is off, we stop the chain walk. If its
s/its/it's/
> +	 * enabled we continue, but stop the requeueing in the chain
> +	 * walk.
>  	 */
> -	if (!detect_deadlock && waiter->prio == task->prio)
> -		goto out_unlock_pi;
> +	if (waiter->prio == task->prio) {
> +		if (!detect_deadlock)
> +			goto out_unlock_pi;
> +		else
> +			requeue = false;
> +	}
>  
>  	/*
>  	 * We need to trylock here as we are holding task->pi_lock,
> @@ -475,6 +489,39 @@ static int rt_mutex_adjust_prio_chain(st
>  	}
>  
>  	/*
> +	 * If we just follow the lock chain for deadlock detection, no
> +	 * need to do all the requeue operations. We avoid a truckload
> +	 * of conditinals around the various places below and just do
> +	 * the minimum chain walk checks here.
> +	 */
> +	if (!requeue) {
> +		/* Release the task */
> +		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> +		put_task_struct(task);
> +
> +		/* If there is no owner of the lock, end of chain. */
> +		if (!rt_mutex_owner(lock)) {
> +			raw_spin_unlock(&lock->wait_lock);
> +			return 0;
> +		}
> +
> +		/* Grab the next task, i.e. owner of @lock */
> +		task = rt_mutex_owner(lock);
> +		get_task_struct(task);
> +		raw_spin_lock_irqsave(&task->pi_lock, flags);
> +
> +		/* Store whether owner is blocked itself and drop locks */
> +		next_lock = task_blocked_on(task);
task_blocked_on(task) is not clear to me, the recipient of the
return is the only clue that hints at what the function does.
> +		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> +		raw_spin_unlock(&lock->wait_lock);
> +
> +		/* If owner is not blocked, end of chain. */
> +		if (!next_lock)
> +			goto out_put_task;
> +		goto again;
> +	}
> +
> +	/*
>  	 * Store the current top waiter before doing the requeue
>  	 * operation on @lock. We need it for the boost/deboost
>  	 * decision below.
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic
  2014-06-09 20:28 ` [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic Thomas Gleixner
  2014-06-10  1:04   ` Steven Rostedt
@ 2014-06-10 15:09   ` Brad Mouring
  1 sibling, 0 replies; 40+ messages in thread
From: Brad Mouring @ 2014-06-10 15:09 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Steven Rostedt, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low, Brad Mouring

On Mon, Jun 09, 2014 at 08:28:10PM -0000, Thomas Gleixner wrote:
> The conditions under which deadlock detection is conducted are unclear
> and undocumented.
> 
> Add constants instead of using 0/1 and provide a selection function
> which hides the additional debug dependency from the calling code.
> 
> Add comments where needed.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/locking/rtmutex-debug.c  |    5 +-
>  kernel/locking/rtmutex-debug.h  |    7 ++--
>  kernel/locking/rtmutex.c        |   69 +++++++++++++++++++++++++++-------------
>  kernel/locking/rtmutex.h        |    7 +++-
>  kernel/locking/rtmutex_common.h |   15 ++++++++
>  5 files changed, 75 insertions(+), 28 deletions(-)
> 
> Index: tip/kernel/locking/rtmutex-debug.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex-debug.c
> +++ tip/kernel/locking/rtmutex-debug.c
> @@ -66,12 +66,13 @@ void rt_mutex_debug_task_free(struct tas
>   * the deadlock. We print when we return. act_waiter can be NULL in
>   * case of a remove waiter operation.
>   */
> -void debug_rt_mutex_deadlock(int detect, struct rt_mutex_waiter *act_waiter,
> +void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
> +			     struct rt_mutex_waiter *act_waiter,
>  			     struct rt_mutex *lock)
>  {
>  	struct task_struct *task;
>  
> -	if (!debug_locks || detect || !act_waiter)
> +	if (!debug_locks || chwalk || !act_waiter)
>  		return;
>  
>  	task = rt_mutex_owner(act_waiter->lock);
> Index: tip/kernel/locking/rtmutex-debug.h
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex-debug.h
> +++ tip/kernel/locking/rtmutex-debug.h
> @@ -20,14 +20,15 @@ extern void debug_rt_mutex_unlock(struct
>  extern void debug_rt_mutex_proxy_lock(struct rt_mutex *lock,
>  				      struct task_struct *powner);
>  extern void debug_rt_mutex_proxy_unlock(struct rt_mutex *lock);
> -extern void debug_rt_mutex_deadlock(int detect, struct rt_mutex_waiter *waiter,
> +extern void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
> +				    struct rt_mutex_waiter *waiter,
>  				    struct rt_mutex *lock);
>  extern void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter);
>  # define debug_rt_mutex_reset_waiter(w)			\
>  	do { (w)->deadlock_lock = NULL; } while (0)
>  
> -static inline int debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *waiter,
> -						 int detect)
> +static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiterr *waiter,
> +						  enum rtmutex_chainwalk walk)
>  {
>  	return (waiter != NULL);
>  }
> Index: tip/kernel/locking/rtmutex.c
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.c
> +++ tip/kernel/locking/rtmutex.c
> @@ -256,6 +256,25 @@ static void rt_mutex_adjust_prio(struct
>  }
>  
>  /*
> + * Deadlock detection is conditional:
> + *
> + * If CONFIG_DEBUG_RT_MUTEXES=n, deadlock detection is only conducted
> + * if the detect argument is == RT_MUTEX_FULL_CHAINWALK.
> + *
> + * If CONFIG_DEBUG_RT_MUTEXES=y, deadlock detection is always
> + * conducted independent of the detect argument.
> + *
> + * If the waiter argument is NULL this indicates the deboost path and
> + * deadlock detection is disabled independent of the detect argument
> + * and the config settings.
> + */
> +static bool rt_mutex_cond_detect_deadlock(struct rt_mutex_waiter *waiter,
> +					  enum rtmutex_chainwalk chwalk)
> +{
> +	return debug_rt_mutex_detect_deadlock(waiter, chwalk);
> +}
> +
> +/*
>   * Max number of times we'll walk the boosting chain:
>   */
This comment is misleading, max_lock_depth conveys the number of
steps we'll take in any one chain walking. I know it's not in your
change but it bothered me when I was looking at this code and,
considering a goal of these patches is to clarify/document, I
figured I would comment.
>  int max_lock_depth = 1024;
> @@ -328,7 +347,7 @@ static inline struct rt_mutex *task_bloc
>   *	goto again;
>   */
>  static int rt_mutex_adjust_prio_chain(struct task_struct *task,
> -				      int deadlock_detect,
> +				      enum rtmutex_chainwalk chwalk,
>  				      struct rt_mutex *orig_lock,
>  				      struct rt_mutex *next_lock,
>  				      struct rt_mutex_waiter *orig_waiter,
> @@ -336,12 +355,12 @@ static int rt_mutex_adjust_prio_chain(st
>  {
>  	struct rt_mutex_waiter *waiter, *top_waiter = orig_waiter;
>  	struct rt_mutex_waiter *prerequeue_top_waiter;
> -	int detect_deadlock, ret = 0, depth = 0;
> +	int ret = 0, depth = 0;
>  	struct rt_mutex *lock;
> +	bool detect_deadlock;
>  	unsigned long flags;
>  
> -	detect_deadlock = debug_rt_mutex_detect_deadlock(orig_waiter,
> -							 deadlock_detect);
> +	detect_deadlock = rt_mutex_cond_detect_deadlock(orig_waiter, chwalk);
>  
>  	/*
>  	 * The (de)boosting is a step by step approach with a lot of
> @@ -449,7 +468,7 @@ static int rt_mutex_adjust_prio_chain(st
>  	 * walk, we detected a deadlock.
>  	 */
>  	if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
> -		debug_rt_mutex_deadlock(deadlock_detect, orig_waiter, lock);
> +		debug_rt_mutex_deadlock(chwalk, orig_waiter, lock);
>  		raw_spin_unlock(&lock->wait_lock);
>  		ret = -EDEADLK;
>  		goto out_unlock_pi;
> @@ -659,7 +678,7 @@ static int try_to_take_rt_mutex(struct r
>  static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
>  				   struct rt_mutex_waiter *waiter,
>  				   struct task_struct *task,
> -				   int detect_deadlock)
> +				   enum rtmutex_chainwalk chwalk)
>  {
>  	struct task_struct *owner = rt_mutex_owner(lock);
>  	struct rt_mutex_waiter *top_waiter = waiter;
> @@ -705,7 +724,7 @@ static int task_blocks_on_rt_mutex(struc
>  		__rt_mutex_adjust_prio(owner);
>  		if (owner->pi_blocked_on)
>  			chain_walk = 1;
> -	} else if (debug_rt_mutex_detect_deadlock(waiter, detect_deadlock)) {
> +	} else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) {
>  		chain_walk = 1;
>  	}
>  
> @@ -730,7 +749,7 @@ static int task_blocks_on_rt_mutex(struc
>  
>  	raw_spin_unlock(&lock->wait_lock);
>  
> -	res = rt_mutex_adjust_prio_chain(owner, detect_deadlock, lock,
> +	res = rt_mutex_adjust_prio_chain(owner, chwalk, lock,
>  					 next_lock, waiter, task);
>  
>  	raw_spin_lock(&lock->wait_lock);
> @@ -813,7 +832,8 @@ static void remove_waiter(struct rt_mute
>  
>  	raw_spin_unlock(&lock->wait_lock);
>  
> -	rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, current);
> +	rt_mutex_adjust_prio_chain(owner, RT_MUTEX_MIN_CHAINWALK, lock,
> +				   next_lock, NULL, current);
>  
>  	raw_spin_lock(&lock->wait_lock);
>  }
> @@ -843,7 +863,8 @@ void rt_mutex_adjust_pi(struct task_stru
>  	/* gets dropped in rt_mutex_adjust_prio_chain()! */
>  	get_task_struct(task);
>  
> -	rt_mutex_adjust_prio_chain(task, 0, NULL, next_lock, NULL, task);
> +	rt_mutex_adjust_prio_chain(task, RT_MUTEX_MIN_CHAINWALK, NULL, NULL,
> +				   NULL, task);
>  }
>  
>  /**
> @@ -921,7 +942,7 @@ static void rt_mutex_handle_deadlock(int
>  static int __sched
>  rt_mutex_slowlock(struct rt_mutex *lock, int state,
>  		  struct hrtimer_sleeper *timeout,
> -		  int detect_deadlock)
> +		  enum rtmutex_chainwalk chwalk)
>  {
>  	struct rt_mutex_waiter waiter;
>  	int ret = 0;
> @@ -947,7 +968,7 @@ rt_mutex_slowlock(struct rt_mutex *lock,
>  			timeout->task = NULL;
>  	}
>  
> -	ret = task_blocks_on_rt_mutex(lock, &waiter, current, detect_deadlock);
> +	ret = task_blocks_on_rt_mutex(lock, &waiter, current, chwalk);
>  
>  	if (likely(!ret))
>  		ret = __rt_mutex_slowlock(lock, state, timeout, &waiter);
> @@ -956,7 +977,7 @@ rt_mutex_slowlock(struct rt_mutex *lock,
>  
>  	if (unlikely(ret)) {
>  		remove_waiter(lock, &waiter);
> -		rt_mutex_handle_deadlock(ret, detect_deadlock, &waiter);
> +		rt_mutex_handle_deadlock(ret, chwalk, &waiter);
>  	}
>  
>  	/*
> @@ -1037,27 +1058,28 @@ static inline int
>  rt_mutex_fastlock(struct rt_mutex *lock, int state,
>  		  int (*slowfn)(struct rt_mutex *lock, int state,
>  				struct hrtimer_sleeper *timeout,
> -				int detect_deadlock))
> +				enum rtmutex_chainwalk chwalk))
>  {
>  	if (likely(rt_mutex_cmpxchg(lock, NULL, current))) {
>  		rt_mutex_deadlock_account_lock(lock, current);
>  		return 0;
>  	} else
> -		return slowfn(lock, state, NULL, 0);
> +		return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK);
>  }
>  
>  static inline int
>  rt_mutex_timed_fastlock(struct rt_mutex *lock, int state,
> -			struct hrtimer_sleeper *timeout, int detect_deadlock,
> +			struct hrtimer_sleeper *timeout,
> +			enum rtmutex_chainwalk chwalk,
>  			int (*slowfn)(struct rt_mutex *lock, int state,
>  				      struct hrtimer_sleeper *timeout,
> -				      int detect_deadlock))
> +				      enum rtmutex_chainwalk chwalk))
>  {
> -	if (!detect_deadlock && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
> +	if (!chwalk && likely(rt_mutex_cmpxchg(lock, NULL, current))) {
>  		rt_mutex_deadlock_account_lock(lock, current);
>  		return 0;
>  	} else
> -		return slowfn(lock, state, timeout, detect_deadlock);
> +		return slowfn(lock, state, timeout, chwalk);
>  }
>  
>  static inline int
> @@ -1119,7 +1141,8 @@ int __rt_mutex_timed_lock(struct rt_mute
>  {
>  	might_sleep();
>  
> -	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
> +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
> +				       RT_MUTEX_FULL_CHAINWALK,
>  				       rt_mutex_slowlock);
>  }
>  
> @@ -1141,7 +1164,8 @@ rt_mutex_timed_lock(struct rt_mutex *loc
>  {
>  	might_sleep();
>  
> -	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
> +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
> +				       RT_MUTEX_MIN_CHAINWALK,
>  				       rt_mutex_slowlock);
>  }
>  EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
> @@ -1270,7 +1294,8 @@ int rt_mutex_start_proxy_lock(struct rt_
>  	}
>  
>  	/* We enforce deadlock detection for futexes */
> -	ret = task_blocks_on_rt_mutex(lock, waiter, task, 1);
> +	ret = task_blocks_on_rt_mutex(lock, waiter, task,
> +				      RT_MUTEX_FULL_CHAINWALK);
>  
>  	if (ret && !rt_mutex_owner(lock)) {
>  		/*
> Index: tip/kernel/locking/rtmutex.h
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex.h
> +++ tip/kernel/locking/rtmutex.h
> @@ -22,10 +22,15 @@
>  #define debug_rt_mutex_init(m, n)			do { } while (0)
>  #define debug_rt_mutex_deadlock(d, a ,l)		do { } while (0)
>  #define debug_rt_mutex_print_deadlock(w)		do { } while (0)
> -#define debug_rt_mutex_detect_deadlock(w,d)		(d)
>  #define debug_rt_mutex_reset_waiter(w)			do { } while (0)
>  
>  static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
>  {
>  	WARN(1, "rtmutex deadlock detected\n");
>  }
> +
> +static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter *w,
> +						  enum rtmutex_chainwalk walk)
> +{
> +	return walk == RT_MUTEX_FULL_CHAINWALK;
> +}
> Index: tip/kernel/locking/rtmutex_common.h
> ===================================================================
> --- tip.orig/kernel/locking/rtmutex_common.h
> +++ tip/kernel/locking/rtmutex_common.h
> @@ -102,6 +102,21 @@ static inline struct task_struct *rt_mut
>  }
>  
>  /*
> + * Constants for rt mutex functions which have a selectable deadlock
> + * detection.
> + *
> + * RT_MUTEX_MIN_CHAINWALK:	Stops the lock chain walk when there are
> + *				no further PI adjustments to be made.
> + *
> + * RT_MUTEX_FULL_CHAINWALK:	Invoke deadlock detection with a full
> + *				walk of the lock chain.
> + */
> +enum rtmutex_chainwalk {
> +	RT_MUTEX_MIN_CHAINWALK,
> +	RT_MUTEX_FULL_CHAINWALK,
> +};
> +
> +/*
>   * PI-futex support (proxy locking functions, etc.):
>   */
>  extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock);
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10 14:57   ` Brad Mouring
@ 2014-06-10 15:19     ` Steven Rostedt
  2014-06-10 17:43       ` Thomas Gleixner
  0 siblings, 1 reply; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10 15:19 UTC (permalink / raw)
  To: Brad Mouring
  Cc: Thomas Gleixner, LKML, Peter Zijlstra, Ingo Molnar,
	Lai Jiangshan, Jason Low

On Tue, 10 Jun 2014 09:57:25 -0500
"Brad Mouring" <bmouring@ni.com> wrote:


> > +		/* Store whether owner is blocked itself and drop locks */
> > +		next_lock = task_blocked_on(task);
> task_blocked_on(task) is not clear to me, the recipient of the
> return is the only clue that hints at what the function does.

Well, this is more than confusing, it's the only user, all other users
are task_blocked_on_lock(), and this causes the code not to compile.

-- Steve


> > +		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> > +		raw_spin_unlock(&lock->wait_lock);
> > +
> > +		/* If owner is not blocked, end of chain. */
> > +		if (!next_lock)
> > +			goto out_put_task;
> > +		goto again;
> > +	}
> > +
> > +	/*
> >  	 * Store the current top waiter before doing the requeue
> >  	 * operation on @lock. We need it for the boost/deboost
> >  	 * decision below.
> > 
> > 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 5/7] rtmutex: Confine deadlock logic to futex
  2014-06-10  0:59   ` Steven Rostedt
  2014-06-10  4:03     ` Lai Jiangshan
@ 2014-06-10 17:39     ` Thomas Gleixner
  1 sibling, 0 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-10 17:39 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 9 Jun 2014, Steven Rostedt wrote:
> On Mon, 09 Jun 2014 20:28:09 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
>   
> > +/*
> > + * Futex variant to allow full deadlock detection.
> > + */
> > +int __rt_mutex_timed_lock(struct rt_mutex *lock,
> > +			  struct hrtimer_sleeper *timeout)
> > +{
> > +	might_sleep();
> > +
> > +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
> > +				       rt_mutex_slowlock);
> > +}
> > +
> >  /**
> >   * rt_mutex_timed_lock - lock a rt_mutex interruptible
> >   *			the timeout structure is provided
> >   *			by the caller
> >   *
> > - * @lock: 		the rt_mutex to be locked
> > + * @lock:		the rt_mutex to be locked
> >   * @timeout:		timeout structure or NULL (no timeout)
> > - * @detect_deadlock:	deadlock detection on/off
> >   *
> >   * Returns:
> > - *  0 		on success
> > - * -EINTR 	when interrupted by a signal
> > + *  0		on success
> > + * -EINTR	when interrupted by a signal
> >   * -ETIMEDOUT	when the timeout expired
> > - * -EDEADLK	when the lock would deadlock (when deadlock detection is on)
> >   */
> >  int
> > -rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout,
> > -		    int detect_deadlock)
> > +rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
> >  {
> >  	might_sleep();
> >  
> > -	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
> > -				       detect_deadlock, rt_mutex_slowlock);
> > +	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout, 0,
> > +				       rt_mutex_slowlock);
> >  }
> >  EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
> >  
> 
> I must be missing something. What's the difference between the above
> and the futex variant one? They both do the exact same thing:
> 
> 
> int foo(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
> {
> 	might_sleep();
> 
> 	return rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timout,
> 				       0, rt_mutex_slowlock);
> }

Right. The one for futex must have s/0/1/. Fixed.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10  1:20   ` Steven Rostedt
  2014-06-10  3:48     ` Jason Low
@ 2014-06-10 17:41     ` Thomas Gleixner
  2014-06-10 17:47       ` Steven Rostedt
  1 sibling, 1 reply; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-10 17:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Mon, 9 Jun 2014, Steven Rostedt wrote:
> On Mon, 09 Jun 2014 20:28:10 -0000
> Thomas Gleixner <tglx@linutronix.de> wrote:
> >  	/*
> > +	 * If we just follow the lock chain for deadlock detection, no
> > +	 * need to do all the requeue operations. We avoid a truckload
> 
> s/We/To/
> 
> 
> > +	 * of conditinals around the various places below and just do
> 
> s/ and/, /

Ok.
 
> > +	 * the minimum chain walk checks here.
> > +	 */
> > +	if (!requeue) {
> > +		/* Release the task */
> > +		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> > +		put_task_struct(task);
> > +
> > +		/* If there is no owner of the lock, end of chain. */
> > +		if (!rt_mutex_owner(lock)) {
> > +			raw_spin_unlock(&lock->wait_lock);
> > +			return 0;
> > +		}
> > +
> > +		/* Grab the next task, i.e. owner of @lock */
> > +		task = rt_mutex_owner(lock);
> > +		get_task_struct(task);
> > +		raw_spin_lock_irqsave(&task->pi_lock, flags);
> > +
> > +		/* Store whether owner is blocked itself and drop locks */
> > +		next_lock = task_blocked_on(task);
> > +		raw_spin_unlock_irqrestore(&task->pi_lock, flags);
> > +		raw_spin_unlock(&lock->wait_lock);
> > +
> > +		/* If owner is not blocked, end of chain. */
> > +		if (!next_lock)
> > +			goto out_put_task;
> 
> On the loop back around, have something like:
> 
> 	if (top_waiter) {
> 		if (!task_has_pi_waiters(task))
> 			goto out_unlock_pi;

	The task has at least one pi waiter.
 
> 		if (!requeue &&
> 		    top_waiter != task_top_pi_waiter(task)) {
> 			if (!detect_deadlock)
> 				goto out_unlock_pi;
> 			else
> 				requeue = false;
> 		}

	Errm? if requeue is off we are in deadlock detection chainwalk
	mode. So all we care about is whether task is blocked on
	next_lock or not.

What you actually missed is that we need to read out top_waiter for
the current lock. Fixed already.
    

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10 15:19     ` Steven Rostedt
@ 2014-06-10 17:43       ` Thomas Gleixner
  2014-06-10 17:51         ` Steven Rostedt
  0 siblings, 1 reply; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-10 17:43 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Brad Mouring, LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low

On Tue, 10 Jun 2014, Steven Rostedt wrote:

> On Tue, 10 Jun 2014 09:57:25 -0500
> "Brad Mouring" <bmouring@ni.com> wrote:
> 
> 
> > > +		/* Store whether owner is blocked itself and drop locks */
> > > +		next_lock = task_blocked_on(task);
> > task_blocked_on(task) is not clear to me, the recipient of the
> > return is the only clue that hints at what the function does.
> 
> Well, this is more than confusing, it's the only user, all other users
> are task_blocked_on_lock(), and this causes the code not to compile.

Grr, yes.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10 17:41     ` Thomas Gleixner
@ 2014-06-10 17:47       ` Steven Rostedt
  2014-06-10 20:45         ` Thomas Gleixner
  0 siblings, 1 reply; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10 17:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Tue, 10 Jun 2014 19:41:39 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

 
> > On the loop back around, have something like:
> > 
> > 	if (top_waiter) {
> > 		if (!task_has_pi_waiters(task))
> > 			goto out_unlock_pi;
> 
> 	The task has at least one pi waiter.
>  
> > 		if (!requeue &&
> > 		    top_waiter != task_top_pi_waiter(task)) {
> > 			if (!detect_deadlock)
> > 				goto out_unlock_pi;
> > 			else
> > 				requeue = false;
> > 		}
> 
> 	Errm? if requeue is off we are in deadlock detection chainwalk
> 	mode. So all we care about is whether task is blocked on
> 	next_lock or not.

Actually that was a typo on my part. That should have been:

	if (requeue &&
		...

As we don't need to read the task_top_pi_waiter() again.

-- Steve


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10 17:43       ` Thomas Gleixner
@ 2014-06-10 17:51         ` Steven Rostedt
  2014-06-10 20:46           ` Thomas Gleixner
  0 siblings, 1 reply; 40+ messages in thread
From: Steven Rostedt @ 2014-06-10 17:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Brad Mouring, LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low

On Tue, 10 Jun 2014 19:43:16 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> On Tue, 10 Jun 2014, Steven Rostedt wrote:
> 
> > On Tue, 10 Jun 2014 09:57:25 -0500
> > "Brad Mouring" <bmouring@ni.com> wrote:
> > 
> > 
> > > > +		/* Store whether owner is blocked itself and drop locks */
> > > > +		next_lock = task_blocked_on(task);
> > > task_blocked_on(task) is not clear to me, the recipient of the
> > > return is the only clue that hints at what the function does.
> > 
> > Well, this is more than confusing, it's the only user, all other users
> > are task_blocked_on_lock(), and this causes the code not to compile.
> 
> Grr, yes.

Luckily you are not posting this to that grumpy IRQ maintainer. He'd
shoot some frozen sharks your way if you sent him patches like this ;-)

-- Steve

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10 17:47       ` Steven Rostedt
@ 2014-06-10 20:45         ` Thomas Gleixner
  0 siblings, 0 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-10 20:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan, Jason Low,
	Brad Mouring

On Tue, 10 Jun 2014, Steven Rostedt wrote:

> On Tue, 10 Jun 2014 19:41:39 +0200 (CEST)
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
>  
> > > On the loop back around, have something like:
> > > 
> > > 	if (top_waiter) {
> > > 		if (!task_has_pi_waiters(task))
> > > 			goto out_unlock_pi;
> > 
> > 	The task has at least one pi waiter.
> >  
> > > 		if (!requeue &&
> > > 		    top_waiter != task_top_pi_waiter(task)) {
> > > 			if (!detect_deadlock)
> > > 				goto out_unlock_pi;
> > > 			else
> > > 				requeue = false;
> > > 		}
> > 
> > 	Errm? if requeue is off we are in deadlock detection chainwalk
> > 	mode. So all we care about is whether task is blocked on
> > 	next_lock or not.
> 
> Actually that was a typo on my part. That should have been:
> 
> 	if (requeue &&
> 		...
> 
> As we don't need to read the task_top_pi_waiter() again.

     if (requeue ...

is completely pointless as the code you were talking about is in the

   if (!requeue) {

branch. So what?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk
  2014-06-10 17:51         ` Steven Rostedt
@ 2014-06-10 20:46           ` Thomas Gleixner
  0 siblings, 0 replies; 40+ messages in thread
From: Thomas Gleixner @ 2014-06-10 20:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Brad Mouring, LKML, Peter Zijlstra, Ingo Molnar, Lai Jiangshan,
	Jason Low

On Tue, 10 Jun 2014, Steven Rostedt wrote:
> On Tue, 10 Jun 2014 19:43:16 +0200 (CEST)
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > On Tue, 10 Jun 2014, Steven Rostedt wrote:
> > 
> > > On Tue, 10 Jun 2014 09:57:25 -0500
> > > "Brad Mouring" <bmouring@ni.com> wrote:
> > > 
> > > 
> > > > > +		/* Store whether owner is blocked itself and drop locks */
> > > > > +		next_lock = task_blocked_on(task);
> > > > task_blocked_on(task) is not clear to me, the recipient of the
> > > > return is the only clue that hints at what the function does.
> > > 
> > > Well, this is more than confusing, it's the only user, all other users
> > > are task_blocked_on_lock(), and this causes the code not to compile.
> > 
> > Grr, yes.
> 
> Luckily you are not posting this to that grumpy IRQ maintainer. He'd
> shoot some frozen sharks your way if you sent him patches like this ;-)

Rightfully so. :)

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2014-06-10 20:46 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-09 20:28 [patch V3 0/7] rtmutex: Code clarification and optimization Thomas Gleixner
2014-06-09 20:28 ` [patch V3 1/7] rtmutex: Deobfuscate chain walk Thomas Gleixner
2014-06-09 20:59   ` Steven Rostedt
2014-06-10  3:52     ` Lai Jiangshan
2014-06-10  3:21   ` Jason Low
2014-06-10 13:57   ` Brad Mouring
2014-06-09 20:28 ` [patch V3 2/7] rtmutex: Clarify the boost/deboost part Thomas Gleixner
2014-06-10  0:37   ` Steven Rostedt
2014-06-10  3:22   ` Jason Low
2014-06-10 14:04   ` Brad Mouring
2014-06-09 20:28 ` [patch V3 3/7] rtmutex: Document pi chain walk Thomas Gleixner
2014-06-10  0:45   ` Steven Rostedt
2014-06-10  3:51     ` Lai Jiangshan
2014-06-10 14:21   ` Brad Mouring
2014-06-09 20:28 ` [patch V3 4/7] rtmutex: Siplify remove_waiter() Thomas Gleixner
2014-06-10  0:53   ` Steven Rostedt
2014-06-10  3:35     ` Lai Jiangshan
2014-06-10  3:44     ` Jason Low
2014-06-10 14:10   ` Brad Mouring
2014-06-09 20:28 ` [patch V3 5/7] rtmutex: Confine deadlock logic to futex Thomas Gleixner
2014-06-10  0:59   ` Steven Rostedt
2014-06-10  4:03     ` Lai Jiangshan
2014-06-10 17:39     ` Thomas Gleixner
2014-06-09 20:28 ` [patch V3 6/7] rtmutex: Cleanup deadlock detector debug logic Thomas Gleixner
2014-06-10  1:04   ` Steven Rostedt
2014-06-10 15:09   ` Brad Mouring
2014-06-09 20:28 ` [patch V3 7/7] rtmutex: Avoid pointless requeueing in the deadlock detection chain walk Thomas Gleixner
2014-06-10  1:20   ` Steven Rostedt
2014-06-10  3:48     ` Jason Low
2014-06-10 17:41     ` Thomas Gleixner
2014-06-10 17:47       ` Steven Rostedt
2014-06-10 20:45         ` Thomas Gleixner
2014-06-10 14:57   ` Brad Mouring
2014-06-10 15:19     ` Steven Rostedt
2014-06-10 17:43       ` Thomas Gleixner
2014-06-10 17:51         ` Steven Rostedt
2014-06-10 20:46           ` Thomas Gleixner
2014-06-10  0:27 ` [patch V3 0/7] rtmutex: Code clarification and optimization Steven Rostedt
2014-06-10  0:35   ` Steven Rostedt
2014-06-10  3:00     ` Lai Jiangshan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.