All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 11:25 Nicolai Hähnle
  2016-11-23 11:25   ` Nicolai Hähnle
                   ` (5 more replies)
  0 siblings, 6 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-23 11:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nicolai Hähnle, Peter Zijlstra, Ingo Molnar, Chris Wilson,
	Maarten Lankhorst, dri-devel, stable, Nicolai Hähnle

From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>

Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
the following example. Acquire context stamps are ordered like the thread
numbers, i.e. thread #1 should back off when it encounters a mutex locked
by thread #0 etc.

Thread #0    Thread #1    Thread #2    Thread #3
---------    ---------    ---------    ---------
                                       lock(ww)
                                       success
             lock(ww')
             success
                          lock(ww)
             lock(ww)        .
                .            .         unlock(ww) part 1
lock(ww)        .            .            .
success         .            .            .
                .            .         unlock(ww) part 2
                .         back off
lock(ww')       .
   .            .
(stuck)      (stuck)

Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
(without being protected by lock->base.wait_lock), meaning that thread #0
can acquire ww in the fast path or, much more likely, the medium path
in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
won't wake up any of the waiters in ww_mutex_set_context_fastpath.

Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
thread #2, since waiters are added at the tail. Thread #2 wakes up and
backs off since it sees ww owned by a context with a lower stamp.

Meanwhile, thread #1 is never woken up, and so it won't back off its lock
on ww'. So thread #0 gets stuck waiting for ww' to be released.

This patch fixes the deadlock by waking up all waiters in the slow path
of ww_mutex_unlock.

We have an internal test case for amdgpu which continuously submits
command streams from tens of threads, where all command streams reference
hundreds of GPU buffer objects with a lot of overlap in the buffer lists
between command streams. This test reliably caused a deadlock, and while I
haven't completely confirmed that it is exactly the scenario outlined
above, this patch does fix the test case.

v2:
- use wake_q_add
- add additional explanations

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: dri-devel@lists.freedesktop.org
Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 kernel/locking/mutex.c | 33 +++++++++++++++++++++++++++++----
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a70b90d..7fbf9b4 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -409,6 +409,9 @@ static bool mutex_optimistic_spin(struct mutex *lock,
 __visible __used noinline
 void __sched __mutex_unlock_slowpath(atomic_t *lock_count);
 
+static __used noinline
+void __sched __mutex_unlock_slowpath_wakeall(atomic_t *lock_count);
+
 /**
  * mutex_unlock - release the mutex
  * @lock: the mutex to be released
@@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
 	 */
 	mutex_clear_owner(&lock->base);
 #endif
-	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
+	/*
+	 * A previously _not_ waiting task may acquire the lock via the fast
+	 * path during our unlock. In that case, already waiting tasks may have
+	 * to back off to avoid a deadlock. Wake up all waiters so that they
+	 * can check their acquire context stamp against the new owner.
+	 */
+	__mutex_fastpath_unlock(&lock->base.count,
+				__mutex_unlock_slowpath_wakeall);
 }
 EXPORT_SYMBOL(ww_mutex_unlock);
 
@@ -716,7 +726,7 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
  * Release the lock, slowpath:
  */
 static inline void
-__mutex_unlock_common_slowpath(struct mutex *lock, int nested)
+__mutex_unlock_common_slowpath(struct mutex *lock, int nested, int wake_all)
 {
 	unsigned long flags;
 	WAKE_Q(wake_q);
@@ -740,7 +750,14 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
 	mutex_release(&lock->dep_map, nested, _RET_IP_);
 	debug_mutex_unlock(lock);
 
-	if (!list_empty(&lock->wait_list)) {
+	if (wake_all) {
+		struct mutex_waiter *waiter;
+
+		list_for_each_entry(waiter, &lock->wait_list, list) {
+			debug_mutex_wake_waiter(lock, waiter);
+			wake_q_add(&wake_q, waiter->task);
+		}
+	} else if (!list_empty(&lock->wait_list)) {
 		/* get the first entry from the wait-list: */
 		struct mutex_waiter *waiter =
 				list_entry(lock->wait_list.next,
@@ -762,7 +779,15 @@ __mutex_unlock_slowpath(atomic_t *lock_count)
 {
 	struct mutex *lock = container_of(lock_count, struct mutex, count);
 
-	__mutex_unlock_common_slowpath(lock, 1);
+	__mutex_unlock_common_slowpath(lock, 1, 0);
+}
+
+static void
+__mutex_unlock_slowpath_wakeall(atomic_t *lock_count)
+{
+	struct mutex *lock = container_of(lock_count, struct mutex, count);
+
+	__mutex_unlock_common_slowpath(lock, 1, 1);
 }
 
 #ifndef CONFIG_DEBUG_LOCK_ALLOC
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/4] locking/ww_mutex: Remove redundant wakeups in ww_mutex_set_context_slowpath
  2016-11-23 11:25 [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Nicolai Hähnle
@ 2016-11-23 11:25   ` Nicolai Hähnle
  2016-11-23 11:25   ` Nicolai Hähnle
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-23 11:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nicolai Hähnle, Peter Zijlstra, Ingo Molnar, Chris Wilson,
	Maarten Lankhorst, dri-devel, Nicolai Hähnle

From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>

When ww_mutex_set_context_slowpath runs, we are in one of two situations:

1. The current task was woken up by ww_mutex_unlock.

2. The current task is racing with ww_mutex_unlock: We entered the slow
   path while lock->base.count <= 0, but skipped the wait in
   __mutex_lock_common.

In both cases, all tasks that are currently contending for the lock are
either racing as in point 2 and blocked on lock->wait_lock, or they have
been or will be woken up by ww_mutex_unlock.

Either way, they will check their stamp against ours without having to be
woken up again.

This improvement is possible only after the changed behavior of
ww_mutex_unlock from the previous patch ("locking/ww_mutex: Fix a deadlock
affecting ww_mutexes").

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 kernel/locking/mutex.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 7fbf9b4..7c09376 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -192,8 +192,10 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock,
 }
 
 /*
- * After acquiring lock in the slowpath set ctx and wake up any
- * waiters so they can recheck.
+ * After acquiring lock in the slowpath set ctx.
+ *
+ * Unlike the fast path, other waiters are already woken up by ww_mutex_unlock,
+ * so we don't have to do it again here.
  *
  * Callers must hold the mutex wait_lock.
  */
@@ -201,19 +203,8 @@ static __always_inline void
 ww_mutex_set_context_slowpath(struct ww_mutex *lock,
 			      struct ww_acquire_ctx *ctx)
 {
-	struct mutex_waiter *cur;
-
 	ww_mutex_lock_acquired(lock, ctx);
 	lock->ctx = ctx;
-
-	/*
-	 * Give any possible sleeping processes the chance to wake up,
-	 * so they can recheck if they have to back off.
-	 */
-	list_for_each_entry(cur, &lock->base.wait_list, list) {
-		debug_mutex_wake_waiter(&lock->base, cur);
-		wake_up_process(cur->task);
-	}
 }
 
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/4] locking/ww_mutex: Remove redundant wakeups in ww_mutex_set_context_slowpath
@ 2016-11-23 11:25   ` Nicolai Hähnle
  0 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-23 11:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nicolai Hähnle, Peter Zijlstra, dri-devel, Ingo Molnar,
	Maarten Lankhorst

From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>

When ww_mutex_set_context_slowpath runs, we are in one of two situations:

1. The current task was woken up by ww_mutex_unlock.

2. The current task is racing with ww_mutex_unlock: We entered the slow
   path while lock->base.count <= 0, but skipped the wait in
   __mutex_lock_common.

In both cases, all tasks that are currently contending for the lock are
either racing as in point 2 and blocked on lock->wait_lock, or they have
been or will be woken up by ww_mutex_unlock.

Either way, they will check their stamp against ours without having to be
woken up again.

This improvement is possible only after the changed behavior of
ww_mutex_unlock from the previous patch ("locking/ww_mutex: Fix a deadlock
affecting ww_mutexes").

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 kernel/locking/mutex.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 7fbf9b4..7c09376 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -192,8 +192,10 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock,
 }
 
 /*
- * After acquiring lock in the slowpath set ctx and wake up any
- * waiters so they can recheck.
+ * After acquiring lock in the slowpath set ctx.
+ *
+ * Unlike the fast path, other waiters are already woken up by ww_mutex_unlock,
+ * so we don't have to do it again here.
  *
  * Callers must hold the mutex wait_lock.
  */
@@ -201,19 +203,8 @@ static __always_inline void
 ww_mutex_set_context_slowpath(struct ww_mutex *lock,
 			      struct ww_acquire_ctx *ctx)
 {
-	struct mutex_waiter *cur;
-
 	ww_mutex_lock_acquired(lock, ctx);
 	lock->ctx = ctx;
-
-	/*
-	 * Give any possible sleeping processes the chance to wake up,
-	 * so they can recheck if they have to back off.
-	 */
-	list_for_each_entry(cur, &lock->base.wait_list, list) {
-		debug_mutex_wake_waiter(&lock->base, cur);
-		wake_up_process(cur->task);
-	}
 }
 
 #ifdef CONFIG_MUTEX_SPIN_ON_OWNER
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/4] locking/Documentation: fix a typo
  2016-11-23 11:25 [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Nicolai Hähnle
@ 2016-11-23 11:25   ` Nicolai Hähnle
  2016-11-23 11:25   ` Nicolai Hähnle
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-23 11:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nicolai Hähnle, Peter Zijlstra, Ingo Molnar, dri-devel,
	Nicolai Hähnle

From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 Documentation/locking/00-INDEX | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/locking/00-INDEX b/Documentation/locking/00-INDEX
index c256c9b..adc1d289 100644
--- a/Documentation/locking/00-INDEX
+++ b/Documentation/locking/00-INDEX
@@ -13,4 +13,4 @@ rt-mutex.txt
 spinlocks.txt
 	- info on using spinlocks to provide exclusive access in kernel.
 ww-mutex-design.txt
-	- Intro to Mutex wait/would deadlock handling.s
+	- Intro to Mutex wait/wound deadlock handling.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/4] locking/Documentation: fix a typo
@ 2016-11-23 11:25   ` Nicolai Hähnle
  0 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-23 11:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, dri-devel, Nicolai Hähnle

From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 Documentation/locking/00-INDEX | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/locking/00-INDEX b/Documentation/locking/00-INDEX
index c256c9b..adc1d289 100644
--- a/Documentation/locking/00-INDEX
+++ b/Documentation/locking/00-INDEX
@@ -13,4 +13,4 @@ rt-mutex.txt
 spinlocks.txt
 	- info on using spinlocks to provide exclusive access in kernel.
 ww-mutex-design.txt
-	- Intro to Mutex wait/would deadlock handling.s
+	- Intro to Mutex wait/wound deadlock handling.
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 4/4] locking/ww_mutex: Fix a comment typo
  2016-11-23 11:25 [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Nicolai Hähnle
@ 2016-11-23 11:25   ` Nicolai Hähnle
  2016-11-23 11:25   ` Nicolai Hähnle
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-23 11:25 UTC (permalink / raw)
  To: linux-kernel
  Cc: Nicolai Hähnle, Peter Zijlstra, Ingo Molnar, dri-devel,
	Nicolai Hähnle

From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 include/linux/ww_mutex.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
index 760399a..fd93cd3 100644
--- a/include/linux/ww_mutex.h
+++ b/include/linux/ww_mutex.h
@@ -97,7 +97,7 @@ static inline void ww_mutex_init(struct ww_mutex *lock,
  * @ctx: w/w acquire context to initialize
  * @ww_class: w/w class of the context
  *
- * Initializes an context to acquire multiple mutexes of the given w/w class.
+ * Initializes a context to acquire multiple mutexes of the given w/w class.
  *
  * Context-based w/w mutex acquiring can be done in any order whatsoever within
  * a given lock class. Deadlocks will be detected and handled with the
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 4/4] locking/ww_mutex: Fix a comment typo
@ 2016-11-23 11:25   ` Nicolai Hähnle
  0 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-23 11:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, dri-devel, Nicolai Hähnle

From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
---
 include/linux/ww_mutex.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
index 760399a..fd93cd3 100644
--- a/include/linux/ww_mutex.h
+++ b/include/linux/ww_mutex.h
@@ -97,7 +97,7 @@ static inline void ww_mutex_init(struct ww_mutex *lock,
  * @ctx: w/w acquire context to initialize
  * @ww_class: w/w class of the context
  *
- * Initializes an context to acquire multiple mutexes of the given w/w class.
+ * Initializes a context to acquire multiple mutexes of the given w/w class.
  *
  * Context-based w/w mutex acquiring can be done in any order whatsoever within
  * a given lock class. Deadlocks will be detected and handled with the
-- 
2.7.4

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 11:25 [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Nicolai Hähnle
@ 2016-11-23 12:50   ` Daniel Vetter
  2016-11-23 11:25   ` Nicolai Hähnle
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 12:50 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: linux-kernel, Nicolai Hähnle, Peter Zijlstra, dri-devel,
	Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
> 
> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> the following example. Acquire context stamps are ordered like the thread
> numbers, i.e. thread #1 should back off when it encounters a mutex locked
> by thread #0 etc.
> 
> Thread #0    Thread #1    Thread #2    Thread #3
> ---------    ---------    ---------    ---------
>                                        lock(ww)
>                                        success
>              lock(ww')
>              success
>                           lock(ww)
>              lock(ww)        .
>                 .            .         unlock(ww) part 1
> lock(ww)        .            .            .
> success         .            .            .
>                 .            .         unlock(ww) part 2
>                 .         back off
> lock(ww')       .
>    .            .
> (stuck)      (stuck)
> 
> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> (without being protected by lock->base.wait_lock), meaning that thread #0
> can acquire ww in the fast path or, much more likely, the medium path
> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> 
> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> thread #2, since waiters are added at the tail. Thread #2 wakes up and
> backs off since it sees ww owned by a context with a lower stamp.
> 
> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> on ww'. So thread #0 gets stuck waiting for ww' to be released.
> 
> This patch fixes the deadlock by waking up all waiters in the slow path
> of ww_mutex_unlock.
> 
> We have an internal test case for amdgpu which continuously submits
> command streams from tens of threads, where all command streams reference
> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> between command streams. This test reliably caused a deadlock, and while I
> haven't completely confirmed that it is exactly the scenario outlined
> above, this patch does fix the test case.
> 
> v2:
> - use wake_q_add
> - add additional explanations
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: stable@vger.kernel.org
> Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

Yeah, when the owning ctx changes we need to wake up all waiters, to make
sure we catch all (new) deadlock scenarios. And I tried poking at your
example, and I think it's solid and can't be minimized any further. I
don't have much clue on mutex.c code itself, but the changes seem
reasonable. With that caveat:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Cheers, Daniel

> ---
>  kernel/locking/mutex.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index a70b90d..7fbf9b4 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -409,6 +409,9 @@ static bool mutex_optimistic_spin(struct mutex *lock,
>  __visible __used noinline
>  void __sched __mutex_unlock_slowpath(atomic_t *lock_count);
>  
> +static __used noinline
> +void __sched __mutex_unlock_slowpath_wakeall(atomic_t *lock_count);
> +
>  /**
>   * mutex_unlock - release the mutex
>   * @lock: the mutex to be released
> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>  	 */
>  	mutex_clear_owner(&lock->base);
>  #endif
> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> +	/*
> +	 * A previously _not_ waiting task may acquire the lock via the fast
> +	 * path during our unlock. In that case, already waiting tasks may have
> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> +	 * can check their acquire context stamp against the new owner.
> +	 */
> +	__mutex_fastpath_unlock(&lock->base.count,
> +				__mutex_unlock_slowpath_wakeall);
>  }
>  EXPORT_SYMBOL(ww_mutex_unlock);
>  
> @@ -716,7 +726,7 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
>   * Release the lock, slowpath:
>   */
>  static inline void
> -__mutex_unlock_common_slowpath(struct mutex *lock, int nested)
> +__mutex_unlock_common_slowpath(struct mutex *lock, int nested, int wake_all)
>  {
>  	unsigned long flags;
>  	WAKE_Q(wake_q);
> @@ -740,7 +750,14 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>  	mutex_release(&lock->dep_map, nested, _RET_IP_);
>  	debug_mutex_unlock(lock);
>  
> -	if (!list_empty(&lock->wait_list)) {
> +	if (wake_all) {
> +		struct mutex_waiter *waiter;
> +
> +		list_for_each_entry(waiter, &lock->wait_list, list) {
> +			debug_mutex_wake_waiter(lock, waiter);
> +			wake_q_add(&wake_q, waiter->task);
> +		}
> +	} else if (!list_empty(&lock->wait_list)) {
>  		/* get the first entry from the wait-list: */
>  		struct mutex_waiter *waiter =
>  				list_entry(lock->wait_list.next,
> @@ -762,7 +779,15 @@ __mutex_unlock_slowpath(atomic_t *lock_count)
>  {
>  	struct mutex *lock = container_of(lock_count, struct mutex, count);
>  
> -	__mutex_unlock_common_slowpath(lock, 1);
> +	__mutex_unlock_common_slowpath(lock, 1, 0);
> +}
> +
> +static void
> +__mutex_unlock_slowpath_wakeall(atomic_t *lock_count)
> +{
> +	struct mutex *lock = container_of(lock_count, struct mutex, count);
> +
> +	__mutex_unlock_common_slowpath(lock, 1, 1);
>  }
>  
>  #ifndef CONFIG_DEBUG_LOCK_ALLOC
> -- 
> 2.7.4
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 12:50   ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 12:50 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: linux-kernel, Nicolai Hähnle, Peter Zijlstra, dri-devel,
	Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
> From: Nicolai H�hnle <Nicolai.Haehnle@amd.com>
> 
> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> the following example. Acquire context stamps are ordered like the thread
> numbers, i.e. thread #1 should back off when it encounters a mutex locked
> by thread #0 etc.
> 
> Thread #0    Thread #1    Thread #2    Thread #3
> ---------    ---------    ---------    ---------
>                                        lock(ww)
>                                        success
>              lock(ww')
>              success
>                           lock(ww)
>              lock(ww)        .
>                 .            .         unlock(ww) part 1
> lock(ww)        .            .            .
> success         .            .            .
>                 .            .         unlock(ww) part 2
>                 .         back off
> lock(ww')       .
>    .            .
> (stuck)      (stuck)
> 
> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> (without being protected by lock->base.wait_lock), meaning that thread #0
> can acquire ww in the fast path or, much more likely, the medium path
> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> 
> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> thread #2, since waiters are added at the tail. Thread #2 wakes up and
> backs off since it sees ww owned by a context with a lower stamp.
> 
> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> on ww'. So thread #0 gets stuck waiting for ww' to be released.
> 
> This patch fixes the deadlock by waking up all waiters in the slow path
> of ww_mutex_unlock.
> 
> We have an internal test case for amdgpu which continuously submits
> command streams from tens of threads, where all command streams reference
> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> between command streams. This test reliably caused a deadlock, and while I
> haven't completely confirmed that it is exactly the scenario outlined
> above, this patch does fix the test case.
> 
> v2:
> - use wake_q_add
> - add additional explanations
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: stable@vger.kernel.org
> Reviewed-by: Christian K�nig <christian.koenig@amd.com> (v1)
> Signed-off-by: Nicolai H�hnle <nicolai.haehnle@amd.com>

Yeah, when the owning ctx changes we need to wake up all waiters, to make
sure we catch all (new) deadlock scenarios. And I tried poking at your
example, and I think it's solid and can't be minimized any further. I
don't have much clue on mutex.c code itself, but the changes seem
reasonable. With that caveat:

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

Cheers, Daniel

> ---
>  kernel/locking/mutex.c | 33 +++++++++++++++++++++++++++++----
>  1 file changed, 29 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index a70b90d..7fbf9b4 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -409,6 +409,9 @@ static bool mutex_optimistic_spin(struct mutex *lock,
>  __visible __used noinline
>  void __sched __mutex_unlock_slowpath(atomic_t *lock_count);
>  
> +static __used noinline
> +void __sched __mutex_unlock_slowpath_wakeall(atomic_t *lock_count);
> +
>  /**
>   * mutex_unlock - release the mutex
>   * @lock: the mutex to be released
> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>  	 */
>  	mutex_clear_owner(&lock->base);
>  #endif
> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> +	/*
> +	 * A previously _not_ waiting task may acquire the lock via the fast
> +	 * path during our unlock. In that case, already waiting tasks may have
> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> +	 * can check their acquire context stamp against the new owner.
> +	 */
> +	__mutex_fastpath_unlock(&lock->base.count,
> +				__mutex_unlock_slowpath_wakeall);
>  }
>  EXPORT_SYMBOL(ww_mutex_unlock);
>  
> @@ -716,7 +726,7 @@ EXPORT_SYMBOL_GPL(__ww_mutex_lock_interruptible);
>   * Release the lock, slowpath:
>   */
>  static inline void
> -__mutex_unlock_common_slowpath(struct mutex *lock, int nested)
> +__mutex_unlock_common_slowpath(struct mutex *lock, int nested, int wake_all)
>  {
>  	unsigned long flags;
>  	WAKE_Q(wake_q);
> @@ -740,7 +750,14 @@ __mutex_unlock_common_slowpath(struct mutex *lock, int nested)
>  	mutex_release(&lock->dep_map, nested, _RET_IP_);
>  	debug_mutex_unlock(lock);
>  
> -	if (!list_empty(&lock->wait_list)) {
> +	if (wake_all) {
> +		struct mutex_waiter *waiter;
> +
> +		list_for_each_entry(waiter, &lock->wait_list, list) {
> +			debug_mutex_wake_waiter(lock, waiter);
> +			wake_q_add(&wake_q, waiter->task);
> +		}
> +	} else if (!list_empty(&lock->wait_list)) {
>  		/* get the first entry from the wait-list: */
>  		struct mutex_waiter *waiter =
>  				list_entry(lock->wait_list.next,
> @@ -762,7 +779,15 @@ __mutex_unlock_slowpath(atomic_t *lock_count)
>  {
>  	struct mutex *lock = container_of(lock_count, struct mutex, count);
>  
> -	__mutex_unlock_common_slowpath(lock, 1);
> +	__mutex_unlock_common_slowpath(lock, 1, 0);
> +}
> +
> +static void
> +__mutex_unlock_slowpath_wakeall(atomic_t *lock_count)
> +{
> +	struct mutex *lock = container_of(lock_count, struct mutex, count);
> +
> +	__mutex_unlock_common_slowpath(lock, 1, 1);
>  }
>  
>  #ifndef CONFIG_DEBUG_LOCK_ALLOC
> -- 
> 2.7.4
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 11:25 [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Nicolai Hähnle
  2016-11-23 11:25   ` Nicolai Hähnle
@ 2016-11-23 13:00   ` Peter Zijlstra
  2016-11-23 11:25   ` Nicolai Hähnle
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 13:00 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: linux-kernel, Nicolai Hähnle, Ingo Molnar, Chris Wilson,
	Maarten Lankhorst, dri-devel, stable

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
> 
> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> the following example. Acquire context stamps are ordered like the thread
> numbers, i.e. thread #1 should back off when it encounters a mutex locked
> by thread #0 etc.
> 
> Thread #0    Thread #1    Thread #2    Thread #3
> ---------    ---------    ---------    ---------
>                                        lock(ww)
>                                        success
>              lock(ww')
>              success
>                           lock(ww)
>              lock(ww)        .
>                 .            .         unlock(ww) part 1
> lock(ww)        .            .            .
> success         .            .            .
>                 .            .         unlock(ww) part 2
>                 .         back off
> lock(ww')       .
>    .            .
> (stuck)      (stuck)
> 
> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> (without being protected by lock->base.wait_lock), meaning that thread #0
> can acquire ww in the fast path or, much more likely, the medium path
> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> 
> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> thread #2, since waiters are added at the tail. Thread #2 wakes up and
> backs off since it sees ww owned by a context with a lower stamp.
> 
> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> on ww'. So thread #0 gets stuck waiting for ww' to be released.
> 
> This patch fixes the deadlock by waking up all waiters in the slow path
> of ww_mutex_unlock.
> 
> We have an internal test case for amdgpu which continuously submits
> command streams from tens of threads, where all command streams reference
> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> between command streams. This test reliably caused a deadlock, and while I
> haven't completely confirmed that it is exactly the scenario outlined
> above, this patch does fix the test case.
> 
> v2:
> - use wake_q_add
> - add additional explanations
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: stable@vger.kernel.org
> Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

Completely and utterly fails to apply; I think this patch is based on
code prior to the mutex rewrite.

Please rebase on tip/locking/core.

Also, is this a regression, or has this been a 'feature' of the ww_mutex
code from early on?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 13:00   ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 13:00 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: linux-kernel, Nicolai Hähnle, Ingo Molnar, Chris Wilson,
	Maarten Lankhorst, dri-devel, stable

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
> From: Nicolai H�hnle <Nicolai.Haehnle@amd.com>
> 
> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> the following example. Acquire context stamps are ordered like the thread
> numbers, i.e. thread #1 should back off when it encounters a mutex locked
> by thread #0 etc.
> 
> Thread #0    Thread #1    Thread #2    Thread #3
> ---------    ---------    ---------    ---------
>                                        lock(ww)
>                                        success
>              lock(ww')
>              success
>                           lock(ww)
>              lock(ww)        .
>                 .            .         unlock(ww) part 1
> lock(ww)        .            .            .
> success         .            .            .
>                 .            .         unlock(ww) part 2
>                 .         back off
> lock(ww')       .
>    .            .
> (stuck)      (stuck)
> 
> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> (without being protected by lock->base.wait_lock), meaning that thread #0
> can acquire ww in the fast path or, much more likely, the medium path
> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> 
> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> thread #2, since waiters are added at the tail. Thread #2 wakes up and
> backs off since it sees ww owned by a context with a lower stamp.
> 
> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> on ww'. So thread #0 gets stuck waiting for ww' to be released.
> 
> This patch fixes the deadlock by waking up all waiters in the slow path
> of ww_mutex_unlock.
> 
> We have an internal test case for amdgpu which continuously submits
> command streams from tens of threads, where all command streams reference
> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> between command streams. This test reliably caused a deadlock, and while I
> haven't completely confirmed that it is exactly the scenario outlined
> above, this patch does fix the test case.
> 
> v2:
> - use wake_q_add
> - add additional explanations
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: stable@vger.kernel.org
> Reviewed-by: Christian K�nig <christian.koenig@amd.com> (v1)
> Signed-off-by: Nicolai H�hnle <nicolai.haehnle@amd.com>

Completely and utterly fails to apply; I think this patch is based on
code prior to the mutex rewrite.

Please rebase on tip/locking/core.

Also, is this a regression, or has this been a 'feature' of the ww_mutex
code from early on?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 13:00   ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 13:00 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: Nicolai Hähnle, linux-kernel, dri-devel, Ingo Molnar,
	stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
> 
> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> the following example. Acquire context stamps are ordered like the thread
> numbers, i.e. thread #1 should back off when it encounters a mutex locked
> by thread #0 etc.
> 
> Thread #0    Thread #1    Thread #2    Thread #3
> ---------    ---------    ---------    ---------
>                                        lock(ww)
>                                        success
>              lock(ww')
>              success
>                           lock(ww)
>              lock(ww)        .
>                 .            .         unlock(ww) part 1
> lock(ww)        .            .            .
> success         .            .            .
>                 .            .         unlock(ww) part 2
>                 .         back off
> lock(ww')       .
>    .            .
> (stuck)      (stuck)
> 
> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> (without being protected by lock->base.wait_lock), meaning that thread #0
> can acquire ww in the fast path or, much more likely, the medium path
> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> 
> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> thread #2, since waiters are added at the tail. Thread #2 wakes up and
> backs off since it sees ww owned by a context with a lower stamp.
> 
> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> on ww'. So thread #0 gets stuck waiting for ww' to be released.
> 
> This patch fixes the deadlock by waking up all waiters in the slow path
> of ww_mutex_unlock.
> 
> We have an internal test case for amdgpu which continuously submits
> command streams from tens of threads, where all command streams reference
> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> between command streams. This test reliably caused a deadlock, and while I
> haven't completely confirmed that it is exactly the scenario outlined
> above, this patch does fix the test case.
> 
> v2:
> - use wake_q_add
> - add additional explanations
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Cc: dri-devel@lists.freedesktop.org
> Cc: stable@vger.kernel.org
> Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

Completely and utterly fails to apply; I think this patch is based on
code prior to the mutex rewrite.

Please rebase on tip/locking/core.

Also, is this a regression, or has this been a 'feature' of the ww_mutex
code from early on?
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 13:00   ` Peter Zijlstra
@ 2016-11-23 13:08     ` Daniel Vetter
  -1 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 13:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> > From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
> > 
> > Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> > the following example. Acquire context stamps are ordered like the thread
> > numbers, i.e. thread #1 should back off when it encounters a mutex locked
> > by thread #0 etc.
> > 
> > Thread #0    Thread #1    Thread #2    Thread #3
> > ---------    ---------    ---------    ---------
> >                                        lock(ww)
> >                                        success
> >              lock(ww')
> >              success
> >                           lock(ww)
> >              lock(ww)        .
> >                 .            .         unlock(ww) part 1
> > lock(ww)        .            .            .
> > success         .            .            .
> >                 .            .         unlock(ww) part 2
> >                 .         back off
> > lock(ww')       .
> >    .            .
> > (stuck)      (stuck)
> > 
> > Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> > (without being protected by lock->base.wait_lock), meaning that thread #0
> > can acquire ww in the fast path or, much more likely, the medium path
> > in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> > won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> > 
> > Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> > thread #2, since waiters are added at the tail. Thread #2 wakes up and
> > backs off since it sees ww owned by a context with a lower stamp.
> > 
> > Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> > on ww'. So thread #0 gets stuck waiting for ww' to be released.
> > 
> > This patch fixes the deadlock by waking up all waiters in the slow path
> > of ww_mutex_unlock.
> > 
> > We have an internal test case for amdgpu which continuously submits
> > command streams from tens of threads, where all command streams reference
> > hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> > between command streams. This test reliably caused a deadlock, and while I
> > haven't completely confirmed that it is exactly the scenario outlined
> > above, this patch does fix the test case.
> > 
> > v2:
> > - use wake_q_add
> > - add additional explanations
> > 
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: stable@vger.kernel.org
> > Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
> > Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
> 
> Completely and utterly fails to apply; I think this patch is based on
> code prior to the mutex rewrite.
> 
> Please rebase on tip/locking/core.
> 
> Also, is this a regression, or has this been a 'feature' of the ww_mutex
> code from early on?

Sorry forgot to mention that, but I checked. Seems to have been broken
since day 1, at least looking at the original code the wake-single-waiter
stuff is as old as the mutex code added in 2006.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 13:08     ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 13:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
> > From: Nicolai H�hnle <Nicolai.Haehnle@amd.com>
> > 
> > Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> > the following example. Acquire context stamps are ordered like the thread
> > numbers, i.e. thread #1 should back off when it encounters a mutex locked
> > by thread #0 etc.
> > 
> > Thread #0    Thread #1    Thread #2    Thread #3
> > ---------    ---------    ---------    ---------
> >                                        lock(ww)
> >                                        success
> >              lock(ww')
> >              success
> >                           lock(ww)
> >              lock(ww)        .
> >                 .            .         unlock(ww) part 1
> > lock(ww)        .            .            .
> > success         .            .            .
> >                 .            .         unlock(ww) part 2
> >                 .         back off
> > lock(ww')       .
> >    .            .
> > (stuck)      (stuck)
> > 
> > Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> > (without being protected by lock->base.wait_lock), meaning that thread #0
> > can acquire ww in the fast path or, much more likely, the medium path
> > in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> > won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> > 
> > Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> > thread #2, since waiters are added at the tail. Thread #2 wakes up and
> > backs off since it sees ww owned by a context with a lower stamp.
> > 
> > Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> > on ww'. So thread #0 gets stuck waiting for ww' to be released.
> > 
> > This patch fixes the deadlock by waking up all waiters in the slow path
> > of ww_mutex_unlock.
> > 
> > We have an internal test case for amdgpu which continuously submits
> > command streams from tens of threads, where all command streams reference
> > hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> > between command streams. This test reliably caused a deadlock, and while I
> > haven't completely confirmed that it is exactly the scenario outlined
> > above, this patch does fix the test case.
> > 
> > v2:
> > - use wake_q_add
> > - add additional explanations
> > 
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> > Cc: dri-devel@lists.freedesktop.org
> > Cc: stable@vger.kernel.org
> > Reviewed-by: Christian K�nig <christian.koenig@amd.com> (v1)
> > Signed-off-by: Nicolai H�hnle <nicolai.haehnle@amd.com>
> 
> Completely and utterly fails to apply; I think this patch is based on
> code prior to the mutex rewrite.
> 
> Please rebase on tip/locking/core.
> 
> Also, is this a regression, or has this been a 'feature' of the ww_mutex
> code from early on?

Sorry forgot to mention that, but I checked. Seems to have been broken
since day 1, at least looking at the original code the wake-single-waiter
stuff is as old as the mutex code added in 2006.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 13:08     ` Daniel Vetter
  (?)
@ 2016-11-23 13:11       ` Daniel Vetter
  -1 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 13:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 02:08:48PM +0100, Daniel Vetter wrote:
> On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> > > From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
> > > 
> > > Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> > > the following example. Acquire context stamps are ordered like the thread
> > > numbers, i.e. thread #1 should back off when it encounters a mutex locked
> > > by thread #0 etc.
> > > 
> > > Thread #0    Thread #1    Thread #2    Thread #3
> > > ---------    ---------    ---------    ---------
> > >                                        lock(ww)
> > >                                        success
> > >              lock(ww')
> > >              success
> > >                           lock(ww)
> > >              lock(ww)        .
> > >                 .            .         unlock(ww) part 1
> > > lock(ww)        .            .            .
> > > success         .            .            .
> > >                 .            .         unlock(ww) part 2
> > >                 .         back off
> > > lock(ww')       .
> > >    .            .
> > > (stuck)      (stuck)
> > > 
> > > Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> > > (without being protected by lock->base.wait_lock), meaning that thread #0
> > > can acquire ww in the fast path or, much more likely, the medium path
> > > in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> > > won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> > > 
> > > Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> > > thread #2, since waiters are added at the tail. Thread #2 wakes up and
> > > backs off since it sees ww owned by a context with a lower stamp.
> > > 
> > > Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> > > on ww'. So thread #0 gets stuck waiting for ww' to be released.
> > > 
> > > This patch fixes the deadlock by waking up all waiters in the slow path
> > > of ww_mutex_unlock.
> > > 
> > > We have an internal test case for amdgpu which continuously submits
> > > command streams from tens of threads, where all command streams reference
> > > hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> > > between command streams. This test reliably caused a deadlock, and while I
> > > haven't completely confirmed that it is exactly the scenario outlined
> > > above, this patch does fix the test case.
> > > 
> > > v2:
> > > - use wake_q_add
> > > - add additional explanations
> > > 
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: stable@vger.kernel.org
> > > Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
> > > Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
> > 
> > Completely and utterly fails to apply; I think this patch is based on
> > code prior to the mutex rewrite.
> > 
> > Please rebase on tip/locking/core.
> > 
> > Also, is this a regression, or has this been a 'feature' of the ww_mutex
> > code from early on?
> 
> Sorry forgot to mention that, but I checked. Seems to have been broken
> since day 1, at least looking at the original code the wake-single-waiter
> stuff is as old as the mutex code added in 2006.

More details: For gpu drivers this was originally working, since the
ww_mutex implementation in ttm did use wake_up_all. So need to add a

Fixes: 5e338405119a ("drm/ttm: convert to the reservation api")
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 13:11       ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 13:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 02:08:48PM +0100, Daniel Vetter wrote:
> On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
> > > From: Nicolai H�hnle <Nicolai.Haehnle@amd.com>
> > > 
> > > Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> > > the following example. Acquire context stamps are ordered like the thread
> > > numbers, i.e. thread #1 should back off when it encounters a mutex locked
> > > by thread #0 etc.
> > > 
> > > Thread #0    Thread #1    Thread #2    Thread #3
> > > ---------    ---------    ---------    ---------
> > >                                        lock(ww)
> > >                                        success
> > >              lock(ww')
> > >              success
> > >                           lock(ww)
> > >              lock(ww)        .
> > >                 .            .         unlock(ww) part 1
> > > lock(ww)        .            .            .
> > > success         .            .            .
> > >                 .            .         unlock(ww) part 2
> > >                 .         back off
> > > lock(ww')       .
> > >    .            .
> > > (stuck)      (stuck)
> > > 
> > > Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> > > (without being protected by lock->base.wait_lock), meaning that thread #0
> > > can acquire ww in the fast path or, much more likely, the medium path
> > > in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> > > won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> > > 
> > > Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> > > thread #2, since waiters are added at the tail. Thread #2 wakes up and
> > > backs off since it sees ww owned by a context with a lower stamp.
> > > 
> > > Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> > > on ww'. So thread #0 gets stuck waiting for ww' to be released.
> > > 
> > > This patch fixes the deadlock by waking up all waiters in the slow path
> > > of ww_mutex_unlock.
> > > 
> > > We have an internal test case for amdgpu which continuously submits
> > > command streams from tens of threads, where all command streams reference
> > > hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> > > between command streams. This test reliably caused a deadlock, and while I
> > > haven't completely confirmed that it is exactly the scenario outlined
> > > above, this patch does fix the test case.
> > > 
> > > v2:
> > > - use wake_q_add
> > > - add additional explanations
> > > 
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: stable@vger.kernel.org
> > > Reviewed-by: Christian K�nig <christian.koenig@amd.com> (v1)
> > > Signed-off-by: Nicolai H�hnle <nicolai.haehnle@amd.com>
> > 
> > Completely and utterly fails to apply; I think this patch is based on
> > code prior to the mutex rewrite.
> > 
> > Please rebase on tip/locking/core.
> > 
> > Also, is this a regression, or has this been a 'feature' of the ww_mutex
> > code from early on?
> 
> Sorry forgot to mention that, but I checked. Seems to have been broken
> since day 1, at least looking at the original code the wake-single-waiter
> stuff is as old as the mutex code added in 2006.

More details: For gpu drivers this was originally working, since the
ww_mutex implementation in ttm did use wake_up_all. So need to add a

Fixes: 5e338405119a ("drm/ttm: convert to the reservation api")
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 13:11       ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 13:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, linux-kernel, dri-devel, Ingo Molnar,
	stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 02:08:48PM +0100, Daniel Vetter wrote:
> On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
> > On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> > > From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
> > > 
> > > Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
> > > the following example. Acquire context stamps are ordered like the thread
> > > numbers, i.e. thread #1 should back off when it encounters a mutex locked
> > > by thread #0 etc.
> > > 
> > > Thread #0    Thread #1    Thread #2    Thread #3
> > > ---------    ---------    ---------    ---------
> > >                                        lock(ww)
> > >                                        success
> > >              lock(ww')
> > >              success
> > >                           lock(ww)
> > >              lock(ww)        .
> > >                 .            .         unlock(ww) part 1
> > > lock(ww)        .            .            .
> > > success         .            .            .
> > >                 .            .         unlock(ww) part 2
> > >                 .         back off
> > > lock(ww')       .
> > >    .            .
> > > (stuck)      (stuck)
> > > 
> > > Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
> > > (without being protected by lock->base.wait_lock), meaning that thread #0
> > > can acquire ww in the fast path or, much more likely, the medium path
> > > in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
> > > won't wake up any of the waiters in ww_mutex_set_context_fastpath.
> > > 
> > > Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
> > > thread #2, since waiters are added at the tail. Thread #2 wakes up and
> > > backs off since it sees ww owned by a context with a lower stamp.
> > > 
> > > Meanwhile, thread #1 is never woken up, and so it won't back off its lock
> > > on ww'. So thread #0 gets stuck waiting for ww' to be released.
> > > 
> > > This patch fixes the deadlock by waking up all waiters in the slow path
> > > of ww_mutex_unlock.
> > > 
> > > We have an internal test case for amdgpu which continuously submits
> > > command streams from tens of threads, where all command streams reference
> > > hundreds of GPU buffer objects with a lot of overlap in the buffer lists
> > > between command streams. This test reliably caused a deadlock, and while I
> > > haven't completely confirmed that it is exactly the scenario outlined
> > > above, this patch does fix the test case.
> > > 
> > > v2:
> > > - use wake_q_add
> > > - add additional explanations
> > > 
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> > > Cc: dri-devel@lists.freedesktop.org
> > > Cc: stable@vger.kernel.org
> > > Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
> > > Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
> > 
> > Completely and utterly fails to apply; I think this patch is based on
> > code prior to the mutex rewrite.
> > 
> > Please rebase on tip/locking/core.
> > 
> > Also, is this a regression, or has this been a 'feature' of the ww_mutex
> > code from early on?
> 
> Sorry forgot to mention that, but I checked. Seems to have been broken
> since day 1, at least looking at the original code the wake-single-waiter
> stuff is as old as the mutex code added in 2006.

More details: For gpu drivers this was originally working, since the
ww_mutex implementation in ttm did use wake_up_all. So need to add a

Fixes: 5e338405119a ("drm/ttm: convert to the reservation api")
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 13:11       ` Daniel Vetter
@ 2016-11-23 13:33         ` Maarten Lankhorst
  -1 siblings, 0 replies; 37+ messages in thread
From: Maarten Lankhorst @ 2016-11-23 13:33 UTC (permalink / raw)
  To: Peter Zijlstra, Nicolai Hähnle, Nicolai Hähnle,
	linux-kernel, dri-devel, Ingo Molnar, stable, Maarten Lankhorst

Op 23-11-16 om 14:11 schreef Daniel Vetter:
> On Wed, Nov 23, 2016 at 02:08:48PM +0100, Daniel Vetter wrote:
>> On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
>>> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
>>>> From: Nicolai Hähnle <Nicolai.Haehnle@amd.com>
>>>>
>>>> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
>>>> the following example. Acquire context stamps are ordered like the thread
>>>> numbers, i.e. thread #1 should back off when it encounters a mutex locked
>>>> by thread #0 etc.
>>>>
>>>> Thread #0    Thread #1    Thread #2    Thread #3
>>>> ---------    ---------    ---------    ---------
>>>>                                        lock(ww)
>>>>                                        success
>>>>              lock(ww')
>>>>              success
>>>>                           lock(ww)
>>>>              lock(ww)        .
>>>>                 .            .         unlock(ww) part 1
>>>> lock(ww)        .            .            .
>>>> success         .            .            .
>>>>                 .            .         unlock(ww) part 2
>>>>                 .         back off
>>>> lock(ww')       .
>>>>    .            .
>>>> (stuck)      (stuck)
>>>>
>>>> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
>>>> (without being protected by lock->base.wait_lock), meaning that thread #0
>>>> can acquire ww in the fast path or, much more likely, the medium path
>>>> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
>>>> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
>>>>
>>>> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
>>>> thread #2, since waiters are added at the tail. Thread #2 wakes up and
>>>> backs off since it sees ww owned by a context with a lower stamp.
>>>>
>>>> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
>>>> on ww'. So thread #0 gets stuck waiting for ww' to be released.
>>>>
>>>> This patch fixes the deadlock by waking up all waiters in the slow path
>>>> of ww_mutex_unlock.
>>>>
>>>> We have an internal test case for amdgpu which continuously submits
>>>> command streams from tens of threads, where all command streams reference
>>>> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
>>>> between command streams. This test reliably caused a deadlock, and while I
>>>> haven't completely confirmed that it is exactly the scenario outlined
>>>> above, this patch does fix the test case.
>>>>
>>>> v2:
>>>> - use wake_q_add
>>>> - add additional explanations
>>>>
>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>> Cc: dri-devel@lists.freedesktop.org
>>>> Cc: stable@vger.kernel.org
>>>> Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
>>>> Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
>>> Completely and utterly fails to apply; I think this patch is based on
>>> code prior to the mutex rewrite.
>>>
>>> Please rebase on tip/locking/core.
>>>
>>> Also, is this a regression, or has this been a 'feature' of the ww_mutex
>>> code from early on?
>> Sorry forgot to mention that, but I checked. Seems to have been broken
>> since day 1, at least looking at the original code the wake-single-waiter
>> stuff is as old as the mutex code added in 2006.
> More details: For gpu drivers this was originally working, since the
> ww_mutex implementation in ttm did use wake_up_all. So need to add a
>
> Fixes: 5e338405119a ("drm/ttm: convert to the reservation api")

But it's broken in the original kernel ww_mutex implementation, I guess it doesn't matter much since ttm was the first in-kernel user anyway. :)

~Maarten

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 13:33         ` Maarten Lankhorst
  0 siblings, 0 replies; 37+ messages in thread
From: Maarten Lankhorst @ 2016-11-23 13:33 UTC (permalink / raw)
  To: Peter Zijlstra, Nicolai Hähnle, Nicolai Hähnle,
	linux-kernel, dri-devel, Ingo Molnar, stable, Maarten Lankhorst

Op 23-11-16 om 14:11 schreef Daniel Vetter:
> On Wed, Nov 23, 2016 at 02:08:48PM +0100, Daniel Vetter wrote:
>> On Wed, Nov 23, 2016 at 02:00:46PM +0100, Peter Zijlstra wrote:
>>> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
>>>> From: Nicolai H�hnle <Nicolai.Haehnle@amd.com>
>>>>
>>>> Fix a race condition involving 4 threads and 2 ww_mutexes as indicated in
>>>> the following example. Acquire context stamps are ordered like the thread
>>>> numbers, i.e. thread #1 should back off when it encounters a mutex locked
>>>> by thread #0 etc.
>>>>
>>>> Thread #0    Thread #1    Thread #2    Thread #3
>>>> ---------    ---------    ---------    ---------
>>>>                                        lock(ww)
>>>>                                        success
>>>>              lock(ww')
>>>>              success
>>>>                           lock(ww)
>>>>              lock(ww)        .
>>>>                 .            .         unlock(ww) part 1
>>>> lock(ww)        .            .            .
>>>> success         .            .            .
>>>>                 .            .         unlock(ww) part 2
>>>>                 .         back off
>>>> lock(ww')       .
>>>>    .            .
>>>> (stuck)      (stuck)
>>>>
>>>> Here, unlock(ww) part 1 is the part that sets lock->base.count to 1
>>>> (without being protected by lock->base.wait_lock), meaning that thread #0
>>>> can acquire ww in the fast path or, much more likely, the medium path
>>>> in mutex_optimistic_spin. Since lock->base.count == 0, thread #0 then
>>>> won't wake up any of the waiters in ww_mutex_set_context_fastpath.
>>>>
>>>> Then, unlock(ww) part 2 wakes up _only_the_first_ waiter of ww. This is
>>>> thread #2, since waiters are added at the tail. Thread #2 wakes up and
>>>> backs off since it sees ww owned by a context with a lower stamp.
>>>>
>>>> Meanwhile, thread #1 is never woken up, and so it won't back off its lock
>>>> on ww'. So thread #0 gets stuck waiting for ww' to be released.
>>>>
>>>> This patch fixes the deadlock by waking up all waiters in the slow path
>>>> of ww_mutex_unlock.
>>>>
>>>> We have an internal test case for amdgpu which continuously submits
>>>> command streams from tens of threads, where all command streams reference
>>>> hundreds of GPU buffer objects with a lot of overlap in the buffer lists
>>>> between command streams. This test reliably caused a deadlock, and while I
>>>> haven't completely confirmed that it is exactly the scenario outlined
>>>> above, this patch does fix the test case.
>>>>
>>>> v2:
>>>> - use wake_q_add
>>>> - add additional explanations
>>>>
>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>> Cc: dri-devel@lists.freedesktop.org
>>>> Cc: stable@vger.kernel.org
>>>> Reviewed-by: Christian K�nig <christian.koenig@amd.com> (v1)
>>>> Signed-off-by: Nicolai H�hnle <nicolai.haehnle@amd.com>
>>> Completely and utterly fails to apply; I think this patch is based on
>>> code prior to the mutex rewrite.
>>>
>>> Please rebase on tip/locking/core.
>>>
>>> Also, is this a regression, or has this been a 'feature' of the ww_mutex
>>> code from early on?
>> Sorry forgot to mention that, but I checked. Seems to have been broken
>> since day 1, at least looking at the original code the wake-single-waiter
>> stuff is as old as the mutex code added in 2006.
> More details: For gpu drivers this was originally working, since the
> ww_mutex implementation in ttm did use wake_up_all. So need to add a
>
> Fixes: 5e338405119a ("drm/ttm: convert to the reservation api")

But it's broken in the original kernel ww_mutex implementation, I guess it doesn't matter much since ttm was the first in-kernel user anyway. :)

~Maarten


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 11:25 [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Nicolai Hähnle
  2016-11-23 11:25   ` Nicolai Hähnle
@ 2016-11-23 14:03   ` Peter Zijlstra
  2016-11-23 11:25   ` Nicolai Hähnle
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 14:03 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: linux-kernel, Nicolai Hähnle, Ingo Molnar, Chris Wilson,
	Maarten Lankhorst, dri-devel, stable

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>  	 */
>  	mutex_clear_owner(&lock->base);
>  #endif
> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> +	/*
> +	 * A previously _not_ waiting task may acquire the lock via the fast
> +	 * path during our unlock. In that case, already waiting tasks may have
> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> +	 * can check their acquire context stamp against the new owner.
> +	 */
> +	__mutex_fastpath_unlock(&lock->base.count,
> +				__mutex_unlock_slowpath_wakeall);
>  }

So doing a wake-all has obvious issues with thundering herd etc.. Also,
with the new mutex, you'd not be able to do hand-off, which would
introduce starvation cases.

Ideally we'd iterate the blocked list and pick the waiter with the
earliest stamp, or we'd maintain the list in stamp order instead of
FIFO, for ww_mutex.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 14:03   ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 14:03 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: linux-kernel, Nicolai Hähnle, Ingo Molnar, Chris Wilson,
	Maarten Lankhorst, dri-devel, stable

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>  	 */
>  	mutex_clear_owner(&lock->base);
>  #endif
> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> +	/*
> +	 * A previously _not_ waiting task may acquire the lock via the fast
> +	 * path during our unlock. In that case, already waiting tasks may have
> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> +	 * can check their acquire context stamp against the new owner.
> +	 */
> +	__mutex_fastpath_unlock(&lock->base.count,
> +				__mutex_unlock_slowpath_wakeall);
>  }

So doing a wake-all has obvious issues with thundering herd etc.. Also,
with the new mutex, you'd not be able to do hand-off, which would
introduce starvation cases.

Ideally we'd iterate the blocked list and pick the waiter with the
earliest stamp, or we'd maintain the list in stamp order instead of
FIFO, for ww_mutex.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 14:03   ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 14:03 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: Nicolai Hähnle, linux-kernel, dri-devel, Ingo Molnar,
	stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>  	 */
>  	mutex_clear_owner(&lock->base);
>  #endif
> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> +	/*
> +	 * A previously _not_ waiting task may acquire the lock via the fast
> +	 * path during our unlock. In that case, already waiting tasks may have
> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> +	 * can check their acquire context stamp against the new owner.
> +	 */
> +	__mutex_fastpath_unlock(&lock->base.count,
> +				__mutex_unlock_slowpath_wakeall);
>  }

So doing a wake-all has obvious issues with thundering herd etc.. Also,
with the new mutex, you'd not be able to do hand-off, which would
introduce starvation cases.

Ideally we'd iterate the blocked list and pick the waiter with the
earliest stamp, or we'd maintain the list in stamp order instead of
FIFO, for ww_mutex.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 14:03   ` Peter Zijlstra
  (?)
@ 2016-11-23 14:25     ` Daniel Vetter
  -1 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 14:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 03:03:36PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> > @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
> >  	 */
> >  	mutex_clear_owner(&lock->base);
> >  #endif
> > -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> > +	/*
> > +	 * A previously _not_ waiting task may acquire the lock via the fast
> > +	 * path during our unlock. In that case, already waiting tasks may have
> > +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> > +	 * can check their acquire context stamp against the new owner.
> > +	 */
> > +	__mutex_fastpath_unlock(&lock->base.count,
> > +				__mutex_unlock_slowpath_wakeall);
> >  }
> 
> So doing a wake-all has obvious issues with thundering herd etc.. Also,
> with the new mutex, you'd not be able to do hand-off, which would
> introduce starvation cases.
> 
> Ideally we'd iterate the blocked list and pick the waiter with the
> earliest stamp, or we'd maintain the list in stamp order instead of
> FIFO, for ww_mutex.

Not sure we'll win that much, at least I think we still need to wake up
everyone with earlier stamp than the one of the task that just released
the lock. Otherwise there's deadlocks. So just cuts the wakeups in half,
on average.

What we could do is do a handoff-hint with the timestamp of earliest task
we believe should get the lock. Everyone with a later timestamp that gets
woken then knows that they definitely have a deadlock situation and need
to back off (thread 2 in the example).

thread 1 would get woken, and would be able to take the lock, except when
thread 0 successfully raced it and stole the lock. And anyone else racing
in with later timestamps would also immediately back off, ensuring
fairness.

Without thinking it through in detail this is a PI issue, except that we
replace boosting with wakeup&back-off. Could we perhaps steal something
from rt mutexes to make it fair&efficient?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 14:25     ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 14:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 03:03:36PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
> > @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
> >  	 */
> >  	mutex_clear_owner(&lock->base);
> >  #endif
> > -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> > +	/*
> > +	 * A previously _not_ waiting task may acquire the lock via the fast
> > +	 * path during our unlock. In that case, already waiting tasks may have
> > +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> > +	 * can check their acquire context stamp against the new owner.
> > +	 */
> > +	__mutex_fastpath_unlock(&lock->base.count,
> > +				__mutex_unlock_slowpath_wakeall);
> >  }
> 
> So doing a wake-all has obvious issues with thundering herd etc.. Also,
> with the new mutex, you'd not be able to do hand-off, which would
> introduce starvation cases.
> 
> Ideally we'd iterate the blocked list and pick the waiter with the
> earliest stamp, or we'd maintain the list in stamp order instead of
> FIFO, for ww_mutex.

Not sure we'll win that much, at least I think we still need to wake up
everyone with earlier stamp than the one of the task that just released
the lock. Otherwise there's deadlocks. So just cuts the wakeups in half,
on average.

What we could do is do a handoff-hint with the timestamp of earliest task
we believe should get the lock. Everyone with a later timestamp that gets
woken then knows that they definitely have a deadlock situation and need
to back off (thread 2 in the example).

thread 1 would get woken, and would be able to take the lock, except when
thread 0 successfully raced it and stole the lock. And anyone else racing
in with later timestamps would also immediately back off, ensuring
fairness.

Without thinking it through in detail this is a PI issue, except that we
replace boosting with wakeup&back-off. Could we perhaps steal something
from rt mutexes to make it fair&efficient?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 14:25     ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-23 14:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, linux-kernel, dri-devel, Ingo Molnar,
	stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 03:03:36PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
> > @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
> >  	 */
> >  	mutex_clear_owner(&lock->base);
> >  #endif
> > -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> > +	/*
> > +	 * A previously _not_ waiting task may acquire the lock via the fast
> > +	 * path during our unlock. In that case, already waiting tasks may have
> > +	 * to back off to avoid a deadlock. Wake up all waiters so that they
> > +	 * can check their acquire context stamp against the new owner.
> > +	 */
> > +	__mutex_fastpath_unlock(&lock->base.count,
> > +				__mutex_unlock_slowpath_wakeall);
> >  }
> 
> So doing a wake-all has obvious issues with thundering herd etc.. Also,
> with the new mutex, you'd not be able to do hand-off, which would
> introduce starvation cases.
> 
> Ideally we'd iterate the blocked list and pick the waiter with the
> earliest stamp, or we'd maintain the list in stamp order instead of
> FIFO, for ww_mutex.

Not sure we'll win that much, at least I think we still need to wake up
everyone with earlier stamp than the one of the task that just released
the lock. Otherwise there's deadlocks. So just cuts the wakeups in half,
on average.

What we could do is do a handoff-hint with the timestamp of earliest task
we believe should get the lock. Everyone with a later timestamp that gets
woken then knows that they definitely have a deadlock situation and need
to back off (thread 2 in the example).

thread 1 would get woken, and would be able to take the lock, except when
thread 0 successfully raced it and stole the lock. And anyone else racing
in with later timestamps would also immediately back off, ensuring
fairness.

Without thinking it through in detail this is a PI issue, except that we
replace boosting with wakeup&back-off. Could we perhaps steal something
from rt mutexes to make it fair&efficient?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 14:25     ` Daniel Vetter
@ 2016-11-23 14:32       ` Peter Zijlstra
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 14:32 UTC (permalink / raw)
  To: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 03:25:25PM +0100, Daniel Vetter wrote:

> Without thinking it through in detail this is a PI issue, except that we
> replace boosting with wakeup&back-off. Could we perhaps steal something
> from rt mutexes to make it fair&efficient?

rt_mutexes order the waiters by 'priority' and wake the top most.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-23 14:32       ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-23 14:32 UTC (permalink / raw)
  To: Nicolai Hähnle, Nicolai Hähnle, linux-kernel,
	dri-devel, Ingo Molnar, stable, Maarten Lankhorst

On Wed, Nov 23, 2016 at 03:25:25PM +0100, Daniel Vetter wrote:

> Without thinking it through in detail this is a PI issue, except that we
> replace boosting with wakeup&back-off. Could we perhaps steal something
> from rt mutexes to make it fair&efficient?

rt_mutexes order the waiters by 'priority' and wake the top most.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-23 14:25     ` Daniel Vetter
@ 2016-11-24 11:26       ` Nicolai Hähnle
  -1 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-24 11:26 UTC (permalink / raw)
  To: Peter Zijlstra, Nicolai Hähnle, linux-kernel, dri-devel,
	Ingo Molnar, stable, Maarten Lankhorst

On 23.11.2016 15:25, Daniel Vetter wrote:
> On Wed, Nov 23, 2016 at 03:03:36PM +0100, Peter Zijlstra wrote:
>> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai Hähnle wrote:
>>> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>>>  	 */
>>>  	mutex_clear_owner(&lock->base);
>>>  #endif
>>> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
>>> +	/*
>>> +	 * A previously _not_ waiting task may acquire the lock via the fast
>>> +	 * path during our unlock. In that case, already waiting tasks may have
>>> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
>>> +	 * can check their acquire context stamp against the new owner.
>>> +	 */
>>> +	__mutex_fastpath_unlock(&lock->base.count,
>>> +				__mutex_unlock_slowpath_wakeall);
>>>  }
>>
>> So doing a wake-all has obvious issues with thundering herd etc.. Also,
>> with the new mutex, you'd not be able to do hand-off, which would
>> introduce starvation cases.
>>
>> Ideally we'd iterate the blocked list and pick the waiter with the
>> earliest stamp, or we'd maintain the list in stamp order instead of
>> FIFO, for ww_mutex.
>
> Not sure we'll win that much, at least I think we still need to wake up
> everyone with earlier stamp than the one of the task that just released
> the lock. Otherwise there's deadlocks. So just cuts the wakeups in half,
> on average.
>
> What we could do is do a handoff-hint with the timestamp of earliest task
> we believe should get the lock. Everyone with a later timestamp that gets
> woken then knows that they definitely have a deadlock situation and need
> to back off (thread 2 in the example).
>
> thread 1 would get woken, and would be able to take the lock, except when
> thread 0 successfully raced it and stole the lock. And anyone else racing
> in with later timestamps would also immediately back off, ensuring
> fairness.

I did consider maintaining stamp order in the waiter list and originally 
decided against it because I just wanted a simple and conservative fix 
to avoid adding new regressions.

Now that a different approach is needed for >= 4.9 anyway mostly due to 
the hand-off logic, I'm reconsidering this.

I do believe we can win a bit by keeping the wait list sorted, if we 
also make sure that waiters don't add themselves in the first place if 
they see that a deadlock situation cannot be avoided.

I will probably want to extend struct mutex_waiter with 
ww_mutex-specific fields to facilitate this (i.e. ctx pointer, perhaps 
stamp as well to reduce pointer-chasing). That should be fine since it 
lives on the stack.

In the meantime, I'd appreciate it if patch #1 could be accepted as-is 
for stable updates to <= 4.8. It fixes a real (if rare) bug, and the 
stampede inefficiency isn't a problem in practice at least for GPU 
applications.

Thanks,
Nicolai

>
> Without thinking it through in detail this is a PI issue, except that we
> replace boosting with wakeup&back-off. Could we perhaps steal something
> from rt mutexes to make it fair&efficient?
> -Daniel
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-24 11:26       ` Nicolai Hähnle
  0 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-24 11:26 UTC (permalink / raw)
  To: Peter Zijlstra, Nicolai Hähnle, linux-kernel, dri-devel,
	Ingo Molnar, stable, Maarten Lankhorst

On 23.11.2016 15:25, Daniel Vetter wrote:
> On Wed, Nov 23, 2016 at 03:03:36PM +0100, Peter Zijlstra wrote:
>> On Wed, Nov 23, 2016 at 12:25:22PM +0100, Nicolai H�hnle wrote:
>>> @@ -473,7 +476,14 @@ void __sched ww_mutex_unlock(struct ww_mutex *lock)
>>>  	 */
>>>  	mutex_clear_owner(&lock->base);
>>>  #endif
>>> -	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
>>> +	/*
>>> +	 * A previously _not_ waiting task may acquire the lock via the fast
>>> +	 * path during our unlock. In that case, already waiting tasks may have
>>> +	 * to back off to avoid a deadlock. Wake up all waiters so that they
>>> +	 * can check their acquire context stamp against the new owner.
>>> +	 */
>>> +	__mutex_fastpath_unlock(&lock->base.count,
>>> +				__mutex_unlock_slowpath_wakeall);
>>>  }
>>
>> So doing a wake-all has obvious issues with thundering herd etc.. Also,
>> with the new mutex, you'd not be able to do hand-off, which would
>> introduce starvation cases.
>>
>> Ideally we'd iterate the blocked list and pick the waiter with the
>> earliest stamp, or we'd maintain the list in stamp order instead of
>> FIFO, for ww_mutex.
>
> Not sure we'll win that much, at least I think we still need to wake up
> everyone with earlier stamp than the one of the task that just released
> the lock. Otherwise there's deadlocks. So just cuts the wakeups in half,
> on average.
>
> What we could do is do a handoff-hint with the timestamp of earliest task
> we believe should get the lock. Everyone with a later timestamp that gets
> woken then knows that they definitely have a deadlock situation and need
> to back off (thread 2 in the example).
>
> thread 1 would get woken, and would be able to take the lock, except when
> thread 0 successfully raced it and stole the lock. And anyone else racing
> in with later timestamps would also immediately back off, ensuring
> fairness.

I did consider maintaining stamp order in the waiter list and originally 
decided against it because I just wanted a simple and conservative fix 
to avoid adding new regressions.

Now that a different approach is needed for >= 4.9 anyway mostly due to 
the hand-off logic, I'm reconsidering this.

I do believe we can win a bit by keeping the wait list sorted, if we 
also make sure that waiters don't add themselves in the first place if 
they see that a deadlock situation cannot be avoided.

I will probably want to extend struct mutex_waiter with 
ww_mutex-specific fields to facilitate this (i.e. ctx pointer, perhaps 
stamp as well to reduce pointer-chasing). That should be fine since it 
lives on the stack.

In the meantime, I'd appreciate it if patch #1 could be accepted as-is 
for stable updates to <= 4.8. It fixes a real (if rare) bug, and the 
stampede inefficiency isn't a problem in practice at least for GPU 
applications.

Thanks,
Nicolai

>
> Without thinking it through in detail this is a PI issue, except that we
> replace boosting with wakeup&back-off. Could we perhaps steal something
> from rt mutexes to make it fair&efficient?
> -Daniel
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-24 11:26       ` Nicolai Hähnle
  (?)
@ 2016-11-24 11:40         ` Peter Zijlstra
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-24 11:40 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: Nicolai Hähnle, linux-kernel, dri-devel, Ingo Molnar,
	stable, Maarten Lankhorst

On Thu, Nov 24, 2016 at 12:26:57PM +0100, Nicolai Hähnle wrote:

> I do believe we can win a bit by keeping the wait list sorted, if we also
> make sure that waiters don't add themselves in the first place if they see
> that a deadlock situation cannot be avoided.
> 
> I will probably want to extend struct mutex_waiter with ww_mutex-specific
> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
> pointer-chasing). That should be fine since it lives on the stack.

Right, shouldn't be a problem I think.

The only 'problem' I can see with using that is that its possible to mix
ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
list order somewhat tricky.

Ideally we'd remove that feature, although I see its actually used quite
a bit :/

> In the meantime, I'd appreciate it if patch #1 could be accepted as-is for
> stable updates to <= 4.8. It fixes a real (if rare) bug, and the stampede
> inefficiency isn't a problem in practice at least for GPU applications.

Sorry can't do. We don't do stable patches that don't have anything
upstream.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-24 11:40         ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-24 11:40 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: Nicolai Hähnle, linux-kernel, dri-devel, Ingo Molnar,
	stable, Maarten Lankhorst

On Thu, Nov 24, 2016 at 12:26:57PM +0100, Nicolai H�hnle wrote:

> I do believe we can win a bit by keeping the wait list sorted, if we also
> make sure that waiters don't add themselves in the first place if they see
> that a deadlock situation cannot be avoided.
> 
> I will probably want to extend struct mutex_waiter with ww_mutex-specific
> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
> pointer-chasing). That should be fine since it lives on the stack.

Right, shouldn't be a problem I think.

The only 'problem' I can see with using that is that its possible to mix
ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
list order somewhat tricky.

Ideally we'd remove that feature, although I see its actually used quite
a bit :/

> In the meantime, I'd appreciate it if patch #1 could be accepted as-is for
> stable updates to <= 4.8. It fixes a real (if rare) bug, and the stampede
> inefficiency isn't a problem in practice at least for GPU applications.

Sorry can't do. We don't do stable patches that don't have anything
upstream.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-24 11:40         ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-24 11:40 UTC (permalink / raw)
  To: Nicolai Hähnle
  Cc: Nicolai Hähnle, linux-kernel, stable, Ingo Molnar,
	dri-devel, Maarten Lankhorst

On Thu, Nov 24, 2016 at 12:26:57PM +0100, Nicolai Hähnle wrote:

> I do believe we can win a bit by keeping the wait list sorted, if we also
> make sure that waiters don't add themselves in the first place if they see
> that a deadlock situation cannot be avoided.
> 
> I will probably want to extend struct mutex_waiter with ww_mutex-specific
> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
> pointer-chasing). That should be fine since it lives on the stack.

Right, shouldn't be a problem I think.

The only 'problem' I can see with using that is that its possible to mix
ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
list order somewhat tricky.

Ideally we'd remove that feature, although I see its actually used quite
a bit :/

> In the meantime, I'd appreciate it if patch #1 could be accepted as-is for
> stable updates to <= 4.8. It fixes a real (if rare) bug, and the stampede
> inefficiency isn't a problem in practice at least for GPU applications.

Sorry can't do. We don't do stable patches that don't have anything
upstream.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-24 11:40         ` Peter Zijlstra
@ 2016-11-24 11:52           ` Daniel Vetter
  -1 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-24 11:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Nicolai Hähnle,
	Linux Kernel Mailing List, stable, Ingo Molnar, dri-devel,
	Maarten Lankhorst

On Thu, Nov 24, 2016 at 12:40 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>
>> I do believe we can win a bit by keeping the wait list sorted, if we also
>> make sure that waiters don't add themselves in the first place if they see
>> that a deadlock situation cannot be avoided.
>>
>> I will probably want to extend struct mutex_waiter with ww_mutex-specific
>> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
>> pointer-chasing). That should be fine since it lives on the stack.
>
> Right, shouldn't be a problem I think.
>
> The only 'problem' I can see with using that is that its possible to mix
> ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
> list order somewhat tricky.
>
> Ideally we'd remove that feature, although I see its actually used quite
> a bit :/

I guess we could create a small fake acquire_ctx for single-lock
paths. That way callers still don't need to deal with having an
explicit ctx, but we can assume the timestamp (for ensuring fairness)
is available for all cases. Otherwise there's indeed a problem with
correctly (well fairly) interleaving ctx and non-ctx lockers I think.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-24 11:52           ` Daniel Vetter
  0 siblings, 0 replies; 37+ messages in thread
From: Daniel Vetter @ 2016-11-24 11:52 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nicolai Hähnle, Linux Kernel Mailing List, dri-devel,
	Ingo Molnar, stable, Maarten Lankhorst

On Thu, Nov 24, 2016 at 12:40 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>
>> I do believe we can win a bit by keeping the wait list sorted, if we also
>> make sure that waiters don't add themselves in the first place if they see
>> that a deadlock situation cannot be avoided.
>>
>> I will probably want to extend struct mutex_waiter with ww_mutex-specific
>> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
>> pointer-chasing). That should be fine since it lives on the stack.
>
> Right, shouldn't be a problem I think.
>
> The only 'problem' I can see with using that is that its possible to mix
> ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
> list order somewhat tricky.
>
> Ideally we'd remove that feature, although I see its actually used quite
> a bit :/

I guess we could create a small fake acquire_ctx for single-lock
paths. That way callers still don't need to deal with having an
explicit ctx, but we can assume the timestamp (for ensuring fairness)
is available for all cases. Otherwise there's indeed a problem with
correctly (well fairly) interleaving ctx and non-ctx lockers I think.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-24 11:52           ` Daniel Vetter
@ 2016-11-24 11:56             ` Peter Zijlstra
  -1 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-24 11:56 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Nicolai Hähnle, Nicolai Hähnle,
	Linux Kernel Mailing List, stable, Ingo Molnar, dri-devel,
	Maarten Lankhorst

On Thu, Nov 24, 2016 at 12:52:25PM +0100, Daniel Vetter wrote:
> On Thu, Nov 24, 2016 at 12:40 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> >> I do believe we can win a bit by keeping the wait list sorted, if we also
> >> make sure that waiters don't add themselves in the first place if they see
> >> that a deadlock situation cannot be avoided.
> >>
> >> I will probably want to extend struct mutex_waiter with ww_mutex-specific
> >> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
> >> pointer-chasing). That should be fine since it lives on the stack.
> >
> > Right, shouldn't be a problem I think.
> >
> > The only 'problem' I can see with using that is that its possible to mix
> > ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
> > list order somewhat tricky.
> >
> > Ideally we'd remove that feature, although I see its actually used quite
> > a bit :/
> 
> I guess we could create a small fake acquire_ctx for single-lock
> paths. That way callers still don't need to deal with having an
> explicit ctx, but we can assume the timestamp (for ensuring fairness)
> is available for all cases. Otherwise there's indeed a problem with
> correctly (well fairly) interleaving ctx and non-ctx lockers I think.

Actually tried that, but we need a ww_class to get a stamp from, and
ww_mutex_lock() doesn't have one of those..

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
@ 2016-11-24 11:56             ` Peter Zijlstra
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Zijlstra @ 2016-11-24 11:56 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Nicolai Hähnle, Linux Kernel Mailing List, dri-devel,
	Ingo Molnar, stable, Maarten Lankhorst

On Thu, Nov 24, 2016 at 12:52:25PM +0100, Daniel Vetter wrote:
> On Thu, Nov 24, 2016 at 12:40 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> >> I do believe we can win a bit by keeping the wait list sorted, if we also
> >> make sure that waiters don't add themselves in the first place if they see
> >> that a deadlock situation cannot be avoided.
> >>
> >> I will probably want to extend struct mutex_waiter with ww_mutex-specific
> >> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
> >> pointer-chasing). That should be fine since it lives on the stack.
> >
> > Right, shouldn't be a problem I think.
> >
> > The only 'problem' I can see with using that is that its possible to mix
> > ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
> > list order somewhat tricky.
> >
> > Ideally we'd remove that feature, although I see its actually used quite
> > a bit :/
> 
> I guess we could create a small fake acquire_ctx for single-lock
> paths. That way callers still don't need to deal with having an
> explicit ctx, but we can assume the timestamp (for ensuring fairness)
> is available for all cases. Otherwise there's indeed a problem with
> correctly (well fairly) interleaving ctx and non-ctx lockers I think.

Actually tried that, but we need a ww_class to get a stamp from, and
ww_mutex_lock() doesn't have one of those..
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes
  2016-11-24 11:56             ` Peter Zijlstra
  (?)
@ 2016-11-24 12:05             ` Nicolai Hähnle
  -1 siblings, 0 replies; 37+ messages in thread
From: Nicolai Hähnle @ 2016-11-24 12:05 UTC (permalink / raw)
  To: Peter Zijlstra, Daniel Vetter
  Cc: Nicolai Hähnle, Linux Kernel Mailing List, stable,
	Ingo Molnar, dri-devel, Maarten Lankhorst

On 24.11.2016 12:56, Peter Zijlstra wrote:
> On Thu, Nov 24, 2016 at 12:52:25PM +0100, Daniel Vetter wrote:
>> On Thu, Nov 24, 2016 at 12:40 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>>>
>>>> I do believe we can win a bit by keeping the wait list sorted, if we also
>>>> make sure that waiters don't add themselves in the first place if they see
>>>> that a deadlock situation cannot be avoided.
>>>>
>>>> I will probably want to extend struct mutex_waiter with ww_mutex-specific
>>>> fields to facilitate this (i.e. ctx pointer, perhaps stamp as well to reduce
>>>> pointer-chasing). That should be fine since it lives on the stack.
>>>
>>> Right, shouldn't be a problem I think.
>>>
>>> The only 'problem' I can see with using that is that its possible to mix
>>> ww and !ww waiters through ww_mutex_lock(.ctx = NULL). This makes the
>>> list order somewhat tricky.
>>>
>>> Ideally we'd remove that feature, although I see its actually used quite
>>> a bit :/
>>
>> I guess we could create a small fake acquire_ctx for single-lock
>> paths. That way callers still don't need to deal with having an
>> explicit ctx, but we can assume the timestamp (for ensuring fairness)
>> is available for all cases. Otherwise there's indeed a problem with
>> correctly (well fairly) interleaving ctx and non-ctx lockers I think.
>
> Actually tried that, but we need a ww_class to get a stamp from, and
> ww_mutex_lock() doesn't have one of those..

The acquire context needs to be live until the unlock anyway, so this is 
something that requires modifying the callers of ww_mutex_lock. Those 
should all have a ww_class available, or something is very wrong :)

Nicolai

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2016-11-24 12:19 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-23 11:25 [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Nicolai Hähnle
2016-11-23 11:25 ` [PATCH 2/4] locking/ww_mutex: Remove redundant wakeups in ww_mutex_set_context_slowpath Nicolai Hähnle
2016-11-23 11:25   ` Nicolai Hähnle
2016-11-23 11:25 ` [PATCH 3/4] locking/Documentation: fix a typo Nicolai Hähnle
2016-11-23 11:25   ` Nicolai Hähnle
2016-11-23 11:25 ` [PATCH 4/4] locking/ww_mutex: Fix a comment typo Nicolai Hähnle
2016-11-23 11:25   ` Nicolai Hähnle
2016-11-23 12:50 ` [PATCH 1/4] locking/ww_mutex: Fix a deadlock affecting ww_mutexes Daniel Vetter
2016-11-23 12:50   ` Daniel Vetter
2016-11-23 13:00 ` Peter Zijlstra
2016-11-23 13:00   ` Peter Zijlstra
2016-11-23 13:00   ` Peter Zijlstra
2016-11-23 13:08   ` Daniel Vetter
2016-11-23 13:08     ` Daniel Vetter
2016-11-23 13:11     ` Daniel Vetter
2016-11-23 13:11       ` Daniel Vetter
2016-11-23 13:11       ` Daniel Vetter
2016-11-23 13:33       ` Maarten Lankhorst
2016-11-23 13:33         ` Maarten Lankhorst
2016-11-23 14:03 ` Peter Zijlstra
2016-11-23 14:03   ` Peter Zijlstra
2016-11-23 14:03   ` Peter Zijlstra
2016-11-23 14:25   ` Daniel Vetter
2016-11-23 14:25     ` Daniel Vetter
2016-11-23 14:25     ` Daniel Vetter
2016-11-23 14:32     ` Peter Zijlstra
2016-11-23 14:32       ` Peter Zijlstra
2016-11-24 11:26     ` Nicolai Hähnle
2016-11-24 11:26       ` Nicolai Hähnle
2016-11-24 11:40       ` Peter Zijlstra
2016-11-24 11:40         ` Peter Zijlstra
2016-11-24 11:40         ` Peter Zijlstra
2016-11-24 11:52         ` Daniel Vetter
2016-11-24 11:52           ` Daniel Vetter
2016-11-24 11:56           ` Peter Zijlstra
2016-11-24 11:56             ` Peter Zijlstra
2016-11-24 12:05             ` Nicolai Hähnle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.