[tip:,locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock
diff mbox series

Message ID 161598470197.398.8903908266426306140.tip-bot2@tip-bot2
State In Next
Commit b058f2e4d0a70c060e21ed122b264e9649cad57f
Headers show
Series
  • [tip:,locking/urgent] locking/ww_mutex: Treat ww_mutex_lock() like a trylock
Related show

Commit Message

tip-bot2 for Peter Zijlstra March 17, 2021, 12:38 p.m. UTC
The following commit has been merged into the locking/urgent branch of tip:

Commit-ID:     b058f2e4d0a70c060e21ed122b264e9649cad57f
Gitweb:        https://git.kernel.org/tip/b058f2e4d0a70c060e21ed122b264e9649cad57f
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Tue, 16 Mar 2021 11:31:18 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 17 Mar 2021 09:56:46 +01:00

locking/ww_mutex: Treat ww_mutex_lock() like a trylock

It was found that running the ww_mutex_lock-torture test produced the
following lockdep splat almost immediately:

[  103.892638] ======================================================
[  103.892639] WARNING: possible circular locking dependency detected
[  103.892641] 5.12.0-rc3-debug+ #2 Tainted: G S      W
[  103.892643] ------------------------------------------------------
[  103.892643] lock_torture_wr/3234 is trying to acquire lock:
[  103.892646] ffffffffc0b35b10 (torture_ww_mutex_2.base){+.+.}-{3:3}, at: torture_ww_mutex_lock+0x316/0x720 [locktorture]
[  103.892660]
[  103.892660] but task is already holding lock:
[  103.892661] ffffffffc0b35cd0 (torture_ww_mutex_0.base){+.+.}-{3:3}, at: torture_ww_mutex_lock+0x3e2/0x720 [locktorture]
[  103.892669]
[  103.892669] which lock already depends on the new lock.
[  103.892669]
[  103.892670]
[  103.892670] the existing dependency chain (in reverse order) is:
[  103.892671]
[  103.892671] -> #2 (torture_ww_mutex_0.base){+.+.}-{3:3}:
[  103.892675]        lock_acquire+0x1c5/0x830
[  103.892682]        __ww_mutex_lock.constprop.15+0x1d1/0x2e50
[  103.892687]        ww_mutex_lock+0x4b/0x180
[  103.892690]        torture_ww_mutex_lock+0x316/0x720 [locktorture]
[  103.892694]        lock_torture_writer+0x142/0x3a0 [locktorture]
[  103.892698]        kthread+0x35f/0x430
[  103.892701]        ret_from_fork+0x1f/0x30
[  103.892706]
[  103.892706] -> #1 (torture_ww_mutex_1.base){+.+.}-{3:3}:
[  103.892709]        lock_acquire+0x1c5/0x830
[  103.892712]        __ww_mutex_lock.constprop.15+0x1d1/0x2e50
[  103.892715]        ww_mutex_lock+0x4b/0x180
[  103.892717]        torture_ww_mutex_lock+0x316/0x720 [locktorture]
[  103.892721]        lock_torture_writer+0x142/0x3a0 [locktorture]
[  103.892725]        kthread+0x35f/0x430
[  103.892727]        ret_from_fork+0x1f/0x30
[  103.892730]
[  103.892730] -> #0 (torture_ww_mutex_2.base){+.+.}-{3:3}:
[  103.892733]        check_prevs_add+0x3fd/0x2470
[  103.892736]        __lock_acquire+0x2602/0x3100
[  103.892738]        lock_acquire+0x1c5/0x830
[  103.892740]        __ww_mutex_lock.constprop.15+0x1d1/0x2e50
[  103.892743]        ww_mutex_lock+0x4b/0x180
[  103.892746]        torture_ww_mutex_lock+0x316/0x720 [locktorture]
[  103.892749]        lock_torture_writer+0x142/0x3a0 [locktorture]
[  103.892753]        kthread+0x35f/0x430
[  103.892755]        ret_from_fork+0x1f/0x30
[  103.892757]
[  103.892757] other info that might help us debug this:
[  103.892757]
[  103.892758] Chain exists of:
[  103.892758]   torture_ww_mutex_2.base --> torture_ww_mutex_1.base --> torture_ww_mutex_0.base
[  103.892758]
[  103.892763]  Possible unsafe locking scenario:
[  103.892763]
[  103.892764]        CPU0                    CPU1
[  103.892765]        ----                    ----
[  103.892765]   lock(torture_ww_mutex_0.base);
[  103.892767] 				      lock(torture_ww_mutex_1.base);
[  103.892770] 				      lock(torture_ww_mutex_0.base);
[  103.892772]   lock(torture_ww_mutex_2.base);
[  103.892774]
[  103.892774]  *** DEADLOCK ***

Since ww_mutex is supposed to be deadlock-proof if used properly, such
deadlock scenario should not happen. To avoid this false positive splat,
treat ww_mutex_lock() like a trylock().

After applying this patch, the locktorture test can run for a long time
without triggering the circular locking dependency splat.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by Davidlohr Bueso <dbueso@suse.de>
Link: https://lore.kernel.org/r/20210316153119.13802-4-longman@redhat.com
---
 kernel/locking/mutex.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Peter Zijlstra March 17, 2021, 1:12 p.m. UTC | #1
On Wed, Mar 17, 2021 at 12:38:21PM -0000, tip-bot2 for Waiman Long wrote:
> The following commit has been merged into the locking/urgent branch of tip:
> 
> Commit-ID:     b058f2e4d0a70c060e21ed122b264e9649cad57f
> Gitweb:        https://git.kernel.org/tip/b058f2e4d0a70c060e21ed122b264e9649cad57f
> Author:        Waiman Long <longman@redhat.com>
> AuthorDate:    Tue, 16 Mar 2021 11:31:18 -04:00
> Committer:     Ingo Molnar <mingo@kernel.org>
> CommitterDate: Wed, 17 Mar 2021 09:56:46 +01:00
> 
> locking/ww_mutex: Treat ww_mutex_lock() like a trylock
> 
> It was found that running the ww_mutex_lock-torture test produced the
> following lockdep splat almost immediately:
> 
> [  103.892638] ======================================================
> [  103.892639] WARNING: possible circular locking dependency detected
> [  103.892641] 5.12.0-rc3-debug+ #2 Tainted: G S      W
> [  103.892643] ------------------------------------------------------
> [  103.892643] lock_torture_wr/3234 is trying to acquire lock:
> [  103.892646] ffffffffc0b35b10 (torture_ww_mutex_2.base){+.+.}-{3:3}, at: torture_ww_mutex_lock+0x316/0x720 [locktorture]
> [  103.892660]
> [  103.892660] but task is already holding lock:
> [  103.892661] ffffffffc0b35cd0 (torture_ww_mutex_0.base){+.+.}-{3:3}, at: torture_ww_mutex_lock+0x3e2/0x720 [locktorture]
> [  103.892669]
> [  103.892669] which lock already depends on the new lock.
> [  103.892669]
> [  103.892670]
> [  103.892670] the existing dependency chain (in reverse order) is:
> [  103.892671]
> [  103.892671] -> #2 (torture_ww_mutex_0.base){+.+.}-{3:3}:
> [  103.892675]        lock_acquire+0x1c5/0x830
> [  103.892682]        __ww_mutex_lock.constprop.15+0x1d1/0x2e50
> [  103.892687]        ww_mutex_lock+0x4b/0x180
> [  103.892690]        torture_ww_mutex_lock+0x316/0x720 [locktorture]
> [  103.892694]        lock_torture_writer+0x142/0x3a0 [locktorture]
> [  103.892698]        kthread+0x35f/0x430
> [  103.892701]        ret_from_fork+0x1f/0x30
> [  103.892706]
> [  103.892706] -> #1 (torture_ww_mutex_1.base){+.+.}-{3:3}:
> [  103.892709]        lock_acquire+0x1c5/0x830
> [  103.892712]        __ww_mutex_lock.constprop.15+0x1d1/0x2e50
> [  103.892715]        ww_mutex_lock+0x4b/0x180
> [  103.892717]        torture_ww_mutex_lock+0x316/0x720 [locktorture]
> [  103.892721]        lock_torture_writer+0x142/0x3a0 [locktorture]
> [  103.892725]        kthread+0x35f/0x430
> [  103.892727]        ret_from_fork+0x1f/0x30
> [  103.892730]
> [  103.892730] -> #0 (torture_ww_mutex_2.base){+.+.}-{3:3}:
> [  103.892733]        check_prevs_add+0x3fd/0x2470
> [  103.892736]        __lock_acquire+0x2602/0x3100
> [  103.892738]        lock_acquire+0x1c5/0x830
> [  103.892740]        __ww_mutex_lock.constprop.15+0x1d1/0x2e50
> [  103.892743]        ww_mutex_lock+0x4b/0x180
> [  103.892746]        torture_ww_mutex_lock+0x316/0x720 [locktorture]
> [  103.892749]        lock_torture_writer+0x142/0x3a0 [locktorture]
> [  103.892753]        kthread+0x35f/0x430
> [  103.892755]        ret_from_fork+0x1f/0x30
> [  103.892757]
> [  103.892757] other info that might help us debug this:
> [  103.892757]
> [  103.892758] Chain exists of:
> [  103.892758]   torture_ww_mutex_2.base --> torture_ww_mutex_1.base --> torture_ww_mutex_0.base
> [  103.892758]
> [  103.892763]  Possible unsafe locking scenario:
> [  103.892763]
> [  103.892764]        CPU0                    CPU1
> [  103.892765]        ----                    ----
> [  103.892765]   lock(torture_ww_mutex_0.base);
> [  103.892767] 				      lock(torture_ww_mutex_1.base);
> [  103.892770] 				      lock(torture_ww_mutex_0.base);
> [  103.892772]   lock(torture_ww_mutex_2.base);
> [  103.892774]
> [  103.892774]  *** DEADLOCK ***
> 
> Since ww_mutex is supposed to be deadlock-proof if used properly, such
> deadlock scenario should not happen. To avoid this false positive splat,
> treat ww_mutex_lock() like a trylock().
> 
> After applying this patch, the locktorture test can run for a long time
> without triggering the circular locking dependency splat.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Acked-by Davidlohr Bueso <dbueso@suse.de>
> Link: https://lore.kernel.org/r/20210316153119.13802-4-longman@redhat.com
> ---
>  kernel/locking/mutex.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index 622ebdf..bb89393 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -946,7 +946,10 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  	}
>  
>  	preempt_disable();
> -	mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
> +	/*
> +	 * Treat as trylock for ww_mutex.
> +	 */
> +	mutex_acquire_nest(&lock->dep_map, subclass, !!ww_ctx, nest_lock, ip);

I'm confused... why isn't nest_lock working here?

For ww_mutex, we're supposed to have ctx->dep_map as a nest_lock, and
all lock acquisitions under a nest lock should be fine. Afaict the above
is just plain wrong.
Peter Zijlstra March 17, 2021, 1:31 p.m. UTC | #2
On Wed, Mar 17, 2021 at 02:12:41PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 12:38:21PM -0000, tip-bot2 for Waiman Long wrote:
> > +	/*
> > +	 * Treat as trylock for ww_mutex.
> > +	 */
> > +	mutex_acquire_nest(&lock->dep_map, subclass, !!ww_ctx, nest_lock, ip);
> 
> I'm confused... why isn't nest_lock working here?
> 
> For ww_mutex, we're supposed to have ctx->dep_map as a nest_lock, and
> all lock acquisitions under a nest lock should be fine. Afaict the above
> is just plain wrong.

To clarify:

	mutex_lock(&A);			ww_mutex_lock(&B, ctx);
	ww_mutex_lock(&B, ctx);		mutex_lock(&A);

should still very much be a deadlock, but your 'fix' makes it not report
that.

Only order within the ww_ctx can be ignored, and that's exactly what
nest_lock should be doing.
Waiman Long March 17, 2021, 2:03 p.m. UTC | #3
On 3/17/21 9:31 AM, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 02:12:41PM +0100, Peter Zijlstra wrote:
>> On Wed, Mar 17, 2021 at 12:38:21PM -0000, tip-bot2 for Waiman Long wrote:
>>> +	/*
>>> +	 * Treat as trylock for ww_mutex.
>>> +	 */
>>> +	mutex_acquire_nest(&lock->dep_map, subclass, !!ww_ctx, nest_lock, ip);
>> I'm confused... why isn't nest_lock working here?
>>
>> For ww_mutex, we're supposed to have ctx->dep_map as a nest_lock, and
>> all lock acquisitions under a nest lock should be fine. Afaict the above
>> is just plain wrong.
> To clarify:
>
> 	mutex_lock(&A);			ww_mutex_lock(&B, ctx);
> 	ww_mutex_lock(&B, ctx);		mutex_lock(&A);
>
> should still very much be a deadlock, but your 'fix' makes it not report
> that.
>
> Only order within the ww_ctx can be ignored, and that's exactly what
> nest_lock should be doing.
>
I will take a deeper look into why that is the case.

Cheers,
Longman
Waiman Long March 17, 2021, 3:35 p.m. UTC | #4
On 3/17/21 10:03 AM, Waiman Long wrote:
> On 3/17/21 9:31 AM, Peter Zijlstra wrote:
>> On Wed, Mar 17, 2021 at 02:12:41PM +0100, Peter Zijlstra wrote:
>>> On Wed, Mar 17, 2021 at 12:38:21PM -0000, tip-bot2 for Waiman Long 
>>> wrote:
>>>> +    /*
>>>> +     * Treat as trylock for ww_mutex.
>>>> +     */
>>>> +    mutex_acquire_nest(&lock->dep_map, subclass, !!ww_ctx, 
>>>> nest_lock, ip);
>>> I'm confused... why isn't nest_lock working here?
>>>
>>> For ww_mutex, we're supposed to have ctx->dep_map as a nest_lock, and
>>> all lock acquisitions under a nest lock should be fine. Afaict the 
>>> above
>>> is just plain wrong.
>> To clarify:
>>
>>     mutex_lock(&A);            ww_mutex_lock(&B, ctx);
>>     ww_mutex_lock(&B, ctx);        mutex_lock(&A);
>>
>> should still very much be a deadlock, but your 'fix' makes it not report
>> that.
>>
>> Only order within the ww_ctx can be ignored, and that's exactly what
>> nest_lock should be doing.
>>
> I will take a deeper look into why that is the case. 

 From reading the source code, nest_lock check is done in 
check_deadlock() so that it won't complain. However, nest_lock isn't 
considered in check_noncircular() which causes the splat to come out. 
Maybe we should add a check for nest_lock there. I will fiddle with the 
code to see if it can address the issue.

Cheers,
Longman
Peter Zijlstra March 17, 2021, 4:48 p.m. UTC | #5
On Wed, Mar 17, 2021 at 11:35:12AM -0400, Waiman Long wrote:

> From reading the source code, nest_lock check is done in check_deadlock() so
> that it won't complain. However, nest_lock isn't considered in
> check_noncircular() which causes the splat to come out. Maybe we should add
> a check for nest_lock there. I will fiddle with the code to see if it can
> address the issue.

Nah, that's not how it's supposed to work. I think the problem is that
DEFINE_WW_MUTEX is buggered, there's not actually any other user of it
in-tree.

Everybody else (including locking-selftests) seem to be using
ww_mutex_init().

So all locks in a ww_class should be having the same lock class, and
then nest_lock will fold them all into a single entry with ->references
incremented. See __lock_acquire().

But from the report:

> [  103.892671] -> #2 (torture_ww_mutex_0.base){+.+.}-{3:3}:
> [  103.892706] -> #1 (torture_ww_mutex_1.base){+.+.}-{3:3}:
> [  103.892730] -> #0 (torture_ww_mutex_2.base){+.+.}-{3:3}:

that went sideways, they're *not* the same class.

I think you'll find that if you use ww_mutex_init() it'll all work. Let
me go and zap this patch, and then I'll try and figure out why
DEFINE_WW_MUTEX() is buggered.
Peter Zijlstra March 17, 2021, 5:40 p.m. UTC | #6
On Wed, Mar 17, 2021 at 05:48:48PM +0100, Peter Zijlstra wrote:

> I think you'll find that if you use ww_mutex_init() it'll all work. Let
> me go and zap this patch, and then I'll try and figure out why
> DEFINE_WW_MUTEX() is buggered.

Moo, I can't get the compiler to do as I want :/

The below is so close but doesn't actually compile.. Maybe we should
just give up on DEFINE_WW_MUTEX and simply remove it.

---
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 0cd631a19727..85f50538f26a 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -129,12 +129,15 @@ do {									\
 # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
 #endif
 
-#define __MUTEX_INITIALIZER(lockname) \
+#define ___MUTEX_INITIALIZER(lockname, depmap) \
 		{ .owner = ATOMIC_LONG_INIT(0) \
 		, .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
 		, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
 		__DEBUG_MUTEX_INITIALIZER(lockname) \
-		__DEP_MAP_MUTEX_INITIALIZER(lockname) }
+		depmap }
+
+#define __MUTEX_INITIALIZER(lockname) \
+		___MUTEX_INITIALIZER(lockname, __DEP_MAP_MUTEX_INITIALIZER(lockname))
 
 #define DEFINE_MUTEX(mutexname) \
 	struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
index 6ecf2a0220db..c62a030652b4 100644
--- a/include/linux/ww_mutex.h
+++ b/include/linux/ww_mutex.h
@@ -50,9 +50,17 @@ struct ww_acquire_ctx {
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class) \
-		, .ww_class = class
+		, .ww_class = &(class)
+
+# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class) \
+		, .dep_map = { \
+			.key = &(class).mutex_key, \
+			.name = (class).mutex_name, \
+			.wait_type_inner = LD_WAIT_SLEEP, \
+		}
 #else
 # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class)
+# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class)
 #endif
 
 #define __WW_CLASS_INITIALIZER(ww_class, _is_wait_die)	    \
@@ -62,7 +70,8 @@ struct ww_acquire_ctx {
 		, .is_wait_die = _is_wait_die }
 
 #define __WW_MUTEX_INITIALIZER(lockname, class) \
-		{ .base =  __MUTEX_INITIALIZER(lockname.base) \
+		{ .base =  ___MUTEX_INITIALIZER(lockname.base, \
+			__DEP_MAP_WW_MUTEX_INITIALIZER(lockname.base, class)) \
 		__WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
 
 #define DEFINE_WD_CLASS(classname) \
diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index 0ab94e1f1276..e8305233eb0b 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -358,9 +358,9 @@ static struct lock_torture_ops mutex_lock_ops = {
 
 #include <linux/ww_mutex.h>
 static DEFINE_WD_CLASS(torture_ww_class);
-static DEFINE_WW_MUTEX(torture_ww_mutex_0, &torture_ww_class);
-static DEFINE_WW_MUTEX(torture_ww_mutex_1, &torture_ww_class);
-static DEFINE_WW_MUTEX(torture_ww_mutex_2, &torture_ww_class);
+static DEFINE_WW_MUTEX(torture_ww_mutex_0, torture_ww_class);
+static DEFINE_WW_MUTEX(torture_ww_mutex_1, torture_ww_class);
+static DEFINE_WW_MUTEX(torture_ww_mutex_2, torture_ww_class);
 
 static int torture_ww_mutex_lock(void)
 __acquires(torture_ww_mutex_0)
Peter Zijlstra March 17, 2021, 5:45 p.m. UTC | #7
On Wed, Mar 17, 2021 at 06:40:27PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 05:48:48PM +0100, Peter Zijlstra wrote:
> 
> > I think you'll find that if you use ww_mutex_init() it'll all work. Let
> > me go and zap this patch, and then I'll try and figure out why
> > DEFINE_WW_MUTEX() is buggered.
> 
> Moo, I can't get the compiler to do as I want :/
> 
> The below is so close but doesn't actually compile.. Maybe we should
> just give up on DEFINE_WW_MUTEX and simply remove it.
> 
> ---
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 0cd631a19727..85f50538f26a 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -129,12 +129,15 @@ do {									\
>  # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
>  #endif
>  
> -#define __MUTEX_INITIALIZER(lockname) \
> +#define ___MUTEX_INITIALIZER(lockname, depmap) \
>  		{ .owner = ATOMIC_LONG_INIT(0) \
>  		, .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
>  		, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
>  		__DEBUG_MUTEX_INITIALIZER(lockname) \
> -		__DEP_MAP_MUTEX_INITIALIZER(lockname) }
> +		depmap }
> +
> +#define __MUTEX_INITIALIZER(lockname) \
> +		___MUTEX_INITIALIZER(lockname, __DEP_MAP_MUTEX_INITIALIZER(lockname))
>  
>  #define DEFINE_MUTEX(mutexname) \
>  	struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
> index 6ecf2a0220db..c62a030652b4 100644
> --- a/include/linux/ww_mutex.h
> +++ b/include/linux/ww_mutex.h
> @@ -50,9 +50,17 @@ struct ww_acquire_ctx {
>  
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class) \
> -		, .ww_class = class
> +		, .ww_class = &(class)
> +
> +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class) \
> +		, .dep_map = { \
> +			.key = &(class).mutex_key, \
> +			.name = (class).mutex_name, \

			,name = #class "_mutex", \

and it 'works', but shees!

> +			.wait_type_inner = LD_WAIT_SLEEP, \
> +		}
>  #else
>  # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class)
> +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class)
>  #endif
>  
>  #define __WW_CLASS_INITIALIZER(ww_class, _is_wait_die)	    \
> @@ -62,7 +70,8 @@ struct ww_acquire_ctx {
>  		, .is_wait_die = _is_wait_die }
>  
>  #define __WW_MUTEX_INITIALIZER(lockname, class) \
> -		{ .base =  __MUTEX_INITIALIZER(lockname.base) \
> +		{ .base =  ___MUTEX_INITIALIZER(lockname.base, \
> +			__DEP_MAP_WW_MUTEX_INITIALIZER(lockname.base, class)) \
>  		__WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
>  
>  #define DEFINE_WD_CLASS(classname) \
> diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
> index 0ab94e1f1276..e8305233eb0b 100644
> --- a/kernel/locking/locktorture.c
> +++ b/kernel/locking/locktorture.c
> @@ -358,9 +358,9 @@ static struct lock_torture_ops mutex_lock_ops = {
>  
>  #include <linux/ww_mutex.h>
>  static DEFINE_WD_CLASS(torture_ww_class);
> -static DEFINE_WW_MUTEX(torture_ww_mutex_0, &torture_ww_class);
> -static DEFINE_WW_MUTEX(torture_ww_mutex_1, &torture_ww_class);
> -static DEFINE_WW_MUTEX(torture_ww_mutex_2, &torture_ww_class);
> +static DEFINE_WW_MUTEX(torture_ww_mutex_0, torture_ww_class);
> +static DEFINE_WW_MUTEX(torture_ww_mutex_1, torture_ww_class);
> +static DEFINE_WW_MUTEX(torture_ww_mutex_2, torture_ww_class);
>  
>  static int torture_ww_mutex_lock(void)
>  __acquires(torture_ww_mutex_0)
Waiman Long March 17, 2021, 6:14 p.m. UTC | #8
On 3/17/21 1:40 PM, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 05:48:48PM +0100, Peter Zijlstra wrote:
>
>> I think you'll find that if you use ww_mutex_init() it'll all work. Let
>> me go and zap this patch, and then I'll try and figure out why
>> DEFINE_WW_MUTEX() is buggered.
> Moo, I can't get the compiler to do as I want :/
>
> The below is so close but doesn't actually compile.. Maybe we should
> just give up on DEFINE_WW_MUTEX and simply remove it.
>
> ---
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 0cd631a19727..85f50538f26a 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -129,12 +129,15 @@ do {									\
>   # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
>   #endif
>   
> -#define __MUTEX_INITIALIZER(lockname) \
> +#define ___MUTEX_INITIALIZER(lockname, depmap) \
>   		{ .owner = ATOMIC_LONG_INIT(0) \
>   		, .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
>   		, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
>   		__DEBUG_MUTEX_INITIALIZER(lockname) \
> -		__DEP_MAP_MUTEX_INITIALIZER(lockname) }
> +		depmap }
> +
> +#define __MUTEX_INITIALIZER(lockname) \
> +		___MUTEX_INITIALIZER(lockname, __DEP_MAP_MUTEX_INITIALIZER(lockname))
>   
>   #define DEFINE_MUTEX(mutexname) \
>   	struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
> index 6ecf2a0220db..c62a030652b4 100644
> --- a/include/linux/ww_mutex.h
> +++ b/include/linux/ww_mutex.h
> @@ -50,9 +50,17 @@ struct ww_acquire_ctx {
>   
>   #ifdef CONFIG_DEBUG_LOCK_ALLOC
>   # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class) \
> -		, .ww_class = class
> +		, .ww_class = &(class)
> +
> +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class) \
> +		, .dep_map = { \
> +			.key = &(class).mutex_key, \
> +			.name = (class).mutex_name, \
> +			.wait_type_inner = LD_WAIT_SLEEP, \
> +		}
>   #else
>   # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class)
> +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class)
>   #endif
>   
>   #define __WW_CLASS_INITIALIZER(ww_class, _is_wait_die)	    \
> @@ -62,7 +70,8 @@ struct ww_acquire_ctx {
>   		, .is_wait_die = _is_wait_die }
>   
>   #define __WW_MUTEX_INITIALIZER(lockname, class) \
> -		{ .base =  __MUTEX_INITIALIZER(lockname.base) \
> +		{ .base =  ___MUTEX_INITIALIZER(lockname.base, \
> +			__DEP_MAP_WW_MUTEX_INITIALIZER(lockname.base, class)) \
>   		__WW_CLASS_MUTEX_INITIALIZER(lockname, class) }
>   
>   #define DEFINE_WD_CLASS(classname) \
> diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
> index 0ab94e1f1276..e8305233eb0b 100644
> --- a/kernel/locking/locktorture.c
> +++ b/kernel/locking/locktorture.c
> @@ -358,9 +358,9 @@ static struct lock_torture_ops mutex_lock_ops = {
>   
>   #include <linux/ww_mutex.h>
>   static DEFINE_WD_CLASS(torture_ww_class);
> -static DEFINE_WW_MUTEX(torture_ww_mutex_0, &torture_ww_class);
> -static DEFINE_WW_MUTEX(torture_ww_mutex_1, &torture_ww_class);
> -static DEFINE_WW_MUTEX(torture_ww_mutex_2, &torture_ww_class);
> +static DEFINE_WW_MUTEX(torture_ww_mutex_0, torture_ww_class);
> +static DEFINE_WW_MUTEX(torture_ww_mutex_1, torture_ww_class);
> +static DEFINE_WW_MUTEX(torture_ww_mutex_2, torture_ww_class);
>   
>   static int torture_ww_mutex_lock(void)
>   __acquires(torture_ww_mutex_0)
>
I changed locktorture to use ww_mutex_init() and the lockdep splat was 
indeed gone. So zapping WW_MUTEX_INIT() and use ww_mutex_init() is a 
possible solution. I will send out a patch to do that.

Thanks,
Longman
Waiman Long March 17, 2021, 6:32 p.m. UTC | #9
On 3/17/21 1:45 PM, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 06:40:27PM +0100, Peter Zijlstra wrote:
>> On Wed, Mar 17, 2021 at 05:48:48PM +0100, Peter Zijlstra wrote:
>>
>>> I think you'll find that if you use ww_mutex_init() it'll all work. Let
>>> me go and zap this patch, and then I'll try and figure out why
>>> DEFINE_WW_MUTEX() is buggered.
>> Moo, I can't get the compiler to do as I want :/
>>
>> The below is so close but doesn't actually compile.. Maybe we should
>> just give up on DEFINE_WW_MUTEX and simply remove it.
>>
>> ---
>> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
>> index 0cd631a19727..85f50538f26a 100644
>> --- a/include/linux/mutex.h
>> +++ b/include/linux/mutex.h
>> @@ -129,12 +129,15 @@ do {									\
>>   # define __DEP_MAP_MUTEX_INITIALIZER(lockname)
>>   #endif
>>   
>> -#define __MUTEX_INITIALIZER(lockname) \
>> +#define ___MUTEX_INITIALIZER(lockname, depmap) \
>>   		{ .owner = ATOMIC_LONG_INIT(0) \
>>   		, .wait_lock = __SPIN_LOCK_UNLOCKED(lockname.wait_lock) \
>>   		, .wait_list = LIST_HEAD_INIT(lockname.wait_list) \
>>   		__DEBUG_MUTEX_INITIALIZER(lockname) \
>> -		__DEP_MAP_MUTEX_INITIALIZER(lockname) }
>> +		depmap }
>> +
>> +#define __MUTEX_INITIALIZER(lockname) \
>> +		___MUTEX_INITIALIZER(lockname, __DEP_MAP_MUTEX_INITIALIZER(lockname))
>>   
>>   #define DEFINE_MUTEX(mutexname) \
>>   	struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
>> diff --git a/include/linux/ww_mutex.h b/include/linux/ww_mutex.h
>> index 6ecf2a0220db..c62a030652b4 100644
>> --- a/include/linux/ww_mutex.h
>> +++ b/include/linux/ww_mutex.h
>> @@ -50,9 +50,17 @@ struct ww_acquire_ctx {
>>   
>>   #ifdef CONFIG_DEBUG_LOCK_ALLOC
>>   # define __WW_CLASS_MUTEX_INITIALIZER(lockname, class) \
>> -		, .ww_class = class
>> +		, .ww_class = &(class)
>> +
>> +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class) \
>> +		, .dep_map = { \
>> +			.key = &(class).mutex_key, \
>> +			.name = (class).mutex_name, \
> 			,name = #class "_mutex", \
>
> and it 'works', but shees!

The name string itself may be duplicated for multiple instances of 
DEFINE_WW_MUTEX(). Do you want to keep DEFINE_WW_MUTEX() or just use 
ww_mutex_init() for all?

I notice that the problem with DEFINE_WW_MUTEX is that the ww_mutex 
themselves has null key instead of the same key from the ww_class as 
with ww_mutex_init().

Cheers,
Longman
Peter Zijlstra March 17, 2021, 7:58 p.m. UTC | #10
On Wed, Mar 17, 2021 at 02:32:27PM -0400, Waiman Long wrote:
> On 3/17/21 1:45 PM, Peter Zijlstra wrote:

> > > +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class) \
> > > +		, .dep_map = { \
> > > +			.key = &(class).mutex_key, \
> > > +			.name = (class).mutex_name, \
> > 			,name = #class "_mutex", \
> > 
> > and it 'works', but shees!
> 
> The name string itself may be duplicated for multiple instances of
> DEFINE_WW_MUTEX(). Do you want to keep DEFINE_WW_MUTEX() or just use
> ww_mutex_init() for all?

So linkers can merge literals, but no guarantee. But yeah, lets just
kill the thing, less tricky macro crud to worry about.

> I notice that the problem with DEFINE_WW_MUTEX is that the ww_mutex
> themselves has null key instead of the same key from the ww_class as with
> ww_mutex_init().

Correct.
Waiman Long March 17, 2021, 8:20 p.m. UTC | #11
On 3/17/21 3:58 PM, Peter Zijlstra wrote:
> On Wed, Mar 17, 2021 at 02:32:27PM -0400, Waiman Long wrote:
>> On 3/17/21 1:45 PM, Peter Zijlstra wrote:
>>>> +# define __DEP_MAP_WW_MUTEX_INITIALIZER(lockname, class) \
>>>> +		, .dep_map = { \
>>>> +			.key = &(class).mutex_key, \
>>>> +			.name = (class).mutex_name, \
>>> 			,name = #class "_mutex", \
>>>
>>> and it 'works', but shees!
>> The name string itself may be duplicated for multiple instances of
>> DEFINE_WW_MUTEX(). Do you want to keep DEFINE_WW_MUTEX() or just use
>> ww_mutex_init() for all?
> So linkers can merge literals, but no guarantee. But yeah, lets just
> kill the thing, less tricky macro crud to worry about.
>
Good, just to confirm the right way to move forward.

Cheers,
Longman

Patch
diff mbox series

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 622ebdf..bb89393 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -946,7 +946,10 @@  __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 	}
 
 	preempt_disable();
-	mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
+	/*
+	 * Treat as trylock for ww_mutex.
+	 */
+	mutex_acquire_nest(&lock->dep_map, subclass, !!ww_ctx, nest_lock, ip);
 
 	if (__mutex_trylock(lock) ||
 	    mutex_optimistic_spin(lock, ww_ctx, NULL)) {