linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] locking/lockdep: Use local_irq_save() with call_rcu()
@ 2020-12-22 22:55 Waiman Long
  2021-01-04 15:38 ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Waiman Long @ 2020-12-22 22:55 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Will Deacon
  Cc: linux-kernel, Bart Van Assche, Waiman Long

The following lockdep splat was hit:

 [  560.638354] WARNING: CPU: 79 PID: 27458 at kernel/rcu/tree_plugin.h:1749 call_rcu+0x6dc/0xf00
    :
 [  560.647761] RIP: 0010:call_rcu+0x6dc/0xf00
 [  560.647763] Code: 0f 8f 29 04 00 00 e8 93 da 1c 00 48 8b 3c 24 57 9d 0f 1f 44 00 00 e9 19 fa ff ff 65 8b 05 38 83 c4 49 85 c0 0f 84 cd fb ff ff <0f> 0b e9 c6 fb ff ff e8 b8 45 51 00 4c 89 f2 48 b8 00 00 00 00 00
 [  560.647764] RSP: 0018:ff11001050097b58 EFLAGS: 00010002
 [  560.647766] RAX: 0000000000000001 RBX: ffffffffbb1f3360 RCX: 0000000000000001
 [  560.647766] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb99bac9c
 [  560.647767] RBP: 1fe220020a012f73 R08: 000000010004005c R09: dffffc0000000000
 [  560.647768] R10: dffffc0000000000 R11: 0000000000000003 R12: ff1100105b7f70e1
 [  560.647769] R13: ffffffffb635d8a0 R14: ff1100105b7f72d8 R15: ff1100105b7f7040
 [  560.647770] FS:  00007fd9b3437080(0000) GS:ff1100105b600000(0000) knlGS:0000000000000000
 [  560.647771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [  560.647772] CR2: 00007fd9b30112bc CR3: 000000105e898006 CR4: 0000000000761ee0
 [  560.647773] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [  560.647773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [  560.647774] PKRU: 55555554
 [  560.647774] Call Trace:
 [  560.647778]  ? invoke_rcu_core+0x180/0x180
 [  560.647782]  ? __is_module_percpu_address+0xed/0x440
 [  560.647787]  lockdep_unregister_key+0x2ab/0x5b0
 [  560.647791]  destroy_workqueue+0x40b/0x610
 [  560.647862]  xlog_dealloc_log+0x216/0x2b0 [xfs]
    :

This splat is caused by the fact that lockdep_unregister_key() uses
raw_local_irq_save() which doesn't update the hardirqs_enabled
percpu flag.  The call_rcu() function, however, will call
lockdep_assert_irqs_disabled() to check the hardirqs_enabled flag which
remained set in this case.

Fix this problem by using local_irq_save()/local_irq_restore() pairs
whenever call_rcu() is being called.

I think raw_local_irq_save() function can be used if no external
function is being called except maybe printk() as it means another
lockdep problem exists.

Fixes: a0b0fd53e1e67 ("locking/lockdep: Free lock classes that are no longer in use")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/lockdep.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index c1418b47f625..2a37af77ede6 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5884,7 +5884,7 @@ static void free_zapped_rcu(struct rcu_head *ch)
 	if (WARN_ON_ONCE(ch != &delayed_free.rcu_head))
 		return;
 
-	raw_local_irq_save(flags);
+	local_irq_save(flags);
 	lockdep_lock();
 
 	/* closed head */
@@ -5898,7 +5898,7 @@ static void free_zapped_rcu(struct rcu_head *ch)
 	call_rcu_zapped(delayed_free.pf + delayed_free.index);
 
 	lockdep_unlock();
-	raw_local_irq_restore(flags);
+	local_irq_restore(flags);
 }
 
 /*
@@ -5941,13 +5941,13 @@ static void lockdep_free_key_range_reg(void *start, unsigned long size)
 
 	init_data_structures_once();
 
-	raw_local_irq_save(flags);
+	local_irq_save(flags);
 	lockdep_lock();
 	pf = get_pending_free();
 	__lockdep_free_key_range(pf, start, size);
 	call_rcu_zapped(pf);
 	lockdep_unlock();
-	raw_local_irq_restore(flags);
+	local_irq_restore(flags);
 
 	/*
 	 * Wait for any possible iterators from look_up_lock_class() to pass
@@ -6043,7 +6043,7 @@ static void lockdep_reset_lock_reg(struct lockdep_map *lock)
 	unsigned long flags;
 	int locked;
 
-	raw_local_irq_save(flags);
+	local_irq_save(flags);
 	locked = graph_lock();
 	if (!locked)
 		goto out_irq;
@@ -6054,7 +6054,7 @@ static void lockdep_reset_lock_reg(struct lockdep_map *lock)
 
 	graph_unlock();
 out_irq:
-	raw_local_irq_restore(flags);
+	local_irq_restore(flags);
 }
 
 /*
@@ -6098,7 +6098,11 @@ void lockdep_unregister_key(struct lock_class_key *key)
 	if (WARN_ON_ONCE(static_obj(key)))
 		return;
 
-	raw_local_irq_save(flags);
+	/*
+	 * local_irq_save() should be used as call_rcu() will check
+	 * hardirqs_enabled state.
+	 */
+	local_irq_save(flags);
 	if (!graph_lock())
 		goto out_irq;
 
@@ -6115,7 +6119,7 @@ void lockdep_unregister_key(struct lock_class_key *key)
 	call_rcu_zapped(pf);
 	graph_unlock();
 out_irq:
-	raw_local_irq_restore(flags);
+	local_irq_restore(flags);
 
 	/* Wait until is_dynamic_key() has finished accessing k->hash_entry. */
 	synchronize_rcu();
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] locking/lockdep: Use local_irq_save() with call_rcu()
  2020-12-22 22:55 [PATCH] locking/lockdep: Use local_irq_save() with call_rcu() Waiman Long
@ 2021-01-04 15:38 ` Peter Zijlstra
  2021-01-04 19:22   ` Waiman Long
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Zijlstra @ 2021-01-04 15:38 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche,
	Paul McKenney, Boqun Feng

On Tue, Dec 22, 2020 at 05:55:53PM -0500, Waiman Long wrote:
> The following lockdep splat was hit:
> 
>  [  560.638354] WARNING: CPU: 79 PID: 27458 at kernel/rcu/tree_plugin.h:1749 call_rcu+0x6dc/0xf00
>     :
>  [  560.647761] RIP: 0010:call_rcu+0x6dc/0xf00
>  [  560.647763] Code: 0f 8f 29 04 00 00 e8 93 da 1c 00 48 8b 3c 24 57 9d 0f 1f 44 00 00 e9 19 fa ff ff 65 8b 05 38 83 c4 49 85 c0 0f 84 cd fb ff ff <0f> 0b e9 c6 fb ff ff e8 b8 45 51 00 4c 89 f2 48 b8 00 00 00 00 00
>  [  560.647764] RSP: 0018:ff11001050097b58 EFLAGS: 00010002
>  [  560.647766] RAX: 0000000000000001 RBX: ffffffffbb1f3360 RCX: 0000000000000001
>  [  560.647766] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb99bac9c
>  [  560.647767] RBP: 1fe220020a012f73 R08: 000000010004005c R09: dffffc0000000000
>  [  560.647768] R10: dffffc0000000000 R11: 0000000000000003 R12: ff1100105b7f70e1
>  [  560.647769] R13: ffffffffb635d8a0 R14: ff1100105b7f72d8 R15: ff1100105b7f7040
>  [  560.647770] FS:  00007fd9b3437080(0000) GS:ff1100105b600000(0000) knlGS:0000000000000000
>  [  560.647771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  [  560.647772] CR2: 00007fd9b30112bc CR3: 000000105e898006 CR4: 0000000000761ee0
>  [  560.647773] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  [  560.647773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  [  560.647774] PKRU: 55555554
>  [  560.647774] Call Trace:
>  [  560.647778]  ? invoke_rcu_core+0x180/0x180
>  [  560.647782]  ? __is_module_percpu_address+0xed/0x440
>  [  560.647787]  lockdep_unregister_key+0x2ab/0x5b0
>  [  560.647791]  destroy_workqueue+0x40b/0x610
>  [  560.647862]  xlog_dealloc_log+0x216/0x2b0 [xfs]
>     :
> 
> This splat is caused by the fact that lockdep_unregister_key() uses
> raw_local_irq_save() which doesn't update the hardirqs_enabled
> percpu flag.  The call_rcu() function, however, will call
> lockdep_assert_irqs_disabled() to check the hardirqs_enabled flag which
> remained set in this case.
> 
> Fix this problem by using local_irq_save()/local_irq_restore() pairs
> whenever call_rcu() is being called.

I'm not sure I much like all this,.. :/

> I think raw_local_irq_save() function can be used if no external
> function is being called except maybe printk() as it means another
> lockdep problem exists.

The reason lockdep is using raw_local_irq_save() is to avoid calling
into itself again, notably local_irq_restore() will end up in
mark_held_locks().

> Fixes: a0b0fd53e1e67 ("locking/lockdep: Free lock classes that are no longer in use")

Seems dubious, as the lockdep_assert_irqs_disabled() that triggered was
added after that patch.

I'm thinking another solution would be to increment the lockdep
recursion count before calling RCU, because then we'll fail
__lockdep_enabled and the assertion gets killed. Hmm?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] locking/lockdep: Use local_irq_save() with call_rcu()
  2021-01-04 15:38 ` Peter Zijlstra
@ 2021-01-04 19:22   ` Waiman Long
  0 siblings, 0 replies; 3+ messages in thread
From: Waiman Long @ 2021-01-04 19:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Will Deacon, linux-kernel, Bart Van Assche,
	Paul McKenney, Boqun Feng

On 1/4/21 10:38 AM, Peter Zijlstra wrote:
> On Tue, Dec 22, 2020 at 05:55:53PM -0500, Waiman Long wrote:
>> The following lockdep splat was hit:
>>
>>   [  560.638354] WARNING: CPU: 79 PID: 27458 at kernel/rcu/tree_plugin.h:1749 call_rcu+0x6dc/0xf00
>>      :
>>   [  560.647761] RIP: 0010:call_rcu+0x6dc/0xf00
>>   [  560.647763] Code: 0f 8f 29 04 00 00 e8 93 da 1c 00 48 8b 3c 24 57 9d 0f 1f 44 00 00 e9 19 fa ff ff 65 8b 05 38 83 c4 49 85 c0 0f 84 cd fb ff ff <0f> 0b e9 c6 fb ff ff e8 b8 45 51 00 4c 89 f2 48 b8 00 00 00 00 00
>>   [  560.647764] RSP: 0018:ff11001050097b58 EFLAGS: 00010002
>>   [  560.647766] RAX: 0000000000000001 RBX: ffffffffbb1f3360 RCX: 0000000000000001
>>   [  560.647766] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb99bac9c
>>   [  560.647767] RBP: 1fe220020a012f73 R08: 000000010004005c R09: dffffc0000000000
>>   [  560.647768] R10: dffffc0000000000 R11: 0000000000000003 R12: ff1100105b7f70e1
>>   [  560.647769] R13: ffffffffb635d8a0 R14: ff1100105b7f72d8 R15: ff1100105b7f7040
>>   [  560.647770] FS:  00007fd9b3437080(0000) GS:ff1100105b600000(0000) knlGS:0000000000000000
>>   [  560.647771] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>   [  560.647772] CR2: 00007fd9b30112bc CR3: 000000105e898006 CR4: 0000000000761ee0
>>   [  560.647773] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>   [  560.647773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>   [  560.647774] PKRU: 55555554
>>   [  560.647774] Call Trace:
>>   [  560.647778]  ? invoke_rcu_core+0x180/0x180
>>   [  560.647782]  ? __is_module_percpu_address+0xed/0x440
>>   [  560.647787]  lockdep_unregister_key+0x2ab/0x5b0
>>   [  560.647791]  destroy_workqueue+0x40b/0x610
>>   [  560.647862]  xlog_dealloc_log+0x216/0x2b0 [xfs]
>>      :
>>
>> This splat is caused by the fact that lockdep_unregister_key() uses
>> raw_local_irq_save() which doesn't update the hardirqs_enabled
>> percpu flag.  The call_rcu() function, however, will call
>> lockdep_assert_irqs_disabled() to check the hardirqs_enabled flag which
>> remained set in this case.
>>
>> Fix this problem by using local_irq_save()/local_irq_restore() pairs
>> whenever call_rcu() is being called.
> I'm not sure I much like all this,.. :/
>
>> I think raw_local_irq_save() function can be used if no external
>> function is being called except maybe printk() as it means another
>> lockdep problem exists.
> The reason lockdep is using raw_local_irq_save() is to avoid calling
> into itself again, notably local_irq_restore() will end up in
> mark_held_locks().
>
>> Fixes: a0b0fd53e1e67 ("locking/lockdep: Free lock classes that are no longer in use")
> Seems dubious, as the lockdep_assert_irqs_disabled() that triggered was
> added after that patch.
>
> I'm thinking another solution would be to increment the lockdep
> recursion count before calling RCU, because then we'll fail
> __lockdep_enabled and the assertion gets killed. Hmm?
>
Yes, you are right. Incrementing the lockdep recursion count should fix 
the issue. I was missing the commit 43be4388e94b ("lockdep: Put graph 
lock/unlock under lock_recursion protection"). This commit will properly 
increment the percpu lockdep recursion count and disable the call_rcu 
warning. So please ignore this patch.

Thanks,
Longman


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-04 19:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-22 22:55 [PATCH] locking/lockdep: Use local_irq_save() with call_rcu() Waiman Long
2021-01-04 15:38 ` Peter Zijlstra
2021-01-04 19:22   ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).