All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH net] bonding: Work around lockdep_is_held false positives
@ 2021-03-22 12:38 Maxim Mikityanskiy
  2021-03-22 14:09 ` Leon Romanovsky
  0 siblings, 1 reply; 5+ messages in thread
From: Maxim Mikityanskiy @ 2021-03-22 12:38 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, Mahesh Bandewar,
	Nikolay Aleksandrov
  Cc: David S. Miller, Jakub Kicinski, Tariq Toukan, netdev,
	Maxim Mikityanskiy

After lockdep gets triggered for the first time, it gets disabled, and
lockdep_enabled() will return false. It will affect lockdep_is_held(),
which will start returning true all the time. Normally, it just disables
checks that expect a lock to be held. However, the bonding code checks
that a lock is NOT held, which triggers a false positive in WARN_ON.

This commit addresses the issue by replacing lockdep_is_held with
spin_is_locked, which should have the same effect, but without suffering
from disabling lockdep.

Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
---
While this patch works around the issue, I would like to discuss better
options. Another straightforward approach is to extend lockdep API with
lockdep_is_not_held(), which will be basically !lockdep_is_held() when
lockdep is enabled, but will return true when !lockdep_enabled().

However, there is no reliable way to check that some lock is not held
without taking it ourselves (because the lock may be taken by another
thread after the check). Could someone explain why this code tries to
make this check? Maybe we could figure out some better way to achieve
the original goal.

 drivers/net/bonding/bond_main.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 74cbbb22470b..b2fe4e93cb8e 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4391,9 +4391,7 @@ int bond_update_slave_arr(struct bonding *bond, struct slave *skipslave)
 	int agg_id = 0;
 	int ret = 0;
 
-#ifdef CONFIG_LOCKDEP
-	WARN_ON(lockdep_is_held(&bond->mode_lock));
-#endif
+	WARN_ON(spin_is_locked(&bond->mode_lock));
 
 	usable_slaves = kzalloc(struct_size(usable_slaves, arr,
 					    bond->slave_cnt), GFP_KERNEL);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH net] bonding: Work around lockdep_is_held false positives
  2021-03-22 12:38 [RFC PATCH net] bonding: Work around lockdep_is_held false positives Maxim Mikityanskiy
@ 2021-03-22 14:09 ` Leon Romanovsky
  2021-03-23 17:34   ` Maxim Mikityanskiy
  0 siblings, 1 reply; 5+ messages in thread
From: Leon Romanovsky @ 2021-03-22 14:09 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, Mahesh Bandewar,
	Nikolay Aleksandrov, David S. Miller, Jakub Kicinski,
	Tariq Toukan, netdev

On Mon, Mar 22, 2021 at 02:38:46PM +0200, Maxim Mikityanskiy wrote:
> After lockdep gets triggered for the first time, it gets disabled, and
> lockdep_enabled() will return false. It will affect lockdep_is_held(),
> which will start returning true all the time. Normally, it just disables
> checks that expect a lock to be held. However, the bonding code checks
> that a lock is NOT held, which triggers a false positive in WARN_ON.
> 
> This commit addresses the issue by replacing lockdep_is_held with
> spin_is_locked, which should have the same effect, but without suffering
> from disabling lockdep.
> 
> Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash")
> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
> ---
> While this patch works around the issue, I would like to discuss better
> options. Another straightforward approach is to extend lockdep API with
> lockdep_is_not_held(), which will be basically !lockdep_is_held() when
> lockdep is enabled, but will return true when !lockdep_enabled().

lockdep_assert_not_held() was added in this cycle to tip: locking/core
https://yhbt.net/lore/all/161475935945.20312.2870945278690244669.tip-bot2@tip-bot2/
https://yhbt.net/lore/all/878s779s9f.fsf@codeaurora.org/

Thanks

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH net] bonding: Work around lockdep_is_held false positives
  2021-03-22 14:09 ` Leon Romanovsky
@ 2021-03-23 17:34   ` Maxim Mikityanskiy
  2021-03-23 17:56     ` Jay Vosburgh
  0 siblings, 1 reply; 5+ messages in thread
From: Maxim Mikityanskiy @ 2021-03-23 17:34 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, Mahesh Bandewar,
	Nikolay Aleksandrov, David S. Miller, Jakub Kicinski,
	Tariq Toukan, netdev

On 2021-03-22 16:09, Leon Romanovsky wrote:
> On Mon, Mar 22, 2021 at 02:38:46PM +0200, Maxim Mikityanskiy wrote:
>> After lockdep gets triggered for the first time, it gets disabled, and
>> lockdep_enabled() will return false. It will affect lockdep_is_held(),
>> which will start returning true all the time. Normally, it just disables
>> checks that expect a lock to be held. However, the bonding code checks
>> that a lock is NOT held, which triggers a false positive in WARN_ON.
>>
>> This commit addresses the issue by replacing lockdep_is_held with
>> spin_is_locked, which should have the same effect, but without suffering
>> from disabling lockdep.
>>
>> Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash")
>> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
>> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
>> ---
>> While this patch works around the issue, I would like to discuss better
>> options. Another straightforward approach is to extend lockdep API with
>> lockdep_is_not_held(), which will be basically !lockdep_is_held() when
>> lockdep is enabled, but will return true when !lockdep_enabled().
> 
> lockdep_assert_not_held() was added in this cycle to tip: locking/core
> https://yhbt.net/lore/all/161475935945.20312.2870945278690244669.tip-bot2@tip-bot2/
> https://yhbt.net/lore/all/878s779s9f.fsf@codeaurora.org/

Thanks for this suggestion - I wasn't aware that this macro was recently 
added and I could use it instead of spin_is_locked.

Still, I would like to figure out why the bonding code does this test at 
all. This lock is not taken by bond_update_slave_arr() itself, so why is 
that a problem in this code?

> Thanks
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH net] bonding: Work around lockdep_is_held false positives
  2021-03-23 17:34   ` Maxim Mikityanskiy
@ 2021-03-23 17:56     ` Jay Vosburgh
  2021-03-23 19:02       ` Maxim Mikityanskiy
  0 siblings, 1 reply; 5+ messages in thread
From: Jay Vosburgh @ 2021-03-23 17:56 UTC (permalink / raw)
  To: Maxim Mikityanskiy
  Cc: Leon Romanovsky, Veaceslav Falico, Andy Gospodarek,
	Mahesh Bandewar, Nikolay Aleksandrov, David S. Miller,
	Jakub Kicinski, Tariq Toukan, netdev

Maxim Mikityanskiy <maximmi@nvidia.com> wrote:

>On 2021-03-22 16:09, Leon Romanovsky wrote:
>> On Mon, Mar 22, 2021 at 02:38:46PM +0200, Maxim Mikityanskiy wrote:
>>> After lockdep gets triggered for the first time, it gets disabled, and
>>> lockdep_enabled() will return false. It will affect lockdep_is_held(),
>>> which will start returning true all the time. Normally, it just disables
>>> checks that expect a lock to be held. However, the bonding code checks
>>> that a lock is NOT held, which triggers a false positive in WARN_ON.
>>>
>>> This commit addresses the issue by replacing lockdep_is_held with
>>> spin_is_locked, which should have the same effect, but without suffering
>>> from disabling lockdep.
>>>
>>> Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash")
>>> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
>>> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
>>> ---
>>> While this patch works around the issue, I would like to discuss better
>>> options. Another straightforward approach is to extend lockdep API with
>>> lockdep_is_not_held(), which will be basically !lockdep_is_held() when
>>> lockdep is enabled, but will return true when !lockdep_enabled().
>>
>> lockdep_assert_not_held() was added in this cycle to tip: locking/core
>> https://yhbt.net/lore/all/161475935945.20312.2870945278690244669.tip-bot2@tip-bot2/
>> https://yhbt.net/lore/all/878s779s9f.fsf@codeaurora.org/
>
>Thanks for this suggestion - I wasn't aware that this macro was recently
>added and I could use it instead of spin_is_locked.
>
>Still, I would like to figure out why the bonding code does this test at
>all. This lock is not taken by bond_update_slave_arr() itself, so why is
>that a problem in this code?

	The goal, I believe, is to insure that the mode_lock is not held
by the caller when entering bond_update_slave_arr.  I suspect this is
because bond_update_slave_arr may sleep.  One calling context notes this
in a comment:

void bond_3ad_handle_link_change(struct slave *slave, char link)
{
[...]
	/* RTNL is held and mode_lock is released so it's safe
	 * to update slave_array here.
	 */
	bond_update_slave_arr(slave->bond, NULL);

	However, as far as I can tell, lockdep_is_held() does not test
for "lock held by this particular context" but instead is "lock held by
any context at all."  As such, I think the test is not valid, and should
be removed.

	The code in question was added by:

commit ee6377147409a00c071b2da853059a7d59979fbc
Author: Mahesh Bandewar <maheshb@google.com>
Date:   Sat Oct 4 17:45:01 2014 -0700

    bonding: Simplify the xmit function for modes that use xmit_hash

	Mahesh, Nikolay, any thoughts?

	-J

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH net] bonding: Work around lockdep_is_held false positives
  2021-03-23 17:56     ` Jay Vosburgh
@ 2021-03-23 19:02       ` Maxim Mikityanskiy
  0 siblings, 0 replies; 5+ messages in thread
From: Maxim Mikityanskiy @ 2021-03-23 19:02 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: Leon Romanovsky, Veaceslav Falico, Andy Gospodarek,
	Mahesh Bandewar, Nikolay Aleksandrov, David S. Miller,
	Jakub Kicinski, Tariq Toukan, netdev

On 2021-03-23 19:56, Jay Vosburgh wrote:
> Maxim Mikityanskiy <maximmi@nvidia.com> wrote:
> 
>> On 2021-03-22 16:09, Leon Romanovsky wrote:
>>> On Mon, Mar 22, 2021 at 02:38:46PM +0200, Maxim Mikityanskiy wrote:
>>>> After lockdep gets triggered for the first time, it gets disabled, and
>>>> lockdep_enabled() will return false. It will affect lockdep_is_held(),
>>>> which will start returning true all the time. Normally, it just disables
>>>> checks that expect a lock to be held. However, the bonding code checks
>>>> that a lock is NOT held, which triggers a false positive in WARN_ON.
>>>>
>>>> This commit addresses the issue by replacing lockdep_is_held with
>>>> spin_is_locked, which should have the same effect, but without suffering
>>>> from disabling lockdep.
>>>>
>>>> Fixes: ee6377147409 ("bonding: Simplify the xmit function for modes that use xmit_hash")
>>>> Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
>>>> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
>>>> ---
>>>> While this patch works around the issue, I would like to discuss better
>>>> options. Another straightforward approach is to extend lockdep API with
>>>> lockdep_is_not_held(), which will be basically !lockdep_is_held() when
>>>> lockdep is enabled, but will return true when !lockdep_enabled().
>>>
>>> lockdep_assert_not_held() was added in this cycle to tip: locking/core
>>> https://yhbt.net/lore/all/161475935945.20312.2870945278690244669.tip-bot2@tip-bot2/
>>> https://yhbt.net/lore/all/878s779s9f.fsf@codeaurora.org/
>>
>> Thanks for this suggestion - I wasn't aware that this macro was recently
>> added and I could use it instead of spin_is_locked.
>>
>> Still, I would like to figure out why the bonding code does this test at
>> all. This lock is not taken by bond_update_slave_arr() itself, so why is
>> that a problem in this code?
> 
> 	The goal, I believe, is to insure that the mode_lock is not held
> by the caller when entering bond_update_slave_arr.  I suspect this is
> because bond_update_slave_arr may sleep.

If that's the case, this check should be replaced with might_sleep(). 
There is at least kzalloc that may sleep, so you may be right, and if 
it's the only reason for this check, it's indeed invalid, as you explain 
below. However, let's see what the authors of the code say - maybe they 
meant that during this function call no context must hold this lock - in 
that case I would like to hear the motivation.

>  One calling context notes this
> in a comment:
> 
> void bond_3ad_handle_link_change(struct slave *slave, char link)
> {
> [...]
> 	/* RTNL is held and mode_lock is released so it's safe
> 	 * to update slave_array here.
> 	 */
> 	bond_update_slave_arr(slave->bond, NULL);
> 
> 	However, as far as I can tell, lockdep_is_held() does not test
> for "lock held by this particular context" but instead is "lock held by
> any context at all."  As such, I think the test is not valid, and should
> be removed.
> 
> 	The code in question was added by:
> 
> commit ee6377147409a00c071b2da853059a7d59979fbc
> Author: Mahesh Bandewar <maheshb@google.com>
> Date:   Sat Oct 4 17:45:01 2014 -0700
> 
>      bonding: Simplify the xmit function for modes that use xmit_hash
> 
> 	Mahesh, Nikolay, any thoughts?
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-03-23 19:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-22 12:38 [RFC PATCH net] bonding: Work around lockdep_is_held false positives Maxim Mikityanskiy
2021-03-22 14:09 ` Leon Romanovsky
2021-03-23 17:34   ` Maxim Mikityanskiy
2021-03-23 17:56     ` Jay Vosburgh
2021-03-23 19:02       ` Maxim Mikityanskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.