All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: "Manoharan, Rajkumar" <rmanohar@qti.qualcomm.com>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: Re: [PATCH 2/3] ath10k:  Grab rcu_read_lock before the txqs spinlock.
Date: Thu, 18 Aug 2016 20:28:24 -0700	[thread overview]
Message-ID: <57B67CD8.5040009@candelatech.com> (raw)
In-Reply-To: <1471575674214.65791@qti.qualcomm.com>



On 08/18/2016 08:01 PM, Manoharan, Rajkumar wrote:
>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
>> index 916119c..d96c06e 100644
>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>> @@ -4307,8 +4307,8 @@ void ath10k_mac_tx_push_pending(struct ath10k *ar)
>>          int max;
>>          int loop_max = 2000;
>>
>> -       spin_lock_bh(&ar->txqs_lock);
>>          rcu_read_lock();
>> +       spin_lock_bh(&ar->txqs_lock);
>>
> Ben,
>
> It is quite possible that push_pending can be preempted after acquiring rcu_lock which
> may lead to deadlock. no? I assume to prevent that spin_lock is taken first.
>
> Could you please explain how this reordering is fixing dead lock?

It did not obviously fix the spin lock issue, but the issue went away.  Maybe it
was because I fixed the stale memory access issues at around the same time.

But, I don't think you can deadlock by taking rcu lock first and then the spinlock.

I checked all places where the spinlock is held, and I do not see any way for any of
them to block forever (In my kernel, I have a 2000 time limit on spinning in the push pending
method, which can help make sure we don't spin forever).

http://dmz2.candelatech.com/?p=linux-4.7.dev.y/.git;a=commitdiff;h=5d192657269d8e20fce733f894bb1b68df71db00

I was also worried that some calls from mac80211 might be holding rcu_read_lock while calling into
ath10k code that would grab the spinlock.  If that is the case (and I didn't verify it was), then
you could have a lock inversion by taking spinlock before rcu read lock in the push-pending method.

Anyway, someone that understands locking subtleties better might have more clue about this code
than I do.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

WARNING: multiple messages have this Message-ID (diff)
From: Ben Greear <greearb@candelatech.com>
To: "Manoharan, Rajkumar" <rmanohar@qti.qualcomm.com>,
	"ath10k@lists.infradead.org" <ath10k@lists.infradead.org>
Cc: "linux-wireless@vger.kernel.org" <linux-wireless@vger.kernel.org>
Subject: Re: [PATCH 2/3] ath10k: Grab rcu_read_lock before the txqs spinlock.
Date: Thu, 18 Aug 2016 20:28:24 -0700	[thread overview]
Message-ID: <57B67CD8.5040009@candelatech.com> (raw)
In-Reply-To: <1471575674214.65791@qti.qualcomm.com>



On 08/18/2016 08:01 PM, Manoharan, Rajkumar wrote:
>> diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
>> index 916119c..d96c06e 100644
>> --- a/drivers/net/wireless/ath/ath10k/mac.c
>> +++ b/drivers/net/wireless/ath/ath10k/mac.c
>> @@ -4307,8 +4307,8 @@ void ath10k_mac_tx_push_pending(struct ath10k *ar)
>>          int max;
>>          int loop_max = 2000;
>>
>> -       spin_lock_bh(&ar->txqs_lock);
>>          rcu_read_lock();
>> +       spin_lock_bh(&ar->txqs_lock);
>>
> Ben,
>
> It is quite possible that push_pending can be preempted after acquiring rcu_lock which
> may lead to deadlock. no? I assume to prevent that spin_lock is taken first.
>
> Could you please explain how this reordering is fixing dead lock?

It did not obviously fix the spin lock issue, but the issue went away.  Maybe it
was because I fixed the stale memory access issues at around the same time.

But, I don't think you can deadlock by taking rcu lock first and then the spinlock.

I checked all places where the spinlock is held, and I do not see any way for any of
them to block forever (In my kernel, I have a 2000 time limit on spinning in the push pending
method, which can help make sure we don't spin forever).

http://dmz2.candelatech.com/?p=linux-4.7.dev.y/.git;a=commitdiff;h=5d192657269d8e20fce733f894bb1b68df71db00

I was also worried that some calls from mac80211 might be holding rcu_read_lock while calling into
ath10k code that would grab the spinlock.  If that is the case (and I didn't verify it was), then
you could have a lock inversion by taking spinlock before rcu read lock in the push-pending method.

Anyway, someone that understands locking subtleties better might have more clue about this code
than I do.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

  reply	other threads:[~2016-08-19  3:34 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19  1:26 [PATCH 1/3] ath10k: Ensure there are no stale ar->txqs entries greearb
2016-08-19  1:26 ` greearb
2016-08-19  1:26 ` [PATCH 2/3] ath10k: Grab rcu_read_lock before the txqs spinlock greearb
2016-08-19  1:26   ` greearb
2016-08-19  3:01   ` Manoharan, Rajkumar
2016-08-19  3:01     ` Manoharan, Rajkumar
2016-08-19  3:28     ` Ben Greear [this message]
2016-08-19  3:28       ` Ben Greear
2016-09-09 13:36   ` Valo, Kalle
2016-09-09 13:36     ` Valo, Kalle
2016-09-09 14:47     ` Ben Greear
2016-09-09 14:47       ` Ben Greear
2016-09-12  6:41       ` Johannes Berg
2016-09-12  6:41         ` Johannes Berg
2016-09-12 16:37         ` Ben Greear
2016-09-12 16:37           ` Ben Greear
2016-08-19  1:26 ` [PATCH 3/3] ath10k: Improve logging message greearb
2016-08-19  1:26   ` greearb
2016-08-19  6:35   ` Mohammed Shafi Shajakhan
2016-08-19  6:35     ` Mohammed Shafi Shajakhan
2016-09-09 13:30     ` Valo, Kalle
2016-09-09 13:30       ` Valo, Kalle
2016-09-13 12:29   ` [3/3] " Kalle Valo
2016-09-13 12:29     ` Kalle Valo
2016-08-19  6:59 ` [PATCH 1/3] ath10k: Ensure there are no stale ar->txqs entries Michal Kazior
2016-08-19  6:59   ` Michal Kazior
2016-08-19 13:34   ` Ben Greear
2016-08-19 13:34     ` Ben Greear
2016-12-01 22:52     ` Ben Greear
2016-12-01 22:52       ` Ben Greear
2016-12-02  0:24       ` Ben Greear
2016-12-02  0:24         ` Ben Greear
2016-12-05  8:50         ` Michal Kazior
2016-12-05  8:50           ` Michal Kazior
2016-12-05 18:19           ` Ben Greear
2016-12-05 18:19             ` Ben Greear
2016-09-09 17:25 ` Felix Fietkau
2016-09-09 17:25   ` Felix Fietkau
2016-09-09 17:46   ` Ben Greear
2016-09-09 17:46     ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57B67CD8.5040009@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=ath10k@lists.infradead.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=rmanohar@qti.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.