* [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
@ 2014-05-07 7:22 David Herrmann
2014-05-07 19:54 ` John W. Linville
2014-05-08 18:18 ` Rajkumar Manoharan
0 siblings, 2 replies; 14+ messages in thread
From: David Herrmann @ 2014-05-07 7:22 UTC (permalink / raw)
To: linux-wireless
Cc: Luis R. Rodriguez, Jouni Malinen, Vasanthakumar Thiagarajan,
Senthil Balasubramanian, John W. Linville, ath9k-devel,
Oleksij Rempel, David Herrmann
ah->caldata may be NULL if no channel is selected. Check for that before
accessing it.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
Hi
This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
is freed while we run in hw_per_calibration(). However, this patch fixes serious
kernel panics with wifi-P2P on my machine.
I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
correct fix would be to synchronously stop any running hw-calibration before
setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
this small workaround.
Thanks
David
drivers/net/wireless/ath/ath9k/ar9002_calib.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/ar9002_calib.c b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
index cdc7400..4667ab9 100644
--- a/drivers/net/wireless/ath/ath9k/ar9002_calib.c
+++ b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
@@ -99,14 +99,16 @@ static bool ar9002_hw_per_calibration(struct ath_hw *ah,
}
currCal->calData->calPostProc(ah, numChains);
- caldata->CalValid |= currCal->calData->calType;
+ if (caldata)
+ caldata->CalValid |= currCal->calData->calType;
+
currCal->calState = CAL_DONE;
iscaldone = true;
} else {
ar9002_hw_setup_calibration(ah, currCal);
}
}
- } else if (!(caldata->CalValid & currCal->calData->calType)) {
+ } else if (caldata && !(caldata->CalValid & currCal->calData->calType)) {
ath9k_hw_reset_calibration(ah, currCal);
}
--
1.9.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-07 7:22 [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002 David Herrmann
@ 2014-05-07 19:54 ` John W. Linville
2014-05-07 20:03 ` [ath9k-devel] " Luis R. Rodriguez
2014-05-07 20:15 ` Felix Fietkau
2014-05-08 18:18 ` Rajkumar Manoharan
1 sibling, 2 replies; 14+ messages in thread
From: John W. Linville @ 2014-05-07 19:54 UTC (permalink / raw)
To: David Herrmann
Cc: linux-wireless, Luis R. Rodriguez, Jouni Malinen,
Vasanthakumar Thiagarajan, Senthil Balasubramanian, ath9k-devel,
Oleksij Rempel
On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
> ah->caldata may be NULL if no channel is selected. Check for that before
> accessing it.
>
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> ---
> Hi
>
> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
> is freed while we run in hw_per_calibration(). However, this patch fixes serious
> kernel panics with wifi-P2P on my machine.
>
> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
> correct fix would be to synchronously stop any running hw-calibration before
> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
> this small workaround.
>
> Thanks
> David
Is there any hope for getting a more complete fix from the ath9k guys
in short order?
John
>
> drivers/net/wireless/ath/ath9k/ar9002_calib.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/ar9002_calib.c b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
> index cdc7400..4667ab9 100644
> --- a/drivers/net/wireless/ath/ath9k/ar9002_calib.c
> +++ b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
> @@ -99,14 +99,16 @@ static bool ar9002_hw_per_calibration(struct ath_hw *ah,
> }
>
> currCal->calData->calPostProc(ah, numChains);
> - caldata->CalValid |= currCal->calData->calType;
> + if (caldata)
> + caldata->CalValid |= currCal->calData->calType;
> +
> currCal->calState = CAL_DONE;
> iscaldone = true;
> } else {
> ar9002_hw_setup_calibration(ah, currCal);
> }
> }
> - } else if (!(caldata->CalValid & currCal->calData->calType)) {
> + } else if (caldata && !(caldata->CalValid & currCal->calData->calType)) {
> ath9k_hw_reset_calibration(ah, currCal);
> }
>
> --
> 1.9.2
>
>
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-07 19:54 ` John W. Linville
@ 2014-05-07 20:03 ` Luis R. Rodriguez
2014-05-07 20:38 ` John W. Linville
2014-05-07 20:15 ` Felix Fietkau
1 sibling, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2014-05-07 20:03 UTC (permalink / raw)
To: John W. Linville
Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k mailing list,
linux-wireless, Jouni Malinen, Senthil Balasubramanian
On Wed, May 7, 2014 at 12:54 PM, John W. Linville
<linville@tuxdriver.com> wrote:
> Is there any hope for getting a more complete fix from the ath9k guys
> in short order?
Wait, who are the ath9k folks now exactly ?
Luis
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-07 19:54 ` John W. Linville
2014-05-07 20:03 ` [ath9k-devel] " Luis R. Rodriguez
@ 2014-05-07 20:15 ` Felix Fietkau
2014-05-12 17:49 ` John W. Linville
1 sibling, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2014-05-07 20:15 UTC (permalink / raw)
To: John W. Linville, David Herrmann
Cc: Vasanthakumar Thiagarajan, ath9k-devel, linux-wireless,
Jouni Malinen, Luis R. Rodriguez, Senthil Balasubramanian
On 2014-05-07 21:54, John W. Linville wrote:
> On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
>> ah->caldata may be NULL if no channel is selected. Check for that before
>> accessing it.
>>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> ---
>> Hi
>>
>> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
>> is freed while we run in hw_per_calibration(). However, this patch fixes serious
>> kernel panics with wifi-P2P on my machine.
>>
>> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
>> correct fix would be to synchronously stop any running hw-calibration before
>> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
>> this small workaround.
>>
>> Thanks
>> David
>
> Is there any hope for getting a more complete fix from the ath9k guys
> in short order?
This looks easy to fix. I'll send a patch soon.
- Felix
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-07 20:03 ` [ath9k-devel] " Luis R. Rodriguez
@ 2014-05-07 20:38 ` John W. Linville
0 siblings, 0 replies; 14+ messages in thread
From: John W. Linville @ 2014-05-07 20:38 UTC (permalink / raw)
To: Luis R. Rodriguez
Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k mailing list,
linux-wireless, Jouni Malinen, Senthil Balasubramanian
On Wed, May 07, 2014 at 01:03:00PM -0700, Luis R. Rodriguez wrote:
> On Wed, May 7, 2014 at 12:54 PM, John W. Linville
> <linville@tuxdriver.com> wrote:
> > Is there any hope for getting a more complete fix from the ath9k guys
> > in short order?
>
> Wait, who are the ath9k folks now exactly ?
A better question may be, who are the qualcomm folks now exactly? ;-)
Anyway, I hope that the ath9k guys know who they are and will respond.
If there aren't any, then I guess we'll have to start slapping bubble
gum and popsicle sticks onto ath9k...
John
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-07 7:22 [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002 David Herrmann
2014-05-07 19:54 ` John W. Linville
@ 2014-05-08 18:18 ` Rajkumar Manoharan
2014-05-08 20:16 ` David Herrmann
1 sibling, 1 reply; 14+ messages in thread
From: Rajkumar Manoharan @ 2014-05-08 18:18 UTC (permalink / raw)
To: David Herrmann
Cc: linux-wireless, Luis R. Rodriguez, Jouni Malinen,
Vasanthakumar Thiagarajan, Senthil Balasubramanian,
John W. Linville, ath9k-devel, Oleksij Rempel
On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
> ah->caldata may be NULL if no channel is selected. Check for that before
> accessing it.
>
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> ---
> Hi
>
> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
> is freed while we run in hw_per_calibration(). However, this patch fixes serious
> kernel panics with wifi-P2P on my machine.
>
> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
> correct fix would be to synchronously stop any running hw-calibration before
> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
> this small workaround.
>
David,
Whenever the DUT is moving to off-channel, ah->caldata is set to NULL in
hw_reset. As you mentioned, before doing hw_reset, the on-going calibration is stopped
synchronously. I using ar9280 for p2p (GO & CLI) validation. Somehow i do not observe
the panics. Is there a easiest way to reproduce the problem. Are you
using wireless-testing tree? Thanks for reporting the problem. Will try
to fix asap.
-Rajkumar
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-08 18:18 ` Rajkumar Manoharan
@ 2014-05-08 20:16 ` David Herrmann
0 siblings, 0 replies; 14+ messages in thread
From: David Herrmann @ 2014-05-08 20:16 UTC (permalink / raw)
To: Rajkumar Manoharan
Cc: linux-wireless, Luis R. Rodriguez, Jouni Malinen,
Vasanthakumar Thiagarajan, Senthil Balasubramanian,
John W. Linville, ath9k-devel, Oleksij Rempel
Hi
On Thu, May 8, 2014 at 8:18 PM, Rajkumar Manoharan
<rmanohar@qti.qualcomm.com> wrote:
> On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
>> ah->caldata may be NULL if no channel is selected. Check for that before
>> accessing it.
>>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> ---
>> Hi
>>
>> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
>> is freed while we run in hw_per_calibration(). However, this patch fixes serious
>> kernel panics with wifi-P2P on my machine.
>>
>> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
>> correct fix would be to synchronously stop any running hw-calibration before
>> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
>> this small workaround.
>>
> David,
>
> Whenever the DUT is moving to off-channel, ah->caldata is set to NULL in
> hw_reset. As you mentioned, before doing hw_reset, the on-going calibration is stopped
> synchronously. I using ar9280 for p2p (GO & CLI) validation. Somehow i do not observe
> the panics. Is there a easiest way to reproduce the problem. Are you
> using wireless-testing tree? Thanks for reporting the problem. Will try
> to fix asap.
Reproducing it is actually quite easy on my machine. Whenever I start
a P2P-connect from my Android-phone to my linux-host and _immediately_
accept it (via p2p_connect on wpas), I get the kernel-panic. Adding
the NULL-protection fixes this.
However, if I delay accepting the connection (ie, issuing p2p_connect
by hand instead of automatically), I cannot see the bug. Furthermore,
on my slower Intel Core 2 Duo, the bug happens much less likely. On my
ARM machine I never saw this happening. Given that my main machine is
an Intel hsw quad-core, I guess it's a simple race-condition.
I also added a printk() whenever caldata is NULL and noticed that it
fires only during the first 2 or 3 runs. After that, it never happened
again.
The bug happens on all linux kernels I tested (starting with 3.9ish up
to linux-next). However, if I apply my fix, anything after 3.13-stable
fails to transmit DHCP data. I can connect properly but DHCP always
times out. I'm not sure why that happens and I'm still debugging this,
but it's quite likely a separate issue. (if I find some time, I will
bisect this)
I now looked at the ath9k code and I couldn't see any locking around
the hw_reset at all. I don't know whether the wifi-core / nl80211
locks this, but what happens if two hw_resets race each other? Just a
guess.. I will try to look into it tomorrow.
Thanks
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-07 20:15 ` Felix Fietkau
@ 2014-05-12 17:49 ` John W. Linville
2014-05-12 18:43 ` Felix Fietkau
0 siblings, 1 reply; 14+ messages in thread
From: John W. Linville @ 2014-05-12 17:49 UTC (permalink / raw)
To: Felix Fietkau
Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k-devel,
linux-wireless, Jouni Malinen, Luis R. Rodriguez,
Senthil Balasubramanian
On Wed, May 07, 2014 at 10:15:53PM +0200, Felix Fietkau wrote:
> On 2014-05-07 21:54, John W. Linville wrote:
> > On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
> >> ah->caldata may be NULL if no channel is selected. Check for that before
> >> accessing it.
> >>
> >> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> >> ---
> >> Hi
> >>
> >> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
> >> is freed while we run in hw_per_calibration(). However, this patch fixes serious
> >> kernel panics with wifi-P2P on my machine.
> >>
> >> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
> >> correct fix would be to synchronously stop any running hw-calibration before
> >> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
> >> this small workaround.
> >>
> >> Thanks
> >> David
> >
> > Is there any hope for getting a more complete fix from the ath9k guys
> > in short order?
> This looks easy to fix. I'll send a patch soon.
Ping?
--
John W. Linville Someday the world will need a hero, and you
linville@tuxdriver.com might be all we have. Be ready.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-12 17:49 ` John W. Linville
@ 2014-05-12 18:43 ` Felix Fietkau
2014-05-13 6:50 ` David Herrmann
0 siblings, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2014-05-12 18:43 UTC (permalink / raw)
To: John W. Linville
Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k-devel,
linux-wireless, Jouni Malinen, Luis R. Rodriguez,
Senthil Balasubramanian
On 2014-05-12 19:49, John W. Linville wrote:
> On Wed, May 07, 2014 at 10:15:53PM +0200, Felix Fietkau wrote:
>> On 2014-05-07 21:54, John W. Linville wrote:
>> > On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
>> >> ah->caldata may be NULL if no channel is selected. Check for that before
>> >> accessing it.
>> >>
>> >> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> >> ---
>> >> Hi
>> >>
>> >> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
>> >> is freed while we run in hw_per_calibration(). However, this patch fixes serious
>> >> kernel panics with wifi-P2P on my machine.
>> >>
>> >> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
>> >> correct fix would be to synchronously stop any running hw-calibration before
>> >> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
>> >> this small workaround.
>> >>
>> >> Thanks
>> >> David
>> >
>> > Is there any hope for getting a more complete fix from the ath9k guys
>> > in short order?
>> This looks easy to fix. I'll send a patch soon.
>
> Ping?
I looked into it again, the scenario where I assumed that this problem
could occur didn't turn out to be true. I have no idea how this crash
can occur.
- Felix
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-12 18:43 ` Felix Fietkau
@ 2014-05-13 6:50 ` David Herrmann
2014-05-13 9:00 ` Rajkumar Manoharan
0 siblings, 1 reply; 14+ messages in thread
From: David Herrmann @ 2014-05-13 6:50 UTC (permalink / raw)
To: Felix Fietkau
Cc: John W. Linville, Vasanthakumar Thiagarajan, ath9k-devel,
linux-wireless, Jouni Malinen, Luis R. Rodriguez,
Senthil Balasubramanian
Hi
On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> I looked into it again, the scenario where I assumed that this problem
> could occur didn't turn out to be true. I have no idea how this crash
> can occur.
The only path that can set ah->caldata to NULL is through:
ieee80211_hw_config()
ath9k_htc_config()
ath9k_htc_set_channel()
ath9k_hw_reset()
This happens whenever IEEE80211_CONF_OFFCHANNEL is set.
Now mac80211 is way to big for me to review right now and
ieee80211_hw_config() is used quite often. Given that the described
call-path does no synchronization against ath9k_htc_ani_work(), all
the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
ani-work is running. Is that intentional? I cannot see any of those
functions calling into ath9k_htc_stop_ani(). This might of course be
implicit.
One call-path I see is:
ieee80211_scan_cancel()
cancel_delayed_work()
We cannot use cancel_delayed_work_sync() here due to locking issues.
However, this obviously races against any following
set_channel(OFFCHANNEL) request.
If there's anything I can do to debug this, let me know. I tried
adding some printk()'s into the hot-path and it turns out to no longer
fail then. So this really seems to be a quite small race (given that a
bunch of simple printk()s is slow enough to work around it).
Thanks
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-13 6:50 ` David Herrmann
@ 2014-05-13 9:00 ` Rajkumar Manoharan
2014-05-13 9:09 ` David Herrmann
0 siblings, 1 reply; 14+ messages in thread
From: Rajkumar Manoharan @ 2014-05-13 9:00 UTC (permalink / raw)
To: David Herrmann
Cc: Felix Fietkau, John W. Linville, Vasanthakumar Thiagarajan,
ath9k-devel, linux-wireless, Jouni Malinen, Luis R. Rodriguez,
Senthil Balasubramanian
On Tue, May 13, 2014 at 08:50:00AM +0200, David Herrmann wrote:
> Hi
>
> On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> > I looked into it again, the scenario where I assumed that this problem
> > could occur didn't turn out to be true. I have no idea how this crash
> > can occur.
>
> The only path that can set ah->caldata to NULL is through:
>
> ieee80211_hw_config()
> ath9k_htc_config()
> ath9k_htc_set_channel()
> ath9k_hw_reset()
>
> This happens whenever IEEE80211_CONF_OFFCHANNEL is set.
>
> Now mac80211 is way to big for me to review right now and
> ieee80211_hw_config() is used quite often. Given that the described
> call-path does no synchronization against ath9k_htc_ani_work(), all
> the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
> ani-work is running. Is that intentional? I cannot see any of those
> functions calling into ath9k_htc_stop_ani(). This might of course be
> implicit.
>
> One call-path I see is:
> ieee80211_scan_cancel()
> cancel_delayed_work()
>
> We cannot use cancel_delayed_work_sync() here due to locking issues.
> However, this obviously races against any following
> set_channel(OFFCHANNEL) request.
>
> If there's anything I can do to debug this, let me know. I tried
> adding some printk()'s into the hot-path and it turns out to no longer
> fail then. So this really seems to be a quite small race (given that a
> bunch of simple printk()s is slow enough to work around it).
David,
Are you using USB devices? What is the PID/VID? I have not looked at
HTC driver perspective.
-Rajkumar
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-13 9:00 ` Rajkumar Manoharan
@ 2014-05-13 9:09 ` David Herrmann
2014-05-13 10:41 ` Rajkumar Manoharan
0 siblings, 1 reply; 14+ messages in thread
From: David Herrmann @ 2014-05-13 9:09 UTC (permalink / raw)
To: Rajkumar Manoharan
Cc: Felix Fietkau, John W. Linville, Vasanthakumar Thiagarajan,
ath9k-devel, linux-wireless, Jouni Malinen, Luis R. Rodriguez,
Senthil Balasubramanian
Hi
On Tue, May 13, 2014 at 11:00 AM, Rajkumar Manoharan
<rmanohar@qti.qualcomm.com> wrote:
> On Tue, May 13, 2014 at 08:50:00AM +0200, David Herrmann wrote:
>> Hi
>>
>> On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
>> > I looked into it again, the scenario where I assumed that this problem
>> > could occur didn't turn out to be true. I have no idea how this crash
>> > can occur.
>>
>> The only path that can set ah->caldata to NULL is through:
>>
>> ieee80211_hw_config()
>> ath9k_htc_config()
>> ath9k_htc_set_channel()
>> ath9k_hw_reset()
>>
>> This happens whenever IEEE80211_CONF_OFFCHANNEL is set.
>>
>> Now mac80211 is way to big for me to review right now and
>> ieee80211_hw_config() is used quite often. Given that the described
>> call-path does no synchronization against ath9k_htc_ani_work(), all
>> the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
>> ani-work is running. Is that intentional? I cannot see any of those
>> functions calling into ath9k_htc_stop_ani(). This might of course be
>> implicit.
>>
>> One call-path I see is:
>> ieee80211_scan_cancel()
>> cancel_delayed_work()
>>
>> We cannot use cancel_delayed_work_sync() here due to locking issues.
>> However, this obviously races against any following
>> set_channel(OFFCHANNEL) request.
>>
>> If there's anything I can do to debug this, let me know. I tried
>> adding some printk()'s into the hot-path and it turns out to no longer
>> fail then. So this really seems to be a quite small race (given that a
>> bunch of simple printk()s is slow enough to work around it).
>
> David,
>
> Are you using USB devices? What is the PID/VID? I have not looked at
> HTC driver perspective.
Yes, I'm using the htc driver. I thought that was clear by pointing at
ar9002, but I was wrong, sorry. lsusb information is:
'Bus 003 Device 056: ID 0411:017f BUFFALO INC. (formerly MelCo.,
Inc.) Sony UWA-BR100 802.11abgn Wireless Adapter [Atheros
AR7010+AR9280]'
Thanks
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-13 9:09 ` David Herrmann
@ 2014-05-13 10:41 ` Rajkumar Manoharan
2014-05-13 18:21 ` David Herrmann
0 siblings, 1 reply; 14+ messages in thread
From: Rajkumar Manoharan @ 2014-05-13 10:41 UTC (permalink / raw)
To: David Herrmann
Cc: Felix Fietkau, John W. Linville, ath9k-devel, linux-wireless
[-- Attachment #1: Type: text/plain, Size: 2472 bytes --]
On Tue, May 13, 2014 at 11:09:48AM +0200, David Herrmann wrote:
> Hi
>
> On Tue, May 13, 2014 at 11:00 AM, Rajkumar Manoharan
> <rmanohar@qti.qualcomm.com> wrote:
> > On Tue, May 13, 2014 at 08:50:00AM +0200, David Herrmann wrote:
> >> Hi
> >>
> >> On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> >> > I looked into it again, the scenario where I assumed that this problem
> >> > could occur didn't turn out to be true. I have no idea how this crash
> >> > can occur.
> >>
> >> The only path that can set ah->caldata to NULL is through:
> >>
> >> ieee80211_hw_config()
> >> ath9k_htc_config()
> >> ath9k_htc_set_channel()
> >> ath9k_hw_reset()
> >>
> >> This happens whenever IEEE80211_CONF_OFFCHANNEL is set.
> >>
> >> Now mac80211 is way to big for me to review right now and
> >> ieee80211_hw_config() is used quite often. Given that the described
> >> call-path does no synchronization against ath9k_htc_ani_work(), all
> >> the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
> >> ani-work is running. Is that intentional? I cannot see any of those
> >> functions calling into ath9k_htc_stop_ani(). This might of course be
> >> implicit.
> >>
> >> One call-path I see is:
> >> ieee80211_scan_cancel()
> >> cancel_delayed_work()
> >>
> >> We cannot use cancel_delayed_work_sync() here due to locking issues.
> >> However, this obviously races against any following
> >> set_channel(OFFCHANNEL) request.
> >>
> >> If there's anything I can do to debug this, let me know. I tried
> >> adding some printk()'s into the hot-path and it turns out to no longer
> >> fail then. So this really seems to be a quite small race (given that a
> >> bunch of simple printk()s is slow enough to work around it).
> >
> > David,
> >
> > Are you using USB devices? What is the PID/VID? I have not looked at
> > HTC driver perspective.
>
> Yes, I'm using the htc driver. I thought that was clear by pointing at
> ar9002, but I was wrong, sorry. lsusb information is:
> 'Bus 003 Device 056: ID 0411:017f BUFFALO INC. (formerly MelCo.,
> Inc.) Sony UWA-BR100 802.11abgn Wireless Adapter [Atheros
> AR7010+AR9280]'
>
Unlike ath9k driver, the ani work is stopped in sw_scan_start callback
for htc driver. So during scan process, ani work is stopped well before
calling set_channel. But in case of p2p_listen operation, set_channel can be
called by sw_roc without stopping ani.
Can you please test with attached change?
-Rajkumar
[-- Attachment #2: ani.patch --]
[-- Type: text/x-diff, Size: 830 bytes --]
diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_main.c b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
index f46cd02..5627917 100644
--- a/drivers/net/wireless/ath/ath9k/htc_drv_main.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
@@ -95,8 +95,10 @@ static void ath9k_htc_vif_iter(void *data, u8 *mac, struct ieee80211_vif *vif)
if ((vif->type == NL80211_IFTYPE_AP ||
vif->type == NL80211_IFTYPE_MESH_POINT) &&
- bss_conf->enable_beacon)
+ bss_conf->enable_beacon) {
priv->reconfig_beacon = true;
+ priv->rearm_ani = true;
+ }
if (bss_conf->assoc) {
priv->rearm_ani = true;
@@ -257,6 +259,7 @@ static int ath9k_htc_set_channel(struct ath9k_htc_priv *priv,
ath9k_htc_ps_wakeup(priv);
+ ath9k_htc_stop_ani(priv);
del_timer_sync(&priv->tx.cleanup_timer);
ath9k_htc_tx_drain(priv);
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
2014-05-13 10:41 ` Rajkumar Manoharan
@ 2014-05-13 18:21 ` David Herrmann
0 siblings, 0 replies; 14+ messages in thread
From: David Herrmann @ 2014-05-13 18:21 UTC (permalink / raw)
To: Rajkumar Manoharan
Cc: Felix Fietkau, John W. Linville, ath9k-devel, linux-wireless
Hi
On Tue, May 13, 2014 at 12:41 PM, Rajkumar Manoharan
<rmanohar@qti.qualcomm.com> wrote:
> On Tue, May 13, 2014 at 11:09:48AM +0200, David Herrmann wrote:
> Unlike ath9k driver, the ani work is stopped in sw_scan_start callback
> for htc driver. So during scan process, ani work is stopped well before
> calling set_channel. But in case of p2p_listen operation, set_channel can be
> called by sw_roc without stopping ani.
>
> Can you please test with attached change?
I cannot reproduce it anymore and my traces show that the
recalibration is called a _lot_ less often. In fact, during 10
successful p2p-connect attempts the per_calib helper is only called
once.
I will keep looking, but looks good so far:
Tested-by: David Herrmann <dh.herrmann@gmail.com>
Thanks
David
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-05-13 18:21 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-07 7:22 [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002 David Herrmann
2014-05-07 19:54 ` John W. Linville
2014-05-07 20:03 ` [ath9k-devel] " Luis R. Rodriguez
2014-05-07 20:38 ` John W. Linville
2014-05-07 20:15 ` Felix Fietkau
2014-05-12 17:49 ` John W. Linville
2014-05-12 18:43 ` Felix Fietkau
2014-05-13 6:50 ` David Herrmann
2014-05-13 9:00 ` Rajkumar Manoharan
2014-05-13 9:09 ` David Herrmann
2014-05-13 10:41 ` Rajkumar Manoharan
2014-05-13 18:21 ` David Herrmann
2014-05-08 18:18 ` Rajkumar Manoharan
2014-05-08 20:16 ` David Herrmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).