linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
@ 2014-05-07  7:22 David Herrmann
  2014-05-07 19:54 ` John W. Linville
  2014-05-08 18:18 ` Rajkumar Manoharan
  0 siblings, 2 replies; 14+ messages in thread
From: David Herrmann @ 2014-05-07  7:22 UTC (permalink / raw)
  To: linux-wireless
  Cc: Luis R. Rodriguez, Jouni Malinen, Vasanthakumar Thiagarajan,
	Senthil Balasubramanian, John W. Linville, ath9k-devel,
	Oleksij Rempel, David Herrmann

ah->caldata may be NULL if no channel is selected. Check for that before
accessing it.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
Hi

This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
is freed while we run in hw_per_calibration(). However, this patch fixes serious
kernel panics with wifi-P2P on my machine.

I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
correct fix would be to synchronously stop any running hw-calibration before
setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
this small workaround.

Thanks
David

 drivers/net/wireless/ath/ath9k/ar9002_calib.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/ar9002_calib.c b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
index cdc7400..4667ab9 100644
--- a/drivers/net/wireless/ath/ath9k/ar9002_calib.c
+++ b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
@@ -99,14 +99,16 @@ static bool ar9002_hw_per_calibration(struct ath_hw *ah,
 				}
 
 				currCal->calData->calPostProc(ah, numChains);
-				caldata->CalValid |= currCal->calData->calType;
+				if (caldata)
+					caldata->CalValid |= currCal->calData->calType;
+
 				currCal->calState = CAL_DONE;
 				iscaldone = true;
 			} else {
 				ar9002_hw_setup_calibration(ah, currCal);
 			}
 		}
-	} else if (!(caldata->CalValid & currCal->calData->calType)) {
+	} else if (caldata && !(caldata->CalValid & currCal->calData->calType)) {
 		ath9k_hw_reset_calibration(ah, currCal);
 	}
 
-- 
1.9.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-07  7:22 [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002 David Herrmann
@ 2014-05-07 19:54 ` John W. Linville
  2014-05-07 20:03   ` [ath9k-devel] " Luis R. Rodriguez
  2014-05-07 20:15   ` Felix Fietkau
  2014-05-08 18:18 ` Rajkumar Manoharan
  1 sibling, 2 replies; 14+ messages in thread
From: John W. Linville @ 2014-05-07 19:54 UTC (permalink / raw)
  To: David Herrmann
  Cc: linux-wireless, Luis R. Rodriguez, Jouni Malinen,
	Vasanthakumar Thiagarajan, Senthil Balasubramanian, ath9k-devel,
	Oleksij Rempel

On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
> ah->caldata may be NULL if no channel is selected. Check for that before
> accessing it.
> 
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> ---
> Hi
> 
> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
> is freed while we run in hw_per_calibration(). However, this patch fixes serious
> kernel panics with wifi-P2P on my machine.
> 
> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
> correct fix would be to synchronously stop any running hw-calibration before
> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
> this small workaround.
> 
> Thanks
> David

Is there any hope for getting a more complete fix from the ath9k guys
in short order?

John

> 
>  drivers/net/wireless/ath/ath9k/ar9002_calib.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/wireless/ath/ath9k/ar9002_calib.c b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
> index cdc7400..4667ab9 100644
> --- a/drivers/net/wireless/ath/ath9k/ar9002_calib.c
> +++ b/drivers/net/wireless/ath/ath9k/ar9002_calib.c
> @@ -99,14 +99,16 @@ static bool ar9002_hw_per_calibration(struct ath_hw *ah,
>  				}
>  
>  				currCal->calData->calPostProc(ah, numChains);
> -				caldata->CalValid |= currCal->calData->calType;
> +				if (caldata)
> +					caldata->CalValid |= currCal->calData->calType;
> +
>  				currCal->calState = CAL_DONE;
>  				iscaldone = true;
>  			} else {
>  				ar9002_hw_setup_calibration(ah, currCal);
>  			}
>  		}
> -	} else if (!(caldata->CalValid & currCal->calData->calType)) {
> +	} else if (caldata && !(caldata->CalValid & currCal->calData->calType)) {
>  		ath9k_hw_reset_calibration(ah, currCal);
>  	}
>  
> -- 
> 1.9.2
> 
> 

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-07 19:54 ` John W. Linville
@ 2014-05-07 20:03   ` Luis R. Rodriguez
  2014-05-07 20:38     ` John W. Linville
  2014-05-07 20:15   ` Felix Fietkau
  1 sibling, 1 reply; 14+ messages in thread
From: Luis R. Rodriguez @ 2014-05-07 20:03 UTC (permalink / raw)
  To: John W. Linville
  Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k mailing list,
	linux-wireless, Jouni Malinen, Senthil Balasubramanian

On Wed, May 7, 2014 at 12:54 PM, John W. Linville
<linville@tuxdriver.com> wrote:
> Is there any hope for getting a more complete fix from the ath9k guys
> in short order?

Wait, who are the ath9k folks now exactly ?

  Luis

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-07 19:54 ` John W. Linville
  2014-05-07 20:03   ` [ath9k-devel] " Luis R. Rodriguez
@ 2014-05-07 20:15   ` Felix Fietkau
  2014-05-12 17:49     ` John W. Linville
  1 sibling, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2014-05-07 20:15 UTC (permalink / raw)
  To: John W. Linville, David Herrmann
  Cc: Vasanthakumar Thiagarajan, ath9k-devel, linux-wireless,
	Jouni Malinen, Luis R. Rodriguez, Senthil Balasubramanian

On 2014-05-07 21:54, John W. Linville wrote:
> On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
>> ah->caldata may be NULL if no channel is selected. Check for that before
>> accessing it.
>> 
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> ---
>> Hi
>> 
>> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
>> is freed while we run in hw_per_calibration(). However, this patch fixes serious
>> kernel panics with wifi-P2P on my machine.
>> 
>> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
>> correct fix would be to synchronously stop any running hw-calibration before
>> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
>> this small workaround.
>> 
>> Thanks
>> David
> 
> Is there any hope for getting a more complete fix from the ath9k guys
> in short order?
This looks easy to fix. I'll send a patch soon.

- Felix


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-07 20:03   ` [ath9k-devel] " Luis R. Rodriguez
@ 2014-05-07 20:38     ` John W. Linville
  0 siblings, 0 replies; 14+ messages in thread
From: John W. Linville @ 2014-05-07 20:38 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k mailing list,
	linux-wireless, Jouni Malinen, Senthil Balasubramanian

On Wed, May 07, 2014 at 01:03:00PM -0700, Luis R. Rodriguez wrote:
> On Wed, May 7, 2014 at 12:54 PM, John W. Linville
> <linville@tuxdriver.com> wrote:
> > Is there any hope for getting a more complete fix from the ath9k guys
> > in short order?
> 
> Wait, who are the ath9k folks now exactly ?

A better question may be, who are the qualcomm folks now exactly? ;-)

Anyway, I hope that the ath9k guys know who they are and will respond.
If there aren't any, then I guess we'll have to start slapping bubble
gum and popsicle sticks onto ath9k...

John
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-07  7:22 [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002 David Herrmann
  2014-05-07 19:54 ` John W. Linville
@ 2014-05-08 18:18 ` Rajkumar Manoharan
  2014-05-08 20:16   ` David Herrmann
  1 sibling, 1 reply; 14+ messages in thread
From: Rajkumar Manoharan @ 2014-05-08 18:18 UTC (permalink / raw)
  To: David Herrmann
  Cc: linux-wireless, Luis R. Rodriguez, Jouni Malinen,
	Vasanthakumar Thiagarajan, Senthil Balasubramanian,
	John W. Linville, ath9k-devel, Oleksij Rempel

On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
> ah->caldata may be NULL if no channel is selected. Check for that before
> accessing it.
> 
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> ---
> Hi
> 
> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
> is freed while we run in hw_per_calibration(). However, this patch fixes serious
> kernel panics with wifi-P2P on my machine.
> 
> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
> correct fix would be to synchronously stop any running hw-calibration before
> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
> this small workaround.
>
David,

Whenever the DUT is moving to off-channel, ah->caldata is set to NULL in
hw_reset. As you mentioned, before doing hw_reset, the on-going calibration is stopped
synchronously. I using ar9280 for p2p (GO & CLI) validation. Somehow i do not observe
the panics. Is there a easiest way to reproduce the problem. Are you
using wireless-testing tree? Thanks for reporting the problem. Will try
to fix asap.

-Rajkumar

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-08 18:18 ` Rajkumar Manoharan
@ 2014-05-08 20:16   ` David Herrmann
  0 siblings, 0 replies; 14+ messages in thread
From: David Herrmann @ 2014-05-08 20:16 UTC (permalink / raw)
  To: Rajkumar Manoharan
  Cc: linux-wireless, Luis R. Rodriguez, Jouni Malinen,
	Vasanthakumar Thiagarajan, Senthil Balasubramanian,
	John W. Linville, ath9k-devel, Oleksij Rempel

Hi

On Thu, May 8, 2014 at 8:18 PM, Rajkumar Manoharan
<rmanohar@qti.qualcomm.com> wrote:
> On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
>> ah->caldata may be NULL if no channel is selected. Check for that before
>> accessing it.
>>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> ---
>> Hi
>>
>> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
>> is freed while we run in hw_per_calibration(). However, this patch fixes serious
>> kernel panics with wifi-P2P on my machine.
>>
>> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
>> correct fix would be to synchronously stop any running hw-calibration before
>> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
>> this small workaround.
>>
> David,
>
> Whenever the DUT is moving to off-channel, ah->caldata is set to NULL in
> hw_reset. As you mentioned, before doing hw_reset, the on-going calibration is stopped
> synchronously. I using ar9280 for p2p (GO & CLI) validation. Somehow i do not observe
> the panics. Is there a easiest way to reproduce the problem. Are you
> using wireless-testing tree? Thanks for reporting the problem. Will try
> to fix asap.

Reproducing it is actually quite easy on my machine. Whenever I start
a P2P-connect from my Android-phone to my linux-host and _immediately_
accept it (via p2p_connect on wpas), I get the kernel-panic. Adding
the NULL-protection fixes this.

However, if I delay accepting the connection (ie, issuing p2p_connect
by hand instead of automatically), I cannot see the bug. Furthermore,
on my slower Intel Core 2 Duo, the bug happens much less likely. On my
ARM machine I never saw this happening. Given that my main machine is
an Intel hsw quad-core, I guess it's a simple race-condition.

I also added a printk() whenever caldata is NULL and noticed that it
fires only during the first 2 or 3 runs. After that, it never happened
again.

The bug happens on all linux kernels I tested (starting with 3.9ish up
to linux-next). However, if I apply my fix, anything after 3.13-stable
fails to transmit DHCP data. I can connect properly but DHCP always
times out. I'm not sure why that happens and I'm still debugging this,
but it's quite likely a separate issue. (if I find some time, I will
bisect this)

I now looked at the ath9k code and I couldn't see any locking around
the hw_reset at all. I don't know whether the wifi-core / nl80211
locks this, but what happens if two hw_resets race each other? Just a
guess.. I will try to look into it tomorrow.

Thanks
David

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-07 20:15   ` Felix Fietkau
@ 2014-05-12 17:49     ` John W. Linville
  2014-05-12 18:43       ` Felix Fietkau
  0 siblings, 1 reply; 14+ messages in thread
From: John W. Linville @ 2014-05-12 17:49 UTC (permalink / raw)
  To: Felix Fietkau
  Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k-devel,
	linux-wireless, Jouni Malinen, Luis R. Rodriguez,
	Senthil Balasubramanian

On Wed, May 07, 2014 at 10:15:53PM +0200, Felix Fietkau wrote:
> On 2014-05-07 21:54, John W. Linville wrote:
> > On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
> >> ah->caldata may be NULL if no channel is selected. Check for that before
> >> accessing it.
> >> 
> >> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> >> ---
> >> Hi
> >> 
> >> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
> >> is freed while we run in hw_per_calibration(). However, this patch fixes serious
> >> kernel panics with wifi-P2P on my machine.
> >> 
> >> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
> >> correct fix would be to synchronously stop any running hw-calibration before
> >> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
> >> this small workaround.
> >> 
> >> Thanks
> >> David
> > 
> > Is there any hope for getting a more complete fix from the ath9k guys
> > in short order?
> This looks easy to fix. I'll send a patch soon.

Ping?

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-12 17:49     ` John W. Linville
@ 2014-05-12 18:43       ` Felix Fietkau
  2014-05-13  6:50         ` David Herrmann
  0 siblings, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2014-05-12 18:43 UTC (permalink / raw)
  To: John W. Linville
  Cc: David Herrmann, Vasanthakumar Thiagarajan, ath9k-devel,
	linux-wireless, Jouni Malinen, Luis R. Rodriguez,
	Senthil Balasubramanian

On 2014-05-12 19:49, John W. Linville wrote:
> On Wed, May 07, 2014 at 10:15:53PM +0200, Felix Fietkau wrote:
>> On 2014-05-07 21:54, John W. Linville wrote:
>> > On Wed, May 07, 2014 at 09:22:58AM +0200, David Herrmann wrote:
>> >> ah->caldata may be NULL if no channel is selected. Check for that before
>> >> accessing it.
>> >> 
>> >> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> >> ---
>> >> Hi
>> >> 
>> >> This is _definitely_ only a workaround, given that no-one guarantees ah->caldata
>> >> is freed while we run in hw_per_calibration(). However, this patch fixes serious
>> >> kernel panics with wifi-P2P on my machine.
>> >> 
>> >> I'm not sure why ah->caldata can be NULL, but it definitely is. I think the
>> >> correct fix would be to synchronously stop any running hw-calibration before
>> >> setting ah->caldata to NULL. I don't know whether/where that is done, so I wrote
>> >> this small workaround.
>> >> 
>> >> Thanks
>> >> David
>> > 
>> > Is there any hope for getting a more complete fix from the ath9k guys
>> > in short order?
>> This looks easy to fix. I'll send a patch soon.
> 
> Ping?
I looked into it again, the scenario where I assumed that this problem
could occur didn't turn out to be true. I have no idea how this crash
can occur.

- Felix

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-12 18:43       ` Felix Fietkau
@ 2014-05-13  6:50         ` David Herrmann
  2014-05-13  9:00           ` Rajkumar Manoharan
  0 siblings, 1 reply; 14+ messages in thread
From: David Herrmann @ 2014-05-13  6:50 UTC (permalink / raw)
  To: Felix Fietkau
  Cc: John W. Linville, Vasanthakumar Thiagarajan, ath9k-devel,
	linux-wireless, Jouni Malinen, Luis R. Rodriguez,
	Senthil Balasubramanian

Hi

On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> I looked into it again, the scenario where I assumed that this problem
> could occur didn't turn out to be true. I have no idea how this crash
> can occur.

The only path that can set ah->caldata to NULL is through:

ieee80211_hw_config()
  ath9k_htc_config()
    ath9k_htc_set_channel()
      ath9k_hw_reset()

This happens whenever IEEE80211_CONF_OFFCHANNEL is set.

Now mac80211 is way to big for me to review right now and
ieee80211_hw_config() is used quite often. Given that the described
call-path does no synchronization against ath9k_htc_ani_work(), all
the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
ani-work is running. Is that intentional? I cannot see any of those
functions calling into ath9k_htc_stop_ani(). This might of course be
implicit.

One call-path I see is:
ieee80211_scan_cancel()
  cancel_delayed_work()

We cannot use cancel_delayed_work_sync() here due to locking issues.
However, this obviously races against any following
set_channel(OFFCHANNEL) request.

If there's anything I can do to debug this, let me know. I tried
adding some printk()'s into the hot-path and it turns out to no longer
fail then. So this really seems to be a quite small race (given that a
bunch of simple printk()s is slow enough to work around it).

Thanks
David

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-13  6:50         ` David Herrmann
@ 2014-05-13  9:00           ` Rajkumar Manoharan
  2014-05-13  9:09             ` David Herrmann
  0 siblings, 1 reply; 14+ messages in thread
From: Rajkumar Manoharan @ 2014-05-13  9:00 UTC (permalink / raw)
  To: David Herrmann
  Cc: Felix Fietkau, John W. Linville, Vasanthakumar Thiagarajan,
	ath9k-devel, linux-wireless, Jouni Malinen, Luis R. Rodriguez,
	Senthil Balasubramanian

On Tue, May 13, 2014 at 08:50:00AM +0200, David Herrmann wrote:
> Hi
> 
> On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> > I looked into it again, the scenario where I assumed that this problem
> > could occur didn't turn out to be true. I have no idea how this crash
> > can occur.
> 
> The only path that can set ah->caldata to NULL is through:
> 
> ieee80211_hw_config()
>   ath9k_htc_config()
>     ath9k_htc_set_channel()
>       ath9k_hw_reset()
> 
> This happens whenever IEEE80211_CONF_OFFCHANNEL is set.
> 
> Now mac80211 is way to big for me to review right now and
> ieee80211_hw_config() is used quite often. Given that the described
> call-path does no synchronization against ath9k_htc_ani_work(), all
> the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
> ani-work is running. Is that intentional? I cannot see any of those
> functions calling into ath9k_htc_stop_ani(). This might of course be
> implicit.
> 
> One call-path I see is:
> ieee80211_scan_cancel()
>   cancel_delayed_work()
> 
> We cannot use cancel_delayed_work_sync() here due to locking issues.
> However, this obviously races against any following
> set_channel(OFFCHANNEL) request.
> 
> If there's anything I can do to debug this, let me know. I tried
> adding some printk()'s into the hot-path and it turns out to no longer
> fail then. So this really seems to be a quite small race (given that a
> bunch of simple printk()s is slow enough to work around it).

David,

Are you using USB devices? What is the PID/VID? I have not looked at
HTC driver perspective.

-Rajkumar

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-13  9:00           ` Rajkumar Manoharan
@ 2014-05-13  9:09             ` David Herrmann
  2014-05-13 10:41               ` Rajkumar Manoharan
  0 siblings, 1 reply; 14+ messages in thread
From: David Herrmann @ 2014-05-13  9:09 UTC (permalink / raw)
  To: Rajkumar Manoharan
  Cc: Felix Fietkau, John W. Linville, Vasanthakumar Thiagarajan,
	ath9k-devel, linux-wireless, Jouni Malinen, Luis R. Rodriguez,
	Senthil Balasubramanian

Hi

On Tue, May 13, 2014 at 11:00 AM, Rajkumar Manoharan
<rmanohar@qti.qualcomm.com> wrote:
> On Tue, May 13, 2014 at 08:50:00AM +0200, David Herrmann wrote:
>> Hi
>>
>> On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
>> > I looked into it again, the scenario where I assumed that this problem
>> > could occur didn't turn out to be true. I have no idea how this crash
>> > can occur.
>>
>> The only path that can set ah->caldata to NULL is through:
>>
>> ieee80211_hw_config()
>>   ath9k_htc_config()
>>     ath9k_htc_set_channel()
>>       ath9k_hw_reset()
>>
>> This happens whenever IEEE80211_CONF_OFFCHANNEL is set.
>>
>> Now mac80211 is way to big for me to review right now and
>> ieee80211_hw_config() is used quite often. Given that the described
>> call-path does no synchronization against ath9k_htc_ani_work(), all
>> the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
>> ani-work is running. Is that intentional? I cannot see any of those
>> functions calling into ath9k_htc_stop_ani(). This might of course be
>> implicit.
>>
>> One call-path I see is:
>> ieee80211_scan_cancel()
>>   cancel_delayed_work()
>>
>> We cannot use cancel_delayed_work_sync() here due to locking issues.
>> However, this obviously races against any following
>> set_channel(OFFCHANNEL) request.
>>
>> If there's anything I can do to debug this, let me know. I tried
>> adding some printk()'s into the hot-path and it turns out to no longer
>> fail then. So this really seems to be a quite small race (given that a
>> bunch of simple printk()s is slow enough to work around it).
>
> David,
>
> Are you using USB devices? What is the PID/VID? I have not looked at
> HTC driver perspective.

Yes, I'm using the htc driver. I thought that was clear by pointing at
ar9002, but I was wrong, sorry. lsusb information is:
  'Bus 003 Device 056: ID 0411:017f BUFFALO INC. (formerly MelCo.,
Inc.) Sony UWA-BR100 802.11abgn Wireless Adapter [Atheros
AR7010+AR9280]'

Thanks
David

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-13  9:09             ` David Herrmann
@ 2014-05-13 10:41               ` Rajkumar Manoharan
  2014-05-13 18:21                 ` David Herrmann
  0 siblings, 1 reply; 14+ messages in thread
From: Rajkumar Manoharan @ 2014-05-13 10:41 UTC (permalink / raw)
  To: David Herrmann
  Cc: Felix Fietkau, John W. Linville, ath9k-devel, linux-wireless

[-- Attachment #1: Type: text/plain, Size: 2472 bytes --]

On Tue, May 13, 2014 at 11:09:48AM +0200, David Herrmann wrote:
> Hi
> 
> On Tue, May 13, 2014 at 11:00 AM, Rajkumar Manoharan
> <rmanohar@qti.qualcomm.com> wrote:
> > On Tue, May 13, 2014 at 08:50:00AM +0200, David Herrmann wrote:
> >> Hi
> >>
> >> On Mon, May 12, 2014 at 8:43 PM, Felix Fietkau <nbd@openwrt.org> wrote:
> >> > I looked into it again, the scenario where I assumed that this problem
> >> > could occur didn't turn out to be true. I have no idea how this crash
> >> > can occur.
> >>
> >> The only path that can set ah->caldata to NULL is through:
> >>
> >> ieee80211_hw_config()
> >>   ath9k_htc_config()
> >>     ath9k_htc_set_channel()
> >>       ath9k_hw_reset()
> >>
> >> This happens whenever IEEE80211_CONF_OFFCHANNEL is set.
> >>
> >> Now mac80211 is way to big for me to review right now and
> >> ieee80211_hw_config() is used quite often. Given that the described
> >> call-path does no synchronization against ath9k_htc_ani_work(), all
> >> the callers of mac80211_hw_config(OFFCHANNEL) must guarantee that no
> >> ani-work is running. Is that intentional? I cannot see any of those
> >> functions calling into ath9k_htc_stop_ani(). This might of course be
> >> implicit.
> >>
> >> One call-path I see is:
> >> ieee80211_scan_cancel()
> >>   cancel_delayed_work()
> >>
> >> We cannot use cancel_delayed_work_sync() here due to locking issues.
> >> However, this obviously races against any following
> >> set_channel(OFFCHANNEL) request.
> >>
> >> If there's anything I can do to debug this, let me know. I tried
> >> adding some printk()'s into the hot-path and it turns out to no longer
> >> fail then. So this really seems to be a quite small race (given that a
> >> bunch of simple printk()s is slow enough to work around it).
> >
> > David,
> >
> > Are you using USB devices? What is the PID/VID? I have not looked at
> > HTC driver perspective.
> 
> Yes, I'm using the htc driver. I thought that was clear by pointing at
> ar9002, but I was wrong, sorry. lsusb information is:
>   'Bus 003 Device 056: ID 0411:017f BUFFALO INC. (formerly MelCo.,
> Inc.) Sony UWA-BR100 802.11abgn Wireless Adapter [Atheros
> AR7010+AR9280]'
>
Unlike ath9k driver, the ani work is stopped in sw_scan_start callback
for htc driver. So during scan process, ani work is stopped well before
calling set_channel. But in case of p2p_listen operation, set_channel can be
called by sw_roc without stopping ani.

Can you please test with attached change?

-Rajkumar

[-- Attachment #2: ani.patch --]
[-- Type: text/x-diff, Size: 830 bytes --]

diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_main.c b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
index f46cd02..5627917 100644
--- a/drivers/net/wireless/ath/ath9k/htc_drv_main.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_main.c
@@ -95,8 +95,10 @@ static void ath9k_htc_vif_iter(void *data, u8 *mac, struct ieee80211_vif *vif)
 
 	if ((vif->type == NL80211_IFTYPE_AP ||
 	     vif->type == NL80211_IFTYPE_MESH_POINT) &&
-	    bss_conf->enable_beacon)
+	    bss_conf->enable_beacon) {
 		priv->reconfig_beacon = true;
+		priv->rearm_ani = true;
+	}
 
 	if (bss_conf->assoc) {
 		priv->rearm_ani = true;
@@ -257,6 +259,7 @@ static int ath9k_htc_set_channel(struct ath9k_htc_priv *priv,
 
 	ath9k_htc_ps_wakeup(priv);
 
+	ath9k_htc_stop_ani(priv);
 	del_timer_sync(&priv->tx.cleanup_timer);
 	ath9k_htc_tx_drain(priv);
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002
  2014-05-13 10:41               ` Rajkumar Manoharan
@ 2014-05-13 18:21                 ` David Herrmann
  0 siblings, 0 replies; 14+ messages in thread
From: David Herrmann @ 2014-05-13 18:21 UTC (permalink / raw)
  To: Rajkumar Manoharan
  Cc: Felix Fietkau, John W. Linville, ath9k-devel, linux-wireless

Hi

On Tue, May 13, 2014 at 12:41 PM, Rajkumar Manoharan
<rmanohar@qti.qualcomm.com> wrote:
> On Tue, May 13, 2014 at 11:09:48AM +0200, David Herrmann wrote:
> Unlike ath9k driver, the ani work is stopped in sw_scan_start callback
> for htc driver. So during scan process, ani work is stopped well before
> calling set_channel. But in case of p2p_listen operation, set_channel can be
> called by sw_roc without stopping ani.
>
> Can you please test with attached change?

I cannot reproduce it anymore and my traces show that the
recalibration is called a _lot_ less often. In fact, during 10
successful p2p-connect attempts the per_calib helper is only called
once.

I will keep looking, but looks good so far:
Tested-by: David Herrmann <dh.herrmann@gmail.com>

Thanks
David

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-05-13 18:21 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-07  7:22 [PATCH] ath9k: fix NULL-deref in hw_per_calibration() for ar9002 David Herrmann
2014-05-07 19:54 ` John W. Linville
2014-05-07 20:03   ` [ath9k-devel] " Luis R. Rodriguez
2014-05-07 20:38     ` John W. Linville
2014-05-07 20:15   ` Felix Fietkau
2014-05-12 17:49     ` John W. Linville
2014-05-12 18:43       ` Felix Fietkau
2014-05-13  6:50         ` David Herrmann
2014-05-13  9:00           ` Rajkumar Manoharan
2014-05-13  9:09             ` David Herrmann
2014-05-13 10:41               ` Rajkumar Manoharan
2014-05-13 18:21                 ` David Herrmann
2014-05-08 18:18 ` Rajkumar Manoharan
2014-05-08 20:16   ` David Herrmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).