* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 15:43 ` Ben Gamari
0 siblings, 0 replies; 24+ messages in thread
From: Ben Gamari @ 2010-11-01 15:43 UTC (permalink / raw)
To: ath9k-devel
On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
> Hi all,
>
> I have an application that creates and destroys a lot of ap vifs and
> does a lot of monitor frame injection. The recent ath9k rx locking
> fixes have helped with stability in this use-case but there still
> seems to be some tx/beacon related race condition(s). These manifests
> themselves as follows on an AR913x based router running
> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>
> 1. TX DMA hangs under simultaneous high RX and TX load
> 2. TX is completely hung but chip is never reset
I have also observed both of these behaviors with just a standard
hostapd single VIF configuration. Quite annoying. It seems to be better
with recent wireless-testing trees.
- Ben
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 15:43 ` Ben Gamari
@ 2010-11-01 15:50 ` Björn Smedman
-1 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 15:50 UTC (permalink / raw)
To: Ben Gamari; +Cc: linux-wireless, ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
Thanx Ben, it's a relief to know I'm not the only one suffering from this.
Unfortunately I can't run wireless-testing (built-in system with
out-of-tree arch). Could this be fixed in later compat-wireless
snapshots? Can you recommend a specific snapshot?
/Björn
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 15:50 ` Björn Smedman
0 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 15:50 UTC (permalink / raw)
To: ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
Thanx Ben, it's a relief to know I'm not the only one suffering from this.
Unfortunately I can't run wireless-testing (built-in system with
out-of-tree arch). Could this be fixed in later compat-wireless
snapshots? Can you recommend a specific snapshot?
/Bj?rn
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 15:50 ` Björn Smedman
@ 2010-11-01 23:12 ` Peter Stuge
-1 siblings, 0 replies; 24+ messages in thread
From: Peter Stuge @ 2010-11-01 23:12 UTC (permalink / raw)
To: Björn Smedman; +Cc: Ben Gamari, ath9k-devel, linux-wireless
Björn Smedman wrote:
> >> 1. TX DMA hangs under simultaneous high RX and TX load
> >> 2. TX is completely hung but chip is never reset
> >
> > I have also observed both of these behaviors with just a standard
> > hostapd single VIF configuration.
>
> Thanx Ben, it's a relief to know I'm not the only one suffering
> from this.
Just a note to confirm that I have also seen many different failures
related to this. The lasting impression is that it's a big mess.
I bought my first ath9k hardware roughly a year ago. That was totally
useless as STA up until kernels a few months ago. I am now using an
AR9280 card and for the very first time ath9k hardware and driver is
actually working at all for me, but there are still issues as I noted
in the other email.
Unfortunately they're the kind of issues which can't be debugged much
lacking strong knowledge of device internals.
//Peter
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 23:12 ` Peter Stuge
0 siblings, 0 replies; 24+ messages in thread
From: Peter Stuge @ 2010-11-01 23:12 UTC (permalink / raw)
To: ath9k-devel
Bj?rn Smedman wrote:
> >> 1. TX DMA hangs under simultaneous high RX and TX load
> >> 2. TX is completely hung but chip is never reset
> >
> > I have also observed both of these behaviors with just a standard
> > hostapd single VIF configuration.
>
> Thanx Ben, it's a relief to know I'm not the only one suffering
> from this.
Just a note to confirm that I have also seen many different failures
related to this. The lasting impression is that it's a big mess.
I bought my first ath9k hardware roughly a year ago. That was totally
useless as STA up until kernels a few months ago. I am now using an
AR9280 card and for the very first time ath9k hardware and driver is
actually working at all for me, but there are still issues as I noted
in the other email.
Unfortunately they're the kind of issues which can't be debugged much
lacking strong knowledge of device internals.
//Peter
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 15:43 ` Ben Gamari
@ 2010-11-01 16:20 ` Björn Smedman
-1 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 16:20 UTC (permalink / raw)
To: Ben Gamari; +Cc: linux-wireless, ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
Looking at the code here is the first passage that triggers a bad
fuzzy feeling for me (beacon.c):
skb = ieee80211_get_buffered_bc(hw, vif);
/*
* if the CABQ traffic from previous DTIM is pending and the current
* beacon is also a DTIM.
* 1) if there is only one vif let the cab traffic continue.
* 2) if there are more than one vif and we are using staggered
* beacons, then drain the cabq by dropping all the frames in
* the cabq so that the current vifs cab traffic can be scheduled.
*/
spin_lock_bh(&cabq->axq_lock);
cabq_depth = cabq->axq_depth;
spin_unlock_bh(&cabq->axq_lock);
if (skb && cabq_depth) {
if (sc->nvifs > 1) {
ath_print(common, ATH_DBG_BEACON,
"Flushing previous cabq traffic\n");
ath_draintxq(sc, cabq, false);
}
}
ath_beacon_setup(sc, avp, bf, info->control.rates[0].idx);
while (skb) {
ath_tx_cabq(hw, skb);
skb = ieee80211_get_buffered_bc(hw, vif);
}
>From what I can tell there is no guarantee that CABQ TX DMA is stopped
when ath_draintxq() is called. From ath_draintxq() point of view that
looks like a bad idea (race between CPU and DMA).
Also, that looking around "cabq_depth = cabq->axq_depth;" looks very
peculiar. I believe it's correct (because nobody else puts anything
into this queue and we don't care if it's shorter later on when we
drain it) but I think it would be nice with a comment.
Any thoughts? I can whip up and test a patch if there are no objections.
/Björn
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 16:20 ` Björn Smedman
0 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 16:20 UTC (permalink / raw)
To: ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
Looking at the code here is the first passage that triggers a bad
fuzzy feeling for me (beacon.c):
skb = ieee80211_get_buffered_bc(hw, vif);
/*
* if the CABQ traffic from previous DTIM is pending and the current
* beacon is also a DTIM.
* 1) if there is only one vif let the cab traffic continue.
* 2) if there are more than one vif and we are using staggered
* beacons, then drain the cabq by dropping all the frames in
* the cabq so that the current vifs cab traffic can be scheduled.
*/
spin_lock_bh(&cabq->axq_lock);
cabq_depth = cabq->axq_depth;
spin_unlock_bh(&cabq->axq_lock);
if (skb && cabq_depth) {
if (sc->nvifs > 1) {
ath_print(common, ATH_DBG_BEACON,
"Flushing previous cabq traffic\n");
ath_draintxq(sc, cabq, false);
}
}
ath_beacon_setup(sc, avp, bf, info->control.rates[0].idx);
while (skb) {
ath_tx_cabq(hw, skb);
skb = ieee80211_get_buffered_bc(hw, vif);
}
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 15:43 ` Ben Gamari
@ 2010-11-01 16:39 ` Björn Smedman
-1 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 16:39 UTC (permalink / raw)
To: Ben Gamari; +Cc: linux-wireless, ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
The next thing that looks racy to me is ath_beacon_alloc() vs
ath_beacon_tasklet() in beacon.c. Beacon queue TX DMA is always
stopped in main.c before calling ath_beacon_alloc() but
ath_beacon_tasklet() is scheduled when we get an SWBA interrupt. My
guess is that these keep coming even if we stop TX DMA on the beacon
queue, no?
/Björn
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 16:39 ` Björn Smedman
0 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 16:39 UTC (permalink / raw)
To: ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
The next thing that looks racy to me is ath_beacon_alloc() vs
ath_beacon_tasklet() in beacon.c. Beacon queue TX DMA is always
stopped in main.c before calling ath_beacon_alloc() but
ath_beacon_tasklet() is scheduled when we get an SWBA interrupt. My
guess is that these keep coming even if we stop TX DMA on the beacon
queue, no?
/Bj?rn
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 16:39 ` Björn Smedman
@ 2010-11-01 16:44 ` Luis R. Rodriguez
-1 siblings, 0 replies; 24+ messages in thread
From: Luis R. Rodriguez @ 2010-11-01 16:44 UTC (permalink / raw)
To: Björn Smedman; +Cc: Ben Gamari, ath9k-devel, linux-wireless
2010/11/1 Björn Smedman <bjorn.smedman@venatech.se>:
> On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
>> On Mon, 1 Nov 2010 16:17:23 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
>>> Hi all,
>>>
>>> I have an application that creates and destroys a lot of ap vifs and
>>> does a lot of monitor frame injection. The recent ath9k rx locking
>>> fixes have helped with stability in this use-case but there still
>>> seems to be some tx/beacon related race condition(s). These manifests
>>> themselves as follows on an AR913x based router running
>>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>>
>>> 1. TX DMA hangs under simultaneous high RX and TX load
>>> 2. TX is completely hung but chip is never reset
>>
>> I have also observed both of these behaviors with just a standard
>> hostapd single VIF configuration. Quite annoying. It seems to be better
>> with recent wireless-testing trees.
>>
>> - Ben
>
> The next thing that looks racy to me is ath_beacon_alloc() vs
> ath_beacon_tasklet() in beacon.c. Beacon queue TX DMA is always
> stopped in main.c before calling ath_beacon_alloc() but
> ath_beacon_tasklet() is scheduled when we get an SWBA interrupt. My
> guess is that these keep coming even if we stop TX DMA on the beacon
> queue, no?
My TX PCU patches for ath9k are not merged yet, try those or wait
until John merges them.
Luis
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 16:44 ` Luis R. Rodriguez
0 siblings, 0 replies; 24+ messages in thread
From: Luis R. Rodriguez @ 2010-11-01 16:44 UTC (permalink / raw)
To: ath9k-devel
2010/11/1 Bj?rn Smedman <bjorn.smedman@venatech.se>:
> On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
>> On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
>>> Hi all,
>>>
>>> I have an application that creates and destroys a lot of ap vifs and
>>> does a lot of monitor frame injection. The recent ath9k rx locking
>>> fixes have helped with stability in this use-case but there still
>>> seems to be some tx/beacon related race condition(s). These manifests
>>> themselves as follows on an AR913x based router running
>>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>>
>>> 1. TX DMA hangs under simultaneous high RX and TX load
>>> 2. TX is completely hung but chip is never reset
>>
>> I have also observed both of these behaviors with just a standard
>> hostapd single VIF configuration. Quite annoying. It seems to be better
>> with recent wireless-testing trees.
>>
>> - Ben
>
> The next thing that looks racy to me is ath_beacon_alloc() vs
> ath_beacon_tasklet() in beacon.c. Beacon queue TX DMA is always
> stopped in main.c before calling ath_beacon_alloc() but
> ath_beacon_tasklet() is scheduled when we get an SWBA interrupt. My
> guess is that these keep coming even if we stop TX DMA on the beacon
> queue, no?
My TX PCU patches for ath9k are not merged yet, try those or wait
until John merges them.
Luis
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 16:44 ` Luis R. Rodriguez
@ 2010-11-01 16:52 ` Felix Fietkau
-1 siblings, 0 replies; 24+ messages in thread
From: Felix Fietkau @ 2010-11-01 16:52 UTC (permalink / raw)
To: Luis R. Rodriguez, Björn Smedman
Cc: Ben Gamari, ath9k-devel, linux-wireless
On 2010-11-01 5:44 PM, Luis R. Rodriguez wrote:
> 2010/11/1 Björn Smedman <bjorn.smedman@venatech.se>:
>> On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
>>> On Mon, 1 Nov 2010 16:17:23 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
>>>> Hi all,
>>>>
>>>> I have an application that creates and destroys a lot of ap vifs and
>>>> does a lot of monitor frame injection. The recent ath9k rx locking
>>>> fixes have helped with stability in this use-case but there still
>>>> seems to be some tx/beacon related race condition(s). These manifests
>>>> themselves as follows on an AR913x based router running
>>>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>>>
>>>> 1. TX DMA hangs under simultaneous high RX and TX load
>>>> 2. TX is completely hung but chip is never reset
>>>
>>> I have also observed both of these behaviors with just a standard
>>> hostapd single VIF configuration. Quite annoying. It seems to be better
>>> with recent wireless-testing trees.
>>>
>>> - Ben
>>
>> The next thing that looks racy to me is ath_beacon_alloc() vs
>> ath_beacon_tasklet() in beacon.c. Beacon queue TX DMA is always
>> stopped in main.c before calling ath_beacon_alloc() but
>> ath_beacon_tasklet() is scheduled when we get an SWBA interrupt. My
>> guess is that these keep coming even if we stop TX DMA on the beacon
>> queue, no?
>
> My TX PCU patches for ath9k are not merged yet, try those or wait
> until John merges them.
They are merged in OpenWrt. Björn, which OpenWrt revision did you use in
your tests?
- Felix
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 16:52 ` Felix Fietkau
0 siblings, 0 replies; 24+ messages in thread
From: Felix Fietkau @ 2010-11-01 16:52 UTC (permalink / raw)
To: ath9k-devel
On 2010-11-01 5:44 PM, Luis R. Rodriguez wrote:
> 2010/11/1 Bj?rn Smedman <bjorn.smedman@venatech.se>:
>> On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
>>> On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
>>>> Hi all,
>>>>
>>>> I have an application that creates and destroys a lot of ap vifs and
>>>> does a lot of monitor frame injection. The recent ath9k rx locking
>>>> fixes have helped with stability in this use-case but there still
>>>> seems to be some tx/beacon related race condition(s). These manifests
>>>> themselves as follows on an AR913x based router running
>>>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>>>
>>>> 1. TX DMA hangs under simultaneous high RX and TX load
>>>> 2. TX is completely hung but chip is never reset
>>>
>>> I have also observed both of these behaviors with just a standard
>>> hostapd single VIF configuration. Quite annoying. It seems to be better
>>> with recent wireless-testing trees.
>>>
>>> - Ben
>>
>> The next thing that looks racy to me is ath_beacon_alloc() vs
>> ath_beacon_tasklet() in beacon.c. Beacon queue TX DMA is always
>> stopped in main.c before calling ath_beacon_alloc() but
>> ath_beacon_tasklet() is scheduled when we get an SWBA interrupt. My
>> guess is that these keep coming even if we stop TX DMA on the beacon
>> queue, no?
>
> My TX PCU patches for ath9k are not merged yet, try those or wait
> until John merges them.
They are merged in OpenWrt. Bj?rn, which OpenWrt revision did you use in
your tests?
- Felix
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 16:52 ` Felix Fietkau
@ 2010-11-01 17:12 ` Björn Smedman
-1 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 17:12 UTC (permalink / raw)
To: Felix Fietkau; +Cc: Luis R. Rodriguez, Ben Gamari, ath9k-devel, linux-wireless
2010/11/1 Felix Fietkau <nbd@openwrt.org>
> > My TX PCU patches for ath9k are not merged yet, try those or wait
> > until John merges them.
> They are merged in OpenWrt. Björn, which OpenWrt revision did you use in
> your tests?
>
> - Felix
I'm based on openwrt/trunk@23720 when I run code. But when I read code
I'm looking at the latest wireless-testing (and trying to keep track
of pending patches on linux-wireless). I will apply the TX PCU patch
and see if that changes my bad fuzzy feeling.
/Björn
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-01 17:12 ` Björn Smedman
0 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 17:12 UTC (permalink / raw)
To: ath9k-devel
2010/11/1 Felix Fietkau <nbd@openwrt.org>
> > My TX PCU patches for ath9k are not merged yet, try those or wait
> > until John merges them.
> They are merged in OpenWrt. Bj?rn, which OpenWrt revision did you use in
> your tests?
>
> - Felix
I'm based on openwrt/trunk at 23720 when I run code. But when I read code
I'm looking at the latest wireless-testing (and trying to keep track
of pending patches on linux-wireless). I will apply the TX PCU patch
and see if that changes my bad fuzzy feeling.
/Bj?rn
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-01 15:43 ` Ben Gamari
@ 2010-11-02 16:55 ` Björn Smedman
-1 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-02 16:55 UTC (permalink / raw)
To: Ben Gamari; +Cc: linux-wireless, ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
I just posted "[RFC] ath9k: fix tx queue selection" with a patch that
fixes (or at least reduces) these two for me. I'm not sure it is the
whole story but at least in theory 1 could be caused by locking one tx
queue and actually transmitting on another. 2 is probably caused by
stopping one mac80211 queue and then starting another.
Ben, if you can easily trigger these problems on wireless-testing,
could you test with my patch and see if it helps? I'm especially
interested to see if it really fixes problem 1.
/Björn
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-02 16:55 ` Björn Smedman
0 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-02 16:55 UTC (permalink / raw)
To: ath9k-devel
On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
> On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
>> Hi all,
>>
>> I have an application that creates and destroys a lot of ap vifs and
>> does a lot of monitor frame injection. The recent ath9k rx locking
>> fixes have helped with stability in this use-case but there still
>> seems to be some tx/beacon related race condition(s). These manifests
>> themselves as follows on an AR913x based router running
>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>
>> 1. TX DMA hangs under simultaneous high RX and TX load
>> 2. TX is completely hung but chip is never reset
>
> I have also observed both of these behaviors with just a standard
> hostapd single VIF configuration. Quite annoying. It seems to be better
> with recent wireless-testing trees.
>
> - Ben
I just posted "[RFC] ath9k: fix tx queue selection" with a patch that
fixes (or at least reduces) these two for me. I'm not sure it is the
whole story but at least in theory 1 could be caused by locking one tx
queue and actually transmitting on another. 2 is probably caused by
stopping one mac80211 queue and then starting another.
Ben, if you can easily trigger these problems on wireless-testing,
could you test with my patch and see if it helps? I'm especially
interested to see if it really fixes problem 1.
/Bj?rn
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-02 16:55 ` Björn Smedman
@ 2010-11-03 16:41 ` Björn Smedman
-1 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-03 16:41 UTC (permalink / raw)
To: linux-wireless, ath9k-devel
2010/11/2 Björn Smedman <bjorn.smedman@venatech.se>:
> On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
>> On Mon, 1 Nov 2010 16:17:23 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
>>> Hi all,
>>>
>>> I have an application that creates and destroys a lot of ap vifs and
>>> does a lot of monitor frame injection. The recent ath9k rx locking
>>> fixes have helped with stability in this use-case but there still
>>> seems to be some tx/beacon related race condition(s). These manifests
>>> themselves as follows on an AR913x based router running
>>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>>
>>> 1. TX DMA hangs under simultaneous high RX and TX load
>>> 2. TX is completely hung but chip is never reset
>>
>> I have also observed both of these behaviors with just a standard
>> hostapd single VIF configuration. Quite annoying. It seems to be better
>> with recent wireless-testing trees.
>>
>> - Ben
>
> I just posted "[RFC] ath9k: fix tx queue selection" with a patch that
> fixes (or at least reduces) these two for me. I'm not sure it is the
> whole story but at least in theory 1 could be caused by locking one tx
> queue and actually transmitting on another. 2 is probably caused by
> stopping one mac80211 queue and then starting another.
Problem 1 is still there. After 5-15 hours of varying rx/tx frame
injection load something like this happens and the chip goes
deaf/mute:
Jan 1 00:18:33 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x40000020
Jan 1 00:18:33 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
Jan 1 00:18:33 user.debug kernel: ath: ah->misc_mode 0xc
Jan 1 00:18:33 user.debug kernel: ath: Setting CFG 0x10a
Jan 1 00:18:43 user.debug kernel: ath: Timeout while waiting
for nf to load: AR_PHY_AGC_CONTROL=0x40d22
Jan 1 00:18:44 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x40000020
Jan 1 00:18:44 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
Jan 1 00:18:44 user.debug kernel: ath: ah->misc_mode 0xc
Jan 1 00:18:44 user.debug kernel: ath: Setting CFG 0x10a
Problem 2 seems gone though.
/Björn
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-03 16:41 ` Björn Smedman
0 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-03 16:41 UTC (permalink / raw)
To: ath9k-devel
2010/11/2 Bj?rn Smedman <bjorn.smedman@venatech.se>:
> On Mon, Nov 1, 2010 at 4:43 PM, Ben Gamari <bgamari@gmail.com> wrote:
>> On Mon, 1 Nov 2010 16:17:23 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
>>> Hi all,
>>>
>>> I have an application that creates and destroys a lot of ap vifs and
>>> does a lot of monitor frame injection. The recent ath9k rx locking
>>> fixes have helped with stability in this use-case but there still
>>> seems to be some tx/beacon related race condition(s). These manifests
>>> themselves as follows on an AR913x based router running
>>> compat-wireless-2010-10-19 (with locking fixes etc from openwrt):
>>>
>>> 1. TX DMA hangs under simultaneous high RX and TX load
>>> 2. TX is completely hung but chip is never reset
>>
>> I have also observed both of these behaviors with just a standard
>> hostapd single VIF configuration. Quite annoying. It seems to be better
>> with recent wireless-testing trees.
>>
>> - Ben
>
> I just posted "[RFC] ath9k: fix tx queue selection" with a patch that
> fixes (or at least reduces) these two for me. I'm not sure it is the
> whole story but at least in theory 1 could be caused by locking one tx
> queue and actually transmitting on another. 2 is probably caused by
> stopping one mac80211 queue and then starting another.
Problem 1 is still there. After 5-15 hours of varying rx/tx frame
injection load something like this happens and the chip goes
deaf/mute:
Jan 1 00:18:33 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x40000020
Jan 1 00:18:33 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
Jan 1 00:18:33 user.debug kernel: ath: ah->misc_mode 0xc
Jan 1 00:18:33 user.debug kernel: ath: Setting CFG 0x10a
Jan 1 00:18:43 user.debug kernel: ath: Timeout while waiting
for nf to load: AR_PHY_AGC_CONTROL=0x40d22
Jan 1 00:18:44 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x40000020
Jan 1 00:18:44 user.debug kernel: ath: DMA failed to stop in 10
ms AR_CR=0x00000024 AR_DIAG_SW=0x42000020
Jan 1 00:18:44 user.debug kernel: ath: ah->misc_mode 0xc
Jan 1 00:18:44 user.debug kernel: ath: Setting CFG 0x10a
Problem 2 seems gone though.
/Bj?rn
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [ath9k-devel] ath9k: race conditions in dma
2010-11-02 16:55 ` Björn Smedman
@ 2010-11-03 17:47 ` Ben Gamari
-1 siblings, 0 replies; 24+ messages in thread
From: Ben Gamari @ 2010-11-03 17:47 UTC (permalink / raw)
To: Björn Smedman; +Cc: ath9k-devel, linux-wireless
On Tue, 2 Nov 2010 17:55:22 +0100, Björn Smedman <bjorn.smedman@venatech.se> wrote:
> Ben, if you can easily trigger these problems on wireless-testing,
> could you test with my patch and see if it helps? I'm especially
> interested to see if it really fixes problem 1.
>
The only time I've been able to reproduce the issue with
wireless-testing is when using my work laptop. I'll bring it home
tonight and see if your patch makes any difference. Thanks,
- Ben
^ permalink raw reply [flat|nested] 24+ messages in thread
* [ath9k-devel] ath9k: race conditions in dma
@ 2010-11-03 17:47 ` Ben Gamari
0 siblings, 0 replies; 24+ messages in thread
From: Ben Gamari @ 2010-11-03 17:47 UTC (permalink / raw)
To: ath9k-devel
On Tue, 2 Nov 2010 17:55:22 +0100, Bj?rn Smedman <bjorn.smedman@venatech.se> wrote:
> Ben, if you can easily trigger these problems on wireless-testing,
> could you test with my patch and see if it helps? I'm especially
> interested to see if it really fixes problem 1.
>
The only time I've been able to reproduce the issue with
wireless-testing is when using my work laptop. I'll bring it home
tonight and see if your patch makes any difference. Thanks,
- Ben
^ permalink raw reply [flat|nested] 24+ messages in thread