All of lore.kernel.org
 help / color / mirror / Atom feed
* ath9k: race conditions in dma
@ 2010-11-01 15:17 ` Björn Smedman
  0 siblings, 0 replies; 24+ messages in thread
From: Björn Smedman @ 2010-11-01 15:17 UTC (permalink / raw)
  To: linux-wireless, ath9k-devel

Hi all,

I have an application that creates and destroys a lot of ap vifs and
does a lot of monitor frame injection. The recent ath9k rx locking
fixes have helped with stability in this use-case but there still
seems to be some tx/beacon related race condition(s). These manifests
themselves as follows on an AR913x based router running
compat-wireless-2010-10-19 (with locking fixes etc from openwrt):

1. TX DMA hangs under simultaneous high RX and TX load

This can happen within minutes but only seems to happen if there is
high load on both RX and TX. These hangs take several seconds to fully
recover from and seem more severe than the usual ones we used to see
before the pcu locking fixes. Log output looks like this:

Jan  1 00:08:47 user.debug kernel: ath: Failed to stop TX DMA in 100
msec after killing last frame
Jan  1 00:08:47 user.debug kernel: ath: Failed to stop TX DMA in 100
msec after killing last frame
Jan  1 00:08:47 user.debug kernel: ath: Failed to stop TX DMA in 100
msec after killing last frame
Jan  1 00:08:47 user.debug kernel: ath: Failed to stop TX DMA in 100
msec after killing last frame
Jan  1 00:08:47 user.debug kernel: ath: Failed to stop TX DMA.
Resetting hardware!
Jan  1 00:08:47 user.debug kernel: ath: DMA failed to stop in 10 ms
AR_CR=0x00000024 AR_DIAG_SW=0x42000020
Jan  1 00:08:47 user.debug kernel: ath: ah->misc_mode 0xc
Jan  1 00:08:47 user.debug kernel: ath: Setting CFG 0x10a
Jan  1 00:08:47 user.debug kernel: ath: ah->misc_mode 0xc
Jan  1 00:08:47 user.debug kernel: ath: Setting CFG 0x10a

Also note that in my use-case there is more processing done on
ieee80211_rx() and ieee80211_tx_status() than perhaps normal.

2. TX is completely hung but chip is never reset

Another similar failure mode under the same conditions as above (when
TX and RX load is high) is that the TX pipeline is somehow hung
(nothing coming out on radio) but there is no log output to suggest
that anything is seriously wrong. My guess here is that the tx queue
might be stopped but I have not been able to verify that.

3. Interrupts completely stop coming

The last failure mode happens when the driver is not RX/TX loaded but
instead left running for a longer period of time (about 12 hours is
enough in most cases but 48 hours basically always does the trick).
The system is fine but it seems ath9k is not receiving any interrupts
("cat /sys/kernel/debug/ath9k/phy0/interrupts" produces the same
result over and over again). If full debug is enabled ("echo 0xffff >
/sys/kernel/debug/ath9k/phy0/debug") only shortcal and longcal related
prints appear (driven by a timer). Bringing the interface down and
then up again with ifconfig does not bring it back to life, but
restarting hostapd does.

Help in tracking these down would be much appreciated. I will follow
up below with some thoughts on contributing factors.

/Björn

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2010-11-03 17:47 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-01 15:17 ath9k: race conditions in dma Björn Smedman
2010-11-01 15:17 ` [ath9k-devel] " Björn Smedman
2010-11-01 15:43 ` Ben Gamari
2010-11-01 15:43   ` Ben Gamari
2010-11-01 15:50   ` Björn Smedman
2010-11-01 15:50     ` Björn Smedman
2010-11-01 23:12     ` Peter Stuge
2010-11-01 23:12       ` Peter Stuge
2010-11-01 16:20   ` Björn Smedman
2010-11-01 16:20     ` Björn Smedman
2010-11-01 16:39   ` Björn Smedman
2010-11-01 16:39     ` Björn Smedman
2010-11-01 16:44     ` Luis R. Rodriguez
2010-11-01 16:44       ` Luis R. Rodriguez
2010-11-01 16:52       ` Felix Fietkau
2010-11-01 16:52         ` Felix Fietkau
2010-11-01 17:12         ` Björn Smedman
2010-11-01 17:12           ` Björn Smedman
2010-11-02 16:55   ` Björn Smedman
2010-11-02 16:55     ` Björn Smedman
2010-11-03 16:41     ` Björn Smedman
2010-11-03 16:41       ` Björn Smedman
2010-11-03 17:47     ` Ben Gamari
2010-11-03 17:47       ` Ben Gamari

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.