[ath9k-devel] ath9k queue hang

* [ath9k-devel] ath9k queue hang
@ 2014-04-15  2:16 Dave Taht
  2014-04-15  3:49 ` Ben Greear
  0 siblings, 1 reply; 2+ messages in thread
From: Dave Taht @ 2014-04-15  2:16 UTC (permalink / raw)
  To: ath9k-devel

We have been trying to replicate a bug in seeing wifi connections hanging
in strange ways after tons of data is transferred... for several months now.

The symptoms varied, anything from multicast failing to background or best
effort traffic failing - from local access working with remote access
not working...

Last week, we finally got a situation where we had enough debugging on to see
something that matches the symptoms we saw, in that one of the wifi queues
would hang and leave the overlying qdisc full of packets that didn't drain.

Nothing short of a reboot clears it.

More details are at:

http://www.bufferbloat.net/issues/442#note-11

It's easily seen if you are in that state if you do a:

cat /sys/kernel/debug/ieee80211/phy*/ath9k/queues

(VO): qnum: 0 qdepth: 0 ampdu-depth: 0 pending: 0 stopped: 0
(VI): qnum: 1 qdepth: 0 ampdu-depth: 0 pending: 0 stopped: 0
(BE): qnum: 2 qdepth: 0 ampdu-depth: 0 pending: 0 stopped: 0
(BK): qnum: 3 qdepth: 0 ampdu-depth: 0 pending: 151 stopped: 1
(CAB): qnum: 8 qdepth: 0 ampdu-depth: 0 pending: 0 stopped: 0

It's ok to have pending frames, and even be stopped. It's not ok to have
pending stay stuck or increase.

If you are running with a smarter qdisc enabled, you can also see it wedged -
in this case the BK queue (1:4).

root at cerowrt:/mnt/disk1# tc -s qdisc show dev sw00
qdisc mq 1: root
Sent 3926131082 bytes 2998293 pkt (dropped 91657, overlimits 0 requeues 70095)
backlog 77608b 1000p requeues 70095
qdisc fq_codel 10: parent 1:1 limit 800p flows 1024 quantum 500 target
10.0ms interval 100.0ms
Sent 110555 bytes 771 pkt (dropped 0, overlimits 0 requeues 5)
backlog 0b 0p requeues 5
maxpacket 256 drop_overlimit 0 new_flow_count 2 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 20: parent 1:2 limit 800p flows 1024 quantum 300 target
5.0ms interval 100.0ms ecn
Sent 2526448 bytes 17982 pkt (dropped 1, overlimits 0 requeues 31)
backlog 0b 0p requeues 31
maxpacket 929 drop_overlimit 0 new_flow_count 71 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 30: parent 1:3 limit 1000p flows 1024 quantum 300
target 5.0ms interval 100.0ms ecn
Sent 15145657 bytes 106290 pkt (dropped 0, overlimits 0 requeues 179)
backlog 0b 0p requeues 179
maxpacket 256 drop_overlimit 0 new_flow_count 0 ecn_mark 0
new_flows_len 0 old_flows_len 0
qdisc fq_codel 40: parent 1:4 limit 1000p flows 1024 quantum 300
target 5.0ms interval 100.0ms
Sent 3908348422 bytes 2873250 pkt (dropped 91656, overlimits 0 requeues 69880)
backlog 77608b 1000p requeues 69880
^^^!!!!!

Experiencing problem for months, on the linux-backports to 3.10.x in openwrt.

-- 
Dave T?ht

^ permalink raw reply	[flat|nested] 2+ messages in thread