From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nikolay Martynov Date: Sun, 6 Feb 2011 21:39:32 -0500 Subject: [ath9k-devel] Weird error messages in logs In-Reply-To: References: <4D48FFF7.1050908@candelatech.com> <4D49692D.5090005@openwrt.org> <4D497C69.8070006@openwrt.org> <4D49F0A4.7080507@openwrt.org> <4D4A0F2C.8070605@openwrt.org> Message-ID: List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ath9k-devel@lists.ath9k.org Hi, Is there anything I can do to help with this problem? Any additional information/testing/etc? Are the logs I've sent before of any use? Thanks. 2011/2/3 Nikolay Martynov : > Hi, > > ?So I did manage to 'stuck' it again. > > ?Ping looks like this: > 64 bytes from 192.168.1.1: icmp_req=128 ttl=64 time=12408 ms > 64 bytes from 192.168.1.1: icmp_req=129 ttl=64 time=11408 ms > 64 bytes from 192.168.1.1: icmp_req=130 ttl=64 time=10408 ms > 64 bytes from 192.168.1.1: icmp_req=132 ttl=64 time=8942 ms > 64 bytes from 192.168.1.1: icmp_req=133 ttl=64 time=7942 ms > 64 bytes from 192.168.1.1: icmp_req=134 ttl=64 time=6942 ms > 64 bytes from 192.168.1.1: icmp_req=135 ttl=64 time=5942 ms > 64 bytes from 192.168.1.1: icmp_req=136 ttl=64 time=4979 ms > 64 bytes from 192.168.1.1: icmp_req=137 ttl=64 time=3981 ms > 64 bytes from 192.168.1.1: icmp_req=138 ttl=64 time=2981 ms > 64 bytes from 192.168.1.1: icmp_req=140 ttl=64 time=10829 ms > 64 bytes from 192.168.1.1: icmp_req=141 ttl=64 time=9901 ms > 64 bytes from 192.168.1.1: icmp_req=142 ttl=64 time=8925 ms > 64 bytes from 192.168.1.1: icmp_req=145 ttl=64 time=6588 ms > 64 bytes from 192.168.1.1: icmp_req=146 ttl=64 time=6139 ms > 64 bytes from 192.168.1.1: icmp_req=147 ttl=64 time=5133 ms > 64 bytes from 192.168.1.1: icmp_req=148 ttl=64 time=4125 ms > 64 bytes from 192.168.1.1: icmp_req=149 ttl=64 time=3117 ms > 64 bytes from 192.168.1.1: icmp_req=150 ttl=64 time=2109 ms > > There were no "Aggregation session blocked" messages in logs. > Here what I've captured: > > root at router:~# cat /sys/kernel/debug/ieee80211/phy0/ath9k/xmit > Num-Tx-Queues: 10 ?tx-queues-setup: 0x10f poll-work-seen: 8164 > ? ? ? ? ? ? ? ? ? ? ? ? ? ?BE ? ? ? ? BK ? ? ? ?VI ? ? ? ?VO > > MPDUs Queued: ? ? ? ? ?1129641 ? ? ? ? ?0 ? ? ? ? 0 ? ? ?2884 > MPDUs Completed: ? ? ? 1129641 ? ? ? ? ?0 ? ? ? ? 0 ? ? ?2884 > Aggregates: ? ? ? ? ? ? 626250 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > AMPDUs Queued HW: ? ? ? 656532 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > AMPDUs Queued SW: ? ? ?4095017 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > AMPDUs Completed: ? ? ?4749880 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > AMPDUs Retried: ? ? ? ? 197274 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > AMPDUs XRetried: ? ? ? ? ?1594 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > FIFO Underrun: ? ? ? ? ? ? ? 0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > TXOP Exceeded: ? ? ? ? ? ? ? 0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > TXTIMER Expiry: ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > DESC CFG Error: ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > DATA Underrun: ? ? ? ? ? ? ? 0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > DELIM Underrun: ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > TX-Pkts-All: ? ? ? ? ? 5881115 ? ? ? ? ?0 ? ? ? ? 0 ? ? ?2884 > TX-Bytes-All: ? ? ? 2056395884 ? ? ? ? ?0 ? ? ? ? 0 ? ?227872 > hw-put-tx-buf: ? ? ? ? ? ? ?25 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ?25 > hw-tx-start: ? ? ? ? ? 2709929 ? ? ? ? ?0 ? ? ? ? 0 ? ? ?2884 > hw-tx-proc-desc: ? ? ? 2709441 ? ? ? ? ?0 ? ? ? ? 0 ? ? ?2883 > txq-memory-address: ? 80d1db74 ? 80d1db78 ?80d1db70 ?80d1db6c > axq-qnum: ? ? ? ? ? ? ? ? ? ?2 ? ? ? ? ?3 ? ? ? ? 1 ? ? ? ? 0 > axq-depth: ? ? ? ? ? ? ? ? ? 2 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > axq-ampdu_depth: ? ? ? ? ? ? 2 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > axq-stopped ? ? ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > tx-in-progress ? ? ? ? ? ? ? 0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > pending-frames ? ? ? ? ? ? ?60 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > txq_headidx: ? ? ? ? ? ? ? ? 0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > txq_tailidx: ? ? ? ? ? ? ? ? 0 ? ? ? ? ?0 ? ? ? ? 0 ? ? ? ? 0 > axq_q empty: ? ? ? ? ? ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 0 > axq_acq empty: ? ? ? ? ? ? ? ? 0 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo_pending: ? ? ? ? ? ? ?1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[0] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[1] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[2] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[3] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[4] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[5] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[6] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq_fifo[7] empty: ? ? ? ? ? ? 1 ? ? ? ? ?1 ? ? ? ? 1 ? ? ? ? 1 > txq[2] first-ac: 80b05f18 sched: 1 > ?first-tid: 80b05a28 sched: 1 paused: 0 > root at router:~# cat /sys/kernel/debug/ieee80211/phy0/ath9k/stations > Stations: > ?tid: addr sched paused buf_q-empty an ac > ?ac: addr sched tid_q-empty txq > 00:21:6a:70:18:56 > ?tid: 80b04a28 sched running 0 80b04a1c 80b04f18 > ?tid: 80b04a74 idle running 1 80b04a1c 80b04f30 > ?tid: 80b04ac0 idle running 1 80b04a1c 80b04f30 > ?tid: 80b04b0c idle running 1 80b04a1c 80b04f18 > ?tid: 80b04b58 idle running 1 80b04a1c 80b04f00 > ?tid: 80b04ba4 idle running 1 80b04a1c 80b04f00 > ?tid: 80b04bf0 idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04c3c idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04c88 idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04cd4 idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04d20 idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04d6c idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04db8 idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04e04 idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04e50 idle running 1 80b04a1c 80b04ee8 > ?tid: 80b04e9c idle running 1 80b04a1c 80b04ee8 > ?ac: 80b04ee8 idle 1 80d1d6ac > ?ac: 80b04f00 idle 1 80d1d724 > ?ac: 80b04f18 sched 0 80d1d79c > ?ac: 80b04f30 idle 1 80d1d814 > f0:b4:79:22:71:85 > ?tid: 80b05a28 idle running 1 80b05a1c 80b05f18 > ?tid: 80b05a74 idle running 1 80b05a1c 80b05f30 > ?tid: 80b05ac0 idle running 1 80b05a1c 80b05f30 > ?tid: 80b05b0c idle running 1 80b05a1c 80b05f18 > ?tid: 80b05b58 idle running 1 80b05a1c 80b05f00 > ?tid: 80b05ba4 idle running 1 80b05a1c 80b05f00 > ?tid: 80b05bf0 idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05c3c idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05c88 idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05cd4 idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05d20 idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05d6c idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05db8 idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05e04 idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05e50 idle running 1 80b05a1c 80b05ee8 > ?tid: 80b05e9c idle running 1 80b05a1c 80b05ee8 > ?ac: 80b05ee8 idle 1 80d1d6ac > ?ac: 80b05f00 idle 1 80d1d724 > ?ac: 80b05f18 idle 1 80d1d79c > ?ac: 80b05f30 idle 1 80d1d814 > root at router:~# logread | tail > Feb ?2 23:56:36 router user.err kernel: ath: Failed to stop TX DMA! > Feb ?3 00:01:33 router user.info hostapd: wlan0: STA 00:21:6a:70:18:56 > WPA: group key handshake completed (RSN) > Feb ?3 00:01:33 router user.info hostapd: wlan0: STA f0:b4:79:22:71:85 > WPA: group key handshake completed (RSN) > Feb ?3 00:04:26 router user.info kernel: device mon.wlan0 left promiscuous mode > Feb ?3 00:10:36 router user.info hostapd: wlan0: STA 00:21:6a:70:18:56 > IEEE 802.11: authenticated > Feb ?3 00:10:36 router user.info hostapd: wlan0: STA 00:21:6a:70:18:56 > IEEE 802.11: associated (aid 1) > Feb ?3 00:10:36 router user.info hostapd: wlan0: STA 00:21:6a:70:18:56 > WPA: pairwise key handshake completed (RSN) > Feb ?3 00:11:33 router user.info hostapd: wlan0: STA 00:21:6a:70:18:56 > WPA: group key handshake completed (RSN) > Feb ?3 00:11:33 router user.info hostapd: wlan0: STA f0:b4:79:22:71:85 > WPA: group key handshake completed (RSN) > Feb ?3 00:11:33 router user.info hostapd: wlan0: STA f0:b4:79:22:71:85 > WPA: received EAPOL-Key 2/2 Group with unexpected replay counter > > Please let me know if this was useful and what kind if other > information you may need. > Thanks. > > > > 2011/2/2 Nikolay Martynov : >> Hi. >> >> ?One more thing I've noticed. >> ?When I do tcpdump on intel mon interface I get quite a lot of >> traffic. When I do same thing on router - I hardly see any traffic. Is >> this expected? >> ?This actually regards problem when client cannot reconnect to router >> - in this state even though client send association and authentication >> requests it doesn't get response. >> >> ?Can these two things be related? Is there any chance ath9k can loose >> some received control packets or 'forget' to send them? >> >> ?Thanks. >> >> 2011/2/2 Nikolay Martynov : >>> Hi. >>> >>> ?Installed new patch and do not have intel stuck yet. Although there >>> is still 10-15% ping loss. But not new error messages in dmesg/log. >>> I'll keep testing. >>> >>> ?Two updates: >>> >>> ?1) last problem I mentioned in previous letter seems to be caused by >>> swcrypt=1 on intel side. I just turned it on yesterday and didn't >>> realize it can cause such issues. But now I turned it off and it seems >>> to be fine. >>> ?2) in router logs there are often messages about inability of driver >>> to stop rx or tx dma, I'm not sure whether this is relevant or not. >>> >>> Thanks. >>> Nikolay. >>> >>> 2011/2/2 Felix Fietkau : >>>> Please copy the patch from http://nbd.name/560-ath9k_xmit_debug.patch >>>> into package/mac80211/patches/ in your OpenWrt build tree. >>>> >>>> Then recompile and update the ath9k package on your AP and see if it >>>> prints the "Aggregation session blocked..." messages when tx to the >>>> Intel client stops working. >>>> >>>> Thanks, >>>> >>>> - Felix >>>> >>>> On 2011-02-03 2:46 AM, Nikolay Martynov wrote: >>>>> Hi. >>>>> >>>>> ? I basically have three types of issues: >>>>> >>>>> ?1) Almost no usable 'n' mode. It is somewhat usable if there is one >>>>> client on AP (and I guess relative silence in the air). If there are >>>>> two clients intel card becomes unusable - it seems to receive almost >>>>> no packets. Mac clients seems to work fine with same AP (didn't try >>>>> too long though). >>>>> >>>>> ? 2) I often get disconnected from AP with message like: wlan1: >>>>> deauthenticating from xx:xx:xx:xx:xx:xx by local choice (reason=3). >>>>> This happens for no apparent reason, in 'g' mode (and probably 'n' >>>>> too, I just do not run 'n' long enough). And this happens with macbook >>>>> too. I'm not sure that message is always the same - often there is no >>>>> disconnection message. When client tries to connect back there are >>>>> different types of timeouts: >>>>> ? ?a) on the client (intel): >>>>> [78513.691609] wlan1: direct probe to xx:xx:xx:xx:xx:xx (try 1/3) >>>>> [78513.888104] wlan1: direct probe to xx:xx:xx:xx:xx:xx (try 2/3) >>>>> [78514.088111] wlan1: direct probe to xx:xx:xx:xx:xx:xx (try 3/3) >>>>> [78514.288065] wlan1: direct probe to xx:xx:xx:xx:xx:xx timed out >>>>> ? ?b) on the client (intel): >>>>> [78471.324221] wlan1: associate with xx:xx:xx:xx:xx:xx (try 1) >>>>> [78471.524074] wlan1: associate with xx:xx:xx:xx:xx:xx (try 2) >>>>> [78471.724064] wlan1: associate with xx:xx:xx:xx:xx:xx (try 3) >>>>> [78471.924055] wlan1: association with xx:xx:xx:xx:xx:xx timed out >>>>> ? a) and b) happen on intel. I often see 'timeout' in similar mac UI, >>>>> but I do not know how to get any diagnostics from mac. >>>>> ? At the same time on AP: >>>>> ? ?b) hostapd: wlan0: STA xx:xx:xx:xx:xx:xx WPA: EAPOL-Key timeout >>>>> ? ?c) hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: did not >>>>> acknowledge association response >>>>> ? ?b) and c) happen for both intel and mac clients. Moreover - one >>>>> client may keep trying to connect while other stays connected and >>>>> works fine. But if other disconnects it has problems connecting too. >>>>> Problem usually can be fixed by restarting 'wifi' on router. I do not >>>>> really know whether this is a ath9k problem, or some hostapd issue. >>>>> This may happen several times an hour, but sometimes doesn't happen >>>>> for hours. 'Disconnection' part happens more often on intel client. >>>>> >>>>> ? 3) Intel seems to loose heavy loaded tcp connection sometimes. I.e. >>>>> I upload something from laptop to machine connected by wire on the >>>>> same router and at some point upload stops and connection timeouts. >>>>> Or, if a type a lot of stuff in SSH session - ssh session locks. Looks >>>>> like bunch of packets get lost and tcp cannot recover. This happens in >>>>> 'g' mode. Often enough so I cannot upload 600Mb file without loosing >>>>> connection. Mac doesn't seem to have this problem. >>>>> >>>>> ? I have TEW-326BRP with trunk openwrt (kernel 2.6.37). >>>>> ? On the client - dell laptop with intel5300, to which I have only two >>>>> antennas connected - I replaced card from non-'n' and laptop has only >>>>> two antennas. OS - ubuntu, 2.6.38 kernel from it's repository + latest >>>>> compat wireless + exp ucode + patch from >>>>> http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2214. >>>>> ? Another client - macbook pro. >>>>> ? I think I'm about 30 meters from AP - AP is no second floor and I'm >>>>> in the basement (basement has concrete walls - this might affect >>>>> signal quality I guess). >>>>> >>>>> ? It'd be really great to have reliable wifi, so I can devote time and >>>>> do whatever testing required. >>>>> ? Please let me know how I can help. >>>>> ? Thanks! >>>> >>> >>> >>> >>> -- >>> Truthfully yours, >>> Martynov Nikolay. >>> Email: mar.kolya at gmail.com >>> Phone: +16478220537 >>> >> >> >> >> -- >> Truthfully yours, >> Martynov Nikolay. >> Email: mar.kolya at gmail.com >> Phone: +16478220537 >> > > > > -- > Truthfully yours, > Martynov Nikolay. > Email: mar.kolya at gmail.com > Phone: +16478220537 > -- Truthfully yours, Martynov Nikolay. Email: mar.kolya at gmail.com Phone: +16478220537