linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ath9k: ap tsf seems random and only uses lower 24 bits or so
@ 2010-06-28 22:31 Björn Smedman
  2010-06-28 22:55 ` Felix Fietkau
  0 siblings, 1 reply; 14+ messages in thread
From: Björn Smedman @ 2010-06-28 22:31 UTC (permalink / raw)
  To: linux-wireless, ath9k-devel

Hi all,

I'm getting weird values from the debugfs file ieee80211/phy0/tsf: the
value goes up and down rather randomly and only the lower 24 bits or
so seem to ever be used (see below for details).

The only thing running on phy0 is a single ap interface (and the
monitor companion that hostapd sets up). I was expecting tsf to
increase monotonically until all 64 bits had been used.

root@OpenWrt:/debug/ieee80211/phy0# dmesg | grep phy0
phy0: Selected rate control algorithm 'minstrel_ht'
phy0: Atheros AR9100 MAC/BB Rev:0 AR2133 RF Rev:a2 mem=0xb80c0000, irq=2
root@OpenWrt:/debug/ieee80211/phy0# while sleep 1; do cat tsf; done
0x0000000000059d9b
0x0000000000038fa2
0x00000000000ee67e
0x000000000008121e
0x000000000017752d
0x000000000006438b
0x000000000015a771
0x00000000000d5ea0
0x0000000000085b43
0x0000000000037806
0x000000000001f562
0x0000000000115b03
0x000000000020be55
^C

If you poll tsf really fast it looks a bit like it's running correctly
but being reset very often:

root@OpenWrt:/debug/ieee80211/phy0# while true; do cat tsf; done
...
0x00000000004b2319
0x00000000004b33f2
0x00000000004b46dc
0x00000000004b578c
0x00000000004b6a89
0x0000000000014435
0x00000000000154a0
0x00000000000167ce
0x000000000001785d
...
^C

For a moment I thought it might be the kernel snprintf (on mips)
playing a trick on me so I tried the following patch. But the result
is the same.

diff -urN a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -63,7 +63,8 @@

 	tsf = drv_get_tsf(local);

-	snprintf(buf, sizeof(buf), "0x%016llx\n", (unsigned long long) tsf);
+	snprintf(buf, sizeof(buf), "0x%08lx%08lx\n",
+		 (unsigned long)(tsf >> 32), (unsigned long)tsf);

 	return simple_read_from_buffer(user_buf, count, ppos, buf, 19);
 }


What could be causing this? Because it's not supposed to do this,
right? Any insight much appreciated!

/Björn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-28 22:31 ath9k: ap tsf seems random and only uses lower 24 bits or so Björn Smedman
@ 2010-06-28 22:55 ` Felix Fietkau
  2010-06-29  6:08   ` [ath9k-devel] " Benoit Papillault
  2010-06-29 15:20   ` Björn Smedman
  0 siblings, 2 replies; 14+ messages in thread
From: Felix Fietkau @ 2010-06-28 22:55 UTC (permalink / raw)
  To: Björn Smedman; +Cc: linux-wireless, ath9k-devel

On 2010-06-29 12:31 AM, Björn Smedman wrote:
> Hi all,
> 
> I'm getting weird values from the debugfs file ieee80211/phy0/tsf: the
> value goes up and down rather randomly and only the lower 24 bits or
> so seem to ever be used (see below for details).
> 
> The only thing running on phy0 is a single ap interface (and the
> monitor companion that hostapd sets up). I was expecting tsf to
> increase monotonically until all 64 bits had been used.
> 
> For a moment I thought it might be the kernel snprintf (on mips)
> playing a trick on me so I tried the following patch. But the result
> is the same.
IMHO the most likely problem source is stuck beacons. Please compile the
driver with the debug option enabled and load it with
insmod ath9k debug=0x00000100

- Felix

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-28 22:55 ` Felix Fietkau
@ 2010-06-29  6:08   ` Benoit Papillault
  2010-06-29 11:45     ` Felix Fietkau
  2010-06-29 15:20   ` Björn Smedman
  1 sibling, 1 reply; 14+ messages in thread
From: Benoit Papillault @ 2010-06-29  6:08 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: Björn Smedman, ath9k-devel, linux-wireless

Le 29/06/2010 00:55, Felix Fietkau a écrit :
> On 2010-06-29 12:31 AM, Björn Smedman wrote:
>> Hi all,
>>
>> I'm getting weird values from the debugfs file ieee80211/phy0/tsf: the
>> value goes up and down rather randomly and only the lower 24 bits or
>> so seem to ever be used (see below for details).
>>
>> The only thing running on phy0 is a single ap interface (and the
>> monitor companion that hostapd sets up). I was expecting tsf to
>> increase monotonically until all 64 bits had been used.
>>
>> For a moment I thought it might be the kernel snprintf (on mips)
>> playing a trick on me so I tried the following patch. But the result
>> is the same.
> IMHO the most likely problem source is stuck beacons. Please compile the
> driver with the debug option enabled and load it with
> insmod ath9k debug=0x00000100
>
> - Felix

Humm... I observed a similar behavior a while ago because only the 15 
lower bits of rstamp were used when being extended (but rstamp is 32 
bits in fact). If so, it has been fixed by Felix in the following commit :

commit a6d2055b02dde1067075795274672720baadd3ca
Author: Felix Fietkau <nbd@openwrt.org>
Date:   Sat Jun 12 00:33:54 2010 -0400

     ath9k: fix extending the rx timestamp with the hardware TSF

Regards,
Benoit

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29  6:08   ` [ath9k-devel] " Benoit Papillault
@ 2010-06-29 11:45     ` Felix Fietkau
  2010-06-29 21:23       ` Benoit Papillault
  0 siblings, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2010-06-29 11:45 UTC (permalink / raw)
  To: Benoit Papillault; +Cc: Björn Smedman, ath9k-devel, linux-wireless

On 2010-06-29 8:08 AM, Benoit Papillault wrote:
> Le 29/06/2010 00:55, Felix Fietkau a écrit :
>> On 2010-06-29 12:31 AM, Björn Smedman wrote:
>>> Hi all,
>>>
>>> I'm getting weird values from the debugfs file ieee80211/phy0/tsf: the
>>> value goes up and down rather randomly and only the lower 24 bits or
>>> so seem to ever be used (see below for details).
>>>
>>> The only thing running on phy0 is a single ap interface (and the
>>> monitor companion that hostapd sets up). I was expecting tsf to
>>> increase monotonically until all 64 bits had been used.
>>>
>>> For a moment I thought it might be the kernel snprintf (on mips)
>>> playing a trick on me so I tried the following patch. But the result
>>> is the same.
>> IMHO the most likely problem source is stuck beacons. Please compile the
>> driver with the debug option enabled and load it with
>> insmod ath9k debug=0x00000100
>>
>> - Felix
> 
> Humm... I observed a similar behavior a while ago because only the 15 
> lower bits of rstamp were used when being extended (but rstamp is 32 
> bits in fact). If so, it has been fixed by Felix in the following commit :
Nope, different issue. The TSF extending applies only to rx timestamps,
however Björn has been observing weird TSF values from the hw register.
The Rx TSF timestamp is pretty much irrelevant in AP mode.

- Felix

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-28 22:55 ` Felix Fietkau
  2010-06-29  6:08   ` [ath9k-devel] " Benoit Papillault
@ 2010-06-29 15:20   ` Björn Smedman
  2010-06-29 15:55     ` Felix Fietkau
  1 sibling, 1 reply; 14+ messages in thread
From: Björn Smedman @ 2010-06-29 15:20 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, ath9k-devel

2010/6/29 Felix Fietkau <nbd@openwrt.org>:
> IMHO the most likely problem source is stuck beacons. Please compile the
> driver with the debug option enabled and load it with
> insmod ath9k debug=0x00000100

It looks like it could be:

...
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 2 [tsf 1986567
tsftu 1940 intval 100] vif (null)
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 1 [tsf 2012168
tsftu 1965 intval 100] vif (null)
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 0 [tsf 2037767
tsftu 1990 intval 100] vif 80945e70
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 0 [tsf 79033
tsftu 77 intval 100] vif 80945e70
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: resume beacon xmit
after 1 misses
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 3 [tsf 117790
tsftu 115 intval 100] vif (null)
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 2 [tsf 143368
tsftu 140 intval 100] vif (null)
Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 1 [tsf 168967
tsftu 165 intval 100] vif (null)
...
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 1 [tsf 14197768
tsftu 13865 intval 100] vif (null)
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 0 [tsf 14223368
tsftu 13890 intval 100] vif 80945e70
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 3 [tsf 14248967
tsftu 13915 intval 100] vif (null)
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 0 [tsf 79180
tsftu 77 intval 100] vif 80945e70
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: resume beacon xmit
after 1 misses
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 3 [tsf 117791
tsftu 115 intval 100] vif (null)
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 2 [tsf 143366
tsftu 140 intval 100] vif (null)
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 1 [tsf 168967
tsftu 165 intval 100] vif (null)
Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 0 [tsf 194567
tsftu 190 intval 100] vif 80945e70
...

What can cause a missed beacon? Are they just a "fact of life"?

In any case I can't find any code that resets the tsf in this (single
missed beacon) case. Will the hardware reset the tsf automatically
whenever a single beacon is missed? Isn't that a bit overkill? Will it
not cause problems for clients?

> - Felix

/Björn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 15:20   ` Björn Smedman
@ 2010-06-29 15:55     ` Felix Fietkau
  2010-06-29 16:36       ` Björn Smedman
  0 siblings, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2010-06-29 15:55 UTC (permalink / raw)
  To: Björn Smedman; +Cc: linux-wireless, ath9k-devel

On 2010-06-29 5:20 PM, Björn Smedman wrote:
> 2010/6/29 Felix Fietkau <nbd@openwrt.org>:
>> IMHO the most likely problem source is stuck beacons. Please compile the
>> driver with the debug option enabled and load it with
>> insmod ath9k debug=0x00000100
> 
> It looks like it could be:
> 
> ...
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 2 [tsf 1986567
> tsftu 1940 intval 100] vif (null)
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 1 [tsf 2012168
> tsftu 1965 intval 100] vif (null)
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 0 [tsf 2037767
> tsftu 1990 intval 100] vif 80945e70
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 0 [tsf 79033
> tsftu 77 intval 100] vif 80945e70
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: resume beacon xmit
> after 1 misses
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 3 [tsf 117790
> tsftu 115 intval 100] vif (null)
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 2 [tsf 143368
> tsftu 140 intval 100] vif (null)
> Jan  1 00:06:21 OpenWrt user.debug kernel: ath: slot 1 [tsf 168967
> tsftu 165 intval 100] vif (null)
> ...
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 1 [tsf 14197768
> tsftu 13865 intval 100] vif (null)
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 0 [tsf 14223368
> tsftu 13890 intval 100] vif 80945e70
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 3 [tsf 14248967
> tsftu 13915 intval 100] vif (null)
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 0 [tsf 79180
> tsftu 77 intval 100] vif 80945e70
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: resume beacon xmit
> after 1 misses
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 3 [tsf 117791
> tsftu 115 intval 100] vif (null)
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 2 [tsf 143366
> tsftu 140 intval 100] vif (null)
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 1 [tsf 168967
> tsftu 165 intval 100] vif (null)
> Jan  1 00:09:08 OpenWrt user.debug kernel: ath: slot 0 [tsf 194567
> tsftu 190 intval 100] vif 80945e70
> ...
> 
> What can cause a missed beacon? Are they just a "fact of life"?
> 
> In any case I can't find any code that resets the tsf in this (single
> missed beacon) case. Will the hardware reset the tsf automatically
> whenever a single beacon is missed? Isn't that a bit overkill? Will it
> not cause problems for clients?
One beacon miss should never cause a TSF reset. Only a lot of
consecutive beacon misses trigger a hardware reset, which then resets
the TSF. Looking at your log, it appears that the beacon miss is a
symptom rather than a cause of the TSF jumps.
Can you add a debug statement to the hw reset function to see if it's
called before the TSF jumps?

- Felix

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 15:55     ` Felix Fietkau
@ 2010-06-29 16:36       ` Björn Smedman
  2010-06-29 16:52         ` Felix Fietkau
  0 siblings, 1 reply; 14+ messages in thread
From: Björn Smedman @ 2010-06-29 16:36 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, ath9k-devel

2010/6/29 Felix Fietkau <nbd@openwrt.org>:
> One beacon miss should never cause a TSF reset. Only a lot of
> consecutive beacon misses trigger a hardware reset, which then resets
> the TSF. Looking at your log, it appears that the beacon miss is a
> symptom rather than a cause of the TSF jumps.
> Can you add a debug statement to the hw reset function to see if it's
> called before the TSF jumps?

Yup, seems to be a hardware reset. Added an ath_print ("Reset HW!") at
the beginning of ath9k_hw_reset() and used debug mask 0x101:

...
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 3 [tsf 14863367
tsftu 14515 intval 100] vif (null)
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 2 [tsf 14888967
tsftu 14540 intval 100] vif (null)
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 1 [tsf 14914568
tsftu 14565 intval 100] vif (null)
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Reset HW!
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: ah->misc_mode 0xc
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Setting CFG 0x10a
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 0 [tsf 80123
tsftu 78 intval 100] vif 80945e70
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Reset HW!
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: ah->misc_mode 0xc
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Setting CFG 0x10a
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 0 [tsf 80989
tsftu 79 intval 100] vif 80945e70
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: resume beacon xmit
after 1 misses
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 3 [tsf 117792
tsftu 115 intval 100] vif (null)
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 2 [tsf 143368
tsftu 140 intval 100] vif (null)
Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 1 [tsf 168967
tsftu 165 intval 100] vif (null)
...

> - Felix
>

/Björn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 16:36       ` Björn Smedman
@ 2010-06-29 16:52         ` Felix Fietkau
  2010-06-29 17:32           ` Björn Smedman
  0 siblings, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2010-06-29 16:52 UTC (permalink / raw)
  To: Björn Smedman; +Cc: linux-wireless, ath9k-devel

On 2010-06-29 6:36 PM, Björn Smedman wrote:
> 2010/6/29 Felix Fietkau <nbd@openwrt.org>:
>> One beacon miss should never cause a TSF reset. Only a lot of
>> consecutive beacon misses trigger a hardware reset, which then resets
>> the TSF. Looking at your log, it appears that the beacon miss is a
>> symptom rather than a cause of the TSF jumps.
>> Can you add a debug statement to the hw reset function to see if it's
>> called before the TSF jumps?
> 
> Yup, seems to be a hardware reset. Added an ath_print ("Reset HW!") at
> the beginning of ath9k_hw_reset() and used debug mask 0x101:
> 
> ...
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 3 [tsf 14863367
> tsftu 14515 intval 100] vif (null)
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 2 [tsf 14888967
> tsftu 14540 intval 100] vif (null)
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 1 [tsf 14914568
> tsftu 14565 intval 100] vif (null)
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Reset HW!
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: ah->misc_mode 0xc
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Setting CFG 0x10a
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 0 [tsf 80123
> tsftu 78 intval 100] vif 80945e70
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Reset HW!
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: ah->misc_mode 0xc
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: Setting CFG 0x10a
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 0 [tsf 80989
> tsftu 79 intval 100] vif 80945e70
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: resume beacon xmit
> after 1 misses
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 3 [tsf 117792
> tsftu 115 intval 100] vif (null)
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 2 [tsf 143368
> tsftu 140 intval 100] vif (null)
> Jan  1 00:01:59 OpenWrt user.debug kernel: ath: slot 1 [tsf 168967
> tsftu 165 intval 100] vif (null)
> ...
Please add another print to the end of ath9k_hw_check_alive() before the
'return false'. Make sure it prints the value of the 'reg' variable.
If you see it in the log, then it's probably the baseband getting stuck.

- Felix

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 16:52         ` Felix Fietkau
@ 2010-06-29 17:32           ` Björn Smedman
  2010-06-29 21:40             ` Björn Smedman
  0 siblings, 1 reply; 14+ messages in thread
From: Björn Smedman @ 2010-06-29 17:32 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, ath9k-devel

2010/6/29 Felix Fietkau <nbd@openwrt.org>:
> Please add another print to the end of ath9k_hw_check_alive() before the
> 'return false'. Make sure it prints the value of the 'reg' variable.
> If you see it in the log, then it's probably the baseband getting stuck.

Yes, hw reset is due to reg = 0x01702400 every 4 - 40 seconds or so:
...
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: slot 0 [tsf 4495367
tsftu 4390 intval 100] vif 80944e70
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: slot 3 [tsf 4520967
tsftu 4415 intval 100] vif (null)
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: slot 2 [tsf 4546567
tsftu 4440 intval 100] vif (null)
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: ath9k_hw_check_alive:
reg = 0x01702400
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: Reset due to hw dead
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: Reset HW!
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: ah->misc_mode 0xc
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: Setting CFG 0x10a
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: slot 0 [tsf 79211
tsftu 77 intval 100] vif 80944e70
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: missed 1 consecutive beacons
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: resume beacon xmit
after 1 misses
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: slot 3 [tsf 117796
tsftu 115 intval 100] vif (null)
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: slot 2 [tsf 143366
tsftu 140 intval 100] vif (null)
Jan  1 00:03:21 OpenWrt user.debug kernel: ath: slot 1 [tsf 168967
tsftu 165 intval 100] vif (null)
...

> - Felix

/Björn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [ath9k-devel] ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 11:45     ` Felix Fietkau
@ 2010-06-29 21:23       ` Benoit Papillault
  0 siblings, 0 replies; 14+ messages in thread
From: Benoit Papillault @ 2010-06-29 21:23 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: Björn Smedman, ath9k-devel, linux-wireless

Le 29/06/2010 13:45, Felix Fietkau a écrit :
> On 2010-06-29 8:08 AM, Benoit Papillault wrote:
>> Le 29/06/2010 00:55, Felix Fietkau a écrit :
>>> On 2010-06-29 12:31 AM, Björn Smedman wrote:
>>>> Hi all,
>>>>
>>>> I'm getting weird values from the debugfs file ieee80211/phy0/tsf: the
>>>> value goes up and down rather randomly and only the lower 24 bits or
>>>> so seem to ever be used (see below for details).
>>>>
>>>> The only thing running on phy0 is a single ap interface (and the
>>>> monitor companion that hostapd sets up). I was expecting tsf to
>>>> increase monotonically until all 64 bits had been used.
>>>>
>>>> For a moment I thought it might be the kernel snprintf (on mips)
>>>> playing a trick on me so I tried the following patch. But the result
>>>> is the same.
>>> IMHO the most likely problem source is stuck beacons. Please compile the
>>> driver with the debug option enabled and load it with
>>> insmod ath9k debug=0x00000100
>>>
>>> - Felix
>>
>> Humm... I observed a similar behavior a while ago because only the 15
>> lower bits of rstamp were used when being extended (but rstamp is 32
>> bits in fact). If so, it has been fixed by Felix in the following commit :
> Nope, different issue. The TSF extending applies only to rx timestamps,
> however Björn has been observing weird TSF values from the hw register.
> The Rx TSF timestamp is pretty much irrelevant in AP mode.
>
> - Felix
>

Ah, right and.. right! :-)
Sorry for the noise.

Regards,
Benoit


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 17:32           ` Björn Smedman
@ 2010-06-29 21:40             ` Björn Smedman
  2010-06-29 21:54               ` Felix Fietkau
  0 siblings, 1 reply; 14+ messages in thread
From: Björn Smedman @ 2010-06-29 21:40 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, ath9k-devel

2010/6/29 Björn Smedman <bjorn.smedman@venatech.se>:
> Yes, hw reset is due to reg = 0x01702400 every 4 - 40 seconds or so:
> ...

When the chip is really stuck, does 'reg' (at 'return false') change
over time? If I add a second requirement that ath9k_hw_check_alive()
must fail three times in a row (in different invocations of
ath9k_tasklet()) before we reset the chip the ap seems fine. I
sometimes get several of these reg = 0x01702400 every second but only
one or at the max two in a row.

Under load I sometimes see some reg = 0x00f02400 as well. I also see
an occasional reset now and then (about once a minute) that must be
caused by something else.

Any insight into what these reg values mean? Do you think they can
safely be ignored as per above?

/Björn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 21:40             ` Björn Smedman
@ 2010-06-29 21:54               ` Felix Fietkau
  2010-06-29 22:50                 ` Björn Smedman
  0 siblings, 1 reply; 14+ messages in thread
From: Felix Fietkau @ 2010-06-29 21:54 UTC (permalink / raw)
  To: Björn Smedman; +Cc: linux-wireless, ath9k-devel

On 2010-06-29 11:40 PM, Björn Smedman wrote:
> 2010/6/29 Björn Smedman <bjorn.smedman@venatech.se>:
>> Yes, hw reset is due to reg = 0x01702400 every 4 - 40 seconds or so:
>> ...
> 
> When the chip is really stuck, does 'reg' (at 'return false') change
> over time? If I add a second requirement that ath9k_hw_check_alive()
> must fail three times in a row (in different invocations of
> ath9k_tasklet()) before we reset the chip the ap seems fine. I
> sometimes get several of these reg = 0x01702400 every second but only
> one or at the max two in a row.
> 
> Under load I sometimes see some reg = 0x00f02400 as well. I also see
> an occasional reset now and then (about once a minute) that must be
> caused by something else.
> 
> Any insight into what these reg values mean? Do you think they can
> safely be ignored as per above?
I had a similar thought about the multiple invocations thing. I think
that's a good approach in general, but we need to ensure that we make it
safe.
The main point of this function is to detect baseband hangs. If we
experience such a hang, I'm not sure we will always get enough
interrupts to do multiple consecutive tests.
One way to make it safe would be to reschedule the tasklet each time we
ignore the result of the ath9k_hw_check_alive(), that way we keep the
detection time low as well. Maybe we could also use a timer for leaving
10 ms time between attempts.

Another thing that I'm working on right now is to ensure that the TSF
gets preserved across resets. For some AR9280 based cards the code
already preserves TSF in software over the chip reset, I could simply
extend that to cover SoC as well.
But before I post such a patch, I'll do a test on AR9160 - to see if it
would be better to make the TSF preserve unconditional.

- Felix

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 21:54               ` Felix Fietkau
@ 2010-06-29 22:50                 ` Björn Smedman
  2010-06-29 23:56                   ` Felix Fietkau
  0 siblings, 1 reply; 14+ messages in thread
From: Björn Smedman @ 2010-06-29 22:50 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, ath9k-devel

2010/6/29 Felix Fietkau <nbd@openwrt.org>:
> I had a similar thought about the multiple invocations thing. I think
> that's a good approach in general, but we need to ensure that we make it
> safe.
> The main point of this function is to detect baseband hangs. If we
> experience such a hang, I'm not sure we will always get enough
> interrupts to do multiple consecutive tests.
> One way to make it safe would be to reschedule the tasklet each time we
> ignore the result of the ath9k_hw_check_alive(), that way we keep the
> detection time low as well. Maybe we could also use a timer for leaving
> 10 ms time between attempts.

The xmit logic has sc->tx_complete_work that periodically checks if
the tx is hung and resets the chip if so. Maybe we could refactor that
into a common periodic health checkup task in main.c that could call
into both xmit.c and recv.c? Or does ath9k_hw_check_alive() have to
run in the interrupt context in some way?

> - Felix

/Björn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: ath9k: ap tsf seems random and only uses lower 24 bits or so
  2010-06-29 22:50                 ` Björn Smedman
@ 2010-06-29 23:56                   ` Felix Fietkau
  0 siblings, 0 replies; 14+ messages in thread
From: Felix Fietkau @ 2010-06-29 23:56 UTC (permalink / raw)
  To: Björn Smedman; +Cc: linux-wireless, ath9k-devel

On 2010-06-30 12:50 AM, Björn Smedman wrote:
> 2010/6/29 Felix Fietkau <nbd@openwrt.org>:
>> I had a similar thought about the multiple invocations thing. I think
>> that's a good approach in general, but we need to ensure that we make it
>> safe.
>> The main point of this function is to detect baseband hangs. If we
>> experience such a hang, I'm not sure we will always get enough
>> interrupts to do multiple consecutive tests.
>> One way to make it safe would be to reschedule the tasklet each time we
>> ignore the result of the ath9k_hw_check_alive(), that way we keep the
>> detection time low as well. Maybe we could also use a timer for leaving
>> 10 ms time between attempts.
> 
> The xmit logic has sc->tx_complete_work that periodically checks if
> the tx is hung and resets the chip if so. Maybe we could refactor that
> into a common periodic health checkup task in main.c that could call
> into both xmit.c and recv.c? Or does ath9k_hw_check_alive() have to
> run in the interrupt context in some way?
I'd like to keep them separate. I think a tx queue hang is very rare,
and the check only runs every second or so, whereas the baseband hang
check needs to run very frequently, as in some situations, the hangs can
probably occur frequently as well.

- Felix

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-06-29 23:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-28 22:31 ath9k: ap tsf seems random and only uses lower 24 bits or so Björn Smedman
2010-06-28 22:55 ` Felix Fietkau
2010-06-29  6:08   ` [ath9k-devel] " Benoit Papillault
2010-06-29 11:45     ` Felix Fietkau
2010-06-29 21:23       ` Benoit Papillault
2010-06-29 15:20   ` Björn Smedman
2010-06-29 15:55     ` Felix Fietkau
2010-06-29 16:36       ` Björn Smedman
2010-06-29 16:52         ` Felix Fietkau
2010-06-29 17:32           ` Björn Smedman
2010-06-29 21:40             ` Björn Smedman
2010-06-29 21:54               ` Felix Fietkau
2010-06-29 22:50                 ` Björn Smedman
2010-06-29 23:56                   ` Felix Fietkau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).