All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: [Bug 989269] Connecting to WLAN causes kernel panic
       [not found] <bug-989269-351310-C5h8dsuOxB@bugzilla.redhat.com>
@ 2013-07-31  8:39 ` Arend van Spriel
  2013-07-31  9:09   ` Felix Fietkau
  0 siblings, 1 reply; 5+ messages in thread
From: Arend van Spriel @ 2013-07-31  8:39 UTC (permalink / raw)
  To: Felix Fietkau, linux-wireless, John W. Linville, John Greene

Hi Felix,

How are things in OpenWRT. I wanted to ask you something regarding a 
defect I am looking at. Since kernel 3.9 several reports have been made 
about a kernel panic in brcmsmac, ie. a divide-by-zero error.

Debugging the issue shows we end up with a rate with MCS index 110, 
which is, well, impossible. As brcmsmac gets the rate info from 
minstrel_ht I was wondering if we have an intergration issue here. I saw 
around April patches about new API which may have been in the 3.9 time 
frame and something subtly changed things for brcmsmac.

Regards,
Arend

-------- Original Message --------
Subject: [Bug 989269] Connecting to WLAN causes kernel panic
Date: Wed, 31 Jul 2013 08:11:41 +0000
From: <bugzilla@redhat.com>
To: <fedora-kernel-wireless-brcm80211@fedoraproject.org>

https://bugzilla.redhat.com/show_bug.cgi?id=989269



--- Comment #13 from Arend van Spriel <arend@broadcom.com> ---
(In reply to Chris from comment #12)
> Created attachment 780839 [details]
> dmesg | grep brcms when connecting to WLAN after patch 2
>
> During gathering this data I connected to the internet, was sitting for a
> while and then walked through a corridor in my university, so that the
> computer was connecting to different routers. Sat down there for
> significantly longer time. At the end I reconnected and disconnected.
> It seems to work stable, without any problems, but I haven't tried to use
> the connection for something heavier.

Thanks for the data. I observed two values that are invalid. ratespec 
value 0
is invalid and the driver selects 1Mbps rate to do the calculation. The 
other
value 134217838 is what triggers the divide-by-zero. The ratespec value is:
ratespec: 0x800006E
   RATE           110      (rate value [unit: 500Kbps or MCS index])
   MIMORATE       1        (RATE field represents MIMO MCS index)

This does not make sense, because MCS index can only go up to 32. I suspect
this should not be a mimo rate, but 54Mbps. Looking further how we end up in
this situation.

-- 
You are receiving this mail because:
You are the assignee for the bug.





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fwd: [Bug 989269] Connecting to WLAN causes kernel panic
  2013-07-31  8:39 ` Fwd: [Bug 989269] Connecting to WLAN causes kernel panic Arend van Spriel
@ 2013-07-31  9:09   ` Felix Fietkau
  2013-07-31  9:45     ` Arend van Spriel
                       ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Felix Fietkau @ 2013-07-31  9:09 UTC (permalink / raw)
  To: Arend van Spriel; +Cc: linux-wireless, John W. Linville, John Greene

On 2013-07-31 10:39 AM, Arend van Spriel wrote:
> Hi Felix,
> 
> How are things in OpenWRT. I wanted to ask you something regarding a 
> defect I am looking at. Since kernel 3.9 several reports have been made 
> about a kernel panic in brcmsmac, ie. a divide-by-zero error.
3.9 was the first kernel to support CCK rates in minstrel_ht as
fallback (in case the link gets very bad). Not sure if that triggers
anything weird in brcmsmac.

> Debugging the issue shows we end up with a rate with MCS index 110, 
> which is, well, impossible.
Did you verify that it comes directly from minstrel_ht, or does it show
up somewhere further down the chain in brcmsmac?

> As brcmsmac gets the rate info from 
> minstrel_ht I was wondering if we have an intergration issue here. I saw 
> around April patches about new API which may have been in the 3.9 time 
> frame and something subtly changed things for brcmsmac.
The new rate API was added in 3.10, not 3.9. It did add bug that caused
bogus MCS rates. I've sent a patch for this a while back (shortly
before 3.10 was released), but it was too late to make it into the
release. I guess we have to wait for it to be applied through stable -
no idea why that hasn't happened yet. 

Here is the fix:

commit 1cd158573951f737fbc878a35cb5eb47bf9af3d5
Author: Felix Fietkau <nbd@openwrt.org>
Date:   Fri Jun 28 21:04:35 2013 +0200

    mac80211/minstrel_ht: fix cck rate sampling
    
    The CCK group needs special treatment to set the right flags and rate
    index. Add this missing check to prevent setting broken rates for tx
    packets.
    
    Cc: stable@vger.kernel.org # 3.10
    Signed-off-by: Felix Fietkau <nbd@openwrt.org>
    Signed-off-by: Johannes Berg <johannes@sipsolutions.net>

diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
index 5b2d301..f5aed96 100644
--- a/net/mac80211/rc80211_minstrel_ht.c
+++ b/net/mac80211/rc80211_minstrel_ht.c
@@ -804,10 +804,18 @@ minstrel_ht_get_rate(void *priv, struct ieee80211_sta *sta, void *priv_sta,
 
 	sample_group = &minstrel_mcs_groups[sample_idx / MCS_GROUP_RATES];
 	info->flags |= IEEE80211_TX_CTL_RATE_CTRL_PROBE;
+	rate->count = 1;
+
+	if (sample_idx / MCS_GROUP_RATES == MINSTREL_CCK_GROUP) {
+		int idx = sample_idx % ARRAY_SIZE(mp->cck_rates);
+		rate->idx = mp->cck_rates[idx];
+		rate->flags = 0;
+		return;
+	}
+
 	rate->idx = sample_idx % MCS_GROUP_RATES +
 		    (sample_group->streams - 1) * MCS_GROUP_RATES;
 	rate->flags = IEEE80211_TX_RC_MCS | sample_group->flags;
-	rate->count = 1;
 }
 
 static void


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: Fwd: [Bug 989269] Connecting to WLAN causes kernel panic
  2013-07-31  9:09   ` Felix Fietkau
@ 2013-07-31  9:45     ` Arend van Spriel
  2013-07-31  9:46     ` Sedat Dilek
  2013-08-16 20:47     ` Arend van Spriel
  2 siblings, 0 replies; 5+ messages in thread
From: Arend van Spriel @ 2013-07-31  9:45 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, John W. Linville, John Greene

On 07/31/2013 11:09 AM, Felix Fietkau wrote:
> On 2013-07-31 10:39 AM, Arend van Spriel wrote:
>> Hi Felix,
>>
>> How are things in OpenWRT. I wanted to ask you something regarding a
>> defect I am looking at. Since kernel 3.9 several reports have been made
>> about a kernel panic in brcmsmac, ie. a divide-by-zero error.
> 3.9 was the first kernel to support CCK rates in minstrel_ht as
> fallback (in case the link gets very bad). Not sure if that triggers
> anything weird in brcmsmac.

It just might reading this in brcmsmac:

	/*
	 * Currently only support same setting for primary and
	 * fallback rates. Unify flags for each rate into a
	 * single value for the frame
	 */
	use_rts |= txrate[k]->flags & IEEE80211_TX_RC_USE_RTS_CTS
		? true : false;
	use_cts |= txrate[k]->flags & IEEE80211_TX_RC_USE_CTS_PROTECT
		? true : false;

Although this is not directly

>> Debugging the issue shows we end up with a rate with MCS index 110,
>> which is, well, impossible.
> Did you verify that it comes directly from minstrel_ht, or does it show
> up somewhere further down the chain in brcmsmac?

I am pretty sure it is not minstrel_ht. brcmsmac converts the 
information from minstrel_ht into a so-called ratespec format. The 
strange MCS is what I see in the ratespec leading up to the 
divide-by-zero. Next thing to look at is the conversion step. As said 
above the CCK fallback might be the culprit. I mean how brcmsmac deals 
with it is.

>> As brcmsmac gets the rate info from
>> minstrel_ht I was wondering if we have an intergration issue here. I saw
>> around April patches about new API which may have been in the 3.9 time
>> frame and something subtly changed things for brcmsmac.
> The new rate API was added in 3.10, not 3.9. It did add bug that caused
> bogus MCS rates. I've sent a patch for this a while back (shortly
> before 3.10 was released), but it was too late to make it into the
> release. I guess we have to wait for it to be applied through stable -
> no idea why that hasn't happened yet.

Ping Greg? I will give it a try.

Thanks,
Arend

> Here is the fix:
>
> commit 1cd158573951f737fbc878a35cb5eb47bf9af3d5
> Author: Felix Fietkau <nbd@openwrt.org>
> Date:   Fri Jun 28 21:04:35 2013 +0200
>
>      mac80211/minstrel_ht: fix cck rate sampling
>
>      The CCK group needs special treatment to set the right flags and rate
>      index. Add this missing check to prevent setting broken rates for tx
>      packets.
>
>      Cc: stable@vger.kernel.org # 3.10
>      Signed-off-by: Felix Fietkau <nbd@openwrt.org>
>      Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
>
> diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
> index 5b2d301..f5aed96 100644
> --- a/net/mac80211/rc80211_minstrel_ht.c
> +++ b/net/mac80211/rc80211_minstrel_ht.c
> @@ -804,10 +804,18 @@ minstrel_ht_get_rate(void *priv, struct ieee80211_sta *sta, void *priv_sta,
>
>   	sample_group = &minstrel_mcs_groups[sample_idx / MCS_GROUP_RATES];
>   	info->flags |= IEEE80211_TX_CTL_RATE_CTRL_PROBE;
> +	rate->count = 1;
> +
> +	if (sample_idx / MCS_GROUP_RATES == MINSTREL_CCK_GROUP) {
> +		int idx = sample_idx % ARRAY_SIZE(mp->cck_rates);
> +		rate->idx = mp->cck_rates[idx];
> +		rate->flags = 0;
> +		return;
> +	}
> +
>   	rate->idx = sample_idx % MCS_GROUP_RATES +
>   		    (sample_group->streams - 1) * MCS_GROUP_RATES;
>   	rate->flags = IEEE80211_TX_RC_MCS | sample_group->flags;
> -	rate->count = 1;
>   }
>
>   static void
>
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fwd: [Bug 989269] Connecting to WLAN causes kernel panic
  2013-07-31  9:09   ` Felix Fietkau
  2013-07-31  9:45     ` Arend van Spriel
@ 2013-07-31  9:46     ` Sedat Dilek
  2013-08-16 20:47     ` Arend van Spriel
  2 siblings, 0 replies; 5+ messages in thread
From: Sedat Dilek @ 2013-07-31  9:46 UTC (permalink / raw)
  To: Felix Fietkau
  Cc: Arend van Spriel, linux-wireless, John W. Linville, John Greene

On Wed, Jul 31, 2013 at 11:09 AM, Felix Fietkau <nbd@openwrt.org> wrote:
> On 2013-07-31 10:39 AM, Arend van Spriel wrote:
>> Hi Felix,
>>
>> How are things in OpenWRT. I wanted to ask you something regarding a
>> defect I am looking at. Since kernel 3.9 several reports have been made
>> about a kernel panic in brcmsmac, ie. a divide-by-zero error.
> 3.9 was the first kernel to support CCK rates in minstrel_ht as
> fallback (in case the link gets very bad). Not sure if that triggers
> anything weird in brcmsmac.
>
>> Debugging the issue shows we end up with a rate with MCS index 110,
>> which is, well, impossible.
> Did you verify that it comes directly from minstrel_ht, or does it show
> up somewhere further down the chain in brcmsmac?
>
>> As brcmsmac gets the rate info from
>> minstrel_ht I was wondering if we have an intergration issue here. I saw
>> around April patches about new API which may have been in the 3.9 time
>> frame and something subtly changed things for brcmsmac.
> The new rate API was added in 3.10, not 3.9. It did add bug that caused
> bogus MCS rates. I've sent a patch for this a while back (shortly
> before 3.10 was released), but it was too late to make it into the
> release. I guess we have to wait for it to be applied through stable -
> no idea why that hasn't happened yet.
>
> Here is the fix:
>
> commit 1cd158573951f737fbc878a35cb5eb47bf9af3d5
> Author: Felix Fietkau <nbd@openwrt.org>
> Date:   Fri Jun 28 21:04:35 2013 +0200
>
>     mac80211/minstrel_ht: fix cck rate sampling
>

That patch is not in Linus tree yet, so it won't get into stable.

- Sedat -

>     The CCK group needs special treatment to set the right flags and rate
>     index. Add this missing check to prevent setting broken rates for tx
>     packets.
>
>     Cc: stable@vger.kernel.org # 3.10
>     Signed-off-by: Felix Fietkau <nbd@openwrt.org>
>     Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
>
> diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
> index 5b2d301..f5aed96 100644
> --- a/net/mac80211/rc80211_minstrel_ht.c
> +++ b/net/mac80211/rc80211_minstrel_ht.c
> @@ -804,10 +804,18 @@ minstrel_ht_get_rate(void *priv, struct ieee80211_sta *sta, void *priv_sta,
>
>         sample_group = &minstrel_mcs_groups[sample_idx / MCS_GROUP_RATES];
>         info->flags |= IEEE80211_TX_CTL_RATE_CTRL_PROBE;
> +       rate->count = 1;
> +
> +       if (sample_idx / MCS_GROUP_RATES == MINSTREL_CCK_GROUP) {
> +               int idx = sample_idx % ARRAY_SIZE(mp->cck_rates);
> +               rate->idx = mp->cck_rates[idx];
> +               rate->flags = 0;
> +               return;
> +       }
> +
>         rate->idx = sample_idx % MCS_GROUP_RATES +
>                     (sample_group->streams - 1) * MCS_GROUP_RATES;
>         rate->flags = IEEE80211_TX_RC_MCS | sample_group->flags;
> -       rate->count = 1;
>  }
>
>  static void
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Fwd: [Bug 989269] Connecting to WLAN causes kernel panic
  2013-07-31  9:09   ` Felix Fietkau
  2013-07-31  9:45     ` Arend van Spriel
  2013-07-31  9:46     ` Sedat Dilek
@ 2013-08-16 20:47     ` Arend van Spriel
  2 siblings, 0 replies; 5+ messages in thread
From: Arend van Spriel @ 2013-08-16 20:47 UTC (permalink / raw)
  To: Felix Fietkau; +Cc: linux-wireless, John W. Linville, John Greene

On 07/31/2013 11:09 AM, Felix Fietkau wrote:
> On 2013-07-31 10:39 AM, Arend van Spriel wrote:
>> Hi Felix,
>>
>> How are things in OpenWRT. I wanted to ask you something regarding a
>> defect I am looking at. Since kernel 3.9 several reports have been made
>> about a kernel panic in brcmsmac, ie. a divide-by-zero error.
> 3.9 was the first kernel to support CCK rates in minstrel_ht as
> fallback (in case the link gets very bad). Not sure if that triggers
> anything weird in brcmsmac.
>
>> Debugging the issue shows we end up with a rate with MCS index 110,
>> which is, well, impossible.
> Did you verify that it comes directly from minstrel_ht, or does it show
> up somewhere further down the chain in brcmsmac?
>
>> As brcmsmac gets the rate info from
>> minstrel_ht I was wondering if we have an intergration issue here. I saw
>> around April patches about new API which may have been in the 3.9 time
>> frame and something subtly changed things for brcmsmac.
> The new rate API was added in 3.10, not 3.9. It did add bug that caused
> bogus MCS rates. I've sent a patch for this a while back (shortly
> before 3.10 was released), but it was too late to make it into the
> release. I guess we have to wait for it to be applied through stable -
> no idea why that hasn't happened yet.

Reportedly the problem still exists in 3.10.6 and 3.11-rc4. So I started 
digging some more. So can you have a look at the rate table below that 
we setup in the wiphy structure:

static struct ieee80211_rate legacy_ratetable[] = {
	RATE(10, 0),
	RATE(20, IEEE80211_RATE_SHORT_PREAMBLE),
	RATE(55, IEEE80211_RATE_SHORT_PREAMBLE),
	RATE(110, IEEE80211_RATE_SHORT_PREAMBLE),
	RATE(60, 0),
	RATE(90, 0),
	RATE(120, 0),
	RATE(180, 0),
	RATE(240, 0),
	RATE(360, 0),
	RATE(480, 0),
	RATE(540, 0),
};

where RATE() is defined as:

#define RATE(rate100m, _flags) { \
	.bitrate = (rate100m), \
	.flags = (_flags), \
	.hw_value = (rate100m / 5), \
}

Do you see anything obviously wrong here from minstrel_ht perspective?

Regards,
Arend

> Here is the fix:
>
> commit 1cd158573951f737fbc878a35cb5eb47bf9af3d5
> Author: Felix Fietkau <nbd@openwrt.org>
> Date:   Fri Jun 28 21:04:35 2013 +0200
>
>      mac80211/minstrel_ht: fix cck rate sampling
>
>      The CCK group needs special treatment to set the right flags and rate
>      index. Add this missing check to prevent setting broken rates for tx
>      packets.
>
>      Cc: stable@vger.kernel.org # 3.10
>      Signed-off-by: Felix Fietkau <nbd@openwrt.org>
>      Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
>
> diff --git a/net/mac80211/rc80211_minstrel_ht.c b/net/mac80211/rc80211_minstrel_ht.c
> index 5b2d301..f5aed96 100644
> --- a/net/mac80211/rc80211_minstrel_ht.c
> +++ b/net/mac80211/rc80211_minstrel_ht.c
> @@ -804,10 +804,18 @@ minstrel_ht_get_rate(void *priv, struct ieee80211_sta *sta, void *priv_sta,
>
>   	sample_group = &minstrel_mcs_groups[sample_idx / MCS_GROUP_RATES];
>   	info->flags |= IEEE80211_TX_CTL_RATE_CTRL_PROBE;
> +	rate->count = 1;
> +
> +	if (sample_idx / MCS_GROUP_RATES == MINSTREL_CCK_GROUP) {
> +		int idx = sample_idx % ARRAY_SIZE(mp->cck_rates);
> +		rate->idx = mp->cck_rates[idx];
> +		rate->flags = 0;
> +		return;
> +	}
> +
>   	rate->idx = sample_idx % MCS_GROUP_RATES +
>   		    (sample_group->streams - 1) * MCS_GROUP_RATES;
>   	rate->flags = IEEE80211_TX_RC_MCS | sample_group->flags;
> -	rate->count = 1;
>   }
>
>   static void
>
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-08-16 22:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-989269-351310-C5h8dsuOxB@bugzilla.redhat.com>
2013-07-31  8:39 ` Fwd: [Bug 989269] Connecting to WLAN causes kernel panic Arend van Spriel
2013-07-31  9:09   ` Felix Fietkau
2013-07-31  9:45     ` Arend van Spriel
2013-07-31  9:46     ` Sedat Dilek
2013-08-16 20:47     ` Arend van Spriel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.