All of lore.kernel.org
 help / color / mirror / Atom feed
* Crash in mlme.c, wireless-testing 2.6.39-wl + hacks
@ 2011-06-30 21:22 Ben Greear
  2011-06-30 21:30 ` Johannes Berg
  0 siblings, 1 reply; 5+ messages in thread
From: Ben Greear @ 2011-06-30 21:22 UTC (permalink / raw)
  To: linux-wireless

We see occasional crashes in mlme.c when testing a certain
configuration:  30 stations, configured for in-kernel authentication,
re-configure them for supplicant, let them associate, delete one of
them.

I added a BUG_ON in __cfg80211_mlme_deauth to check for null
bssid and it hit.

Please note this is hacked code, so it's possible it's something
I am doing.  I'm going to add some extra checks in this method to
keep from crashing, but it may be a while until I can test against
clean upstream kernels for this particular config.


kernel BUG at /home/greearb/git/linux.wireless-testing-ct/net/wireless/mlme.c:606!
invalid opcode: 0000 [#1] PREEMPT
last sysfs file: /sys/devices/pci0000:00/0000:00:0c.0/net/sta0/flags
Modules linked in: padlock_aes aes_i586 aes_generic xt_TPROXY nf_tproxy_core xt_socket ip]

Pid: 28023, comm: ip Tainted: P            2.6.39-wlc3+ #44    /CN700-8237R+
EIP: 0060:[<f889e2d8>] EFLAGS: 00010202 CPU: 0
EIP is at __cfg80211_mlme_deauth+0x5a/0xfe [cfg80211]
EAX: 00000001 EBX: f69aac00 ECX: 00000000 EDX: efdf3408
ESI: f6bdc000 EDI: f5c19a04 EBP: f5c19a10 ESP: f5c199e0
  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process ip (pid: 28023, ti=f5c18000 task=f12b5100 task.ti=f5c18000)
Stack:
  c08d6ee4 efdf3000 f6bdc000 efdf3408 00000000 00000000 00000000 00000000
  00000003 efdf3408 f6bdc000 efdf3000 f5c19a48 f88a1230 00000000 00000000
  00000003 00000000 efdf3434 00000009 00000003 0174586e 00000000 efdf3408
Call Trace:
  [<f88a1230>] __cfg80211_disconnect+0xf4/0x17a [cfg80211]
  [<f888f322>] cfg80211_netdev_notifier_call+0x275/0x4a4 [cfg80211]
  [<c07462c7>] ? _raw_spin_unlock_irqrestore+0x25/0x28
  [<c072a68e>] ? packet_notifier+0x14f/0x158
  [<c0748618>] notifier_call_chain+0x26/0x48
  [<c043ccd1>] raw_notifier_call_chain+0x1a/0x1c
  [<c06bba81>] call_netdevice_notifiers+0x44/0x4b
  [<c06bbadd>] __dev_close_many+0x55/0xb2
  [<c042a706>] ? _local_bh_enable_ip+0x74/0x76
  [<c042a710>] ? local_bh_enable_ip+0x8/0xa
  [<c06bbb59>] __dev_close+0x1f/0x2c
  [<c06b9b82>] __dev_change_flags+0xa6/0x11b
  [<c06bc2d3>] dev_change_flags+0x13/0x3f
  [<c06c627b>] do_setlink+0x256/0x653
  [<c06c6970>] rtnl_newlink+0x24f/0x48f
  [<c06c67c6>] ? rtnl_newlink+0xa5/0x48f
  [<c0746900>] ? page_fault+0x10/0x10
  [<c056d775>] ? might_fault+0x14/0x16
  [<c06c6721>] ? rtnl_setlink+0xa9/0xa9
  [<c06c5d58>] rtnetlink_rcv_msg+0x188/0x19e
  [<c06c5bd0>] ? rtnetlink_rcv+0x22/0x22
  [<c06d3636>] netlink_rcv_skb+0x30/0x76
  [<c06c5bc9>] rtnetlink_rcv+0x1b/0x22
  [<c06d3457>] netlink_unicast+0xc1/0x11d
  [<c06b55a8>] ? copy_from_user+0x8/0xa
  [<c06d3b32>] netlink_sendmsg+0x212/0x229
  [<c06ad2bb>] __sock_sendmsg+0x54/0x5b
  [<c06ad744>] sock_sendmsg+0x94/0xab
  [<c056d775>] ? might_fault+0x14/0x16
  [<c056d8ce>] ? _copy_from_user+0x31/0x115
  [<c06b55a8>] ? copy_from_user+0x8/0xa
  [<c06b58d7>] ? verify_iovec+0x3e/0x77
  [<c06adf89>] sys_sendmsg+0x14d/0x19a
  [<c0484be9>] ? __do_fault+0x2b2/0x2de
  [<c048559d>] ? handle_pte_fault+0x264/0x5bc
  [<c0485984>] ? handle_mm_fault+0x8f/0x9e
  [<c06ade33>] ? sys_recvmsg+0x44/0x4d
  [<c06af1a4>] sys_socketcall+0x227/0x289
  [<c0488a15>] ? sys_brk+0xd0/0xd8
  [<c0749c50>] sysenter_do_call+0x12/0x22

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in mlme.c, wireless-testing 2.6.39-wl + hacks
  2011-06-30 21:22 Crash in mlme.c, wireless-testing 2.6.39-wl + hacks Ben Greear
@ 2011-06-30 21:30 ` Johannes Berg
  2011-06-30 21:38   ` Ben Greear
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Berg @ 2011-06-30 21:30 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-wireless

On Thu, 2011-06-30 at 14:22 -0700, Ben Greear wrote:
> We see occasional crashes in mlme.c when testing a certain
> configuration:  30 stations, configured for in-kernel authentication,
> re-configure them for supplicant, let them associate, delete one of
> them.
> 
> I added a BUG_ON in __cfg80211_mlme_deauth to check for null
> bssid and it hit.
> 
> Please note this is hacked code, so it's possible it's something
> I am doing.  I'm going to add some extra checks in this method to
> keep from crashing, but it may be a while until I can test against
> clean upstream kernels for this particular config.

It'd help if you at least said what you changed, since you say you
changed things in this area but don't say what I don't think I'll bother
looking at this.

johannes


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in mlme.c, wireless-testing 2.6.39-wl + hacks
  2011-06-30 21:30 ` Johannes Berg
@ 2011-06-30 21:38   ` Ben Greear
  2011-07-01  8:10     ` Johannes Berg
  0 siblings, 1 reply; 5+ messages in thread
From: Ben Greear @ 2011-06-30 21:38 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless

On 06/30/2011 02:30 PM, Johannes Berg wrote:
> On Thu, 2011-06-30 at 14:22 -0700, Ben Greear wrote:
>> We see occasional crashes in mlme.c when testing a certain
>> configuration:  30 stations, configured for in-kernel authentication,
>> re-configure them for supplicant, let them associate, delete one of
>> them.
>>
>> I added a BUG_ON in __cfg80211_mlme_deauth to check for null
>> bssid and it hit.
>>
>> Please note this is hacked code, so it's possible it's something
>> I am doing.  I'm going to add some extra checks in this method to
>> keep from crashing, but it may be a while until I can test against
>> clean upstream kernels for this particular config.
>
> It'd help if you at least said what you changed, since you say you
> changed things in this area but don't say what I don't think I'll bother
> looking at this.

Very little significant changes in this area, but I've a non-related
proprietary module loaded, and patches to various other parts of the
networking code.

The full tree is here if you want to take a look, or I can send
you a full unified diff:

http://dmz2.candelatech.com/git/gitweb.cgi?p=linux.wireless-testing-ct.ct/.git;a=summary

Seems a tricky timing related bug, possibly we're only hitting it because
we're testing on an older C3 processor system that is significantly slower
than our normal test systems.

Anyway, no worries if you don't care to look at it.  Looks like we're
the only ones hitting it, and I think I have a proper enough work-around.

If/when I get a chance, will try un-tainted kernel, and will re-post if
we can reproduce the bug.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in mlme.c, wireless-testing 2.6.39-wl + hacks
  2011-06-30 21:38   ` Ben Greear
@ 2011-07-01  8:10     ` Johannes Berg
  2011-07-01 13:00       ` Ben Greear
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Berg @ 2011-07-01  8:10 UTC (permalink / raw)
  To: Ben Greear; +Cc: linux-wireless


> Very little significant changes in this area, but I've a non-related
> proprietary module loaded, and patches to various other parts of the
> networking code.
> 
> The full tree is here if you want to take a look, or I can send
> you a full unified diff:
> 
> http://dmz2.candelatech.com/git/gitweb.cgi?p=linux.wireless-testing-ct.ct/.git;a=summary

Fair enough, I don't see anything there that would impact this bug.

> Seems a tricky timing related bug, possibly we're only hitting it because
> we're testing on an older C3 processor system that is significantly slower
> than our normal test systems.

Hm. That seems odd. I didn't see anything that lacked locking either.

I think the detail that we need to investigate is what you said before:

> configured for in-kernel authentication,
> re-configure them for supplicant, let them associate, delete one of
> them.

but I don't see anything there either right now.

johannes


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Crash in mlme.c, wireless-testing 2.6.39-wl + hacks
  2011-07-01  8:10     ` Johannes Berg
@ 2011-07-01 13:00       ` Ben Greear
  0 siblings, 0 replies; 5+ messages in thread
From: Ben Greear @ 2011-07-01 13:00 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless

On 07/01/2011 01:10 AM, Johannes Berg wrote:
>
>> Very little significant changes in this area, but I've a non-related
>> proprietary module loaded, and patches to various other parts of the
>> networking code.
>>
>> The full tree is here if you want to take a look, or I can send
>> you a full unified diff:
>>
>> http://dmz2.candelatech.com/git/gitweb.cgi?p=linux.wireless-testing-ct.ct/.git;a=summary
>
> Fair enough, I don't see anything there that would impact this bug.
>
>> Seems a tricky timing related bug, possibly we're only hitting it because
>> we're testing on an older C3 processor system that is significantly slower
>> than our normal test systems.
>
> Hm. That seems odd. I didn't see anything that lacked locking either.
>
> I think the detail that we need to investigate is what you said before:
>
>> configured for in-kernel authentication,
>> re-configure them for supplicant, let them associate, delete one of
>> them.
>
> but I don't see anything there either right now.

Well, would you accept a patch that checked for null bssid, and did a WARN_ON
and bailed out if found?  Seems little harm, and we verified that the system
otherwise remained stable with such a patch added...

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-07-01 13:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-30 21:22 Crash in mlme.c, wireless-testing 2.6.39-wl + hacks Ben Greear
2011-06-30 21:30 ` Johannes Berg
2011-06-30 21:38   ` Ben Greear
2011-07-01  8:10     ` Johannes Berg
2011-07-01 13:00       ` Ben Greear

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.