* unloading WiFi modules is usually triggering kernel crash @ 2012-07-31 12:54 Pedro Francisco 2012-07-31 13:13 ` John W. Linville 2012-08-07 10:22 ` Stanislaw Gruszka 0 siblings, 2 replies; 10+ messages in thread From: Pedro Francisco @ 2012-07-31 12:54 UTC (permalink / raw) To: ML linux-wireless I've noticed in the past few days a pattern: sometimes nm-applet starts showing empty bars for the signal strength. Running the script: sudo ifconfig wlan0 down; sleep 1 sudo rmmod hp_wmi; sudo rmmod iwl3945; sudo rmmod iwlegacy; sudo rmmod mac80211; sudo rmmod cfg80211 sleep 2; sudo rmmod rfkill; sync sudo modprobe rfkill; sudo modprobe cfg80211; sudo modprobe mac80211; sudo modprobe iwlegacy sudo modprobe iwl3945; sudo modprobe hp_wmi; sleep 1; sudo ifconfig wlan0 up usually triggers a kernel crash. This has happened twice so far. I tried it now for the third time but it didn't crash. Logs (running with slub_debug ): https://dl.dropbox.com/u/1332655/WiFi-issues/notTainted-cfg80211_mlme_disassoc-WARNING.log https://dl.dropbox.com/u/1332655/WiFi-issues/alreadyTainted-debug_print_object-WARNING.log (debug_print_object-WARNING was caused by running the above script rmmoding things) https://dl.dropbox.com/u/1332655/WiFi-issues/iw_dev_scan.log https://dl.dropbox.com/u/1332655/WiFi-issues/gshell-wifiBars_empty.png Any ideas on what is going on? Looking at other mails around here it seems not to be driver specific, at least the cfg80211_mlme_disassoc part. Thanks in Advance, -- Pedro ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-07-31 12:54 unloading WiFi modules is usually triggering kernel crash Pedro Francisco @ 2012-07-31 13:13 ` John W. Linville 2012-08-07 10:22 ` Stanislaw Gruszka 1 sibling, 0 replies; 10+ messages in thread From: John W. Linville @ 2012-07-31 13:13 UTC (permalink / raw) To: Pedro Francisco; +Cc: ML linux-wireless, johannes On Tue, Jul 31, 2012 at 01:54:52PM +0100, Pedro Francisco wrote: > I've noticed in the past few days a pattern: sometimes nm-applet > starts showing empty bars for the signal strength. > > Running the script: > sudo ifconfig wlan0 down; sleep 1 > sudo rmmod hp_wmi; sudo rmmod iwl3945; sudo rmmod iwlegacy; sudo rmmod > mac80211; sudo rmmod cfg80211 > sleep 2; sudo rmmod rfkill; sync > sudo modprobe rfkill; sudo modprobe cfg80211; sudo modprobe mac80211; > sudo modprobe iwlegacy > sudo modprobe iwl3945; sudo modprobe hp_wmi; sleep 1; sudo ifconfig wlan0 up > > usually triggers a kernel crash. This has happened twice so far. I > tried it now for the third time but it didn't crash. > > Logs (running with slub_debug ): > https://dl.dropbox.com/u/1332655/WiFi-issues/notTainted-cfg80211_mlme_disassoc-WARNING.log > https://dl.dropbox.com/u/1332655/WiFi-issues/alreadyTainted-debug_print_object-WARNING.log > (debug_print_object-WARNING was caused by running the above script > rmmoding things) > https://dl.dropbox.com/u/1332655/WiFi-issues/iw_dev_scan.log > https://dl.dropbox.com/u/1332655/WiFi-issues/gshell-wifiBars_empty.png > > Any ideas on what is going on? Looking at other mails around here it > seems not to be driver specific, at least the cfg80211_mlme_disassoc > part. Looks the same as this one, FWIW... https://bugzilla.redhat.com/show_bug.cgi?id=834158 John -- John W. Linville Someday the world will need a hero, and you linville@tuxdriver.com might be all we have. Be ready. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-07-31 12:54 unloading WiFi modules is usually triggering kernel crash Pedro Francisco 2012-07-31 13:13 ` John W. Linville @ 2012-08-07 10:22 ` Stanislaw Gruszka 2012-08-30 15:58 ` Pedro Francisco 1 sibling, 1 reply; 10+ messages in thread From: Stanislaw Gruszka @ 2012-08-07 10:22 UTC (permalink / raw) To: Pedro Francisco; +Cc: ML linux-wireless On Tue, Jul 31, 2012 at 01:54:52PM +0100, Pedro Francisco wrote: > I've noticed in the past few days a pattern: sometimes nm-applet > starts showing empty bars for the signal strength. RSSI reporting problem or maybe NM issue. When you change kernel to older or newer does this problem go away ? > Running the script: > sudo ifconfig wlan0 down; sleep 1 > sudo rmmod hp_wmi; sudo rmmod iwl3945; sudo rmmod iwlegacy; sudo rmmod > mac80211; sudo rmmod cfg80211 > sleep 2; sudo rmmod rfkill; sync > sudo modprobe rfkill; sudo modprobe cfg80211; sudo modprobe mac80211; > sudo modprobe iwlegacy > sudo modprobe iwl3945; sudo modprobe hp_wmi; sleep 1; sudo ifconfig wlan0 up I run a bit modified script (I do not have hp_wmi.ko and rfkill.ko) for few hours, and did not get any WARNING/crash. I used 3.5, can you check if that problem is also fixed on your system on 3.5 or newer. Stanislaw ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-08-07 10:22 ` Stanislaw Gruszka @ 2012-08-30 15:58 ` Pedro Francisco 2012-09-26 12:47 ` Pedro Francisco 0 siblings, 1 reply; 10+ messages in thread From: Pedro Francisco @ 2012-08-30 15:58 UTC (permalink / raw) To: Stanislaw Gruszka; +Cc: ML linux-wireless On Tue, Aug 7, 2012 at 11:22 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > On Tue, Jul 31, 2012 at 01:54:52PM +0100, Pedro Francisco wrote: >> I've noticed in the past few days a pattern: sometimes nm-applet >> starts showing empty bars for the signal strength. > > RSSI reporting problem or maybe NM issue. When you change kernel to > older or newer does this problem go away ? > >> Running the script: >> sudo ifconfig wlan0 down; sleep 1 >> sudo rmmod hp_wmi; sudo rmmod iwl3945; sudo rmmod iwlegacy; sudo rmmod >> mac80211; sudo rmmod cfg80211 >> sleep 2; sudo rmmod rfkill; sync >> sudo modprobe rfkill; sudo modprobe cfg80211; sudo modprobe mac80211; >> sudo modprobe iwlegacy >> sudo modprobe iwl3945; sudo modprobe hp_wmi; sleep 1; sudo ifconfig wlan0 up > > I run a bit modified script (I do not have hp_wmi.ko and rfkill.ko) for few > hours, and did not get any WARNING/crash. I used 3.5, can you check if that > problem is also fixed on your system on 3.5 or newer. On 3.5.2-3.fc17.i686.PAE everything seems stable. The problem I had described hasn't happened recently. I guess it got fixed in the meantime. Thank you for your time, -- Pedro ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-08-30 15:58 ` Pedro Francisco @ 2012-09-26 12:47 ` Pedro Francisco 2012-10-03 14:30 ` Stanislaw Gruszka 0 siblings, 1 reply; 10+ messages in thread From: Pedro Francisco @ 2012-09-26 12:47 UTC (permalink / raw) To: Stanislaw Gruszka; +Cc: ML linux-wireless On Thu, Aug 30, 2012 at 4:58 PM, Pedro Francisco <pedrogfrancisco@gmail.com> wrote: > On Tue, Aug 7, 2012 at 11:22 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote: >> On Tue, Jul 31, 2012 at 01:54:52PM +0100, Pedro Francisco wrote: >>> I've noticed in the past few days a pattern: sometimes nm-applet >>> starts showing empty bars for the signal strength. >> >> RSSI reporting problem or maybe NM issue. When you change kernel to >> older or newer does this problem go away ? >> >>> Running the script: >>> sudo ifconfig wlan0 down; sleep 1 >>> sudo rmmod hp_wmi; sudo rmmod iwl3945; sudo rmmod iwlegacy; sudo rmmod >>> mac80211; sudo rmmod cfg80211 >>> sleep 2; sudo rmmod rfkill; sync >>> sudo modprobe rfkill; sudo modprobe cfg80211; sudo modprobe mac80211; >>> sudo modprobe iwlegacy >>> sudo modprobe iwl3945; sudo modprobe hp_wmi; sleep 1; sudo ifconfig wlan0 up >> >> I run a bit modified script (I do not have hp_wmi.ko and rfkill.ko) for few >> hours, and did not get any WARNING/crash. I used 3.5, can you check if that >> problem is also fixed on your system on 3.5 or newer. > > On 3.5.2-3.fc17.i686.PAE everything seems stable. The problem I had > described hasn't happened recently. > I guess it got fixed in the meantime. I was wrong, got it again. So, to recap: once the network applet shows no signal, but only then, removing the wireless modules triggers an unrecoverable kernel panic. I still haven't compiled a relocatable x86 kernel to get a proper backtrace using kexec/kdump, sorry. I found something else as well. Notice this output of "iwconfig" when everything is _normal_: $ iwconfig wlan0 wlan0 IEEE 802.11abg ESSID:"eduroam" Mode:Managed Frequency:2.437 GHz Access Point: B8:62:1F:XX:XX:XX Bit Rate=54 Mb/s Tx-Power=15 dBm Retry long limit:7 RTS thr:off Fragment thr:off Power Management:off Link Quality=58/70 Signal level=-52 dBm Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 Tx excessive retries:0 Invalid misc:0 Missed beacon:0 When I have the "empty signal bars" issue: $ iwconfig wlan0 wlan0 IEEE 802.11abg ESSID:off/any Mode:Managed Access Point: Not-Associated Tx-Power=15 dBm Retry long limit:7 RTS thr:off Fragment thr:off Power Management:off In case you're wondering, it is connected and streaming stuff :) I can sometimes trigger it on purpose: I just have to roam to a 5GHz AP of the same ESS, cycle around 2GHz and back to 5GHz (using wpa_cli roam XX:XX:XX:XX:XX ). If I get "SME: Authentication request to the driver failed", then disabling NetworkManager (not wireless) and reenabling will _probably_ get the "empty signal bars" (I was just able to trigger the "empty signal bars" now after a clean boot). So I'm guessing something gets corrupted, which is why reloading the modules will crash. I'm aware due to a patch to _iwlwifi_ (not iwl3945/iwlegacy) [1] that 2->5GHz roaming is not working very well on newer Intel wireless cards so it is worth considering it is happening here as well. Also, note some info, collected two days ago, relative to "Invalid misc:" is getting 10 "invalid misc" packets in 10 seconds normal? Several 'VAL=`date`; VAL="$VAL $(iwconfig wlan0 |grep "Invalid misc")"; echo $VAL' follow: Seg Set 24 15:06:36 WEST 2012 Tx excessive retries:5 Invalid misc:133 Missed beacon:0 Seg Set 24 15:06:46 WEST 2012 Tx excessive retries:5 Invalid misc:143 Missed beacon:0 Seg Set 24 15:07:00 WEST 2012 Tx excessive retries:5 Invalid misc:148 Missed beacon:0 Seg Set 24 15:21:46 WEST 2012 Tx excessive retries:22 Invalid misc:495 Missed beacon:0 Seg Set 24 15:24:41 WEST 2012 Tx excessive retries:24 Invalid misc:593 Missed beacon:0 So, something is getting corrupted here. Do you want the full logs? [1] http://thread.gmane.org/gmane.linux.kernel.wireless.general/89361/focus=89445 -- Pedro ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-09-26 12:47 ` Pedro Francisco @ 2012-10-03 14:30 ` Stanislaw Gruszka 2012-10-09 9:14 ` Pedro Francisco 0 siblings, 1 reply; 10+ messages in thread From: Stanislaw Gruszka @ 2012-10-03 14:30 UTC (permalink / raw) To: Pedro Francisco; +Cc: ML linux-wireless, Johannes Berg On Wed, Sep 26, 2012 at 01:47:18PM +0100, Pedro Francisco wrote: > On Thu, Aug 30, 2012 at 4:58 PM, Pedro Francisco > <pedrogfrancisco@gmail.com> wrote: > > On Tue, Aug 7, 2012 at 11:22 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > >> On Tue, Jul 31, 2012 at 01:54:52PM +0100, Pedro Francisco wrote: > >>> I've noticed in the past few days a pattern: sometimes nm-applet > >>> starts showing empty bars for the signal strength. > >> > >> RSSI reporting problem or maybe NM issue. When you change kernel to > >> older or newer does this problem go away ? > >> > >>> Running the script: > >>> sudo ifconfig wlan0 down; sleep 1 > >>> sudo rmmod hp_wmi; sudo rmmod iwl3945; sudo rmmod iwlegacy; sudo rmmod > >>> mac80211; sudo rmmod cfg80211 > >>> sleep 2; sudo rmmod rfkill; sync > >>> sudo modprobe rfkill; sudo modprobe cfg80211; sudo modprobe mac80211; > >>> sudo modprobe iwlegacy > >>> sudo modprobe iwl3945; sudo modprobe hp_wmi; sleep 1; sudo ifconfig wlan0 up > >> > >> I run a bit modified script (I do not have hp_wmi.ko and rfkill.ko) for few > >> hours, and did not get any WARNING/crash. I used 3.5, can you check if that > >> problem is also fixed on your system on 3.5 or newer. > > > > On 3.5.2-3.fc17.i686.PAE everything seems stable. The problem I had > > described hasn't happened recently. > > I guess it got fixed in the meantime. > > I was wrong, got it again. > > So, to recap: once the network applet shows no signal, but only then, > removing the wireless modules triggers an unrecoverable kernel panic. > I still haven't compiled a relocatable x86 kernel to get a proper > backtrace using kexec/kdump, sorry. > > I found something else as well. Notice this output of "iwconfig" when > everything is _normal_: > $ iwconfig wlan0 > wlan0 IEEE 802.11abg ESSID:"eduroam" > Mode:Managed Frequency:2.437 GHz Access Point: B8:62:1F:XX:XX:XX > Bit Rate=54 Mb/s Tx-Power=15 dBm > Retry long limit:7 RTS thr:off Fragment thr:off > Power Management:off > Link Quality=58/70 Signal level=-52 dBm > Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 > Tx excessive retries:0 Invalid misc:0 Missed beacon:0 > > When I have the "empty signal bars" issue: > $ iwconfig wlan0 > wlan0 IEEE 802.11abg ESSID:off/any > Mode:Managed Access Point: Not-Associated Tx-Power=15 dBm > Retry long limit:7 RTS thr:off Fragment thr:off > Power Management:off > > In case you're wondering, it is connected and streaming stuff :) > > I can sometimes trigger it on purpose: I just have to roam to a 5GHz > AP of the same ESS, cycle around 2GHz and back to 5GHz (using wpa_cli > roam XX:XX:XX:XX:XX ). If I get "SME: Authentication request to the > driver failed", then disabling NetworkManager (not wireless) and > reenabling will _probably_ get the "empty signal bars" (I was just > able to trigger the "empty signal bars" now after a clean boot). > So I'm guessing something gets corrupted, which is why reloading the > modules will crash. We do not stop mac80211 timers on module unload. I reproduced below warnings with iwlwifi on 3.5 kernel with DEBUG_OBJECTS enabled. I forced roaming many times, and then do "modprobe -r iwlwifi". Unfortunately those steps do not trigger warnings anytime, they happened just once. iwlwifi 0000:02:00.0: ACTIVATE a non DRIVER active station id 0 addr 6c:50:4d:3f:79:73 ------------[ cut here ]------------ WARNING: at lib/debugobjects.c:261 debug_print_object+0x8e/0xb0() Hardware name: SandyBridge Platform ODEBUG: free active (active state 0) object type: timer_list hint: ieee80211_sta_conn_mon_timer+0x0/0x40 [mac80211] Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 uinput arc4 sg iwlwifi(-) mac80211 cfg80211 rfkill coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr lpc_ich mfd_core i2c_i801 e1000e ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom aesni_intel cryptd aes_x86_64 aes_generic ahci libahci i915 drm_kms_helper drm i2c_algo_bit i2c_core video dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 3064, comm: modprobe Not tainted 3.5.0 #1 Call Trace: [<ffffffff810535af>] warn_slowpath_common+0x7f/0xc0 [<ffffffff810536a6>] warn_slowpath_fmt+0x46/0x50 [<ffffffff812901be>] debug_print_object+0x8e/0xb0 [<ffffffffa03a09b0>] ? ieee80211_chswitch_timer+0x40/0x40 [mac80211] [<ffffffff81290a0d>] __debug_check_no_obj_freed+0x10d/0x200 [<ffffffff81290b1d>] debug_check_no_obj_freed+0x1d/0x30 [<ffffffff8117a2b0>] kfree+0xc0/0x330 [<ffffffff810b9083>] ? __lock_release+0x133/0x1a0 [<ffffffff815555f0>] ? _raw_spin_unlock_irqrestore+0x40/0x80 [<ffffffff814957c4>] netdev_release+0x44/0x60 [<ffffffff813704b7>] device_release+0x27/0xa0 [<ffffffff8127da42>] kobject_cleanup+0x82/0x1b0 [<ffffffff8127db7d>] kobject_release+0xd/0x10 [<ffffffff8127d8cc>] kobject_put+0x2c/0x60 [<ffffffff8147e371>] netdev_run_todo+0x101/0x180 [<ffffffff8148f5ae>] rtnl_unlock+0xe/0x10 [<ffffffffa0366178>] ieee80211_unregister_hw+0x58/0x120 [mac80211] [<ffffffffa040912b>] iwlagn_mac_unregister+0x2b/0x40 [iwlwifi] [<ffffffffa03fdf59>] iwl_op_mode_dvm_stop+0x49/0xf0 [iwlwifi] [<ffffffffa041f730>] iwl_drv_stop+0x40/0x60 [iwlwifi] [<ffffffffa0430a39>] iwl_pci_remove+0x25/0x3c [iwlwifi] [<ffffffff812aafc2>] pci_device_remove+0x52/0x120 [<ffffffff813741cc>] __device_release_driver+0x7c/0xe0 [<ffffffff81374308>] driver_detach+0xd8/0xe0 [<ffffffff81372f61>] bus_remove_driver+0x91/0x110 [<ffffffff81374fd2>] driver_unregister+0x62/0xa0 [<ffffffff812ab2b4>] pci_unregister_driver+0x44/0xa0 [<ffffffffa041f3d5>] iwl_pci_unregister_driver+0x15/0x20 [iwlwifi] [<ffffffffa0430a01>] iwl_exit+0x9/0x1c [iwlwifi] [<ffffffff810c50f1>] sys_delete_module+0x1d1/0x2c0 [<ffffffff81555855>] ? retint_swapgs+0x13/0x1b [<ffffffff810e169c>] ? __audit_syscall_entry+0xcc/0x210 [<ffffffff812896ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8155de69>] system_call_fastpath+0x16/0x1b ---[ end trace 8070f580fc119b8b ]--- ------------[ cut here ]------------ WARNING: at lib/debugobjects.c:261 debug_print_object+0x8e/0xb0() Hardware name: SandyBridge Platform ODEBUG: free active (active state 0) object type: timer_list hint: ieee80211_sta_bcn_mon_timer+0x0/0x40 [mac80211] Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 uinput arc4 sg iwlwifi(-) mac80211 cfg80211 rfkill coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr lpc_ich mfd_core i2c_i801 e1000e ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom aesni_intel cryptd aes_x86_64 aes_generic ahci libahci i915 drm_kms_helper drm i2c_algo_bit i2c_core video dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] Pid: 3064, comm: modprobe Tainted: G W 3.5.0 #1 Call Trace: [<ffffffff810535af>] warn_slowpath_common+0x7f/0xc0 [<ffffffff810536a6>] warn_slowpath_fmt+0x46/0x50 [<ffffffff812901be>] debug_print_object+0x8e/0xb0 [<ffffffffa03a09f0>] ? ieee80211_sta_conn_mon_timer+0x40/0x40 [mac80211] [<ffffffff81290a0d>] __debug_check_no_obj_freed+0x10d/0x200 [<ffffffff81290b1d>] debug_check_no_obj_freed+0x1d/0x30 [<ffffffff8117a2b0>] kfree+0xc0/0x330 [<ffffffff810b9083>] ? __lock_release+0x133/0x1a0 [<ffffffff815555f0>] ? _raw_spin_unlock_irqrestore+0x40/0x80 [<ffffffff814957c4>] netdev_release+0x44/0x60 [<ffffffff813704b7>] device_release+0x27/0xa0 [<ffffffff8127da42>] kobject_cleanup+0x82/0x1b0 [<ffffffff8127db7d>] kobject_release+0xd/0x10 [<ffffffff8127d8cc>] kobject_put+0x2c/0x60 [<ffffffff8147e371>] netdev_run_todo+0x101/0x180 [<ffffffff8148f5ae>] rtnl_unlock+0xe/0x10 [<ffffffffa0366178>] ieee80211_unregister_hw+0x58/0x120 [mac80211] [<ffffffffa040912b>] iwlagn_mac_unregister+0x2b/0x40 [iwlwifi] [<ffffffffa03fdf59>] iwl_op_mode_dvm_stop+0x49/0xf0 [iwlwifi] [<ffffffffa041f730>] iwl_drv_stop+0x40/0x60 [iwlwifi] [<ffffffffa0430a39>] iwl_pci_remove+0x25/0x3c [iwlwifi] [<ffffffff812aafc2>] pci_device_remove+0x52/0x120 [<ffffffff813741cc>] __device_release_driver+0x7c/0xe0 [<ffffffff81374308>] driver_detach+0xd8/0xe0 [<ffffffff81372f61>] bus_remove_driver+0x91/0x110 [<ffffffff81374fd2>] driver_unregister+0x62/0xa0 [<ffffffff812ab2b4>] pci_unregister_driver+0x44/0xa0 [<ffffffffa041f3d5>] iwl_pci_unregister_driver+0x15/0x20 [iwlwifi] [<ffffffffa0430a01>] iwl_exit+0x9/0x1c [iwlwifi] [<ffffffff810c50f1>] sys_delete_module+0x1d1/0x2c0 [<ffffffff81555855>] ? retint_swapgs+0x13/0x1b [<ffffffff810e169c>] ? __audit_syscall_entry+0xcc/0x210 [<ffffffff812896ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8155de69>] system_call_fastpath+0x16/0x1b ---[ end trace 8070f580fc119b8c ]--- Bridge firewalling registered > misc:" is getting 10 "invalid misc" packets in 10 seconds normal? > Several 'VAL=`date`; VAL="$VAL $(iwconfig wlan0 |grep "Invalid > misc")"; echo $VAL' follow: > Seg Set 24 15:06:36 WEST 2012 Tx excessive retries:5 Invalid misc:133 > Missed beacon:0 > Seg Set 24 15:06:46 WEST 2012 Tx excessive retries:5 Invalid misc:143 > Missed beacon:0 > Seg Set 24 15:07:00 WEST 2012 Tx excessive retries:5 Invalid misc:148 > Missed beacon:0 > Seg Set 24 15:21:46 WEST 2012 Tx excessive retries:22 Invalid misc:495 > Missed beacon:0 > Seg Set 24 15:24:41 WEST 2012 Tx excessive retries:24 Invalid misc:593 > Missed beacon:0 I see lot of that. This can be caused by noisy radio environment, but also can be a firmware/driver bug. Unfortunately those kind of bugs are not easy to fix. Stanislaw ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-10-03 14:30 ` Stanislaw Gruszka @ 2012-10-09 9:14 ` Pedro Francisco 2012-10-12 12:13 ` Stanislaw Gruszka 0 siblings, 1 reply; 10+ messages in thread From: Pedro Francisco @ 2012-10-09 9:14 UTC (permalink / raw) To: Stanislaw Gruszka; +Cc: ML linux-wireless, Johannes Berg On Wed, Oct 3, 2012 at 3:30 PM, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > On Wed, Sep 26, 2012 at 01:47:18PM +0100, Pedro Francisco wrote: >> On Thu, Aug 30, 2012 at 4:58 PM, Pedro Francisco >> <pedrogfrancisco@gmail.com> wrote: >> > On Tue, Aug 7, 2012 at 11:22 AM, Stanislaw Gruszka <sgruszka@redhat.com> wrote: >> >> On Tue, Jul 31, 2012 at 01:54:52PM +0100, Pedro Francisco wrote: >> >>> I've noticed in the past few days a pattern: sometimes nm-applet >> >>> starts showing empty bars for the signal strength. >> >> >> >> RSSI reporting problem or maybe NM issue. When you change kernel to >> >> older or newer does this problem go away ? >> >> >> >>> Running the script: >> >>> sudo ifconfig wlan0 down; sleep 1 >> >>> sudo rmmod hp_wmi; sudo rmmod iwl3945; sudo rmmod iwlegacy; sudo rmmod >> >>> mac80211; sudo rmmod cfg80211 >> >>> sleep 2; sudo rmmod rfkill; sync >> >>> sudo modprobe rfkill; sudo modprobe cfg80211; sudo modprobe mac80211; >> >>> sudo modprobe iwlegacy >> >>> sudo modprobe iwl3945; sudo modprobe hp_wmi; sleep 1; sudo ifconfig wlan0 up >> >> >> >> I run a bit modified script (I do not have hp_wmi.ko and rfkill.ko) for few >> >> hours, and did not get any WARNING/crash. I used 3.5, can you check if that >> >> problem is also fixed on your system on 3.5 or newer. >> > >> > On 3.5.2-3.fc17.i686.PAE everything seems stable. The problem I had >> > described hasn't happened recently. >> > I guess it got fixed in the meantime. >> >> I was wrong, got it again. >> >> So, to recap: once the network applet shows no signal, but only then, >> removing the wireless modules triggers an unrecoverable kernel panic. >> I still haven't compiled a relocatable x86 kernel to get a proper >> backtrace using kexec/kdump, sorry. >> >> I found something else as well. Notice this output of "iwconfig" when >> everything is _normal_: >> $ iwconfig wlan0 >> wlan0 IEEE 802.11abg ESSID:"eduroam" >> Mode:Managed Frequency:2.437 GHz Access Point: B8:62:1F:XX:XX:XX >> Bit Rate=54 Mb/s Tx-Power=15 dBm >> Retry long limit:7 RTS thr:off Fragment thr:off >> Power Management:off >> Link Quality=58/70 Signal level=-52 dBm >> Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0 >> Tx excessive retries:0 Invalid misc:0 Missed beacon:0 >> >> When I have the "empty signal bars" issue: >> $ iwconfig wlan0 >> wlan0 IEEE 802.11abg ESSID:off/any >> Mode:Managed Access Point: Not-Associated Tx-Power=15 dBm >> Retry long limit:7 RTS thr:off Fragment thr:off >> Power Management:off >> >> In case you're wondering, it is connected and streaming stuff :) >> >> I can sometimes trigger it on purpose: I just have to roam to a 5GHz >> AP of the same ESS, cycle around 2GHz and back to 5GHz (using wpa_cli >> roam XX:XX:XX:XX:XX ). If I get "SME: Authentication request to the >> driver failed", then disabling NetworkManager (not wireless) and >> reenabling will _probably_ get the "empty signal bars" (I was just >> able to trigger the "empty signal bars" now after a clean boot). >> So I'm guessing something gets corrupted, which is why reloading the >> modules will crash. > > We do not stop mac80211 timers on module unload. I reproduced below > warnings with iwlwifi on 3.5 kernel with DEBUG_OBJECTS enabled. > I forced roaming many times, and then do "modprobe -r iwlwifi". > Unfortunately those steps do not trigger warnings anytime, they > happened just once. > > iwlwifi 0000:02:00.0: ACTIVATE a non DRIVER active station id 0 addr 6c:50:4d:3f:79:73 > ------------[ cut here ]------------ > WARNING: at lib/debugobjects.c:261 debug_print_object+0x8e/0xb0() > Hardware name: SandyBridge Platform > ODEBUG: free active (active state 0) object type: timer_list hint: > ieee80211_sta_conn_mon_timer+0x0/0x40 [mac80211] > Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq > freq_table mperf ipv6 uinput arc4 sg iwlwifi(-) mac80211 cfg80211 rfkill > coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr > lpc_ich mfd_core i2c_i801 e1000e ext4 mbcache jbd2 sd_mod crc_t10dif > sr_mod cdrom aesni_intel cryptd aes_x86_64 aes_generic ahci libahci i915 > drm_kms_helper drm i2c_algo_bit i2c_core video dm_mirror dm_region_hash > dm_log dm_mod [last unloaded: scsi_wait_scan] > Pid: 3064, comm: modprobe Not tainted 3.5.0 #1 > Call Trace: > [<ffffffff810535af>] warn_slowpath_common+0x7f/0xc0 > [<ffffffff810536a6>] warn_slowpath_fmt+0x46/0x50 > [<ffffffff812901be>] debug_print_object+0x8e/0xb0 > [<ffffffffa03a09b0>] ? ieee80211_chswitch_timer+0x40/0x40 [mac80211] > [<ffffffff81290a0d>] __debug_check_no_obj_freed+0x10d/0x200 > [<ffffffff81290b1d>] debug_check_no_obj_freed+0x1d/0x30 > [<ffffffff8117a2b0>] kfree+0xc0/0x330 > [<ffffffff810b9083>] ? __lock_release+0x133/0x1a0 > [<ffffffff815555f0>] ? _raw_spin_unlock_irqrestore+0x40/0x80 > [<ffffffff814957c4>] netdev_release+0x44/0x60 > [<ffffffff813704b7>] device_release+0x27/0xa0 > [<ffffffff8127da42>] kobject_cleanup+0x82/0x1b0 > [<ffffffff8127db7d>] kobject_release+0xd/0x10 > [<ffffffff8127d8cc>] kobject_put+0x2c/0x60 > [<ffffffff8147e371>] netdev_run_todo+0x101/0x180 > [<ffffffff8148f5ae>] rtnl_unlock+0xe/0x10 > [<ffffffffa0366178>] ieee80211_unregister_hw+0x58/0x120 [mac80211] > [<ffffffffa040912b>] iwlagn_mac_unregister+0x2b/0x40 [iwlwifi] > [<ffffffffa03fdf59>] iwl_op_mode_dvm_stop+0x49/0xf0 [iwlwifi] > [<ffffffffa041f730>] iwl_drv_stop+0x40/0x60 [iwlwifi] > [<ffffffffa0430a39>] iwl_pci_remove+0x25/0x3c [iwlwifi] > [<ffffffff812aafc2>] pci_device_remove+0x52/0x120 > [<ffffffff813741cc>] __device_release_driver+0x7c/0xe0 > [<ffffffff81374308>] driver_detach+0xd8/0xe0 > [<ffffffff81372f61>] bus_remove_driver+0x91/0x110 > [<ffffffff81374fd2>] driver_unregister+0x62/0xa0 > [<ffffffff812ab2b4>] pci_unregister_driver+0x44/0xa0 > [<ffffffffa041f3d5>] iwl_pci_unregister_driver+0x15/0x20 [iwlwifi] > [<ffffffffa0430a01>] iwl_exit+0x9/0x1c [iwlwifi] > [<ffffffff810c50f1>] sys_delete_module+0x1d1/0x2c0 > [<ffffffff81555855>] ? retint_swapgs+0x13/0x1b > [<ffffffff810e169c>] ? __audit_syscall_entry+0xcc/0x210 > [<ffffffff812896ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff8155de69>] system_call_fastpath+0x16/0x1b > ---[ end trace 8070f580fc119b8b ]--- > ------------[ cut here ]------------ > WARNING: at lib/debugobjects.c:261 debug_print_object+0x8e/0xb0() > Hardware name: SandyBridge Platform > ODEBUG: free active (active state 0) object type: timer_list hint: > ieee80211_sta_bcn_mon_timer+0x0/0x40 [mac80211] > Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq > freq_table mperf ipv6 uinput arc4 sg iwlwifi(-) mac80211 cfg80211 rfkill > coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode pcspkr > lpc_ich mfd_core i2c_i801 e1000e ext4 mbcache jbd2 sd_mod crc_t10dif > sr_mod cdrom aesni_intel cryptd aes_x86_64 aes_generic ahci libahci i915 > drm_kms_helper drm i2c_algo_bit i2c_core video dm_mirror dm_region_hash > dm_log dm_mod [last unloaded: scsi_wait_scan] > Pid: 3064, comm: modprobe Tainted: G W 3.5.0 #1 > Call Trace: > [<ffffffff810535af>] warn_slowpath_common+0x7f/0xc0 > [<ffffffff810536a6>] warn_slowpath_fmt+0x46/0x50 > [<ffffffff812901be>] debug_print_object+0x8e/0xb0 > [<ffffffffa03a09f0>] ? ieee80211_sta_conn_mon_timer+0x40/0x40 > [mac80211] > [<ffffffff81290a0d>] __debug_check_no_obj_freed+0x10d/0x200 > [<ffffffff81290b1d>] debug_check_no_obj_freed+0x1d/0x30 > [<ffffffff8117a2b0>] kfree+0xc0/0x330 > [<ffffffff810b9083>] ? __lock_release+0x133/0x1a0 > [<ffffffff815555f0>] ? _raw_spin_unlock_irqrestore+0x40/0x80 > [<ffffffff814957c4>] netdev_release+0x44/0x60 > [<ffffffff813704b7>] device_release+0x27/0xa0 > [<ffffffff8127da42>] kobject_cleanup+0x82/0x1b0 > [<ffffffff8127db7d>] kobject_release+0xd/0x10 > [<ffffffff8127d8cc>] kobject_put+0x2c/0x60 > [<ffffffff8147e371>] netdev_run_todo+0x101/0x180 > [<ffffffff8148f5ae>] rtnl_unlock+0xe/0x10 > [<ffffffffa0366178>] ieee80211_unregister_hw+0x58/0x120 [mac80211] > [<ffffffffa040912b>] iwlagn_mac_unregister+0x2b/0x40 [iwlwifi] > [<ffffffffa03fdf59>] iwl_op_mode_dvm_stop+0x49/0xf0 [iwlwifi] > [<ffffffffa041f730>] iwl_drv_stop+0x40/0x60 [iwlwifi] > [<ffffffffa0430a39>] iwl_pci_remove+0x25/0x3c [iwlwifi] > [<ffffffff812aafc2>] pci_device_remove+0x52/0x120 > [<ffffffff813741cc>] __device_release_driver+0x7c/0xe0 > [<ffffffff81374308>] driver_detach+0xd8/0xe0 > [<ffffffff81372f61>] bus_remove_driver+0x91/0x110 > [<ffffffff81374fd2>] driver_unregister+0x62/0xa0 > [<ffffffff812ab2b4>] pci_unregister_driver+0x44/0xa0 > [<ffffffffa041f3d5>] iwl_pci_unregister_driver+0x15/0x20 [iwlwifi] > [<ffffffffa0430a01>] iwl_exit+0x9/0x1c [iwlwifi] > [<ffffffff810c50f1>] sys_delete_module+0x1d1/0x2c0 > [<ffffffff81555855>] ? retint_swapgs+0x13/0x1b > [<ffffffff810e169c>] ? __audit_syscall_entry+0xcc/0x210 > [<ffffffff812896ce>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff8155de69>] system_call_fastpath+0x16/0x1b > ---[ end trace 8070f580fc119b8c ]--- > Bridge firewalling registered > Hi! I was finally able to compile a relocatable kernel: here's what I got, after a crash on iwlegacy module removal: # crash vmcore /usr/lib/debug/lib/modules/`uname -r`/vmlinux crash 6.0.8-1.fc18 [note, I'm on FC17 but had to install FC18's crash to workaround a log structure change in 3.5 kernel] (...) This GDB was configured as "i686-pc-linux-gnu"... ============ CRASH 1 ============ KERNEL: /usr/lib/debug/lib/modules/3.5.4-2.pedro.fc17.i686.PAE/vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 2 DATE: Tue Oct 9 09:17:35 2012 UPTIME: 00:12:47 LOAD AVERAGE: 0.50, 0.62, 0.60 TASKS: 322 NODENAME: s2 RELEASE: 3.5.4-2.pedro.fc17.i686.PAE VERSION: #1 SMP Mon Oct 8 23:15:44 WEST 2012 MACHINE: i686 (1496 Mhz) MEMORY: 2 GB PANIC: "Oops: 0000 [#1] SMP " (check log for details) PID: 0 COMMAND: "swapper/1" TASK: f4104240 (1 of 2) [THREAD_INFO: f4146000] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> bt PID: 0 TASK: f4104240 CPU: 1 COMMAND: "swapper/1" #0 [f4147db4] crash_kexec at c04a7d59 #1 [f4147e04] timerqueue_add at c0675503 #2 [f4147e14] ktime_get at c04921ee #3 [f4147e30] bad_area_nosemaphore at c0958328 #4 [f4147e3c] do_page_fault at c0964c25 #5 [f4147eb8] error_code (via page_fault) at c0961eb1 EAX: 6b6b6b6b EBX: 00072420 ECX: 00000001 EDX: f4178930 EBP: f4147f30 DS: 007b ESI: 00000024 ES: 007b EDI: 00072420 GS: 00e0 CS: 0060 EIP: c0457993 ERR: ffffffff EFLAGS: 00010003 #6 [f4147eec] get_next_timer_interrupt at c0457993 #7 [f4147f34] tick_nohz_stop_sched_tick.isra.11 at c049a19a #8 [f4147f78] tick_nohz_idle_enter at c049a661 #9 [f4147f80] cpu_idle at c04195da ============ CRASH 2 ============ KERNEL: /usr/lib/debug/lib/modules/3.5.4-2.pedro.fc17.i686.PAE/vmlinux DUMPFILE: vmcore CPUS: 2 DATE: Tue Oct 9 09:29:35 2012 UPTIME: 00:10:22 LOAD AVERAGE: 0.30, 0.78, 0.67 TASKS: 323 NODENAME: s2 RELEASE: 3.5.4-2.pedro.fc17.i686.PAE VERSION: #1 SMP Mon Oct 8 23:15:44 WEST 2012 MACHINE: i686 (1496 Mhz) MEMORY: 2 GB PANIC: "kernel BUG at kernel/timer.c:1091!" PID: 6563 COMMAND: "rpm" <-- ? TASK: eaa5dcc0 [THREAD_INFO: f414a000] CPU: 0 STATE: TASK_RUNNING (PANIC) crash> bt PID: 6563 TASK: eaa5dcc0 CPU: 0 COMMAND: "rpm" bt: cannot resolve stack trace: #0 [f414bd58] __schedule at c095fba6 #1 [f414bdd4] sched_clock_local at c047a56d bt: text symbols on stack: [f414bd5c] kmap_atomic_prot at c0441244 [f414bd70] __kunmap_atomic at c04410dd [f414bd80] get_page_from_freelist at c0504e40 [f414bdcc] sched_clock at c0417a28 [f414bdd4] sched_clock_local at c047a572 [f414be28] update_curr at c047cdb2 [f414be60] clear_nohz_tick_stopped.part.37 at c0958f63 [f414be6c] trigger_load_balance at c047ff73 [f414be88] scheduler_tick at c04774e5 [f414beac] timerqueue_add at c0675508 [f414bec0] ktime_get at c04921f0 [f414bed4] lapic_next_event at c042f75b [f414bedc] clockevents_program_event at c049877d [f414bef4] tick_program_event at c0499a79 [f414bf04] hrtimer_interrupt at c046bcc8 [f414bf54] irq_exit at c045001d [f414bf5c] smp_apic_timer_interrupt at c042fdbe [f414bf74] apic_timer_interrupt at c0961c85 [f414bfa8] sysenter_past_esp at c0968322 bt: possible exception frame: USER-MODE EXCEPTION FRAME AT f414bfb4: EAX: 0000002d EBX: 09fba000 ECX: 45a4bff4 EDX: 09fba000 DS: 007b ESI: 09f99000 ES: 007b EDI: 09fba000 SS: 007b ESP: bff95804 EBP: bff95804 GS: 0033 CS: 0073 EIP: b7720424 ERR: 0000002d EFLAGS: 00000202 So, I'm guessing this means it is related to what you found on iwlwifi (even if I'm on iwlegacy)? The crash kernel crashed again but I can try to add a script to try to recover dmesg -- I believe slub_debug caught something as well... -- Pedro ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-10-09 9:14 ` Pedro Francisco @ 2012-10-12 12:13 ` Stanislaw Gruszka 2012-10-15 11:03 ` Johannes Berg 2012-10-15 15:48 ` Pedro Francisco 0 siblings, 2 replies; 10+ messages in thread From: Stanislaw Gruszka @ 2012-10-12 12:13 UTC (permalink / raw) To: Pedro Francisco; +Cc: ML linux-wireless, Johannes Berg On Tue, Oct 09, 2012 at 10:14:40AM +0100, Pedro Francisco wrote: > So, I'm guessing this means it is related to what you found on iwlwifi > (even if I'm on iwlegacy)? Yes, this seems to be cfg80211 problem. I think crash happen because cfg80211 is in disassociate state (i.e. has wdev->current_bss NULL) and erroneously mac80211 stays in associate state. So while we unload module cfg80211_mlme_down() we do not call ieee80211_deauth(). I think this state mishmash happens because wrong behaviour on __cfg80211_mlme_deauth(). Below patch try to correct that. Can you check if it prevent a crash? On my environment I can not reproduce this problem reliably. Thanks Stanislaw diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index ab78b53..9b99b60 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -1218,6 +1218,7 @@ struct cfg80211_deauth_request { const u8 *ie; size_t ie_len; u16 reason_code; + bool local_state_change; }; /** diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index e714ed8..e510a33 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -3549,6 +3549,7 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, { struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; u8 frame_buf[IEEE80211_DEAUTH_FRAME_LEN]; + bool tx = !req->local_state_change; mutex_lock(&ifmgd->mtx); @@ -3565,12 +3566,12 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, if (ifmgd->associated && ether_addr_equal(ifmgd->associated->bssid, req->bssid)) { ieee80211_set_disassoc(sdata, IEEE80211_STYPE_DEAUTH, - req->reason_code, true, frame_buf); + req->reason_code, tx, frame_buf); } else { drv_mgd_prepare_tx(sdata->local, sdata); ieee80211_send_deauth_disassoc(sdata, req->bssid, IEEE80211_STYPE_DEAUTH, - req->reason_code, true, + req->reason_code, tx, frame_buf); } diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c index 3df195a..4954010 100644 --- a/net/wireless/mlme.c +++ b/net/wireless/mlme.c @@ -457,21 +457,11 @@ int __cfg80211_mlme_deauth(struct cfg80211_registered_device *rdev, .reason_code = reason, .ie = ie, .ie_len = ie_len, + .local_state_change = local_state_change, }; ASSERT_WDEV_LOCK(wdev); - if (local_state_change) { - if (wdev->current_bss && - ether_addr_equal(wdev->current_bss->pub.bssid, bssid)) { - cfg80211_unhold_bss(wdev->current_bss); - cfg80211_put_bss(&wdev->current_bss->pub); - wdev->current_bss = NULL; - } - - return 0; - } - return rdev->ops->deauth(&rdev->wiphy, dev, &req); } ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-10-12 12:13 ` Stanislaw Gruszka @ 2012-10-15 11:03 ` Johannes Berg 2012-10-15 15:48 ` Pedro Francisco 1 sibling, 0 replies; 10+ messages in thread From: Johannes Berg @ 2012-10-15 11:03 UTC (permalink / raw) To: Stanislaw Gruszka; +Cc: Pedro Francisco, ML linux-wireless On Fri, 2012-10-12 at 14:13 +0200, Stanislaw Gruszka wrote: > On Tue, Oct 09, 2012 at 10:14:40AM +0100, Pedro Francisco wrote: > > So, I'm guessing this means it is related to what you found on iwlwifi > > (even if I'm on iwlegacy)? > > Yes, this seems to be cfg80211 problem. I think crash happen because > cfg80211 is in disassociate state (i.e. has wdev->current_bss NULL) and > erroneously mac80211 stays in associate state. So while we unload > module cfg80211_mlme_down() we do not call ieee80211_deauth(). > > I think this state mishmash happens because wrong behaviour on > __cfg80211_mlme_deauth(). Below patch try to correct that. > Can you check if it prevent a crash? On my environment I can > not reproduce this problem reliably. Ugh, yeah, what was I thinking with the code below ... ?? > diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h > index ab78b53..9b99b60 100644 > --- a/include/net/cfg80211.h > +++ b/include/net/cfg80211.h > @@ -1218,6 +1218,7 @@ struct cfg80211_deauth_request { > const u8 *ie; > size_t ie_len; > u16 reason_code; > + bool local_state_change; > }; > > /** > diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c > index e714ed8..e510a33 100644 > --- a/net/mac80211/mlme.c > +++ b/net/mac80211/mlme.c > @@ -3549,6 +3549,7 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, > { > struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; > u8 frame_buf[IEEE80211_DEAUTH_FRAME_LEN]; > + bool tx = !req->local_state_change; > > mutex_lock(&ifmgd->mtx); > > @@ -3565,12 +3566,12 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, > if (ifmgd->associated && > ether_addr_equal(ifmgd->associated->bssid, req->bssid)) { > ieee80211_set_disassoc(sdata, IEEE80211_STYPE_DEAUTH, > - req->reason_code, true, frame_buf); > + req->reason_code, tx, frame_buf); > } else { > drv_mgd_prepare_tx(sdata->local, sdata); > ieee80211_send_deauth_disassoc(sdata, req->bssid, > IEEE80211_STYPE_DEAUTH, > - req->reason_code, true, > + req->reason_code, tx, > frame_buf); > } > > diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c > index 3df195a..4954010 100644 > --- a/net/wireless/mlme.c > +++ b/net/wireless/mlme.c > @@ -457,21 +457,11 @@ int __cfg80211_mlme_deauth(struct cfg80211_registered_device *rdev, > .reason_code = reason, > .ie = ie, > .ie_len = ie_len, > + .local_state_change = local_state_change, > }; > > ASSERT_WDEV_LOCK(wdev); > > - if (local_state_change) { > - if (wdev->current_bss && > - ether_addr_equal(wdev->current_bss->pub.bssid, bssid)) { > - cfg80211_unhold_bss(wdev->current_bss); > - cfg80211_put_bss(&wdev->current_bss->pub); > - wdev->current_bss = NULL; > - } > - > - return 0; > - } > - This looks fine to me. Probably needs Cc: stable? Then again, maybe if the deauth request is for a BSS that *isn't* the current BSS we should "swallow" it in cfg80211? IOW, something like if (local_state_change && (!wdev->current_bss || !ether_addr_equal(...)) return 0; since neither mac80211 nor cfg80211 track authentication... Doesn't matter much though. johannes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unloading WiFi modules is usually triggering kernel crash 2012-10-12 12:13 ` Stanislaw Gruszka 2012-10-15 11:03 ` Johannes Berg @ 2012-10-15 15:48 ` Pedro Francisco 1 sibling, 0 replies; 10+ messages in thread From: Pedro Francisco @ 2012-10-15 15:48 UTC (permalink / raw) To: Stanislaw Gruszka; +Cc: ML linux-wireless, Johannes Berg [-- Attachment #1: Type: text/plain, Size: 3578 bytes --] On Fri, Oct 12, 2012 at 1:13 PM, Stanislaw Gruszka <sgruszka@redhat.com> wrote: > On Tue, Oct 09, 2012 at 10:14:40AM +0100, Pedro Francisco wrote: >> So, I'm guessing this means it is related to what you found on iwlwifi >> (even if I'm on iwlegacy)? > > Yes, this seems to be cfg80211 problem. I think crash happen because > cfg80211 is in disassociate state (i.e. has wdev->current_bss NULL) and > erroneously mac80211 stays in associate state. So while we unload > module cfg80211_mlme_down() we do not call ieee80211_deauth(). > > I think this state mishmash happens because wrong behaviour on > __cfg80211_mlme_deauth(). Below patch try to correct that. > Can you check if it prevent a crash? On my environment I can > not reproduce this problem reliably. > > Thanks > Stanislaw > > diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h > index ab78b53..9b99b60 100644 > --- a/include/net/cfg80211.h > +++ b/include/net/cfg80211.h > @@ -1218,6 +1218,7 @@ struct cfg80211_deauth_request { > const u8 *ie; > size_t ie_len; > u16 reason_code; > + bool local_state_change; > }; > > /** > diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c > index e714ed8..e510a33 100644 > --- a/net/mac80211/mlme.c > +++ b/net/mac80211/mlme.c > @@ -3549,6 +3549,7 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, > { > struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; > u8 frame_buf[IEEE80211_DEAUTH_FRAME_LEN]; > + bool tx = !req->local_state_change; > > mutex_lock(&ifmgd->mtx); > > @@ -3565,12 +3566,12 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, > if (ifmgd->associated && > ether_addr_equal(ifmgd->associated->bssid, req->bssid)) { > ieee80211_set_disassoc(sdata, IEEE80211_STYPE_DEAUTH, > - req->reason_code, true, frame_buf); > + req->reason_code, tx, frame_buf); > } else { > drv_mgd_prepare_tx(sdata->local, sdata); > ieee80211_send_deauth_disassoc(sdata, req->bssid, > IEEE80211_STYPE_DEAUTH, > - req->reason_code, true, > + req->reason_code, tx, > frame_buf); > } > > diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c > index 3df195a..4954010 100644 > --- a/net/wireless/mlme.c > +++ b/net/wireless/mlme.c > @@ -457,21 +457,11 @@ int __cfg80211_mlme_deauth(struct cfg80211_registered_device *rdev, > .reason_code = reason, > .ie = ie, > .ie_len = ie_len, > + .local_state_change = local_state_change, > }; > > ASSERT_WDEV_LOCK(wdev); > > - if (local_state_change) { > - if (wdev->current_bss && > - ether_addr_equal(wdev->current_bss->pub.bssid, bssid)) { > - cfg80211_unhold_bss(wdev->current_bss); > - cfg80211_put_bss(&wdev->current_bss->pub); > - wdev->current_bss = NULL; > - } > - > - return 0; > - } > - > return rdev->ops->deauth(&rdev->wiphy, dev, &req); > } > I've been testing the patch since this morning (GMT), I can't reproduce any of the issues I referred on this thread (had to adapt the patch slightly, though). Seems to be fixed! Thank you for your help! -- Pedro Francisco [-- Attachment #2: mlme-timers-fedora-3.6.1-fc17-kernel.patch --] [-- Type: application/octet-stream, Size: 1921 bytes --] diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index 3d254e1..f10553c 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -1217,6 +1217,7 @@ struct cfg80211_deauth_request { const u8 *ie; size_t ie_len; u16 reason_code; + bool local_state_change; }; /** diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index f76b833..da3f5e4 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -3457,6 +3457,7 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, { struct ieee80211_if_managed *ifmgd = &sdata->u.mgd; u8 frame_buf[DEAUTH_DISASSOC_LEN]; + bool tx = !req->local_state_change; mutex_lock(&ifmgd->mtx); @@ -3473,11 +3474,11 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata, if (ifmgd->associated && ether_addr_equal(ifmgd->associated->bssid, req->bssid)) ieee80211_set_disassoc(sdata, IEEE80211_STYPE_DEAUTH, - req->reason_code, true, frame_buf); + req->reason_code, tx, frame_buf); else ieee80211_send_deauth_disassoc(sdata, req->bssid, IEEE80211_STYPE_DEAUTH, - req->reason_code, true, + req->reason_code, tx, frame_buf); mutex_unlock(&ifmgd->mtx); diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c index 1cdb1d5..0877efb 100644 --- a/net/wireless/mlme.c +++ b/net/wireless/mlme.c @@ -457,21 +457,11 @@ int __cfg80211_mlme_deauth(struct cfg80211_registered_device *rdev, .reason_code = reason, .ie = ie, .ie_len = ie_len, + .local_state_change = local_state_change, }; ASSERT_WDEV_LOCK(wdev); - if (local_state_change) { - if (wdev->current_bss && - ether_addr_equal(wdev->current_bss->pub.bssid, bssid)) { - cfg80211_unhold_bss(wdev->current_bss); - cfg80211_put_bss(&wdev->current_bss->pub); - wdev->current_bss = NULL; - } - - return 0; - } - return rdev->ops->deauth(&rdev->wiphy, dev, &req); } ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-10-15 15:49 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-07-31 12:54 unloading WiFi modules is usually triggering kernel crash Pedro Francisco 2012-07-31 13:13 ` John W. Linville 2012-08-07 10:22 ` Stanislaw Gruszka 2012-08-30 15:58 ` Pedro Francisco 2012-09-26 12:47 ` Pedro Francisco 2012-10-03 14:30 ` Stanislaw Gruszka 2012-10-09 9:14 ` Pedro Francisco 2012-10-12 12:13 ` Stanislaw Gruszka 2012-10-15 11:03 ` Johannes Berg 2012-10-15 15:48 ` Pedro Francisco
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.