* Suspicious RCU usage in mac80211 @ 2012-04-10 3:19 Larry Finger 2012-04-12 3:31 ` Johannes Berg 0 siblings, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-04-10 3:19 UTC (permalink / raw) To: Johannes Berg; +Cc: wireless When running kernel 3.4-rc2 from wireless testing, I got the following logged: [ 2299.344437] =============================== [ 2299.344443] [ INFO: suspicious RCU usage. ] [ 2299.344452] 3.4.0-rc2-wl+ #222 Not tainted [ 2299.344458] ------------------------------- [ 2299.344466] net/mac80211/sta_info.h:449 suspicious rcu_dereference_protected() usage! [ 2299.344472] [ 2299.344474] other info that might help us debug this: [ 2299.344477] [ 2299.344484] [ 2299.344486] rcu_scheduler_active = 1, debug_locks = 0 [ 2299.344493] 1 lock held by swapper/1/0: [ 2299.344499] #0: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104850a>] run_timer_softirq+0xfa/0x6e0 [ 2299.344533] [ 2299.344535] stack backtrace: [ 2299.344544] Pid: 0, comm: swapper/1 Not tainted 3.4.0-rc2-wl+ #222 [ 2299.344551] Call Trace: [ 2299.344557] <IRQ> [<ffffffff81092fdd>] lockdep_rcu_suspicious+0xfd/0x130 [ 2299.344651] [<ffffffffa05db673>] sta_tx_agg_session_timer_expired+0xe3/0x100 [mac80211] [ 2299.344664] [<ffffffff81048595>] run_timer_softirq+0x185/0x6e0 [ 2299.344676] [<ffffffff8104850a>] ? run_timer_softirq+0xfa/0x6e0 [ 2299.344714] [<ffffffffa05db590>] ? ieee80211_stop_tx_ba_session+0x240/0x240 [mac80211] [ 2299.344732] [<ffffffff81040156>] __do_softirq+0xc6/0x3f0 [ 2299.344748] [<ffffffff8139c934>] ? _raw_spin_lock+0x44/0x50 [ 2299.344758] [<ffffffff8139d0f6>] ? _raw_spin_unlock+0x26/0x40 [ 2299.344771] [<ffffffff8139f24c>] call_softirq+0x1c/0x30 [ 2299.344785] [<ffffffff81004785>] do_softirq+0x85/0xc0 [ 2299.344795] [<ffffffff810407b6>] irq_exit+0xa6/0xe0 [ 2299.344806] [<ffffffff8100424e>] do_IRQ+0x5e/0xd0 [ 2299.344816] [<ffffffff8139d56c>] common_interrupt+0x6c/0x6c [ 2299.344823] <EOI> [<ffffffff8100c042>] ? default_idle+0x62/0x4f0 [ 2299.344844] [<ffffffff8100c040>] ? default_idle+0x60/0x4f0 [ 2299.344854] [<ffffffff8100c524>] amd_e400_idle+0x54/0x100 [ 2299.344864] [<ffffffff8100ce96>] cpu_idle+0x96/0xe0 [ 2299.344875] [<ffffffff8138fbc6>] start_secondary+0x1b7/0x1bb The wireless driver in use was rtl8192ce. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-04-10 3:19 Suspicious RCU usage in mac80211 Larry Finger @ 2012-04-12 3:31 ` Johannes Berg 2012-04-12 3:51 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Johannes Berg @ 2012-04-12 3:31 UTC (permalink / raw) To: Larry Finger; +Cc: wireless On Mon, 2012-04-09 at 22:19 -0500, Larry Finger wrote: > When running kernel 3.4-rc2 from wireless testing, I got the following logged: > > [ 2299.344437] =============================== > [ 2299.344443] [ INFO: suspicious RCU usage. ] > [ 2299.344452] 3.4.0-rc2-wl+ #222 Not tainted > [ 2299.344458] ------------------------------- > [ 2299.344466] net/mac80211/sta_info.h:449 suspicious rcu_dereference_protected() usage! > [ 2299.344472] > [ 2299.344474] other info that might help us debug this: > [ 2299.344477] > [ 2299.344484] > [ 2299.344486] rcu_scheduler_active = 1, debug_locks = 0 > [ 2299.344493] 1 lock held by swapper/1/0: > [ 2299.344499] #0: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104850a>] > run_timer_softirq+0xfa/0x6e0 > [ 2299.344533] > [ 2299.344535] stack backtrace: > [ 2299.344544] Pid: 0, comm: swapper/1 Not tainted 3.4.0-rc2-wl+ #222 > [ 2299.344551] Call Trace: > [ 2299.344557] <IRQ> [<ffffffff81092fdd>] lockdep_rcu_suspicious+0xfd/0x130 > [ 2299.344651] [<ffffffffa05db673>] sta_tx_agg_session_timer_expired+0xe3/0x100 > [mac80211] That's odd: static inline struct tid_ampdu_tx * rcu_dereference_protected_tid_tx(struct sta_info *sta, int tid) { return rcu_dereference_protected(sta->ampdu_mlme.tid_tx[tid], lockdep_is_held(&sta->lock) || lockdep_is_held(&sta->ampdu_mlme.mtx)); } sta_tx_agg_session_timer_expired calls ieee80211_stop_tx_ba_session which does: spin_lock_bh(&sta->lock); tid_tx = rcu_dereference_protected_tid_tx(sta, tid); So why would the message happen? Strange. johannes ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-04-12 3:31 ` Johannes Berg @ 2012-04-12 3:51 ` Larry Finger 2012-04-12 3:54 ` Johannes Berg 0 siblings, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-04-12 3:51 UTC (permalink / raw) To: Johannes Berg; +Cc: wireless On 04/11/2012 10:31 PM, Johannes Berg wrote: > On Mon, 2012-04-09 at 22:19 -0500, Larry Finger wrote: >> When running kernel 3.4-rc2 from wireless testing, I got the following logged: >> >> [ 2299.344437] =============================== >> [ 2299.344443] [ INFO: suspicious RCU usage. ] >> [ 2299.344452] 3.4.0-rc2-wl+ #222 Not tainted >> [ 2299.344458] ------------------------------- >> [ 2299.344466] net/mac80211/sta_info.h:449 suspicious rcu_dereference_protected() usage! >> [ 2299.344472] >> [ 2299.344474] other info that might help us debug this: >> [ 2299.344477] >> [ 2299.344484] >> [ 2299.344486] rcu_scheduler_active = 1, debug_locks = 0 >> [ 2299.344493] 1 lock held by swapper/1/0: >> [ 2299.344499] #0: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104850a>] >> run_timer_softirq+0xfa/0x6e0 >> [ 2299.344533] >> [ 2299.344535] stack backtrace: >> [ 2299.344544] Pid: 0, comm: swapper/1 Not tainted 3.4.0-rc2-wl+ #222 >> [ 2299.344551] Call Trace: >> [ 2299.344557]<IRQ> [<ffffffff81092fdd>] lockdep_rcu_suspicious+0xfd/0x130 >> [ 2299.344651] [<ffffffffa05db673>] sta_tx_agg_session_timer_expired+0xe3/0x100 >> [mac80211] > > That's odd: > > static inline struct tid_ampdu_tx * > rcu_dereference_protected_tid_tx(struct sta_info *sta, int tid) > { > return rcu_dereference_protected(sta->ampdu_mlme.tid_tx[tid], > lockdep_is_held(&sta->lock) || > lockdep_is_held(&sta->ampdu_mlme.mtx)); > } > > sta_tx_agg_session_timer_expired calls ieee80211_stop_tx_ba_session > which does: > > spin_lock_bh(&sta->lock); > tid_tx = rcu_dereference_protected_tid_tx(sta, tid); > > > So why would the message happen? Strange. When I first saw the thread at http://lkml.indiana.edu/hypermail/linux/kernel/1204.1/01446.html, I thought it might be about the same problem, but that appears to be restricted to sparc64. The problem does not happen every time - even more strangeness. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-04-12 3:51 ` Larry Finger @ 2012-04-12 3:54 ` Johannes Berg 2012-04-12 15:51 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Johannes Berg @ 2012-04-12 3:54 UTC (permalink / raw) To: Larry Finger; +Cc: wireless On Wed, 2012-04-11 at 22:51 -0500, Larry Finger wrote: > On 04/11/2012 10:31 PM, Johannes Berg wrote: > > On Mon, 2012-04-09 at 22:19 -0500, Larry Finger wrote: > >> When running kernel 3.4-rc2 from wireless testing, I got the following logged: > >> > >> [ 2299.344437] =============================== > >> [ 2299.344443] [ INFO: suspicious RCU usage. ] > >> [ 2299.344452] 3.4.0-rc2-wl+ #222 Not tainted > >> [ 2299.344458] ------------------------------- > >> [ 2299.344466] net/mac80211/sta_info.h:449 suspicious rcu_dereference_protected() usage! > >> [ 2299.344472] > >> [ 2299.344474] other info that might help us debug this: > >> [ 2299.344477] > >> [ 2299.344484] > >> [ 2299.344486] rcu_scheduler_active = 1, debug_locks = 0 > >> [ 2299.344493] 1 lock held by swapper/1/0: > >> [ 2299.344499] #0: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104850a>] > >> run_timer_softirq+0xfa/0x6e0 > >> [ 2299.344533] > >> [ 2299.344535] stack backtrace: > >> [ 2299.344544] Pid: 0, comm: swapper/1 Not tainted 3.4.0-rc2-wl+ #222 > >> [ 2299.344551] Call Trace: > >> [ 2299.344557]<IRQ> [<ffffffff81092fdd>] lockdep_rcu_suspicious+0xfd/0x130 > >> [ 2299.344651] [<ffffffffa05db673>] sta_tx_agg_session_timer_expired+0xe3/0x100 > >> [mac80211] > > > > That's odd: > > > > static inline struct tid_ampdu_tx * > > rcu_dereference_protected_tid_tx(struct sta_info *sta, int tid) > > { > > return rcu_dereference_protected(sta->ampdu_mlme.tid_tx[tid], > > lockdep_is_held(&sta->lock) || > > lockdep_is_held(&sta->ampdu_mlme.mtx)); > > } > > > > sta_tx_agg_session_timer_expired calls ieee80211_stop_tx_ba_session > > which does: > > > > spin_lock_bh(&sta->lock); > > tid_tx = rcu_dereference_protected_tid_tx(sta, tid); > > > > > > So why would the message happen? Strange. > > When I first saw the thread at > http://lkml.indiana.edu/hypermail/linux/kernel/1204.1/01446.html, I thought it > might be about the same problem, but that appears to be restricted to sparc64. > > The problem does not happen every time - even more strangeness. What do you mean by "every time"? I wouldn't expect the timer to fire every time you set up aggregation etc., the timer firing is a special case to start with. johannes ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-04-12 3:54 ` Johannes Berg @ 2012-04-12 15:51 ` Larry Finger 2012-04-12 15:55 ` Johannes Berg 0 siblings, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-04-12 15:51 UTC (permalink / raw) To: Johannes Berg; +Cc: wireless On 04/11/2012 10:54 PM, Johannes Berg wrote: > > What do you mean by "every time"? I wouldn't expect the timer to fire > every time you set up aggregation etc., the timer firing is a special > case to start with. I meant it is intermittent, or it required some special setup that I have not duplicated, as I have seen the condition only once. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-04-12 15:51 ` Larry Finger @ 2012-04-12 15:55 ` Johannes Berg 2012-05-01 14:25 ` Mohammed Shafi 0 siblings, 1 reply; 25+ messages in thread From: Johannes Berg @ 2012-04-12 15:55 UTC (permalink / raw) To: Larry Finger; +Cc: wireless On Thu, 2012-04-12 at 10:51 -0500, Larry Finger wrote: > On 04/11/2012 10:54 PM, Johannes Berg wrote: > > > > What do you mean by "every time"? I wouldn't expect the timer to fire > > every time you set up aggregation etc., the timer firing is a special > > case to start with. > > I meant it is intermittent, or it required some special setup that I have not > duplicated, as I have seen the condition only once. Ok. I suspect that's explained by the fact that the timer is an error case anyway, so you've probably never seen that error case again. It's still puzzling though since the code looks fine to me. johannes ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-04-12 15:55 ` Johannes Berg @ 2012-05-01 14:25 ` Mohammed Shafi 2012-05-01 19:18 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Mohammed Shafi @ 2012-05-01 14:25 UTC (permalink / raw) To: Johannes Berg; +Cc: Larry Finger, wireless hi Johannes and Larry, On Thu, Apr 12, 2012 at 9:25 PM, Johannes Berg <johannes@sipsolutions.net> wrote: > On Thu, 2012-04-12 at 10:51 -0500, Larry Finger wrote: >> On 04/11/2012 10:54 PM, Johannes Berg wrote: >> > >> > What do you mean by "every time"? I wouldn't expect the timer to fire >> > every time you set up aggregation etc., the timer firing is a special >> > case to start with. >> >> I meant it is intermittent, or it required some special setup that I have not >> duplicated, as I have seen the condition only once. > > Ok. I suspect that's explained by the fact that the timer is an error > case anyway, so you've probably never seen that error case again. It's > still puzzling though since the code looks fine to me. > > johannes > > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html May 1 14:34:51 shafi-laptop kernel: [ 83.216284] =============================== May 1 14:34:51 shafi-laptop kernel: [ 83.216286] [ INFO: suspicious RCU usage. ] May 1 14:34:51 shafi-laptop kernel: [ 83.216289] 3.4.0-rc3-wl #16 Not tainted May 1 14:34:51 shafi-laptop kernel: [ 83.216291] ------------------------------- May 1 14:34:51 shafi-laptop kernel: [ 83.216293] net/mac80211/sta_info.h:456 suspicious rcu_dereference_protected() usage! May 1 14:34:51 shafi-laptop kernel: [ 83.216295] May 1 14:34:51 shafi-laptop kernel: [ 83.216296] other info that might help us debug this: May 1 14:34:51 shafi-laptop kernel: [ 83.216297] May 1 14:34:51 shafi-laptop kernel: [ 83.216299] May 1 14:34:51 shafi-laptop kernel: [ 83.216299] rcu_scheduler_active = 1, debug_locks = 1 May 1 14:34:51 shafi-laptop kernel: [ 83.216302] 2 locks held by kmemleak/52: May 1 14:34:51 shafi-laptop kernel: [ 83.216303] #0: (scan_mutex){+.+...}, at: [<c024d343>] kmemleak_scan_thread+0x53/0xc0 May 1 14:34:51 shafi-laptop kernel: [ 83.216313] #1: (&tid_tx->session_timer){+.-...}, at: [<c0144fdf>] run_timer_softirq+0xcf/0x400 May 1 14:34:51 shafi-laptop kernel: [ 83.216321] May 1 14:34:51 shafi-laptop kernel: [ 83.216322] stack backtrace: May 1 14:34:51 shafi-laptop kernel: [ 83.216325] Pid: 52, comm: kmemleak Not tainted 3.4.0-rc3-wl #16 May 1 14:34:51 shafi-laptop kernel: [ 83.216327] Call Trace: May 1 14:34:51 shafi-laptop kernel: [ 83.216332] [<c0191726>] lockdep_rcu_suspicious+0xc6/0x100 May 1 14:34:51 shafi-laptop kernel: [ 83.216351] [<f835b14d>] sta_tx_agg_session_timer_expired+0xcd/0x100 [mac80211] May 1 14:34:51 shafi-laptop kernel: [ 83.216355] [<c014506d>] run_timer_softirq+0x15d/0x400 May 1 14:34:51 shafi-laptop kernel: [ 83.216358] [<c0144fdf>] ? run_timer_softirq+0xcf/0x400 May 1 14:34:51 shafi-laptop kernel: [ 83.216373] [<f835b080>] ? sta_addba_resp_timer_expired+0x240/0x240 [mac80211] May 1 14:34:51 shafi-laptop kernel: [ 83.216380] [<c013d56f>] __do_softirq+0xaf/0x310 May 1 14:34:51 shafi-laptop kernel: [ 83.216384] [<c013d4c0>] ? __hrtimer_tasklet_trampoline+0x20/0x20 May 1 14:34:51 shafi-laptop kernel: [ 83.216386] <IRQ> [<c013cc65>] ? irq_exit+0xb5/0xd0 May 1 14:34:51 shafi-laptop kernel: [ 83.216393] [<c06b74a9>] ? smp_apic_timer_interrupt+0x59/0x88 May 1 14:34:51 shafi-laptop kernel: [ 83.216398] [<c03d8edc>] ? trace_hardirqs_off_thunk+0xc/0x10 May 1 14:34:51 shafi-laptop kernel: [ 83.216402] [<c06afc36>] ? apic_timer_interrupt+0x36/0x40 May 1 14:34:51 shafi-laptop kernel: [ 83.216405] [<c019007b>] ? rcu_read_lock_sched_held+0x1b/0x70 May 1 14:34:51 shafi-laptop kernel: [ 83.216409] [<c06af58e>] ? _raw_spin_unlock_irqrestore+0x3e/0x70 May 1 14:34:51 shafi-laptop kernel: [ 83.216412] [<c024bf4b>] ? scan_gray_list+0x12b/0x180 May 1 14:34:51 shafi-laptop kernel: [ 83.216415] [<c024c815>] ? kmemleak_scan+0x235/0x8c0 May 1 14:34:51 shafi-laptop kernel: [ 83.216418] [<c024c5e0>] ? kmemleak_seq_next+0x130/0x130 May 1 14:34:51 shafi-laptop kernel: [ 83.216422] [<c024d2f0>] ? kmemleak_write+0x450/0x450 May 1 14:34:51 shafi-laptop kernel: [ 83.216425] [<c024d2f0>] ? kmemleak_write+0x450/0x450 May 1 14:34:51 shafi-laptop kernel: [ 83.216428] [<c024d348>] ? kmemleak_scan_thread+0x58/0xc0 May 1 14:34:51 shafi-laptop kernel: [ 83.216431] [<c01590a4>] ? kthread+0x84/0x90 May 1 14:34:51 shafi-laptop kernel: [ 83.216435] [<c0159020>] ? __init_kthread_worker+0x60/0x60 May 1 14:34:51 shafi-laptop kernel: [ 83.216438] [<c06b7306>] ? kernel_thread_helper+0x6/0x10 May 1 14:34:54 shafi-laptop kernel: [ 86.692061] wlan0: no IPv6 routers present May 1 14:34:58 shafi-laptop kernel: [ 90.691571] cfg80211: Found new beacon on frequency: 2467 MHz (Ch 12) on phy0 i just looked into the code and compared it with sta_addba_resp_timer_expired and thought whether we should do this.. diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c index 5b7053c..e4da258 100644 --- a/net/mac80211/agg-tx.c +++ b/net/mac80211/agg-tx.c @@ -421,21 +421,25 @@ static void sta_tx_agg_session_timer_expired(unsigned long data) struct tid_ampdu_tx *tid_tx; unsigned long timeout; + rcu_read_lock(); tid_tx = rcu_dereference_protected_tid_tx(sta, *ptid); - if (!tid_tx) + if (!tid_tx) { + rcu_read_unlock(); return; + } timeout = tid_tx->last_tx + TU_TO_JIFFIES(tid_tx->timeout); if (time_is_after_jiffies(timeout)) { mod_timer(&tid_tx->session_timer, timeout); + rcu_read_unlock(); return; } #ifdef CONFIG_MAC80211_HT_DEBUG printk(KERN_DEBUG "tx session timer expired on tid %d\n", (u16)*ptid); #endif - ieee80211_stop_tx_ba_session(&sta->sta, *ptid); + rcu_read_unlock(); } int i -- thanks, shafi ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-01 14:25 ` Mohammed Shafi @ 2012-05-01 19:18 ` Larry Finger 2012-05-02 5:02 ` Mohammed Shafi 2012-05-02 10:00 ` Catalin Marinas 0 siblings, 2 replies; 25+ messages in thread From: Larry Finger @ 2012-05-01 19:18 UTC (permalink / raw) To: Mohammed Shafi; +Cc: Johannes Berg, wireless, Catalin Marinas On 05/01/2012 09:25 AM, Mohammed Shafi wrote: > > i just looked into the code and compared it with > sta_addba_resp_timer_expired and thought whether we should do > this.. Your patch does not help. I still get the following dump in the log: =============================== [ INFO: suspicious RCU usage. ] 3.4.0-rc5-wl+ #287 Not tainted ------------------------------- net/mac80211/sta_info.h:456 suspicious rcu_dereference_protected() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 3 locks held by kmemleak/622: #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] run_timer_softirq+0xfa/0x6e0 #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] stack backtrace: Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 Call Trace: <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 [mac80211] [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 As kmemleak seems to be involved, I have added Catalin Marinas to the Cc list. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-01 19:18 ` Larry Finger @ 2012-05-02 5:02 ` Mohammed Shafi 2012-05-02 10:00 ` Catalin Marinas 1 sibling, 0 replies; 25+ messages in thread From: Mohammed Shafi @ 2012-05-02 5:02 UTC (permalink / raw) To: Larry Finger; +Cc: Johannes Berg, wireless, Catalin Marinas On Wed, May 2, 2012 at 12:48 AM, Larry Finger <Larry.Finger@lwfinger.net> wrote: > On 05/01/2012 09:25 AM, Mohammed Shafi wrote: >> >> >> i just looked into the code and compared it with >> sta_addba_resp_timer_expired and thought whether we should do >> this.. > > > Your patch does not help. I still get the following dump in the log: Larry, thanks for trying it out. then there seems to be something else, which requires further debugging/understanding. > > =============================== > [ INFO: suspicious RCU usage. ] > 3.4.0-rc5-wl+ #287 Not tainted > ------------------------------- > > net/mac80211/sta_info.h:456 suspicious rcu_dereference_protected() usage! > > other info that might help us debug this: > > rcu_scheduler_active = 1, debug_locks = 0 > 3 locks held by kmemleak/622: > #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] > kmemleak_scan_thread+0x56/0xd0 > #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] > run_timer_softirq+0xfa/0x6e0 > #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] > sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] > > stack backtrace: > Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 > Call Trace: > <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 > [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 > [mac80211] > [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] > [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 > > As kmemleak seems to be involved, I have added Catalin Marinas to the Cc > list. > > Larry > > -- thanks, shafi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-01 19:18 ` Larry Finger 2012-05-02 5:02 ` Mohammed Shafi @ 2012-05-02 10:00 ` Catalin Marinas 2012-05-02 17:07 ` Johannes Berg 1 sibling, 1 reply; 25+ messages in thread From: Catalin Marinas @ 2012-05-02 10:00 UTC (permalink / raw) To: Larry Finger; +Cc: Mohammed Shafi, Johannes Berg, wireless On Tue, May 01, 2012 at 08:18:54PM +0100, Larry Finger wrote: > On 05/01/2012 09:25 AM, Mohammed Shafi wrote: > > i just looked into the code and compared it with > > sta_addba_resp_timer_expired and thought whether we should do > > this.. > > Your patch does not help. I still get the following dump in the log: > > =============================== > [ INFO: suspicious RCU usage. ] > 3.4.0-rc5-wl+ #287 Not tainted > ------------------------------- > net/mac80211/sta_info.h:456 suspicious rcu_dereference_protected() usage! > > other info that might help us debug this: > > rcu_scheduler_active = 1, debug_locks = 0 > 3 locks held by kmemleak/622: > #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 > #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] > run_timer_softirq+0xfa/0x6e0 > #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] > sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] > > stack backtrace: > Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 > Call Trace: > <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 > [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 [mac80211] > [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] > [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 > > As kmemleak seems to be involved, I have added Catalin Marinas to the Cc list. It doesn't look related to kmemleak. It just happens that sta_tx_agg_session_timer_expired() was called from a timer when the kmemleak scanning thread was running (pretty likely as this thread can take minutes to complete). Looking at the code and the logs, ieee80211_start_tx_ba_session() calls rcu_dereference_protected_tid_tx() which calls rcu_dereference_protected() with the (lockdep_is_held(&sta->lock) || lockdep_is_held(&sta->ampdu_mlme.mtx)) condition which is false. As the kernel log says, none of these locks are held, hence the warning. Either the rcu_dereference annotation is wrong (introduced by commit 40b275 "mac80211: sparse RCU annotations") or there is something wrong in the code and the RCU pointer is dereferenced without any of those locks held. -- Catalin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-02 10:00 ` Catalin Marinas @ 2012-05-02 17:07 ` Johannes Berg 2012-05-02 20:09 ` Paul E. McKenney 2012-05-03 3:02 ` Larry Finger 0 siblings, 2 replies; 25+ messages in thread From: Johannes Berg @ 2012-05-02 17:07 UTC (permalink / raw) To: Catalin Marinas; +Cc: Larry Finger, Mohammed Shafi, wireless, Paul E. McKenney On Wed, 2012-05-02 at 11:00 +0100, Catalin Marinas wrote: > > Your patch does not help. I still get the following dump in the log: The patch is also wrong, we hold the mutex there and can't hold RCU read lock. > > #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 > > #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] > > run_timer_softirq+0xfa/0x6e0 > > #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] > > sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] > > > > stack backtrace: > > Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 > > Call Trace: > > <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 > > [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 [mac80211] > > [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] > > [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 > > > > As kmemleak seems to be involved, I have added Catalin Marinas to the Cc list. > Looking at the code and the logs, ieee80211_start_tx_ba_session() calls I'm almost certain that ieee80211_start_tx_ba_session() is a bogus calltrace entry, since we're in a timer and that's not called from a timer. > rcu_dereference_protected_tid_tx() which calls > rcu_dereference_protected() with the (lockdep_is_held(&sta->lock) || > lockdep_is_held(&sta->ampdu_mlme.mtx)) condition which is false. As the > kernel log says, none of these locks are held, hence the warning. So does that just mean we need to add rcu_read_lock_held() to the conditions? I thought that wasn't necessary? +Paul. johannes ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-02 17:07 ` Johannes Berg @ 2012-05-02 20:09 ` Paul E. McKenney 2012-05-03 18:38 ` Johannes Berg 2012-05-03 3:02 ` Larry Finger 1 sibling, 1 reply; 25+ messages in thread From: Paul E. McKenney @ 2012-05-02 20:09 UTC (permalink / raw) To: Johannes Berg; +Cc: Catalin Marinas, Larry Finger, Mohammed Shafi, wireless On Wed, May 02, 2012 at 07:07:51PM +0200, Johannes Berg wrote: > On Wed, 2012-05-02 at 11:00 +0100, Catalin Marinas wrote: > > > > Your patch does not help. I still get the following dump in the log: > > The patch is also wrong, we hold the mutex there and can't hold RCU read > lock. > > > > #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 > > > #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] > > > run_timer_softirq+0xfa/0x6e0 > > > #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] > > > sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] > > > > > > stack backtrace: > > > Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 > > > Call Trace: > > > <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 > > > [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 [mac80211] > > > [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] > > > [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 > > > > > > As kmemleak seems to be involved, I have added Catalin Marinas to the Cc list. > > > Looking at the code and the logs, ieee80211_start_tx_ba_session() calls > > I'm almost certain that ieee80211_start_tx_ba_session() is a bogus > calltrace entry, since we're in a timer and that's not called from a > timer. > > > rcu_dereference_protected_tid_tx() which calls > > rcu_dereference_protected() with the (lockdep_is_held(&sta->lock) || > > lockdep_is_held(&sta->ampdu_mlme.mtx)) condition which is false. As the > > kernel log says, none of these locks are held, hence the warning. > > So does that just mean we need to add rcu_read_lock_held() to the > conditions? I thought that wasn't necessary? +Paul. If you are using rcu_dereference_protected(), you really would need to add rcu_read_lock_held(). Except that it is not legal to use rcu_dereference_protected() within an RCU read-side critical section because rcu_dereference_protected() does nothing to protect against misordering mischief from the compiler and the CPU. Actually, that sounds like a useful coccinelle check, now that you mention it. So you should instead use rcu_dereference_check(). This may be used in an RCU read-side critical section, but may also be passed things like (lockdep_is_held(&sta->lock) || lockdep_is_held(&sta->ampdu_mlme.mtx)). Thanx, Paul ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-02 20:09 ` Paul E. McKenney @ 2012-05-03 18:38 ` Johannes Berg 2012-05-04 6:17 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Johannes Berg @ 2012-05-03 18:38 UTC (permalink / raw) To: paulmck; +Cc: Catalin Marinas, Larry Finger, Mohammed Shafi, wireless On Wed, 2012-05-02 at 13:09 -0700, Paul E. McKenney wrote: > On Wed, May 02, 2012 at 07:07:51PM +0200, Johannes Berg wrote: > > On Wed, 2012-05-02 at 11:00 +0100, Catalin Marinas wrote: > > > > > > Your patch does not help. I still get the following dump in the log: > > > > The patch is also wrong, we hold the mutex there and can't hold RCU read > > lock. > > > > > > #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 > > > > #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] > > > > run_timer_softirq+0xfa/0x6e0 > > > > #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] > > > > sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] > > > > > > > > stack backtrace: > > > > Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 > > > > Call Trace: > > > > <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 > > > > [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 [mac80211] > > > > [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] > > > > [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 > > > > > > > > As kmemleak seems to be involved, I have added Catalin Marinas to the Cc list. > > > > > Looking at the code and the logs, ieee80211_start_tx_ba_session() calls > > > > I'm almost certain that ieee80211_start_tx_ba_session() is a bogus > > calltrace entry, since we're in a timer and that's not called from a > > timer. > > > > > rcu_dereference_protected_tid_tx() which calls > > > rcu_dereference_protected() with the (lockdep_is_held(&sta->lock) || > > > lockdep_is_held(&sta->ampdu_mlme.mtx)) condition which is false. As the > > > kernel log says, none of these locks are held, hence the warning. > > > > So does that just mean we need to add rcu_read_lock_held() to the > > conditions? I thought that wasn't necessary? +Paul. > > If you are using rcu_dereference_protected(), you really would need > to add rcu_read_lock_held(). Except that it is not legal to use > rcu_dereference_protected() within an RCU read-side critical section > because rcu_dereference_protected() does nothing to protect against > misordering mischief from the compiler and the CPU. Actually, that > sounds like a useful coccinelle check, now that you mention it. > > So you should instead use rcu_dereference_check(). This may be used > in an RCU read-side critical section, but may also be passed things like > (lockdep_is_held(&sta->lock) || lockdep_is_held(&sta->ampdu_mlme.mtx)). Oops, of course, I should have seen that, thanks for pointing it out! Now that I look at it, the problem actually seems to be something else entirely though: the rcu_dereference (whichever type) in sta_tx_agg_session_timer_expired() itself isn't protected at all, something like the patch below is needed. Larry, can you try that? johannes diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c index 5b7053c..40d3ff4 100644 --- a/net/mac80211/agg-tx.c +++ b/net/mac80211/agg-tx.c @@ -421,16 +421,22 @@ static void sta_tx_agg_session_timer_expired(unsigned long data) struct tid_ampdu_tx *tid_tx; unsigned long timeout; - tid_tx = rcu_dereference_protected_tid_tx(sta, *ptid); - if (!tid_tx) + rcu_read_lock(); + tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[*ptid]); + if (!tid_tx) { + rcu_read_unlock(); return; + } timeout = tid_tx->last_tx + TU_TO_JIFFIES(tid_tx->timeout); if (time_is_after_jiffies(timeout)) { mod_timer(&tid_tx->session_timer, timeout); + rcu_read_unlock(); return; } + rcu_read_unlock(); + #ifdef CONFIG_MAC80211_HT_DEBUG printk(KERN_DEBUG "tx session timer expired on tid %d\n", (u16)*ptid); #endif ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-03 18:38 ` Johannes Berg @ 2012-05-04 6:17 ` Larry Finger 2012-05-04 6:40 ` Mohammed Shafi 0 siblings, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-05-04 6:17 UTC (permalink / raw) To: Johannes Berg; +Cc: Catalin Marinas, Mohammed Shafi, wireless On 05/03/2012 01:38 PM, Johannes Berg wrote: > diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c > index 5b7053c..40d3ff4 100644 > --- a/net/mac80211/agg-tx.c > +++ b/net/mac80211/agg-tx.c > @@ -421,16 +421,22 @@ static void sta_tx_agg_session_timer_expired(unsigned long data) > struct tid_ampdu_tx *tid_tx; > unsigned long timeout; > > - tid_tx = rcu_dereference_protected_tid_tx(sta, *ptid); > - if (!tid_tx) > + rcu_read_lock(); > + tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[*ptid]); > + if (!tid_tx) { > + rcu_read_unlock(); > return; > + } > > timeout = tid_tx->last_tx + TU_TO_JIFFIES(tid_tx->timeout); > if (time_is_after_jiffies(timeout)) { > mod_timer(&tid_tx->session_timer, timeout); > + rcu_read_unlock(); > return; > } > > + rcu_read_unlock(); > + > #ifdef CONFIG_MAC80211_HT_DEBUG > printk(KERN_DEBUG "tx session timer expired on tid %d\n", (u16)*ptid); > #endif This patch is the same as was proposed by Mohammed on May 1. I must have messed up the testing as I thought it failed; however, it is now working. Mohammed - I plan on submitting this to John with a "From:" from you along with your s-o-b. Is that OK. It will be submitted for inclusion in 3.5. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-04 6:17 ` Larry Finger @ 2012-05-04 6:40 ` Mohammed Shafi 2012-05-04 6:48 ` Mohammed Shafi 0 siblings, 1 reply; 25+ messages in thread From: Mohammed Shafi @ 2012-05-04 6:40 UTC (permalink / raw) To: Larry Finger; +Cc: Johannes Berg, Catalin Marinas, wireless Hi Larry, On Fri, May 4, 2012 at 11:47 AM, Larry Finger <Larry.Finger@lwfinger.net> wrote: > On 05/03/2012 01:38 PM, Johannes Berg wrote: >> >> diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c >> index 5b7053c..40d3ff4 100644 >> --- a/net/mac80211/agg-tx.c >> +++ b/net/mac80211/agg-tx.c >> @@ -421,16 +421,22 @@ static void >> sta_tx_agg_session_timer_expired(unsigned long data) >> struct tid_ampdu_tx *tid_tx; >> unsigned long timeout; >> >> - tid_tx = rcu_dereference_protected_tid_tx(sta, *ptid); >> - if (!tid_tx) >> + rcu_read_lock(); >> + tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[*ptid]); >> + if (!tid_tx) { >> + rcu_read_unlock(); >> return; >> + } >> >> timeout = tid_tx->last_tx + TU_TO_JIFFIES(tid_tx->timeout); >> if (time_is_after_jiffies(timeout)) { >> mod_timer(&tid_tx->session_timer, timeout); >> + rcu_read_unlock(); >> return; >> } >> >> + rcu_read_unlock(); >> + >> #ifdef CONFIG_MAC80211_HT_DEBUG >> printk(KERN_DEBUG "tx session timer expired on tid %d\n", >> (u16)*ptid); >> #endif > > > This patch is the same as was proposed by Mohammed on May 1. I must have > messed up the testing as I thought it failed; however, it is now working. honestly i am not sure about the technical details, but Johannes patch seems to follow almost similar what's is done in 'sta_addba_resp_timer_expired'. while thought that is being also with mutex under hold but the difference is the lock is unhold before ieee80211_stop_tx_ba_session is called. > > Mohammed - I plan on submitting this to John with a "From:" from you along > with your s-o-b. Is that OK. It will be submitted for inclusion in 3.5. > > Larry > -- thanks, shafi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-04 6:40 ` Mohammed Shafi @ 2012-05-04 6:48 ` Mohammed Shafi 2012-05-04 13:45 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Mohammed Shafi @ 2012-05-04 6:48 UTC (permalink / raw) To: Larry Finger; +Cc: Johannes Berg, Catalin Marinas, wireless On Fri, May 4, 2012 at 12:10 PM, Mohammed Shafi <shafi.wireless@gmail.com> wrote: > Hi Larry, > > On Fri, May 4, 2012 at 11:47 AM, Larry Finger <Larry.Finger@lwfinger.net> wrote: >> On 05/03/2012 01:38 PM, Johannes Berg wrote: >>> >>> diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c >>> index 5b7053c..40d3ff4 100644 >>> --- a/net/mac80211/agg-tx.c >>> +++ b/net/mac80211/agg-tx.c >>> @@ -421,16 +421,22 @@ static void >>> sta_tx_agg_session_timer_expired(unsigned long data) >>> struct tid_ampdu_tx *tid_tx; >>> unsigned long timeout; >>> >>> - tid_tx = rcu_dereference_protected_tid_tx(sta, *ptid); >>> - if (!tid_tx) >>> + rcu_read_lock(); >>> + tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[*ptid]); Larry, Johannes seems to use rcu_dereference straight way just like in 'sta_addba_resp_timer_expired' and have it protected by rcu_read_locks >>> + if (!tid_tx) { >>> + rcu_read_unlock(); >>> return; >>> + } >>> >>> timeout = tid_tx->last_tx + TU_TO_JIFFIES(tid_tx->timeout); >>> if (time_is_after_jiffies(timeout)) { >>> mod_timer(&tid_tx->session_timer, timeout); >>> + rcu_read_unlock(); >>> return; >>> } >>> >>> + rcu_read_unlock(); >>> + >>> #ifdef CONFIG_MAC80211_HT_DEBUG >>> printk(KERN_DEBUG "tx session timer expired on tid %d\n", >>> (u16)*ptid); >>> #endif >> >> >> This patch is the same as was proposed by Mohammed on May 1. I must have >> messed up the testing as I thought it failed; however, it is now working. > > honestly i am not sure about the technical details, but Johannes patch > seems to follow almost similar > what's is done in 'sta_addba_resp_timer_expired'. while thought that > is being also with mutex under hold > but the difference is the lock is unhold before > ieee80211_stop_tx_ba_session is called. > >> >> Mohammed - I plan on submitting this to John with a "From:" from you along >> with your s-o-b. Is that OK. It will be submitted for inclusion in 3.5. >> >> Larry >> > > > -- > thanks, > shafi -- thanks, shafi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-04 6:48 ` Mohammed Shafi @ 2012-05-04 13:45 ` Larry Finger 2012-05-04 14:35 ` Mohammed Shafi 0 siblings, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-05-04 13:45 UTC (permalink / raw) To: Mohammed Shafi; +Cc: Johannes Berg, Catalin Marinas, wireless On 05/04/2012 01:48 AM, Mohammed Shafi wrote: > On Fri, May 4, 2012 at 12:10 PM, Mohammed Shafi > <shafi.wireless@gmail.com> wrote: >> Hi Larry, >> >> On Fri, May 4, 2012 at 11:47 AM, Larry Finger<Larry.Finger@lwfinger.net> wrote: >>> On 05/03/2012 01:38 PM, Johannes Berg wrote: >>>> >>>> diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c >>>> index 5b7053c..40d3ff4 100644 >>>> --- a/net/mac80211/agg-tx.c >>>> +++ b/net/mac80211/agg-tx.c >>>> @@ -421,16 +421,22 @@ static void >>>> sta_tx_agg_session_timer_expired(unsigned long data) >>>> struct tid_ampdu_tx *tid_tx; >>>> unsigned long timeout; >>>> >>>> - tid_tx = rcu_dereference_protected_tid_tx(sta, *ptid); >>>> - if (!tid_tx) >>>> + rcu_read_lock(); >>>> + tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[*ptid]); > > Larry, Johannes seems to use rcu_dereference straight way just like in > 'sta_addba_resp_timer_expired' > and have it protected by rcu_read_locks Sorry. I missed that subtle difference. Submitted "from Johannes". Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-04 13:45 ` Larry Finger @ 2012-05-04 14:35 ` Mohammed Shafi 0 siblings, 0 replies; 25+ messages in thread From: Mohammed Shafi @ 2012-05-04 14:35 UTC (permalink / raw) To: Larry Finger; +Cc: Johannes Berg, Catalin Marinas, wireless On Fri, May 4, 2012 at 7:15 PM, Larry Finger <Larry.Finger@lwfinger.net> wrote: > On 05/04/2012 01:48 AM, Mohammed Shafi wrote: >> >> On Fri, May 4, 2012 at 12:10 PM, Mohammed Shafi >> <shafi.wireless@gmail.com> wrote: >>> >>> Hi Larry, >>> >>> On Fri, May 4, 2012 at 11:47 AM, Larry Finger<Larry.Finger@lwfinger.net> >>> wrote: >>>> >>>> On 05/03/2012 01:38 PM, Johannes Berg wrote: >>>>> >>>>> >>>>> diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c >>>>> index 5b7053c..40d3ff4 100644 >>>>> --- a/net/mac80211/agg-tx.c >>>>> +++ b/net/mac80211/agg-tx.c >>>>> @@ -421,16 +421,22 @@ static void >>>>> sta_tx_agg_session_timer_expired(unsigned long data) >>>>> struct tid_ampdu_tx *tid_tx; >>>>> unsigned long timeout; >>>>> >>>>> - tid_tx = rcu_dereference_protected_tid_tx(sta, *ptid); >>>>> - if (!tid_tx) >>>>> + rcu_read_lock(); >>>>> + tid_tx = rcu_dereference(sta->ampdu_mlme.tid_tx[*ptid]); >> >> >> Larry, Johannes seems to use rcu_dereference straight way just like in >> 'sta_addba_resp_timer_expired' >> and have it protected by rcu_read_locks > > > Sorry. I missed that subtle difference. Submitted "from Johannes". no problem thanks! yeah but it seems my gmail seems to be identify those changes!! -- thanks, shafi ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-02 17:07 ` Johannes Berg 2012-05-02 20:09 ` Paul E. McKenney @ 2012-05-03 3:02 ` Larry Finger 2012-05-03 8:47 ` Catalin Marinas 1 sibling, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-05-03 3:02 UTC (permalink / raw) To: Johannes Berg; +Cc: Catalin Marinas, Mohammed Shafi, wireless, Paul E. McKenney On 05/02/2012 12:07 PM, Johannes Berg wrote: > On Wed, 2012-05-02 at 11:00 +0100, Catalin Marinas wrote: > >>> Your patch does not help. I still get the following dump in the log: > > The patch is also wrong, we hold the mutex there and can't hold RCU read > lock. > >>> #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 >>> #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] >>> run_timer_softirq+0xfa/0x6e0 >>> #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] >>> sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] >>> >>> stack backtrace: >>> Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 >>> Call Trace: >>> <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 >>> [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 [mac80211] >>> [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] >>> [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 >>> >>> As kmemleak seems to be involved, I have added Catalin Marinas to the Cc list. > >> Looking at the code and the logs, ieee80211_start_tx_ba_session() calls > > I'm almost certain that ieee80211_start_tx_ba_session() is a bogus > calltrace entry, since we're in a timer and that's not called from a > timer. > >> rcu_dereference_protected_tid_tx() which calls >> rcu_dereference_protected() with the (lockdep_is_held(&sta->lock) || >> lockdep_is_held(&sta->ampdu_mlme.mtx)) condition which is false. As the >> kernel log says, none of these locks are held, hence the warning. > > So does that just mean we need to add rcu_read_lock_held() to the > conditions? I thought that wasn't necessary? +Paul. If you meant the following Index: wireless-testing-new/net/mac80211/sta_info.h =================================================================== --- wireless-testing-new.orig/net/mac80211/sta_info.h +++ wireless-testing-new/net/mac80211/sta_info.h @@ -453,7 +453,8 @@ rcu_dereference_protected_tid_tx(struct { return rcu_dereference_protected(sta->ampdu_mlme.tid_tx[tid], lockdep_is_held(&sta->lock) || - lockdep_is_held(&sta->ampdu_mlme.mtx)); + lockdep_is_held(&sta->ampdu_mlme.mtx) || + rcu_read_lock_held()); } #define STA_HASH_SIZE 256 It does not help. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-03 3:02 ` Larry Finger @ 2012-05-03 8:47 ` Catalin Marinas 2012-05-03 16:54 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Catalin Marinas @ 2012-05-03 8:47 UTC (permalink / raw) To: Larry Finger; +Cc: Johannes Berg, Mohammed Shafi, wireless, Paul E. McKenney On Thu, May 03, 2012 at 04:02:12AM +0100, Larry Finger wrote: > On 05/02/2012 12:07 PM, Johannes Berg wrote: > > On Wed, 2012-05-02 at 11:00 +0100, Catalin Marinas wrote: > > > >>> Your patch does not help. I still get the following dump in the log: > > > > The patch is also wrong, we hold the mutex there and can't hold RCU read > > lock. > > > >>> #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 > >>> #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] > >>> run_timer_softirq+0xfa/0x6e0 > >>> #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa0449ff0>] > >>> sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] > >>> > >>> stack backtrace: > >>> Pid: 622, comm: kmemleak Not tainted 3.4.0-rc5-wl+ #287 > >>> Call Trace: > >>> <IRQ> [<ffffffff8109309d>] lockdep_rcu_suspicious+0xfd/0x130 > >>> [<ffffffffa044a1cf>] sta_tx_agg_session_timer_expired+0x1df/0x2a0 [mac80211] > >>> [<ffffffffa0449ff0>] ? ieee80211_start_tx_ba_session+0x450/0x450 [mac80211] > >>> [<ffffffff810485c5>] run_timer_softirq+0x185/0x6e0 > >>> > >>> As kmemleak seems to be involved, I have added Catalin Marinas to the Cc list. > > > >> Looking at the code and the logs, ieee80211_start_tx_ba_session() calls > > > > I'm almost certain that ieee80211_start_tx_ba_session() is a bogus > > calltrace entry, since we're in a timer and that's not called from a > > timer. > > > >> rcu_dereference_protected_tid_tx() which calls > >> rcu_dereference_protected() with the (lockdep_is_held(&sta->lock) || > >> lockdep_is_held(&sta->ampdu_mlme.mtx)) condition which is false. As the > >> kernel log says, none of these locks are held, hence the warning. > > > > So does that just mean we need to add rcu_read_lock_held() to the > > conditions? I thought that wasn't necessary? +Paul. > > If you meant the following > > Index: wireless-testing-new/net/mac80211/sta_info.h > =================================================================== > --- wireless-testing-new.orig/net/mac80211/sta_info.h > +++ wireless-testing-new/net/mac80211/sta_info.h > @@ -453,7 +453,8 @@ rcu_dereference_protected_tid_tx(struct > { > return rcu_dereference_protected(sta->ampdu_mlme.tid_tx[tid], IIUC, Paul suggested that you should use rcu_dereference_check() here instead as the protected one is not safe in this context. -- Catalin ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-03 8:47 ` Catalin Marinas @ 2012-05-03 16:54 ` Larry Finger 2012-05-03 17:12 ` Paul E. McKenney 0 siblings, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-05-03 16:54 UTC (permalink / raw) To: Catalin Marinas; +Cc: Johannes Berg, Mohammed Shafi, wireless, Paul E. McKenney On 05/03/2012 03:47 AM, Catalin Marinas wrote: > > IIUC, Paul suggested that you should use rcu_dereference_check() here > instead as the protected one is not safe in this context. This patch also fails to fix the problem. Did I do what Paul suggested? Index: wireless-testing-new/net/mac80211/sta_info.h =================================================================== --- wireless-testing-new.orig/net/mac80211/sta_info.h +++ wireless-testing-new/net/mac80211/sta_info.h @@ -452,7 +452,7 @@ void ieee80211_assign_tid_tx(struct sta_ static inline struct tid_ampdu_tx * rcu_dereference_protected_tid_tx(struct sta_info *sta, int tid) { - return rcu_dereference_protected(sta->ampdu_mlme.tid_tx[tid], + return rcu_dereference_check(sta->ampdu_mlme.tid_tx[tid], lockdep_is_held(&sta->lock) || lockdep_is_held(&sta->ampdu_mlme.mtx)); } ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-03 16:54 ` Larry Finger @ 2012-05-03 17:12 ` Paul E. McKenney 2012-05-03 17:46 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Paul E. McKenney @ 2012-05-03 17:12 UTC (permalink / raw) To: Larry Finger; +Cc: Catalin Marinas, Johannes Berg, Mohammed Shafi, wireless On Thu, May 03, 2012 at 11:54:29AM -0500, Larry Finger wrote: > On 05/03/2012 03:47 AM, Catalin Marinas wrote: > > > >IIUC, Paul suggested that you should use rcu_dereference_check() here > >instead as the protected one is not safe in this context. > > This patch also fails to fix the problem. Did I do what Paul suggested? That is indeed what I suggested! What locks does lockdep say are held? Thanx, Paul > Index: wireless-testing-new/net/mac80211/sta_info.h > =================================================================== > --- wireless-testing-new.orig/net/mac80211/sta_info.h > +++ wireless-testing-new/net/mac80211/sta_info.h > @@ -452,7 +452,7 @@ void ieee80211_assign_tid_tx(struct sta_ > static inline struct tid_ampdu_tx * > rcu_dereference_protected_tid_tx(struct sta_info *sta, int tid) > { > - return rcu_dereference_protected(sta->ampdu_mlme.tid_tx[tid], > + return rcu_dereference_check(sta->ampdu_mlme.tid_tx[tid], > lockdep_is_held(&sta->lock) || > lockdep_is_held(&sta->ampdu_mlme.mtx)); > } > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-03 17:12 ` Paul E. McKenney @ 2012-05-03 17:46 ` Larry Finger 2012-05-03 18:22 ` Paul E. McKenney 0 siblings, 1 reply; 25+ messages in thread From: Larry Finger @ 2012-05-03 17:46 UTC (permalink / raw) To: paulmck; +Cc: Catalin Marinas, Johannes Berg, Mohammed Shafi, wireless On 05/03/2012 12:12 PM, Paul E. McKenney wrote: > On Thu, May 03, 2012 at 11:54:29AM -0500, Larry Finger wrote: >> On 05/03/2012 03:47 AM, Catalin Marinas wrote: >>> >>> IIUC, Paul suggested that you should use rcu_dereference_check() here >>> instead as the protected one is not safe in this context. >> >> This patch also fails to fix the problem. Did I do what Paul suggested? > > That is indeed what I suggested! > > What locks does lockdep say are held? It varies from instance to instance. I have seen 1, 2, or 3. Those are the following: #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] run_timer_softirq+0xfa/0x6e0 #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa03c1ff0>] sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] When only 1 lock is held, it is "&tid_tx->session_timer", and that one is held in every case. I think that means we need to OR it with the other lockdep_is_held() arguments in the rcu_dereference_xxxxx() call, but I did not seem to get the syntax right. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-03 17:46 ` Larry Finger @ 2012-05-03 18:22 ` Paul E. McKenney 2012-05-03 18:32 ` Larry Finger 0 siblings, 1 reply; 25+ messages in thread From: Paul E. McKenney @ 2012-05-03 18:22 UTC (permalink / raw) To: Larry Finger; +Cc: Catalin Marinas, Johannes Berg, Mohammed Shafi, wireless On Thu, May 03, 2012 at 12:46:02PM -0500, Larry Finger wrote: > On 05/03/2012 12:12 PM, Paul E. McKenney wrote: > >On Thu, May 03, 2012 at 11:54:29AM -0500, Larry Finger wrote: > >>On 05/03/2012 03:47 AM, Catalin Marinas wrote: > >>> > >>>IIUC, Paul suggested that you should use rcu_dereference_check() here > >>>instead as the protected one is not safe in this context. > >> > >>This patch also fails to fix the problem. Did I do what Paul suggested? > > > >That is indeed what I suggested! > > > >What locks does lockdep say are held? > > It varies from instance to instance. I have seen 1, 2, or 3. Those > are the following: > > #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 > #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] > run_timer_softirq+0xfa/0x6e0 > #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa03c1ff0>] > sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] > > > When only 1 lock is held, it is "&tid_tx->session_timer", and that > one is held in every case. I think that means we need to OR it with > the other lockdep_is_held() arguments in the rcu_dereference_xxxxx() > call, but I did not seem to get the syntax right. If any one of several locks (call them a, b, and c) is sufficient to protect the data, and if an rcu_read_lock() also suffices, then something like the following would work: return rcu_dereference_check(sta->ampdu_mlme.tid_tx[tid], lockdep_is_held(&a) || lockdep_is_held(&b) || lockdep_is_held(&c)); But I must defer to the developers and maintainers of that code as to exactly which combinations of locks and RCU are required. The rcu_read_lock_held() is supplied by rcu_dereference_check(), so I am surprised that you got a splat that included rcu_read_lock. Thanx, Paul ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Suspicious RCU usage in mac80211 2012-05-03 18:22 ` Paul E. McKenney @ 2012-05-03 18:32 ` Larry Finger 0 siblings, 0 replies; 25+ messages in thread From: Larry Finger @ 2012-05-03 18:32 UTC (permalink / raw) To: paulmck; +Cc: Catalin Marinas, Johannes Berg, Mohammed Shafi, wireless On 05/03/2012 01:22 PM, Paul E. McKenney wrote: > On Thu, May 03, 2012 at 12:46:02PM -0500, Larry Finger wrote: >> On 05/03/2012 12:12 PM, Paul E. McKenney wrote: >>> On Thu, May 03, 2012 at 11:54:29AM -0500, Larry Finger wrote: >>>> On 05/03/2012 03:47 AM, Catalin Marinas wrote: >>>>> >>>>> IIUC, Paul suggested that you should use rcu_dereference_check() here >>>>> instead as the protected one is not safe in this context. >>>> >>>> This patch also fails to fix the problem. Did I do what Paul suggested? >>> >>> That is indeed what I suggested! >>> >>> What locks does lockdep say are held? >> >> It varies from instance to instance. I have seen 1, 2, or 3. Those >> are the following: >> >> #0: (scan_mutex){+.+...}, at: [<ffffffff8113b0d6>] kmemleak_scan_thread+0x56/0xd0 >> #1: (&tid_tx->session_timer){+.-...}, at: [<ffffffff8104853a>] >> run_timer_softirq+0xfa/0x6e0 >> #2: (rcu_read_lock){.+.+..}, at: [<ffffffffa03c1ff0>] >> sta_tx_agg_session_timer_expired+0x0/0x2a0 [mac80211] >> >> >> When only 1 lock is held, it is "&tid_tx->session_timer", and that >> one is held in every case. I think that means we need to OR it with >> the other lockdep_is_held() arguments in the rcu_dereference_xxxxx() >> call, but I did not seem to get the syntax right. > > If any one of several locks (call them a, b, and c) is sufficient to > protect the data, and if an rcu_read_lock() also suffices, then something > like the following would work: > > return rcu_dereference_check(sta->ampdu_mlme.tid_tx[tid], > lockdep_is_held(&a) || > lockdep_is_held(&b) || > lockdep_is_held(&c)); > > But I must defer to the developers and maintainers of that code > as to exactly which combinations of locks and RCU are required. > The rcu_read_lock_held() is supplied by rcu_dereference_check(), so I > am surprised that you got a splat that included rcu_read_lock. That was before I switched to rcu_dereference_check(). Sorry if that misled you. Larry ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2012-05-04 14:36 UTC | newest] Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-04-10 3:19 Suspicious RCU usage in mac80211 Larry Finger 2012-04-12 3:31 ` Johannes Berg 2012-04-12 3:51 ` Larry Finger 2012-04-12 3:54 ` Johannes Berg 2012-04-12 15:51 ` Larry Finger 2012-04-12 15:55 ` Johannes Berg 2012-05-01 14:25 ` Mohammed Shafi 2012-05-01 19:18 ` Larry Finger 2012-05-02 5:02 ` Mohammed Shafi 2012-05-02 10:00 ` Catalin Marinas 2012-05-02 17:07 ` Johannes Berg 2012-05-02 20:09 ` Paul E. McKenney 2012-05-03 18:38 ` Johannes Berg 2012-05-04 6:17 ` Larry Finger 2012-05-04 6:40 ` Mohammed Shafi 2012-05-04 6:48 ` Mohammed Shafi 2012-05-04 13:45 ` Larry Finger 2012-05-04 14:35 ` Mohammed Shafi 2012-05-03 3:02 ` Larry Finger 2012-05-03 8:47 ` Catalin Marinas 2012-05-03 16:54 ` Larry Finger 2012-05-03 17:12 ` Paul E. McKenney 2012-05-03 17:46 ` Larry Finger 2012-05-03 18:22 ` Paul E. McKenney 2012-05-03 18:32 ` Larry Finger
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.