From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758117Ab3BKQwc (ORCPT ); Mon, 11 Feb 2013 11:52:32 -0500 Received: from mail-ie0-f172.google.com ([209.85.223.172]:36126 "EHLO mail-ie0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757893Ab3BKQw3 (ORCPT ); Mon, 11 Feb 2013 11:52:29 -0500 Message-ID: <1360601545.20362.24.camel@edumazet-glaptop> Subject: Re: kernel 3.7.6, l2tp, qdisc_tx circular locking From: Eric Dumazet To: Denys Fedoryshchenko Cc: netdev@vger.kernel.org, davem@davemloft.net, linux-kernel@vger.kernel.org Date: Mon, 11 Feb 2013 08:52:25 -0800 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 2013-02-11 at 14:28 +0200, Denys Fedoryshchenko wrote: > Hello, > > I was running l2tp connections with accel-ppp as test, and got this: > Nothing unusual in kernel, x86_64, 2 CPU, gentoo, gcc 4.6.3 > If you need more information - let me know please. > > > [ 7575.690820] > [ 7575.690834] ====================================================== > [ 7575.690850] [ INFO: possible circular locking dependency detected ] > [ 7575.690870] 3.7.6-wbuild #6 Not tainted > [ 7575.690882] ------------------------------------------------------- > [ 7575.690901] swapper/1/0 is trying to acquire lock: > [ 7575.690916] (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, > at: [] dev_queue_xmit+0x90/0x34d > [ 7575.690972] > [ 7575.690972] but task is already holding lock: > [ 7575.690988] (l2tp_sock){+.-...}, at: [] > l2tp_xmit_skb+0x1b8/0x453 [l2tp_core] > [ 7575.691037] > [ 7575.691037] which lock already depends on the new lock. > [ 7575.691037] > [ 7575.691059] > [ 7575.691059] the existing dependency chain (in reverse order) is: > [ 7575.691082] > [ 7575.691082] -> #4 (l2tp_sock){+.-...}: > [ 7575.691119] [] lock_acquire+0x7e/0x94 > [ 7575.691141] [] _raw_spin_lock+0x2c/0x3b > [ 7575.691615] [] l2tp_xmit_skb+0x1b8/0x453 > [l2tp_core] > [ 7575.692088] [] pppol2tp_xmit+0x122/0x15d > [l2tp_ppp] > [ 7575.692560] [] ppp_channel_push+0x48/0xad > [ 7575.693031] [] ppp_write+0xc1/0xcd > [ 7575.693499] [] vfs_write+0xa3/0xca > [ 7575.693969] [] sys_write+0x54/0x78 > [ 7575.694436] [] > system_call_fastpath+0x16/0x1b > [ 7575.694907] > [ 7575.694907] -> #3 (&(&pch->downl)->rlock){+.-...}: > [ 7575.695830] [] lock_acquire+0x7e/0x94 > [ 7575.696297] [] _raw_spin_lock_bh+0x31/0x40 > [ 7575.696767] [] ppp_push+0x69/0x507 > [ 7575.697235] [] ppp_xmit_process+0x44a/0x4ff > [ 7575.697703] [] ppp_write+0xb7/0xcd > [ 7575.698170] [] vfs_write+0xa3/0xca > [ 7575.698638] [] sys_write+0x54/0x78 > [ 7575.699105] [] > system_call_fastpath+0x16/0x1b > [ 7575.699575] > [ 7575.699575] -> #2 (&(&ppp->wlock)->rlock){+.-...}: > [ 7575.700498] [] lock_acquire+0x7e/0x94 > [ 7575.700966] [] _raw_spin_lock_bh+0x31/0x40 > [ 7575.701435] [] ppp_xmit_process+0x23/0x4ff > [ 7575.701905] [] ppp_start_xmit+0x128/0x143 > [ 7575.702373] [] > dev_hard_start_xmit+0x267/0x371 > [ 7575.702843] [] sch_direct_xmit+0x70/0x14f > [ 7575.703312] [] dev_queue_xmit+0x152/0x34d > [ 7575.703869] [] neigh_direct_output+0xc/0xe > [ 7575.704336] [] > ip_finish_output2+0x268/0x2e5 > [ 7575.704806] [] ip_finish_output+0x46/0x4b > [ 7575.705274] [] ip_output+0x63/0x67 > [ 7575.705742] [] ip_forward_finish+0x6b/0x70 > [ 7575.706211] [] ip_forward+0x205/0x285 > [ 7575.706678] [] ip_rcv_finish+0x2b3/0x2cb > [ 7575.707147] [] > NF_HOOK.constprop.8+0x4c/0x55 > [ 7575.707617] [] ip_rcv+0x22b/0x259 > [ 7575.708084] [] > __netif_receive_skb+0x458/0x4c5 > [ 7575.708556] [] netif_receive_skb+0x56/0x8b > [ 7575.709026] [] napi_skb_finish+0x21/0x53 > [ 7575.709495] [] napi_gro_receive+0x2a/0x2f > [ 7575.709963] [] igb_poll+0x634/0x909 > [ 7575.710430] [] net_rx_action+0xa3/0x1b9 > [ 7575.710897] [] __do_softirq+0xa8/0x157 > [ 7575.711368] [] call_softirq+0x1c/0x26 > [ 7575.711835] [] do_softirq+0x38/0x83 > [ 7575.712304] [] irq_exit+0x4e/0xad > [ 7575.712771] [] do_IRQ+0x89/0xa0 > [ 7575.713236] [] ret_from_intr+0x0/0x13 > [ 7575.713703] [] cpuidle_enter+0x12/0x14 > [ 7575.714171] [] > cpuidle_enter_state+0x10/0x39 > [ 7575.714642] [] cpuidle_idle_call+0x7e/0xa4 > [ 7575.715112] [] cpu_idle+0x58/0xa2 > [ 7575.715579] [] start_secondary+0x188/0x18d > [ 7575.716049] > [ 7575.716049] -> #1 (_xmit_PPP#2){+.-...}: > [ 7575.716977] [] lock_acquire+0x7e/0x94 > [ 7575.717443] [] _raw_spin_lock+0x2c/0x3b > [ 7575.718000] [] sch_direct_xmit+0x49/0x14f > [ 7575.718467] [] dev_queue_xmit+0x152/0x34d > [ 7575.718933] [] neigh_direct_output+0xc/0xe > [ 7575.719403] [] > ip_finish_output2+0x268/0x2e5 > [ 7575.719873] [] ip_finish_output+0x46/0x4b > [ 7575.720344] [] ip_output+0x63/0x67 > [ 7575.720812] [] ip_forward_finish+0x6b/0x70 > [ 7575.721282] [] ip_forward+0x205/0x285 > [ 7575.721750] [] ip_rcv_finish+0x2b3/0x2cb > [ 7575.722219] [] > NF_HOOK.constprop.8+0x4c/0x55 > [ 7575.722689] [] ip_rcv+0x22b/0x259 > [ 7575.723158] [] > __netif_receive_skb+0x458/0x4c5 > [ 7575.723628] [] netif_receive_skb+0x56/0x8b > [ 7575.724098] [] napi_skb_finish+0x21/0x53 > [ 7575.724566] [] napi_gro_receive+0x2a/0x2f > [ 7575.725034] [] igb_poll+0x634/0x909 > [ 7575.725501] [] net_rx_action+0xa3/0x1b9 > [ 7575.725970] [] __do_softirq+0xa8/0x157 > [ 7575.726439] [] call_softirq+0x1c/0x26 > [ 7575.726907] [] do_softirq+0x38/0x83 > [ 7575.727375] [] irq_exit+0x4e/0xad > [ 7575.727841] [] do_IRQ+0x89/0xa0 > [ 7575.728308] [] ret_from_intr+0x0/0x13 > [ 7575.728779] [] cpuidle_enter+0x12/0x14 > [ 7575.740374] [] > cpuidle_enter_state+0x10/0x39 > [ 7575.740846] [] cpuidle_idle_call+0x7e/0xa4 > [ 7575.741315] [] cpu_idle+0x58/0xa2 > [ 7575.741784] [] start_secondary+0x188/0x18d > [ 7575.742256] > [ 7575.742256] -> #0 (dev->qdisc_tx_busylock ?: > &qdisc_tx_busylock){+.-...}: > [ 7575.743190] [] __lock_acquire+0xaf1/0xdce > [ 7575.743661] [] lock_acquire+0x7e/0x94 > [ 7575.744129] [] _raw_spin_lock+0x2c/0x3b > [ 7575.744605] [] dev_queue_xmit+0x90/0x34d > [ 7575.745077] [] > ip_finish_output2+0x25a/0x2e5 > [ 7575.745548] [] ip_finish_output+0x46/0x4b > [ 7575.746017] [] ip_output+0x63/0x67 > [ 7575.746570] [] dst_output+0x18/0x1c > [ 7575.747036] [] ip_local_out+0x1b/0x1f > [ 7575.747502] [] ip_queue_xmit+0x2d3/0x337 > [ 7575.747970] [] l2tp_xmit_skb+0x404/0x453 > [l2tp_core] > [ 7575.748442] [] pppol2tp_xmit+0x122/0x15d > [l2tp_ppp] > [ 7575.748916] [] ppp_push+0x7f/0x507 > [ 7575.749383] [] ppp_xmit_process+0x44a/0x4ff > [ 7575.749852] [] ppp_start_xmit+0x128/0x143 > [ 7575.750320] [] > dev_hard_start_xmit+0x2ef/0x371 > [ 7575.750790] [] sch_direct_xmit+0x70/0x14f > [ 7575.751258] [] dev_queue_xmit+0x152/0x34d > [ 7575.751727] [] neigh_direct_output+0xc/0xe > [ 7575.752196] [] > ip_finish_output2+0x268/0x2e5 > [ 7575.752667] [] ip_finish_output+0x46/0x4b > [ 7575.753136] [] ip_output+0x63/0x67 > [ 7575.753604] [] ip_forward_finish+0x6b/0x70 > [ 7575.754073] [] ip_forward+0x205/0x285 > [ 7575.754541] [] ip_rcv_finish+0x2b3/0x2cb > [ 7575.755009] [] > NF_HOOK.constprop.8+0x4c/0x55 > [ 7575.755478] [] ip_rcv+0x22b/0x259 > [ 7575.755945] [] > __netif_receive_skb+0x458/0x4c5 > [ 7575.756414] [] netif_receive_skb+0x56/0x8b > [ 7575.756884] [] napi_gro_complete+0xd1/0xdc > [ 7575.757354] [] napi_gro_flush+0x4c/0x68 > [ 7575.757822] [] napi_complete+0x19/0x4e > [ 7575.758290] [] igb_poll+0x6c5/0x909 > [ 7575.758759] [] net_rx_action+0xa3/0x1b9 > [ 7575.759228] [] __do_softirq+0xa8/0x157 > [ 7575.759697] [] call_softirq+0x1c/0x26 > [ 7575.760164] [] do_softirq+0x38/0x83 > [ 7575.760721] [] irq_exit+0x4e/0xad > [ 7575.761184] [] do_IRQ+0x89/0xa0 > [ 7575.761648] [] ret_from_intr+0x0/0x13 > [ 7575.762117] [] cpuidle_enter+0x12/0x14 > [ 7575.762585] [] > cpuidle_enter_state+0x10/0x39 > [ 7575.763056] [] cpuidle_idle_call+0x7e/0xa4 > [ 7575.763525] [] cpu_idle+0x58/0xa2 > [ 7575.763992] [] start_secondary+0x188/0x18d > [ 7575.764462] > [ 7575.764462] other info that might help us debug this: > [ 7575.764462] > [ 7575.765817] Chain exists of: > [ 7575.765817] dev->qdisc_tx_busylock ?: &qdisc_tx_busylock --> > &(&pch->downl)->rlock --> l2tp_sock > [ 7575.765817] > [ 7575.767652] Possible unsafe locking scenario: > [ 7575.767652] > [ 7575.768560] CPU0 CPU1 > [ 7575.769021] ---- ---- > [ 7575.769482] lock(l2tp_sock); > [ 7575.769948] > lock(&(&pch->downl)->rlock); > [ 7575.770422] lock(l2tp_sock); > [ 7575.770895] lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock); > [ 7575.771368] > [ 7575.771368] *** DEADLOCK *** > [ 7575.771368] > [ 7575.772824] 10 locks held by swapper/1/0: > [ 7575.773283] #0: (rcu_read_lock){.+.+..}, at: [] > rcu_lock_acquire+0x0/0x29 > [ 7575.774223] #1: (rcu_read_lock_bh){.+....}, at: > [] rcu_lock_acquire+0x0/0x29 > [ 7575.775248] #2: (rcu_read_lock_bh){.+....}, at: > [] rcu_lock_acquire+0x0/0x29 > [ 7575.776184] #3: (_xmit_PPP#2){+.-...}, at: [] > sch_direct_xmit+0x49/0x14f > [ 7575.777133] #4: (&(&ppp->wlock)->rlock){+.-...}, at: > [] ppp_xmit_process+0x23/0x4ff > [ 7575.778072] #5: (&(&pch->downl)->rlock){+.-...}, at: > [] ppp_push+0x69/0x507 > [ 7575.779010] #6: (l2tp_sock){+.-...}, at: [] > l2tp_xmit_skb+0x1b8/0x453 [l2tp_core] > [ 7575.779951] #7: (rcu_read_lock){.+.+..}, at: [] > rcu_lock_acquire+0x0/0x29 > [ 7575.780891] #8: (rcu_read_lock_bh){.+....}, at: > [] rcu_lock_acquire+0x0/0x29 > [ 7575.781830] #9: (rcu_read_lock_bh){.+....}, at: > [] rcu_lock_acquire+0x0/0x29 > [ 7575.782769] > [ 7575.782769] stack backtrace: > [ 7575.783674] Pid: 0, comm: swapper/1 Not tainted 3.7.6-wbuild #6 > [ 7575.784139] Call Trace: > [ 7575.784596] [] > print_circular_bug+0x1f8/0x209 > [ 7575.785075] [] __lock_acquire+0xaf1/0xdce > [ 7575.785540] [] lock_acquire+0x7e/0x94 > [ 7575.786004] [] ? dev_queue_xmit+0x90/0x34d > [ 7575.786470] [] ? dev_seq_stop+0xb/0xb > [ 7575.786935] [] _raw_spin_lock+0x2c/0x3b > [ 7575.787398] [] ? dev_queue_xmit+0x90/0x34d > [ 7575.787864] [] dev_queue_xmit+0x90/0x34d > [ 7575.788328] [] ip_finish_output2+0x25a/0x2e5 > [ 7575.788794] [] ip_finish_output+0x46/0x4b > [ 7575.789347] [] ip_output+0x63/0x67 > [ 7575.789808] [] dst_output+0x18/0x1c > [ 7575.790271] [] ip_local_out+0x1b/0x1f > [ 7575.790735] [] ip_queue_xmit+0x2d3/0x337 > [ 7575.791200] [] l2tp_xmit_skb+0x404/0x453 > [l2tp_core] > [ 7575.791668] [] pppol2tp_xmit+0x122/0x15d > [l2tp_ppp] > [ 7575.792135] [] ppp_push+0x7f/0x507 > [ 7575.792600] [] ? > _raw_spin_unlock_irqrestore+0x3a/0x41 > [ 7575.793069] [] ? > trace_hardirqs_on_caller+0x107/0x158 > [ 7575.793536] [] ? trace_hardirqs_on+0xd/0xf > [ 7575.794002] [] ppp_xmit_process+0x44a/0x4ff > [ 7575.794467] [] ppp_start_xmit+0x128/0x143 > [ 7575.794933] [] dev_hard_start_xmit+0x2ef/0x371 > [ 7575.795400] [] sch_direct_xmit+0x70/0x14f > [ 7575.795864] [] dev_queue_xmit+0x152/0x34d > [ 7575.796328] [] neigh_direct_output+0xc/0xe > [ 7575.796795] [] ip_finish_output2+0x268/0x2e5 > [ 7575.797261] [] ip_finish_output+0x46/0x4b > [ 7575.797726] [] ip_output+0x63/0x67 > [ 7575.798190] [] ip_forward_finish+0x6b/0x70 > [ 7575.798656] [] ip_forward+0x205/0x285 > [ 7575.799120] [] ip_rcv_finish+0x2b3/0x2cb > [ 7575.799586] [] ? skb_dst.isra.7+0x58/0x58 > [ 7575.800051] [] NF_HOOK.constprop.8+0x4c/0x55 > [ 7575.800517] [] ip_rcv+0x22b/0x259 > [ 7575.800981] [] __netif_receive_skb+0x458/0x4c5 > [ 7575.801448] [] netif_receive_skb+0x56/0x8b > [ 7575.801913] [] napi_gro_complete+0xd1/0xdc > [ 7575.802378] [] napi_gro_flush+0x4c/0x68 > [ 7575.802843] [] ? rcu_read_unlock+0x1c/0x1e > [ 7575.803395] [] napi_complete+0x19/0x4e > [ 7575.803856] [] igb_poll+0x6c5/0x909 > [ 7575.804317] [] ? __lock_acquire+0x5b9/0xdce > [ 7575.804783] [] net_rx_action+0xa3/0x1b9 > [ 7575.805247] [] ? __do_softirq+0x70/0x157 > [ 7575.805712] [] __do_softirq+0xa8/0x157 > [ 7575.806176] [] call_softirq+0x1c/0x26 > [ 7575.806641] [] do_softirq+0x38/0x83 > [ 7575.807106] [] irq_exit+0x4e/0xad > [ 7575.807569] [] do_IRQ+0x89/0xa0 > [ 7575.808032] [] common_interrupt+0x6f/0x6f > [ 7575.808495] [] ? > retint_restore_args+0xe/0xe > [ 7575.808974] [] ? intel_idle+0xeb/0x111 > [ 7575.809443] [] ? intel_idle+0xe4/0x111 > [ 7575.809909] [] cpuidle_enter+0x12/0x14 > [ 7575.810373] [] cpuidle_enter_state+0x10/0x39 > [ 7575.810840] [] cpuidle_idle_call+0x7e/0xa4 > [ 7575.811306] [] cpu_idle+0x58/0xa2 > [ 7575.811770] [] start_secondary+0x188/0x18d Hmm... It seems keeping the socket locked is not needed and painful for LOCKDEP. Could you try the following patch ? diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index dcfd64e..7baf88f 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1251,8 +1251,11 @@ int l2tp_xmit_skb(struct l2tp_session *session, struct sk_buff *skb, int hdr_len } l2tp_skb_set_owner_w(skb, sk); + bh_unlock_sock(sk); l2tp_xmit_core(session, skb, fl, data_len); + return ret; + out_unlock: bh_unlock_sock(sk);