All of lore.kernel.org
 help / color / mirror / Atom feed
* Deadlock, L2TP over IP are not working, 3.4.1
@ 2012-06-06  9:54 Denys Fedoryshchenko
  2012-06-07 20:53 ` Francois Romieu
  2012-06-25 15:35 ` [PATCH] net: l2tp_eth: use LLTX to avoid LOCKDEP splats Eric Dumazet
  0 siblings, 2 replies; 13+ messages in thread
From: Denys Fedoryshchenko @ 2012-06-06  9:54 UTC (permalink / raw)
  To: davem, netdev, linux-kernel

It seems l2tp are not working, at least for me, due some bug

Script i uses, to reproduce:
SERVER=192.168.11.2
LOCALIP=`curl http://${SERVER}:8080/myip`
ID=`curl http://${SERVER}:8080/tunid` # It will generate some number, 
let's say 2
echo ID: ${ID}
modprobe l2tp_ip
modprobe l2tp_eth
ip l2tp add tunnel remote ${SERVER} local ${LOCALIP} tunnel_id ${ID} 
peer_tunnel_id ${ID} encap ip
ip l2tp add session name tun100 tunnel_id ${ID} session_id 1 
peer_session_id 1
ip link set dev tun100 up
ip addr add dev tun100 10.0.6.${ID}/24

Here is report for latest stable kernel. I can reproduce it on multiple 
pc's.
It is new setup, so i am not sure it was working on old kernels or not 
(regression or not).

[ 8683.927442] ======================================================
[ 8683.927555] [ INFO: possible circular locking dependency detected ]
[ 8683.927672] 3.4.1-build-0061 #14 Not tainted
[ 8683.927782] -------------------------------------------------------
[ 8683.927895] swapper/0/0 is trying to acquire lock:
[ 8683.928007]  (slock-AF_INET){+.-...}, at: [<e0fc73ec>] 
l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]
[ 8683.928121] but task is already holding lock:
[ 8683.928121]  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>] 
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] which lock already depends on the new lock.
[ 8683.928121]
[ 8683.928121]
[ 8683.928121] the existing dependency chain (in reverse order) is:
[ 8683.928121]
[ 8683.928121] -> #1 (_xmit_ETHER#2){+.-...}:
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<c0304e0c>] ip_send_reply+0xf2/0x1ce
[ 8683.928121]        [<c0317dbc>] tcp_v4_send_reset+0x153/0x16f
[ 8683.928121]        [<c0317f4a>] tcp_v4_do_rcv+0x172/0x194
[ 8683.928121]        [<c031929b>] tcp_v4_rcv+0x387/0x5a0
[ 8683.928121]        [<c03001d0>] ip_local_deliver_finish+0x13a/0x1e9
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c030075b>] ip_local_deliver+0x41/0x45
[ 8683.928121]        [<c03005dd>] ip_rcv_finish+0x31a/0x33c
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c0300960>] ip_rcv+0x201/0x23d
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02deae8>] netif_receive_skb+0x4e/0x7d
[ 8683.928121]        [<e08d5ef3>] rtl8139_poll+0x243/0x33d [8139too]
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] -> #0 (slock-AF_INET){+.-...}:
[ 8683.928121]        [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e 
[l2tp_core]
[ 8683.928121]        [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f 
[l2tp_eth]
[ 8683.928121]        [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]        [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]        [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]        [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]        [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] other info that might help us debug this:
[ 8683.928121]
[ 8683.928121]  Possible unsafe locking scenario:
[ 8683.928121]
[ 8683.928121]        CPU0                    CPU1
[ 8683.928121]        ----                    ----
[ 8683.928121]   lock(_xmit_ETHER#2);
[ 8683.928121]                                lock(slock-AF_INET);
[ 8683.928121]                                lock(_xmit_ETHER#2);
[ 8683.928121]   lock(slock-AF_INET);
[ 8683.928121]
[ 8683.928121]  *** DEADLOCK ***
[ 8683.928121]
[ 8683.928121] 3 locks held by swapper/0/0:
[ 8683.928121]  #0:  (rcu_read_lock){.+.+..}, at: [<c02dbc10>] 
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #1:  (rcu_read_lock_bh){.+....}, at: [<c02dbc10>] 
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #2:  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>] 
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] stack backtrace:
[ 8683.928121] Pid: 0, comm: swapper/0 Not tainted 3.4.1-build-0061 #14
[ 8683.928121] Call Trace:
[ 8683.928121]  [<c034bdd2>] ? printk+0x18/0x1a
[ 8683.928121]  [<c0158904>] print_circular_bug+0x1ac/0x1b6
[ 8683.928121]  [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]  [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f [l2tp_eth]
[ 8683.928121]  [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]  [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]  [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]  [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]  [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]  [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]  [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]  [<c012b23a>] ? local_bh_enable+0xd/0xd
[ 8683.928121]  <IRQ>  [<c012b4d0>] ? irq_exit+0x41/0x91
[ 8683.928121]  [<c0103c6f>] ? do_IRQ+0x79/0x8d
[ 8683.928121]  [<c0157ea1>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 8683.928121]  [<c034ef6e>] ? common_interrupt+0x2e/0x34
[ 8683.928121]  [<c0108a33>] ? default_idle+0x23/0x38
[ 8683.928121]  [<c01091a8>] ? cpu_idle+0x55/0x6f
[ 8683.928121]  [<c033df25>] ? rest_init+0xa1/0xa7
[ 8683.928121]  [<c033de84>] ? __read_lock_failed+0x14/0x14
[ 8683.928121]  [<c0498745>] ? start_kernel+0x303/0x30a
[ 8683.928121]  [<c0498209>] ? repair_env_string+0x51/0x51
[ 8683.928121]  [<c04980a8>] ? i386_start_kernel+0xa8/0xaf


[158595.436934]
[158595.437018] ======================================================
[158595.437111] [ INFO: possible circular locking dependency detected ]
[158595.437198] 3.4.0-build-0061 #12 Tainted: G        W
[158595.437281] -------------------------------------------------------
[158595.437365] swapper/0/0 is trying to acquire lock:
[158595.437447]  (slock-AF_INET){+.-...}, at: [<f86453ec>] 
l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[158595.437613]
[158595.437613] but task is already holding lock:
[158595.437763]  (_xmit_ETHER#2){+.-...}, at: [<c02f09b9>] 
sch_direct_xmit+0x36/0x119
[158595.437837]
[158595.437837] which lock already depends on the new lock.
[158595.437837]
[158595.437837]
[158595.437837] the existing dependency chain (in reverse order) is:
[158595.437837]
[158595.437837] -> #1 (_xmit_ETHER#2){+.-...}:
[158595.437837]        [<c015a6d1>] lock_acquire+0x71/0x85
[158595.437837]        [<c034de94>] _raw_spin_lock_irqsave+0x40/0x50
[158595.437837]        [<c017c1f2>] get_page_from_freelist+0x227/0x398
[158595.437837]        [<c017c5a7>] __alloc_pages_nodemask+0xef/0x5f9
[158595.437837]        [<c019c34f>] alloc_slab_page+0x1d/0x21
[158595.437837]        [<c019c39f>] new_slab+0x4c/0x164
[158595.437837]        [<c019d259>] 
__slab_alloc.clone.59.clone.64+0x247/0x2de
[158595.437837]        [<c019dd21>] __kmalloc_track_caller+0x55/0xa4
[158595.437837]        [<c02d56fb>] __alloc_skb+0x51/0x100
[158595.437837]        [<c02d2cfa>] sock_alloc_send_pskb+0x9e/0x263
[158595.437837]        [<c02d2ed7>] sock_alloc_send_skb+0x18/0x1d
[158595.437837]        [<c0303e04>] 
__ip_append_data.clone.52+0x302/0x6dc
[158595.437837]        [<c030494c>] ip_append_data+0x80/0x88
[158595.437837]        [<c03209dd>] icmp_push_reply+0x5c/0x101
[158595.437837]        [<c0321555>] icmp_send+0x31d/0x342
[158595.437837]        [<f862b05c>] send_unreach+0x19/0x1b [ipt_REJECT]
[158595.437837]        [<f862b0f5>] reject_tg+0x53/0x2de [ipt_REJECT]
[158595.437837]        [<c033359a>] ipt_do_table+0x3ad/0x410
[158595.437837]        [<f856c0c4>] iptable_filter_hook+0x56/0x5e 
[iptable_filter]
[158595.437837]        [<c02f9941>] nf_iterate+0x36/0x5c
[158595.437837]        [<c02f99bf>] nf_hook_slow+0x58/0xf1
[158595.437837]        [<c0301f33>] ip_forward+0x295/0x2a2
[158595.437837]        [<c0300969>] ip_rcv_finish+0x31a/0x33c
[158595.437837]        [<c03009d1>] NF_HOOK.clone.11+0x46/0x4d
[158595.437837]        [<c0300cec>] ip_rcv+0x201/0x23d
[158595.437837]        [<c02deca7>] __netif_receive_skb+0x329/0x378
[158595.437837]        [<c02dee74>] netif_receive_skb+0x4e/0x7d
[158595.437837]        [<c02def60>] napi_skb_finish+0x1e/0x34
[158595.437837]        [<c02df389>] napi_gro_receive+0x20/0x24
[158595.437837]        [<f850e213>] rtl8169_poll+0x2e6/0x52c [r8169]
[158595.437837]        [<c02df48f>] net_rx_action+0x90/0x15d
[158595.437837]        [<c012b42d>] __do_softirq+0x7b/0x118
[158595.437837]
[158595.437837] -> #0 (slock-AF_INET){+.-...}:
[158595.437837]        [<c015a08b>] __lock_acquire+0x9a3/0xc27
[158595.437837]        [<c015a6d1>] lock_acquire+0x71/0x85
[158595.437837]        [<c034ddad>] _raw_spin_lock+0x33/0x40
[158595.437837]        [<f86453ec>] l2tp_xmit_skb+0x173/0x47e 
[l2tp_core]
[158595.437837]        [<f86591fb>] l2tp_eth_dev_xmit+0x1a/0x2f 
[l2tp_eth]
[158595.437837]        [<c02e0573>] dev_hard_start_xmit+0x333/0x3f2
[158595.437837]        [<c02f09d8>] sch_direct_xmit+0x55/0x119
[158595.437837]        [<c02e08b4>] dev_queue_xmit+0x282/0x418
[158595.437837]        [<c031f887>] NF_HOOK.clone.19+0x45/0x4c
[158595.437837]        [<c031f8b0>] arp_xmit+0x22/0x24
[158595.437837]        [<c031f8f3>] arp_send+0x41/0x48
[158595.437837]        [<c031fe09>] arp_process+0x289/0x491
[158595.437837]        [<c031f887>] NF_HOOK.clone.19+0x45/0x4c
[158595.437837]        [<c031fb2c>] arp_rcv+0xb1/0xc3
[158595.437837]        [<c02deca7>] __netif_receive_skb+0x329/0x378
[158595.437837]        [<c02ded5f>] process_backlog+0x69/0x130
[158595.437837]        [<c02df48f>] net_rx_action+0x90/0x15d
[158595.437837]        [<c012b42d>] __do_softirq+0x7b/0x118
[158595.437837]
[158595.437837] other info that might help us debug this:
[158595.437837]
[158595.437837]  Possible unsafe locking scenario:
[158595.437837]
[158595.437837]        CPU0                    CPU1
[158595.437837]        ----                    ----
[158595.437837]   lock(_xmit_ETHER#2);
[158595.437837]                                lock(slock-AF_INET);
[158595.437837]                                lock(_xmit_ETHER#2);
[158595.437837]   lock(slock-AF_INET);
[158595.437837]
[158595.437837]  *** DEADLOCK ***
[158595.437837]
[158595.437837] 3 locks held by swapper/0/0:
[158595.437837]  #0:  (rcu_read_lock){.+.+..}, at: [<c02dbf9c>] 
rcu_lock_acquire+0x0/0x30
[158595.437837]  #1:  (rcu_read_lock_bh){.+....}, at: [<c02dbf9c>] 
rcu_lock_acquire+0x0/0x30
[158595.437837]  #2:  (_xmit_ETHER#2){+.-...}, at: [<c02f09b9>] 
sch_direct_xmit+0x36/0x119
[158595.437837]
[158595.437837] stack backtrace:
[158595.437837] Pid: 0, comm: swapper/0 Tainted: G        W    
3.4.0-build-0061 #12
[158595.437837] Call Trace:
[158595.437837]  [<c034c156>] ? printk+0x18/0x1a
[158595.437837]  [<c0158a74>] print_circular_bug+0x1ac/0x1b6
[158595.437837]  [<c015a08b>] __lock_acquire+0x9a3/0xc27
[158595.437837]  [<c015a6d1>] lock_acquire+0x71/0x85
[158595.437837]  [<f86453ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[158595.437837]  [<c034ddad>] _raw_spin_lock+0x33/0x40
[158595.437837]  [<f86453ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[158595.437837]  [<f86453ec>] l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[158595.437837]  [<f86591fb>] l2tp_eth_dev_xmit+0x1a/0x2f [l2tp_eth]
[158595.437837]  [<c02e0573>] dev_hard_start_xmit+0x333/0x3f2
[158595.437837]  [<c02f09d8>] sch_direct_xmit+0x55/0x119
[158595.437837]  [<c02e08b4>] dev_queue_xmit+0x282/0x418
[158595.437837]  [<c02e0632>] ? dev_hard_start_xmit+0x3f2/0x3f2
[158595.437837]  [<c031f887>] NF_HOOK.clone.19+0x45/0x4c
[158595.437837]  [<c031f8b0>] arp_xmit+0x22/0x24
[158595.437837]  [<c02e0632>] ? dev_hard_start_xmit+0x3f2/0x3f2
[158595.437837]  [<c031f8f3>] arp_send+0x41/0x48
[158595.437837]  [<c031fe09>] arp_process+0x289/0x491
[158595.437837]  [<c031fb80>] ? __neigh_lookup.clone.20+0x42/0x42
[158595.437837]  [<c031f887>] NF_HOOK.clone.19+0x45/0x4c
[158595.437837]  [<c031fb2c>] arp_rcv+0xb1/0xc3
[158595.437837]  [<c031fb80>] ? __neigh_lookup.clone.20+0x42/0x42
[158595.437837]  [<c02deca7>] __netif_receive_skb+0x329/0x378
[158595.437837]  [<c02ded5f>] process_backlog+0x69/0x130
[158595.437837]  [<c02df48f>] net_rx_action+0x90/0x15d
[158595.437837]  [<c012b42d>] __do_softirq+0x7b/0x118
[158595.437837]  [<c013236e>] ? do_send_specific+0xb/0x8f
[158595.437837]  [<c012b3b2>] ? local_bh_enable+0xd/0xd
[158595.437837]  <IRQ>  [<c012b648>] ? irq_exit+0x41/0x91
[158595.437837]  [<c0103c73>] ? do_IRQ+0x79/0x8d
[158595.437837]  [<c0158011>] ? trace_hardirqs_off_caller+0x2e/0x86
[158595.437837]  [<c034f2ee>] ? common_interrupt+0x2e/0x34
[158595.437837]  [<c015007b>] ? ktime_get_ts+0x8f/0x9b
[158595.437837]  [<c0108a0a>] ? mwait_idle+0x50/0x5a
[158595.437837]  [<c01091ac>] ? cpu_idle+0x55/0x6f
[158595.437837]  [<c033e2b1>] ? rest_init+0xa1/0xa7
[158595.437837]  [<c033e210>] ? __read_lock_failed+0x14/0x14
[158595.437837]  [<c049874f>] ? start_kernel+0x30d/0x314
[158595.437837]  [<c0498209>] ? repair_env_string+0x51/0x51
[158595.437837]  [<c04980a8>] ? i386_start_kernel+0xa8/0xaf

[63546.808787]
[63546.809025] ======================================================
[63546.809259] [ INFO: possible circular locking dependency detected ]
[63546.809494] 3.4.1-build-0061 #14 Not tainted
[63546.809685] -------------------------------------------------------
[63546.809685] swapper/0/0 is trying to acquire lock:
[63546.809685]  (slock-AF_INET){+.-...}, at: [<f8c593ec>] 
l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[63546.809685]
[63546.809685] but task is already holding lock:
[63546.809685]  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>] 
sch_direct_xmit+0x36/0x119
[63546.809685]
[63546.809685] which lock already depends on the new lock.
[63546.809685]
[63546.809685]
[63546.809685] the existing dependency chain (in reverse order) is:
[63546.809685]
[63546.809685] -> #1 (_xmit_ETHER#2){+.-...}:
[63546.809685]        [<c015a561>] lock_acquire+0x71/0x85
[63546.809685]        [<c034dc06>] _raw_spin_lock_bh+0x38/0x45
[63546.809685]        [<c02a4e8a>] ppp_push+0x59/0x4b3
[63546.809685]        [<c02a66b9>] ppp_xmit_process+0x41b/0x4be
[63546.809685]        [<c02a69b9>] ppp_write+0x90/0xa1
[63546.809685]        [<c01a2e8c>] vfs_write+0x7e/0xab
[63546.809685]        [<c01a2ffc>] sys_write+0x3d/0x5e
[63546.809685]        [<c034e191>] syscall_call+0x7/0xb
[63546.809685]
[63546.809685] -> #0 (slock-AF_INET){+.-...}:
[63546.809685]        [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[63546.809685]        [<c015a561>] lock_acquire+0x71/0x85
[63546.809685]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[63546.809685]        [<f8c593ec>] l2tp_xmit_skb+0x173/0x47e 
[l2tp_core]
[63546.809685]        [<f8c751fb>] l2tp_eth_dev_xmit+0x1a/0x2f 
[l2tp_eth]
[63546.809685]        [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[63546.809685]        [<c02f064c>] sch_direct_xmit+0x55/0x119
[63546.809685]        [<c02e0528>] dev_queue_xmit+0x282/0x418
[63546.809685]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[63546.809685]        [<c031f524>] arp_xmit+0x22/0x24
[63546.809685]        [<c031f567>] arp_send+0x41/0x48
[63546.809685]        [<c031fa7d>] arp_process+0x289/0x491
[63546.809685]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[63546.809685]        [<c031f7a0>] arp_rcv+0xb1/0xc3
[63546.809685]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[63546.809685]        [<c02de9d3>] process_backlog+0x69/0x130
[63546.809685]        [<c02df103>] net_rx_action+0x90/0x15d
[63546.809685]        [<c012b2b5>] __do_softirq+0x7b/0x118
[63546.809685]
[63546.809685] other info that might help us debug this:
[63546.809685]
[63546.809685]  Possible unsafe locking scenario:
[63546.809685]
[63546.809685]        CPU0                    CPU1
[63546.809685]        ----                    ----
[63546.809685]   lock(_xmit_ETHER#2);
[63546.809685]                                lock(slock-AF_INET);
[63546.809685]                                lock(_xmit_ETHER#2);
[63546.809685]   lock(slock-AF_INET);
[63546.809685]
[63546.809685]  *** DEADLOCK ***
[63546.809685]
[63546.809685] 3 locks held by swapper/0/0:
[63546.809685]  #0:  (rcu_read_lock){.+.+..}, at: [<c02dbc10>] 
rcu_lock_acquire+0x0/0x30
[63546.809685]  #1:  (rcu_read_lock_bh){.+....}, at: [<c02dbc10>] 
rcu_lock_acquire+0x0/0x30
[63546.809685]  #2:  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>] 
sch_direct_xmit+0x36/0x119
[63546.809685]
[63546.809685] stack backtrace:
[63546.809685] Pid: 0, comm: swapper/0 Not tainted 3.4.1-build-0061 #14
[63546.809685] Call Trace:
[63546.809685]  [<c034bdd2>] ? printk+0x18/0x1a
[63546.809685]  [<c0158904>] print_circular_bug+0x1ac/0x1b6
[63546.809685]  [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[63546.809685]  [<c015a561>] lock_acquire+0x71/0x85
[63546.809685]  [<f8c593ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[63546.809685]  [<c034da2d>] _raw_spin_lock+0x33/0x40
[63546.809685]  [<f8c593ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[63546.809685]  [<f8c593ec>] l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[63546.809685]  [<f8c751fb>] l2tp_eth_dev_xmit+0x1a/0x2f [l2tp_eth]
[63546.809685]  [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[63546.809685]  [<c02f064c>] sch_direct_xmit+0x55/0x119
[63546.809685]  [<c02e0528>] dev_queue_xmit+0x282/0x418
[63546.809685]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[63546.809685]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[63546.809685]  [<c031f524>] arp_xmit+0x22/0x24
[63546.809685]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[63546.809685]  [<c031f567>] arp_send+0x41/0x48
[63546.809685]  [<c031fa7d>] arp_process+0x289/0x491
[63546.809685]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[63546.809685]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[63546.809685]  [<c031f7a0>] arp_rcv+0xb1/0xc3
[63546.809685]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[63546.809685]  [<c02de91b>] __netif_receive_skb+0x329/0x378
[63546.809685]  [<c02de9d3>] process_backlog+0x69/0x130
[63546.809685]  [<c02df103>] net_rx_action+0x90/0x15d
[63546.809685]  [<c012b2b5>] __do_softirq+0x7b/0x118
[63546.809685]  [<c012b23a>] ? local_bh_enable+0xd/0xd
[63546.809685]  [<c012b23a>] ? local_bh_enable+0xd/0xd
[63546.809685]  <IRQ>  [<c012b4d0>] ? irq_exit+0x41/0x91
[63546.809685]  [<c0103c6f>] ? do_IRQ+0x79/0x8d
[63546.809685]  [<c0157ea1>] ? trace_hardirqs_off_caller+0x2e/0x86
[63546.809685]  [<c034ef6e>] ? common_interrupt+0x2e/0x34
[63546.809685]  [<c015007b>] ? do_gettimeofday+0x20/0x29
[63546.809685]  [<c0108a06>] ? mwait_idle+0x50/0x5a
[63546.809685]  [<c01091a8>] ? cpu_idle+0x55/0x6f
[63546.809685]  [<c033df25>] ? rest_init+0xa1/0xa7
[63546.809685]  [<c033de84>] ? __read_lock_failed+0x14/0x14
[63546.809685]  [<c0498745>] ? start_kernel+0x303/0x30a
[63546.809685]  [<c0498209>] ? repair_env_string+0x51/0x51
[63546.809685]  [<c04980a8>] ? i386_start_kernel+0xa8/0xaf

---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-06  9:54 Deadlock, L2TP over IP are not working, 3.4.1 Denys Fedoryshchenko
@ 2012-06-07 20:53 ` Francois Romieu
  2012-06-07 21:30   ` Eric Dumazet
  2012-06-25 15:35 ` [PATCH] net: l2tp_eth: use LLTX to avoid LOCKDEP splats Eric Dumazet
  1 sibling, 1 reply; 13+ messages in thread
From: Francois Romieu @ 2012-06-07 20:53 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: davem, netdev, linux-kernel

Denys Fedoryshchenko <denys@visp.net.lb> :
[...]
> [ 8683.927442] ======================================================
> [ 8683.927555] [ INFO: possible circular locking dependency detected ]
> [ 8683.927672] 3.4.1-build-0061 #14 Not tainted
> [ 8683.927782] -------------------------------------------------------
> [ 8683.927895] swapper/0/0 is trying to acquire lock:
> [ 8683.928007]  (slock-AF_INET){+.-...}, at: [<e0fc73ec>]
> l2tp_xmit_skb+0x173/0x47e [l2tp_core]
> [ 8683.928121]
> [ 8683.928121] but task is already holding lock:
> [ 8683.928121]  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>]
> sch_direct_xmit+0x36/0x119
> [ 8683.928121]
> [ 8683.928121] which lock already depends on the new lock.

Any reason why it could not be made LLTX ?

(untested patch against -git, applies to 3.4.1 with some offset)

diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
index 443591d..5725258 100644
--- a/net/l2tp/l2tp_eth.c
+++ b/net/l2tp/l2tp_eth.c
@@ -36,12 +36,20 @@
 /* Default device name. May be overridden by name specified by user */
 #define L2TP_ETH_DEV_NAME	"l2tpeth%d"
 
+struct l2tp_eth_stats {
+	u64			packets;
+	u64			bytes;
+	struct u64_stats_sync	syncp;
+};
+
 /* via netdev_priv() */
 struct l2tp_eth {
 	struct net_device	*dev;
 	struct sock		*tunnel_sock;
 	struct l2tp_session	*session;
 	struct list_head	list;
+	struct l2tp_eth_stats	rstats;
+	struct l2tp_eth_stats	tstats;
 };
 
 /* via l2tp_session_priv() */
@@ -87,25 +95,56 @@ static int l2tp_eth_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct l2tp_eth *priv = netdev_priv(dev);
 	struct l2tp_session *session = priv->session;
+	struct l2tp_eth_stats *tstats = &priv->tstats;
 
 	l2tp_xmit_skb(session, skb, session->hdr_len);
 
-	dev->stats.tx_bytes += skb->len;
-	dev->stats.tx_packets++;
+	u64_stats_update_begin(&tstats->syncp);
+	tstats->packets++;
+	tstats->bytes += skb->len;
+	u64_stats_update_end(&tstats->syncp);
 
 	return 0;
 }
 
+static struct rtnl_link_stats64 *
+l2tp_eth_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
+{
+	struct l2tp_eth *priv = netdev_priv(dev);
+	struct l2tp_eth_stats *s;
+	unsigned int start;
+
+	s = &priv->rstats;
+	do {
+		start = u64_stats_fetch_begin_bh(&s->syncp);
+		stats->rx_packets	= s->packets;
+		stats->rx_bytes		= s->bytes;
+	} while (u64_stats_fetch_retry_bh(&s->syncp, start));
+
+	s = &priv->tstats;
+	do {
+		start = u64_stats_fetch_begin_bh(&s->syncp);
+		stats->tx_packets	= s->packets;
+		stats->tx_bytes		= s->bytes;
+	} while (u64_stats_fetch_retry_bh(&s->syncp, start));
+
+	stats->rx_errors = dev->stats.rx_errors;
+
+	return stats;
+}
+
 static struct net_device_ops l2tp_eth_netdev_ops = {
 	.ndo_init		= l2tp_eth_dev_init,
 	.ndo_uninit		= l2tp_eth_dev_uninit,
 	.ndo_start_xmit		= l2tp_eth_dev_xmit,
+	.ndo_get_stats64	= l2tp_eth_get_stats64,
 };
 
 static void l2tp_eth_dev_setup(struct net_device *dev)
 {
 	ether_setup(dev);
-	dev->priv_flags &= ~IFF_TX_SKB_SHARING;
+	dev->features		|= NETIF_F_LLTX;
+	dev->priv_flags		&= ~IFF_TX_SKB_SHARING;
 	dev->netdev_ops		= &l2tp_eth_netdev_ops;
 	dev->destructor		= free_netdev;
 }
@@ -139,8 +178,13 @@ static void l2tp_eth_dev_recv(struct l2tp_session *session, struct sk_buff *skb,
 	nf_reset(skb);
 
 	if (dev_forward_skb(dev, skb) == NET_RX_SUCCESS) {
-		dev->stats.rx_packets++;
-		dev->stats.rx_bytes += data_len;
+		struct l2tp_eth *priv = netdev_priv(dev);
+		struct l2tp_eth_stats *rstats = &priv->rstats;
+
+		u64_stats_update_begin(&rstats->syncp);
+		rstats->packets++;
+		rstats->bytes += data_len;
+		u64_stats_update_end(&rstats->syncp);
 	} else
 		dev->stats.rx_errors++;
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-07 20:53 ` Francois Romieu
@ 2012-06-07 21:30   ` Eric Dumazet
  2012-06-07 22:37     ` Francois Romieu
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2012-06-07 21:30 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Denys Fedoryshchenko, davem, netdev, linux-kernel

On Thu, 2012-06-07 at 22:53 +0200, Francois Romieu wrote:

> Any reason why it could not be made LLTX ?
> 

>  /* via l2tp_session_priv() */
> @@ -87,25 +95,56 @@ static int l2tp_eth_dev_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct l2tp_eth *priv = netdev_priv(dev);
>  	struct l2tp_session *session = priv->session;
> +	struct l2tp_eth_stats *tstats = &priv->tstats;
>  
>  	l2tp_xmit_skb(session, skb, session->hdr_len);
>  
> -	dev->stats.tx_bytes += skb->len;
> -	dev->stats.tx_packets++;
> +	u64_stats_update_begin(&tstats->syncp);
> +	tstats->packets++;
> +	tstats->bytes += skb->len;
> +	u64_stats_update_end(&tstats->syncp);
>  
>  	return 0;
>  }
>  

Its racy.

If LLTX is used, this means several cpus can execute this code at the
same time.

You need percpu stats, or use atomic primitives.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-07 21:30   ` Eric Dumazet
@ 2012-06-07 22:37     ` Francois Romieu
  2012-06-08  5:47       ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Francois Romieu @ 2012-06-07 22:37 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Denys Fedoryshchenko, davem, netdev, linux-kernel

Eric Dumazet <eric.dumazet@gmail.com> :
[...]
> If LLTX is used, this means several cpus can execute this code at the
> same time.
> 
> You need percpu stats, or use atomic primitives.

Would adding percpu stats not be frown upon ?

As atomic will defeat the 64 bits stats on 32 bits, I should probably stick
to plain bh disabling lock.

-- 
Ueimor

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-07 22:37     ` Francois Romieu
@ 2012-06-08  5:47       ` Eric Dumazet
  2012-06-08 15:41         ` Benjamin LaHaise
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2012-06-08  5:47 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Denys Fedoryshchenko, davem, netdev, linux-kernel

On Fri, 2012-06-08 at 00:37 +0200, Francois Romieu wrote:
> Eric Dumazet <eric.dumazet@gmail.com> :
> [...]
> > If LLTX is used, this means several cpus can execute this code at the
> > same time.
> > 
> > You need percpu stats, or use atomic primitives.
> 
> Would adding percpu stats not be frown upon ?
> 

I have no idea how many l2tp_eth devices are setup at once in typical
conf.

> As atomic will defeat the 64 bits stats on 32 bits, I should probably stick
> to plain bh disabling lock.
> 

Not sure what you mean by "atomic will defeat the 64 bits stats on 32
bits", since you also need atomic for "32 bits stats" or "64 bits stats
on 64 bit arches"

I personally would just use "unsigned long". If people want 64bit stats
for their l2tp_eth, they certainly can use a 64bit host.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-08  5:47       ` Eric Dumazet
@ 2012-06-08 15:41         ` Benjamin LaHaise
  2012-06-08 15:56           ` Denys Fedoryshchenko
  0 siblings, 1 reply; 13+ messages in thread
From: Benjamin LaHaise @ 2012-06-08 15:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Francois Romieu, Denys Fedoryshchenko, davem, netdev, linux-kernel

On Fri, Jun 08, 2012 at 07:47:18AM +0200, Eric Dumazet wrote:
> I have no idea how many l2tp_eth devices are setup at once in typical
> conf.

Depends on the usage scenario.  L2TP is commonly used for terminating 
customer connections by wholesale ISPs.  In that kind of edge routing 
use-case, tens of thousands of interfaces are easily possible.

		-ben
-- 
"Thought is the essence of where you are now."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-08 15:41         ` Benjamin LaHaise
@ 2012-06-08 15:56           ` Denys Fedoryshchenko
  2012-06-08 16:04             ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Denys Fedoryshchenko @ 2012-06-08 15:56 UTC (permalink / raw)
  To: Benjamin LaHaise
  Cc: Eric Dumazet, Francois Romieu, davem, netdev, linux-kernel

On 2012-06-08 18:41, Benjamin LaHaise wrote:
> On Fri, Jun 08, 2012 at 07:47:18AM +0200, Eric Dumazet wrote:
>> I have no idea how many l2tp_eth devices are setup at once in 
>> typical
>> conf.
>
> Depends on the usage scenario.  L2TP is commonly used for terminating
> customer connections by wholesale ISPs.  In that kind of edge routing
> use-case, tens of thousands of interfaces are easily possible.
>
> 		-ben
In my case it is few hundreds. I am not sure it is typical case, but i 
really will like if this setup
will give me good performance. Since Linux usually used on the PC's, i 
don't think they will scale more
than 2-3 Gbps. Also L2TP pseudowire (l2tp_eth) usually done not 
directly to end-users, but to LAC's, so i think
it is up to thousand.

Right now i am routing around 700Mbps over there, and noticed some 
problem with pskb_expand_head,
but that's separate question i will ask, if i will not be able to sort 
it out by myself.


---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-08 15:56           ` Denys Fedoryshchenko
@ 2012-06-08 16:04             ` Eric Dumazet
  2012-06-10 12:14               ` Hong zhi guo
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2012-06-08 16:04 UTC (permalink / raw)
  To: Denys Fedoryshchenko
  Cc: Benjamin LaHaise, Francois Romieu, davem, netdev, linux-kernel

On Fri, 2012-06-08 at 18:56 +0300, Denys Fedoryshchenko wrote:

> Right now i am routing around 700Mbps over there, and noticed some 
> problem with pskb_expand_head,
> but that's separate question i will ask, if i will not be able to sort 
> it out by myself.
> 

Before spending too much time on this, make sure you use a kernel
including commit 617c8c11236 ( skb: avoid unnecessary reallocations in
__skb_cow )




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-08 16:04             ` Eric Dumazet
@ 2012-06-10 12:14               ` Hong zhi guo
  2012-06-10 13:31                 ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Hong zhi guo @ 2012-06-10 12:14 UTC (permalink / raw)
  To: Eric Dumazet, davem
  Cc: Denys Fedoryshchenko, Benjamin LaHaise, Francois Romieu, netdev,
	linux-kernel

Is there any conclusion about the problem?  Will bh_lock_sock_nested
fix the lockdep violation?

On Sat, Jun 9, 2012 at 12:04 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2012-06-08 at 18:56 +0300, Denys Fedoryshchenko wrote:
>
>> Right now i am routing around 700Mbps over there, and noticed some
>> problem with pskb_expand_head,
>> but that's separate question i will ask, if i will not be able to sort
>> it out by myself.
>>
>
> Before spending too much time on this, make sure you use a kernel
> including commit 617c8c11236 ( skb: avoid unnecessary reallocations in
> __skb_cow )
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
best regards
hong zhi guo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-10 12:14               ` Hong zhi guo
@ 2012-06-10 13:31                 ` Eric Dumazet
  2012-06-10 13:38                   ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2012-06-10 13:31 UTC (permalink / raw)
  To: Hong zhi guo
  Cc: davem, Denys Fedoryshchenko, Benjamin LaHaise, Francois Romieu,
	netdev, linux-kernel

On Sun, 2012-06-10 at 20:14 +0800, Hong zhi guo wrote:
> Is there any conclusion about the problem?  Will bh_lock_sock_nested
> fix the lockdep violation?

lockdep violation could indeed be fixed like that, but LLTX seems more
appropriate.

But there is still a bug because skb->len is used after 
l2tp_xmit_skb(session, skb, session->hdr_len);

There is also a bug in l2tp_core.c, because it assumes RX path is not
reentrant. That should be fixed eventually.

I believe we could avoid percpu stuf and new locking, adding the
following unions in net_device_stats. It seems enough to have machine
long words stats, no need to force 64bit stats on 32bit arches.

 include/linux/netdevice.h |   40 ++++++++++++++++++++++++++++--------
 net/l2tp/l2tp_eth.c       |   29 +++++++++++++-------------
 2 files changed, 47 insertions(+), 22 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d94cb14..1dee75a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -171,14 +171,38 @@ static inline bool dev_xmit_complete(int rc)
  */
 
 struct net_device_stats {
-	unsigned long	rx_packets;
-	unsigned long	tx_packets;
-	unsigned long	rx_bytes;
-	unsigned long	tx_bytes;
-	unsigned long	rx_errors;
-	unsigned long	tx_errors;
-	unsigned long	rx_dropped;
-	unsigned long	tx_dropped;
+	union {
+		unsigned long	rx_packets;
+		atomic_long_t   rx_packets_atomic;
+	};
+	union {
+		unsigned long	tx_packets;
+		atomic_long_t	tx_packets_atomic;
+	};
+	union {
+		unsigned long	rx_bytes;
+		atomic_long_t	rx_bytes_atomic;
+	};
+	union {
+		unsigned long	tx_bytes;
+		atomic_long_t	tx_bytes_atomic;
+	};
+	union {
+		unsigned long	rx_errors;
+		atomic_long_t	rx_errors_atomic;
+	};
+	union {
+		unsigned long	tx_errors;
+		atomic_long_t	tx_errors_atomic;
+	};
+	union {
+		unsigned long	rx_dropped;
+		atomic_long_t	rx_dropped_atomic;
+	};
+	union {
+		unsigned long	tx_dropped;
+		atomic_long_t	tx_dropped_atomic;
+	};
 	unsigned long	multicast;
 	unsigned long	collisions;
 	unsigned long	rx_length_errors;
diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
index 185f12f..9e80ec4 100644
--- a/net/l2tp/l2tp_eth.c
+++ b/net/l2tp/l2tp_eth.c
@@ -83,20 +83,20 @@ static void l2tp_eth_dev_uninit(struct net_device *dev)
 	dev_put(dev);
 }
 
-static int l2tp_eth_dev_xmit(struct sk_buff *skb, struct net_device *dev)
+static netdev_tx_t l2tp_eth_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct l2tp_eth *priv = netdev_priv(dev);
 	struct l2tp_session *session = priv->session;
 
-	l2tp_xmit_skb(session, skb, session->hdr_len);
+	atomic_long_add(skb->len, &dev->stats.tx_bytes_atomic);
+	atomic_long_inc(&dev->stats.tx_packets_atomic);
 
-	dev->stats.tx_bytes += skb->len;
-	dev->stats.tx_packets++;
+	l2tp_xmit_skb(session, skb, session->hdr_len);
 
-	return 0;
+	return NETDEV_TX_OK;
 }
 
-static struct net_device_ops l2tp_eth_netdev_ops = {
+static const struct net_device_ops l2tp_eth_netdev_ops = {
 	.ndo_init		= l2tp_eth_dev_init,
 	.ndo_uninit		= l2tp_eth_dev_uninit,
 	.ndo_start_xmit		= l2tp_eth_dev_xmit,
@@ -106,8 +106,9 @@ static void l2tp_eth_dev_setup(struct net_device *dev)
 {
 	ether_setup(dev);
 	dev->priv_flags &= ~IFF_TX_SKB_SHARING;
-	dev->netdev_ops		= &l2tp_eth_netdev_ops;
-	dev->destructor		= free_netdev;
+	dev->features	|= NETIF_F_LLTX;
+	dev->netdev_ops	= &l2tp_eth_netdev_ops;
+	dev->destructor	= free_netdev;
 }
 
 static void l2tp_eth_dev_recv(struct l2tp_session *session, struct sk_buff *skb, int data_len)
@@ -139,15 +140,15 @@ static void l2tp_eth_dev_recv(struct l2tp_session *session, struct sk_buff *skb,
 	nf_reset(skb);
 
 	if (dev_forward_skb(dev, skb) == NET_RX_SUCCESS) {
-		dev->stats.rx_packets++;
-		dev->stats.rx_bytes += data_len;
-	} else
-		dev->stats.rx_errors++;
-
+		atomic_long_inc(&dev->stats.rx_packets_atomic);
+		atomic_long_add(data_len, &dev->stats.rx_bytes_atomic);
+	} else {
+		atomic_long_inc(&dev->stats.rx_errors_atomic);
+	}
 	return;
 
 error:
-	dev->stats.rx_errors++;
+	atomic_long_inc(&dev->stats.rx_errors_atomic);
 	kfree_skb(skb);
 }
 



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Deadlock, L2TP over IP are not working, 3.4.1
  2012-06-10 13:31                 ` Eric Dumazet
@ 2012-06-10 13:38                   ` Eric Dumazet
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Dumazet @ 2012-06-10 13:38 UTC (permalink / raw)
  To: Hong zhi guo
  Cc: davem, Denys Fedoryshchenko, Benjamin LaHaise, Francois Romieu,
	netdev, linux-kernel

On Sun, 2012-06-10 at 15:31 +0200, Eric Dumazet wrote:

>  include/linux/netdevice.h |   40 ++++++++++++++++++++++++++++--------
>  net/l2tp/l2tp_eth.c       |   29 +++++++++++++-------------
>  2 files changed, 47 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index d94cb14..1dee75a 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -171,14 +171,38 @@ static inline bool dev_xmit_complete(int rc)
>   */
>  
>  struct net_device_stats {
> -	unsigned long	rx_packets;
> -	unsigned long	tx_packets;
> -	unsigned long	rx_bytes;
> -	unsigned long	tx_bytes;
> -	unsigned long	rx_errors;
> -	unsigned long	tx_errors;
> -	unsigned long	rx_dropped;
> -	unsigned long	tx_dropped;
> +	union {
> +		unsigned long	rx_packets;
> +		atomic_long_t   rx_packets_atomic;
> +	};
> +	union {
> +		unsigned long	tx_packets;
> +		atomic_long_t	tx_packets_atomic;
> +	};
> +	union {
> +		unsigned long	rx_bytes;
> +		atomic_long_t	rx_bytes_atomic;
> +	};
> +	union {
> +		unsigned long	tx_bytes;
> +		atomic_long_t	tx_bytes_atomic;
> +	};
> +	union {
> +		unsigned long	rx_errors;
> +		atomic_long_t	rx_errors_atomic;
> +	};
> +	union {
> +		unsigned long	tx_errors;
> +		atomic_long_t	tx_errors_atomic;
> +	};
> +	union {
> +		unsigned long	rx_dropped;
> +		atomic_long_t	rx_dropped_atomic;
> +	};
> +	union {
> +		unsigned long	tx_dropped;
> +		atomic_long_t	tx_dropped_atomic;
> +	};

Other choice would be to have a new structure

struct net_device_atomic_stats {
	atomic_long_t rx_packets;
	atomic_long_t tx_packets;
	atomic_long_t rx_bytes;
	atomic_long_t tx_bytes;
	atomic_long_t rx_errors;
	atomic_long_t tx_errors;
	atomic_long_t rx_dropped;
	atomic_long_t tx_dropped;
...
};

and to union it in :

struct net_device {
	...
	union {
		struct net_device_stats	       stats;
		struct net_device_atomic_stats atom_stats;
	};
	...
};



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] net: l2tp_eth: use LLTX to avoid LOCKDEP splats
  2012-06-06  9:54 Deadlock, L2TP over IP are not working, 3.4.1 Denys Fedoryshchenko
  2012-06-07 20:53 ` Francois Romieu
@ 2012-06-25 15:35 ` Eric Dumazet
  2012-06-26 23:43   ` David Miller
  1 sibling, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2012-06-25 15:35 UTC (permalink / raw)
  To: Denys Fedoryshchenko, David Miller
  Cc: netdev, Hong zhi guo, James Chapman, Francois Romieu

From: Eric Dumazet <edumazet@google.com>

Denys Fedoryshchenko reported a LOCKDEP issue with l2tp code.


[ 8683.927442] ======================================================
[ 8683.927555] [ INFO: possible circular locking dependency detected ]
[ 8683.927672] 3.4.1-build-0061 #14 Not tainted
[ 8683.927782] -------------------------------------------------------
[ 8683.927895] swapper/0/0 is trying to acquire lock:
[ 8683.928007]  (slock-AF_INET){+.-...}, at: [<e0fc73ec>] 
l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]
[ 8683.928121] but task is already holding lock:
[ 8683.928121]  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>] 
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] which lock already depends on the new lock.
[ 8683.928121]
[ 8683.928121]
[ 8683.928121] the existing dependency chain (in reverse order) is:
[ 8683.928121]
[ 8683.928121] -> #1 (_xmit_ETHER#2){+.-...}:
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<c0304e0c>] ip_send_reply+0xf2/0x1ce
[ 8683.928121]        [<c0317dbc>] tcp_v4_send_reset+0x153/0x16f
[ 8683.928121]        [<c0317f4a>] tcp_v4_do_rcv+0x172/0x194
[ 8683.928121]        [<c031929b>] tcp_v4_rcv+0x387/0x5a0
[ 8683.928121]        [<c03001d0>] ip_local_deliver_finish+0x13a/0x1e9
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c030075b>] ip_local_deliver+0x41/0x45
[ 8683.928121]        [<c03005dd>] ip_rcv_finish+0x31a/0x33c
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c0300960>] ip_rcv+0x201/0x23d
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02deae8>] netif_receive_skb+0x4e/0x7d
[ 8683.928121]        [<e08d5ef3>] rtl8139_poll+0x243/0x33d [8139too]
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] -> #0 (slock-AF_INET){+.-...}:
[ 8683.928121]        [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e 
[l2tp_core]
[ 8683.928121]        [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f 
[l2tp_eth]
[ 8683.928121]        [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]        [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]        [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]        [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]        [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] other info that might help us debug this:
[ 8683.928121]
[ 8683.928121]  Possible unsafe locking scenario:
[ 8683.928121]
[ 8683.928121]        CPU0                    CPU1
[ 8683.928121]        ----                    ----
[ 8683.928121]   lock(_xmit_ETHER#2);
[ 8683.928121]                                lock(slock-AF_INET);
[ 8683.928121]                                lock(_xmit_ETHER#2);
[ 8683.928121]   lock(slock-AF_INET);
[ 8683.928121]
[ 8683.928121]  *** DEADLOCK ***
[ 8683.928121]
[ 8683.928121] 3 locks held by swapper/0/0:
[ 8683.928121]  #0:  (rcu_read_lock){.+.+..}, at: [<c02dbc10>] 
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #1:  (rcu_read_lock_bh){.+....}, at: [<c02dbc10>] 
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #2:  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>] 
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] stack backtrace:
[ 8683.928121] Pid: 0, comm: swapper/0 Not tainted 3.4.1-build-0061 #14
[ 8683.928121] Call Trace:
[ 8683.928121]  [<c034bdd2>] ? printk+0x18/0x1a
[ 8683.928121]  [<c0158904>] print_circular_bug+0x1ac/0x1b6
[ 8683.928121]  [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]  [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f [l2tp_eth]
[ 8683.928121]  [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]  [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]  [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]  [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]  [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]  [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]  [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]  [<c012b23a>] ? local_bh_enable+0xd/0xd
[ 8683.928121]  <IRQ>  [<c012b4d0>] ? irq_exit+0x41/0x91
[ 8683.928121]  [<c0103c6f>] ? do_IRQ+0x79/0x8d
[ 8683.928121]  [<c0157ea1>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 8683.928121]  [<c034ef6e>] ? common_interrupt+0x2e/0x34
[ 8683.928121]  [<c0108a33>] ? default_idle+0x23/0x38
[ 8683.928121]  [<c01091a8>] ? cpu_idle+0x55/0x6f
[ 8683.928121]  [<c033df25>] ? rest_init+0xa1/0xa7
[ 8683.928121]  [<c033de84>] ? __read_lock_failed+0x14/0x14
[ 8683.928121]  [<c0498745>] ? start_kernel+0x303/0x30a
[ 8683.928121]  [<c0498209>] ? repair_env_string+0x51/0x51
[ 8683.928121]  [<c04980a8>] ? i386_start_kernel+0xa8/0xaf

It appears that like most virtual devices, l2tp should be converted to
LLTX mode.

This patch takes care of statistics using atomic_long in both RX and TX
paths, and fix a bug in l2tp_eth_dev_recv(), which was caching skb->data
before a pskb_may_pull() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Cc: James Chapman <jchapman@katalix.com>
Cc: Hong zhi guo <honkiko@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
---
Should be applied after "net: l2tp_eth: fix l2tp_eth_dev_xmit race"

 net/l2tp/l2tp_eth.c |   43 +++++++++++++++++++++++++++++++-----------
 1 file changed, 32 insertions(+), 11 deletions(-)

diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
index c3738f4..47b259f 100644
--- a/net/l2tp/l2tp_eth.c
+++ b/net/l2tp/l2tp_eth.c
@@ -42,6 +42,11 @@ struct l2tp_eth {
 	struct sock		*tunnel_sock;
 	struct l2tp_session	*session;
 	struct list_head	list;
+	atomic_long_t		tx_bytes;
+	atomic_long_t		tx_packets;
+	atomic_long_t		rx_bytes;
+	atomic_long_t		rx_packets;
+	atomic_long_t		rx_errors;
 };
 
 /* via l2tp_session_priv() */
@@ -88,24 +93,40 @@ static int l2tp_eth_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct l2tp_eth *priv = netdev_priv(dev);
 	struct l2tp_session *session = priv->session;
 
-	dev->stats.tx_bytes += skb->len;
-	dev->stats.tx_packets++;
+	atomic_long_add(skb->len, &priv->tx_bytes);
+	atomic_long_inc(&priv->tx_packets);
 
 	l2tp_xmit_skb(session, skb, session->hdr_len);
 
 	return NETDEV_TX_OK;
 }
 
+static struct rtnl_link_stats64 *l2tp_eth_get_stats64(struct net_device *dev,
+						      struct rtnl_link_stats64 *stats)
+{
+	struct l2tp_eth *priv = netdev_priv(dev);
+
+	stats->tx_bytes   = atomic_long_read(&priv->tx_bytes);
+	stats->tx_packets = atomic_long_read(&priv->tx_packets);
+	stats->rx_bytes   = atomic_long_read(&priv->rx_bytes);
+	stats->rx_packets = atomic_long_read(&priv->rx_packets);
+	stats->rx_errors  = atomic_long_read(&priv->rx_errors);
+	return stats;
+}
+
+
 static struct net_device_ops l2tp_eth_netdev_ops = {
 	.ndo_init		= l2tp_eth_dev_init,
 	.ndo_uninit		= l2tp_eth_dev_uninit,
 	.ndo_start_xmit		= l2tp_eth_dev_xmit,
+	.ndo_get_stats64	= l2tp_eth_get_stats64,
 };
 
 static void l2tp_eth_dev_setup(struct net_device *dev)
 {
 	ether_setup(dev);
-	dev->priv_flags &= ~IFF_TX_SKB_SHARING;
+	dev->priv_flags		&= ~IFF_TX_SKB_SHARING;
+	dev->features		|= NETIF_F_LLTX;
 	dev->netdev_ops		= &l2tp_eth_netdev_ops;
 	dev->destructor		= free_netdev;
 }
@@ -114,17 +135,17 @@ static void l2tp_eth_dev_recv(struct l2tp_session *session, struct sk_buff *skb,
 {
 	struct l2tp_eth_sess *spriv = l2tp_session_priv(session);
 	struct net_device *dev = spriv->dev;
+	struct l2tp_eth *priv = netdev_priv(dev);
 
 	if (session->debug & L2TP_MSG_DATA) {
 		unsigned int length;
-		u8 *ptr = skb->data;
 
 		length = min(32u, skb->len);
 		if (!pskb_may_pull(skb, length))
 			goto error;
 
 		pr_debug("%s: eth recv\n", session->name);
-		print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length);
+		print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, skb->data, length);
 	}
 
 	if (!pskb_may_pull(skb, sizeof(ETH_HLEN)))
@@ -139,15 +160,15 @@ static void l2tp_eth_dev_recv(struct l2tp_session *session, struct sk_buff *skb,
 	nf_reset(skb);
 
 	if (dev_forward_skb(dev, skb) == NET_RX_SUCCESS) {
-		dev->stats.rx_packets++;
-		dev->stats.rx_bytes += data_len;
-	} else
-		dev->stats.rx_errors++;
-
+		atomic_long_inc(&priv->rx_packets);
+		atomic_long_add(data_len, &priv->rx_bytes);
+	} else {
+		atomic_long_inc(&priv->rx_errors);
+	}
 	return;
 
 error:
-	dev->stats.rx_errors++;
+	atomic_long_inc(&priv->rx_errors);
 	kfree_skb(skb);
 }
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] net: l2tp_eth: use LLTX to avoid LOCKDEP splats
  2012-06-25 15:35 ` [PATCH] net: l2tp_eth: use LLTX to avoid LOCKDEP splats Eric Dumazet
@ 2012-06-26 23:43   ` David Miller
  0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2012-06-26 23:43 UTC (permalink / raw)
  To: eric.dumazet; +Cc: denys, netdev, honkiko, jchapman, romieu

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 25 Jun 2012 17:35:45 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Denys Fedoryshchenko reported a LOCKDEP issue with l2tp code.
 ...
> It appears that like most virtual devices, l2tp should be converted to
> LLTX mode.
> 
> This patch takes care of statistics using atomic_long in both RX and TX
> paths, and fix a bug in l2tp_eth_dev_recv(), which was caching skb->data
> before a pskb_may_pull() call.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
> Cc: James Chapman <jchapman@katalix.com>
> Cc: Hong zhi guo <honkiko@gmail.com>
> Cc: Francois Romieu <romieu@fr.zoreil.com>
> ---
> Should be applied after "net: l2tp_eth: fix l2tp_eth_dev_xmit race"

Applied, thanks Eric.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-06-26 23:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-06  9:54 Deadlock, L2TP over IP are not working, 3.4.1 Denys Fedoryshchenko
2012-06-07 20:53 ` Francois Romieu
2012-06-07 21:30   ` Eric Dumazet
2012-06-07 22:37     ` Francois Romieu
2012-06-08  5:47       ` Eric Dumazet
2012-06-08 15:41         ` Benjamin LaHaise
2012-06-08 15:56           ` Denys Fedoryshchenko
2012-06-08 16:04             ` Eric Dumazet
2012-06-10 12:14               ` Hong zhi guo
2012-06-10 13:31                 ` Eric Dumazet
2012-06-10 13:38                   ` Eric Dumazet
2012-06-25 15:35 ` [PATCH] net: l2tp_eth: use LLTX to avoid LOCKDEP splats Eric Dumazet
2012-06-26 23:43   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.