All of lore.kernel.org
 help / color / mirror / Atom feed
* softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
@ 2011-07-12  9:23 ` Thomas De Schampheleire
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas De Schampheleire @ 2011-07-12  9:23 UTC (permalink / raw)
  To: Eric Dumazet, linuxppc-dev; +Cc: Ronny Meeus, David Miller, netdev

Hi,

I'm adding the linuxppc-dev mailing list since this may be pointing to
an irq/softirq problem in the powerpc architecture-specific code...

On Tue, Jul 12, 2011 at 8:58 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 12 juillet 2011 à 08:38 +0200, Ronny Meeus a écrit :
>> On Tue, Jul 12, 2011 at 5:27 AM, David Miller <davem@davemloft.net> wrote:
>> > From: Ronny Meeus <ronny.meeus@gmail.com>
>> > Date: Sat, 11 Jun 2011 07:04:09 +0200
>> >
>> >> I was running a test: 1 application was sending raw Ethernet packets
>> >> on a physical looped interface while a second application was
>> >> receiving packets, so the latter application receives each packet 2
>> >> times (once while sending from the context of the first application
>> >> and a second time while receiving from the hardware).  After some
>> >> time, the test blocks due to a spinlock reentrance issue in
>> >> af_packet. Both the sending application and the softIRQ receiving
>> >> packets enter the spinlock code. After applying the patch below, the
>> >> issue is resolved.
>> >>
>> >> Signed-off-by: Ronny Meeus <ronny.meeus@gmail.com>
>> >
>> > The packet receive hooks should always be called with software
>> > interrupts disabled, it is a bug if this is not happening.  Your
>> > patch should not be necessary at all.
>> >
>> >
>>
>> Can you be a bit more specific on where the software interrupts should
>> be disabled?
>>
>> Below you find the information I get after switching on the spin_lock
>> issue detection in the kernel.
>> In this run also I-PIPE was active but this issue is also seen with
>> I-PIPE disabled.
>>
>
> This seems a bug, but in softirq handling in your arch
>
>> [   96.450034] BUG: spinlock recursion on CPU#0, send_eth_socket/1907
>> [   96.540451]  lock: eaeb8c9c, .magic: dead4ead, .owner:
>> send_eth_socket/1907, .owner_cpu: 0
>> [   96.656060] Call Trace: (unreliable)0161000 success=90] [c000789c]
>> show_stack+0x78/0x18c160001
>> [   96.827988] [efff3dd0] [c01e2a50] spin_bug+0xa8/0xc0=0000162000 succ
>> [   96.920712] [efff3df0] [c01e2b9c] do_raw_spin_lock+0x70/0x1c4ount=0000163000
>> [   97.022823] [efff3e20] [c0388d88] _raw_spin_lock+0x10/0x2001
>> [   97.121800] [efff3e30] [c03391a0] tpacket_rcv+0x274/0x61c164001
>> [   97.219733] [efff3e80] [c02d5970] __netif_receive_skb+0x1a8/0x36c0000165001
>> [   97.326001] [efff3eb0] [c02d6234] netif_receive_skb+0x98/0xacess=0000166001
>> [   97.427060] [efff3ee0] [c029e6c4] ingress_rx_default_dqrr+0x42c/0x4b8
>> [   97.504194] [efff3f10] [c02ba524] qman_poll_dqrr+0x1e0/0x284
>> [   97.571948] [efff3f50] [c029ff3c] dpaa_eth_poll+0x34/0xd0
>> [   97.636579] [efff3f70] [c02d6514] net_rx_action+0xc0/0x1e8
>> [   97.702256] [efff3fa0] [c0034d28] __do_softirq+0x138/0x210
>> [   97.767928] [efff3ff0] [c0010128] call_do_softirq+0x14/0x24
>> [   97.834641] [ec479a90] [c0004a00] do_softirq+0xb4/0xec
>
> We got an IRQ, and we start do_softirq() from irq_exit()
>
>> [   97.896148] [ec479ab0] [c0034a44] irq_exit+0x60/0xb8
>> [   97.955573] [ec479ac0] [c000a490] __ipipe_do_IRQ+0x88/0xc0
>> [   98.021253] [ec479ae0] [c0070214] __ipipe_sync_stage+0x1f0/0x27c
>> [   98.093176] [ec479b20] [c0009f28] __ipipe_handle_irq+0x1b8/0x1e8
>> [   98.165106] [ec479b50] [c000a210] __ipipe_grab_irq+0x18c/0x1bc
>> [   98.234947] [ec479b80] [c0011520] __ipipe_ret_from_except+0x0/0xc
>> [   98.307915] --- Exception: 501 at __packet_get_status+0x48/0x70
>> [   98.307920]     LR = __packet_get_status+0x44/0x70
>> [   98.436082] [ec479c40] [00000578] 0x578 (unreliable)
>> [   98.495524] [ec479c50] [c0338360] packet_lookup_frame+0x48/0x70
>> [   98.566405] [ec479c60] [c03391b4] tpacket_rcv+0x288/0x61c
>
> Here we were in BH disabled section, since dev_queue_xmit() contains :
>
>        rcu_read_lock_bh()
>
>
>> [   98.631037] [ec479cb0] [c02d762c] dev_hard_start_xmit+0x164/0x588
>> [   98.704003] [ec479cf0] [c0338d6c] packet_sendmsg+0x8c4/0x988
>> [   98.771758] [ec479d70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [   98.835348] [ec479e40] [c02c4420] sys_sendto+0xdc/0x120
>> [   98.897891] [ec479f10] [c02c57d0] sys_socketcall+0x148/0x210
>> [   98.965648] [ec479f40] [c001084c] ret_from_syscall+0x0/0x3c
>> [   99.032361] --- Exception: c01 at 0x48051f00
>> [   99.032365]     LR = 0x4808e030
>> [  100.563009] BUG: spinlock lockup on CPU#0, send_eth_socket/1907, eaeb8c9c
>> [  100.644283] Call Trace:
>> [  100.673480] [efff3db0] [c000789c] show_stack+0x78/0x18c (unreliable)
>> [  100.749589] [efff3df0] [c01e2c94] do_raw_spin_lock+0x168/0x1c4
>> [  100.819430] [efff3e20] [c0388d88] _raw_spin_lock+0x10/0x20
>> [  100.885102] [efff3e30] [c03391a0] tpacket_rcv+0x274/0x61c
>> [  100.949733] [efff3e80] [c02d5970] __netif_receive_skb+0x1a8/0x36c
>> [  101.022699] [efff3eb0] [c02d6234] netif_receive_skb+0x98/0xac
>> [  101.091497] [efff3ee0] [c029e6c4] ingress_rx_default_dqrr+0x42c/0x4b8
>> [  101.168628] [efff3f10] [c02ba524] qman_poll_dqrr+0x1e0/0x284
>> [  101.236385] [efff3f50] [c029ff3c] dpaa_eth_poll+0x34/0xd0
>> [  101.301016] [efff3f70] [c02d6514] net_rx_action+0xc0/0x1e8
>> [  101.366692] [efff3fa0] [c0034d28] __do_softirq+0x138/0x210
>> [  101.432364] [efff3ff0] [c0010128] call_do_softirq+0x14/0x24
>> [  101.499078] [ec479a90] [c0004a00] do_softirq+0xb4/0xec
>> [  101.560585] [ec479ab0] [c0034a44] irq_exit+0x60/0xb8
>> [  101.620003] [ec479ac0] [c000a490] __ipipe_do_IRQ+0x88/0xc0
>> [  101.685678] [ec479ae0] [c0070214] __ipipe_sync_stage+0x1f0/0x27c
>> [  101.757601] [ec479b20] [c0009f28] __ipipe_handle_irq+0x1b8/0x1e8
>> [  101.829523] [ec479b50] [c000a210] __ipipe_grab_irq+0x18c/0x1bc
>> [  101.899363] [ec479b80] [c0011520] __ipipe_ret_from_except+0x0/0xc
>> [  101.972333] --- Exception: 501 at __packet_get_status+0x48/0x70
>> [  101.972338]     LR = __packet_get_status+0x44/0x70
>> [  102.100490] [ec479c40] [00000578] 0x578 (unreliable)
>> [  102.159929] [ec479c50] [c0338360] packet_lookup_frame+0x48/0x70
>> [  102.230810] [ec479c60] [c03391b4] tpacket_rcv+0x288/0x61c
>> [  102.295442] [ec479cb0] [c02d762c] dev_hard_start_xmit+0x164/0x588
>> [  102.368407] [ec479cf0] [c0338d6c] packet_sendmsg+0x8c4/0x988
>> [  102.436162] [ec479d70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [  102.499747] [ec479e40] [c02c4420] sys_sendto+0xdc/0x120
>> [  102.562296] [ec479f10] [c02c57d0] sys_socketcall+0x148/0x210
>> [  102.630052] [ec479f40] [c001084c] ret_from_syscall+0x0/0x3c
>> [  102.696773] --- Exception: c01 at 0x48051f00
>> [  102.696777]     LR = 0x4808e030
>>


Note that the reason we are seeing this problem, may be because the
kernel we are using contains some patches from Freescale.
Specifically, in dev_queue_xmit(), support is added for hardware queue
handling, just before entering the rcu_read_lock_bh():

        if (dev->features & NETIF_F_HW_QDISC) {
                txq = dev_pick_tx(dev, skb);
                return dev_hard_start_xmit(skb, dev, txq);
        }

        /* Disable soft irqs for various locks below. Also
         * stops preemption for RCU.
         */
        rcu_read_lock_bh();

We just tried moving the escaping to dev_hard_start_xmit() after
taking the lock, but this gives a large number of other problems, e.g.

[   78.662428] BUG: sleeping function called from invalid context at
mm/slab.c:3101
[   78.751004] in_atomic(): 1, irqs_disabled(): 0, pid: 1908, name:
send_eth_socket
[   78.839582] Call Trace:
[   78.868784] [ec537b70] [c000789c] show_stack+0x78/0x18c (unreliable)
[   78.944905] [ec537bb0] [c0022900] __might_sleep+0x100/0x118
[   79.011636] [ec537bc0] [c00facc4] kmem_cache_alloc+0x48/0x118
[   79.080446] [ec537be0] [c02cd0e8] __alloc_skb+0x50/0x130
[   79.144047] [ec537c00] [c02cdf5c] skb_copy+0x44/0xc8
[   79.203478] [ec537c20] [c029f904] dpa_tx+0x154/0x758
[   79.262907] [ec537c80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
[   79.335878] [ec537cc0] [c02d7aac] dev_queue_xmit+0x5c/0x3a4
[   79.402602] [ec537cf0] [c0338d4c] packet_sendmsg+0x8c4/0x988
[   79.470363] [ec537d70] [c02c3838] sock_sendmsg+0x90/0xb4
[   79.533960] [ec537e40] [c02c4420] sys_sendto+0xdc/0x120
[   79.596514] [ec537f10] [c02c57d0] sys_socketcall+0x148/0x210
[   79.664287] [ec537f40] [c001084c] ret_from_syscall+0x0/0x3c
[   79.731015] --- Exception: c01 at 0x48051f00
[   79.731019]     LR = 0x4808e030


Note that this may just be the cause for us seeing this problem. If
indeed the main problem is irq_exit() invoking softirqs in a locked
context, then this patch adding hardware queue support is not really
relevant.

Any suggestions from the developers at linuxppc-dev are very welcome...

Thomas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
@ 2011-07-12  9:23 ` Thomas De Schampheleire
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas De Schampheleire @ 2011-07-12  9:23 UTC (permalink / raw)
  To: Eric Dumazet, linuxppc-dev; +Cc: netdev, Ronny Meeus, David Miller

Hi,

I'm adding the linuxppc-dev mailing list since this may be pointing to
an irq/softirq problem in the powerpc architecture-specific code...

On Tue, Jul 12, 2011 at 8:58 AM, Eric Dumazet <eric.dumazet@gmail.com> wrot=
e:
> Le mardi 12 juillet 2011 =E0 08:38 +0200, Ronny Meeus a =E9crit :
>> On Tue, Jul 12, 2011 at 5:27 AM, David Miller <davem@davemloft.net> wrot=
e:
>> > From: Ronny Meeus <ronny.meeus@gmail.com>
>> > Date: Sat, 11 Jun 2011 07:04:09 +0200
>> >
>> >> I was running a test: 1 application was sending raw Ethernet packets
>> >> on a physical looped interface while a second application was
>> >> receiving packets, so the latter application receives each packet 2
>> >> times (once while sending from the context of the first application
>> >> and a second time while receiving from the hardware). =A0After some
>> >> time, the test blocks due to a spinlock reentrance issue in
>> >> af_packet. Both the sending application and the softIRQ receiving
>> >> packets enter the spinlock code. After applying the patch below, the
>> >> issue is resolved.
>> >>
>> >> Signed-off-by: Ronny Meeus <ronny.meeus@gmail.com>
>> >
>> > The packet receive hooks should always be called with software
>> > interrupts disabled, it is a bug if this is not happening. =A0Your
>> > patch should not be necessary at all.
>> >
>> >
>>
>> Can you be a bit more specific on where the software interrupts should
>> be disabled?
>>
>> Below you find the information I get after switching on the spin_lock
>> issue detection in the kernel.
>> In this run also I-PIPE was active but this issue is also seen with
>> I-PIPE disabled.
>>
>
> This seems a bug, but in softirq handling in your arch
>
>> [ =A0 96.450034] BUG: spinlock recursion on CPU#0, send_eth_socket/1907
>> [ =A0 96.540451] =A0lock: eaeb8c9c, .magic: dead4ead, .owner:
>> send_eth_socket/1907, .owner_cpu: 0
>> [ =A0 96.656060] Call Trace: (unreliable)0161000 success=3D90] [c000789c=
]
>> show_stack+0x78/0x18c160001
>> [ =A0 96.827988] [efff3dd0] [c01e2a50] spin_bug+0xa8/0xc0=3D0000162000 s=
ucc
>> [ =A0 96.920712] [efff3df0] [c01e2b9c] do_raw_spin_lock+0x70/0x1c4ount=
=3D0000163000
>> [ =A0 97.022823] [efff3e20] [c0388d88] _raw_spin_lock+0x10/0x2001
>> [ =A0 97.121800] [efff3e30] [c03391a0] tpacket_rcv+0x274/0x61c164001
>> [ =A0 97.219733] [efff3e80] [c02d5970] __netif_receive_skb+0x1a8/0x36c00=
00165001
>> [ =A0 97.326001] [efff3eb0] [c02d6234] netif_receive_skb+0x98/0xacess=3D=
0000166001
>> [ =A0 97.427060] [efff3ee0] [c029e6c4] ingress_rx_default_dqrr+0x42c/0x4=
b8
>> [ =A0 97.504194] [efff3f10] [c02ba524] qman_poll_dqrr+0x1e0/0x284
>> [ =A0 97.571948] [efff3f50] [c029ff3c] dpaa_eth_poll+0x34/0xd0
>> [ =A0 97.636579] [efff3f70] [c02d6514] net_rx_action+0xc0/0x1e8
>> [ =A0 97.702256] [efff3fa0] [c0034d28] __do_softirq+0x138/0x210
>> [ =A0 97.767928] [efff3ff0] [c0010128] call_do_softirq+0x14/0x24
>> [ =A0 97.834641] [ec479a90] [c0004a00] do_softirq+0xb4/0xec
>
> We got an IRQ, and we start do_softirq() from irq_exit()
>
>> [ =A0 97.896148] [ec479ab0] [c0034a44] irq_exit+0x60/0xb8
>> [ =A0 97.955573] [ec479ac0] [c000a490] __ipipe_do_IRQ+0x88/0xc0
>> [ =A0 98.021253] [ec479ae0] [c0070214] __ipipe_sync_stage+0x1f0/0x27c
>> [ =A0 98.093176] [ec479b20] [c0009f28] __ipipe_handle_irq+0x1b8/0x1e8
>> [ =A0 98.165106] [ec479b50] [c000a210] __ipipe_grab_irq+0x18c/0x1bc
>> [ =A0 98.234947] [ec479b80] [c0011520] __ipipe_ret_from_except+0x0/0xc
>> [ =A0 98.307915] --- Exception: 501 at __packet_get_status+0x48/0x70
>> [ =A0 98.307920] =A0 =A0 LR =3D __packet_get_status+0x44/0x70
>> [ =A0 98.436082] [ec479c40] [00000578] 0x578 (unreliable)
>> [ =A0 98.495524] [ec479c50] [c0338360] packet_lookup_frame+0x48/0x70
>> [ =A0 98.566405] [ec479c60] [c03391b4] tpacket_rcv+0x288/0x61c
>
> Here we were in BH disabled section, since dev_queue_xmit() contains :
>
> =A0 =A0 =A0 =A0rcu_read_lock_bh()
>
>
>> [ =A0 98.631037] [ec479cb0] [c02d762c] dev_hard_start_xmit+0x164/0x588
>> [ =A0 98.704003] [ec479cf0] [c0338d6c] packet_sendmsg+0x8c4/0x988
>> [ =A0 98.771758] [ec479d70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [ =A0 98.835348] [ec479e40] [c02c4420] sys_sendto+0xdc/0x120
>> [ =A0 98.897891] [ec479f10] [c02c57d0] sys_socketcall+0x148/0x210
>> [ =A0 98.965648] [ec479f40] [c001084c] ret_from_syscall+0x0/0x3c
>> [ =A0 99.032361] --- Exception: c01 at 0x48051f00
>> [ =A0 99.032365] =A0 =A0 LR =3D 0x4808e030
>> [ =A0100.563009] BUG: spinlock lockup on CPU#0, send_eth_socket/1907, ea=
eb8c9c
>> [ =A0100.644283] Call Trace:
>> [ =A0100.673480] [efff3db0] [c000789c] show_stack+0x78/0x18c (unreliable=
)
>> [ =A0100.749589] [efff3df0] [c01e2c94] do_raw_spin_lock+0x168/0x1c4
>> [ =A0100.819430] [efff3e20] [c0388d88] _raw_spin_lock+0x10/0x20
>> [ =A0100.885102] [efff3e30] [c03391a0] tpacket_rcv+0x274/0x61c
>> [ =A0100.949733] [efff3e80] [c02d5970] __netif_receive_skb+0x1a8/0x36c
>> [ =A0101.022699] [efff3eb0] [c02d6234] netif_receive_skb+0x98/0xac
>> [ =A0101.091497] [efff3ee0] [c029e6c4] ingress_rx_default_dqrr+0x42c/0x4=
b8
>> [ =A0101.168628] [efff3f10] [c02ba524] qman_poll_dqrr+0x1e0/0x284
>> [ =A0101.236385] [efff3f50] [c029ff3c] dpaa_eth_poll+0x34/0xd0
>> [ =A0101.301016] [efff3f70] [c02d6514] net_rx_action+0xc0/0x1e8
>> [ =A0101.366692] [efff3fa0] [c0034d28] __do_softirq+0x138/0x210
>> [ =A0101.432364] [efff3ff0] [c0010128] call_do_softirq+0x14/0x24
>> [ =A0101.499078] [ec479a90] [c0004a00] do_softirq+0xb4/0xec
>> [ =A0101.560585] [ec479ab0] [c0034a44] irq_exit+0x60/0xb8
>> [ =A0101.620003] [ec479ac0] [c000a490] __ipipe_do_IRQ+0x88/0xc0
>> [ =A0101.685678] [ec479ae0] [c0070214] __ipipe_sync_stage+0x1f0/0x27c
>> [ =A0101.757601] [ec479b20] [c0009f28] __ipipe_handle_irq+0x1b8/0x1e8
>> [ =A0101.829523] [ec479b50] [c000a210] __ipipe_grab_irq+0x18c/0x1bc
>> [ =A0101.899363] [ec479b80] [c0011520] __ipipe_ret_from_except+0x0/0xc
>> [ =A0101.972333] --- Exception: 501 at __packet_get_status+0x48/0x70
>> [ =A0101.972338] =A0 =A0 LR =3D __packet_get_status+0x44/0x70
>> [ =A0102.100490] [ec479c40] [00000578] 0x578 (unreliable)
>> [ =A0102.159929] [ec479c50] [c0338360] packet_lookup_frame+0x48/0x70
>> [ =A0102.230810] [ec479c60] [c03391b4] tpacket_rcv+0x288/0x61c
>> [ =A0102.295442] [ec479cb0] [c02d762c] dev_hard_start_xmit+0x164/0x588
>> [ =A0102.368407] [ec479cf0] [c0338d6c] packet_sendmsg+0x8c4/0x988
>> [ =A0102.436162] [ec479d70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [ =A0102.499747] [ec479e40] [c02c4420] sys_sendto+0xdc/0x120
>> [ =A0102.562296] [ec479f10] [c02c57d0] sys_socketcall+0x148/0x210
>> [ =A0102.630052] [ec479f40] [c001084c] ret_from_syscall+0x0/0x3c
>> [ =A0102.696773] --- Exception: c01 at 0x48051f00
>> [ =A0102.696777] =A0 =A0 LR =3D 0x4808e030
>>


Note that the reason we are seeing this problem, may be because the
kernel we are using contains some patches from Freescale.
Specifically, in dev_queue_xmit(), support is added for hardware queue
handling, just before entering the rcu_read_lock_bh():

        if (dev->features & NETIF_F_HW_QDISC) {
                txq =3D dev_pick_tx(dev, skb);
                return dev_hard_start_xmit(skb, dev, txq);
        }

        /* Disable soft irqs for various locks below. Also
         * stops preemption for RCU.
         */
        rcu_read_lock_bh();

We just tried moving the escaping to dev_hard_start_xmit() after
taking the lock, but this gives a large number of other problems, e.g.

[   78.662428] BUG: sleeping function called from invalid context at
mm/slab.c:3101
[   78.751004] in_atomic(): 1, irqs_disabled(): 0, pid: 1908, name:
send_eth_socket
[   78.839582] Call Trace:
[   78.868784] [ec537b70] [c000789c] show_stack+0x78/0x18c (unreliable)
[   78.944905] [ec537bb0] [c0022900] __might_sleep+0x100/0x118
[   79.011636] [ec537bc0] [c00facc4] kmem_cache_alloc+0x48/0x118
[   79.080446] [ec537be0] [c02cd0e8] __alloc_skb+0x50/0x130
[   79.144047] [ec537c00] [c02cdf5c] skb_copy+0x44/0xc8
[   79.203478] [ec537c20] [c029f904] dpa_tx+0x154/0x758
[   79.262907] [ec537c80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
[   79.335878] [ec537cc0] [c02d7aac] dev_queue_xmit+0x5c/0x3a4
[   79.402602] [ec537cf0] [c0338d4c] packet_sendmsg+0x8c4/0x988
[   79.470363] [ec537d70] [c02c3838] sock_sendmsg+0x90/0xb4
[   79.533960] [ec537e40] [c02c4420] sys_sendto+0xdc/0x120
[   79.596514] [ec537f10] [c02c57d0] sys_socketcall+0x148/0x210
[   79.664287] [ec537f40] [c001084c] ret_from_syscall+0x0/0x3c
[   79.731015] --- Exception: c01 at 0x48051f00
[   79.731019]     LR =3D 0x4808e030


Note that this may just be the cause for us seeing this problem. If
indeed the main problem is irq_exit() invoking softirqs in a locked
context, then this patch adding hardware queue support is not really
relevant.

Any suggestions from the developers at linuxppc-dev are very welcome...

Thomas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
  2011-07-12  9:23 ` Thomas De Schampheleire
@ 2011-07-12 10:01   ` David Miller
  -1 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-07-12 10:01 UTC (permalink / raw)
  To: patrickdepinguin+linuxppc; +Cc: eric.dumazet, linuxppc-dev, ronny.meeus, netdev

From: Thomas De Schampheleire <patrickdepinguin+linuxppc@gmail.com>
Date: Tue, 12 Jul 2011 11:23:28 +0200

> Note that the reason we are seeing this problem, may be because the
> kernel we are using contains some patches from Freescale.
> Specifically, in dev_queue_xmit(), support is added for hardware queue
> handling, just before entering the rcu_read_lock_bh():
> 
>         if (dev->features & NETIF_F_HW_QDISC) {
>                 txq = dev_pick_tx(dev, skb);
>                 return dev_hard_start_xmit(skb, dev, txq);
>         }
> 
>         /* Disable soft irqs for various locks below. Also
>          * stops preemption for RCU.
>          */
>         rcu_read_lock_bh();
> 
> We just tried moving the escaping to dev_hard_start_xmit() after
> taking the lock, but this gives a large number of other problems, e.g.

This is definitely why you are seeing this behavior.

You cannot invoke dev_hard_start_xmit() without softirqs
being disabled.  It breaks everything.

This is what happens when you integrate networking patches
which were not reviewed and vetted on netdev.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
@ 2011-07-12 10:01   ` David Miller
  0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-07-12 10:01 UTC (permalink / raw)
  To: patrickdepinguin+linuxppc; +Cc: linuxppc-dev, ronny.meeus, eric.dumazet, netdev

From: Thomas De Schampheleire <patrickdepinguin+linuxppc@gmail.com>
Date: Tue, 12 Jul 2011 11:23:28 +0200

> Note that the reason we are seeing this problem, may be because the
> kernel we are using contains some patches from Freescale.
> Specifically, in dev_queue_xmit(), support is added for hardware queue
> handling, just before entering the rcu_read_lock_bh():
> 
>         if (dev->features & NETIF_F_HW_QDISC) {
>                 txq = dev_pick_tx(dev, skb);
>                 return dev_hard_start_xmit(skb, dev, txq);
>         }
> 
>         /* Disable soft irqs for various locks below. Also
>          * stops preemption for RCU.
>          */
>         rcu_read_lock_bh();
> 
> We just tried moving the escaping to dev_hard_start_xmit() after
> taking the lock, but this gives a large number of other problems, e.g.

This is definitely why you are seeing this behavior.

You cannot invoke dev_hard_start_xmit() without softirqs
being disabled.  It breaks everything.

This is what happens when you integrate networking patches
which were not reviewed and vetted on netdev.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
  2011-07-12  9:23 ` Thomas De Schampheleire
@ 2011-07-12 10:10   ` Eric Dumazet
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-07-12 10:10 UTC (permalink / raw)
  To: Thomas De Schampheleire; +Cc: linuxppc-dev, Ronny Meeus, David Miller, netdev

Le mardi 12 juillet 2011 à 11:23 +0200, Thomas De Schampheleire a
écrit :
> Hi,
> 
> I'm adding the linuxppc-dev mailing list since this may be pointing to
> an irq/softirq problem in the powerpc architecture-specific code...

> 
> Note that the reason we are seeing this problem, may be because the
> kernel we are using contains some patches from Freescale.
> Specifically, in dev_queue_xmit(), support is added for hardware queue
> handling, just before entering the rcu_read_lock_bh():
> 

Oh well, what a mess.

>         if (dev->features & NETIF_F_HW_QDISC) {
>                 txq = dev_pick_tx(dev, skb);



>                 return dev_hard_start_xmit(skb, dev, txq);
	This need to be :
		local_bh_disable();
		rc = dev_hard_start_xmit(skb, dev, txq);
		local_bh_enable();
		return rc;


>         }
> 
>         /* Disable soft irqs for various locks below. Also
>          * stops preemption for RCU.
>          */
>         rcu_read_lock_bh();
> 
> We just tried moving the escaping to dev_hard_start_xmit() after
> taking the lock, but this gives a large number of other problems, e.g.
> 
> [   78.662428] BUG: sleeping function called from invalid context at
> mm/slab.c:3101
> [   78.751004] in_atomic(): 1, irqs_disabled(): 0, pid: 1908, name:
> send_eth_socket
> [   78.839582] Call Trace:
> [   78.868784] [ec537b70] [c000789c] show_stack+0x78/0x18c (unreliable)
> [   78.944905] [ec537bb0] [c0022900] __might_sleep+0x100/0x118
> [   79.011636] [ec537bc0] [c00facc4] kmem_cache_alloc+0x48/0x118
> [   79.080446] [ec537be0] [c02cd0e8] __alloc_skb+0x50/0x130
> [   79.144047] [ec537c00] [c02cdf5c] skb_copy+0x44/0xc8
> [   79.203478] [ec537c20] [c029f904] dpa_tx+0x154/0x758

doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure.


> [   79.262907] [ec537c80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
> [   79.335878] [ec537cc0] [c02d7aac] dev_queue_xmit+0x5c/0x3a4
> [   79.402602] [ec537cf0] [c0338d4c] packet_sendmsg+0x8c4/0x988
> [   79.470363] [ec537d70] [c02c3838] sock_sendmsg+0x90/0xb4
> [   79.533960] [ec537e40] [c02c4420] sys_sendto+0xdc/0x120
> [   79.596514] [ec537f10] [c02c57d0] sys_socketcall+0x148/0x210
> [   79.664287] [ec537f40] [c001084c] ret_from_syscall+0x0/0x3c
> [   79.731015] --- Exception: c01 at 0x48051f00
> [   79.731019]     LR = 0x4808e030
> 
> 
> Note that this may just be the cause for us seeing this problem. If
> indeed the main problem is irq_exit() invoking softirqs in a locked
> context, then this patch adding hardware queue support is not really
> relevant.

irq_exit() is fine. This is because BH are not masked because of the
Freescale patches.

Really, suggesting an af_packet patch to solve a problem introduced in
an out of tree patch is insane.

You guys hould have clearly stated you were using an alien kernel.




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
@ 2011-07-12 10:10   ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-07-12 10:10 UTC (permalink / raw)
  To: Thomas De Schampheleire; +Cc: linuxppc-dev, netdev, Ronny Meeus, David Miller

Le mardi 12 juillet 2011 à 11:23 +0200, Thomas De Schampheleire a
écrit :
> Hi,
> 
> I'm adding the linuxppc-dev mailing list since this may be pointing to
> an irq/softirq problem in the powerpc architecture-specific code...

> 
> Note that the reason we are seeing this problem, may be because the
> kernel we are using contains some patches from Freescale.
> Specifically, in dev_queue_xmit(), support is added for hardware queue
> handling, just before entering the rcu_read_lock_bh():
> 

Oh well, what a mess.

>         if (dev->features & NETIF_F_HW_QDISC) {
>                 txq = dev_pick_tx(dev, skb);



>                 return dev_hard_start_xmit(skb, dev, txq);
	This need to be :
		local_bh_disable();
		rc = dev_hard_start_xmit(skb, dev, txq);
		local_bh_enable();
		return rc;


>         }
> 
>         /* Disable soft irqs for various locks below. Also
>          * stops preemption for RCU.
>          */
>         rcu_read_lock_bh();
> 
> We just tried moving the escaping to dev_hard_start_xmit() after
> taking the lock, but this gives a large number of other problems, e.g.
> 
> [   78.662428] BUG: sleeping function called from invalid context at
> mm/slab.c:3101
> [   78.751004] in_atomic(): 1, irqs_disabled(): 0, pid: 1908, name:
> send_eth_socket
> [   78.839582] Call Trace:
> [   78.868784] [ec537b70] [c000789c] show_stack+0x78/0x18c (unreliable)
> [   78.944905] [ec537bb0] [c0022900] __might_sleep+0x100/0x118
> [   79.011636] [ec537bc0] [c00facc4] kmem_cache_alloc+0x48/0x118
> [   79.080446] [ec537be0] [c02cd0e8] __alloc_skb+0x50/0x130
> [   79.144047] [ec537c00] [c02cdf5c] skb_copy+0x44/0xc8
> [   79.203478] [ec537c20] [c029f904] dpa_tx+0x154/0x758

doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure.


> [   79.262907] [ec537c80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
> [   79.335878] [ec537cc0] [c02d7aac] dev_queue_xmit+0x5c/0x3a4
> [   79.402602] [ec537cf0] [c0338d4c] packet_sendmsg+0x8c4/0x988
> [   79.470363] [ec537d70] [c02c3838] sock_sendmsg+0x90/0xb4
> [   79.533960] [ec537e40] [c02c4420] sys_sendto+0xdc/0x120
> [   79.596514] [ec537f10] [c02c57d0] sys_socketcall+0x148/0x210
> [   79.664287] [ec537f40] [c001084c] ret_from_syscall+0x0/0x3c
> [   79.731015] --- Exception: c01 at 0x48051f00
> [   79.731019]     LR = 0x4808e030
> 
> 
> Note that this may just be the cause for us seeing this problem. If
> indeed the main problem is irq_exit() invoking softirqs in a locked
> context, then this patch adding hardware queue support is not really
> relevant.

irq_exit() is fine. This is because BH are not masked because of the
Freescale patches.

Really, suggesting an af_packet patch to solve a problem introduced in
an out of tree patch is insane.

You guys hould have clearly stated you were using an alien kernel.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
  2011-07-12 10:10   ` Eric Dumazet
@ 2011-07-12 12:03     ` Ronny Meeus
  -1 siblings, 0 replies; 20+ messages in thread
From: Ronny Meeus @ 2011-07-12 12:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Thomas De Schampheleire, linuxppc-dev, David Miller, netdev, afleming

On Tue, Jul 12, 2011 at 12:10 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 12 juillet 2011 à 11:23 +0200, Thomas De Schampheleire a
> écrit :
>> Hi,
>>
>> I'm adding the linuxppc-dev mailing list since this may be pointing to
>> an irq/softirq problem in the powerpc architecture-specific code...
>
>>
>> Note that the reason we are seeing this problem, may be because the
>> kernel we are using contains some patches from Freescale.
>> Specifically, in dev_queue_xmit(), support is added for hardware queue
>> handling, just before entering the rcu_read_lock_bh():
>>
>
> Oh well, what a mess.
>
>>         if (dev->features & NETIF_F_HW_QDISC) {
>>                 txq = dev_pick_tx(dev, skb);
>
>
>
>>                 return dev_hard_start_xmit(skb, dev, txq);
>        This need to be :
>                local_bh_disable();
>                rc = dev_hard_start_xmit(skb, dev, txq);
>                local_bh_enable();
>                return rc;
>
>
>>         }
>>
>>         /* Disable soft irqs for various locks below. Also
>>          * stops preemption for RCU.
>>          */
>>         rcu_read_lock_bh();
>>
>> We just tried moving the escaping to dev_hard_start_xmit() after
>> taking the lock, but this gives a large number of other problems, e.g.
>>
>> [   78.662428] BUG: sleeping function called from invalid context at
>> mm/slab.c:3101
>> [   78.751004] in_atomic(): 1, irqs_disabled(): 0, pid: 1908, name:
>> send_eth_socket
>> [   78.839582] Call Trace:
>> [   78.868784] [ec537b70] [c000789c] show_stack+0x78/0x18c (unreliable)
>> [   78.944905] [ec537bb0] [c0022900] __might_sleep+0x100/0x118
>> [   79.011636] [ec537bc0] [c00facc4] kmem_cache_alloc+0x48/0x118
>> [   79.080446] [ec537be0] [c02cd0e8] __alloc_skb+0x50/0x130
>> [   79.144047] [ec537c00] [c02cdf5c] skb_copy+0x44/0xc8
>> [   79.203478] [ec537c20] [c029f904] dpa_tx+0x154/0x758
>
> doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure.
>
>
>> [   79.262907] [ec537c80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
>> [   79.335878] [ec537cc0] [c02d7aac] dev_queue_xmit+0x5c/0x3a4
>> [   79.402602] [ec537cf0] [c0338d4c] packet_sendmsg+0x8c4/0x988
>> [   79.470363] [ec537d70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [   79.533960] [ec537e40] [c02c4420] sys_sendto+0xdc/0x120
>> [   79.596514] [ec537f10] [c02c57d0] sys_socketcall+0x148/0x210
>> [   79.664287] [ec537f40] [c001084c] ret_from_syscall+0x0/0x3c
>> [   79.731015] --- Exception: c01 at 0x48051f00
>> [   79.731019]     LR = 0x4808e030
>>
>>
>> Note that this may just be the cause for us seeing this problem. If
>> indeed the main problem is irq_exit() invoking softirqs in a locked
>> context, then this patch adding hardware queue support is not really
>> relevant.
>
> irq_exit() is fine. This is because BH are not masked because of the
> Freescale patches.
>
> Really, suggesting an af_packet patch to solve a problem introduced in
> an out of tree patch is insane.
>
> You guys hould have clearly stated you were using an alien kernel.
>
>
>
>

Sorry for not mentioning we were using a patched kernel.
I was not aware that the code involved was patched by the FreeScale
patches we applied. The code found in the stack dumps is not
implemented in FSL specific files.

While reading the code of af_packet I saw that the spin_lock_bh is
used in several places while this is not the case in the tpacket_rcv
function. Since we had a locking issue in that code, I thought that my
patch would be OK.
I was not aware that for that specific function (tpacket_rcv) a
different lock primitive must be used. A suggestion for improvement:
it would be better to document this pre-condition in the code.

After doing the change you proposed our code now looks like:

>---if (dev->features & NETIF_F_HW_QDISC) {
>--->---txq = dev_pick_tx(dev, skb);
>--->---local_bh_disable();
>--->---rc = dev_hard_start_xmit(skb, dev, txq);
>--->---local_bh_enable();
>--->---return rc;
>---}

>---/* Disable soft irqs for various locks below. Also
>--- * stops preemption for RCU.
>--- */
>---rcu_read_lock_bh();

but we still see the issue "BUG: sleeping function called from invalid context":

[   91.015989] BUG: sleeping function called from invalid context at
include/linux/skbuff.h:786
[   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
[   91.200461] Call Trace:
[   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
[   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
[   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
[   91.431957] [ec58bc80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
[   91.504952] [ec58bcc0] [c02d7ab0] dev_queue_xmit+0x60/0x3ac
[   91.571692] [ec58bcf0] [c0338d54] packet_sendmsg+0x8c4/0x988
[   91.639457] [ec58bd70] [c02c3838] sock_sendmsg+0x90/0xb4
[   91.703066] [ec58be40] [c02c4420] sys_sendto+0xdc/0x120
[   91.765646] [ec58bf10] [c02c57d0] sys_socketcall+0x148/0x210
[   91.833420] [ec58bf40] [c001084c] ret_from_syscall+0x0/0x3c
[   91.900153] --- Exception: c01 at 0x4824df00
[   91.900157]     LR = 0x4828a030


The FreeScale patch that introduced this code was created by Andy
Fleming <afleming@freescale.com> (Put in CC).

The purpose of the patch is:
"
Subject: [PATCH] net: Add support for handling queueing in hardware

The QDisc code does a bunch of locking which is unnecessary if
you have hardware which handles all of the queueing.  Add
support for this, and skip over all of the queueing code if
the feature is enabled on a given device.
"

Ronny

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
@ 2011-07-12 12:03     ` Ronny Meeus
  0 siblings, 0 replies; 20+ messages in thread
From: Ronny Meeus @ 2011-07-12 12:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linuxppc-dev, netdev, afleming, Thomas De Schampheleire, David Miller

On Tue, Jul 12, 2011 at 12:10 PM, Eric Dumazet <eric.dumazet@gmail.com> wro=
te:
> Le mardi 12 juillet 2011 =E0 11:23 +0200, Thomas De Schampheleire a
> =E9crit :
>> Hi,
>>
>> I'm adding the linuxppc-dev mailing list since this may be pointing to
>> an irq/softirq problem in the powerpc architecture-specific code...
>
>>
>> Note that the reason we are seeing this problem, may be because the
>> kernel we are using contains some patches from Freescale.
>> Specifically, in dev_queue_xmit(), support is added for hardware queue
>> handling, just before entering the rcu_read_lock_bh():
>>
>
> Oh well, what a mess.
>
>> =A0 =A0 =A0 =A0 if (dev->features & NETIF_F_HW_QDISC) {
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 txq =3D dev_pick_tx(dev, skb);
>
>
>
>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return dev_hard_start_xmit(skb, dev, txq=
);
> =A0 =A0 =A0 =A0This need to be :
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0local_bh_disable();
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0rc =3D dev_hard_start_xmit(skb, dev, txq);
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0local_bh_enable();
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0return rc;
>
>
>> =A0 =A0 =A0 =A0 }
>>
>> =A0 =A0 =A0 =A0 /* Disable soft irqs for various locks below. Also
>> =A0 =A0 =A0 =A0 =A0* stops preemption for RCU.
>> =A0 =A0 =A0 =A0 =A0*/
>> =A0 =A0 =A0 =A0 rcu_read_lock_bh();
>>
>> We just tried moving the escaping to dev_hard_start_xmit() after
>> taking the lock, but this gives a large number of other problems, e.g.
>>
>> [ =A0 78.662428] BUG: sleeping function called from invalid context at
>> mm/slab.c:3101
>> [ =A0 78.751004] in_atomic(): 1, irqs_disabled(): 0, pid: 1908, name:
>> send_eth_socket
>> [ =A0 78.839582] Call Trace:
>> [ =A0 78.868784] [ec537b70] [c000789c] show_stack+0x78/0x18c (unreliable=
)
>> [ =A0 78.944905] [ec537bb0] [c0022900] __might_sleep+0x100/0x118
>> [ =A0 79.011636] [ec537bc0] [c00facc4] kmem_cache_alloc+0x48/0x118
>> [ =A0 79.080446] [ec537be0] [c02cd0e8] __alloc_skb+0x50/0x130
>> [ =A0 79.144047] [ec537c00] [c02cdf5c] skb_copy+0x44/0xc8
>> [ =A0 79.203478] [ec537c20] [c029f904] dpa_tx+0x154/0x758
>
> doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure.
>
>
>> [ =A0 79.262907] [ec537c80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
>> [ =A0 79.335878] [ec537cc0] [c02d7aac] dev_queue_xmit+0x5c/0x3a4
>> [ =A0 79.402602] [ec537cf0] [c0338d4c] packet_sendmsg+0x8c4/0x988
>> [ =A0 79.470363] [ec537d70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [ =A0 79.533960] [ec537e40] [c02c4420] sys_sendto+0xdc/0x120
>> [ =A0 79.596514] [ec537f10] [c02c57d0] sys_socketcall+0x148/0x210
>> [ =A0 79.664287] [ec537f40] [c001084c] ret_from_syscall+0x0/0x3c
>> [ =A0 79.731015] --- Exception: c01 at 0x48051f00
>> [ =A0 79.731019] =A0 =A0 LR =3D 0x4808e030
>>
>>
>> Note that this may just be the cause for us seeing this problem. If
>> indeed the main problem is irq_exit() invoking softirqs in a locked
>> context, then this patch adding hardware queue support is not really
>> relevant.
>
> irq_exit() is fine. This is because BH are not masked because of the
> Freescale patches.
>
> Really, suggesting an af_packet patch to solve a problem introduced in
> an out of tree patch is insane.
>
> You guys hould have clearly stated you were using an alien kernel.
>
>
>
>

Sorry for not mentioning we were using a patched kernel.
I was not aware that the code involved was patched by the FreeScale
patches we applied. The code found in the stack dumps is not
implemented in FSL specific files.

While reading the code of af_packet I saw that the spin_lock_bh is
used in several places while this is not the case in the tpacket_rcv
function. Since we had a locking issue in that code, I thought that my
patch would be OK.
I was not aware that for that specific function (tpacket_rcv) a
different lock primitive must be used. A suggestion for improvement:
it would be better to document this pre-condition in the code.

After doing the change you proposed our code now looks like:

>---if (dev->features & NETIF_F_HW_QDISC) {
>--->---txq =3D dev_pick_tx(dev, skb);
>--->---local_bh_disable();
>--->---rc =3D dev_hard_start_xmit(skb, dev, txq);
>--->---local_bh_enable();
>--->---return rc;
>---}

>---/* Disable soft irqs for various locks below. Also
>--- * stops preemption for RCU.
>--- */
>---rcu_read_lock_bh();

but we still see the issue "BUG: sleeping function called from invalid cont=
ext":

[   91.015989] BUG: sleeping function called from invalid context at
include/linux/skbuff.h:786
[   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1=
842
[   91.200461] Call Trace:
[   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
[   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
[   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
[   91.431957] [ec58bc80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
[   91.504952] [ec58bcc0] [c02d7ab0] dev_queue_xmit+0x60/0x3ac
[   91.571692] [ec58bcf0] [c0338d54] packet_sendmsg+0x8c4/0x988
[   91.639457] [ec58bd70] [c02c3838] sock_sendmsg+0x90/0xb4
[   91.703066] [ec58be40] [c02c4420] sys_sendto+0xdc/0x120
[   91.765646] [ec58bf10] [c02c57d0] sys_socketcall+0x148/0x210
[   91.833420] [ec58bf40] [c001084c] ret_from_syscall+0x0/0x3c
[   91.900153] --- Exception: c01 at 0x4824df00
[   91.900157]     LR =3D 0x4828a030


The FreeScale patch that introduced this code was created by Andy
Fleming <afleming@freescale.com> (Put in CC).

The purpose of the patch is:
"
Subject: [PATCH] net: Add support for handling queueing in hardware

The QDisc code does a bunch of locking which is unnecessary if
you have hardware which handles all of the queueing.  Add
support for this, and skip over all of the queueing code if
the feature is enabled on a given device.
"

Ronny

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
  2011-07-12 12:03     ` Ronny Meeus
@ 2011-07-12 12:08       ` David Miller
  -1 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-07-12 12:08 UTC (permalink / raw)
  To: ronny.meeus
  Cc: eric.dumazet, patrickdepinguin+linuxppc, linuxppc-dev, netdev, afleming

From: Ronny Meeus <ronny.meeus@gmail.com>
Date: Tue, 12 Jul 2011 14:03:04 +0200

> but we still see the issue "BUG: sleeping function called from invalid context":
> 
> [   91.015989] BUG: sleeping function called from invalid context at
> include/linux/skbuff.h:786
> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
> [   91.200461] Call Trace:
> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758

Because this dpa driver's transmit method is doing something else that
is not allowed in software interrupt context.

You must remove all things that might sleep in this driver's
->ndo_start_xmit method, and I do mean everything.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
@ 2011-07-12 12:08       ` David Miller
  0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-07-12 12:08 UTC (permalink / raw)
  To: ronny.meeus
  Cc: linuxppc-dev, afleming, patrickdepinguin+linuxppc, eric.dumazet, netdev

From: Ronny Meeus <ronny.meeus@gmail.com>
Date: Tue, 12 Jul 2011 14:03:04 +0200

> but we still see the issue "BUG: sleeping function called from invalid context":
> 
> [   91.015989] BUG: sleeping function called from invalid context at
> include/linux/skbuff.h:786
> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
> [   91.200461] Call Trace:
> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758

Because this dpa driver's transmit method is doing something else that
is not allowed in software interrupt context.

You must remove all things that might sleep in this driver's
->ndo_start_xmit method, and I do mean everything.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
  2011-07-12 12:08       ` David Miller
@ 2011-07-12 12:13         ` David Miller
  -1 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-07-12 12:13 UTC (permalink / raw)
  To: ronny.meeus
  Cc: eric.dumazet, patrickdepinguin+linuxppc, linuxppc-dev, netdev, afleming

From: David Miller <davem@davemloft.net>
Date: Tue, 12 Jul 2011 05:08:17 -0700 (PDT)

> From: Ronny Meeus <ronny.meeus@gmail.com>
> Date: Tue, 12 Jul 2011 14:03:04 +0200
> 
>> but we still see the issue "BUG: sleeping function called from invalid context":
>> 
>> [   91.015989] BUG: sleeping function called from invalid context at
>> include/linux/skbuff.h:786
>> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
>> [   91.200461] Call Trace:
>> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
>> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
>> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
> 
> Because this dpa driver's transmit method is doing something else that
> is not allowed in software interrupt context.
> 
> You must remove all things that might sleep in this driver's
> ->ndo_start_xmit method, and I do mean everything.

Also this whole HW QOS feature bit facility is beyond bogus.

What if the user enables a qdisc that the hardware can't handle, or a
configuration of a hw supported qdisc that the hardware can't support?

What if I have packet classification and packet actions enabled in the
packet scheduler?

These changes are terrible, and we really need you guys to sort out
your problems with these changes yoursleves because your wounds are
entirely self-inflicted and totally not our problem.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
@ 2011-07-12 12:13         ` David Miller
  0 siblings, 0 replies; 20+ messages in thread
From: David Miller @ 2011-07-12 12:13 UTC (permalink / raw)
  To: ronny.meeus
  Cc: linuxppc-dev, afleming, patrickdepinguin+linuxppc, eric.dumazet, netdev

From: David Miller <davem@davemloft.net>
Date: Tue, 12 Jul 2011 05:08:17 -0700 (PDT)

> From: Ronny Meeus <ronny.meeus@gmail.com>
> Date: Tue, 12 Jul 2011 14:03:04 +0200
> 
>> but we still see the issue "BUG: sleeping function called from invalid context":
>> 
>> [   91.015989] BUG: sleeping function called from invalid context at
>> include/linux/skbuff.h:786
>> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
>> [   91.200461] Call Trace:
>> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
>> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
>> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
> 
> Because this dpa driver's transmit method is doing something else that
> is not allowed in software interrupt context.
> 
> You must remove all things that might sleep in this driver's
> ->ndo_start_xmit method, and I do mean everything.

Also this whole HW QOS feature bit facility is beyond bogus.

What if the user enables a qdisc that the hardware can't handle, or a
configuration of a hw supported qdisc that the hardware can't support?

What if I have packet classification and packet actions enabled in the
packet scheduler?

These changes are terrible, and we really need you guys to sort out
your problems with these changes yoursleves because your wounds are
entirely self-inflicted and totally not our problem.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
  2011-07-12 12:03     ` Ronny Meeus
@ 2011-07-12 15:27       ` Eric Dumazet
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-07-12 15:27 UTC (permalink / raw)
  To: Ronny Meeus
  Cc: Thomas De Schampheleire, linuxppc-dev, David Miller, netdev, afleming

Le mardi 12 juillet 2011 à 14:03 +0200, Ronny Meeus a écrit :

> Sorry for not mentioning we were using a patched kernel.
> I was not aware that the code involved was patched by the FreeScale
> patches we applied. The code found in the stack dumps is not
> implemented in FSL specific files.
> 
> While reading the code of af_packet I saw that the spin_lock_bh is
> used in several places while this is not the case in the tpacket_rcv
> function. Since we had a locking issue in that code, I thought that my
> patch would be OK.
> I was not aware that for that specific function (tpacket_rcv) a
> different lock primitive must be used. A suggestion for improvement:
> it would be better to document this pre-condition in the code.
> 
> After doing the change you proposed our code now looks like:
> 
> >---if (dev->features & NETIF_F_HW_QDISC) {
> >--->---txq = dev_pick_tx(dev, skb);
> >--->---local_bh_disable();
> >--->---rc = dev_hard_start_xmit(skb, dev, txq);
> >--->---local_bh_enable();
> >--->---return rc;
> >---}
> 
> >---/* Disable soft irqs for various locks below. Also
> >--- * stops preemption for RCU.
> >--- */
> >---rcu_read_lock_bh();
> 
> but we still see the issue "BUG: sleeping function called from invalid context":

Of course you are if this is the only change you did.

> 
> [   91.015989] BUG: sleeping function called from invalid context at
> include/linux/skbuff.h:786
> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
> [   91.200461] Call Trace:
> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758


Please read again my mail : 

I said : "doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure."

I dont have this code, but I suspect it's using : skb_copy(skb,
GFP_KERNEL)

Just say no, use GFP_ATOMIC instead.

Real question is : why skb_copy() is done, since its slow as hell.

> [   91.431957] [ec58bc80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
> [   91.504952] [ec58bcc0] [c02d7ab0] dev_queue_xmit+0x60/0x3ac
> [   91.571692] [ec58bcf0] [c0338d54] packet_sendmsg+0x8c4/0x988
> [   91.639457] [ec58bd70] [c02c3838] sock_sendmsg+0x90/0xb4
> [   91.703066] [ec58be40] [c02c4420] sys_sendto+0xdc/0x120
> [   91.765646] [ec58bf10] [c02c57d0] sys_socketcall+0x148/0x210
> [   91.833420] [ec58bf40] [c001084c] ret_from_syscall+0x0/0x3c
> [   91.900153] --- Exception: c01 at 0x4824df00
> [   91.900157]     LR = 0x4828a030
> 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
@ 2011-07-12 15:27       ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-07-12 15:27 UTC (permalink / raw)
  To: Ronny Meeus
  Cc: linuxppc-dev, netdev, afleming, Thomas De Schampheleire, David Miller

Le mardi 12 juillet 2011 à 14:03 +0200, Ronny Meeus a écrit :

> Sorry for not mentioning we were using a patched kernel.
> I was not aware that the code involved was patched by the FreeScale
> patches we applied. The code found in the stack dumps is not
> implemented in FSL specific files.
> 
> While reading the code of af_packet I saw that the spin_lock_bh is
> used in several places while this is not the case in the tpacket_rcv
> function. Since we had a locking issue in that code, I thought that my
> patch would be OK.
> I was not aware that for that specific function (tpacket_rcv) a
> different lock primitive must be used. A suggestion for improvement:
> it would be better to document this pre-condition in the code.
> 
> After doing the change you proposed our code now looks like:
> 
> >---if (dev->features & NETIF_F_HW_QDISC) {
> >--->---txq = dev_pick_tx(dev, skb);
> >--->---local_bh_disable();
> >--->---rc = dev_hard_start_xmit(skb, dev, txq);
> >--->---local_bh_enable();
> >--->---return rc;
> >---}
> 
> >---/* Disable soft irqs for various locks below. Also
> >--- * stops preemption for RCU.
> >--- */
> >---rcu_read_lock_bh();
> 
> but we still see the issue "BUG: sleeping function called from invalid context":

Of course you are if this is the only change you did.

> 
> [   91.015989] BUG: sleeping function called from invalid context at
> include/linux/skbuff.h:786
> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
> [   91.200461] Call Trace:
> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758


Please read again my mail : 

I said : "doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure."

I dont have this code, but I suspect it's using : skb_copy(skb,
GFP_KERNEL)

Just say no, use GFP_ATOMIC instead.

Real question is : why skb_copy() is done, since its slow as hell.

> [   91.431957] [ec58bc80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
> [   91.504952] [ec58bcc0] [c02d7ab0] dev_queue_xmit+0x60/0x3ac
> [   91.571692] [ec58bcf0] [c0338d54] packet_sendmsg+0x8c4/0x988
> [   91.639457] [ec58bd70] [c02c3838] sock_sendmsg+0x90/0xb4
> [   91.703066] [ec58be40] [c02c4420] sys_sendto+0xdc/0x120
> [   91.765646] [ec58bf10] [c02c57d0] sys_socketcall+0x148/0x210
> [   91.833420] [ec58bf40] [c001084c] ret_from_syscall+0x0/0x3c
> [   91.900153] --- Exception: c01 at 0x4824df00
> [   91.900157]     LR = 0x4828a030
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
  2011-07-12 15:27       ` Eric Dumazet
@ 2011-07-12 19:52         ` Ronny Meeus
  -1 siblings, 0 replies; 20+ messages in thread
From: Ronny Meeus @ 2011-07-12 19:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Thomas De Schampheleire, linuxppc-dev, David Miller, netdev, afleming

On Tue, Jul 12, 2011 at 5:27 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Le mardi 12 juillet 2011 à 14:03 +0200, Ronny Meeus a écrit :
>
>> Sorry for not mentioning we were using a patched kernel.
>> I was not aware that the code involved was patched by the FreeScale
>> patches we applied. The code found in the stack dumps is not
>> implemented in FSL specific files.
>>
>> While reading the code of af_packet I saw that the spin_lock_bh is
>> used in several places while this is not the case in the tpacket_rcv
>> function. Since we had a locking issue in that code, I thought that my
>> patch would be OK.
>> I was not aware that for that specific function (tpacket_rcv) a
>> different lock primitive must be used. A suggestion for improvement:
>> it would be better to document this pre-condition in the code.
>>
>> After doing the change you proposed our code now looks like:
>>
>> >---if (dev->features & NETIF_F_HW_QDISC) {
>> >--->---txq = dev_pick_tx(dev, skb);
>> >--->---local_bh_disable();
>> >--->---rc = dev_hard_start_xmit(skb, dev, txq);
>> >--->---local_bh_enable();
>> >--->---return rc;
>> >---}
>>
>> >---/* Disable soft irqs for various locks below. Also
>> >--- * stops preemption for RCU.
>> >--- */
>> >---rcu_read_lock_bh();
>>
>> but we still see the issue "BUG: sleeping function called from invalid context":
>
> Of course you are if this is the only change you did.
>
>>
>> [   91.015989] BUG: sleeping function called from invalid context at
>> include/linux/skbuff.h:786
>> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
>> [   91.200461] Call Trace:
>> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
>> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
>> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
>
>
> Please read again my mail :
>
> I said : "doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure."
>
> I dont have this code, but I suspect it's using : skb_copy(skb,
> GFP_KERNEL)
>
> Just say no, use GFP_ATOMIC instead.
>
> Real question is : why skb_copy() is done, since its slow as hell.
>
>> [   91.431957] [ec58bc80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
>> [   91.504952] [ec58bcc0] [c02d7ab0] dev_queue_xmit+0x60/0x3ac
>> [   91.571692] [ec58bcf0] [c0338d54] packet_sendmsg+0x8c4/0x988
>> [   91.639457] [ec58bd70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [   91.703066] [ec58be40] [c02c4420] sys_sendto+0xdc/0x120
>> [   91.765646] [ec58bf10] [c02c57d0] sys_socketcall+0x148/0x210
>> [   91.833420] [ec58bf40] [c001084c] ret_from_syscall+0x0/0x3c
>> [   91.900153] --- Exception: c01 at 0x4824df00
>> [   91.900157]     LR = 0x4828a030
>>
>
>
>

I have identified the piece of code, it was a call to skb_unshare. I
have changed it into GFP_ATOMIC.

>---if (skb_cloned(skb))
>--->---skb = skb_unshare(skb, GFP_ATOMIC);

After doing this change, I do not see the issue anymore. At least not
for the test I'm doing right now.
After seeing all your comments today, it might be that other issues popup later.

BTW Are there any good sites (or books) that document this part of the
Linux kernel?

Best regards,
Ronny

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface)
@ 2011-07-12 19:52         ` Ronny Meeus
  0 siblings, 0 replies; 20+ messages in thread
From: Ronny Meeus @ 2011-07-12 19:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linuxppc-dev, netdev, afleming, Thomas De Schampheleire, David Miller

On Tue, Jul 12, 2011 at 5:27 PM, Eric Dumazet <eric.dumazet@gmail.com> wrot=
e:
> Le mardi 12 juillet 2011 =E0 14:03 +0200, Ronny Meeus a =E9crit :
>
>> Sorry for not mentioning we were using a patched kernel.
>> I was not aware that the code involved was patched by the FreeScale
>> patches we applied. The code found in the stack dumps is not
>> implemented in FSL specific files.
>>
>> While reading the code of af_packet I saw that the spin_lock_bh is
>> used in several places while this is not the case in the tpacket_rcv
>> function. Since we had a locking issue in that code, I thought that my
>> patch would be OK.
>> I was not aware that for that specific function (tpacket_rcv) a
>> different lock primitive must be used. A suggestion for improvement:
>> it would be better to document this pre-condition in the code.
>>
>> After doing the change you proposed our code now looks like:
>>
>> >---if (dev->features & NETIF_F_HW_QDISC) {
>> >--->---txq =3D dev_pick_tx(dev, skb);
>> >--->---local_bh_disable();
>> >--->---rc =3D dev_hard_start_xmit(skb, dev, txq);
>> >--->---local_bh_enable();
>> >--->---return rc;
>> >---}
>>
>> >---/* Disable soft irqs for various locks below. Also
>> >--- * stops preemption for RCU.
>> >--- */
>> >---rcu_read_lock_bh();
>>
>> but we still see the issue "BUG: sleeping function called from invalid c=
ontext":
>
> Of course you are if this is the only change you did.
>
>>
>> [ =A0 91.015989] BUG: sleeping function called from invalid context at
>> include/linux/skbuff.h:786
>> [ =A0 91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NM=
TX_T1842
>> [ =A0 91.200461] Call Trace:
>> [ =A0 91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable=
)
>> [ =A0 91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
>> [ =A0 91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
>
>
> Please read again my mail :
>
> I said : "doing GFP_KERNEL allocations in dpa_tx() is wrong, for sure."
>
> I dont have this code, but I suspect it's using : skb_copy(skb,
> GFP_KERNEL)
>
> Just say no, use GFP_ATOMIC instead.
>
> Real question is : why skb_copy() is done, since its slow as hell.
>
>> [ =A0 91.431957] [ec58bc80] [c02d78ec] dev_hard_start_xmit+0x424/0x588
>> [ =A0 91.504952] [ec58bcc0] [c02d7ab0] dev_queue_xmit+0x60/0x3ac
>> [ =A0 91.571692] [ec58bcf0] [c0338d54] packet_sendmsg+0x8c4/0x988
>> [ =A0 91.639457] [ec58bd70] [c02c3838] sock_sendmsg+0x90/0xb4
>> [ =A0 91.703066] [ec58be40] [c02c4420] sys_sendto+0xdc/0x120
>> [ =A0 91.765646] [ec58bf10] [c02c57d0] sys_socketcall+0x148/0x210
>> [ =A0 91.833420] [ec58bf40] [c001084c] ret_from_syscall+0x0/0x3c
>> [ =A0 91.900153] --- Exception: c01 at 0x4824df00
>> [ =A0 91.900157] =A0 =A0 LR =3D 0x4828a030
>>
>
>
>

I have identified the piece of code, it was a call to skb_unshare. I
have changed it into GFP_ATOMIC.

>---if (skb_cloned(skb))
>--->---skb =3D skb_unshare(skb, GFP_ATOMIC);

After doing this change, I do not see the issue anymore. At least not
for the test I'm doing right now.
After seeing all your comments today, it might be that other issues popup l=
ater.

BTW Are there any good sites (or books) that document this part of the
Linux kernel?

Best regards,
Ronny

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
  2011-07-12 12:13         ` David Miller
@ 2011-07-13  7:20           ` Thomas De Schampheleire
  -1 siblings, 0 replies; 20+ messages in thread
From: Thomas De Schampheleire @ 2011-07-13  7:20 UTC (permalink / raw)
  To: David Miller; +Cc: ronny.meeus, eric.dumazet, linuxppc-dev, netdev, afleming

On Tue, Jul 12, 2011 at 2:13 PM, David Miller <davem@davemloft.net> wrote:
> From: David Miller <davem@davemloft.net>
> Date: Tue, 12 Jul 2011 05:08:17 -0700 (PDT)
>
>> From: Ronny Meeus <ronny.meeus@gmail.com>
>> Date: Tue, 12 Jul 2011 14:03:04 +0200
>>
>>> but we still see the issue "BUG: sleeping function called from invalid context":
>>>
>>> [   91.015989] BUG: sleeping function called from invalid context at
>>> include/linux/skbuff.h:786
>>> [   91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: NMTX_T1842
>>> [   91.200461] Call Trace:
>>> [   91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliable)
>>> [   91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
>>> [   91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
>>
>> Because this dpa driver's transmit method is doing something else that
>> is not allowed in software interrupt context.
>>
>> You must remove all things that might sleep in this driver's
>> ->ndo_start_xmit method, and I do mean everything.
>
> Also this whole HW QOS feature bit facility is beyond bogus.
>
> What if the user enables a qdisc that the hardware can't handle, or a
> configuration of a hw supported qdisc that the hardware can't support?
>
> What if I have packet classification and packet actions enabled in the
> packet scheduler?
>
> These changes are terrible, and we really need you guys to sort out
> your problems with these changes yoursleves because your wounds are
> entirely self-inflicted and totally not our problem.
>

Of course, you make very valid points.

I just want to mention that, although you seem to think that we
shouldn't have created this thread in the first place, your and Eric's
remarks have actually helped us in identifying the problem and
increasing our understanding.

So, thanks for that.

Thomas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
@ 2011-07-13  7:20           ` Thomas De Schampheleire
  0 siblings, 0 replies; 20+ messages in thread
From: Thomas De Schampheleire @ 2011-07-13  7:20 UTC (permalink / raw)
  To: David Miller; +Cc: linuxppc-dev, afleming, ronny.meeus, eric.dumazet, netdev

On Tue, Jul 12, 2011 at 2:13 PM, David Miller <davem@davemloft.net> wrote:
> From: David Miller <davem@davemloft.net>
> Date: Tue, 12 Jul 2011 05:08:17 -0700 (PDT)
>
>> From: Ronny Meeus <ronny.meeus@gmail.com>
>> Date: Tue, 12 Jul 2011 14:03:04 +0200
>>
>>> but we still see the issue "BUG: sleeping function called from invalid =
context":
>>>
>>> [ =A0 91.015989] BUG: sleeping function called from invalid context at
>>> include/linux/skbuff.h:786
>>> [ =A0 91.117096] in_atomic(): 1, irqs_disabled(): 0, pid: 1865, name: N=
MTX_T1842
>>> [ =A0 91.200461] Call Trace:
>>> [ =A0 91.229672] [ec58bbd0] [c000789c] show_stack+0x78/0x18c (unreliabl=
e)
>>> [ =A0 91.305791] [ec58bc10] [c0022900] __might_sleep+0x100/0x118
>>> [ =A0 91.372524] [ec58bc20] [c029f8d8] dpa_tx+0x128/0x758
>>
>> Because this dpa driver's transmit method is doing something else that
>> is not allowed in software interrupt context.
>>
>> You must remove all things that might sleep in this driver's
>> ->ndo_start_xmit method, and I do mean everything.
>
> Also this whole HW QOS feature bit facility is beyond bogus.
>
> What if the user enables a qdisc that the hardware can't handle, or a
> configuration of a hw supported qdisc that the hardware can't support?
>
> What if I have packet classification and packet actions enabled in the
> packet scheduler?
>
> These changes are terrible, and we really need you guys to sort out
> your problems with these changes yoursleves because your wounds are
> entirely self-inflicted and totally not our problem.
>

Of course, you make very valid points.

I just want to mention that, although you seem to think that we
shouldn't have created this thread in the first place, your and Eric's
remarks have actually helped us in identifying the problem and
increasing our understanding.

So, thanks for that.

Thomas

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
  2011-07-13  7:20           ` Thomas De Schampheleire
@ 2011-07-13  7:38             ` Eric Dumazet
  -1 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-07-13  7:38 UTC (permalink / raw)
  To: Thomas De Schampheleire
  Cc: David Miller, ronny.meeus, linuxppc-dev, netdev, afleming

Le mercredi 13 juillet 2011 à 09:20 +0200, Thomas De Schampheleire a
écrit :

> I just want to mention that, although you seem to think that we
> shouldn't have created this thread in the first place, your and Eric's
> remarks have actually helped us in identifying the problem and
> increasing our understanding.
> 

Thread was fine IMHO, but should have mentioned the context of "non
standard linux-2.6 or net-next-2.6 trees"

I would like to mention NETIF_F_LLTX existence.

While being marked deprecated, it allows a lockless tx path :

The device transmit handler can then perform custom actions, including
its own locking if needed.

If the user wants a specific Qdisc setup (instead of a no qdisc one), it
still can.

veth, loopback, dummy, macvlan & bond devices use NETIF_F_LLTX for
example.

But ndo_start_xmit() must always be called with BH disabled.




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: softirqs are invoked while bottom halves are masked
@ 2011-07-13  7:38             ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2011-07-13  7:38 UTC (permalink / raw)
  To: Thomas De Schampheleire
  Cc: linuxppc-dev, netdev, afleming, David Miller, ronny.meeus

Le mercredi 13 juillet 2011 à 09:20 +0200, Thomas De Schampheleire a
écrit :

> I just want to mention that, although you seem to think that we
> shouldn't have created this thread in the first place, your and Eric's
> remarks have actually helped us in identifying the problem and
> increasing our understanding.
> 

Thread was fine IMHO, but should have mentioned the context of "non
standard linux-2.6 or net-next-2.6 trees"

I would like to mention NETIF_F_LLTX existence.

While being marked deprecated, it allows a lockless tx path :

The device transmit handler can then perform custom actions, including
its own locking if needed.

If the user wants a specific Qdisc setup (instead of a no qdisc one), it
still can.

veth, loopback, dummy, macvlan & bond devices use NETIF_F_LLTX for
example.

But ndo_start_xmit() must always be called with BH disabled.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-07-13  7:39 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-12  9:23 softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface) Thomas De Schampheleire
2011-07-12  9:23 ` Thomas De Schampheleire
2011-07-12 10:01 ` softirqs are invoked while bottom halves are masked David Miller
2011-07-12 10:01   ` David Miller
2011-07-12 10:10 ` softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface) Eric Dumazet
2011-07-12 10:10   ` Eric Dumazet
2011-07-12 12:03   ` Ronny Meeus
2011-07-12 12:03     ` Ronny Meeus
2011-07-12 12:08     ` softirqs are invoked while bottom halves are masked David Miller
2011-07-12 12:08       ` David Miller
2011-07-12 12:13       ` David Miller
2011-07-12 12:13         ` David Miller
2011-07-13  7:20         ` Thomas De Schampheleire
2011-07-13  7:20           ` Thomas De Schampheleire
2011-07-13  7:38           ` Eric Dumazet
2011-07-13  7:38             ` Eric Dumazet
2011-07-12 15:27     ` softirqs are invoked while bottom halves are masked (was: Re: [PATCH] [PATCH] Fix deadlock in af_packet while stressing raw ethernet socket interface) Eric Dumazet
2011-07-12 15:27       ` Eric Dumazet
2011-07-12 19:52       ` Ronny Meeus
2011-07-12 19:52         ` Ronny Meeus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.