All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [BUG] A panic caused by null pointer dereference aftering updating to
       [not found] ` <CAHz2CGU5aPA5QJ9ED9g5kahD_jJaeLF+VOotUSBFXDkY+EH8ow@mail.gmail.com>
@ 2014-04-14 15:19   ` James Chapman
  2014-04-14 17:01     ` Zhan Jianyu
  0 siblings, 1 reply; 20+ messages in thread
From: James Chapman @ 2014-04-14 15:19 UTC (permalink / raw)
  To: Zhan Jianyu, davem, edumazet, joe; +Cc: LKML, netdev

Please send the complete oops message.

Is this a regression? If so, do you know what the last kernel version
was that worked?

Thanks
James

On 14/04/14 14:33, Zhan Jianyu wrote:
> When I tried to connect my VPN, I got a panic, saying
> a NULL poiter dereference at 0x00000000000002c0
> 
> I came across this bug twice today, after updateing to
> Linux-3.15-rc1.
> 
> Below are some panic message(hand copy,not complete)
> =====
> 
> Kernel panic - not syncing: Fatal exception in interupt
> 
> RIP ip_queue_xmit+0x20/0x3e0
> Call Trace:
> l2tp_xmit_skb+0x335/0x6c0 [l2tp_core]
> ? skb_free_head+0x1e/0x80
> pppol2tp_xmit+0x141/0x210 [l2tp_ppp]
> ppp_channel_push+0x50/0xd0 [ppp_generic]
> ppp_write+0xa3/0xec [ppp_generic]
> vfs_write
> Sys_wirte
> ? __audit_syscall_exit
> system_call_fastpath
> 
> =====
> 
> I've tried to figure it out.
> I disassembled ip_queue_xmit, found that the null
> dereference is caused by the first argument of
> ip_queue_xmit(), which is sk_buff pointer became
> NULL.
> 
> This seems some async skb freeing is in progress?
> 
> Regards,
> Jianyu Zhan
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 15:19   ` [BUG] A panic caused by null pointer dereference aftering updating to James Chapman
@ 2014-04-14 17:01     ` Zhan Jianyu
  2014-04-14 17:12       ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Zhan Jianyu @ 2014-04-14 17:01 UTC (permalink / raw)
  To: James Chapman; +Cc: davem, Eric Dumazet, joe, LKML, netdev

On Mon, Apr 14, 2014 at 11:19 PM, James Chapman <jchapman@katalix.com> wrote:
> Please send the complete oops message.

Hi, complete oops message is :


[  100.243737] BUG: unable to handle kernel NULL pointer dereference
at 00000000000002c0
[  100.244985] IP: [<ffffffff815d4cb0>] ip_queue_xmit+0x20/0x3e0
[  100.262266] PGD 0
[  100.288395] Oops: 0000 [#1] SMP
[  100.325955] Modules linked in: l2tp-ppp l2tp-netlink l2tp_core
pppoe vmblock vsock vmmemctl vmhgfs acpiphp snd_ens1371 gameport
snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm
snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event
snd_seq snd_timer snd_seq_device ppdev psmouse serio_raw fbcon
tileblit font bitblit softcursor snd parport_pc soundcore
snd_page_alloc vmci i2c_piix4 vga16fb vgastate intel_agp agpgart
shpchp lp parport floppy pcnet32 mii mptspi mptscsih mptbase
scsi_transport_spi vmxnet
[  1100.472178] CPU: 0 PID: 1849 Comm: pppd Tainted: G        IOE
3.15.0-rc1+ #13
[  1100.540018] Hardware name: LENOVO 20216/v2QY041, BIOS
74CN34WW(v2.0k) 07/04/2013
[  1100.562844] task: ffff88003ee48000 ti: ffff880257a4e000 task.ti:
ffff880257a4e000
[  1100.584351] RIP: 0010: [<ffffffff815d4cb0>]  [<ffffffff815d4cb0>]
ip_queue_xmit+0x20/0x3e0
[  1100.584351] RSP: 0018: ffff880257a4fdd8 EFLAGS: 00010293
[  1100.609358] RAX: ffff88003fe48e00 RBX:ffff88003f5ad100 RCX:0000000000000014
[  1100.631467] RDX: ffffffff81ccab40   RSI:  ffff88003fe49110
RDI:ffff88003f5ad100
[  1100.657664] RBP: ffff880257a4fe00 R08:  ffff88807ff06476
R09:0000000000000022
[  1100.657694] R10:  ffffe90009726600 R11:  ffff88003fe49110
R12:0000000000000000
[  1100.657694] R13:  ffff88025a087480 R14:  ffff880025b561400
R15:ffff88003fe49110
[  1100.688994] FS:    00007f8857f8f840(0000)
GS:ffff88026f200000(0000) knlGS:000000000000
[  1100.697694] CS: 0010 DS: 0000 ES: 0000 CR0:0000000080050033
[  1100.697694] CR2:0000000000002c0 CR3: 0000000260190e00 CR4: 00000000001407f0
[  1100.706083] Stack:
[  1100.731783]  ffff88025cadbe00 ffff88003fe48e00 0000000000000022
ffff88025b561400
[  1100.759324]  ffff88003f5ad100  ffff880257a4fe68 ffffffffa081a6e5
ffff88003f5ad100
[  1100.811396]  ffff880257a4fe28  ffffffff8158686e   ffff880257a4fe68
 ffffffff000000008
[  1100.891922] Call Trace:
[  1100.916257]  [<ffffffffa081a6e5>] l2tp_xmit_skb+0x335/0x6c0 [l2tp_core]
[  1100.943670]  [<ffffffff8158686e>] ? skb_free_head+0x1e/0x80
[  1100.970905]  [<ffffffffa0830da1>] pppol2tp_xmit+0x141/0x210 [l2tp_ppp]
[  1100.995542]  [<ffffffffa0805230>] ppp_channel_push+0x50/0xd0 [ppp_generic]
[  1101.024087]  [<ffffffffa0805523>] ppp_write+0xa3/0xec [ppp_generic]
[  1101.133447]  [<ffffffff811c7bfa>] vfs_write+0xba/0x1e0
[  1101.228987]  [<ffffffff811c8746>] Sys_wirte+0x46/0xb0
[  1101.428494]  [<ffffffff8110aca6>] ? __audit_syscall_exit+0x1f6/0x2a0
[  1101.898345]  [<ffffffff816a90e9>] system_call_fastpath+0x16/0x1b
[  1102.024087] Code: e9 31 fe ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00
55 48 89 e5 41 57 49 89 f7 41 55 41 54 53 48 89 fb 4c 8b 67 18 4c 8b
67 18 4c 8b 6f 58 <4d> 8b b4 24 c0 02 00 00 49 83 e5 fe 0f 84 3e 02 00
00 4d 85 f6
[  1102.879592] RIP: [<ffffffff815d4cb0>] ip_queue_xmit+0x20/0x3e0
[  1103.179592]   RsP<FFFF880257A4FDD8>
[  1103.834682] CR2: 00000000000002C0
[  1104.043682] kernel panic - not syncing: Fatal exception in interrupt
[  1104.789539] Offset: 0x0 from 0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffff9fffffff)

>Is this a regression? If so, do you know what the last kernel version
>was that worked?

I think this is a regression. 3.14 is the good working version. Today,
after I updated to 3.15-rc1,  evertime I tried connect my VPN, it
always panics.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 17:01     ` Zhan Jianyu
@ 2014-04-14 17:12       ` Eric Dumazet
  2014-04-14 17:34         ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2014-04-14 17:12 UTC (permalink / raw)
  To: Zhan Jianyu; +Cc: James Chapman, davem, Eric Dumazet, joe, LKML, netdev

On Tue, 2014-04-15 at 01:01 +0800, Zhan Jianyu wrote:
> On Mon, Apr 14, 2014 at 11:19 PM, James Chapman <jchapman@katalix.com> wrote:
> > Please send the complete oops message.
> 
> Hi, complete oops message is :
> 
> 
> [  100.243737] BUG: unable to handle kernel NULL pointer dereference
> at 00000000000002c0
> [  100.244985] IP: [<ffffffff815d4cb0>] ip_queue_xmit+0x20/0x3e0
> [  100.262266] PGD 0
> [  100.288395] Oops: 0000 [#1] SMP
> [  100.325955] Modules linked in: l2tp-ppp l2tp-netlink l2tp_core
> pppoe vmblock vsock vmmemctl vmhgfs acpiphp snd_ens1371 gameport
> snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm
> snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event
> snd_seq snd_timer snd_seq_device ppdev psmouse serio_raw fbcon
> tileblit font bitblit softcursor snd parport_pc soundcore
> snd_page_alloc vmci i2c_piix4 vga16fb vgastate intel_agp agpgart
> shpchp lp parport floppy pcnet32 mii mptspi mptscsih mptbase
> scsi_transport_spi vmxnet
> [  1100.472178] CPU: 0 PID: 1849 Comm: pppd Tainted: G        IOE
> 3.15.0-rc1+ #13
> [  1100.540018] Hardware name: LENOVO 20216/v2QY041, BIOS
> 74CN34WW(v2.0k) 07/04/2013
> [  1100.562844] task: ffff88003ee48000 ti: ffff880257a4e000 task.ti:
> ffff880257a4e000
> [  1100.584351] RIP: 0010: [<ffffffff815d4cb0>]  [<ffffffff815d4cb0>]
> ip_queue_xmit+0x20/0x3e0
> [  1100.584351] RSP: 0018: ffff880257a4fdd8 EFLAGS: 00010293
> [  1100.609358] RAX: ffff88003fe48e00 RBX:ffff88003f5ad100 RCX:0000000000000014
> [  1100.631467] RDX: ffffffff81ccab40   RSI:  ffff88003fe49110
> RDI:ffff88003f5ad100
> [  1100.657664] RBP: ffff880257a4fe00 R08:  ffff88807ff06476
> R09:0000000000000022
> [  1100.657694] R10:  ffffe90009726600 R11:  ffff88003fe49110
> R12:0000000000000000
> [  1100.657694] R13:  ffff88025a087480 R14:  ffff880025b561400
> R15:ffff88003fe49110
> [  1100.688994] FS:    00007f8857f8f840(0000)
> GS:ffff88026f200000(0000) knlGS:000000000000
> [  1100.697694] CS: 0010 DS: 0000 ES: 0000 CR0:0000000080050033
> [  1100.697694] CR2:0000000000002c0 CR3: 0000000260190e00 CR4: 00000000001407f0
> [  1100.706083] Stack:
> [  1100.731783]  ffff88025cadbe00 ffff88003fe48e00 0000000000000022
> ffff88025b561400
> [  1100.759324]  ffff88003f5ad100  ffff880257a4fe68 ffffffffa081a6e5
> ffff88003f5ad100
> [  1100.811396]  ffff880257a4fe28  ffffffff8158686e   ffff880257a4fe68
>  ffffffff000000008
> [  1100.891922] Call Trace:
> [  1100.916257]  [<ffffffffa081a6e5>] l2tp_xmit_skb+0x335/0x6c0 [l2tp_core]
> [  1100.943670]  [<ffffffff8158686e>] ? skb_free_head+0x1e/0x80
> [  1100.970905]  [<ffffffffa0830da1>] pppol2tp_xmit+0x141/0x210 [l2tp_ppp]
> [  1100.995542]  [<ffffffffa0805230>] ppp_channel_push+0x50/0xd0 [ppp_generic]
> [  1101.024087]  [<ffffffffa0805523>] ppp_write+0xa3/0xec [ppp_generic]
> [  1101.133447]  [<ffffffff811c7bfa>] vfs_write+0xba/0x1e0
> [  1101.228987]  [<ffffffff811c8746>] Sys_wirte+0x46/0xb0
> [  1101.428494]  [<ffffffff8110aca6>] ? __audit_syscall_exit+0x1f6/0x2a0
> [  1101.898345]  [<ffffffff816a90e9>] system_call_fastpath+0x16/0x1b
> [  1102.024087] Code: e9 31 fe ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00
> 55 48 89 e5 41 57 49 89 f7 41 55 41 54 53 48 89 fb 4c 8b 67 18 4c 8b
> 67 18 4c 8b 6f 58 <4d> 8b b4 24 c0 02 00 00 49 83 e5 fe 0f 84 3e 02 00
> 00 4d 85 f6
> [  1102.879592] RIP: [<ffffffff815d4cb0>] ip_queue_xmit+0x20/0x3e0
> [  1103.179592]   RsP<FFFF880257A4FDD8>
> [  1103.834682] CR2: 00000000000002C0
> [  1104.043682] kernel panic - not syncing: Fatal exception in interrupt
> [  1104.789539] Offset: 0x0 from 0xffffffff81000000 (relocation range:
> 0xffffffff80000000-0xffffffff9fffffff)
> 
> >Is this a regression? If so, do you know what the last kernel version
> >was that worked?
> 
> I think this is a regression. 3.14 is the good working version. Today,
> after I updated to 3.15-rc1,  evertime I tried connect my VPN, it
> always panics.
> --

Hmm, it seems commit 31c70d5956fc l2tp: keep original skb ownership
is the problem.

ip_queue_xmit() assumes the socket attached to skb is an inet socket.




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 17:12       ` Eric Dumazet
@ 2014-04-14 17:34         ` David Miller
  2014-04-14 17:40           ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2014-04-14 17:34 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 14 Apr 2014 10:12:29 -0700

> Hmm, it seems commit 31c70d5956fc l2tp: keep original skb ownership
> is the problem.
> 
> ip_queue_xmit() assumes the socket attached to skb is an inet socket.

This is similar to the "send over AF_PACKET" issue we were discussing
the other week.

It seems we need a real resolution to this issue.

To recap:

1) We want to charge memory to the top-level socket.

2) However during encapsulations etc. we can end up in IP stack
   which expects only IP sockets to be attached to skb->sk

I suspect that in the short term we may have to bit the bullet and
compromise #2, and do flow control via the tunnel's socket.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 17:34         ` David Miller
@ 2014-04-14 17:40           ` Eric Dumazet
  2014-04-14 17:49             ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2014-04-14 17:40 UTC (permalink / raw)
  To: David Miller; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

On Mon, 2014-04-14 at 13:34 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 14 Apr 2014 10:12:29 -0700
> 
> > Hmm, it seems commit 31c70d5956fc l2tp: keep original skb ownership
> > is the problem.
> > 
> > ip_queue_xmit() assumes the socket attached to skb is an inet socket.
> 
> This is similar to the "send over AF_PACKET" issue we were discussing
> the other week.
> 
> It seems we need a real resolution to this issue.
> 
> To recap:
> 
> 1) We want to charge memory to the top-level socket.
> 
> 2) However during encapsulations etc. we can end up in IP stack
>    which expects only IP sockets to be attached to skb->sk
> 
> I suspect that in the short term we may have to bit the bullet and
> compromise #2, and do flow control via the tunnel's socket.

Or add a sk parameter to ip_queue_xmit(), to break the assumption "sk =
skb->sk", which happened to be generally true, but is ambiguous for IP
tunnels.




^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 17:40           ` Eric Dumazet
@ 2014-04-14 17:49             ` David Miller
  2014-04-14 17:54               ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2014-04-14 17:49 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 14 Apr 2014 10:40:04 -0700

> On Mon, 2014-04-14 at 13:34 -0400, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Mon, 14 Apr 2014 10:12:29 -0700
>> 
>> > Hmm, it seems commit 31c70d5956fc l2tp: keep original skb ownership
>> > is the problem.
>> > 
>> > ip_queue_xmit() assumes the socket attached to skb is an inet socket.
>> 
>> This is similar to the "send over AF_PACKET" issue we were discussing
>> the other week.
>> 
>> It seems we need a real resolution to this issue.
>> 
>> To recap:
>> 
>> 1) We want to charge memory to the top-level socket.
>> 
>> 2) However during encapsulations etc. we can end up in IP stack
>>    which expects only IP sockets to be attached to skb->sk
>> 
>> I suspect that in the short term we may have to bit the bullet and
>> compromise #2, and do flow control via the tunnel's socket.
> 
> Or add a sk parameter to ip_queue_xmit(), to break the assumption "sk =
> skb->sk", which happened to be generally true, but is ambiguous for IP
> tunnels.

Yep that would handle the l2tp cases, but vxlan needs something different
since it goes through iptunnel_xmit().

So perhaps as a quick fix we can add an 'sk' arg to both
ip_queue_xmit() and ip_local_out().

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 17:49             ` David Miller
@ 2014-04-14 17:54               ` Eric Dumazet
  2014-04-14 18:02                 ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2014-04-14 17:54 UTC (permalink / raw)
  To: David Miller; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

On Mon, 2014-04-14 at 13:49 -0400, David Miller wrote:

> Yep that would handle the l2tp cases, but vxlan needs something different
> since it goes through iptunnel_xmit().
> 
> So perhaps as a quick fix we can add an 'sk' arg to both
> ip_queue_xmit() and ip_local_out().

Right, I am working on this right now.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 17:54               ` Eric Dumazet
@ 2014-04-14 18:02                 ` David Miller
  2014-04-14 18:19                   ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2014-04-14 18:02 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 14 Apr 2014 10:54:42 -0700

> On Mon, 2014-04-14 at 13:49 -0400, David Miller wrote:
> 
>> Yep that would handle the l2tp cases, but vxlan needs something different
>> since it goes through iptunnel_xmit().
>> 
>> So perhaps as a quick fix we can add an 'sk' arg to both
>> ip_queue_xmit() and ip_local_out().
> 
> Right, I am working on this right now.

Thanks a lot Eric.

BTW, it occurs to me that there may be some other spots in the output path
that expect that if SKB is ETH_P_IP then skb->sk is IP socket.  For example
somewhere in netfilter or packet classifier paths.

Just FYI... that was one of the things I was going to audit.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 18:02                 ` David Miller
@ 2014-04-14 18:19                   ` Eric Dumazet
  2014-04-14 18:48                     ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2014-04-14 18:19 UTC (permalink / raw)
  To: David Miller; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

On Mon, 2014-04-14 at 14:02 -0400, David Miller wrote:

> BTW, it occurs to me that there may be some other spots in the output path
> that expect that if SKB is ETH_P_IP then skb->sk is IP socket.  For example
> somewhere in netfilter or packet classifier paths.
> 
> Just FYI... that was one of the things I was going to audit.

packet classifiers cannot have such assumptions for sure.

About iptunnel_xmit(), I do not think there is an issue, as we do not
enter ip_queue_xmit() on this path, but ip_local_out().

ip_local_out() doesn't use skb->sk , unless some netfilter module
uses this. I am not sure how the previous behavior could be useful in
this case, as all packets were sharing same socket ownership.

net/netfilter/xt_owner.c for example has better coverage if it can
really have a pointer to the user socket, not the internal socket used
by l2tp or vxlan tunnel.





^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 18:19                   ` Eric Dumazet
@ 2014-04-14 18:48                     ` David Miller
  2014-04-14 19:06                       ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2014-04-14 18:48 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 14 Apr 2014 11:19:26 -0700

> ip_local_out() doesn't use skb->sk

It does Eric.

We had just such a report with this in the backtrace, when AF_PACKET
sends over vxlan devices.

The problem is ip_mc_output().

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 18:48                     ` David Miller
@ 2014-04-14 19:06                       ` Eric Dumazet
  2014-04-14 19:22                         ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2014-04-14 19:06 UTC (permalink / raw)
  To: David Miller; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

On Mon, 2014-04-14 at 14:48 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 14 Apr 2014 11:19:26 -0700
> 
> > ip_local_out() doesn't use skb->sk
> 
> It does Eric.
> 

Hmmm, right...

> We had just such a report with this in the backtrace, when AF_PACKET
> sends over vxlan devices.
> 
> The problem is ip_mc_output().

So this means that : User socket wanted sk_mc_loop(sk), but because
vxlan changed skb->sk to internal socket, we were doing something else
anyway.

There are a lot of undocumented features here... like

 	skb->priority = sk->sk_priority;
 	skb->mark = sk->sk_mark;

which socket do we want here (in ip_queue_xmit()) ?

Its interesting to see ip6_xmit() already has a 'struct sock *sk'
parameter...

This was the preliminary patch I tested :

 include/net/inet6_connection_sock.h |    2 +-
 include/net/inet_connection_sock.h  |    2 +-
 include/net/ip.h                    |    2 +-
 net/dccp/output.c                   |    2 +-
 net/ipv4/ip_output.c                |    5 +++--
 net/ipv4/tcp_output.c               |    2 +-
 net/ipv6/inet6_connection_sock.c    |    3 +--
 net/l2tp/l2tp_core.c                |    4 ++--
 net/l2tp/l2tp_ip.c                  |    2 +-
 net/sctp/protocol.c                 |    2 +-
 10 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h
index f981ba7adeed..74af137304be 100644
--- a/include/net/inet6_connection_sock.h
+++ b/include/net/inet6_connection_sock.h
@@ -40,7 +40,7 @@ void inet6_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req,
 
 void inet6_csk_addr2sockaddr(struct sock *sk, struct sockaddr *uaddr);
 
-int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl);
+int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 
 struct dst_entry *inet6_csk_update_pmtu(struct sock *sk, u32 mtu);
 #endif /* _INET6_CONNECTION_SOCK_H */
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index c55aeed41ace..7a4313887568 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -36,7 +36,7 @@ struct tcp_congestion_ops;
  * (i.e. things that depend on the address family)
  */
 struct inet_connection_sock_af_ops {
-	int	    (*queue_xmit)(struct sk_buff *skb, struct flowi *fl);
+	int	    (*queue_xmit)(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 	void	    (*send_check)(struct sock *sk, struct sk_buff *skb);
 	int	    (*rebuild_header)(struct sock *sk);
 	void	    (*sk_rx_dst_set)(struct sock *sk, const struct sk_buff *skb);
diff --git a/include/net/ip.h b/include/net/ip.h
index 25064c28e059..77e73d293e09 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -111,7 +111,7 @@ int ip_do_nat(struct sk_buff *skb);
 void ip_send_check(struct iphdr *ip);
 int __ip_local_out(struct sk_buff *skb);
 int ip_local_out(struct sk_buff *skb);
-int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl);
+int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 void ip_init(void);
 int ip_append_data(struct sock *sk, struct flowi4 *fl4,
 		   int getfrag(void *from, char *to, int offset, int len,
diff --git a/net/dccp/output.c b/net/dccp/output.c
index 8876078859da..0248e8a3460c 100644
--- a/net/dccp/output.c
+++ b/net/dccp/output.c
@@ -138,7 +138,7 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb)
 
 		DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
 
-		err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
+		err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
 		return net_xmit_eval(err);
 	}
 	return -ENOBUFS;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 1a0755fea491..7ad68b860935 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -315,9 +315,9 @@ static void ip_copy_addrs(struct iphdr *iph, const struct flowi4 *fl4)
 	       sizeof(fl4->saddr) + sizeof(fl4->daddr));
 }
 
-int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl)
+/* Note: skb->sk can be different from sk, in case of tunnels */
+int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl)
 {
-	struct sock *sk = skb->sk;
 	struct inet_sock *inet = inet_sk(sk);
 	struct ip_options_rcu *inet_opt;
 	struct flowi4 *fl4;
@@ -389,6 +389,7 @@ packet_routed:
 	ip_select_ident_more(skb, &rt->dst, sk,
 			     (skb_shinfo(skb)->gso_segs ?: 1) - 1);
 
+	/* TODO : should we use skb->sk here instead of sk ? */
 	skb->priority = sk->sk_priority;
 	skb->mark = sk->sk_mark;
 
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 699fb102e971..025e25093984 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -981,7 +981,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 		TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS,
 			      tcp_skb_pcount(skb));
 
-	err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
+	err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
 	if (likely(err <= 0))
 		return err;
 
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index c9138189415a..d4ade34ab375 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -224,9 +224,8 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
 	return dst;
 }
 
-int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl_unused)
+int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl_unused)
 {
-	struct sock *sk = skb->sk;
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct flowi6 fl6;
 	struct dst_entry *dst;
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 47f7a5490555..a4e37d7158dc 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1131,10 +1131,10 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
 	skb->local_df = 1;
 #if IS_ENABLED(CONFIG_IPV6)
 	if (tunnel->sock->sk_family == PF_INET6 && !tunnel->v4mapped)
-		error = inet6_csk_xmit(skb, NULL);
+		error = inet6_csk_xmit(tunnel->sock, skb, NULL);
 	else
 #endif
-		error = ip_queue_xmit(skb, fl);
+		error = ip_queue_xmit(tunnel->sock, skb, fl);
 
 	/* Update stats */
 	if (error >= 0) {
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 0b44d855269c..3397fe6897c0 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -487,7 +487,7 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 
 xmit:
 	/* Queue the packet to IP for output */
-	rc = ip_queue_xmit(skb, &inet->cork.fl);
+	rc = ip_queue_xmit(sk, skb, &inet->cork.fl);
 	rcu_read_unlock();
 
 error:
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 4e1d0fcb028e..c09757fbf803 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -957,7 +957,7 @@ static inline int sctp_v4_xmit(struct sk_buff *skb,
 
 	SCTP_INC_STATS(sock_net(&inet->sk), SCTP_MIB_OUTSCTPPACKS);
 
-	return ip_queue_xmit(skb, &transport->fl);
+	return ip_queue_xmit(&inet->sk, skb, &transport->fl);
 }
 
 static struct sctp_af sctp_af_inet;



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 19:06                       ` Eric Dumazet
@ 2014-04-14 19:22                         ` David Miller
  2014-04-14 21:23                           ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2014-04-14 19:22 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 14 Apr 2014 12:06:36 -0700

> On Mon, 2014-04-14 at 14:48 -0400, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Mon, 14 Apr 2014 11:19:26 -0700
>> 
>> > ip_local_out() doesn't use skb->sk
>> 
>> It does Eric.
>> 
> 
> Hmmm, right...
> 
>> We had just such a report with this in the backtrace, when AF_PACKET
>> sends over vxlan devices.
>> 
>> The problem is ip_mc_output().
> 
> So this means that : User socket wanted sk_mc_loop(sk), but because
> vxlan changed skb->sk to internal socket, we were doing something else
> anyway.

Actually the exact opposite is happening.  vxlan does not override skb->sk,
and leaves it as AF_PACKET's socket.  Then we crash in ip_mc_output() because
it only expects IP sockets, not AF_PACKET ones, attached to skb->sk.

> There are a lot of undocumented features here... like
> 
>  	skb->priority = sk->sk_priority;
>  	skb->mark = sk->sk_mark;
> 
> which socket do we want here (in ip_queue_xmit()) ?

Probably that of tunnel.

> Its interesting to see ip6_xmit() already has a 'struct sock *sk'
> parameter...
> 
> This was the preliminary patch I tested :

You still need to make similar transformation to ip_local_out() as mentioned
above.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 19:22                         ` David Miller
@ 2014-04-14 21:23                           ` Eric Dumazet
  2014-04-14 22:51                             ` David Miller
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2014-04-14 21:23 UTC (permalink / raw)
  To: David Miller; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

On Mon, 2014-04-14 at 15:22 -0400, David Miller wrote:


> 
> You still need to make similar transformation to ip_local_out() as mentioned
> above.

I was trying to cook a not too invasive patch.

Changing ip_local_out() means changing dst_output() / nf_hook()  and
hundred of call sites.

Unless I misunderstood you ?



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 21:23                           ` Eric Dumazet
@ 2014-04-14 22:51                             ` David Miller
  2014-04-15  0:14                               ` Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: David Miller @ 2014-04-14 22:51 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 14 Apr 2014 14:23:24 -0700

> I was trying to cook a not too invasive patch.
> 
> Changing ip_local_out() means changing dst_output() / nf_hook()  and
> hundred of call sites.
> 
> Unless I misunderstood you ?

You could make ip_local_out(skb) be __ip_local_out(skb, skb->sk) and
make tunnel output use __ip_local_out().

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-14 22:51                             ` David Miller
@ 2014-04-15  0:14                               ` Eric Dumazet
  2014-04-15  6:32                                 ` Zhan Jianyu
  0 siblings, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2014-04-15  0:14 UTC (permalink / raw)
  To: David Miller; +Cc: nasa4836, jchapman, edumazet, joe, linux-kernel, netdev

On Mon, 2014-04-14 at 18:51 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 14 Apr 2014 14:23:24 -0700
> 
> > I was trying to cook a not too invasive patch.
> > 
> > Changing ip_local_out() means changing dst_output() / nf_hook()  and
> > hundred of call sites.
> > 
> > Unless I misunderstood you ?
> 
> You could make ip_local_out(skb) be __ip_local_out(skb, skb->sk) and
> make tunnel output use __ip_local_out().

OK lets try ;)

Zhan, could you try following patch, thanks !

 drivers/net/vxlan.c                 |    4 ++--
 include/net/dst.h                   |   14 +++++++++++---
 include/net/inet6_connection_sock.h |    2 +-
 include/net/inet_connection_sock.h  |    2 +-
 include/net/ip.h                    |   13 +++++++++----
 include/net/ip_tunnels.h            |    2 +-
 net/core/dst.c                      |    4 ++--
 net/dccp/output.c                   |    2 +-
 net/ipv4/ip_output.c                |   16 ++++++++--------
 net/ipv4/ip_tunnel.c                |    2 +-
 net/ipv4/ip_tunnel_core.c           |    4 ++--
 net/ipv4/route.c                    |    4 ++--
 net/ipv4/tcp_output.c               |    2 +-
 net/ipv6/inet6_connection_sock.c    |    3 +--
 net/ipv6/sit.c                      |    5 +++--
 net/l2tp/l2tp_core.c                |    4 ++--
 net/l2tp/l2tp_ip.c                  |    2 +-
 net/openvswitch/vport-gre.c         |    2 +-
 net/sctp/protocol.c                 |    2 +-
 19 files changed, 51 insertions(+), 38 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index c55e316373a1..82355d5d155a 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1755,8 +1755,8 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
 	if (err)
 		return err;
 
-	return iptunnel_xmit(rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df,
-			     false);
+	return iptunnel_xmit(vs->sock->sk, rt, skb, src, dst, IPPROTO_UDP,
+			     tos, ttl, df, false);
 }
 EXPORT_SYMBOL_GPL(vxlan_xmit_skb);
 
diff --git a/include/net/dst.h b/include/net/dst.h
index 46ed958e0c6e..71c60f42be48 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -45,7 +45,7 @@ struct dst_entry {
 	void			*__pad1;
 #endif
 	int			(*input)(struct sk_buff *);
-	int			(*output)(struct sk_buff *);
+	int			(*output)(struct sock *sk, struct sk_buff *skb);
 
 	unsigned short		flags;
 #define DST_HOST		0x0001
@@ -367,7 +367,11 @@ static inline struct dst_entry *skb_dst_pop(struct sk_buff *skb)
 	return child;
 }
 
-int dst_discard(struct sk_buff *skb);
+int dst_discard_sk(struct sock *sk, struct sk_buff *skb);
+static inline int dst_discard(struct sk_buff *skb)
+{
+	return dst_discard_sk(skb->sk, skb);
+}
 void *dst_alloc(struct dst_ops *ops, struct net_device *dev, int initial_ref,
 		int initial_obsolete, unsigned short flags);
 void __dst_free(struct dst_entry *dst);
@@ -449,9 +453,13 @@ static inline void dst_set_expires(struct dst_entry *dst, int timeout)
 }
 
 /* Output packet to network from transport.  */
+static inline int dst_output_sk(struct sock *sk, struct sk_buff *skb)
+{
+	return skb_dst(skb)->output(sk, skb);
+}
 static inline int dst_output(struct sk_buff *skb)
 {
-	return skb_dst(skb)->output(skb);
+	return dst_output_sk(skb->sk, skb);
 }
 
 /* Input packet from network to transport.  */
diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h
index f981ba7adeed..74af137304be 100644
--- a/include/net/inet6_connection_sock.h
+++ b/include/net/inet6_connection_sock.h
@@ -40,7 +40,7 @@ void inet6_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req,
 
 void inet6_csk_addr2sockaddr(struct sock *sk, struct sockaddr *uaddr);
 
-int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl);
+int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 
 struct dst_entry *inet6_csk_update_pmtu(struct sock *sk, u32 mtu);
 #endif /* _INET6_CONNECTION_SOCK_H */
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index c55aeed41ace..7a4313887568 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -36,7 +36,7 @@ struct tcp_congestion_ops;
  * (i.e. things that depend on the address family)
  */
 struct inet_connection_sock_af_ops {
-	int	    (*queue_xmit)(struct sk_buff *skb, struct flowi *fl);
+	int	    (*queue_xmit)(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 	void	    (*send_check)(struct sock *sk, struct sk_buff *skb);
 	int	    (*rebuild_header)(struct sock *sk);
 	void	    (*sk_rx_dst_set)(struct sock *sk, const struct sk_buff *skb);
diff --git a/include/net/ip.h b/include/net/ip.h
index 25064c28e059..3ec2b0fb9d83 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -104,14 +104,19 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
 	   struct net_device *orig_dev);
 int ip_local_deliver(struct sk_buff *skb);
 int ip_mr_input(struct sk_buff *skb);
-int ip_output(struct sk_buff *skb);
-int ip_mc_output(struct sk_buff *skb);
+int ip_output(struct sock *sk, struct sk_buff *skb);
+int ip_mc_output(struct sock *sk, struct sk_buff *skb);
 int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *));
 int ip_do_nat(struct sk_buff *skb);
 void ip_send_check(struct iphdr *ip);
 int __ip_local_out(struct sk_buff *skb);
-int ip_local_out(struct sk_buff *skb);
-int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl);
+int ip_local_out_sk(struct sock *sk, struct sk_buff *skb);
+static inline int ip_local_out(struct sk_buff *skb)
+{
+	return ip_local_out_sk(skb->sk, skb);
+}
+
+int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 void ip_init(void);
 int ip_append_data(struct sock *sk, struct flowi4 *fl4,
 		   int getfrag(void *from, char *to, int offset, int len,
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index e77c10405d51..a4daf9eb8562 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -153,7 +153,7 @@ static inline u8 ip_tunnel_ecn_encap(u8 tos, const struct iphdr *iph,
 }
 
 int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto);
-int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
+int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb,
 		  __be32 src, __be32 dst, __u8 proto,
 		  __u8 tos, __u8 ttl, __be16 df, bool xnet);
 
diff --git a/net/core/dst.c b/net/core/dst.c
index ca4231ec7347..7443c2725c9c 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -142,12 +142,12 @@ loop:
 	mutex_unlock(&dst_gc_mutex);
 }
 
-int dst_discard(struct sk_buff *skb)
+int dst_discard_sk(struct sock *sk, struct sk_buff *skb)
 {
 	kfree_skb(skb);
 	return 0;
 }
-EXPORT_SYMBOL(dst_discard);
+EXPORT_SYMBOL(dst_discard_sk);
 
 const u32 dst_default_metrics[RTAX_MAX + 1] = {
 	/* This initializer is needed to force linker to place this variable
diff --git a/net/dccp/output.c b/net/dccp/output.c
index 8876078859da..0248e8a3460c 100644
--- a/net/dccp/output.c
+++ b/net/dccp/output.c
@@ -138,7 +138,7 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb)
 
 		DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
 
-		err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
+		err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
 		return net_xmit_eval(err);
 	}
 	return -ENOBUFS;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 1a0755fea491..1cbeba5edff9 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -101,17 +101,17 @@ int __ip_local_out(struct sk_buff *skb)
 		       skb_dst(skb)->dev, dst_output);
 }
 
-int ip_local_out(struct sk_buff *skb)
+int ip_local_out_sk(struct sock *sk, struct sk_buff *skb)
 {
 	int err;
 
 	err = __ip_local_out(skb);
 	if (likely(err == 1))
-		err = dst_output(skb);
+		err = dst_output_sk(sk, skb);
 
 	return err;
 }
-EXPORT_SYMBOL_GPL(ip_local_out);
+EXPORT_SYMBOL_GPL(ip_local_out_sk);
 
 static inline int ip_select_ttl(struct inet_sock *inet, struct dst_entry *dst)
 {
@@ -226,9 +226,8 @@ static int ip_finish_output(struct sk_buff *skb)
 		return ip_finish_output2(skb);
 }
 
-int ip_mc_output(struct sk_buff *skb)
+int ip_mc_output(struct sock *sk, struct sk_buff *skb)
 {
-	struct sock *sk = skb->sk;
 	struct rtable *rt = skb_rtable(skb);
 	struct net_device *dev = rt->dst.dev;
 
@@ -287,7 +286,7 @@ int ip_mc_output(struct sk_buff *skb)
 			    !(IPCB(skb)->flags & IPSKB_REROUTED));
 }
 
-int ip_output(struct sk_buff *skb)
+int ip_output(struct sock *sk, struct sk_buff *skb)
 {
 	struct net_device *dev = skb_dst(skb)->dev;
 
@@ -315,9 +314,9 @@ static void ip_copy_addrs(struct iphdr *iph, const struct flowi4 *fl4)
 	       sizeof(fl4->saddr) + sizeof(fl4->daddr));
 }
 
-int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl)
+/* Note: skb->sk can be different from sk, in case of tunnels */
+int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl)
 {
-	struct sock *sk = skb->sk;
 	struct inet_sock *inet = inet_sk(sk);
 	struct ip_options_rcu *inet_opt;
 	struct flowi4 *fl4;
@@ -389,6 +388,7 @@ packet_routed:
 	ip_select_ident_more(skb, &rt->dst, sk,
 			     (skb_shinfo(skb)->gso_segs ?: 1) - 1);
 
+	/* TODO : should we use skb->sk here instead of sk ? */
 	skb->priority = sk->sk_priority;
 	skb->mark = sk->sk_mark;
 
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index e77381d1df9a..484d0ce27ef7 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -670,7 +670,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
 		return;
 	}
 
-	err = iptunnel_xmit(rt, skb, fl4.saddr, fl4.daddr, protocol,
+	err = iptunnel_xmit(skb->sk, rt, skb, fl4.saddr, fl4.daddr, protocol,
 			    tos, ttl, df, !net_eq(tunnel->net, dev_net(dev)));
 	iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
 
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index e0c2b1d2ea4e..bcf206c79005 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -46,7 +46,7 @@
 #include <net/netns/generic.h>
 #include <net/rtnetlink.h>
 
-int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
+int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb,
 		  __be32 src, __be32 dst, __u8 proto,
 		  __u8 tos, __u8 ttl, __be16 df, bool xnet)
 {
@@ -76,7 +76,7 @@ int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
 	iph->ttl	=	ttl;
 	__ip_select_ident(iph, &rt->dst, (skb_shinfo(skb)->gso_segs ?: 1) - 1);
 
-	err = ip_local_out(skb);
+	err = ip_local_out_sk(sk, skb);
 	if (unlikely(net_xmit_eval(err)))
 		pkt_len = 0;
 	return pkt_len;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 34d094cadb11..f2279d4470c4 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1129,7 +1129,7 @@ static void ipv4_link_failure(struct sk_buff *skb)
 		dst_set_expires(&rt->dst, 0);
 }
 
-static int ip_rt_bug(struct sk_buff *skb)
+static int ip_rt_bug(struct sock *sk, struct sk_buff *skb)
 {
 	pr_debug("%s: %pI4 -> %pI4, %s\n",
 		 __func__, &ip_hdr(skb)->saddr, &ip_hdr(skb)->daddr,
@@ -2218,7 +2218,7 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_or
 
 		new->__use = 1;
 		new->input = dst_discard;
-		new->output = dst_discard;
+		new->output = dst_discard_sk;
 
 		new->dev = ort->dst.dev;
 		if (new->dev)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 699fb102e971..025e25093984 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -981,7 +981,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 		TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS,
 			      tcp_skb_pcount(skb));
 
-	err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
+	err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
 	if (likely(err <= 0))
 		return err;
 
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index c9138189415a..d4ade34ab375 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -224,9 +224,8 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
 	return dst;
 }
 
-int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl_unused)
+int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl_unused)
 {
-	struct sock *sk = skb->sk;
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct flowi6 fl6;
 	struct dst_entry *dst;
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 1693c8d885f0..8da8268d65f8 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -974,8 +974,9 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
 		goto out;
 	}
 
-	err = iptunnel_xmit(rt, skb, fl4.saddr, fl4.daddr, IPPROTO_IPV6, tos,
-			    ttl, df, !net_eq(tunnel->net, dev_net(dev)));
+	err = iptunnel_xmit(skb->sk, rt, skb, fl4.saddr, fl4.daddr,
+			    IPPROTO_IPV6, tos, ttl, df,
+			    !net_eq(tunnel->net, dev_net(dev)));
 	iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
 	return NETDEV_TX_OK;
 
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 47f7a5490555..a4e37d7158dc 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1131,10 +1131,10 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
 	skb->local_df = 1;
 #if IS_ENABLED(CONFIG_IPV6)
 	if (tunnel->sock->sk_family == PF_INET6 && !tunnel->v4mapped)
-		error = inet6_csk_xmit(skb, NULL);
+		error = inet6_csk_xmit(tunnel->sock, skb, NULL);
 	else
 #endif
-		error = ip_queue_xmit(skb, fl);
+		error = ip_queue_xmit(tunnel->sock, skb, fl);
 
 	/* Update stats */
 	if (error >= 0) {
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 0b44d855269c..3397fe6897c0 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -487,7 +487,7 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 
 xmit:
 	/* Queue the packet to IP for output */
-	rc = ip_queue_xmit(skb, &inet->cork.fl);
+	rc = ip_queue_xmit(sk, skb, &inet->cork.fl);
 	rcu_read_unlock();
 
 error:
diff --git a/net/openvswitch/vport-gre.c b/net/openvswitch/vport-gre.c
index a3d6951602db..ebb6e2442554 100644
--- a/net/openvswitch/vport-gre.c
+++ b/net/openvswitch/vport-gre.c
@@ -174,7 +174,7 @@ static int gre_tnl_send(struct vport *vport, struct sk_buff *skb)
 
 	skb->local_df = 1;
 
-	return iptunnel_xmit(rt, skb, fl.saddr,
+	return iptunnel_xmit(skb->sk, rt, skb, fl.saddr,
 			     OVS_CB(skb)->tun_key->ipv4_dst, IPPROTO_GRE,
 			     OVS_CB(skb)->tun_key->ipv4_tos,
 			     OVS_CB(skb)->tun_key->ipv4_ttl, df, false);
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 4e1d0fcb028e..c09757fbf803 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -957,7 +957,7 @@ static inline int sctp_v4_xmit(struct sk_buff *skb,
 
 	SCTP_INC_STATS(sock_net(&inet->sk), SCTP_MIB_OUTSCTPPACKS);
 
-	return ip_queue_xmit(skb, &transport->fl);
+	return ip_queue_xmit(&inet->sk, skb, &transport->fl);
 }
 
 static struct sctp_af sctp_af_inet;



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [BUG] A panic caused by null pointer dereference aftering updating to
  2014-04-15  0:14                               ` Eric Dumazet
@ 2014-04-15  6:32                                 ` Zhan Jianyu
  2014-04-15 14:33                                   ` [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends Eric Dumazet
  0 siblings, 1 reply; 20+ messages in thread
From: Zhan Jianyu @ 2014-04-15  6:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, James Chapman, Eric Dumazet, joe, LKML, netdev

On Tue, Apr 15, 2014 at 8:14 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Zhan, could you try following patch, thanks !
>
>  drivers/net/vxlan.c                 |    4 ++--
>  include/net/dst.h                   |   14 +++++++++++---
>  include/net/inet6_connection_sock.h |    2 +-
>  include/net/inet_connection_sock.h  |    2 +-
>  include/net/ip.h                    |   13 +++++++++----
>  include/net/ip_tunnels.h            |    2 +-
>  net/core/dst.c                      |    4 ++--
>  net/dccp/output.c                   |    2 +-
>  net/ipv4/ip_output.c                |   16 ++++++++--------
>  net/ipv4/ip_tunnel.c                |    2 +-
>  net/ipv4/ip_tunnel_core.c           |    4 ++--
>  net/ipv4/route.c                    |    4 ++--
>  net/ipv4/tcp_output.c               |    2 +-
>  net/ipv6/inet6_connection_sock.c    |    3 +--
>  net/ipv6/sit.c                      |    5 +++--
>  net/l2tp/l2tp_core.c                |    4 ++--
>  net/l2tp/l2tp_ip.c                  |    2 +-
>  net/openvswitch/vport-gre.c         |    2 +-
>  net/sctp/protocol.c                 |    2 +-
>  19 files changed, 51 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
> index c55e316373a1..82355d5d155a 100644
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -1755,8 +1755,8 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
>         if (err)
>                 return err;
>
> -       return iptunnel_xmit(rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df,
> -                            false);
> +       return iptunnel_xmit(vs->sock->sk, rt, skb, src, dst, IPPROTO_UDP,
> +                            tos, ttl, df, false);
>  }
>  EXPORT_SYMBOL_GPL(vxlan_xmit_skb);
>
> diff --git a/include/net/dst.h b/include/net/dst.h
> index 46ed958e0c6e..71c60f42be48 100644
> --- a/include/net/dst.h
> +++ b/include/net/dst.h
> @@ -45,7 +45,7 @@ struct dst_entry {
>         void                    *__pad1;
>  #endif
>         int                     (*input)(struct sk_buff *);
> -       int                     (*output)(struct sk_buff *);
> +       int                     (*output)(struct sock *sk, struct sk_buff *skb);
>
>         unsigned short          flags;
>  #define DST_HOST               0x0001
> @@ -367,7 +367,11 @@ static inline struct dst_entry *skb_dst_pop(struct sk_buff *skb)
>         return child;
>  }
>
> -int dst_discard(struct sk_buff *skb);
> +int dst_discard_sk(struct sock *sk, struct sk_buff *skb);
> +static inline int dst_discard(struct sk_buff *skb)
> +{
> +       return dst_discard_sk(skb->sk, skb);
> +}
>  void *dst_alloc(struct dst_ops *ops, struct net_device *dev, int initial_ref,
>                 int initial_obsolete, unsigned short flags);
>  void __dst_free(struct dst_entry *dst);
> @@ -449,9 +453,13 @@ static inline void dst_set_expires(struct dst_entry *dst, int timeout)
>  }
>
>  /* Output packet to network from transport.  */
> +static inline int dst_output_sk(struct sock *sk, struct sk_buff *skb)
> +{
> +       return skb_dst(skb)->output(sk, skb);
> +}
>  static inline int dst_output(struct sk_buff *skb)
>  {
> -       return skb_dst(skb)->output(skb);
> +       return dst_output_sk(skb->sk, skb);
>  }
>
>  /* Input packet from network to transport.  */
> diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h
> index f981ba7adeed..74af137304be 100644
> --- a/include/net/inet6_connection_sock.h
> +++ b/include/net/inet6_connection_sock.h
> @@ -40,7 +40,7 @@ void inet6_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req,
>
>  void inet6_csk_addr2sockaddr(struct sock *sk, struct sockaddr *uaddr);
>
> -int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl);
> +int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
>
>  struct dst_entry *inet6_csk_update_pmtu(struct sock *sk, u32 mtu);
>  #endif /* _INET6_CONNECTION_SOCK_H */
> diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
> index c55aeed41ace..7a4313887568 100644
> --- a/include/net/inet_connection_sock.h
> +++ b/include/net/inet_connection_sock.h
> @@ -36,7 +36,7 @@ struct tcp_congestion_ops;
>   * (i.e. things that depend on the address family)
>   */
>  struct inet_connection_sock_af_ops {
> -       int         (*queue_xmit)(struct sk_buff *skb, struct flowi *fl);
> +       int         (*queue_xmit)(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
>         void        (*send_check)(struct sock *sk, struct sk_buff *skb);
>         int         (*rebuild_header)(struct sock *sk);
>         void        (*sk_rx_dst_set)(struct sock *sk, const struct sk_buff *skb);
> diff --git a/include/net/ip.h b/include/net/ip.h
> index 25064c28e059..3ec2b0fb9d83 100644
> --- a/include/net/ip.h
> +++ b/include/net/ip.h
> @@ -104,14 +104,19 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
>            struct net_device *orig_dev);
>  int ip_local_deliver(struct sk_buff *skb);
>  int ip_mr_input(struct sk_buff *skb);
> -int ip_output(struct sk_buff *skb);
> -int ip_mc_output(struct sk_buff *skb);
> +int ip_output(struct sock *sk, struct sk_buff *skb);
> +int ip_mc_output(struct sock *sk, struct sk_buff *skb);
>  int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *));
>  int ip_do_nat(struct sk_buff *skb);
>  void ip_send_check(struct iphdr *ip);
>  int __ip_local_out(struct sk_buff *skb);
> -int ip_local_out(struct sk_buff *skb);
> -int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl);
> +int ip_local_out_sk(struct sock *sk, struct sk_buff *skb);
> +static inline int ip_local_out(struct sk_buff *skb)
> +{
> +       return ip_local_out_sk(skb->sk, skb);
> +}
> +
> +int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
>  void ip_init(void);
>  int ip_append_data(struct sock *sk, struct flowi4 *fl4,
>                    int getfrag(void *from, char *to, int offset, int len,
> diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
> index e77c10405d51..a4daf9eb8562 100644
> --- a/include/net/ip_tunnels.h
> +++ b/include/net/ip_tunnels.h
> @@ -153,7 +153,7 @@ static inline u8 ip_tunnel_ecn_encap(u8 tos, const struct iphdr *iph,
>  }
>
>  int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto);
> -int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
> +int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb,
>                   __be32 src, __be32 dst, __u8 proto,
>                   __u8 tos, __u8 ttl, __be16 df, bool xnet);
>
> diff --git a/net/core/dst.c b/net/core/dst.c
> index ca4231ec7347..7443c2725c9c 100644
> --- a/net/core/dst.c
> +++ b/net/core/dst.c
> @@ -142,12 +142,12 @@ loop:
>         mutex_unlock(&dst_gc_mutex);
>  }
>
> -int dst_discard(struct sk_buff *skb)
> +int dst_discard_sk(struct sock *sk, struct sk_buff *skb)
>  {
>         kfree_skb(skb);
>         return 0;
>  }
> -EXPORT_SYMBOL(dst_discard);
> +EXPORT_SYMBOL(dst_discard_sk);
>
>  const u32 dst_default_metrics[RTAX_MAX + 1] = {
>         /* This initializer is needed to force linker to place this variable
> diff --git a/net/dccp/output.c b/net/dccp/output.c
> index 8876078859da..0248e8a3460c 100644
> --- a/net/dccp/output.c
> +++ b/net/dccp/output.c
> @@ -138,7 +138,7 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb)
>
>                 DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
>
> -               err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
> +               err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
>                 return net_xmit_eval(err);
>         }
>         return -ENOBUFS;
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 1a0755fea491..1cbeba5edff9 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -101,17 +101,17 @@ int __ip_local_out(struct sk_buff *skb)
>                        skb_dst(skb)->dev, dst_output);
>  }
>
> -int ip_local_out(struct sk_buff *skb)
> +int ip_local_out_sk(struct sock *sk, struct sk_buff *skb)
>  {
>         int err;
>
>         err = __ip_local_out(skb);
>         if (likely(err == 1))
> -               err = dst_output(skb);
> +               err = dst_output_sk(sk, skb);
>
>         return err;
>  }
> -EXPORT_SYMBOL_GPL(ip_local_out);
> +EXPORT_SYMBOL_GPL(ip_local_out_sk);
>
>  static inline int ip_select_ttl(struct inet_sock *inet, struct dst_entry *dst)
>  {
> @@ -226,9 +226,8 @@ static int ip_finish_output(struct sk_buff *skb)
>                 return ip_finish_output2(skb);
>  }
>
> -int ip_mc_output(struct sk_buff *skb)
> +int ip_mc_output(struct sock *sk, struct sk_buff *skb)
>  {
> -       struct sock *sk = skb->sk;
>         struct rtable *rt = skb_rtable(skb);
>         struct net_device *dev = rt->dst.dev;
>
> @@ -287,7 +286,7 @@ int ip_mc_output(struct sk_buff *skb)
>                             !(IPCB(skb)->flags & IPSKB_REROUTED));
>  }
>
> -int ip_output(struct sk_buff *skb)
> +int ip_output(struct sock *sk, struct sk_buff *skb)
>  {
>         struct net_device *dev = skb_dst(skb)->dev;
>
> @@ -315,9 +314,9 @@ static void ip_copy_addrs(struct iphdr *iph, const struct flowi4 *fl4)
>                sizeof(fl4->saddr) + sizeof(fl4->daddr));
>  }
>
> -int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl)
> +/* Note: skb->sk can be different from sk, in case of tunnels */
> +int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl)
>  {
> -       struct sock *sk = skb->sk;
>         struct inet_sock *inet = inet_sk(sk);
>         struct ip_options_rcu *inet_opt;
>         struct flowi4 *fl4;
> @@ -389,6 +388,7 @@ packet_routed:
>         ip_select_ident_more(skb, &rt->dst, sk,
>                              (skb_shinfo(skb)->gso_segs ?: 1) - 1);
>
> +       /* TODO : should we use skb->sk here instead of sk ? */
>         skb->priority = sk->sk_priority;
>         skb->mark = sk->sk_mark;
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index e77381d1df9a..484d0ce27ef7 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -670,7 +670,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
>                 return;
>         }
>
> -       err = iptunnel_xmit(rt, skb, fl4.saddr, fl4.daddr, protocol,
> +       err = iptunnel_xmit(skb->sk, rt, skb, fl4.saddr, fl4.daddr, protocol,
>                             tos, ttl, df, !net_eq(tunnel->net, dev_net(dev)));
>         iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
>
> diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
> index e0c2b1d2ea4e..bcf206c79005 100644
> --- a/net/ipv4/ip_tunnel_core.c
> +++ b/net/ipv4/ip_tunnel_core.c
> @@ -46,7 +46,7 @@
>  #include <net/netns/generic.h>
>  #include <net/rtnetlink.h>
>
> -int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
> +int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb,
>                   __be32 src, __be32 dst, __u8 proto,
>                   __u8 tos, __u8 ttl, __be16 df, bool xnet)
>  {
> @@ -76,7 +76,7 @@ int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
>         iph->ttl        =       ttl;
>         __ip_select_ident(iph, &rt->dst, (skb_shinfo(skb)->gso_segs ?: 1) - 1);
>
> -       err = ip_local_out(skb);
> +       err = ip_local_out_sk(sk, skb);
>         if (unlikely(net_xmit_eval(err)))
>                 pkt_len = 0;
>         return pkt_len;
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 34d094cadb11..f2279d4470c4 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -1129,7 +1129,7 @@ static void ipv4_link_failure(struct sk_buff *skb)
>                 dst_set_expires(&rt->dst, 0);
>  }
>
> -static int ip_rt_bug(struct sk_buff *skb)
> +static int ip_rt_bug(struct sock *sk, struct sk_buff *skb)
>  {
>         pr_debug("%s: %pI4 -> %pI4, %s\n",
>                  __func__, &ip_hdr(skb)->saddr, &ip_hdr(skb)->daddr,
> @@ -2218,7 +2218,7 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_or
>
>                 new->__use = 1;
>                 new->input = dst_discard;
> -               new->output = dst_discard;
> +               new->output = dst_discard_sk;
>
>                 new->dev = ort->dst.dev;
>                 if (new->dev)
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 699fb102e971..025e25093984 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -981,7 +981,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
>                 TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS,
>                               tcp_skb_pcount(skb));
>
> -       err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
> +       err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
>         if (likely(err <= 0))
>                 return err;
>
> diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
> index c9138189415a..d4ade34ab375 100644
> --- a/net/ipv6/inet6_connection_sock.c
> +++ b/net/ipv6/inet6_connection_sock.c
> @@ -224,9 +224,8 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
>         return dst;
>  }
>
> -int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl_unused)
> +int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl_unused)
>  {
> -       struct sock *sk = skb->sk;
>         struct ipv6_pinfo *np = inet6_sk(sk);
>         struct flowi6 fl6;
>         struct dst_entry *dst;
> diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
> index 1693c8d885f0..8da8268d65f8 100644
> --- a/net/ipv6/sit.c
> +++ b/net/ipv6/sit.c
> @@ -974,8 +974,9 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
>                 goto out;
>         }
>
> -       err = iptunnel_xmit(rt, skb, fl4.saddr, fl4.daddr, IPPROTO_IPV6, tos,
> -                           ttl, df, !net_eq(tunnel->net, dev_net(dev)));
> +       err = iptunnel_xmit(skb->sk, rt, skb, fl4.saddr, fl4.daddr,
> +                           IPPROTO_IPV6, tos, ttl, df,
> +                           !net_eq(tunnel->net, dev_net(dev)));
>         iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
>         return NETDEV_TX_OK;
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 47f7a5490555..a4e37d7158dc 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -1131,10 +1131,10 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
>         skb->local_df = 1;
>  #if IS_ENABLED(CONFIG_IPV6)
>         if (tunnel->sock->sk_family == PF_INET6 && !tunnel->v4mapped)
> -               error = inet6_csk_xmit(skb, NULL);
> +               error = inet6_csk_xmit(tunnel->sock, skb, NULL);
>         else
>  #endif
> -               error = ip_queue_xmit(skb, fl);
> +               error = ip_queue_xmit(tunnel->sock, skb, fl);
>
>         /* Update stats */
>         if (error >= 0) {
> diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
> index 0b44d855269c..3397fe6897c0 100644
> --- a/net/l2tp/l2tp_ip.c
> +++ b/net/l2tp/l2tp_ip.c
> @@ -487,7 +487,7 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
>
>  xmit:
>         /* Queue the packet to IP for output */
> -       rc = ip_queue_xmit(skb, &inet->cork.fl);
> +       rc = ip_queue_xmit(sk, skb, &inet->cork.fl);
>         rcu_read_unlock();
>
>  error:
> diff --git a/net/openvswitch/vport-gre.c b/net/openvswitch/vport-gre.c
> index a3d6951602db..ebb6e2442554 100644
> --- a/net/openvswitch/vport-gre.c
> +++ b/net/openvswitch/vport-gre.c
> @@ -174,7 +174,7 @@ static int gre_tnl_send(struct vport *vport, struct sk_buff *skb)
>
>         skb->local_df = 1;
>
> -       return iptunnel_xmit(rt, skb, fl.saddr,
> +       return iptunnel_xmit(skb->sk, rt, skb, fl.saddr,
>                              OVS_CB(skb)->tun_key->ipv4_dst, IPPROTO_GRE,
>                              OVS_CB(skb)->tun_key->ipv4_tos,
>                              OVS_CB(skb)->tun_key->ipv4_ttl, df, false);

Hi, Eric,

I've applied the patch, it seems now works for me, at least no panic any more:-)
Thanks for all you guys.

Tested-by: Jianyu Zhan <nasa4836@gmail.com>


Regards,
Jianyu Zhan

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends
  2014-04-15  6:32                                 ` Zhan Jianyu
@ 2014-04-15 14:33                                   ` Eric Dumazet
  2014-04-15 17:16                                     ` David Miller
  2014-04-21 13:20                                     ` lucien xin
  0 siblings, 2 replies; 20+ messages in thread
From: Eric Dumazet @ 2014-04-15 14:33 UTC (permalink / raw)
  To: Zhan Jianyu, David Miller; +Cc: James Chapman, netdev, lucien xin

From: Eric Dumazet <edumazet@google.com>

ip_queue_xmit() assumes the skb it has to transmit is attached to an
inet socket. Commit 31c70d5956fc ("l2tp: keep original skb ownership")
changed l2tp to not change skb ownership and thus broke this assumption.

One fix is to add a new 'struct sock *sk' parameter to ip_queue_xmit(),
so that we do not assume skb->sk points to the socket used by l2tp
tunnel.

The same problem happened with vxlan and ip_mc_output() : The
sk_mc_loop() test triggers a WARN_ON() when the provider of the packet
is an AF_PACKET socket.

Same fix for this problem : dst->output() method gets an additional
'struct sock *sk' parameter. This needs a cascade of changes so that
this parameter can be propagated from vxlan to final consumer.

With help from David Miller.

Reported-by: Zhan Jianyu <nasa4836@gmail.com>
Tested-by: Zhan Jianyu <nasa4836@gmail.com>
Reported-by: lucien xin <lucien.xin@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Fixes: 31c70d5956fc ("l2tp: keep original skb ownership")
Fixes: 8f646c922d55 ("vxlan: keep original skb ownership")
---

David, you asked me to take care of both problems, I now realize the two
patches are not in the same linux version, so it might be needed to
split in two separate patches to ease backport.

Lucien, can you test this patch solves the problem you reported ?

 drivers/net/vxlan.c                 |    4 ++--
 include/net/dst.h                   |   14 +++++++++++---
 include/net/inet6_connection_sock.h |    2 +-
 include/net/inet_connection_sock.h  |    2 +-
 include/net/ip.h                    |   13 +++++++++----
 include/net/ip_tunnels.h            |    2 +-
 net/core/dst.c                      |    4 ++--
 net/dccp/output.c                   |    2 +-
 net/ipv4/ip_output.c                |   16 ++++++++--------
 net/ipv4/ip_tunnel.c                |    2 +-
 net/ipv4/ip_tunnel_core.c           |    4 ++--
 net/ipv4/route.c                    |    4 ++--
 net/ipv4/tcp_output.c               |    2 +-
 net/ipv6/inet6_connection_sock.c    |    3 +--
 net/ipv6/sit.c                      |    5 +++--
 net/l2tp/l2tp_core.c                |    4 ++--
 net/l2tp/l2tp_ip.c                  |    2 +-
 net/openvswitch/vport-gre.c         |    2 +-
 net/sctp/protocol.c                 |    2 +-
 19 files changed, 51 insertions(+), 38 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index c55e316373a1..82355d5d155a 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1755,8 +1755,8 @@ int vxlan_xmit_skb(struct vxlan_sock *vs,
 	if (err)
 		return err;
 
-	return iptunnel_xmit(rt, skb, src, dst, IPPROTO_UDP, tos, ttl, df,
-			     false);
+	return iptunnel_xmit(vs->sock->sk, rt, skb, src, dst, IPPROTO_UDP,
+			     tos, ttl, df, false);
 }
 EXPORT_SYMBOL_GPL(vxlan_xmit_skb);
 
diff --git a/include/net/dst.h b/include/net/dst.h
index 46ed958e0c6e..71c60f42be48 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -45,7 +45,7 @@ struct dst_entry {
 	void			*__pad1;
 #endif
 	int			(*input)(struct sk_buff *);
-	int			(*output)(struct sk_buff *);
+	int			(*output)(struct sock *sk, struct sk_buff *skb);
 
 	unsigned short		flags;
 #define DST_HOST		0x0001
@@ -367,7 +367,11 @@ static inline struct dst_entry *skb_dst_pop(struct sk_buff *skb)
 	return child;
 }
 
-int dst_discard(struct sk_buff *skb);
+int dst_discard_sk(struct sock *sk, struct sk_buff *skb);
+static inline int dst_discard(struct sk_buff *skb)
+{
+	return dst_discard_sk(skb->sk, skb);
+}
 void *dst_alloc(struct dst_ops *ops, struct net_device *dev, int initial_ref,
 		int initial_obsolete, unsigned short flags);
 void __dst_free(struct dst_entry *dst);
@@ -449,9 +453,13 @@ static inline void dst_set_expires(struct dst_entry *dst, int timeout)
 }
 
 /* Output packet to network from transport.  */
+static inline int dst_output_sk(struct sock *sk, struct sk_buff *skb)
+{
+	return skb_dst(skb)->output(sk, skb);
+}
 static inline int dst_output(struct sk_buff *skb)
 {
-	return skb_dst(skb)->output(skb);
+	return dst_output_sk(skb->sk, skb);
 }
 
 /* Input packet from network to transport.  */
diff --git a/include/net/inet6_connection_sock.h b/include/net/inet6_connection_sock.h
index f981ba7adeed..74af137304be 100644
--- a/include/net/inet6_connection_sock.h
+++ b/include/net/inet6_connection_sock.h
@@ -40,7 +40,7 @@ void inet6_csk_reqsk_queue_hash_add(struct sock *sk, struct request_sock *req,
 
 void inet6_csk_addr2sockaddr(struct sock *sk, struct sockaddr *uaddr);
 
-int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl);
+int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 
 struct dst_entry *inet6_csk_update_pmtu(struct sock *sk, u32 mtu);
 #endif /* _INET6_CONNECTION_SOCK_H */
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index c55aeed41ace..7a4313887568 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -36,7 +36,7 @@ struct tcp_congestion_ops;
  * (i.e. things that depend on the address family)
  */
 struct inet_connection_sock_af_ops {
-	int	    (*queue_xmit)(struct sk_buff *skb, struct flowi *fl);
+	int	    (*queue_xmit)(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 	void	    (*send_check)(struct sock *sk, struct sk_buff *skb);
 	int	    (*rebuild_header)(struct sock *sk);
 	void	    (*sk_rx_dst_set)(struct sock *sk, const struct sk_buff *skb);
diff --git a/include/net/ip.h b/include/net/ip.h
index 25064c28e059..3ec2b0fb9d83 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -104,14 +104,19 @@ int ip_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt,
 	   struct net_device *orig_dev);
 int ip_local_deliver(struct sk_buff *skb);
 int ip_mr_input(struct sk_buff *skb);
-int ip_output(struct sk_buff *skb);
-int ip_mc_output(struct sk_buff *skb);
+int ip_output(struct sock *sk, struct sk_buff *skb);
+int ip_mc_output(struct sock *sk, struct sk_buff *skb);
 int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *));
 int ip_do_nat(struct sk_buff *skb);
 void ip_send_check(struct iphdr *ip);
 int __ip_local_out(struct sk_buff *skb);
-int ip_local_out(struct sk_buff *skb);
-int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl);
+int ip_local_out_sk(struct sock *sk, struct sk_buff *skb);
+static inline int ip_local_out(struct sk_buff *skb)
+{
+	return ip_local_out_sk(skb->sk, skb);
+}
+
+int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 void ip_init(void);
 int ip_append_data(struct sock *sk, struct flowi4 *fl4,
 		   int getfrag(void *from, char *to, int offset, int len,
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index e77c10405d51..a4daf9eb8562 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -153,7 +153,7 @@ static inline u8 ip_tunnel_ecn_encap(u8 tos, const struct iphdr *iph,
 }
 
 int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto);
-int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
+int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb,
 		  __be32 src, __be32 dst, __u8 proto,
 		  __u8 tos, __u8 ttl, __be16 df, bool xnet);
 
diff --git a/net/core/dst.c b/net/core/dst.c
index ca4231ec7347..7443c2725c9c 100644
--- a/net/core/dst.c
+++ b/net/core/dst.c
@@ -142,12 +142,12 @@ loop:
 	mutex_unlock(&dst_gc_mutex);
 }
 
-int dst_discard(struct sk_buff *skb)
+int dst_discard_sk(struct sock *sk, struct sk_buff *skb)
 {
 	kfree_skb(skb);
 	return 0;
 }
-EXPORT_SYMBOL(dst_discard);
+EXPORT_SYMBOL(dst_discard_sk);
 
 const u32 dst_default_metrics[RTAX_MAX + 1] = {
 	/* This initializer is needed to force linker to place this variable
diff --git a/net/dccp/output.c b/net/dccp/output.c
index 8876078859da..0248e8a3460c 100644
--- a/net/dccp/output.c
+++ b/net/dccp/output.c
@@ -138,7 +138,7 @@ static int dccp_transmit_skb(struct sock *sk, struct sk_buff *skb)
 
 		DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
 
-		err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
+		err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
 		return net_xmit_eval(err);
 	}
 	return -ENOBUFS;
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 1a0755fea491..1cbeba5edff9 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -101,17 +101,17 @@ int __ip_local_out(struct sk_buff *skb)
 		       skb_dst(skb)->dev, dst_output);
 }
 
-int ip_local_out(struct sk_buff *skb)
+int ip_local_out_sk(struct sock *sk, struct sk_buff *skb)
 {
 	int err;
 
 	err = __ip_local_out(skb);
 	if (likely(err == 1))
-		err = dst_output(skb);
+		err = dst_output_sk(sk, skb);
 
 	return err;
 }
-EXPORT_SYMBOL_GPL(ip_local_out);
+EXPORT_SYMBOL_GPL(ip_local_out_sk);
 
 static inline int ip_select_ttl(struct inet_sock *inet, struct dst_entry *dst)
 {
@@ -226,9 +226,8 @@ static int ip_finish_output(struct sk_buff *skb)
 		return ip_finish_output2(skb);
 }
 
-int ip_mc_output(struct sk_buff *skb)
+int ip_mc_output(struct sock *sk, struct sk_buff *skb)
 {
-	struct sock *sk = skb->sk;
 	struct rtable *rt = skb_rtable(skb);
 	struct net_device *dev = rt->dst.dev;
 
@@ -287,7 +286,7 @@ int ip_mc_output(struct sk_buff *skb)
 			    !(IPCB(skb)->flags & IPSKB_REROUTED));
 }
 
-int ip_output(struct sk_buff *skb)
+int ip_output(struct sock *sk, struct sk_buff *skb)
 {
 	struct net_device *dev = skb_dst(skb)->dev;
 
@@ -315,9 +314,9 @@ static void ip_copy_addrs(struct iphdr *iph, const struct flowi4 *fl4)
 	       sizeof(fl4->saddr) + sizeof(fl4->daddr));
 }
 
-int ip_queue_xmit(struct sk_buff *skb, struct flowi *fl)
+/* Note: skb->sk can be different from sk, in case of tunnels */
+int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl)
 {
-	struct sock *sk = skb->sk;
 	struct inet_sock *inet = inet_sk(sk);
 	struct ip_options_rcu *inet_opt;
 	struct flowi4 *fl4;
@@ -389,6 +388,7 @@ packet_routed:
 	ip_select_ident_more(skb, &rt->dst, sk,
 			     (skb_shinfo(skb)->gso_segs ?: 1) - 1);
 
+	/* TODO : should we use skb->sk here instead of sk ? */
 	skb->priority = sk->sk_priority;
 	skb->mark = sk->sk_mark;
 
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index e77381d1df9a..484d0ce27ef7 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -670,7 +670,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
 		return;
 	}
 
-	err = iptunnel_xmit(rt, skb, fl4.saddr, fl4.daddr, protocol,
+	err = iptunnel_xmit(skb->sk, rt, skb, fl4.saddr, fl4.daddr, protocol,
 			    tos, ttl, df, !net_eq(tunnel->net, dev_net(dev)));
 	iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
 
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index e0c2b1d2ea4e..bcf206c79005 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -46,7 +46,7 @@
 #include <net/netns/generic.h>
 #include <net/rtnetlink.h>
 
-int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
+int iptunnel_xmit(struct sock *sk, struct rtable *rt, struct sk_buff *skb,
 		  __be32 src, __be32 dst, __u8 proto,
 		  __u8 tos, __u8 ttl, __be16 df, bool xnet)
 {
@@ -76,7 +76,7 @@ int iptunnel_xmit(struct rtable *rt, struct sk_buff *skb,
 	iph->ttl	=	ttl;
 	__ip_select_ident(iph, &rt->dst, (skb_shinfo(skb)->gso_segs ?: 1) - 1);
 
-	err = ip_local_out(skb);
+	err = ip_local_out_sk(sk, skb);
 	if (unlikely(net_xmit_eval(err)))
 		pkt_len = 0;
 	return pkt_len;
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 34d094cadb11..f2279d4470c4 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1129,7 +1129,7 @@ static void ipv4_link_failure(struct sk_buff *skb)
 		dst_set_expires(&rt->dst, 0);
 }
 
-static int ip_rt_bug(struct sk_buff *skb)
+static int ip_rt_bug(struct sock *sk, struct sk_buff *skb)
 {
 	pr_debug("%s: %pI4 -> %pI4, %s\n",
 		 __func__, &ip_hdr(skb)->saddr, &ip_hdr(skb)->daddr,
@@ -2218,7 +2218,7 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_or
 
 		new->__use = 1;
 		new->input = dst_discard;
-		new->output = dst_discard;
+		new->output = dst_discard_sk;
 
 		new->dev = ort->dst.dev;
 		if (new->dev)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 699fb102e971..025e25093984 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -981,7 +981,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
 		TCP_ADD_STATS(sock_net(sk), TCP_MIB_OUTSEGS,
 			      tcp_skb_pcount(skb));
 
-	err = icsk->icsk_af_ops->queue_xmit(skb, &inet->cork.fl);
+	err = icsk->icsk_af_ops->queue_xmit(sk, skb, &inet->cork.fl);
 	if (likely(err <= 0))
 		return err;
 
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index c9138189415a..d4ade34ab375 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -224,9 +224,8 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
 	return dst;
 }
 
-int inet6_csk_xmit(struct sk_buff *skb, struct flowi *fl_unused)
+int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl_unused)
 {
-	struct sock *sk = skb->sk;
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct flowi6 fl6;
 	struct dst_entry *dst;
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 1693c8d885f0..8da8268d65f8 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -974,8 +974,9 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
 		goto out;
 	}
 
-	err = iptunnel_xmit(rt, skb, fl4.saddr, fl4.daddr, IPPROTO_IPV6, tos,
-			    ttl, df, !net_eq(tunnel->net, dev_net(dev)));
+	err = iptunnel_xmit(skb->sk, rt, skb, fl4.saddr, fl4.daddr,
+			    IPPROTO_IPV6, tos, ttl, df,
+			    !net_eq(tunnel->net, dev_net(dev)));
 	iptunnel_xmit_stats(err, &dev->stats, dev->tstats);
 	return NETDEV_TX_OK;
 
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 47f7a5490555..a4e37d7158dc 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -1131,10 +1131,10 @@ static int l2tp_xmit_core(struct l2tp_session *session, struct sk_buff *skb,
 	skb->local_df = 1;
 #if IS_ENABLED(CONFIG_IPV6)
 	if (tunnel->sock->sk_family == PF_INET6 && !tunnel->v4mapped)
-		error = inet6_csk_xmit(skb, NULL);
+		error = inet6_csk_xmit(tunnel->sock, skb, NULL);
 	else
 #endif
-		error = ip_queue_xmit(skb, fl);
+		error = ip_queue_xmit(tunnel->sock, skb, fl);
 
 	/* Update stats */
 	if (error >= 0) {
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 0b44d855269c..3397fe6897c0 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -487,7 +487,7 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 
 xmit:
 	/* Queue the packet to IP for output */
-	rc = ip_queue_xmit(skb, &inet->cork.fl);
+	rc = ip_queue_xmit(sk, skb, &inet->cork.fl);
 	rcu_read_unlock();
 
 error:
diff --git a/net/openvswitch/vport-gre.c b/net/openvswitch/vport-gre.c
index a3d6951602db..ebb6e2442554 100644
--- a/net/openvswitch/vport-gre.c
+++ b/net/openvswitch/vport-gre.c
@@ -174,7 +174,7 @@ static int gre_tnl_send(struct vport *vport, struct sk_buff *skb)
 
 	skb->local_df = 1;
 
-	return iptunnel_xmit(rt, skb, fl.saddr,
+	return iptunnel_xmit(skb->sk, rt, skb, fl.saddr,
 			     OVS_CB(skb)->tun_key->ipv4_dst, IPPROTO_GRE,
 			     OVS_CB(skb)->tun_key->ipv4_tos,
 			     OVS_CB(skb)->tun_key->ipv4_ttl, df, false);
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 4e1d0fcb028e..c09757fbf803 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -957,7 +957,7 @@ static inline int sctp_v4_xmit(struct sk_buff *skb,
 
 	SCTP_INC_STATS(sock_net(&inet->sk), SCTP_MIB_OUTSCTPPACKS);
 
-	return ip_queue_xmit(skb, &transport->fl);
+	return ip_queue_xmit(&inet->sk, skb, &transport->fl);
 }
 
 static struct sctp_af sctp_af_inet;

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends
  2014-04-15 14:33                                   ` [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends Eric Dumazet
@ 2014-04-15 17:16                                     ` David Miller
  2014-04-15 20:06                                       ` Eric Dumazet
  2014-04-21 13:20                                     ` lucien xin
  1 sibling, 1 reply; 20+ messages in thread
From: David Miller @ 2014-04-15 17:16 UTC (permalink / raw)
  To: eric.dumazet; +Cc: nasa4836, jchapman, netdev, lucien.xin

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 15 Apr 2014 07:33:12 -0700

> diff --git a/net/core/dst.c b/net/core/dst.c
> index ca4231ec7347..7443c2725c9c 100644
> --- a/net/core/dst.c
> +++ b/net/core/dst.c
> @@ -142,12 +142,12 @@ loop:
>  	mutex_unlock(&dst_gc_mutex);
>  }
>  
> -int dst_discard(struct sk_buff *skb)
> +int dst_discard_sk(struct sock *sk, struct sk_buff *skb)
>  {
>  	kfree_skb(skb);
>  	return 0;
>  }
> -EXPORT_SYMBOL(dst_discard);
> +EXPORT_SYMBOL(dst_discard_sk);
>  
>  const u32 dst_default_metrics[RTAX_MAX + 1] = {
>  	/* This initializer is needed to force linker to place this variable

This doesn't compile, I'd expect to see hunks such as:

@@ -184,7 +184,7 @@ void *dst_alloc(struct dst_ops *ops, struct net_device *dev,
 	dst->xfrm = NULL;
 #endif
 	dst->input = dst_discard;
-	dst->output = dst_discard;
+	dst->output = dst_discard_sk;
 	dst->error = 0;
 	dst->obsolete = initial_obsolete;
 	dst->header_len = 0;

for this file, but I don't.  Maybe you posted an outdated revision of your
patch?  There also aren't the necessary decnet and XFRM changes, layers
which also use dst->output().

Anyways, I'm splitting up your patches and sorting this out.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends
  2014-04-15 17:16                                     ` David Miller
@ 2014-04-15 20:06                                       ` Eric Dumazet
  0 siblings, 0 replies; 20+ messages in thread
From: Eric Dumazet @ 2014-04-15 20:06 UTC (permalink / raw)
  To: David Miller; +Cc: nasa4836, jchapman, netdev, lucien.xin

On Tue, 2014-04-15 at 13:16 -0400, David Miller wrote:

> This doesn't compile, I'd expect to see hunks such as:
> 
> @@ -184,7 +184,7 @@ void *dst_alloc(struct dst_ops *ops, struct net_device *dev,
>  	dst->xfrm = NULL;
>  #endif
>  	dst->input = dst_discard;
> -	dst->output = dst_discard;
> +	dst->output = dst_discard_sk;
>  	dst->error = 0;
>  	dst->obsolete = initial_obsolete;
>  	dst->header_len = 0;
> 
> for this file, but I don't.  Maybe you posted an outdated revision of your
> patch?  There also aren't the necessary decnet and XFRM changes, layers
> which also use dst->output().
> 
> Anyways, I'm splitting up your patches and sorting this out.

Hmmpff. The truth is I did a "make allyesconfig;make -j24", and the
build completed, but I missed warnings.

And the thing even booted and ran properly.

Sorry for this, and thanks for taking care of the mess.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends
  2014-04-15 14:33                                   ` [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends Eric Dumazet
  2014-04-15 17:16                                     ` David Miller
@ 2014-04-21 13:20                                     ` lucien xin
  1 sibling, 0 replies; 20+ messages in thread
From: lucien xin @ 2014-04-21 13:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Zhan Jianyu, David Miller, James Chapman, network dev

On Tue, Apr 15, 2014 at 10:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Reported-by: Zhan Jianyu <nasa4836@gmail.com>
> Tested-by: Zhan Jianyu <nasa4836@gmail.com>
> Reported-by: lucien xin <lucien.xin@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: James Chapman <jchapman@katalix.com>
> Fixes: 31c70d5956fc ("l2tp: keep original skb ownership")
> Fixes: 8f646c922d55 ("vxlan: keep original skb ownership")
> ---
>
> David, you asked me to take care of both problems, I now realize the two
> patches are not in the same linux version, so it might be needed to
> split in two separate patches to ease backport.
>
> Lucien, can you test this patch solves the problem you reported ?
>

sorry for late,  I have tested vxlan dhclient with the patch, no WARN().

thanks,

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-04-21 13:20 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHz2CGWs-JH0v3dWAMBDrKqkP6ygb48YZ-qkc=KsFqz5yO7f+Q@mail.gmail.com>
     [not found] ` <CAHz2CGU5aPA5QJ9ED9g5kahD_jJaeLF+VOotUSBFXDkY+EH8ow@mail.gmail.com>
2014-04-14 15:19   ` [BUG] A panic caused by null pointer dereference aftering updating to James Chapman
2014-04-14 17:01     ` Zhan Jianyu
2014-04-14 17:12       ` Eric Dumazet
2014-04-14 17:34         ` David Miller
2014-04-14 17:40           ` Eric Dumazet
2014-04-14 17:49             ` David Miller
2014-04-14 17:54               ` Eric Dumazet
2014-04-14 18:02                 ` David Miller
2014-04-14 18:19                   ` Eric Dumazet
2014-04-14 18:48                     ` David Miller
2014-04-14 19:06                       ` Eric Dumazet
2014-04-14 19:22                         ` David Miller
2014-04-14 21:23                           ` Eric Dumazet
2014-04-14 22:51                             ` David Miller
2014-04-15  0:14                               ` Eric Dumazet
2014-04-15  6:32                                 ` Zhan Jianyu
2014-04-15 14:33                                   ` [PATCH net] ipv4: add a sock pointer to ip_queue_xmit() and friends Eric Dumazet
2014-04-15 17:16                                     ` David Miller
2014-04-15 20:06                                       ` Eric Dumazet
2014-04-21 13:20                                     ` lucien xin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.