netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ipv6: oops in datagram.c line 260
@ 2014-12-10  5:43 Chris Ruehl
  2014-12-24 13:42 ` Chris Ruehl
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Ruehl @ 2014-12-10  5:43 UTC (permalink / raw)
  To: netdev

Hi all,

We running a Dell server which crash frequently with (dell crash video 
snapshot) vanilla 3.14.25

Capture viewed here: http://www.gtsys.com.hk/~chris/datagram_c_line260.png

The capture don't sadly don't show the full trace, so we lack on 
information.
1st line I can see in the crash video from the idrac : 
tcp_transmit_skb+0x461

RIP [<ffffffff815da587>] ipv6_local_error+0x17/0x140

The null pointer happen:
  Type "apropos word" to search for commands related to "word"...
Reading symbols from net/ipv6/datagram.o...done.
(gdb) list *(ipv6_local_error+0x17)
0xae7 is in ipv6_local_error (net/ipv6/datagram.c:260).
255        struct ipv6_pinfo *np = inet6_sk(sk);
256        struct sock_exterr_skb *serr;
257        struct ipv6hdr *iph;
258        struct sk_buff *skb;
259
260        if (!np->recverr)
261            return;
262
263        skb = alloc_skb(sizeof(struct ipv6hdr), GFP_ATOMIC);
264        if (!skb)
(gdb) quit


We running a 6in4 with ipsec tunnel on the 6. I found a pull request from
Steffen Klassert
here:
     http://article.gmane.org/gmane.linux.network/281469

Which might be relevant to this problem.

For time being I add a

         if (np == NULL){
                 LIMIT_NETDEBUG(KERN_DEBUG "ipv6_pinfo is NULL\n");
                 return;
         }

as work around to stop the server crashing


With kind regards
Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2014-12-10  5:43 ipv6: oops in datagram.c line 260 Chris Ruehl
@ 2014-12-24 13:42 ` Chris Ruehl
  2015-01-06 16:01   ` Hannes Frederic Sowa
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Ruehl @ 2014-12-24 13:42 UTC (permalink / raw)
  To: netdev; +Cc: davem, steffen.klassert

On Wednesday, December 10, 2014 01:43 PM, Chris Ruehl wrote:
> Hi all,
>
> We running a Dell server which crash frequently with (dell crash video snapshot)
> vanilla 3.14.25
>
> Capture viewed here: http://www.gtsys.com.hk/~chris/datagram_c_line260.png
>
> The capture sadly don't show the full trace, so we lack on information.
> 1st line I can see in the crash video from the idrac : tcp_transmit_skb+0x461
>
> RIP [<ffffffff815da587>] ipv6_local_error+0x17/0x140
>
> The null pointer happen:
>   Type "apropos word" to search for commands related to "word"...
> Reading symbols from net/ipv6/datagram.o...done.
> (gdb) list *(ipv6_local_error+0x17)
> 0xae7 is in ipv6_local_error (net/ipv6/datagram.c:260).
> 255        struct ipv6_pinfo *np = inet6_sk(sk);
> 256        struct sock_exterr_skb *serr;
> 257        struct ipv6hdr *iph;
> 258        struct sk_buff *skb;
> 259
> 260        if (!np->recverr)
> 261            return;
> 262
> 263        skb = alloc_skb(sizeof(struct ipv6hdr), GFP_ATOMIC);
> 264        if (!skb)
> (gdb) quit
>
>
> We running a 6in4 with ipsec tunnel on the 6. I found a pull request from
> Steffen Klassert
> here:
>      http://article.gmane.org/gmane.linux.network/281469
>
> Which might be relevant to this problem.
>
> For time being I add a
>
>          if (np == NULL){
>                  LIMIT_NETDEBUG(KERN_DEBUG "ipv6_pinfo is NULL\n");
>                  return;
>          }
>
> as work around to stop the server crashing
>
>
> With kind regards
> Chris
>

Catch it!

Update the kernel to 3.14.27 and add a WARN_ON() to the function and catch the 
OOPS after 5 Days.

As mentioned we running a IPv6 in IPv4 with a couple of IPSec tunnels on the v6.

Code change:
void ipv6_local_error(struct sock *sk, int err, struct flowi6 *fl6, u32 info)
{
         struct ipv6_pinfo *np = inet6_sk(sk);
         struct sock_exterr_skb *serr;
         struct ipv6hdr *iph;
         struct sk_buff *skb;

         if (np == NULL){
                 LIMIT_NETDEBUG(KERN_CRIT "ipv6_pinfo is NULL\n");
                 WARN_ON(1);
                 return;
         }



[447604.244357] ipv6_pinfo is NULL
[447604.273733] ------------[ cut here ]------------
[447604.303628] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[447604.366173] Modules linked in: ipmi_si vhost_net vhost macvtap macvlan 
xt_policy authenc esp6 xfrm4_mode_tunnel xfrm6_mode_tunnel mpt3sas mpt2sas 
raid_class scsi_transport_sas mptctl mptbase ipt_MASQUERADE iptable_nat 
nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack 
ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp ipmi_devintf dell_rbu 
ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables 
x_tables xfrm_user xfrm4_tunnel ipcomp xfrm_ipcomp esp4 ah4 deflate ctr 
twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 
twofish_common camellia_generic camellia_aesni_avx_x86_64 camellia_x86_64 
serpent_avx_x86_64 serpent_sse2_x86_64 xts serpent_generic blowfish_generic 
blowfish_x86_64 blowfish_common cast5_avx_x86_64 cast5_generic cast_common 
des_generic cmac xcbc rmd160 crypto_null af_key xfrm_algo sit ip_tunnel tunnel4 
bridge stp llc xfs libcrc32c intel_rapl x86_pkg_temp_thermal intel_powerclamp 
coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul gpio_ich 
ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul joydev glue_helper 
ablk_helper cryptd dcdbas shpchp wmi mei_me mei acpi_power_meter lpc_ich dummy 
lp parport hid_generic tg3 usbhid hid ahci megaraid_sas ptp libahci pps_core 
[last unloaded: ipmi_si]
[447605.087999] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.14.27 #11
[447605.139687] Hardware name: Dell Inc. PowerEdge R420/0CN7CM, BIOS 2.3.3 
07/10/2014
[447605.242931]  0000000000000009 ffff8806172e3b48 ffffffff815ffd58 0000000000000000
[447605.349130]  ffff8806172e3b80 ffffffff81043c23 ffff8800a16322e8 ffff880037daa1c0
[447605.459659]  ffff88000b026800 0000000000000000 ffff880037daa4b8 ffff8806172e3b90
[447605.576385] Call Trace:
[447605.634243]  <IRQ>  [<ffffffff815ffd58>] dump_stack+0x45/0x56
[447605.692870]  [<ffffffff81043c23>] warn_slowpath_common+0x73/0x90
[447605.751097]  [<ffffffff81043cf5>] warn_slowpath_null+0x15/0x20
[447605.808000]  [<ffffffff815da6db>] ipv6_local_error+0x16b/0x1a0
[447605.863821]  [<ffffffff815e29d0>] xfrm6_local_error+0x60/0x90
[447605.918493]  [<ffffffff8150b485>] ? skb_dequeue+0x15/0x70
[447605.971871]  [<ffffffff815a6cc1>] xfrm_local_error+0x51/0x70
[447606.024218]  [<ffffffff8159ca15>] xfrm4_extract_output+0x75/0xb0
[447606.075630]  [<ffffffff815a6c5a>] xfrm_inner_extract_output+0x6a/0x80
[447606.126055]  [<ffffffff815e27a2>] xfrm6_prepare_output+0x12/0x60
[447606.175310]  [<ffffffff815a6ed0>] xfrm_output_resume+0x1f0/0x370
[447606.223406]  [<ffffffff8151a486>] ? skb_checksum_help+0x76/0x190
[447606.270572]  [<ffffffff815a709b>] xfrm_output+0x3b/0xf0
[447606.316454]  [<ffffffff815e2ae0>] ? xfrm6_extract_output+0xe0/0xe0
[447606.361803]  [<ffffffff815e2af7>] xfrm6_output_finish+0x17/0x20
[447606.406053]  [<ffffffff8159cad6>] xfrm4_output+0x46/0x80
[447606.448694]  [<ffffffff81550a80>] ip_local_out+0x20/0x30
[447606.489952]  [<ffffffff81550dd5>] ip_queue_xmit+0x135/0x3c0
[447606.530017]  [<ffffffff815672e1>] tcp_transmit_skb+0x461/0x8c0
[447606.569362]  [<ffffffff8156786e>] tcp_write_xmit+0x12e/0xb20
[447606.607876]  [<ffffffff815669ff>] ? tcp_current_mss+0x4f/0x70
[447606.645723]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
[447606.682837]  [<ffffffff81569487>] tcp_send_loss_probe+0x37/0x1f0
[447606.719000]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
[447606.754537]  [<ffffffff8156b1bb>] tcp_write_timer_handler+0x4b/0x1b0
[447606.789266]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
[447606.823242]  [<ffffffff8156b378>] tcp_write_timer+0x58/0x60
[447606.856047]  [<ffffffff8104e848>] call_timer_fn.isra.32+0x18/0x80
[447606.888029]  [<ffffffff8104ea1a>] run_timer_softirq+0x16a/0x200
[447606.920224]  [<ffffffff81047efc>] __do_softirq+0xec/0x250
[447606.951850]  [<ffffffff810482f5>] irq_exit+0xf5/0x100
[447606.982665]  [<ffffffff8102bc6f>] smp_apic_timer_interrupt+0x3f/0x50
[447607.014382]  [<ffffffff8160d98a>] apic_timer_interrupt+0x6a/0x70
[447607.046175]  <EOI>  [<ffffffff8104f336>] ? get_next_timer_interrupt+0x1d6/0x250
[447607.111311]  [<ffffffff814d45a7>] ? cpuidle_enter_state+0x47/0xc0
[447607.145850]  [<ffffffff814d45a3>] ? cpuidle_enter_state+0x43/0xc0
[447607.179625]  [<ffffffff814d46b6>] cpuidle_idle_call+0x96/0x130
[447607.213531]  [<ffffffff8100b909>] arch_cpu_idle+0x9/0x20
[447607.247052]  [<ffffffff810925ba>] cpu_startup_entry+0xda/0x1d0
[447607.280775]  [<ffffffff81029d22>] start_secondary+0x212/0x2c0
[447607.314555] ---[ end trace 6ff3826b6e4fdf67 ]---


Can someone have a closer look into this problem?

Regards
Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2014-12-24 13:42 ` Chris Ruehl
@ 2015-01-06 16:01   ` Hannes Frederic Sowa
  2015-01-07  7:22     ` Steffen Klassert
  2015-01-26  8:35     ` Steffen Klassert
  0 siblings, 2 replies; 12+ messages in thread
From: Hannes Frederic Sowa @ 2015-01-06 16:01 UTC (permalink / raw)
  To: Chris Ruehl; +Cc: netdev, davem, steffen.klassert

On Mi, 2014-12-24 at 21:42 +0800, Chris Ruehl wrote:
> [447604.244357] ipv6_pinfo is NULL
> [447604.273733] ------------[ cut here ]------------
> [447604.303628] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
> ipv6_local_error+0x16b/0x1a0()
> [[...]]
> [last unloaded: ipmi_si]
> [447605.087999] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.14.27 #11
> [447605.139687] Hardware name: Dell Inc. PowerEdge R420/0CN7CM, BIOS 2.3.3 
> 07/10/2014
> [447605.242931]  0000000000000009 ffff8806172e3b48 ffffffff815ffd58 0000000000000000
> [447605.349130]  ffff8806172e3b80 ffffffff81043c23 ffff8800a16322e8 ffff880037daa1c0
> [447605.459659]  ffff88000b026800 0000000000000000 ffff880037daa4b8 ffff8806172e3b90
> [447605.576385] Call Trace:
> [447605.634243]  <IRQ>  [<ffffffff815ffd58>] dump_stack+0x45/0x56
> [447605.692870]  [<ffffffff81043c23>] warn_slowpath_common+0x73/0x90
> [447605.751097]  [<ffffffff81043cf5>] warn_slowpath_null+0x15/0x20
> [447605.808000]  [<ffffffff815da6db>] ipv6_local_error+0x16b/0x1a0
> [447605.863821]  [<ffffffff815e29d0>] xfrm6_local_error+0x60/0x90
> [447605.918493]  [<ffffffff8150b485>] ? skb_dequeue+0x15/0x70
> [447605.971871]  [<ffffffff815a6cc1>] xfrm_local_error+0x51/0x70
> [447606.024218]  [<ffffffff8159ca15>] xfrm4_extract_output+0x75/0xb0
> [447606.075630]  [<ffffffff815a6c5a>] xfrm_inner_extract_output+0x6a/0x80
> [447606.126055]  [<ffffffff815e27a2>] xfrm6_prepare_output+0x12/0x60
> [447606.175310]  [<ffffffff815a6ed0>] xfrm_output_resume+0x1f0/0x370
> [447606.223406]  [<ffffffff8151a486>] ? skb_checksum_help+0x76/0x190
> [447606.270572]  [<ffffffff815a709b>] xfrm_output+0x3b/0xf0
> [447606.316454]  [<ffffffff815e2ae0>] ? xfrm6_extract_output+0xe0/0xe0
> [447606.361803]  [<ffffffff815e2af7>] xfrm6_output_finish+0x17/0x20
> [447606.406053]  [<ffffffff8159cad6>] xfrm4_output+0x46/0x80
> [447606.448694]  [<ffffffff81550a80>] ip_local_out+0x20/0x30
> [447606.489952]  [<ffffffff81550dd5>] ip_queue_xmit+0x135/0x3c0
> [447606.530017]  [<ffffffff815672e1>] tcp_transmit_skb+0x461/0x8c0
> [447606.569362]  [<ffffffff8156786e>] tcp_write_xmit+0x12e/0xb20
> [447606.607876]  [<ffffffff815669ff>] ? tcp_current_mss+0x4f/0x70
> [447606.645723]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
> [447606.682837]  [<ffffffff81569487>] tcp_send_loss_probe+0x37/0x1f0
> [447606.719000]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
> [447606.754537]  [<ffffffff8156b1bb>] tcp_write_timer_handler+0x4b/0x1b0
> [447606.789266]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
> [447606.823242]  [<ffffffff8156b378>] tcp_write_timer+0x58/0x60
> [447606.856047]  [<ffffffff8104e848>] call_timer_fn.isra.32+0x18/0x80
> [447606.888029]  [<ffffffff8104ea1a>] run_timer_softirq+0x16a/0x200
> [447606.920224]  [<ffffffff81047efc>] __do_softirq+0xec/0x250
> [447606.951850]  [<ffffffff810482f5>] irq_exit+0xf5/0x100
> [447606.982665]  [<ffffffff8102bc6f>] smp_apic_timer_interrupt+0x3f/0x50
> [447607.014382]  [<ffffffff8160d98a>] apic_timer_interrupt+0x6a/0x70
> [447607.046175]  <EOI>  [<ffffffff8104f336>] ? get_next_timer_interrupt+0x1d6/0x250
> [447607.111311]  [<ffffffff814d45a7>] ? cpuidle_enter_state+0x47/0xc0
> [447607.145850]  [<ffffffff814d45a3>] ? cpuidle_enter_state+0x43/0xc0
> [447607.179625]  [<ffffffff814d46b6>] cpuidle_idle_call+0x96/0x130
> [447607.213531]  [<ffffffff8100b909>] arch_cpu_idle+0x9/0x20
> [447607.247052]  [<ffffffff810925ba>] cpu_startup_entry+0xda/0x1d0
> [447607.280775]  [<ffffffff81029d22>] start_secondary+0x212/0x2c0
> [447607.314555] ---[ end trace 6ff3826b6e4fdf67 ]---
> 

Thanks for the report!

xfrm6_output_finish unconditionally resets skb->protocol so we try to
dispatch to the IPv6 handler, even though tcp just sends an IPv4 packet.

Hairy, I have a look.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-01-06 16:01   ` Hannes Frederic Sowa
@ 2015-01-07  7:22     ` Steffen Klassert
  2015-01-07 10:45       ` Hannes Frederic Sowa
  2015-01-26  8:35     ` Steffen Klassert
  1 sibling, 1 reply; 12+ messages in thread
From: Steffen Klassert @ 2015-01-07  7:22 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Chris Ruehl, netdev, davem

On Tue, Jan 06, 2015 at 05:01:13PM +0100, Hannes Frederic Sowa wrote:
> 
> xfrm6_output_finish unconditionally resets skb->protocol so we try to
> dispatch to the IPv6 handler, even though tcp just sends an IPv4 packet.

Maybe we better dispatch based on sk->sk_family, this should give
always the right address family of the socket.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-01-07  7:22     ` Steffen Klassert
@ 2015-01-07 10:45       ` Hannes Frederic Sowa
  2015-01-07 12:26         ` Steffen Klassert
  0 siblings, 1 reply; 12+ messages in thread
From: Hannes Frederic Sowa @ 2015-01-07 10:45 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Chris Ruehl, netdev, davem

On Mi, 2015-01-07 at 08:22 +0100, Steffen Klassert wrote:
> On Tue, Jan 06, 2015 at 05:01:13PM +0100, Hannes Frederic Sowa wrote:
> > 
> > xfrm6_output_finish unconditionally resets skb->protocol so we try to
> > dispatch to the IPv6 handler, even though tcp just sends an IPv4 packet.
> 
> Maybe we better dispatch based on sk->sk_family, this should give
> always the right address family of the socket.

The original problem was dealing with IPv4/v6 mapped traffic. Processing
local errors from unconnected UDP sockets which are emitting both IPv4
and IPv6 frames won't play nicely with sk->sk_family I fear.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-01-07 10:45       ` Hannes Frederic Sowa
@ 2015-01-07 12:26         ` Steffen Klassert
  0 siblings, 0 replies; 12+ messages in thread
From: Steffen Klassert @ 2015-01-07 12:26 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Chris Ruehl, netdev, davem

On Wed, Jan 07, 2015 at 11:45:02AM +0100, Hannes Frederic Sowa wrote:
> On Mi, 2015-01-07 at 08:22 +0100, Steffen Klassert wrote:
> > On Tue, Jan 06, 2015 at 05:01:13PM +0100, Hannes Frederic Sowa wrote:
> > > 
> > > xfrm6_output_finish unconditionally resets skb->protocol so we try to
> > > dispatch to the IPv6 handler, even though tcp just sends an IPv4 packet.
> > 
> > Maybe we better dispatch based on sk->sk_family, this should give
> > always the right address family of the socket.
> 
> The original problem was dealing with IPv4/v6 mapped traffic. Processing
> local errors from unconnected UDP sockets which are emitting both IPv4
> and IPv6 frames won't play nicely with sk->sk_family I fear.

Good point, unfortunately it is not so easy to fix as I thought.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-01-06 16:01   ` Hannes Frederic Sowa
  2015-01-07  7:22     ` Steffen Klassert
@ 2015-01-26  8:35     ` Steffen Klassert
  2015-01-27  4:20       ` Chris Ruehl
       [not found]       ` <54C71AFB.40300@gtsys.com.hk>
  1 sibling, 2 replies; 12+ messages in thread
From: Steffen Klassert @ 2015-01-26  8:35 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: Chris Ruehl, netdev, davem

On Tue, Jan 06, 2015 at 05:01:13PM +0100, Hannes Frederic Sowa wrote:
> On Mi, 2014-12-24 at 21:42 +0800, Chris Ruehl wrote:
> > [447604.244357] ipv6_pinfo is NULL
> > [447604.273733] ------------[ cut here ]------------
> > [447604.303628] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
> > ipv6_local_error+0x16b/0x1a0()
> > [[...]]
> > [last unloaded: ipmi_si]
> > [447605.087999] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.14.27 #11
> > [447605.139687] Hardware name: Dell Inc. PowerEdge R420/0CN7CM, BIOS 2.3.3 
> > 07/10/2014
> > [447605.242931]  0000000000000009 ffff8806172e3b48 ffffffff815ffd58 0000000000000000
> > [447605.349130]  ffff8806172e3b80 ffffffff81043c23 ffff8800a16322e8 ffff880037daa1c0
> > [447605.459659]  ffff88000b026800 0000000000000000 ffff880037daa4b8 ffff8806172e3b90
> > [447605.576385] Call Trace:
> > [447605.634243]  <IRQ>  [<ffffffff815ffd58>] dump_stack+0x45/0x56
> > [447605.692870]  [<ffffffff81043c23>] warn_slowpath_common+0x73/0x90
> > [447605.751097]  [<ffffffff81043cf5>] warn_slowpath_null+0x15/0x20
> > [447605.808000]  [<ffffffff815da6db>] ipv6_local_error+0x16b/0x1a0
> > [447605.863821]  [<ffffffff815e29d0>] xfrm6_local_error+0x60/0x90
> > [447605.918493]  [<ffffffff8150b485>] ? skb_dequeue+0x15/0x70
> > [447605.971871]  [<ffffffff815a6cc1>] xfrm_local_error+0x51/0x70
> > [447606.024218]  [<ffffffff8159ca15>] xfrm4_extract_output+0x75/0xb0
> > [447606.075630]  [<ffffffff815a6c5a>] xfrm_inner_extract_output+0x6a/0x80
> > [447606.126055]  [<ffffffff815e27a2>] xfrm6_prepare_output+0x12/0x60
> > [447606.175310]  [<ffffffff815a6ed0>] xfrm_output_resume+0x1f0/0x370
> > [447606.223406]  [<ffffffff8151a486>] ? skb_checksum_help+0x76/0x190
> > [447606.270572]  [<ffffffff815a709b>] xfrm_output+0x3b/0xf0
> > [447606.316454]  [<ffffffff815e2ae0>] ? xfrm6_extract_output+0xe0/0xe0
> > [447606.361803]  [<ffffffff815e2af7>] xfrm6_output_finish+0x17/0x20
> > [447606.406053]  [<ffffffff8159cad6>] xfrm4_output+0x46/0x80
> > [447606.448694]  [<ffffffff81550a80>] ip_local_out+0x20/0x30
> > [447606.489952]  [<ffffffff81550dd5>] ip_queue_xmit+0x135/0x3c0
> > [447606.530017]  [<ffffffff815672e1>] tcp_transmit_skb+0x461/0x8c0
> > [447606.569362]  [<ffffffff8156786e>] tcp_write_xmit+0x12e/0xb20
> > [447606.607876]  [<ffffffff815669ff>] ? tcp_current_mss+0x4f/0x70
> > [447606.645723]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
> > [447606.682837]  [<ffffffff81569487>] tcp_send_loss_probe+0x37/0x1f0
> > [447606.719000]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
> > [447606.754537]  [<ffffffff8156b1bb>] tcp_write_timer_handler+0x4b/0x1b0
> > [447606.789266]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
> > [447606.823242]  [<ffffffff8156b378>] tcp_write_timer+0x58/0x60
> > [447606.856047]  [<ffffffff8104e848>] call_timer_fn.isra.32+0x18/0x80
> > [447606.888029]  [<ffffffff8104ea1a>] run_timer_softirq+0x16a/0x200
> > [447606.920224]  [<ffffffff81047efc>] __do_softirq+0xec/0x250
> > [447606.951850]  [<ffffffff810482f5>] irq_exit+0xf5/0x100
> > [447606.982665]  [<ffffffff8102bc6f>] smp_apic_timer_interrupt+0x3f/0x50
> > [447607.014382]  [<ffffffff8160d98a>] apic_timer_interrupt+0x6a/0x70
> > [447607.046175]  <EOI>  [<ffffffff8104f336>] ? get_next_timer_interrupt+0x1d6/0x250
> > [447607.111311]  [<ffffffff814d45a7>] ? cpuidle_enter_state+0x47/0xc0
> > [447607.145850]  [<ffffffff814d45a3>] ? cpuidle_enter_state+0x43/0xc0
> > [447607.179625]  [<ffffffff814d46b6>] cpuidle_idle_call+0x96/0x130
> > [447607.213531]  [<ffffffff8100b909>] arch_cpu_idle+0x9/0x20
> > [447607.247052]  [<ffffffff810925ba>] cpu_startup_entry+0xda/0x1d0
> > [447607.280775]  [<ffffffff81029d22>] start_secondary+0x212/0x2c0
> > [447607.314555] ---[ end trace 6ff3826b6e4fdf67 ]---
> > 
> 
> Thanks for the report!
> 
> xfrm6_output_finish unconditionally resets skb->protocol so we try to
> dispatch to the IPv6 handler, even though tcp just sends an IPv4 packet.
> 

Looks like we can postpone the setting of skb->protocol to the
xfrm{4,6}_prepare_output() functions where we finally switch to
outer mode.

This has two implications:

- We reset skb->protocol only for tunnel modes, should be ok.

- This affects the xfrm_output_gso() codepath on interfamily
  tunnels. skb_mac_gso_segment() dispatches to the gso_segment()
  callback functions via skb->protocol. So we dispatch to
  the gso_segment() function of the outer mode what looks
  wrong to me. If we postpone the setting of skb->protocol
  to the xfrm{4,6}_prepare_output() we dispatch to inner mode
  here.

Unfortunately I was not able to reproduce the problem on our test
setup. Chris could you try if the the patch below fixes your
problem?

Subject: [PATCH RFC] xfrm: Fix local error reporting crash with interfamily
 tunnels

We set the outer mode protocol too early. As a result, the
local error handler might dispatch to the wrong	address family
and report the error to a wrong socket type. We fix this by
seting the outer protocol to the skb after we accessed the
inner mode for the last time, right before we do the atcual
encapsulation where we switch finally to the outer mode.

Reported-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/xfrm4_output.c | 2 +-
 net/ipv6/xfrm6_output.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index d5f6bd9..dab7381 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -63,6 +63,7 @@ int xfrm4_prepare_output(struct xfrm_state *x, struct sk_buff *skb)
 		return err;
 
 	IPCB(skb)->flags |= IPSKB_XFRM_TUNNEL_SIZE;
+	skb->protocol = htons(ETH_P_IP);
 
 	return x->outer_mode->output2(x, skb);
 }
@@ -71,7 +72,6 @@ EXPORT_SYMBOL(xfrm4_prepare_output);
 int xfrm4_output_finish(struct sk_buff *skb)
 {
 	memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
-	skb->protocol = htons(ETH_P_IP);
 
 #ifdef CONFIG_NETFILTER
 	IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index ca3f29b..010f8bd 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -114,6 +114,7 @@ int xfrm6_prepare_output(struct xfrm_state *x, struct sk_buff *skb)
 		return err;
 
 	skb->ignore_df = 1;
+	skb->protocol = htons(ETH_P_IPV6);
 
 	return x->outer_mode->output2(x, skb);
 }
@@ -122,7 +123,6 @@ EXPORT_SYMBOL(xfrm6_prepare_output);
 int xfrm6_output_finish(struct sk_buff *skb)
 {
 	memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
-	skb->protocol = htons(ETH_P_IPV6);
 
 #ifdef CONFIG_NETFILTER
 	IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-01-26  8:35     ` Steffen Klassert
@ 2015-01-27  4:20       ` Chris Ruehl
       [not found]       ` <54C71AFB.40300@gtsys.com.hk>
  1 sibling, 0 replies; 12+ messages in thread
From: Chris Ruehl @ 2015-01-27  4:20 UTC (permalink / raw)
  To: Steffen Klassert, Hannes Frederic Sowa; +Cc: netdev, davem

On Monday, January 26, 2015 04:35 PM, Steffen Klassert wrote:
> On Tue, Jan 06, 2015 at 05:01:13PM +0100, Hannes Frederic Sowa wrote:
>> On Mi, 2014-12-24 at 21:42 +0800, Chris Ruehl wrote:
>>> [447604.244357] ipv6_pinfo is NULL
>>> [447604.273733] ------------[ cut here ]------------
>>> [447604.303628] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262
>>> ipv6_local_error+0x16b/0x1a0()
>>> [[...]]
>>> [last unloaded: ipmi_si]
>>> [447605.087999] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.14.27 #11
>>> [447605.139687] Hardware name: Dell Inc. PowerEdge R420/0CN7CM, BIOS 2.3.3
>>> 07/10/2014
>>> [447605.242931]  0000000000000009 ffff8806172e3b48 ffffffff815ffd58 0000000000000000
>>> [447605.349130]  ffff8806172e3b80 ffffffff81043c23 ffff8800a16322e8 ffff880037daa1c0
>>> [447605.459659]  ffff88000b026800 0000000000000000 ffff880037daa4b8 ffff8806172e3b90
>>> [447605.576385] Call Trace:
>>> [447605.634243]  <IRQ>  [<ffffffff815ffd58>] dump_stack+0x45/0x56
>>> [447605.692870]  [<ffffffff81043c23>] warn_slowpath_common+0x73/0x90
>>> [447605.751097]  [<ffffffff81043cf5>] warn_slowpath_null+0x15/0x20
>>> [447605.808000]  [<ffffffff815da6db>] ipv6_local_error+0x16b/0x1a0
>>> [447605.863821]  [<ffffffff815e29d0>] xfrm6_local_error+0x60/0x90
>>> [447605.918493]  [<ffffffff8150b485>] ? skb_dequeue+0x15/0x70
>>> [447605.971871]  [<ffffffff815a6cc1>] xfrm_local_error+0x51/0x70
>>> [447606.024218]  [<ffffffff8159ca15>] xfrm4_extract_output+0x75/0xb0
>>> [447606.075630]  [<ffffffff815a6c5a>] xfrm_inner_extract_output+0x6a/0x80
>>> [447606.126055]  [<ffffffff815e27a2>] xfrm6_prepare_output+0x12/0x60
>>> [447606.175310]  [<ffffffff815a6ed0>] xfrm_output_resume+0x1f0/0x370
>>> [447606.223406]  [<ffffffff8151a486>] ? skb_checksum_help+0x76/0x190
>>> [447606.270572]  [<ffffffff815a709b>] xfrm_output+0x3b/0xf0
>>> [447606.316454]  [<ffffffff815e2ae0>] ? xfrm6_extract_output+0xe0/0xe0
>>> [447606.361803]  [<ffffffff815e2af7>] xfrm6_output_finish+0x17/0x20
>>> [447606.406053]  [<ffffffff8159cad6>] xfrm4_output+0x46/0x80
>>> [447606.448694]  [<ffffffff81550a80>] ip_local_out+0x20/0x30
>>> [447606.489952]  [<ffffffff81550dd5>] ip_queue_xmit+0x135/0x3c0
>>> [447606.530017]  [<ffffffff815672e1>] tcp_transmit_skb+0x461/0x8c0
>>> [447606.569362]  [<ffffffff8156786e>] tcp_write_xmit+0x12e/0xb20
>>> [447606.607876]  [<ffffffff815669ff>] ? tcp_current_mss+0x4f/0x70
>>> [447606.645723]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
>>> [447606.682837]  [<ffffffff81569487>] tcp_send_loss_probe+0x37/0x1f0
>>> [447606.719000]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
>>> [447606.754537]  [<ffffffff8156b1bb>] tcp_write_timer_handler+0x4b/0x1b0
>>> [447606.789266]  [<ffffffff8156b320>] ? tcp_write_timer_handler+0x1b0/0x1b0
>>> [447606.823242]  [<ffffffff8156b378>] tcp_write_timer+0x58/0x60
>>> [447606.856047]  [<ffffffff8104e848>] call_timer_fn.isra.32+0x18/0x80
>>> [447606.888029]  [<ffffffff8104ea1a>] run_timer_softirq+0x16a/0x200
>>> [447606.920224]  [<ffffffff81047efc>] __do_softirq+0xec/0x250
>>> [447606.951850]  [<ffffffff810482f5>] irq_exit+0xf5/0x100
>>> [447606.982665]  [<ffffffff8102bc6f>] smp_apic_timer_interrupt+0x3f/0x50
>>> [447607.014382]  [<ffffffff8160d98a>] apic_timer_interrupt+0x6a/0x70
>>> [447607.046175]  <EOI>  [<ffffffff8104f336>] ? get_next_timer_interrupt+0x1d6/0x250
>>> [447607.111311]  [<ffffffff814d45a7>] ? cpuidle_enter_state+0x47/0xc0
>>> [447607.145850]  [<ffffffff814d45a3>] ? cpuidle_enter_state+0x43/0xc0
>>> [447607.179625]  [<ffffffff814d46b6>] cpuidle_idle_call+0x96/0x130
>>> [447607.213531]  [<ffffffff8100b909>] arch_cpu_idle+0x9/0x20
>>> [447607.247052]  [<ffffffff810925ba>] cpu_startup_entry+0xda/0x1d0
>>> [447607.280775]  [<ffffffff81029d22>] start_secondary+0x212/0x2c0
>>> [447607.314555] ---[ end trace 6ff3826b6e4fdf67 ]---
>>>
>> Thanks for the report!
>>
>> xfrm6_output_finish unconditionally resets skb->protocol so we try to
>> dispatch to the IPv6 handler, even though tcp just sends an IPv4 packet.
>>
> Looks like we can postpone the setting of skb->protocol to the
> xfrm{4,6}_prepare_output() functions where we finally switch to
> outer mode.
>
> This has two implications:
>
> - We reset skb->protocol only for tunnel modes, should be ok.
>
> - This affects the xfrm_output_gso() codepath on interfamily
>    tunnels. skb_mac_gso_segment() dispatches to the gso_segment()
>    callback functions via skb->protocol. So we dispatch to
>    the gso_segment() function of the outer mode what looks
>    wrong to me. If we postpone the setting of skb->protocol
>    to the xfrm{4,6}_prepare_output() we dispatch to inner mode
>    here.
>
> Unfortunately I was not able to reproduce the problem on our test
> setup. Chris could you try if the the patch below fixes your
> problem?
>
> Subject: [PATCH RFC] xfrm: Fix local error reporting crash with interfamily
>   tunnels
>
> We set the outer mode protocol too early. As a result, the
> local error handler might dispatch to the wrong	address family
> and report the error to a wrong socket type. We fix this by
> seting the outer protocol to the skb after we accessed the
> inner mode for the last time, right before we do the atcual
> encapsulation where we switch finally to the outer mode.
>
> Reported-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv4/xfrm4_output.c | 2 +-
>   net/ipv6/xfrm6_output.c | 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
> index d5f6bd9..dab7381 100644
> --- a/net/ipv4/xfrm4_output.c
> +++ b/net/ipv4/xfrm4_output.c
> @@ -63,6 +63,7 @@ int xfrm4_prepare_output(struct xfrm_state *x, struct sk_buff *skb)
>   		return err;
>   
>   	IPCB(skb)->flags |= IPSKB_XFRM_TUNNEL_SIZE;
> +	skb->protocol = htons(ETH_P_IP);
>   
>   	return x->outer_mode->output2(x, skb);
>   }
> @@ -71,7 +72,6 @@ EXPORT_SYMBOL(xfrm4_prepare_output);
>   int xfrm4_output_finish(struct sk_buff *skb)
>   {
>   	memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
> -	skb->protocol = htons(ETH_P_IP);
>   
>   #ifdef CONFIG_NETFILTER
>   	IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
> diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
> index ca3f29b..010f8bd 100644
> --- a/net/ipv6/xfrm6_output.c
> +++ b/net/ipv6/xfrm6_output.c
> @@ -114,6 +114,7 @@ int xfrm6_prepare_output(struct xfrm_state *x, struct sk_buff *skb)
>   		return err;
>   
>   	skb->ignore_df = 1;
> +	skb->protocol = htons(ETH_P_IPV6);
>   
>   	return x->outer_mode->output2(x, skb);
>   }
> @@ -122,7 +123,6 @@ EXPORT_SYMBOL(xfrm6_prepare_output);
>   int xfrm6_output_finish(struct sk_buff *skb)
>   {
>   	memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
> -	skb->protocol = htons(ETH_P_IPV6);
>   
>   #ifdef CONFIG_NETFILTER
>   	IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
Steffen,

I will apply the patch and let you know. I keep my warning so we will 
see if its hits it (hopefully not)
After apply the patch it can take a couple of day until we know it - see 
below
root@sh1:/home/chris/kernel.d/linux-3.14.x# dmesg | grep WARNING
[447604.303628] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[1738973.489326] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[1738973.678786] WARNING: CPU: 6 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[2795700.233928] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[2805335.085370] WARNING: CPU: 0 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[2881267.252047] WARNING: CPU: 6 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[3042311.131764] WARNING: CPU: 7 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[3061315.974711] WARNING: CPU: 6 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[3070653.051669] WARNING: CPU: 6 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[3089456.783231] WARNING: CPU: 6 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[3098986.926483] WARNING: CPU: 6 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()
[3118180.833934] WARNING: CPU: 6 PID: 0 at net/ipv6/datagram.c:262 
ipv6_local_error+0x16b/0x1a0()

Thanks
Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
       [not found]       ` <54C71AFB.40300@gtsys.com.hk>
@ 2015-01-27 11:58         ` Steffen Klassert
  2015-01-28  3:50           ` Chris Ruehl
  2015-02-06  7:37           ` Chris Ruehl
  0 siblings, 2 replies; 12+ messages in thread
From: Steffen Klassert @ 2015-01-27 11:58 UTC (permalink / raw)
  To: Chris Ruehl; +Cc: Hannes Frederic Sowa, netdev, davem

On Tue, Jan 27, 2015 at 12:58:35PM +0800, Chris Ruehl wrote:
> 
>    Steffen,
> 
>    your patch can't apply to the vanilla v3.14.29 can you cross check please.

Sorry, this patch was based on the net tree.

>    I'm sorry but we running a productive system and I can't make to much
>    noise here!
>    Your patch is partly in the 3.14.29 and
>    skb->protocol = htons(ETH_P_IP)
>    from the xfrm4/6_output_finish() no removed. I do then
> 
>    --- linux-3.14.x/net/ipv4/xfrm4_output.c.orig    2015-01-27
>    12:50:01.830651344 +0800
>    +++ linux-3.14.x/net/ipv4/xfrm4_output.c    2015-01-27 12:51:13.280386355
>    +0800
>    @@ -82,7 +82,6 @@
>         IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
>     #endif
>     
>    -    skb->protocol = htons(ETH_P_IP);
>         return xfrm_output(skb);
>     }
>     
>    --- linux-3.14.x/net/ipv6/xfrm6_output.c.orig    2015-01-27
>    12:49:39.260735321 +0800
>    +++ linux-3.14.x/net/ipv6/xfrm6_output.c    2015-01-27 12:50:47.280482636
>    +0800
>    @@ -132,7 +132,6 @@
>         IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
>     #endif
>     
>    -    skb->protocol = htons(ETH_P_IPV6);
>         return xfrm_output(skb);
>     }

Yes, that should be ok. Here is the complete patch for v3.14.29:

Subject: [PATCH RFC v3.14.29] xfrm: Fix local error reporting crash with interfamily tunnels

We set the outer mode protocol too early. As a result, the
local error handler might dispatch to the wrong address family
and report the error to a wrong socket type. We fix this by
seting the outer protocol to the skb only after we accessed the
inner mode for the last time, right before we do the atcual
encapsulation where we switch finally to the outer mode.
The settings in xfrm{4,6}_output_finish() are removed.

Reported-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/xfrm4_output.c |    1 -
 net/ipv6/xfrm6_output.c |    1 -
 2 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
index baa0f63..0cb9606 100644
--- a/net/ipv4/xfrm4_output.c
+++ b/net/ipv4/xfrm4_output.c
@@ -82,7 +82,6 @@ int xfrm4_output_finish(struct sk_buff *skb)
 	IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
 #endif
 
-	skb->protocol = htons(ETH_P_IP);
 	return xfrm_output(skb);
 }
 
diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 6cd625e..98396cf 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -132,7 +132,6 @@ int xfrm6_output_finish(struct sk_buff *skb)
 	IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
 #endif
 
-	skb->protocol = htons(ETH_P_IPV6);
 	return xfrm_output(skb);
 }
 
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-01-27 11:58         ` Steffen Klassert
@ 2015-01-28  3:50           ` Chris Ruehl
  2015-02-06  7:37           ` Chris Ruehl
  1 sibling, 0 replies; 12+ messages in thread
From: Chris Ruehl @ 2015-01-28  3:50 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Hannes Frederic Sowa, netdev, davem

On Tuesday, January 27, 2015 07:58 PM, Steffen Klassert wrote:
> On Tue, Jan 27, 2015 at 12:58:35PM +0800, Chris Ruehl wrote:
>>
>>     Steffen,
>>
>>     your patch can't apply to the vanilla v3.14.29 can you cross check please.
>
> Sorry, this patch was based on the net tree.
>
>>     I'm sorry but we running a productive system and I can't make to much
>>     noise here!
>>     Your patch is partly in the 3.14.29 and
>>     skb->protocol = htons(ETH_P_IP)
>>     from the xfrm4/6_output_finish() no removed. I do then
>>
>>     --- linux-3.14.x/net/ipv4/xfrm4_output.c.orig    2015-01-27
>>     12:50:01.830651344 +0800
>>     +++ linux-3.14.x/net/ipv4/xfrm4_output.c    2015-01-27 12:51:13.280386355
>>     +0800
>>     @@ -82,7 +82,6 @@
>>          IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
>>      #endif
>>
>>     -    skb->protocol = htons(ETH_P_IP);
>>          return xfrm_output(skb);
>>      }
>>
>>     --- linux-3.14.x/net/ipv6/xfrm6_output.c.orig    2015-01-27
>>     12:49:39.260735321 +0800
>>     +++ linux-3.14.x/net/ipv6/xfrm6_output.c    2015-01-27 12:50:47.280482636
>>     +0800
>>     @@ -132,7 +132,6 @@
>>          IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
>>      #endif
>>
>>     -    skb->protocol = htons(ETH_P_IPV6);
>>          return xfrm_output(skb);
>>      }
>
> Yes, that should be ok. Here is the complete patch for v3.14.29:
>
> Subject: [PATCH RFC v3.14.29] xfrm: Fix local error reporting crash with interfamily tunnels
>
> We set the outer mode protocol too early. As a result, the
> local error handler might dispatch to the wrong address family
> and report the error to a wrong socket type. We fix this by
> seting the outer protocol to the skb only after we accessed the
> inner mode for the last time, right before we do the atcual
> encapsulation where we switch finally to the outer mode.
> The settings in xfrm{4,6}_output_finish() are removed.
>
> Reported-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv4/xfrm4_output.c |    1 -
>   net/ipv6/xfrm6_output.c |    1 -
>   2 files changed, 0 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
> index baa0f63..0cb9606 100644
> --- a/net/ipv4/xfrm4_output.c
> +++ b/net/ipv4/xfrm4_output.c
> @@ -82,7 +82,6 @@ int xfrm4_output_finish(struct sk_buff *skb)
>   	IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
>   #endif
>
> -	skb->protocol = htons(ETH_P_IP);
>   	return xfrm_output(skb);
>   }
>
> diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
> index 6cd625e..98396cf 100644
> --- a/net/ipv6/xfrm6_output.c
> +++ b/net/ipv6/xfrm6_output.c
> @@ -132,7 +132,6 @@ int xfrm6_output_finish(struct sk_buff *skb)
>   	IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
>   #endif
>
> -	skb->protocol = htons(ETH_P_IPV6);
>   	return xfrm_output(skb);
>   }
>
>
Applied, wait for a window for reboot the system. And wait :0)

Chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-01-27 11:58         ` Steffen Klassert
  2015-01-28  3:50           ` Chris Ruehl
@ 2015-02-06  7:37           ` Chris Ruehl
  2015-02-10  9:57             ` Steffen Klassert
  1 sibling, 1 reply; 12+ messages in thread
From: Chris Ruehl @ 2015-02-06  7:37 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Hannes Frederic Sowa, netdev, davem

On Tuesday, January 27, 2015 07:58 PM, Steffen Klassert wrote:
> On Tue, Jan 27, 2015 at 12:58:35PM +0800, Chris Ruehl wrote:
>>     Steffen,
>>
>>     your patch can't apply to the vanilla v3.14.29 can you cross check please.
> Sorry, this patch was based on the net tree.
>
>>     I'm sorry but we running a productive system and I can't make to much
>>     noise here!
>>     Your patch is partly in the 3.14.29 and
>>     skb->protocol = htons(ETH_P_IP)
>>     from the xfrm4/6_output_finish() no removed. I do then
>>
>>     --- linux-3.14.x/net/ipv4/xfrm4_output.c.orig    2015-01-27
>>     12:50:01.830651344 +0800
>>     +++ linux-3.14.x/net/ipv4/xfrm4_output.c    2015-01-27 12:51:13.280386355
>>     +0800
>>     @@ -82,7 +82,6 @@
>>          IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
>>      #endif
>>      
>>     -    skb->protocol = htons(ETH_P_IP);
>>          return xfrm_output(skb);
>>      }
>>      
>>     --- linux-3.14.x/net/ipv6/xfrm6_output.c.orig    2015-01-27
>>     12:49:39.260735321 +0800
>>     +++ linux-3.14.x/net/ipv6/xfrm6_output.c    2015-01-27 12:50:47.280482636
>>     +0800
>>     @@ -132,7 +132,6 @@
>>          IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
>>      #endif
>>      
>>     -    skb->protocol = htons(ETH_P_IPV6);
>>          return xfrm_output(skb);
>>      }
> Yes, that should be ok. Here is the complete patch for v3.14.29:
>
> Subject: [PATCH RFC v3.14.29] xfrm: Fix local error reporting crash with interfamily tunnels
>
> We set the outer mode protocol too early. As a result, the
> local error handler might dispatch to the wrong address family
> and report the error to a wrong socket type. We fix this by
> seting the outer protocol to the skb only after we accessed the
> inner mode for the last time, right before we do the atcual
> encapsulation where we switch finally to the outer mode.
> The settings in xfrm{4,6}_output_finish() are removed.
>
> Reported-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv4/xfrm4_output.c |    1 -
>   net/ipv6/xfrm6_output.c |    1 -
>   2 files changed, 0 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c
> index baa0f63..0cb9606 100644
> --- a/net/ipv4/xfrm4_output.c
> +++ b/net/ipv4/xfrm4_output.c
> @@ -82,7 +82,6 @@ int xfrm4_output_finish(struct sk_buff *skb)
>   	IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED;
>   #endif
>   
> -	skb->protocol = htons(ETH_P_IP);
>   	return xfrm_output(skb);
>   }
>   
> diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
> index 6cd625e..98396cf 100644
> --- a/net/ipv6/xfrm6_output.c
> +++ b/net/ipv6/xfrm6_output.c
> @@ -132,7 +132,6 @@ int xfrm6_output_finish(struct sk_buff *skb)
>   	IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED;
>   #endif
>   
> -	skb->protocol = htons(ETH_P_IPV6);
>   	return xfrm_output(skb);
>   }
>   
Hi Steffen,

server is up for 6 days no problems any more.
Please apply the patch!

Thank you very much
Chris

Tested-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ipv6: oops in datagram.c line 260
  2015-02-06  7:37           ` Chris Ruehl
@ 2015-02-10  9:57             ` Steffen Klassert
  0 siblings, 0 replies; 12+ messages in thread
From: Steffen Klassert @ 2015-02-10  9:57 UTC (permalink / raw)
  To: Chris Ruehl; +Cc: Hannes Frederic Sowa, netdev, davem

On Fri, Feb 06, 2015 at 03:37:52PM +0800, Chris Ruehl wrote:
> Hi Steffen,
> 
> server is up for 6 days no problems any more.
> Please apply the patch!
> 
> Thank you very much
> Chris
> 
> Tested-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>

Now applied to the ipsec tree. Thanks a lot for testing Chris!

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-02-10  9:57 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-10  5:43 ipv6: oops in datagram.c line 260 Chris Ruehl
2014-12-24 13:42 ` Chris Ruehl
2015-01-06 16:01   ` Hannes Frederic Sowa
2015-01-07  7:22     ` Steffen Klassert
2015-01-07 10:45       ` Hannes Frederic Sowa
2015-01-07 12:26         ` Steffen Klassert
2015-01-26  8:35     ` Steffen Klassert
2015-01-27  4:20       ` Chris Ruehl
     [not found]       ` <54C71AFB.40300@gtsys.com.hk>
2015-01-27 11:58         ` Steffen Klassert
2015-01-28  3:50           ` Chris Ruehl
2015-02-06  7:37           ` Chris Ruehl
2015-02-10  9:57             ` Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).