All of lore.kernel.org
 help / color / mirror / Atom feed
* Repeatable IPv6 crash in 3.19.0-1
@ 2015-02-27 21:37 Brian Rak
  2015-02-28  0:48 ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Rak @ 2015-02-27 21:37 UTC (permalink / raw)
  To: netdev

I've been seeing a crash under 3.19.0 that seems to occur when I put 
heavy traffic across a macvtap/veth interface.

We have a KVM guest attached to a veth pair using macvtap.  We're 
routing IPv6 traffic into one end of the veth pair using some static 
routes.  We do *not* have proxy_ndp enabled (though, we are using some 
software to do neighbor proxying - http://priv.nu/projects/ndppd/ ).

I've been able to reproduce this pretty easily by downloading some large 
files from the guest.  We see two traces in a row when this occurs:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 6520 at arch/x86/kernel/smp.c:124 
native_smp_send_reschedule+0x5f/0x70()
Modules linked in: ip_set netconsole configfs xt_comment ebt_ip6 
ip6table_mangle veth xt_physdev br_netfilter ebt_arp ebt_ip ebtable_nat 
ebtables cls_fw sch_sfq sch_htb vhost_net macvtap macvlan vhost tun 
kvm_intel kvm 8021q garp nfnetlink_queue nfnetlink_log nfnetlink 
bluetooth rfkill bridge stp llc xt_CHECKSUM iptable_mangle ipt_REJECT 
nf_reject_ipv4 iptable_filter ip_tables ip6t_REJECT nf_reject_ipv6 
ip6table_filter ip6_tables ipv6 joydev iTCO_wdt iTCO_vendor_support 
8250_fintek ipmi_devintf ipmi_si ipmi_msghandler microcode pcspkr 
i2c_i801 sg lpc_ich igb dca ptp pps_core hwmon shpchp xhci_pci xhci_hcd 
ie31200_edac edac_core ext4 jbd2 mbcache sd_mod ahci libahci video ttm 
drm_kms_helper sysimgblt sysfillrect syscopyarea dm_mirror 
dm_region_hash dm_log dm_mod
CPU: 0 PID: 6520 Comm: vhost-6518 Tainted: G      D 
3.19.0-1.el6.elrepo.x86_64 #1
Hardware name: Supermicro X10SLH-F/X10SLM+-F/X10SLH-F/X10SLM+-F, BIOS 
1.1a 12/03/2013
  000000000000007c ffff88041fc035a0 ffffffff816754e2 000000000000007c
  0000000000000000 ffff88041fc035e0 ffffffff81074bc5 ffff88041fc03600
  ffff88041fc53f00 0000000000000001 ffff88041fc13f00 ffff8803f6a11150
Call Trace:
  <IRQ>  [<ffffffff816754e2>] dump_stack+0x48/0x5e
  [<ffffffff81074bc5>] warn_slowpath_common+0x95/0xe0
  [<ffffffff81074c2a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8104749f>] native_smp_send_reschedule+0x5f/0x70
  [<ffffffff810a83fa>] trigger_load_balance+0x14a/0x1f0
  [<ffffffff81099a06>] scheduler_tick+0xa6/0xe0
  [<ffffffff810da121>] update_process_times+0x51/0x70
  [<ffffffff810eb919>] tick_sched_handle+0x39/0x80
  [<ffffffff810ebb62>] tick_sched_timer+0x52/0xa0
  [<ffffffff810dc9d3>] __run_hrtimer+0x83/0x1d0
  [<ffffffff810ebb10>] ? tick_nohz_handler+0xc0/0xc0
  [<ffffffff810dcd46>] hrtimer_interrupt+0x106/0x250
  [<ffffffff8104a249>] local_apic_timer_interrupt+0x39/0x60
  [<ffffffff8167c7d5>] smp_apic_timer_interrupt+0x45/0x60
  [<ffffffff8167a87d>] apic_timer_interrupt+0x6d/0x80
  [<ffffffff81675362>] ? panic+0x1c0/0x206
  [<ffffffff8167535b>] ? panic+0x1b9/0x206
  [<ffffffff810185ca>] oops_end+0xea/0xf0
  [<ffffffff810602c5>] no_context+0x125/0x200
  [<ffffffff810604cd>] __bad_area_nosemaphore+0x12d/0x230
  [<ffffffffa02f726c>] ? ip6t_do_table+0x29c/0x6e0 [ip6_tables]
  [<ffffffffa0331ed0>] ? deliver_clone+0x60/0x60 [bridge]
  [<ffffffff810605e3>] bad_area_nosemaphore+0x13/0x20
  [<ffffffff81060b76>] __do_page_fault+0x336/0x520
  [<ffffffffa03320b9>] ? br_dev_queue_push_xmit+0x1e9/0x200 [bridge]
  [<ffffffff81060e6c>] do_page_fault+0x2c/0x40
  [<ffffffff8167b928>] page_fault+0x28/0x30
  [<ffffffffa02836a3>] ? ip6_finish_output2+0x193/0x490 [ipv6]
  [<ffffffff815d9e4d>] ? nf_hook_slow+0x7d/0x150
  [<ffffffffa0283e10>] ? ip6_xmit+0x470/0x470 [ipv6]
  [<ffffffffa0282a00>] ? ip6_forward_proxy_check+0x150/0x150 [ipv6]
  [<ffffffffa0283ea5>] ip6_finish_output+0x95/0xd0 [ipv6]
  [<ffffffffa0283f58>] ip6_output+0x78/0xb0 [ipv6]
  [<ffffffffa0282a16>] ip6_forward_finish+0x16/0x20 [ipv6]
  [<ffffffffa0284548>] ip6_forward+0x5b8/0x7a0 [ipv6]
  [<ffffffffa0290cac>] ? ip6_route_input+0xbc/0xe0 [ipv6]
  [<ffffffffa028590d>] ip6_rcv_finish+0x9d/0xb0 [ipv6]
  [<ffffffffa0285c88>] ipv6_rcv+0x368/0x4d0 [ipv6]
  [<ffffffff815a8274>] __netif_receive_skb_core+0x4b4/0x640
  [<ffffffff815a8427>] __netif_receive_skb+0x27/0x70
  [<ffffffff815a8562>] process_backlog+0xf2/0x1b0
  [<ffffffff815a8de3>] napi_poll+0xd3/0x1c0
  [<ffffffff810e9664>] ? clockevents_program_event+0x74/0x120
  [<ffffffff815a8f60>] net_rx_action+0x90/0x1c0
  [<ffffffff81078b3b>] __do_softirq+0xfb/0x2a0
  [<ffffffff8167b53c>] do_softirq_own_stack+0x1c/0x30
  <EOI>  [<ffffffff81078645>] do_softirq+0x55/0x60
  [<ffffffff81078728>] __local_bh_enable_ip+0x88/0x90
  [<ffffffff815a9c67>] __dev_queue_xmit+0x227/0x5a0
  [<ffffffff815aa000>] dev_queue_xmit+0x10/0x20
  [<ffffffffa04b4417>] macvtap_get_user+0x437/0x5d0 [macvtap]
  [<ffffffffa04a1172>] ? vhost_get_vq_desc+0x152/0x300 [vhost]
  [<ffffffffa04b45d5>] macvtap_sendmsg+0x25/0x30 [macvtap]
  [<ffffffffa04b9f8b>] handle_tx+0x27b/0x480 [vhost_net]
  [<ffffffffa04ba1c5>] handle_tx_kick+0x15/0x20 [vhost_net]
  [<ffffffffa04a0f6d>] vhost_worker+0x10d/0x1c0 [vhost]
  [<ffffffffa04a0e60>] ? vhost_dev_init+0x1d0/0x1d0 [vhost]
  [<ffffffff8109244e>] kthread+0xce/0xf0
  [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff816798bc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70
---[ end trace eb7c35e4dfea0d83 ]---
BUG: unable to handle kernel paging request at ffff880408812ffe
IP: [<ffffffffa027b6a3>] ip6_finish_output2+0x193/0x490 [ipv6]
PGD 211e067 PUD 2121067 PMD 409339063 PTE 8000000408812161
Oops: 0003 [#1] SMP
Modules linked in: netconsole configfs ip_set xt_comment ebt_ip6 
ip6table_mangle veth xt_physdev br_netfilter ebt_arp ebt_ip ebtable_nat 
ebtables cls_fw sch_sfq sch_htb vhost_net macvtap macvlan vhost tun 
kvm_intel kvm 8021q garp nfnetlink_queue nfnetlink_log nfnetlink 
bluetooth rfkill bridge stp llc joydev xt_CHECKSUM iptable_mangle 
ipt_REJECT nf_reject_ipv4 iptable_filter ip_tables ip6t_REJECT 
nf_reject_ipv6 ip6table_filter ip6_tables ipv6 iTCO_wdt 
iTCO_vendor_support 8250_fintek ipmi_devintf ipmi_si ipmi_msghandler 
microcode pcspkr i2c_i801 sg lpc_ich igb dca ptp pps_core hwmon shpchp 
xhci_pci xhci_hcd ie31200_edac edac_core ext4 jbd2 mbcache sd_mod ahci 
libahci video ttm drm_kms_helper sysimgblt sysfillrect syscopyarea 
dm_mirror dm_region_hash dm_log dm_mod
CPU: 7 PID: 8187 Comm: vhost-8184 Not tainted 3.19.0-1.el6.elrepo.x86_64 #1
Hardware name: Supermicro X10SLH-F/X10SLM+-F/X10SLH-F/X10SLM+-F, BIOS 
1.1a 12/03/2013
task: ffff8803f391c050 ti: ffff88040c128000 task.ti: ffff88040c128000
RIP: 0010:[<ffffffffa027b6a3>]  [<ffffffffa027b6a3>] 
ip6_finish_output2+0x193/0x490 [ipv6]
RSP: 0018:ffff88041fdc3be8  EFLAGS: 00010283
RAX: ffff88040881300e RBX: ffff8803cfcd3a00 RCX: ffff88040d1c52e4
RDX: 7f813e3323000000 RSI: ffff88040bcee168 RDI: ffff8803f65b55c0
RBP: ffff88041fdc3c38 R08: ffff8803d36283d8 R09: 00000000ff332302
R10: 00000000000080fe R11: 000000007f813efe R12: 000000000000000e
R13: ffff88040d1c5200 R14: ffff88040d1c52f0 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88041fdc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff880408812ffe CR3: 00000000d1613000 CR4: 00000000001427e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
  ffffffffa027be10 ffff880380000000 ffffffffa027aa00 0000000a00000002
  ffffffff81d5e380 ffff8803cfcd3a00 00000000000005dc ffffffff81d25340
  ffff88040881300e ffff880408813000 ffff88041fdc3c58 ffffffffa027bea5
Call Trace:
  <IRQ>
  [<ffffffffa027be10>] ? ip6_xmit+0x470/0x470 [ipv6]
  [<ffffffffa027aa00>] ? ip6_forward_proxy_check+0x150/0x150 [ipv6]
  [<ffffffffa027bea5>] ip6_finish_output+0x95/0xd0 [ipv6]
  [<ffffffffa027bf58>] ip6_output+0x78/0xb0 [ipv6]
  [<ffffffffa027aa16>] ip6_forward_finish+0x16/0x20 [ipv6]
  [<ffffffffa027c548>] ip6_forward+0x5b8/0x7a0 [ipv6]
  [<ffffffffa0288cac>] ? ip6_route_input+0xbc/0xe0 [ipv6]
  [<ffffffffa027d90d>] ip6_rcv_finish+0x9d/0xb0 [ipv6]
  [<ffffffffa027dc88>] ipv6_rcv+0x368/0x4d0 [ipv6]
  [<ffffffff815a8274>] __netif_receive_skb_core+0x4b4/0x640
  [<ffffffff815a8427>] __netif_receive_skb+0x27/0x70
  [<ffffffff815a8562>] process_backlog+0xf2/0x1b0
  [<ffffffff815a8de3>] napi_poll+0xd3/0x1c0
  [<ffffffff815a8f60>] net_rx_action+0x90/0x1c0
  [<ffffffff81078b3b>] __do_softirq+0xfb/0x2a0
  [<ffffffff8167b53c>] do_softirq_own_stack+0x1c/0x30
  <EOI>
  [<ffffffff81078645>] do_softirq+0x55/0x60
  [<ffffffff81078728>] __local_bh_enable_ip+0x88/0x90
  [<ffffffff815a9c67>] __dev_queue_xmit+0x227/0x5a0
  [<ffffffff815aa000>] dev_queue_xmit+0x10/0x20
  [<ffffffffa04b0417>] macvtap_get_user+0x437/0x5d0 [macvtap]
  [<ffffffffa049d172>] ? vhost_get_vq_desc+0x152/0x300 [vhost]
  [<ffffffffa04b05d5>] macvtap_sendmsg+0x25/0x30 [macvtap]
  [<ffffffffa04b5f8b>] handle_tx+0x27b/0x480 [vhost_net]
  [<ffffffffa04b61c5>] handle_tx_kick+0x15/0x20 [vhost_net]
  [<ffffffffa049cf6d>] vhost_worker+0x10d/0x1c0 [vhost]
  [<ffffffffa049ce60>] ? vhost_dev_init+0x1d0/0x1d0 [vhost]
  [<ffffffff8109244e>] kthread+0xce/0xf0
  [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70
  [<ffffffff816798bc>] ret_from_fork+0x7c/0xb0
  [<ffffffff81092380>] ? kthread_freezable_should_stop+0x70/0x70
Code: 00 00 44 8b 39 41 f6 c7 01 0f 85 8d 02 00 00 45 0f b7 a5 e0 00 00 
00 41 83 fc 10 0f 8f 82 02 00 00 49 8b 16 48 8b 83 d8 00 00 00 <48> 89 
50 f0 49 8b 56 08 48 89 50 f8 45 3b bd e4 00 00 00 75 c2
RIP  [<ffffffffa027b6a3>] ip6_finish_output2+0x193/0x490 [ipv6]
  RSP <ffff88041fdc3be8>
CR2: ffff880408812ffe
---[ end trace d743d347dba40c49 ]---

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Repeatable IPv6 crash in 3.19.0-1
  2015-02-27 21:37 Repeatable IPv6 crash in 3.19.0-1 Brian Rak
@ 2015-02-28  0:48 ` Eric Dumazet
  2015-02-28  1:16   ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2015-02-28  0:48 UTC (permalink / raw)
  To: Brian Rak; +Cc: netdev

On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote:
> I've been seeing a crash under 3.19.0 that seems to occur when I put 
> heavy traffic across a macvtap/veth interface.
> 
> We have a KVM guest attached to a veth pair using macvtap.  We're 
> routing IPv6 traffic into one end of the veth pair using some static 
> routes.  We do *not* have proxy_ndp enabled (though, we are using some 
> software to do neighbor proxying - http://priv.nu/projects/ndppd/ ).
> 
> I've been able to reproduce this pretty easily by downloading some large 
> files from the guest.  We see two traces in a row when this occurs:


Nice !

Crash is in neigh_hh_output()

-> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);

And there is only 14 bytes of headroom instead of 16.

Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet
header.

IPv4 has a paranoid section, not IPv6 :

        /* Be paranoid, rather than too clever. */
        if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) {
                struct sk_buff *skb2;

                skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev));
                if (skb2 == NULL) {
                        kfree_skb(skb);
                        return -ENOMEM;
                }
                if (skb->sk)
                        skb_set_owner_w(skb2, skb->sk);
                consume_skb(skb);
                skb = skb2;
        }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Repeatable IPv6 crash in 3.19.0-1
  2015-02-28  0:48 ` Eric Dumazet
@ 2015-02-28  1:16   ` Eric Dumazet
  2015-02-28  1:54     ` Brian Rak
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2015-02-28  1:16 UTC (permalink / raw)
  To: Brian Rak; +Cc: netdev

On Fri, 2015-02-27 at 16:48 -0800, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote:
> > I've been seeing a crash under 3.19.0 that seems to occur when I put 
> > heavy traffic across a macvtap/veth interface.
> > 
> > We have a KVM guest attached to a veth pair using macvtap.  We're 
> > routing IPv6 traffic into one end of the veth pair using some static 
> > routes.  We do *not* have proxy_ndp enabled (though, we are using some 
> > software to do neighbor proxying - http://priv.nu/projects/ndppd/ ).
> > 
> > I've been able to reproduce this pretty easily by downloading some large 
> > files from the guest.  We see two traces in a row when this occurs:
> 
> 
> Nice !
> 
> Crash is in neigh_hh_output()
> 
> -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
> 
> And there is only 14 bytes of headroom instead of 16.
> 
> Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet
> header.

Could you try following patch ?

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -654,11 +654,14 @@ static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q,
 	} /* else everything is zero */
 }
 
+/* Neighbour code has some assumptions on HH_DATA_MOD alignment */
+#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN)
+
 /* Get packet from user space buffer */
 static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 				struct iov_iter *from, int noblock)
 {
-	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
+	int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE);
 	struct sk_buff *skb;
 	struct macvlan_dev *vlan;
 	unsigned long total_len = iov_iter_count(from);
@@ -722,7 +725,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 			linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len);
 	}
 
-	skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen,
+	skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen,
 				linear, noblock, &err);
 	if (!skb)
 		goto err;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: Repeatable IPv6 crash in 3.19.0-1
  2015-02-28  1:16   ` Eric Dumazet
@ 2015-02-28  1:54     ` Brian Rak
  2015-02-28  2:01       ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Rak @ 2015-02-28  1:54 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev



On 2/27/2015 8:16 PM, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 16:48 -0800, Eric Dumazet wrote:
>> On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote:
>>> I've been seeing a crash under 3.19.0 that seems to occur when I put
>>> heavy traffic across a macvtap/veth interface.
>>>
>>> We have a KVM guest attached to a veth pair using macvtap.  We're
>>> routing IPv6 traffic into one end of the veth pair using some static
>>> routes.  We do *not* have proxy_ndp enabled (though, we are using some
>>> software to do neighbor proxying - http://priv.nu/projects/ndppd/ ).
>>>
>>> I've been able to reproduce this pretty easily by downloading some large
>>> files from the guest.  We see two traces in a row when this occurs:
>>
>>
>> Nice !
>>
>> Crash is in neigh_hh_output()
>>
>> -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
>>
>> And there is only 14 bytes of headroom instead of 16.
>>
>> Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet
>> header.
>
> Could you try following patch ?
>
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -654,11 +654,14 @@ static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q,
>   	} /* else everything is zero */
>   }
>
> +/* Neighbour code has some assumptions on HH_DATA_MOD alignment */
> +#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN)
> +
>   /* Get packet from user space buffer */
>   static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>   				struct iov_iter *from, int noblock)
>   {
> -	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
> +	int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE);
>   	struct sk_buff *skb;
>   	struct macvlan_dev *vlan;
>   	unsigned long total_len = iov_iter_count(from);
> @@ -722,7 +725,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>   			linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len);
>   	}
>
> -	skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen,
> +	skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen,
>   				linear, noblock, &err);
>   	if (!skb)
>   		goto err;
>
>

Wow, that was *much* faster then I was expecting, thanks a bunch!

I can confirm that resolves the issue.. I've tested this and it fixes 
the issue perfectly.  I've been able to put a whole bunch of IPv6 
traffic through the interface now, whereas before even a minor amount of 
traffic would crash the host.

Thanks again!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Repeatable IPv6 crash in 3.19.0-1
  2015-02-28  1:54     ` Brian Rak
@ 2015-02-28  2:01       ` Eric Dumazet
  2015-02-28  2:03         ` Eric Dumazet
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2015-02-28  2:01 UTC (permalink / raw)
  To: Brian Rak; +Cc: netdev

On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote:

> Wow, that was *much* faster then I was expecting, thanks a bunch!
> 
> I can confirm that resolves the issue.. I've tested this and it fixes 
> the issue perfectly.  I've been able to put a whole bunch of IPv6 
> traffic through the interface now, whereas before even a minor amount of 
> traffic would crash the host.
> 
> Thanks again!

Interesting...

Had a prior version of linux kernel been fine ?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Repeatable IPv6 crash in 3.19.0-1
  2015-02-28  2:01       ` Eric Dumazet
@ 2015-02-28  2:03         ` Eric Dumazet
  2015-02-28  2:11           ` Brian Rak
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2015-02-28  2:03 UTC (permalink / raw)
  To: Brian Rak; +Cc: netdev

On Fri, 2015-02-27 at 18:01 -0800, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote:
> 
> > Wow, that was *much* faster then I was expecting, thanks a bunch!
> > 
> > I can confirm that resolves the issue.. I've tested this and it fixes 
> > the issue perfectly.  I've been able to put a whole bunch of IPv6 
> > traffic through the interface now, whereas before even a minor amount of 
> > traffic would crash the host.
> > 
> > Thanks again!
> 
> Interesting...
> 
> Had a prior version of linux kernel been fine ?

Or maybe you recently switched on this config option ?

CONFIG_DEBUG_PAGEALLOC=y

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Repeatable IPv6 crash in 3.19.0-1
  2015-02-28  2:03         ` Eric Dumazet
@ 2015-02-28  2:11           ` Brian Rak
  2015-02-28  2:21             ` Eric Dumazet
  2015-02-28  2:35             ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet
  0 siblings, 2 replies; 10+ messages in thread
From: Brian Rak @ 2015-02-28  2:11 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev



On 2/27/2015 9:03 PM, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 18:01 -0800, Eric Dumazet wrote:
>> On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote:
>>
>>> Wow, that was *much* faster then I was expecting, thanks a bunch!
>>>
>>> I can confirm that resolves the issue.. I've tested this and it fixes
>>> the issue perfectly.  I've been able to put a whole bunch of IPv6
>>> traffic through the interface now, whereas before even a minor amount of
>>> traffic would crash the host.
>>>
>>> Thanks again!
>>
>> Interesting...
>>
>> Had a prior version of linux kernel been fine ?
>
> Or maybe you recently switched on this config option ?
>
> CONFIG_DEBUG_PAGEALLOC=y
>
>
>

We've only recently started using this veth/macvtap combo, so it's 
possible this has been around for awhile and we just hadn't noticed.

I don't have any info on older kernels currently.  I *think* I've seen 
crashes on 3.17.1, but I didn't save any stack traces, so I can't be sure.

CONFIG_DEBUG_PAGEALLOC is not set, and never has been.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Repeatable IPv6 crash in 3.19.0-1
  2015-02-28  2:11           ` Brian Rak
@ 2015-02-28  2:21             ` Eric Dumazet
  2015-02-28  2:35             ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet
  1 sibling, 0 replies; 10+ messages in thread
From: Eric Dumazet @ 2015-02-28  2:21 UTC (permalink / raw)
  To: Brian Rak; +Cc: netdev

On Fri, 2015-02-27 at 21:11 -0500, Brian Rak wrote:

> We've only recently started using this veth/macvtap combo, so it's 
> possible this has been around for awhile and we just hadn't noticed.
> 
> I don't have any info on older kernels currently.  I *think* I've seen 
> crashes on 3.17.1, but I didn't save any stack traces, so I can't be sure.
> 
> CONFIG_DEBUG_PAGEALLOC is not set, and never has been.

OK, thanks for the confirmation. I'll send an official patch.

(I guess same patch is also needed for drivers/net/tun.c)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH net] macvtap: make sure neighbour code can push ethernet header
  2015-02-28  2:11           ` Brian Rak
  2015-02-28  2:21             ` Eric Dumazet
@ 2015-02-28  2:35             ` Eric Dumazet
  2015-03-01  5:30               ` David Miller
  1 sibling, 1 reply; 10+ messages in thread
From: Eric Dumazet @ 2015-02-28  2:35 UTC (permalink / raw)
  To: Brian Rak, David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

Brian reported crashes using IPv6 traffic with macvtap/veth combo.

I tracked the crashes in neigh_hh_output()

-> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);

Neighbour code assumes headroom to push Ethernet header is
at least 16 bytes.

It appears macvtap has only 14 bytes available on arches
where NET_IP_ALIGN is 0 (like x86)

Effect is a corruption of 2 bytes right before skb->head,
and possible crashes if accessing non existing memory.

This fix should also increase IPv4 performance, as paranoid code
in ip_finish_output2() wont have to call skb_realloc_headroom()

Reported-by: Brian Rak <brak@vultr.com>
Tested-by: Brian Rak <brak@vultr.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/macvtap.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -654,11 +654,14 @@ static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q,
 	} /* else everything is zero */
 }
 
+/* Neighbour code has some assumptions on HH_DATA_MOD alignment */
+#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN)
+
 /* Get packet from user space buffer */
 static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 				struct iov_iter *from, int noblock)
 {
-	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
+	int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE);
 	struct sk_buff *skb;
 	struct macvlan_dev *vlan;
 	unsigned long total_len = iov_iter_count(from);
@@ -722,7 +725,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 			linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len);
 	}
 
-	skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen,
+	skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen,
 				linear, noblock, &err);
 	if (!skb)
 		goto err;

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH net] macvtap: make sure neighbour code can push ethernet header
  2015-02-28  2:35             ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet
@ 2015-03-01  5:30               ` David Miller
  0 siblings, 0 replies; 10+ messages in thread
From: David Miller @ 2015-03-01  5:30 UTC (permalink / raw)
  To: eric.dumazet; +Cc: brak, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 27 Feb 2015 18:35:35 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Brian reported crashes using IPv6 traffic with macvtap/veth combo.
> 
> I tracked the crashes in neigh_hh_output()
> 
> -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
> 
> Neighbour code assumes headroom to push Ethernet header is
> at least 16 bytes.
> 
> It appears macvtap has only 14 bytes available on arches
> where NET_IP_ALIGN is 0 (like x86)
> 
> Effect is a corruption of 2 bytes right before skb->head,
> and possible crashes if accessing non existing memory.
> 
> This fix should also increase IPv4 performance, as paranoid code
> in ip_finish_output2() wont have to call skb_realloc_headroom()
> 
> Reported-by: Brian Rak <brak@vultr.com>
> Tested-by: Brian Rak <brak@vultr.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-03-01  5:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-27 21:37 Repeatable IPv6 crash in 3.19.0-1 Brian Rak
2015-02-28  0:48 ` Eric Dumazet
2015-02-28  1:16   ` Eric Dumazet
2015-02-28  1:54     ` Brian Rak
2015-02-28  2:01       ` Eric Dumazet
2015-02-28  2:03         ` Eric Dumazet
2015-02-28  2:11           ` Brian Rak
2015-02-28  2:21             ` Eric Dumazet
2015-02-28  2:35             ` [PATCH net] macvtap: make sure neighbour code can push ethernet header Eric Dumazet
2015-03-01  5:30               ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.