From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sander Eikelenboom Subject: Re: [linux-4.4-mw] BUG: unable to handle kernel paging request =?UTF-8?Q?ip=5Fvs=5Fout=2Econstprop?= Date: Thu, 12 Nov 2015 16:16:45 +0100 Message-ID: References: <06dc952f98f54da2c4d85b31e5fa9826@eikelenboom.it> <1447337353.22599.14.camel@edumazet-glaptop2.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: netdev@vger.kernel.org, netfilter-devel@vger.kernel.org To: Eric Dumazet Return-path: In-Reply-To: <1447337353.22599.14.camel@edumazet-glaptop2.roam.corp.google.com> Sender: netfilter-devel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On 2015-11-12 15:09, Eric Dumazet wrote: > On Thu, 2015-11-12 at 11:08 +0100, Sander Eikelenboom wrote: >> Hi All, >> >> Just got a crash with a linux-4.4-mw kernel. >> I'm using a routed bridge and apart from the splat below i have got >> some >> interesting other messages that aren't there in 4.3 (and perhaps are >> of >> interest for the crash as well): >> [ 207.033768] vif vif-1-0 vif1.0: set_features() failed (-1); wanted >> 0x0000000400004803, left 0x0000000400114813 >> [ 207.033780] vif vif-1-0 vif1.0: set_features() failed (-1); wanted >> 0x0000000400004803, left 0x0000000400114813 >> [ 207.245435] xen_bridge: error setting offload STP state on port >> 1(vif1.0) >> [ 207.245442] vif vif-1-0 vif1.0: failed to set HW ageing time >> [ 207.245443] xen_bridge: error setting offload STP state on port >> 1(vif1.0) >> [ 207.245491] vif vif-1-0 vif1.0: set_features() failed (-1); wanted >> 0x0000000400004803, left 0x0000000400114813 >> >> The commit message for the commit that introduced the "set HW ageing >> time" error message, doesn't seem to tell >> me much about it's purpose. If it's not related i can reported as a >> seperate issue. >> >> -- >> Sander >> >> The crash: >> [ 354.328687] BUG: unable to handle kernel paging request at >> ffff880049aa8000 >> [ 354.350206] IP: [] >> ip_vs_out.constprop.25+0x47/0x60 >> [ 354.360882] PGD 2212067 PUD 25b4067 PMD 5ffb6067 PTE 0 >> [ 354.371587] Oops: 0000 [#1] SMP >> [ 354.382143] Modules linked in: >> [ 354.392537] CPU: 0 PID: 0 Comm: swapper/0 Not tainted >> 4.3.0-mw-20151111-linus-doflr+ #1 >> [ 354.403105] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , >> BIOS >> V1.8B1 09/13/2010 >> [ 354.413666] task: ffffffff82218580 ti: ffffffff82200000 task.ti: >> ffffffff82200000 >> [ 354.424255] RIP: e030:[] [] >> ip_vs_out.constprop.25+0x47/0x60 >> [ 354.434742] RSP: e02b:ffff88005f6034b0 EFLAGS: 00010246 >> [ 354.445006] RAX: 0000000000000001 RBX: ffff88005f6034f8 RCX: >> ffff880049aa7ce0 >> [ 354.455262] RDX: ffff88003c0e5500 RSI: 0000000000000003 RDI: >> ffff880004e0e800 >> [ 354.465422] RBP: ffff88005f6034b8 R08: 0000000000000014 R09: >> 0000000000000003 >> [ 354.475508] R10: 0000000000000001 R11: ffff880040f394cc R12: >> ffff88005f603528 >> [ 354.485567] R13: ffff88003c0e5500 R14: ffffffff822da2e8 R15: >> ffff88003c0e5500 >> [ 354.495595] FS: 00007f0243c2b700(0000) GS:ffff88005f600000(0000) >> knlGS:0000000000000000 >> [ 354.505474] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b >> [ 354.515135] CR2: ffff880049aa8000 CR3: 0000000059271000 CR4: >> 0000000000000660 >> [ 354.524794] Stack: >> [ 354.534319] ffffffff81a074fc ffff88005f6034e8 ffffffff8199e138 >> ffff88003c0e5500 >> [ 354.543981] ffff88005f603528 ffff88003c0e5500 0000000000000000 >> ffff88005f603518 >> [ 354.553577] ffffffff8199e1af ffff880005300048 ffff88003c0e5500 >> ffffffff822da2e8 >> [ 354.563160] Call Trace: >> [ 354.572418] >> [ 354.572480] [] ? ip_vs_local_reply4+0x1c/0x20 >> [ 354.590458] [] nf_iterate+0x58/0x70 >> [ 354.599372] [] nf_hook_slow+0x5f/0xb0 >> [ 354.608245] [] __ip_local_out+0x9e/0xb0 >> [ 354.617036] [] ? ip_forward_options+0x1a0/0x1a0 >> [ 354.625874] [] ip_local_out+0x17/0x40 >> [ 354.634383] [] ip_build_and_send_pkt+0x148/0x1c0 >> [ 354.642715] [] tcp_v4_send_synack+0x56/0xa0 >> [ 354.650893] [] ? >> inet_csk_reqsk_queue_hash_add+0x68/0x90 >> [ 354.659083] [] tcp_conn_request+0x95d/0x970 >> [ 354.667196] [] ? __local_bh_enable_ip+0x26/0x90 >> [ 354.675246] [] tcp_v4_conn_request+0x47/0x50 >> [ 354.683254] [] tcp_rcv_state_process+0x183/0xca0 >> [ 354.691004] [] tcp_v4_do_rcv+0x5c/0x1f0 >> [ 354.698533] [] tcp_v4_rcv+0x987/0x9a0 >> [ 354.705968] [] ? ipv4_confirm+0x78/0xf0 >> [ 354.713370] [] >> ip_local_deliver_finish+0x84/0x120 >> [ 354.720739] [] ip_local_deliver+0x42/0xd0 >> [ 354.728029] [] ? inet_del_offload+0x40/0x40 >> [ 354.735270] [] ip_rcv_finish+0x106/0x320 >> [ 354.742413] [] ip_rcv+0x211/0x370 >> [ 354.749268] [] ? >> ip_local_deliver_finish+0x120/0x120 >> [ 354.755929] [] >> __netif_receive_skb_core+0x2cb/0x970 >> [ 354.762535] [] ? nf_nat_setup_info+0x7a/0x2f0 >> [ 354.769131] [] __netif_receive_skb+0x11/0x70 >> [ 354.775481] [] >> netif_receive_skb_internal+0x1e/0x80 >> [ 354.781638] [] ? nf_hook_slow+0x5f/0xb0 >> [ 354.787771] [] netif_receive_skb+0x9/0x10 >> [ 354.793916] [] >> br_handle_frame_finish+0x178/0x4b0 >> [ 354.800077] [] ? nf_nat_ipv4_fn+0x167/0x1e0 >> [ 354.806260] [] ? >> br_handle_local_finish+0x50/0x50 >> [ 354.812405] [] >> br_nf_pre_routing_finish+0x183/0x360 >> [ 354.818574] [] ? br_netif_receive_skb+0x10/0x10 >> [ 354.824775] [] br_nf_pre_routing+0x2a7/0x380 >> [ 354.830780] [] ? br_nf_forward_ip+0x3f0/0x3f0 >> [ 354.836567] [] nf_iterate+0x58/0x70 >> [ 354.842281] [] nf_hook_slow+0x5f/0xb0 >> [ 354.847886] [] br_handle_frame+0x1a2/0x290 >> [ 354.853520] [] ? br_netif_receive_skb+0x10/0x10 >> [ 354.859206] [] ? >> br_handle_frame_finish+0x4b0/0x4b0 >> [ 354.864824] [] >> __netif_receive_skb_core+0x12b/0x970 >> [ 354.870350] [] ? >> __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 >> [ 354.875880] [] __netif_receive_skb+0x11/0x70 >> [ 354.881293] [] >> netif_receive_skb_internal+0x1e/0x80 >> [ 354.886653] [] netif_receive_skb+0x9/0x10 >> [ 354.891918] [] xenvif_tx_action+0x693/0x820 >> [ 354.897170] [] xenvif_poll+0x29/0x70 >> [ 354.902426] [] net_rx_action+0x1f7/0x300 >> [ 354.907636] [] __do_softirq+0x103/0x210 >> [ 354.912837] [] irq_exit+0x4b/0xa0 >> [ 354.917940] [] xen_evtchn_do_upcall+0x30/0x40 >> [ 354.923051] [] >> xen_do_hypervisor_callback+0x1e/0x40 >> [ 354.928089] >> [ 354.928175] [] ? xen_hypercall_sched_op+0xa/0x20 >> [ 354.938047] [] ? xen_hypercall_sched_op+0xa/0x20 >> [ 354.942985] [] ? xen_safe_halt+0x10/0x20 >> [ 354.947859] [] ? default_idle+0x13/0x20 >> [ 354.952664] [] ? arch_cpu_idle+0xa/0x10 >> [ 354.957470] [] ? default_idle_call+0x2e/0x50 >> [ 354.962291] [] ? cpu_startup_entry+0x272/0x2e0 >> [ 354.967063] [] ? rest_init+0x77/0x80 >> [ 354.971854] [] ? start_kernel+0x438/0x445 >> [ 354.976640] [] ? >> x86_64_start_reservations+0x2a/0x2c >> [ 354.981457] [] ? xen_start_kernel+0x555/0x561 >> [ 354.986277] Code: 48 f7 42 58 fe ff ff ff b8 01 00 00 00 74 13 8b >> 4f >> 04 85 c9 74 0a 55 48 89 e5 e8 05 fa ff ff 5d f3 c3 f3 c3 66 83 79 10 >> 02 >> 75 d5 <80> b9 20 03 00 00 00 79 cc c3 66 66 66 66 66 66 2e 0f 1f 84 00 >> [ 354.996803] RIP [] >> ip_vs_out.constprop.25+0x47/0x60 >> [ 355.002021] RSP >> [ 355.007159] CR2: ffff880049aa8000 >> [ 355.012294] ---[ end trace 5b3b3b699aee4fc6 ]--- >> [ 355.017424] Kernel panic - not syncing: Fatal exception in >> interrupt >> [ 355.022732] Kernel Offset: disabled >> (XEN) [2015-11-11 15:45:14.718] Hardware Dom0 crashed: rebooting >> machine >> in 5 seconds. >> >> (gdb) list *0xffffffff81a074a7 >> 0xffffffff81a074a7 is in ip_vs_out >> (net/netfilter/ipvs/ip_vs_core.c:1192). >> 1187 if (unlikely(skb->sk != NULL && hooknum == NF_INET_LOCAL_OUT && >> 1188 af == AF_INET)) { >> 1189 struct sock *sk = skb->sk; >> 1190 struct inet_sock *inet = inet_sk(skb->sk); >> 1191 >> 1192 if (inet && sk->sk_family == PF_INET && inet->nodefrag) >> 1193 return NF_ACCEPT; >> 1194 } >> 1195 >> 1196 if (unlikely(!skb_dst(skb))) >> > > Thanks for the report, please try following patch : Hi Eric, Thanks for the patch! Got it up and running at the moment, but since i don't have a clear trigger it will take 1 or 2 days before i can report something back. -- Sander > diff --git a/net/netfilter/ipvs/ip_vs_core.c > b/net/netfilter/ipvs/ip_vs_core.c > index 1e24fff53e4b..f57b4dcdb233 100644 > --- a/net/netfilter/ipvs/ip_vs_core.c > +++ b/net/netfilter/ipvs/ip_vs_core.c > @@ -1176,6 +1176,7 @@ ip_vs_out(struct netns_ipvs *ipvs, unsigned int > hooknum, struct sk_buff *skb, in > struct ip_vs_protocol *pp; > struct ip_vs_proto_data *pd; > struct ip_vs_conn *cp; > + struct sock *sk; > > EnterFunction(11); > > @@ -1183,13 +1184,12 @@ ip_vs_out(struct netns_ipvs *ipvs, unsigned > int hooknum, struct sk_buff *skb, in > if (skb->ipvs_property) > return NF_ACCEPT; > > + sk = skb_to_full_sk(skb); > /* Bad... Do not break raw sockets */ > - if (unlikely(skb->sk != NULL && hooknum == NF_INET_LOCAL_OUT && > + if (unlikely(sk && hooknum == NF_INET_LOCAL_OUT && > af == AF_INET)) { > - struct sock *sk = skb->sk; > - struct inet_sock *inet = inet_sk(skb->sk); > > - if (inet && sk->sk_family == PF_INET && inet->nodefrag) > + if (sk->sk_family == PF_INET && inet_sk(sk)->nodefrag) > return NF_ACCEPT; > } > > @@ -1681,6 +1681,7 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int > hooknum, struct sk_buff *skb, int > struct ip_vs_conn *cp; > int ret, pkts; > int conn_reuse_mode; > + struct sock *sk; > > /* Already marked as IPVS request or reply? */ > if (skb->ipvs_property) > @@ -1708,12 +1709,11 @@ ip_vs_in(struct netns_ipvs *ipvs, unsigned int > hooknum, struct sk_buff *skb, int > ip_vs_fill_iph_skb(af, skb, false, &iph); > > /* Bad... Do not break raw sockets */ > - if (unlikely(skb->sk != NULL && hooknum == NF_INET_LOCAL_OUT && > + sk = skb_to_full_sk(skb); > + if (unlikely(sk && hooknum == NF_INET_LOCAL_OUT && > af == AF_INET)) { > - struct sock *sk = skb->sk; > - struct inet_sock *inet = inet_sk(skb->sk); > > - if (inet && sk->sk_family == PF_INET && inet->nodefrag) > + if (sk->sk_family == PF_INET && inet_sk(sk)->nodefrag) > return NF_ACCEPT; > }