All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel panic in eth_header
@ 2019-02-05 16:29 Andrew
  2019-02-05 16:57 ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew @ 2019-02-05 16:29 UTC (permalink / raw)
  To: Netdev

Hi all.

After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic 
after a 3 days of uptime.

Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with 
debug info enabled (kernel is compiled with same function addresses - I 
compare vmlinux symbol maps) - it says that panic is in 
net/ethernet/eth.c:88

Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. 
What extra info is needed?

[263565.106441] BUG: unable to handle kernel paging request at 
ffff88015a4d2dd4
[263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.119030] PGD 1e8f067 [263565.121474] PUD 0
[263565.123580]
[263565.125166] Oops: 0002 [#1] SMP
[263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp 
xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp 
nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat 
nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf 
cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe 
pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca 
parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
[263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    
4.9.153-x86_64 #1
[263565.183996] Hardware name: System manufacturer System Product 
Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
[263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
[263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] 
eth_header+0x3b/0xc0
[263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
[263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: 
ffff8800682434a0
[263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: 
ffff880077aab000
[263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 
0000000000000574
[263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: 
ffff8800682434a0
[263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 
0000000000000008
[263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) 
knlGS:0000000000000000
[263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 
00000000000006f0
[263565.271944] Stack:
[263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 
ffff8800682434a0
[263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 
ffff88007fa83d00
[263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 
ffffffff815a8c61
[263565.296661] Call Trace:
[263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? 
neigh_connected_output+0xa9/0x100
[263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
[263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
[263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
[263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
[263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
[263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
[263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
[263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
[263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
[263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
[263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
[263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
[263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
[263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
[263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
[263565.396008]  [<ffffffff81626cd6>] ? 
call_function_single_interrupt+0x96/0xa0
[263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? 
__sched_text_end+0x2/0x2
[263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
[263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
[263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
[263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
[263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 
bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 
85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
[263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.460124]  RSP <ffff88007fa83c58>
[263565.463696] CR2: ffff88015a4d2dd4
[263565.467104] ---[ end trace a1bcaf3618724adf ]---
[263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
[263565.478245] Kernel Offset: disabled
[263565.481818] Rebooting in 5 seconds..


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel panic in eth_header
  2019-02-05 16:29 Kernel panic in eth_header Andrew
@ 2019-02-05 16:57 ` Eric Dumazet
  2019-02-05 19:34   ` Florian Fainelli
  2019-02-05 19:34   ` Florian Fainelli
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2019-02-05 16:57 UTC (permalink / raw)
  To: Andrew, Netdev



On 02/05/2019 08:29 AM, Andrew wrote:
> Hi all.
> 
> After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic after a 3 days of uptime.
> 
> Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with debug info enabled (kernel is compiled with same function addresses - I compare vmlinux symbol maps) - it says that panic is in net/ethernet/eth.c:88
> 
> Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. What extra info is needed?
> 
> [263565.106441] BUG: unable to handle kernel paging request at ffff88015a4d2dd4
> [263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
> [263565.119030] PGD 1e8f067 [263565.121474] PUD 0
> [263565.123580]
> [263565.125166] Oops: 0002 [#1] SMP
> [263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
> [263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    4.9.153-x86_64 #1
> [263565.183996] Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
> [263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
> [263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] eth_header+0x3b/0xc0
> [263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
> [263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: ffff8800682434a0
> [263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: ffff880077aab000
> [263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 0000000000000574
> [263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: ffff8800682434a0
> [263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 0000000000000008
> [263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) knlGS:0000000000000000
> [263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 00000000000006f0
> [263565.271944] Stack:
> [263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 ffff8800682434a0
> [263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 ffff88007fa83d00
> [263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 ffffffff815a8c61
> [263565.296661] Call Trace:
> [263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? neigh_connected_output+0xa9/0x100
> [263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
> [263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
> [263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
> [263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
> [263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
> [263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
> [263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
> [263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
> [263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
> [263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
> [263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
> [263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
> [263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
> [263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
> [263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
> [263565.396008]  [<ffffffff81626cd6>] ? call_function_single_interrupt+0x96/0xa0
> [263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? __sched_text_end+0x2/0x2
> [263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
> [263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
> [263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
> [263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
> [263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
> [263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
> [263565.460124]  RSP <ffff88007fa83c58>
> [263565.463696] CR2: ffff88015a4d2dd4
> [263565.467104] ---[ end trace a1bcaf3618724adf ]---
> [263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
> [263565.478245] Kernel Offset: disabled
> [263565.481818] Rebooting in 5 seconds..
> 


This is a well known issue, a fix should come shortly in stable branches

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index f8bbd693c19c247e41839c2d0b5318ca51b23ee8..d95b32af4a0e3f552405c9e61cc372729834160c 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -425,6 +425,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
         * fragment.
         */
 
+       err = -EINVAL;
        /* Find out where to put this fragment.  */
        prev_tail = qp->q.fragments_tail;
        if (!prev_tail)
@@ -501,7 +502,6 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 
 discard_qp:
        inet_frag_kill(&qp->q);
-       err = -EINVAL;
        __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
 err:
        kfree_skb(skb);




^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Kernel panic in eth_header
  2019-02-05 16:57 ` Eric Dumazet
@ 2019-02-05 19:34   ` Florian Fainelli
  2019-02-05 20:13     ` Eric Dumazet
  2019-02-05 19:34   ` Florian Fainelli
  1 sibling, 1 reply; 9+ messages in thread
From: Florian Fainelli @ 2019-02-05 19:34 UTC (permalink / raw)
  To: Eric Dumazet, Andrew, Netdev

On 2/5/19 8:57 AM, Eric Dumazet wrote:
> 
> 
> On 02/05/2019 08:29 AM, Andrew wrote:
>> Hi all.
>>
>> After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic after a 3 days of uptime.
>>
>> Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with debug info enabled (kernel is compiled with same function addresses - I compare vmlinux symbol maps) - it says that panic is in net/ethernet/eth.c:88
>>
>> Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. What extra info is needed?
>>
>> [263565.106441] BUG: unable to handle kernel paging request at ffff88015a4d2dd4
>> [263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>> [263565.119030] PGD 1e8f067 [263565.121474] PUD 0
>> [263565.123580]
>> [263565.125166] Oops: 0002 [#1] SMP
>> [263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
>> [263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    4.9.153-x86_64 #1
>> [263565.183996] Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
>> [263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
>> [263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>> [263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
>> [263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: ffff8800682434a0
>> [263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: ffff880077aab000
>> [263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 0000000000000574
>> [263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: ffff8800682434a0
>> [263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 0000000000000008
>> [263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) knlGS:0000000000000000
>> [263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 00000000000006f0
>> [263565.271944] Stack:
>> [263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 ffff8800682434a0
>> [263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 ffff88007fa83d00
>> [263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 ffffffff815a8c61
>> [263565.296661] Call Trace:
>> [263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? neigh_connected_output+0xa9/0x100
>> [263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
>> [263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
>> [263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
>> [263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
>> [263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
>> [263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
>> [263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
>> [263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
>> [263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
>> [263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
>> [263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
>> [263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
>> [263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
>> [263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
>> [263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
>> [263565.396008]  [<ffffffff81626cd6>] ? call_function_single_interrupt+0x96/0xa0
>> [263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? __sched_text_end+0x2/0x2
>> [263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
>> [263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
>> [263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
>> [263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
>> [263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
>> [263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>> [263565.460124]  RSP <ffff88007fa83c58>
>> [263565.463696] CR2: ffff88015a4d2dd4
>> [263565.467104] ---[ end trace a1bcaf3618724adf ]---
>> [263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
>> [263565.478245] Kernel Offset: disabled
>> [263565.481818] Rebooting in 5 seconds..
>>
> 
> 
> This is a well known issue, a fix should come shortly in stable branches

Is Peter or yourself doing the backport? David would only take care of
the most two recent stable kernels.

Sorry about missing that change as part of the fragmenstack backport to
4.9...

> 
> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
> index f8bbd693c19c247e41839c2d0b5318ca51b23ee8..d95b32af4a0e3f552405c9e61cc372729834160c 100644
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -425,6 +425,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>          * fragment.
>          */
>  
> +       err = -EINVAL;
>         /* Find out where to put this fragment.  */
>         prev_tail = qp->q.fragments_tail;
>         if (!prev_tail)
> @@ -501,7 +502,6 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>  
>  discard_qp:
>         inet_frag_kill(&qp->q);
> -       err = -EINVAL;
>         __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
>  err:
>         kfree_skb(skb);
> 
> 
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel panic in eth_header
  2019-02-05 16:57 ` Eric Dumazet
  2019-02-05 19:34   ` Florian Fainelli
@ 2019-02-05 19:34   ` Florian Fainelli
  2019-02-05 20:21     ` Andrew
  1 sibling, 1 reply; 9+ messages in thread
From: Florian Fainelli @ 2019-02-05 19:34 UTC (permalink / raw)
  To: Eric Dumazet, Andrew, Netdev

On 2/5/19 8:57 AM, Eric Dumazet wrote:
> 
> 
> On 02/05/2019 08:29 AM, Andrew wrote:
>> Hi all.
>>
>> After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic after a 3 days of uptime.
>>
>> Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with debug info enabled (kernel is compiled with same function addresses - I compare vmlinux symbol maps) - it says that panic is in net/ethernet/eth.c:88
>>
>> Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. What extra info is needed?
>>
>> [263565.106441] BUG: unable to handle kernel paging request at ffff88015a4d2dd4
>> [263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>> [263565.119030] PGD 1e8f067 [263565.121474] PUD 0
>> [263565.123580]
>> [263565.125166] Oops: 0002 [#1] SMP
>> [263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
>> [263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    4.9.153-x86_64 #1
>> [263565.183996] Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
>> [263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
>> [263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>> [263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
>> [263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: ffff8800682434a0
>> [263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: ffff880077aab000
>> [263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 0000000000000574
>> [263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: ffff8800682434a0
>> [263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 0000000000000008
>> [263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) knlGS:0000000000000000
>> [263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 00000000000006f0
>> [263565.271944] Stack:
>> [263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 ffff8800682434a0
>> [263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 ffff88007fa83d00
>> [263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 ffffffff815a8c61
>> [263565.296661] Call Trace:
>> [263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? neigh_connected_output+0xa9/0x100
>> [263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
>> [263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
>> [263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
>> [263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
>> [263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
>> [263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
>> [263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
>> [263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
>> [263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
>> [263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
>> [263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
>> [263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
>> [263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
>> [263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
>> [263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
>> [263565.396008]  [<ffffffff81626cd6>] ? call_function_single_interrupt+0x96/0xa0
>> [263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? __sched_text_end+0x2/0x2
>> [263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
>> [263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
>> [263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
>> [263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
>> [263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
>> [263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>> [263565.460124]  RSP <ffff88007fa83c58>
>> [263565.463696] CR2: ffff88015a4d2dd4
>> [263565.467104] ---[ end trace a1bcaf3618724adf ]---
>> [263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
>> [263565.478245] Kernel Offset: disabled
>> [263565.481818] Rebooting in 5 seconds..
>>
> 
> 
> This is a well known issue, a fix should come shortly in stable branches

Is Peter or yourself doing the backport? David would only take care of
the most two recent stable kernels.

Sorry about missing that change as part of the fragmenstack backport to
4.9...

> 
> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
> index f8bbd693c19c247e41839c2d0b5318ca51b23ee8..d95b32af4a0e3f552405c9e61cc372729834160c 100644
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -425,6 +425,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>          * fragment.
>          */
>  
> +       err = -EINVAL;
>         /* Find out where to put this fragment.  */
>         prev_tail = qp->q.fragments_tail;
>         if (!prev_tail)
> @@ -501,7 +502,6 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>  
>  discard_qp:
>         inet_frag_kill(&qp->q);
> -       err = -EINVAL;
>         __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
>  err:
>         kfree_skb(skb);
> 
> 
> 


-- 
Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel panic in eth_header
  2019-02-05 19:34   ` Florian Fainelli
@ 2019-02-05 20:13     ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2019-02-05 20:13 UTC (permalink / raw)
  To: Florian Fainelli, Eric Dumazet, Andrew, Netdev



On 02/05/2019 11:34 AM, Florian Fainelli wrote:
> On 2/5/19 8:57 AM, Eric Dumazet wrote:
>>
>>
>> On 02/05/2019 08:29 AM, Andrew wrote:
>>> Hi all.
>>>
>>> After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic after a 3 days of uptime.
>>>
>>> Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with debug info enabled (kernel is compiled with same function addresses - I compare vmlinux symbol maps) - it says that panic is in net/ethernet/eth.c:88
>>>
>>> Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. What extra info is needed?
>>>
>>> [263565.106441] BUG: unable to handle kernel paging request at ffff88015a4d2dd4
>>> [263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.119030] PGD 1e8f067 [263565.121474] PUD 0
>>> [263565.123580]
>>> [263565.125166] Oops: 0002 [#1] SMP
>>> [263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
>>> [263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    4.9.153-x86_64 #1
>>> [263565.183996] Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
>>> [263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
>>> [263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
>>> [263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: ffff8800682434a0
>>> [263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: ffff880077aab000
>>> [263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 0000000000000574
>>> [263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: ffff8800682434a0
>>> [263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 0000000000000008
>>> [263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) knlGS:0000000000000000
>>> [263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 00000000000006f0
>>> [263565.271944] Stack:
>>> [263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 ffff8800682434a0
>>> [263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 ffff88007fa83d00
>>> [263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 ffffffff815a8c61
>>> [263565.296661] Call Trace:
>>> [263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? neigh_connected_output+0xa9/0x100
>>> [263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
>>> [263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
>>> [263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
>>> [263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
>>> [263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
>>> [263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
>>> [263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
>>> [263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
>>> [263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
>>> [263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
>>> [263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
>>> [263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
>>> [263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
>>> [263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
>>> [263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
>>> [263565.396008]  [<ffffffff81626cd6>] ? call_function_single_interrupt+0x96/0xa0
>>> [263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? __sched_text_end+0x2/0x2
>>> [263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
>>> [263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
>>> [263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
>>> [263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
>>> [263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
>>> [263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.460124]  RSP <ffff88007fa83c58>
>>> [263565.463696] CR2: ffff88015a4d2dd4
>>> [263565.467104] ---[ end trace a1bcaf3618724adf ]---
>>> [263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
>>> [263565.478245] Kernel Offset: disabled
>>> [263565.481818] Rebooting in 5 seconds..
>>>
>>
>>
>> This is a well known issue, a fix should come shortly in stable branches
> 
> Is Peter or yourself doing the backport? David would only take care of
> the most two recent stable kernels.
> 
> Sorry about missing that change as part of the fragmenstack backport to
> 4.9...


Greg took care of this for the trees he manages.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel panic in eth_header
  2019-02-05 19:34   ` Florian Fainelli
@ 2019-02-05 20:21     ` Andrew
  2019-02-05 20:28       ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew @ 2019-02-05 20:21 UTC (permalink / raw)
  To: Netdev

On 05.02.2019 21:34, Florian Fainelli wrote:
> On 2/5/19 8:57 AM, Eric Dumazet wrote:
>>
>> On 02/05/2019 08:29 AM, Andrew wrote:
>>> Hi all.
>>>
>>> After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic after a 3 days of uptime.
>>>
>>> Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with debug info enabled (kernel is compiled with same function addresses - I compare vmlinux symbol maps) - it says that panic is in net/ethernet/eth.c:88
>>>
>>> Below there is a kernel panic trace. igb is from vendor, ver. 5.3.5.4. What extra info is needed?
>>>
>>> [263565.106441] BUG: unable to handle kernel paging request at ffff88015a4d2dd4
>>> [263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.119030] PGD 1e8f067 [263565.121474] PUD 0
>>> [263565.123580]
>>> [263565.125166] Oops: 0002 [#1] SMP
>>> [263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
>>> [263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    4.9.153-x86_64 #1
>>> [263565.183996] Hardware name: System manufacturer System Product Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
>>> [263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
>>> [263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
>>> [263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: ffff8800682434a0
>>> [263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: ffff880077aab000
>>> [263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 0000000000000574
>>> [263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: ffff8800682434a0
>>> [263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 0000000000000008
>>> [263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) knlGS:0000000000000000
>>> [263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 00000000000006f0
>>> [263565.271944] Stack:
>>> [263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 ffff8800682434a0
>>> [263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 ffff88007fa83d00
>>> [263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 ffffffff815a8c61
>>> [263565.296661] Call Trace:
>>> [263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? neigh_connected_output+0xa9/0x100
>>> [263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
>>> [263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
>>> [263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
>>> [263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
>>> [263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
>>> [263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
>>> [263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
>>> [263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
>>> [263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
>>> [263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
>>> [263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
>>> [263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
>>> [263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
>>> [263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
>>> [263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
>>> [263565.396008]  [<ffffffff81626cd6>] ? call_function_single_interrupt+0x96/0xa0
>>> [263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? __sched_text_end+0x2/0x2
>>> [263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
>>> [263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
>>> [263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
>>> [263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
>>> [263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
>>> [263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
>>> [263565.460124]  RSP <ffff88007fa83c58>
>>> [263565.463696] CR2: ffff88015a4d2dd4
>>> [263565.467104] ---[ end trace a1bcaf3618724adf ]---
>>> [263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
>>> [263565.478245] Kernel Offset: disabled
>>> [263565.481818] Rebooting in 5 seconds..
>>>
>>
>> This is a well known issue, a fix should come shortly in stable branches
> Is Peter or yourself doing the backport? David would only take care of
> the most two recent stable kernels.
>
> Sorry about missing that change as part of the fragmenstack backport to
> 4.9...

I think that backport will be trivial - at least patch lays smoothly on 
4.9 (just with offsets difference).

I'll test it.

Btw, maybe there's a some test conditions to quickly check if patch 
helps? Crash is reproducible with unpredictable interval (tens of hours 
of quite heavy load).


>> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
>> index f8bbd693c19c247e41839c2d0b5318ca51b23ee8..d95b32af4a0e3f552405c9e61cc372729834160c 100644
>> --- a/net/ipv4/ip_fragment.c
>> +++ b/net/ipv4/ip_fragment.c
>> @@ -425,6 +425,7 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>>           * fragment.
>>           */
>>   
>> +       err = -EINVAL;
>>          /* Find out where to put this fragment.  */
>>          prev_tail = qp->q.fragments_tail;
>>          if (!prev_tail)
>> @@ -501,7 +502,6 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
>>   
>>   discard_qp:
>>          inet_frag_kill(&qp->q);
>> -       err = -EINVAL;
>>          __IP_INC_STATS(net, IPSTATS_MIB_REASM_OVERLAPS);
>>   err:
>>          kfree_skb(skb);
>>
>>
>>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel panic in eth_header
  2019-02-05 20:21     ` Andrew
@ 2019-02-05 20:28       ` Eric Dumazet
  2019-02-05 23:09         ` Andrew
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2019-02-05 20:28 UTC (permalink / raw)
  To: Andrew, Netdev



On 02/05/2019 12:21 PM, Andrew wrote:

> I think that backport will be trivial - at least patch lays smoothly on 4.9 (just with offsets difference).
> 
> I'll test it.
> 
> Btw, maybe there's a some test conditions to quickly check if patch helps? Crash is reproducible with unpredictable interval (tens of hours of quite heavy load).
>

Build your kernel with CONFIG_KASAN=y

Then run the tests Peter wrote.

4c3510483d26420d2c2c7cc075ad872286cc5932 selftests: net: ip_defrag: cover new IPv6 defrag behavior
3271a4821882a64214acc1bd7b173900ec70c9bf selftests: net: fix/improve ip_defrag selftest
bccc17118bcf3c62c947361d51760334f6602f43 selftests/net: add ipv6 tests to ip_defrag selftest
02c7f38b7ace9f1b2ddb7a88139127eef4cf8706 selftests/net: add ip_defrag selftest



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel panic in eth_header
  2019-02-05 20:28       ` Eric Dumazet
@ 2019-02-05 23:09         ` Andrew
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew @ 2019-02-05 23:09 UTC (permalink / raw)
  To: Netdev

Thanks. At least, IPv4 tests passed (IPv4 overlapped - fails at first 
time, but passes on next times; on IPv6 I've got 'send_fragment: 
operation not permitted', I didn't look deeply because I don't use 
IPv6). No KASAN warnings in dmesg.

On 05.02.2019 22:28, Eric Dumazet wrote:
>
> On 02/05/2019 12:21 PM, Andrew wrote:
>
>> I think that backport will be trivial - at least patch lays smoothly on 4.9 (just with offsets difference).
>>
>> I'll test it.
>>
>> Btw, maybe there's a some test conditions to quickly check if patch helps? Crash is reproducible with unpredictable interval (tens of hours of quite heavy load).
>>
> Build your kernel with CONFIG_KASAN=y
>
> Then run the tests Peter wrote.
>
> 4c3510483d26420d2c2c7cc075ad872286cc5932 selftests: net: ip_defrag: cover new IPv6 defrag behavior
> 3271a4821882a64214acc1bd7b173900ec70c9bf selftests: net: fix/improve ip_defrag selftest
> bccc17118bcf3c62c947361d51760334f6602f43 selftests/net: add ipv6 tests to ip_defrag selftest
> 02c7f38b7ace9f1b2ddb7a88139127eef4cf8706 selftests/net: add ip_defrag selftest
>
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Kernel panic in eth_header
@ 2019-02-05 16:09 Andrew
  0 siblings, 0 replies; 9+ messages in thread
From: Andrew @ 2019-02-05 16:09 UTC (permalink / raw)
  To: Netdev

Hi all.

After upgrade on PPPoE BRAS to kernel 4.9.153 I've got an kernel panic 
after a 3 days of uptime.

Unfortunately kernel is compiled w/o debug info; I rebuilt kernel with 
debug info enabled (kernel is compiled with same function addresses - I 
compare vmlinux symbol maps) - it says that panic is in 
net/ethernet/eth.c:88

Below there is a kernel panic trace. igb is from upstream, ver. 5.3.5.4. 
What extra info is needed?

[263565.106441] BUG: unable to handle kernel paging request at 
ffff88015a4d2dd4
[263565.113527] IP: [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.119030] PGD 1e8f067 [263565.121474] PUD 0
[263565.123580]
[263565.125166] Oops: 0002 [#1] SMP
[263565.128398] Modules linked in: xt_nat iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 iptable_filter xt_length xt_TCPMSS xt_tcpudp 
xt_mark xt_dscp iptable_mangle ip_tables x_tables nf_nat_pptp 
nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_proto_gre nf_nat 
nf_conntrack sch_sfq sch_htb cls_u32 sch_ingress sch_prio sch_tbf 
cls_flow cls_fw act_police ifb 8021q mrp garp stp llc softdog pppoe 
pppox ppp_generic slhc i2c_nforce2 i2c_core igb(O) parport_pc dca 
parport thermal asus_atk0110 fan ptp k10temp hwmon pps_core nv_tco
[263565.176083] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G           O    
4.9.153-x86_64 #1
[263565.183996] Hardware name: System manufacturer System Product 
Name/M2N-E, BIOS ASUS M2N-E ACPI BIOS Revision 5001 03/23/2010
[263565.195289] task: ffff88007d0f5200 task.stack: ffffc9000006c000
[263565.201295] RIP: 0010:[<ffffffff8158e48b>] [<ffffffff8158e48b>] 
eth_header+0x3b/0xc0
[263565.209225] RSP: 0018:ffff88007fa83c58  EFLAGS: 00010286
[263565.214622] RAX: ffff88015a4d2dc8 RBX: 0000000000000008 RCX: 
ffff8800682434a0
[263565.221843] RDX: ffff88015a4d2dc8 RSI: ffff88015a4d2dc8 RDI: 
ffff880077aab000
[263565.229062] RBP: ffff88007b663d90 R08: ffff88007b663d90 R09: 
0000000000000574
[263565.236281] R10: ffff88007d1fa000 R11: 0000000000000000 R12: 
ffff8800682434a0
[263565.243501] R13: ffff88007d1fa000 R14: 0000000000000574 R15: 
0000000000000008
[263565.250719] FS:  0000000000000000(0000) GS:ffff88007fa80000(0000) 
knlGS:0000000000000000
[263565.258894] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[263565.264725] CR2: ffff88015a4d2dd4 CR3: 000000007ad73000 CR4: 
00000000000006f0
[263565.271944] Stack:
[263565.274041]  ffff880077aab000 ffff880068243400 ffff88007a745000 
ffff8800682434a0
[263565.281582]  0000000000000002 ffffffff81571d09 ffff880068243400 
ffff88007fa83d00
[263565.289121]  ffff88007a745000 ffff880077aab000 ffff88007a712000 
ffffffff815a8c61
[263565.296661] Call Trace:
[263565.299193]  <IRQ> [263565.301205] [<ffffffff81571d09>] ? 
neigh_connected_output+0xa9/0x100
[263565.307740]  [<ffffffff815a8c61>] ? ip_finish_output2+0x221/0x400
[263565.313920]  [<ffffffff8159e144>] ? nf_iterate+0x54/0x60
[263565.319319]  [<ffffffff815ab2fa>] ? ip_output+0x6a/0xf0
[263565.324631]  [<ffffffff8159e102>] ? nf_iterate+0x12/0x60
[263565.330030]  [<ffffffff815aa6e0>] ? ip_fragment.constprop.5+0x80/0x80
[263565.336556]  [<ffffffff815a73b6>] ? ip_forward+0x396/0x480
[263565.342128]  [<ffffffff815a6fb0>] ? ip_check_defrag+0x1e0/0x1e0
[263565.348134]  [<ffffffff815a5a2e>] ? ip_rcv+0x2ae/0x370
[263565.353361]  [<ffffffffa0107c02>] ? pppoe_rcv_core+0xd2/0x160 [pppoe]
[263565.359888]  [<ffffffff815a5170>] ? ip_local_deliver_finish+0x1d0/0x1d0
[263565.366586]  [<ffffffff81562a57>] ? __netif_receive_skb_core+0x527/0xa80
[263565.373373]  [<ffffffff81567632>] ? process_backlog+0x92/0x130
[263565.379291]  [<ffffffff8156745d>] ? net_rx_action+0x24d/0x390
[263565.385124]  [<ffffffff81628374>] ? __do_softirq+0xf4/0x2a0
[263565.390784]  [<ffffffff8107136c>] ? irq_exit+0xbc/0xd0
[263565.396008]  [<ffffffff81626cd6>] ? 
call_function_single_interrupt+0x96/0xa0
[263565.403141]  <EOI> [263565.405153] [<ffffffff81623eb0>] ? 
__sched_text_end+0x2/0x2
[263565.410907]  [<ffffffff81624182>] ? native_safe_halt+0x2/0x10
[263565.416741]  [<ffffffff81623ec8>] ? default_idle+0x18/0xd0
[263565.422314]  [<ffffffff810a7a46>] ? cpu_startup_entry+0x126/0x220
[263565.428492]  [<ffffffff8104c261>] ? start_secondary+0x161/0x180
[263565.434496] Code: 0e 00 00 00 53 89 d3 49 89 cc 4c 89 c5 45 89 ce e8 
bb 8a fc ff 66 83 fb 01 48 89 c6 74 44 66 83 fb 04 74 3e 66 c1 c3 08 48 
85 ed <66> 89 58 0c 74 40 8b 45 00 4d 85 e4 89 46 06 0f b7 45 04 66 89
[263565.454534] RIP  [<ffffffff8158e48b>] eth_header+0x3b/0xc0
[263565.460124]  RSP <ffff88007fa83c58>
[263565.463696] CR2: ffff88015a4d2dd4
[263565.467104] ---[ end trace a1bcaf3618724adf ]---
[263565.471807] Kernel panic - not syncing: Fatal exception in interrupt
[263565.478245] Kernel Offset: disabled
[263565.481818] Rebooting in 5 seconds..


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-02-05 23:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-05 16:29 Kernel panic in eth_header Andrew
2019-02-05 16:57 ` Eric Dumazet
2019-02-05 19:34   ` Florian Fainelli
2019-02-05 20:13     ` Eric Dumazet
2019-02-05 19:34   ` Florian Fainelli
2019-02-05 20:21     ` Andrew
2019-02-05 20:28       ` Eric Dumazet
2019-02-05 23:09         ` Andrew
  -- strict thread matches above, loose matches on Subject: below --
2019-02-05 16:09 Andrew

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.