From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: Fw: [Bug 201423] New: eth0: hw csum failure Date: Mon, 15 Oct 2018 08:41:47 -0700 Message-ID: References: <20181015081519.0bf076bc@xeon-e3> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: netdev , rossi.f@inwind.it To: Stephen Hemminger Return-path: Received: from mail-io1-f46.google.com ([209.85.166.46]:44627 "EHLO mail-io1-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726422AbeJOX1s (ORCPT ); Mon, 15 Oct 2018 19:27:48 -0400 Received: by mail-io1-f46.google.com with SMTP id s6-v6so3960007ioa.11 for ; Mon, 15 Oct 2018 08:42:01 -0700 (PDT) In-Reply-To: <20181015081519.0bf076bc@xeon-e3> Sender: netdev-owner@vger.kernel.org List-ID: On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger wrote: > > > > Begin forwarded message: > > Date: Sun, 14 Oct 2018 10:42:48 +0000 > From: bugzilla-daemon@bugzilla.kernel.org > To: stephen@networkplumber.org > Subject: [Bug 201423] New: eth0: hw csum failure > > > https://bugzilla.kernel.org/show_bug.cgi?id=201423 > > Bug ID: 201423 > Summary: eth0: hw csum failure > Product: Networking > Version: 2.5 > Kernel Version: 4.19.0-rc7 > Hardware: Intel > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > Assignee: stephen@networkplumber.org > Reporter: rossi.f@inwind.it > Regression: No > > I have a P6T DELUXE V2 motherboard and using the sky2 driver for the ethernet > ports. I get the following error message: > > [ 433.727397] eth0: hw csum failure > [ 433.727406] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19 > [ 433.727406] Hardware name: System manufacturer System Product Name/P6T > DELUXE V2, BIOS 1202 12/22/2010 > [ 433.727407] Call Trace: > [ 433.727409] > [ 433.727415] dump_stack+0x46/0x5b > [ 433.727419] __skb_checksum_complete+0xb0/0xc0 > [ 433.727423] tcp_v4_rcv+0x528/0xb60 > [ 433.727426] ? ipt_do_table+0x2d0/0x400 > [ 433.727429] ip_local_deliver_finish+0x5a/0x110 > [ 433.727430] ip_local_deliver+0xe1/0xf0 > [ 433.727431] ? ip_sublist_rcv_finish+0x60/0x60 > [ 433.727432] ip_rcv+0xca/0xe0 > [ 433.727434] ? ip_rcv_finish_core.isra.0+0x300/0x300 > [ 433.727436] __netif_receive_skb_one_core+0x4b/0x70 > [ 433.727438] netif_receive_skb_internal+0x4e/0x130 > [ 433.727439] napi_gro_receive+0x6a/0x80 > [ 433.727442] sky2_poll+0x707/0xd20 > [ 433.727446] ? rcu_check_callbacks+0x1b4/0x900 > [ 433.727447] net_rx_action+0x237/0x380 > [ 433.727449] __do_softirq+0xdc/0x1e0 > [ 433.727452] irq_exit+0xa9/0xb0 > [ 433.727453] do_IRQ+0x45/0xc0 > [ 433.727455] common_interrupt+0xf/0xf > [ 433.727456] > [ 433.727459] RIP: 0010:cpuidle_enter_state+0x124/0x200 > [ 433.727461] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f > ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 e1 > 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48 > [ 433.727462] RSP: 0000:ffffc900000a3e98 EFLAGS: 00000282 ORIG_RAX: > ffffffffffffffde > [ 433.727463] RAX: ffff880237b1f280 RBX: 0000000000000004 RCX: > 000000000000001f > [ 433.727464] RDX: 20c49ba5e353f7cf RSI: 000000002fe419c1 RDI: > 0000000000000000 > [ 433.727465] RBP: ffff880237b263a0 R08: 0000000000000714 R09: > 000000650512105d > [ 433.727465] R10: 00000000ffffffff R11: 0000000000000342 R12: > 00000064fc2a8b1c > [ 433.727466] R13: 00000064fc25b35f R14: 0000000000000004 R15: > ffffffff8204af20 > [ 433.727468] ? cpuidle_enter_state+0x119/0x200 > [ 433.727471] do_idle+0x1bf/0x200 > [ 433.727473] cpu_startup_entry+0x6a/0x70 > [ 433.727475] start_secondary+0x17f/0x1c0 > [ 433.727476] secondary_startup_64+0xa4/0xb0 > [ 441.662954] eth0: hw csum failure > [ 441.662959] CPU: 4 PID: 4347 Comm: radeon_cs:0 Not tainted 4.19.0-rc7 #19 > [ 441.662960] Hardware name: System manufacturer System Product Name/P6T > DELUXE V2, BIOS 1202 12/22/2010 > [ 441.662960] Call Trace: > [ 441.662963] > [ 441.662968] dump_stack+0x46/0x5b > [ 441.662972] __skb_checksum_complete+0xb0/0xc0 > [ 441.662975] tcp_v4_rcv+0x528/0xb60 > [ 441.662979] ? ipt_do_table+0x2d0/0x400 > [ 441.662981] ip_local_deliver_finish+0x5a/0x110 > [ 441.662983] ip_local_deliver+0xe1/0xf0 > [ 441.662985] ? ip_sublist_rcv_finish+0x60/0x60 > [ 441.662986] ip_rcv+0xca/0xe0 > [ 441.662988] ? ip_rcv_finish_core.isra.0+0x300/0x300 > [ 441.662990] __netif_receive_skb_one_core+0x4b/0x70 > [ 441.662993] netif_receive_skb_internal+0x4e/0x130 > [ 441.662994] napi_gro_receive+0x6a/0x80 > [ 441.662998] sky2_poll+0x707/0xd20 > [ 441.663000] net_rx_action+0x237/0x380 > [ 441.663002] __do_softirq+0xdc/0x1e0 > [ 441.663005] irq_exit+0xa9/0xb0 > [ 441.663007] do_IRQ+0x45/0xc0 > [ 441.663009] common_interrupt+0xf/0xf > [ 441.663010] > [ 441.663012] RIP: 0010:merge+0x22/0xb0 > [ 441.663014] Code: c3 31 c0 c3 90 90 90 90 41 56 41 55 41 54 55 48 89 d5 53 > 48 89 cb 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 c0 <48> 85 c9 > 74 70 48 85 d2 74 6b 49 89 fd 49 89 f6 49 89 e4 eb 14 48 > [ 441.663015] RSP: 0018:ffffc9000090b988 EFLAGS: 00000246 ORIG_RAX: > ffffffffffffffde > [ 441.663017] RAX: 0000000000000000 RBX: ffff88021ab2d408 RCX: > ffff88021ab2d408 > [ 441.663018] RDX: ffff88021ab2d388 RSI: ffffffffa021c440 RDI: > 0000000000000000 > [ 441.663019] RBP: ffff88021ab2d388 R08: 0000000000005ecf R09: > 0000000000008500 > [ 441.663020] R10: ffffea000877ec00 R11: ffff880236803500 R12: > ffffffffa021c440 > [ 441.663021] R13: ffff88021ab2d448 R14: 0000000000000004 R15: > ffffc9000090b9e0 > [ 441.663048] ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon] > [ 441.663063] ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon] > [ 441.663065] ? merge+0x57/0xb0 > [ 441.663080] ? radeon_irq_kms_set_irq_n_enabled+0x120/0x120 [radeon] > [ 441.663082] list_sort+0x8b/0x230 > [ 441.663094] radeon_cs_parser_fini+0xdf/0x110 [radeon] > [ 441.663110] radeon_cs_ioctl+0x2a4/0x710 [radeon] > [ 441.663113] ? __switch_to_asm+0x34/0x70 > [ 441.663114] ? __switch_to_asm+0x40/0x70 > [ 441.663130] ? radeon_cs_parser_init+0x20/0x20 [radeon] > [ 441.663141] drm_ioctl_kernel+0xa3/0xe0 [drm] > [ 441.663149] drm_ioctl+0x2e2/0x380 [drm] > [ 441.663164] ? radeon_cs_parser_init+0x20/0x20 [radeon] > [ 441.663168] ? page_add_new_anon_rmap+0x42/0x70 > [ 441.663171] do_vfs_ioctl+0x9a/0x600 > [ 441.663173] ksys_ioctl+0x35/0x60 > [ 441.663175] __x64_sys_ioctl+0x11/0x20 > [ 441.663177] do_syscall_64+0x3d/0xf0 > [ 441.663179] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 441.663180] RIP: 0033:0x7f9377377f37 > [ 441.663182] Code: 00 00 00 75 0c 48 c7 c0 ff ff ff ff 48 83 c4 18 c3 e8 ad > db 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 10 00 00 00 0f 05 <48> 3d 01 > f0 ff ff 73 01 c3 48 8b 0d 21 4f 2c 00 f7 d8 64 89 01 48 > [ 441.663183] RSP: 002b:00007f92c3130d28 EFLAGS: 00000246 ORIG_RAX: > 0000000000000010 > [ 441.663185] RAX: ffffffffffffffda RBX: 0000564498327ec0 RCX: > 00007f9377377f37 > [ 441.663186] RDX: 0000564498337ec8 RSI: 00000000c0206466 RDI: > 0000000000000010 > [ 441.663186] RBP: 0000564498337ec8 R08: 0000000000000000 R09: > 0000000000000000 > [ 441.663187] R10: 0000000000000000 R11: 0000000000000246 R12: > 00000000c0206466 > [ 441.663188] R13: 0000000000000010 R14: 0000000000000000 R15: > 0000564497a38120 > [ 462.833418] eth0: hw csum failure > [ 462.833428] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.19.0-rc7 #19 > [ 462.833429] Hardware name: System manufacturer System Product Name/P6T > DELUXE V2, BIOS 1202 12/22/2010 > [ 462.833429] Call Trace: > [ 462.833432] > [ 462.833438] dump_stack+0x46/0x5b > [ 462.833442] __skb_checksum_complete+0xb0/0xc0 > [ 462.833446] tcp_v4_rcv+0x528/0xb60 > [ 462.833449] ? ipt_do_table+0x2d0/0x400 > [ 462.833452] ip_local_deliver_finish+0x5a/0x110 > [ 462.833454] ip_local_deliver+0xe1/0xf0 > [ 462.833455] ? ip_sublist_rcv_finish+0x60/0x60 > [ 462.833457] ip_rcv+0xca/0xe0 > [ 462.833459] ? ip_rcv_finish_core.isra.0+0x300/0x300 > [ 462.833461] __netif_receive_skb_one_core+0x4b/0x70 > [ 462.833464] netif_receive_skb_internal+0x4e/0x130 > [ 462.833466] napi_gro_receive+0x6a/0x80 > [ 462.833469] sky2_poll+0x707/0xd20 > [ 462.833471] net_rx_action+0x237/0x380 > [ 462.833474] __do_softirq+0xdc/0x1e0 > [ 462.833477] irq_exit+0xa9/0xb0 > [ 462.833479] do_IRQ+0x45/0xc0 > [ 462.833481] common_interrupt+0xf/0xf > [ 462.833482] > [ 462.833486] RIP: 0010:cpuidle_enter_state+0x124/0x200 > [ 462.833488] Code: 53 60 89 c3 e8 dd 90 ad ff 65 8b 3d 96 58 a7 7e e8 d1 8f > ad ff 31 ff 49 89 c4 e8 27 99 ad ff fb 48 ba cf f7 53 e3 a5 9b c4 20 <4c> 89 e1 > 4c 29 e9 48 89 c8 48 c1 f9 3f 48 f7 ea b8 ff ff ff 7f 48 > [ 462.833489] RSP: 0018:ffffc900000a3e98 EFLAGS: 00000282 ORIG_RAX: > ffffffffffffffde > [ 462.833491] RAX: ffff880237b1f280 RBX: 0000000000000004 RCX: > 000000000000001f > [ 462.833492] RDX: 20c49ba5e353f7cf RSI: 000000002fe419c1 RDI: > 0000000000000000 > [ 462.833493] RBP: ffff880237b263a0 R08: 0000000000000000 R09: > 0000000000000000 > [ 462.833494] R10: 00000000ffffffff R11: 0000000000000273 R12: > 0000006bc3052131 > [ 462.833495] R13: 0000006bc2f99f57 R14: 0000000000000004 R15: > ffffffff8204af20 > [ 462.833498] ? cpuidle_enter_state+0x119/0x200 > [ 462.833503] do_idle+0x1bf/0x200 > [ 462.833506] cpu_startup_entry+0x6a/0x70 > [ 462.833510] start_secondary+0x17f/0x1c0 > [ 462.833513] secondary_startup_64+0xa4/0xb0 > > Something is changed between 4.17.12 and 4.18, after bisecting the problem I > got the following first bad commit: > > commit 88078d98d1bb085d72af8437707279e203524fa5 > Author: Eric Dumazet > Date: Wed Apr 18 11:43:15 2018 -0700 > > net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends > > After working on IP defragmentation lately, I found that some large > packets defeat CHECKSUM_COMPLETE optimization because of NIC adding > zero paddings on the last (small) fragment. > > While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed > to CHECKSUM_NONE, forcing a full csum validation, even if all prior > fragments had CHECKSUM_COMPLETE set. > > We can instead compute the checksum of the part we are trimming, > usually smaller than the part we keep. > > Signed-off-by: Eric Dumazet > Signed-off-by: David S. Miller > Thanks for bisecting ! This commit is known to expose some NIC/driver bugs. Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f ("net: sungem: fix rx checksum support") for one driver needing a fix. I assume SKY2_HW_NEW_LE is not set on your NIC ?