From mboxrd@z Thu Jan 1 00:00:00 1970 From: Eric Dumazet Subject: Re: Fw: [Bug 201423] New: eth0: hw csum failure Date: Fri, 19 Oct 2018 15:25:40 -0700 Message-ID: References: <20181015081519.0bf076bc@xeon-e3> <4693819f-4a76-532f-9b24-d4328183c807@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: Stephen Hemminger , netdev , rossi.f@inwind.it, Dimitris Michailidis To: Eric Dumazet , Eric Dumazet , andre@tomt.net Return-path: Received: from mail-pg1-f193.google.com ([209.85.215.193]:41470 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726403AbeJTGdl (ORCPT ); Sat, 20 Oct 2018 02:33:41 -0400 Received: by mail-pg1-f193.google.com with SMTP id 23-v6so16330034pgc.8 for ; Fri, 19 Oct 2018 15:25:42 -0700 (PDT) In-Reply-To: <4693819f-4a76-532f-9b24-d4328183c807@gmail.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: On 10/19/2018 02:58 PM, Eric Dumazet wrote: > > > On 10/16/2018 06:00 AM, Eric Dumazet wrote: >> On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt wrote: >>> >>> On 15.10.2018 17:41, Eric Dumazet wrote: >>>> On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger >>>>> Something is changed between 4.17.12 and 4.18, after bisecting the problem I >>>>> got the following first bad commit: >>>>> >>>>> commit 88078d98d1bb085d72af8437707279e203524fa5 >>>>> Author: Eric Dumazet >>>>> Date: Wed Apr 18 11:43:15 2018 -0700 >>>>> >>>>> net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends >>>>> >>>>> After working on IP defragmentation lately, I found that some large >>>>> packets defeat CHECKSUM_COMPLETE optimization because of NIC adding >>>>> zero paddings on the last (small) fragment. >>>>> >>>>> While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed >>>>> to CHECKSUM_NONE, forcing a full csum validation, even if all prior >>>>> fragments had CHECKSUM_COMPLETE set. >>>>> >>>>> We can instead compute the checksum of the part we are trimming, >>>>> usually smaller than the part we keep. >>>>> >>>>> Signed-off-by: Eric Dumazet >>>>> Signed-off-by: David S. Miller >>>>> >>>> >>>> Thanks for bisecting ! >>>> >>>> This commit is known to expose some NIC/driver bugs. >>>> >>>> Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f >>>> ("net: sungem: fix rx checksum support") for one driver needing a fix. >>>> >>>> I assume SKY2_HW_NEW_LE is not set on your NIC ? >>>> >>> >>> I've seen similar on several systems with mlx4 cards when using 4.18.x - >>> that is hw csum failure followed by some backtrace. >>> >>> Only seems to happen on systems dealing with quite a bit of UDP. >>> >> >> Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE, >> but CHECKSUM_UNNECESSARY >> >> I would be nice to track this a bit further, maybe by providing the >> full packet content. >> >>> Example from 4.18.10: >>>> [635607.740574] p0xe0: hw csum failure >>>> [635607.740598] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1 >>>> [635607.740599] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 05/02/2017 >>>> [635607.740599] Call Trace: >>>> [635607.740602] >>>> [635607.740611] dump_stack+0x5c/0x7b >>>> [635607.740617] __skb_gro_checksum_complete+0x9a/0xa0 >>>> [635607.740621] udp6_gro_receive+0x211/0x290 >>>> [635607.740624] ipv6_gro_receive+0x1a8/0x390 >>>> [635607.740627] dev_gro_receive+0x33e/0x550 >>>> [635607.740628] napi_gro_frags+0xa2/0x210 >>>> [635607.740635] mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en] >>>> [635607.740648] ? mlx4_cq_completion+0x23/0x70 [mlx4_core] >>>> [635607.740654] ? mlx4_eq_int+0x373/0xc80 [mlx4_core] >>>> [635607.740657] mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en] >>>> [635607.740658] net_rx_action+0xe0/0x2e0 >>>> [635607.740662] __do_softirq+0xd8/0x2e5 >>>> [635607.740666] irq_exit+0xb4/0xc0 >>>> [635607.740667] do_IRQ+0x85/0xd0 >>>> [635607.740670] common_interrupt+0xf/0xf >>>> [635607.740671] >>>> [635607.740675] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0 >>>> [635607.740675] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 >>>> [635607.740701] RSP: 0018:ffffa5c206353ea8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9 >>>> [635607.740703] RAX: ffff8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 000000000000001f >>>> [635607.740703] RDX: 00024214f597c5b0 RSI: 0000000000020780 RDI: 0000000000000000 >>>> [635607.740704] RBP: 0000000000000004 R08: 002542bfbefa99fa R09: 00000000ffffffff >>>> [635607.740705] R10: ffffa5c206353e88 R11: 00000000000000c5 R12: ffffffffaf0aaf78 >>>> [635607.740706] R13: ffff8d72ffd297d8 R14: 0000000000000000 R15: 00024214f58c2ed5 >>>> [635607.740709] ? cpuidle_enter_state+0x91/0x2a0 >>>> [635607.740712] do_idle+0x1d0/0x240 >>>> [635607.740715] cpu_startup_entry+0x5f/0x70 >>>> [635607.740719] start_secondary+0x185/0x1a0 >>>> [635607.740722] secondary_startup_64+0xa5/0xb0 >>>> [635607.740731] p0xe0: hw csum failure >>>> [635607.740745] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1 >>>> [635607.740746] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 05/02/2017 >>>> [635607.740746] Call Trace: >>>> [635607.740747] >>>> [635607.740750] dump_stack+0x5c/0x7b >>>> [635607.740755] __skb_checksum_complete+0xb8/0xd0 >>>> [635607.740760] __udp6_lib_rcv+0xa6b/0xa70 >>>> [635607.740767] ? nft_do_chain_inet+0x7a/0xd0 [nf_tables] >>>> [635607.740770] ? nft_do_chain_inet+0x7a/0xd0 [nf_tables] >>>> [635607.740774] ip6_input_finish+0xc0/0x460 >>>> [635607.740776] ip6_input+0x2b/0x90 >>>> [635607.740778] ? ip6_rcv_finish+0x110/0x110 >>>> [635607.740780] ipv6_rcv+0x2cd/0x4b0 >>>> [635607.740783] ? udp6_lib_lookup_skb+0x59/0x80 >>>> [635607.740785] __netif_receive_skb_core+0x455/0xb30 >>>> [635607.740788] ? ipv6_gro_receive+0x1a8/0x390 >>>> [635607.740790] ? netif_receive_skb_internal+0x24/0xb0 >>>> [635607.740792] netif_receive_skb_internal+0x24/0xb0 >>>> [635607.740793] napi_gro_frags+0x165/0x210 >>>> [635607.740796] mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en] >>>> [635607.740802] ? mlx4_cq_completion+0x23/0x70 [mlx4_core] >>>> [635607.740807] ? mlx4_eq_int+0x373/0xc80 [mlx4_core] >>>> [635607.740810] mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en] >>>> [635607.740811] net_rx_action+0xe0/0x2e0 >>>> [635607.740813] __do_softirq+0xd8/0x2e5 >>>> [635607.740816] irq_exit+0xb4/0xc0 >>>> [635607.740817] do_IRQ+0x85/0xd0 >>>> [635607.740820] common_interrupt+0xf/0xf >>>> [635607.740821] >>>> [635607.740823] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0 >>>> [635607.740823] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 >>>> [635607.740848] RSP: 0018:ffffa5c206353ea8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9 >>>> [635607.740849] RAX: ffff8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 000000000000001f >>>> [635607.740850] RDX: 00024214f597c5b0 RSI: 0000000000020780 RDI: 0000000000000000 >>>> [635607.740851] RBP: 0000000000000004 R08: 002542bfbefa99fa R09: 00000000ffffffff >>>> [635607.740852] R10: ffffa5c206353e88 R11: 00000000000000c5 R12: ffffffffaf0aaf78 >>>> [635607.740853] R13: ffff8d72ffd297d8 R14: 0000000000000000 R15: 00024214f58c2ed5 >>>> [635607.740855] ? cpuidle_enter_state+0x91/0x2a0 >>>> [635607.740857] do_idle+0x1d0/0x240 >>>> [635607.740859] cpu_startup_entry+0x5f/0x70 >>>> [635607.740861] start_secondary+0x185/0x1a0 >>>> [635607.740863] secondary_startup_64+0xa5/0xb0 > > As a matter of fact Dimitris found the issue in the patch and is working on a fix involving csum_block_sub() > > Problems comes from trimming an odd number of bytes. More exactly, trimming bytes starting at an odd offset.