All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Eric Dumazet <eric.dumazet@gmail.com>,
	Eric Dumazet <edumazet@google.com>,
	andre@tomt.net
Cc: Stephen Hemminger <stephen@networkplumber.org>,
	netdev <netdev@vger.kernel.org>,
	rossi.f@inwind.it, Dimitris Michailidis <dmichail@google.com>
Subject: Re: Fw: [Bug 201423] New: eth0: hw csum failure
Date: Fri, 19 Oct 2018 15:25:40 -0700	[thread overview]
Message-ID: <cd6f5d1f-bd89-5da2-96be-85c2311ca0a1@gmail.com> (raw)
In-Reply-To: <4693819f-4a76-532f-9b24-d4328183c807@gmail.com>



On 10/19/2018 02:58 PM, Eric Dumazet wrote:
> 
> 
> On 10/16/2018 06:00 AM, Eric Dumazet wrote:
>> On Mon, Oct 15, 2018 at 11:30 PM Andre Tomt <andre@tomt.net> wrote:
>>>
>>> On 15.10.2018 17:41, Eric Dumazet wrote:
>>>> On Mon, Oct 15, 2018 at 8:15 AM Stephen Hemminger
>>>>> Something is changed between 4.17.12 and 4.18, after bisecting the problem I
>>>>> got the following first bad commit:
>>>>>
>>>>> commit 88078d98d1bb085d72af8437707279e203524fa5
>>>>> Author: Eric Dumazet <edumazet@google.com>
>>>>> Date:   Wed Apr 18 11:43:15 2018 -0700
>>>>>
>>>>>      net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends
>>>>>
>>>>>      After working on IP defragmentation lately, I found that some large
>>>>>      packets defeat CHECKSUM_COMPLETE optimization because of NIC adding
>>>>>      zero paddings on the last (small) fragment.
>>>>>
>>>>>      While removing the padding with pskb_trim_rcsum(), we set skb->ip_summed
>>>>>      to CHECKSUM_NONE, forcing a full csum validation, even if all prior
>>>>>      fragments had CHECKSUM_COMPLETE set.
>>>>>
>>>>>      We can instead compute the checksum of the part we are trimming,
>>>>>      usually smaller than the part we keep.
>>>>>
>>>>>      Signed-off-by: Eric Dumazet <edumazet@google.com>
>>>>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>>>>
>>>>
>>>> Thanks for bisecting !
>>>>
>>>> This commit is known to expose some NIC/driver bugs.
>>>>
>>>> Look at commit 12b03558cef6d655d0d394f5e98a6fd07c1f6c0f
>>>> ("net: sungem: fix rx checksum support")  for one driver needing a fix.
>>>>
>>>> I assume SKY2_HW_NEW_LE is not set on your NIC ?
>>>>
>>>
>>> I've seen similar on several systems with mlx4 cards when using 4.18.x -
>>> that is hw csum failure followed by some backtrace.
>>>
>>> Only seems to happen on systems dealing with quite a bit of UDP.
>>>
>>
>> Strange, because mlx4 on IPv6+UDP should not use CHECKSUM_COMPLETE,
>> but CHECKSUM_UNNECESSARY
>>
>> I would be nice to track this a bit further, maybe by providing the
>> full packet content.
>>
>>> Example from 4.18.10:
>>>> [635607.740574] p0xe0: hw csum failure
>>>> [635607.740598] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
>>>> [635607.740599] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 05/02/2017
>>>> [635607.740599] Call Trace:
>>>> [635607.740602]  <IRQ>
>>>> [635607.740611]  dump_stack+0x5c/0x7b
>>>> [635607.740617]  __skb_gro_checksum_complete+0x9a/0xa0
>>>> [635607.740621]  udp6_gro_receive+0x211/0x290
>>>> [635607.740624]  ipv6_gro_receive+0x1a8/0x390
>>>> [635607.740627]  dev_gro_receive+0x33e/0x550
>>>> [635607.740628]  napi_gro_frags+0xa2/0x210
>>>> [635607.740635]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
>>>> [635607.740648]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
>>>> [635607.740654]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
>>>> [635607.740657]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
>>>> [635607.740658]  net_rx_action+0xe0/0x2e0
>>>> [635607.740662]  __do_softirq+0xd8/0x2e5
>>>> [635607.740666]  irq_exit+0xb4/0xc0
>>>> [635607.740667]  do_IRQ+0x85/0xd0
>>>> [635607.740670]  common_interrupt+0xf/0xf
>>>> [635607.740671]  </IRQ>
>>>> [635607.740675] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
>>>> [635607.740675] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
>>>> [635607.740701] RSP: 0018:ffffa5c206353ea8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9
>>>> [635607.740703] RAX: ffff8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 000000000000001f
>>>> [635607.740703] RDX: 00024214f597c5b0 RSI: 0000000000020780 RDI: 0000000000000000
>>>> [635607.740704] RBP: 0000000000000004 R08: 002542bfbefa99fa R09: 00000000ffffffff
>>>> [635607.740705] R10: ffffa5c206353e88 R11: 00000000000000c5 R12: ffffffffaf0aaf78
>>>> [635607.740706] R13: ffff8d72ffd297d8 R14: 0000000000000000 R15: 00024214f58c2ed5
>>>> [635607.740709]  ? cpuidle_enter_state+0x91/0x2a0
>>>> [635607.740712]  do_idle+0x1d0/0x240
>>>> [635607.740715]  cpu_startup_entry+0x5f/0x70
>>>> [635607.740719]  start_secondary+0x185/0x1a0
>>>> [635607.740722]  secondary_startup_64+0xa5/0xb0
>>>> [635607.740731] p0xe0: hw csum failure
>>>> [635607.740745] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.18.0-1 #1
>>>> [635607.740746] Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 05/02/2017
>>>> [635607.740746] Call Trace:
>>>> [635607.740747]  <IRQ>
>>>> [635607.740750]  dump_stack+0x5c/0x7b
>>>> [635607.740755]  __skb_checksum_complete+0xb8/0xd0
>>>> [635607.740760]  __udp6_lib_rcv+0xa6b/0xa70
>>>> [635607.740767]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
>>>> [635607.740770]  ? nft_do_chain_inet+0x7a/0xd0 [nf_tables]
>>>> [635607.740774]  ip6_input_finish+0xc0/0x460
>>>> [635607.740776]  ip6_input+0x2b/0x90
>>>> [635607.740778]  ? ip6_rcv_finish+0x110/0x110
>>>> [635607.740780]  ipv6_rcv+0x2cd/0x4b0
>>>> [635607.740783]  ? udp6_lib_lookup_skb+0x59/0x80
>>>> [635607.740785]  __netif_receive_skb_core+0x455/0xb30
>>>> [635607.740788]  ? ipv6_gro_receive+0x1a8/0x390
>>>> [635607.740790]  ? netif_receive_skb_internal+0x24/0xb0
>>>> [635607.740792]  netif_receive_skb_internal+0x24/0xb0
>>>> [635607.740793]  napi_gro_frags+0x165/0x210
>>>> [635607.740796]  mlx4_en_process_rx_cq+0xa01/0xb40 [mlx4_en]
>>>> [635607.740802]  ? mlx4_cq_completion+0x23/0x70 [mlx4_core]
>>>> [635607.740807]  ? mlx4_eq_int+0x373/0xc80 [mlx4_core]
>>>> [635607.740810]  mlx4_en_poll_rx_cq+0x55/0xf0 [mlx4_en]
>>>> [635607.740811]  net_rx_action+0xe0/0x2e0
>>>> [635607.740813]  __do_softirq+0xd8/0x2e5
>>>> [635607.740816]  irq_exit+0xb4/0xc0
>>>> [635607.740817]  do_IRQ+0x85/0xd0
>>>> [635607.740820]  common_interrupt+0xf/0xf
>>>> [635607.740821]  </IRQ>
>>>> [635607.740823] RIP: 0010:cpuidle_enter_state+0xb4/0x2a0
>>>> [635607.740823] Code: 31 ff e8 df a6 ba ff 45 84 f6 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 d8 01 00 00 31 ff e8 13 81 bf ff fb 66 0f 1f 44 00 00 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7
>>>> [635607.740848] RSP: 0018:ffffa5c206353ea8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd9
>>>> [635607.740849] RAX: ffff8d72ffd20f00 RBX: 00024214f597c5b0 RCX: 000000000000001f
>>>> [635607.740850] RDX: 00024214f597c5b0 RSI: 0000000000020780 RDI: 0000000000000000
>>>> [635607.740851] RBP: 0000000000000004 R08: 002542bfbefa99fa R09: 00000000ffffffff
>>>> [635607.740852] R10: ffffa5c206353e88 R11: 00000000000000c5 R12: ffffffffaf0aaf78
>>>> [635607.740853] R13: ffff8d72ffd297d8 R14: 0000000000000000 R15: 00024214f58c2ed5
>>>> [635607.740855]  ? cpuidle_enter_state+0x91/0x2a0
>>>> [635607.740857]  do_idle+0x1d0/0x240
>>>> [635607.740859]  cpu_startup_entry+0x5f/0x70
>>>> [635607.740861]  start_secondary+0x185/0x1a0
>>>> [635607.740863]  secondary_startup_64+0xa5/0xb0
> 
> As a matter of fact Dimitris found the issue in the patch and is working on a fix involving csum_block_sub()
> 
> Problems comes from trimming an odd number of bytes.

More exactly, trimming bytes starting at an odd offset.

  reply	other threads:[~2018-10-20  6:33 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-15 15:15 Fw: [Bug 201423] New: eth0: hw csum failure Stephen Hemminger
2018-10-15 15:41 ` Eric Dumazet
2018-10-15 16:12   ` Dave Stevenson
2018-10-15 16:21   ` Stephen Hemminger
2018-10-15 22:28   ` Fw: " Fabio Rossi
2018-10-16  6:30   ` Andre Tomt
2018-10-16 13:00     ` Eric Dumazet
2018-10-19 21:58       ` Eric Dumazet
2018-10-19 22:25         ` Eric Dumazet [this message]
2018-10-21 13:34           ` Andre Tomt
2018-10-24 19:41             ` Andre Tomt
2018-10-25 17:38               ` Eric Dumazet
2018-10-26 11:45                 ` Andre Tomt
2018-10-26 12:38                   ` Andre Tomt
2018-10-26 12:59                     ` Eric Dumazet
2018-10-26 13:17                       ` Andre Tomt
2018-10-27 21:41                   ` Andre Tomt
2018-10-30 10:58                     ` Andre Tomt
2018-10-30 11:04                       ` Andre Tomt
2018-10-31  4:08                         ` Andre Tomt
2018-11-04  5:43                           ` Andre Tomt
2018-10-31  0:25         ` Fabio Rossi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cd6f5d1f-bd89-5da2-96be-85c2311ca0a1@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=andre@tomt.net \
    --cc=dmichail@google.com \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=rossi.f@inwind.it \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.