linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
       [not found] ` <fbb95c11-240c-1a11-0a62-0483908c577e@gmail.com>
@ 2018-06-16  7:14   ` Mathieu Malaterre
  2018-06-16 12:45     ` Eric Dumazet
  2018-06-17 10:09     ` Andreas Schwab
  0 siblings, 2 replies; 16+ messages in thread
From: Mathieu Malaterre @ 2018-06-16  7:14 UTC (permalink / raw)
  To: eric.dumazet
  Cc: David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
	Meelis Roos, schwab, netdev, linuxppc-dev

Hi Eric,

On Fri, Jun 15, 2018 at 9:14 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 06/15/2018 11:56 AM, Mathieu Malaterre wrote:
> > This reverts commit 88078d98d1bb085d72af8437707279e203524fa5.
> >
> > It causes regressions for people using chips driven by the sungem
> > driver. Suspicion is that the skb->csum value isn't being adjusted
> > properly.
> >
> > Symptoms as seen on G4+sungem are:
> >
> > [   34.023281] eth0: hw csum failure
> > [   34.023438] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #2
> > [   34.023618] Call Trace:
> > [   34.023707] [dffedbd0] [c069ddac] __skb_checksum_complete+0xf0/0x108 (unreliable)
> > [   34.023948] [dffedbf0] [c0777a70] tcp_v4_rcv+0x604/0xe00
> > [   34.024118] [dffedc70] [c0731624] ip_local_deliver_finish+0xa8/0x3c4
> > [   34.024315] [dffedcb0] [c0732430] ip_local_deliver+0xf0/0x154
> > [   34.024493] [dffedcf0] [c07328dc] ip_rcv+0x448/0x774
> > [   34.024653] [dffedd50] [c06aeae0] __netif_receive_skb_core+0x5e8/0x1184
> > [   34.024857] [dffedde0] [c06bba20] napi_gro_receive+0x160/0x22c
> > [   34.025044] [dffede10] [e14b2590] gem_poll+0x7fc/0x1ac0 [sungem]
> > [   34.025228] [dffedee0] [c06bacf0] net_rx_action+0x34c/0x618
> > [   34.025402] [dffedf60] [c07fd27c] __do_softirq+0x16c/0x5f0
> > [   34.025575] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
> > [   34.025738] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
> > [   34.025903] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
> > [   34.026055] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
> > [   34.026225] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
> >                    LR = arch_cpu_idle+0x30/0x78
> > [   34.026510] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
> > [   34.026682] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
> > [   34.026835] [c0cf7fb0] [c00a3ab0] cpu_startup_entry+0x20/0x28
> > [   34.027013] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
> > [   34.027181] [c0cf7ff0] [00003444] 0x3444
> >
> > See commit 7ce5a27f2ef8 ("Revert "net: Handle CHECKSUM_COMPLETE more
> > adequately in pskb_trim_rcsum()."") for previous reference.
>
> This fix seems to hide a bug in csum functions on this architecture.

That's odd since it seems to only affect g4+sungem user. None of the
ppc64 seems to be having it. And some ppc32 users are not even seeing
it.

> Or a bug on this NIC when receiving a small packet (less than 60 bytes).
> Maybe the padding bytes are not included in NIC provided csum, and not 0.

Ok in that case the bug is located in
./drivers/net/ethernet/sun/sungem.c that seems more likely. I'll try
to understand that code, then.

Thanks

>
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-16  7:14   ` [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends" Mathieu Malaterre
@ 2018-06-16 12:45     ` Eric Dumazet
  2018-06-17 10:27       ` Andreas Schwab
  2018-06-17 10:09     ` Andreas Schwab
  1 sibling, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-16 12:45 UTC (permalink / raw)
  To: Mathieu Malaterre
  Cc: David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
	Meelis Roos, schwab, netdev, linuxppc-dev



On 06/16/2018 12:14 AM, Mathieu Malaterre wrote:
> Hi Eric,
> 
> On Fri, Jun 15, 2018 at 9:14 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>
>>
>> On 06/15/2018 11:56 AM, Mathieu Malaterre wrote:
>>> This reverts commit 88078d98d1bb085d72af8437707279e203524fa5.
>>>
>>> It causes regressions for people using chips driven by the sungem
>>> driver. Suspicion is that the skb->csum value isn't being adjusted
>>> properly.
>>>
>>> Symptoms as seen on G4+sungem are:
>>>
>>> [   34.023281] eth0: hw csum failure
>>> [   34.023438] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #2
>>> [   34.023618] Call Trace:
>>> [   34.023707] [dffedbd0] [c069ddac] __skb_checksum_complete+0xf0/0x108 (unreliable)
>>> [   34.023948] [dffedbf0] [c0777a70] tcp_v4_rcv+0x604/0xe00
>>> [   34.024118] [dffedc70] [c0731624] ip_local_deliver_finish+0xa8/0x3c4
>>> [   34.024315] [dffedcb0] [c0732430] ip_local_deliver+0xf0/0x154
>>> [   34.024493] [dffedcf0] [c07328dc] ip_rcv+0x448/0x774
>>> [   34.024653] [dffedd50] [c06aeae0] __netif_receive_skb_core+0x5e8/0x1184
>>> [   34.024857] [dffedde0] [c06bba20] napi_gro_receive+0x160/0x22c
>>> [   34.025044] [dffede10] [e14b2590] gem_poll+0x7fc/0x1ac0 [sungem]
>>> [   34.025228] [dffedee0] [c06bacf0] net_rx_action+0x34c/0x618
>>> [   34.025402] [dffedf60] [c07fd27c] __do_softirq+0x16c/0x5f0
>>> [   34.025575] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
>>> [   34.025738] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
>>> [   34.025903] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
>>> [   34.026055] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
>>> [   34.026225] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
>>>                    LR = arch_cpu_idle+0x30/0x78
>>> [   34.026510] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
>>> [   34.026682] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
>>> [   34.026835] [c0cf7fb0] [c00a3ab0] cpu_startup_entry+0x20/0x28
>>> [   34.027013] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
>>> [   34.027181] [c0cf7ff0] [00003444] 0x3444
>>>
>>> See commit 7ce5a27f2ef8 ("Revert "net: Handle CHECKSUM_COMPLETE more
>>> adequately in pskb_trim_rcsum()."") for previous reference.
>>
>> This fix seems to hide a bug in csum functions on this architecture.
> 
> That's odd since it seems to only affect g4+sungem user. None of the
> ppc64 seems to be having it. And some ppc32 users are not even seeing
> it.
> 
>> Or a bug on this NIC when receiving a small packet (less than 60 bytes).
>> Maybe the padding bytes are not included in NIC provided csum, and not 0.
> 
> Ok in that case the bug is located in
> ./drivers/net/ethernet/sun/sungem.c that seems more likely. I'll try
> to understand that code, then.


I would try something like :

Basically do not bother using CHECKSUM_COMPLETE for small frames that might have been padded.

Since we need to bring one cache line in eth_type_trans() and further header processing,
computing the checksum in software will be almost free anyway.

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..071039f211a8a33153e888bd4014314ba5e686a4 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -855,9 +855,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
                        skb = copy_skb;
                }
 
-               csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
-               skb->csum = csum_unfold(csum);
-               skb->ip_summed = CHECKSUM_COMPLETE;
+               if (len > ETH_ZLEN) {
+                       csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
+                       skb->csum = csum_unfold(csum);
+                       skb->ip_summed = CHECKSUM_COMPLETE;
+               }
                skb->protocol = eth_type_trans(skb, gp->dev);
 
                napi_gro_receive(&gp->napi, skb);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-16  7:14   ` [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends" Mathieu Malaterre
  2018-06-16 12:45     ` Eric Dumazet
@ 2018-06-17 10:09     ` Andreas Schwab
  1 sibling, 0 replies; 16+ messages in thread
From: Andreas Schwab @ 2018-06-17 10:09 UTC (permalink / raw)
  To: Mathieu Malaterre
  Cc: eric.dumazet, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev

On Jun 16 2018, Mathieu Malaterre <malat@debian.org> wrote:

> That's odd since it seems to only affect g4+sungem user. None of the
> ppc64 seems to be having it.

I'm also seeing it on a PowerMac G5.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-16 12:45     ` Eric Dumazet
@ 2018-06-17 10:27       ` Andreas Schwab
  2018-06-17 22:54         ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-17 10:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev

On Jun 16 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> I would try something like :
>
> Basically do not bother using CHECKSUM_COMPLETE for small frames that might have been padded.
>
> Since we need to bring one cache line in eth_type_trans() and further header processing,
> computing the checksum in software will be almost free anyway.
>
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..071039f211a8a33153e888bd4014314ba5e686a4 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -855,9 +855,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
>                         skb = copy_skb;
>                 }
>  
> -               csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> -               skb->csum = csum_unfold(csum);
> -               skb->ip_summed = CHECKSUM_COMPLETE;
> +               if (len > ETH_ZLEN) {
> +                       csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> +                       skb->csum = csum_unfold(csum);
> +                       skb->ip_summed = CHECKSUM_COMPLETE;
> +               }
>                 skb->protocol = eth_type_trans(skb, gp->dev);
>  
>                 napi_gro_receive(&gp->napi, skb);

That doesn't change anything.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-17 10:27       ` Andreas Schwab
@ 2018-06-17 22:54         ` Eric Dumazet
  2018-06-18 17:54           ` Andreas Schwab
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-17 22:54 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev



On 06/17/2018 03:27 AM, Andreas Schwab wrote:

> 
> That doesn't change anything.

OK, thanks !

Oh this is silly, please try :

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
 int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
 {
        if (skb->ip_summed == CHECKSUM_COMPLETE) {
-               int delta = skb->len - len;
+               __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
 
-               skb->csum = csum_sub(skb->csum,
-                                    skb_checksum(skb, len, delta, 0));
+               skb->csum = csum_block_sub(skb->csum, csumdiff, len);
        }
        return __pskb_trim(skb, len);
 }

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-17 22:54         ` Eric Dumazet
@ 2018-06-18 17:54           ` Andreas Schwab
  2018-06-18 18:18             ` Eric Dumazet
  2018-06-18 18:29             ` Mathieu Malaterre
  0 siblings, 2 replies; 16+ messages in thread
From: Andreas Schwab @ 2018-06-18 17:54 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev

On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Oh this is silly, please try :
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
>  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
>  {
>         if (skb->ip_summed == CHECKSUM_COMPLETE) {
> -               int delta = skb->len - len;
> +               __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
>  
> -               skb->csum = csum_sub(skb->csum,
> -                                    skb_checksum(skb, len, delta, 0));
> +               skb->csum = csum_block_sub(skb->csum, csumdiff, len);
>         }
>         return __pskb_trim(skb, len);
>  }

That doesn't help either.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-18 17:54           ` Andreas Schwab
@ 2018-06-18 18:18             ` Eric Dumazet
  2018-06-18 18:45               ` Mathieu Malaterre
  2018-06-18 18:29             ` Mathieu Malaterre
  1 sibling, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-18 18:18 UTC (permalink / raw)
  To: Andreas Schwab, Eric Dumazet
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev



On 06/18/2018 10:54 AM, Andreas Schwab wrote:
> On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> Oh this is silly, please try :
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
>>  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
>>  {
>>         if (skb->ip_summed == CHECKSUM_COMPLETE) {
>> -               int delta = skb->len - len;
>> +               __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
>>  
>> -               skb->csum = csum_sub(skb->csum,
>> -                                    skb_checksum(skb, len, delta, 0));
>> +               skb->csum = csum_block_sub(skb->csum, csumdiff, len);
>>         }
>>         return __pskb_trim(skb, len);
>>  }
> 
> That doesn't help either.
> 
> Andreas.
> 

Then maybe NIC provided csum is not correct.

It does not compute a checksum on all the frame, but part of it.

You could use this patch to double check.

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..277859ea73e35271a10b02011120fca248ec8e71 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -857,6 +857,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
 
                csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
                skb->csum = csum_unfold(csum);
+               {
+               __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
+               if (csum != csum_fold(rsum))
+                       pr_err_ratelimited("sungem wrong csum : %x/%x, len %u bytes\n", csum, csum_fold(rsum), len);
+               }
                skb->ip_summed = CHECKSUM_COMPLETE;
                skb->protocol = eth_type_trans(skb, gp->dev);
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-18 17:54           ` Andreas Schwab
  2018-06-18 18:18             ` Eric Dumazet
@ 2018-06-18 18:29             ` Mathieu Malaterre
  1 sibling, 0 replies; 16+ messages in thread
From: Mathieu Malaterre @ 2018-06-18 18:29 UTC (permalink / raw)
  To: schwab
  Cc: eric.dumazet, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev

On Mon, Jun 18, 2018 at 7:54 PM Andreas Schwab <schwab@linux-m68k.org> wrote:
>
> On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> > Oh this is silly, please try :
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
> >  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
> >  {
> >         if (skb->ip_summed == CHECKSUM_COMPLETE) {
> > -               int delta = skb->len - len;
> > +               __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
> >
> > -               skb->csum = csum_sub(skb->csum,
> > -                                    skb_checksum(skb, len, delta, 0));
> > +               skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> >         }
> >         return __pskb_trim(skb, len);
> >  }
>
> That doesn't help either.

seconded (setup g4+sungem):

[  100.272676] eth0: hw csum failure
[  100.272710] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #6
[  100.272716] Call Trace:
[  100.272733] [dffedbd0] [c069ddb8]
__skb_checksum_complete+0xf0/0x108 (unreliable)
[  100.272752] [dffedbf0] [c078ea28] __udp4_lib_rcv+0x238/0xf98
[  100.272767] [dffedc70] [c0731630] ip_local_deliver_finish+0xa8/0x3c4
[  100.272777] [dffedcb0] [c073243c] ip_local_deliver+0xf0/0x154
[  100.272786] [dffedcf0] [c07328e8] ip_rcv+0x448/0x774
[  100.272800] [dffedd50] [c06aeaec] __netif_receive_skb_core+0x5e8/0x1184
[  100.272811] [dffedde0] [c06bba2c] napi_gro_receive+0x160/0x22c
[  100.272828] [dffede10] [e1571590] gem_poll+0x7fc/0x1ac0 [sungem]
[  100.272837] [dffedee0] [c06bacfc] net_rx_action+0x34c/0x618
[  100.272849] [dffedf60] [c07fd28c] __do_softirq+0x16c/0x5f0
[  100.272863] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
[  100.272877] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
[  100.272890] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
[  100.272900] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
[  100.272911] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
                   LR = arch_cpu_idle+0x30/0x78
[  100.272920] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
[  100.272935] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
[  100.272944] [c0cf7fb0] [c00a3ab4] cpu_startup_entry+0x24/0x28
[  100.272955] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
[  100.272963] [c0cf7ff0] [00003444] 0x3444


> Andreas.
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-18 18:18             ` Eric Dumazet
@ 2018-06-18 18:45               ` Mathieu Malaterre
  2018-06-18 23:29                 ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Mathieu Malaterre @ 2018-06-18 18:45 UTC (permalink / raw)
  To: eric.dumazet
  Cc: schwab, David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
	Meelis Roos, netdev, linuxppc-dev

On Mon, Jun 18, 2018 at 8:18 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 06/18/2018 10:54 AM, Andreas Schwab wrote:
> > On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >> Oh this is silly, please try :
> >>
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
> >>  int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
> >>  {
> >>         if (skb->ip_summed == CHECKSUM_COMPLETE) {
> >> -               int delta = skb->len - len;
> >> +               __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
> >>
> >> -               skb->csum = csum_sub(skb->csum,
> >> -                                    skb_checksum(skb, len, delta, 0));
> >> +               skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> >>         }
> >>         return __pskb_trim(skb, len);
> >>  }
> >
> > That doesn't help either.
> >
> > Andreas.
> >
>
> Then maybe NIC provided csum is not correct.
>
> It does not compute a checksum on all the frame, but part of it.
>
> You could use this patch to double check.
>
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..277859ea73e35271a10b02011120fca248ec8e71 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -857,6 +857,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
>
>                 csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
>                 skb->csum = csum_unfold(csum);
> +               {
> +               __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
> +               if (csum != csum_fold(rsum))
> +                       pr_err_ratelimited("sungem wrong csum : %x/%x, len %u bytes\n", csum, csum_fold(rsum), len);
> +               }
>                 skb->ip_summed = CHECKSUM_COMPLETE;
>                 skb->protocol = eth_type_trans(skb, gp->dev);
>
>

Here is what I get on my side

[   53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[   53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
[   58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
[   58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
[   63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
[   68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
[   73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
[   78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
[   83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[   83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
[   88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
[   93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
[   98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
[  103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
[  108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
[  108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
[  113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
[  113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[  118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
[  123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
[  128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
[  133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
[  135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
[  135.529208] eth0: hw csum failure
[  135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
[  135.529226] Call Trace:
[  135.529243] [dffedbe0] [c069ddac]
__skb_checksum_complete+0xf0/0x108 (unreliable)


>
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-18 18:45               ` Mathieu Malaterre
@ 2018-06-18 23:29                 ` Eric Dumazet
  2018-06-18 23:36                   ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-18 23:29 UTC (permalink / raw)
  To: Mathieu Malaterre
  Cc: schwab, David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
	Meelis Roos, netdev, linuxppc-dev



On 06/18/2018 11:45 AM, Mathieu Malaterre wrote:

> 
> Here is what I get on my side
> 
> [   53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [   53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
> [   58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
> [   58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
> [   63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
> [   68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
> [   73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
> [   78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
> [   83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [   83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
> [   88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
> [   93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
> [   98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
> [  103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
> [  108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
> [  108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
> [  113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
> [  113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [  118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
> [  123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
> [  128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
> [  133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
> [  135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
> [  135.529208] eth0: hw csum failure
> [  135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
> [  135.529226] Call Trace:
> [  135.529243] [dffedbe0] [c069ddac]
> __skb_checksum_complete+0xf0/0x108 (unreliable)

Thanks, then I guess next step would be to dump the content of the frames
having a wrong checksum, hoping we find an easy way to discard the CHECKSUM_COMPLETE
in a selective way.

Otherwise, we will need to remove CHECKSUM_COMPLETE setting in this driver.

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..77a761f95be788bb86c8d917f613c9084818f826 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -857,6 +857,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
 
                csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
                skb->csum = csum_unfold(csum);
+               {
+               __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
+               if (csum != csum_fold(rsum) && net_ratelimit())
+                       pr_err("sungem wrong csum : %04x/%04x, len %u bytes\n",
+                               csum, csum_fold(rsum), len);
+                       print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
+                                           16, 1, skb->data, len, true);
+               }
                skb->ip_summed = CHECKSUM_COMPLETE;
                skb->protocol = eth_type_trans(skb, gp->dev);
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-18 23:29                 ` Eric Dumazet
@ 2018-06-18 23:36                   ` Eric Dumazet
  2018-06-19 19:10                     ` Andreas Schwab
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-18 23:36 UTC (permalink / raw)
  To: Eric Dumazet, Mathieu Malaterre
  Cc: schwab, David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
	Meelis Roos, netdev, linuxppc-dev



On 06/18/2018 04:29 PM, Eric Dumazet wrote:
> 
> 
> On 06/18/2018 11:45 AM, Mathieu Malaterre wrote:
> 
>>
>> Here is what I get on my side
>>
>> [   53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [   53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
>> [   58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
>> [   58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
>> [   63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
>> [   68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
>> [   73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
>> [   78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
>> [   83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [   83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
>> [   88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
>> [   93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
>> [   98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
>> [  103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
>> [  108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
>> [  108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
>> [  113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
>> [  113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [  118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
>> [  123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
>> [  128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
>> [  133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
>> [  135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
>> [  135.529208] eth0: hw csum failure
>> [  135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
>> [  135.529226] Call Trace:
>> [  135.529243] [dffedbe0] [c069ddac]
>> __skb_checksum_complete+0xf0/0x108 (unreliable)
> 
> Thanks, then I guess next step would be to dump the content of the frames
> having a wrong checksum, hoping we find an easy way to discard the CHECKSUM_COMPLETE
> in a selective way.
> 
> Otherwise, we will need to remove CHECKSUM_COMPLETE setting in this driver.
> 
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..77a761f95be788bb86c8d917f613c9084818f826 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -857,6 +857,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>  
>                 csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
>                 skb->csum = csum_unfold(csum);
> +               {
> +               __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
> +               if (csum != csum_fold(rsum) && net_ratelimit())
> +                       pr_err("sungem wrong csum : %04x/%04x, len %u bytes\n",
> +                               csum, csum_fold(rsum), len);
> +                       print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,


DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)

> +                                           16, 1, skb->data, len, true);
> +               }
>                 skb->ip_summed = CHECKSUM_COMPLETE;
>                 skb->protocol = eth_type_trans(skb, gp->dev);
>  
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-18 23:36                   ` Eric Dumazet
@ 2018-06-19 19:10                     ` Andreas Schwab
  2018-06-19 20:10                       ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-19 19:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev

On Jun 18 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)

DUMP_PREFIX_ADDRESS is useless for that purpose.

Here are some samples of broken csums:

[  853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, c0000001fa187e02
[  853.849232] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 10  ...C.b...Q....E.
[  853.849235] raw data: 00000010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 c0 a8  .L..@.@.........
[  853.849237] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 00 00  ...{.{.8i.......
[  853.849240] raw data: 00000030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 d9 5b  .......5gg.....[
[  853.849242] raw data: 00000040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 8f 38  .......g.......8
[  853.849244] raw data: 00000050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50        /......;.....P

[  857.883052] sungem: sungem wrong csum : dbb4/c48f, len 94 bytes, c0000001fa185882
[  857.883058] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 00  ...C.b...Q....E.
[  857.883070] raw data: 00000010: 00 4c a1 97 40 00 3a 11 ce ed d9 5b 2c 11 c0 a8  .L..@.:....[,...
[  857.883080] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 14 4b 24 02 06 ea 00 00  ...{.{.8.K$.....
[  857.883085] raw data: 00000030: 00 0b 00 00 02 99 c0 a8 64 09 de d3 d2 5a 36 e4  ........d....Z6.
[  857.883090] raw data: 00000040: bc f5 de d3 d2 8a 8f 2c 17 44 de d3 d2 8a 93 8b  .......,.D......
[  857.883094] raw data: 00000050: d7 b7 de d3 d2 8a 93 97 69 6e 39 7b d2 5a        ........in9{.Z

[  858.124689] sungem: sungem wrong csum : 1f4f/02d0, len 118 bytes, c0000001fa185602
[  858.124700] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01  ...C.b.=~LH...a.
[  858.124705] raw data: 00000010: 1e b1 00 3c 06 40 20 01 0a 62 17 11 88 01 00 00  ...<.@ ..b......
[  858.124709] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00  .....8 ..b......
[  858.124714] raw data: 00000030: 00 00 00 00 00 07 94 b4 00 16 86 f5 29 e8 36 cb  ............).6.
[  858.124718] raw data: 00000040: 50 49 80 18 05 93 9a 53 00 00 01 01 08 0a 58 b2  PI.....S......X.
[  858.124723] raw data: 00000050: de 54 61 5f 2f 3c 00 00 00 10 cc 08 55 f7 da 21  .Ta_/<......U..!
[  858.124727] raw data: 00000060: f4 60 0a 6b 3c aa b9 b3 7e 61 10 b8 c2 be 9a 0b  .`.k<...~a......
[  858.124731] raw data: 00000070: c7 e9 5b 97 1b ac                                ..[...

[  858.126522] sungem: sungem wrong csum : 0836/19e9, len 90 bytes, c0000001fa185382
[  858.126530] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01  ...C.b.=~LH...a.
[  858.126532] raw data: 00000010: 1e b1 00 20 06 40 20 01 0a 62 17 11 88 01 00 00  ... .@ ..b......
[  858.126535] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00  .....8 ..b......
[  858.126537] raw data: 00000030: 00 00 00 00 00 07 94 b4 00 16 86 f5 2a 04 36 cb  ............*.6.
[  858.126540] raw data: 00000040: 50 65 80 10 05 93 3e 56 00 00 01 01 08 0a 58 b2  Pe....>V......X.
[  858.126542] raw data: 00000050: de 56 61 5f 30 4d 1d 58 42 d2                    .Va_0M.XB.

[  858.131559] sungem: sungem wrong csum : 5891/c98d, len 90 bytes, c0000001fa185102
[  858.131567] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01  ...C.b.=~LH...a.
[  858.131570] raw data: 00000010: 1e b1 00 20 06 40 20 01 0a 62 17 11 88 01 00 00  ... .@ ..b......
[  858.131572] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00  .....8 ..b......
[  858.131574] raw data: 00000030: 00 00 00 00 00 07 94 b4 00 16 86 f5 2a 04 36 cb  ............*.6.
[  858.131577] raw data: 00000040: 50 a1 80 10 05 93 3e 10 00 00 01 01 08 0a 58 b2  P.....>.......X.
[  858.131579] raw data: 00000050: de 5b 61 5f 30 52 3f ea 70 9b                    .[a_0R?.p.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-19 19:10                     ` Andreas Schwab
@ 2018-06-19 20:10                       ` Eric Dumazet
  2018-06-19 22:10                         ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-19 20:10 UTC (permalink / raw)
  To: Andreas Schwab, Eric Dumazet
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev



On 06/19/2018 12:10 PM, Andreas Schwab wrote:
> On Jun 18 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)
> 
> DUMP_PREFIX_ADDRESS is useless for that purpose.
> 
> Here are some samples of broken csums:
> 
> [  853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, c0000001fa187e02
> [  853.849232] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 10  ...C.b...Q....E.
> [  853.849235] raw data: 00000010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 c0 a8  .L..@.@.........
> [  853.849237] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 00 00  ...{.{.8i.......
> [  853.849240] raw data: 00000030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 d9 5b  .......5gg.....[
> [  853.849242] raw data: 00000040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 8f 38  .......g.......8
> [  853.849244] raw data: 00000050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50        /......;.....P

Thanks.

4 bytes in excess.

Might be the FCS, and it does not look like provided csum has a relation with it.

For some reason FCS stripping was disabled by :

commit 3e32011d4da6424b3bc65b1e1a047e30ac9882c7
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Mon May 19 09:39:11 2003 -0700

    [SUNGEM]: Updates from PowerPC people.
    
    Support more chips and split out all the complex PHY
    handling into a seperate file.


Since this NIC never had CHECKSUM_COMPLETE support (since we have to trim each skb,
thus were forcing ip_summed to CHECKSUM_NONE) we probably should remove it and be happy.

Unless you guys find a way to let the NIC strip the FCS, and double check the csum is a real csum ;)

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..85439308375e95c3854e4a1561697d69ec85399b 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -760,7 +760,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
        struct net_device *dev = gp->dev;
        int entry, drops, work_done = 0;
        u32 done;
-       __sum16 csum;
 
        if (netif_msg_rx_status(gp))
                printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
@@ -855,9 +854,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
                        skb = copy_skb;
                }
 
-               csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
-               skb->csum = csum_unfold(csum);
-               skb->ip_summed = CHECKSUM_COMPLETE;
                skb->protocol = eth_type_trans(skb, gp->dev);
 
                napi_gro_receive(&gp->napi, skb);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-19 20:10                       ` Eric Dumazet
@ 2018-06-19 22:10                         ` Eric Dumazet
  2018-06-19 22:32                           ` Andreas Schwab
  0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-19 22:10 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev



On 06/19/2018 01:10 PM, Eric Dumazet wrote:
> 
> 
> On 06/19/2018 12:10 PM, Andreas Schwab wrote:
>> On Jun 18 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)
>>
>> DUMP_PREFIX_ADDRESS is useless for that purpose.
>>
>> Here are some samples of broken csums:
>>
>> [  853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, c0000001fa187e02
>> [  853.849232] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 10  ...C.b...Q....E.
>> [  853.849235] raw data: 00000010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 c0 a8  .L..@.@.........
>> [  853.849237] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 00 00  ...{.{.8i.......
>> [  853.849240] raw data: 00000030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 d9 5b  .......5gg.....[
>> [  853.849242] raw data: 00000040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 8f 38  .......g.......8
>> [  853.849244] raw data: 00000050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50        /......;.....P
> 
> Thanks.
> 
> 4 bytes in excess.
> 
> Might be the FCS, and it does not look like provided csum has a relation with it.
> 
> For some reason FCS stripping was disabled by :
> 
> commit 3e32011d4da6424b3bc65b1e1a047e30ac9882c7
> Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date:   Mon May 19 09:39:11 2003 -0700
> 
>     [SUNGEM]: Updates from PowerPC people.
>     
>     Support more chips and split out all the complex PHY
>     handling into a seperate file.
> 
> 
> Since this NIC never had CHECKSUM_COMPLETE support (since we have to trim each skb,
> thus were forcing ip_summed to CHECKSUM_NONE) we probably should remove it and be happy.
> 
> Unless you guys find a way to let the NIC strip the FCS, and double check the csum is a real csum ;)
> 
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..85439308375e95c3854e4a1561697d69ec85399b 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -760,7 +760,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
>         struct net_device *dev = gp->dev;
>         int entry, drops, work_done = 0;
>         u32 done;
> -       __sum16 csum;
>  
>         if (netif_msg_rx_status(gp))
>                 printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
> @@ -855,9 +854,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
>                         skb = copy_skb;
>                 }
>  
> -               csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> -               skb->csum = csum_unfold(csum);
> -               skb->ip_summed = CHECKSUM_COMPLETE;
>                 skb->protocol = eth_type_trans(skb, gp->dev);
>  
>                 napi_gro_receive(&gp->napi, skb);
> 


If you have time, you also could check if changing the suspect (14 / 2) to ETH_HLEN would help
(and also enabling STRIP_FCS)



diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -60,8 +60,7 @@
 #include <linux/sungem_phy.h>
 #include "sungem.h"
 
-/* Stripping FCS is causing problems, disabled for now */
-#undef STRIP_FCS
+#define STRIP_FCS
 
 #define DEFAULT_MSG    (NETIF_MSG_DRV          | \
                         NETIF_MSG_PROBE        | \
@@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
        writel(desc_dma & 0xffffffff, gp->regs + RXDMA_DBLOW);
        writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
        val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
-              ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+              (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
        writel(val, gp->regs + RXDMA_CFG);
        if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
                writel(((5 & RXDMA_BLANK_IPKTS) |
@@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
 
                csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
                skb->csum = csum_unfold(csum);
+               {
+               __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
+               if (csum != csum_fold(rsum) && net_ratelimit())
+                       pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
+                               csum, csum_fold(rsum), len);
+                       print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
+                                           16, 1, skb->data, len, true);
+               }
                skb->ip_summed = CHECKSUM_COMPLETE;
                skb->protocol = eth_type_trans(skb, gp->dev);
 
@@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
        writel(0, gp->regs + TXDMA_KICK);
 
        val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
-              ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+              (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
        writel(val, gp->regs + RXDMA_CFG);
 
        writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-19 22:10                         ` Eric Dumazet
@ 2018-06-19 22:32                           ` Andreas Schwab
  2018-06-19 22:40                             ` Eric Dumazet
  0 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-19 22:32 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev

On Jun 19 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:

> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -60,8 +60,7 @@
>  #include <linux/sungem_phy.h>
>  #include "sungem.h"
>  
> -/* Stripping FCS is causing problems, disabled for now */
> -#undef STRIP_FCS
> +#define STRIP_FCS
>  
>  #define DEFAULT_MSG    (NETIF_MSG_DRV          | \
>                          NETIF_MSG_PROBE        | \
> @@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
>         writel(desc_dma & 0xffffffff, gp->regs + RXDMA_DBLOW);
>         writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
>         val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
> -              ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
> +              (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>         writel(val, gp->regs + RXDMA_CFG);
>         if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
>                 writel(((5 & RXDMA_BLANK_IPKTS) |
> @@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>  
>                 csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
>                 skb->csum = csum_unfold(csum);
> +               {
> +               __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
> +               if (csum != csum_fold(rsum) && net_ratelimit())
> +                       pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
> +                               csum, csum_fold(rsum), len);
> +                       print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
> +                                           16, 1, skb->data, len, true);
> +               }
>                 skb->ip_summed = CHECKSUM_COMPLETE;
>                 skb->protocol = eth_type_trans(skb, gp->dev);
>  
> @@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
>         writel(0, gp->regs + TXDMA_KICK);
>  
>         val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
> -              ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
> +              (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>         writel(val, gp->regs + RXDMA_CFG);
>  
>         writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);

With that patch I still get the wrong csum messages, but no longer the
hw csum failure messages (tested on a PowerMac G5).

[  662.659767] sungem: sungem wrong csum : 8359/7ca6, len 86 bytes, c0000001fee9cc02
[  662.659775] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01  ...C.b.=~LH...a.
[  662.659778] raw data: 00000010: 1c 1e 00 20 06 40 20 01 0a 62 17 11 88 01 00 00  ... .@ ..b......
[  662.659780] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00  .....8 ..b......
[  662.659783] raw data: 00000030: 00 00 00 00 00 07 9a 18 00 16 c1 9a 7e ea ea 44  ............~..D
[  662.659785] raw data: 00000040: fb 4a 80 10 05 93 44 08 00 00 01 01 08 0a 59 68  .J....D.......Yh
[  662.659788] raw data: 00000050: ba e2 0e bb ac ae                                ......

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
  2018-06-19 22:32                           ` Andreas Schwab
@ 2018-06-19 22:40                             ` Eric Dumazet
  0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-06-19 22:40 UTC (permalink / raw)
  To: Andreas Schwab, Eric Dumazet
  Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
	Christophe LEROY, Meelis Roos, netdev, linuxppc-dev



On 06/19/2018 03:32 PM, Andreas Schwab wrote:
> On Jun 19 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
>> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9 100644
>> --- a/drivers/net/ethernet/sun/sungem.c
>> +++ b/drivers/net/ethernet/sun/sungem.c
>> @@ -60,8 +60,7 @@
>>  #include <linux/sungem_phy.h>
>>  #include "sungem.h"
>>  
>> -/* Stripping FCS is causing problems, disabled for now */
>> -#undef STRIP_FCS
>> +#define STRIP_FCS
>>  
>>  #define DEFAULT_MSG    (NETIF_MSG_DRV          | \
>>                          NETIF_MSG_PROBE        | \
>> @@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
>>         writel(desc_dma & 0xffffffff, gp->regs + RXDMA_DBLOW);
>>         writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
>>         val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
>> -              ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
>> +              (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>>         writel(val, gp->regs + RXDMA_CFG);
>>         if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
>>                 writel(((5 & RXDMA_BLANK_IPKTS) |
>> @@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>>  
>>                 csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
>>                 skb->csum = csum_unfold(csum);
>> +               {
>> +               __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
>> +               if (csum != csum_fold(rsum) && net_ratelimit())
>> +                       pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
>> +                               csum, csum_fold(rsum), len);
>> +                       print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
>> +                                           16, 1, skb->data, len, true);
>> +               }
>>                 skb->ip_summed = CHECKSUM_COMPLETE;
>>                 skb->protocol = eth_type_trans(skb, gp->dev);
>>  
>> @@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
>>         writel(0, gp->regs + TXDMA_KICK);
>>  
>>         val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
>> -              ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
>> +              (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>>         writel(val, gp->regs + RXDMA_CFG);
>>  
>>         writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);
> 
> With that patch I still get the wrong csum messages, but no longer the
> hw csum failure messages (tested on a PowerMac G5).
> 
> [  662.659767] sungem: sungem wrong csum : 8359/7ca6, len 86 bytes, c0000001fee9cc02
> [  662.659775] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01  ...C.b.=~LH...a.
> [  662.659778] raw data: 00000010: 1c 1e 00 20 06 40 20 01 0a 62 17 11 88 01 00 00  ... .@ ..b......
> [  662.659780] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00  .....8 ..b......
> [  662.659783] raw data: 00000030: 00 00 00 00 00 07 9a 18 00 16 c1 9a 7e ea ea 44  ............~..D
> [  662.659785] raw data: 00000040: fb 4a 80 10 05 93 44 08 00 00 01 01 08 0a 59 68  .J....D.......Yh
> [  662.659788] raw data: 00000050: ba e2 0e bb ac ae                                ......
> 
> Andreas.
> 

Note that 8359 and 7ca6 are the same really (a missing ~ to invert csum_partial())

So the bug was that :

1) Driver programmed a wrong start offset for the csum  (7 bytes instead of 14 bytes to skip Ethernet Header)

2) FCS was not stripped.

Basically CHECKSUM_COMPLETE support never worked, but this was hidden by the fact that linux stack
had to throw away this CHECKSUM_COMPLETE because the FCS had to be removed.

Thanks !

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-06-19 22:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20180615185645.8921-1-malat@debian.org>
     [not found] ` <fbb95c11-240c-1a11-0a62-0483908c577e@gmail.com>
2018-06-16  7:14   ` [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends" Mathieu Malaterre
2018-06-16 12:45     ` Eric Dumazet
2018-06-17 10:27       ` Andreas Schwab
2018-06-17 22:54         ` Eric Dumazet
2018-06-18 17:54           ` Andreas Schwab
2018-06-18 18:18             ` Eric Dumazet
2018-06-18 18:45               ` Mathieu Malaterre
2018-06-18 23:29                 ` Eric Dumazet
2018-06-18 23:36                   ` Eric Dumazet
2018-06-19 19:10                     ` Andreas Schwab
2018-06-19 20:10                       ` Eric Dumazet
2018-06-19 22:10                         ` Eric Dumazet
2018-06-19 22:32                           ` Andreas Schwab
2018-06-19 22:40                             ` Eric Dumazet
2018-06-18 18:29             ` Mathieu Malaterre
2018-06-17 10:09     ` Andreas Schwab

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).