* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
[not found] ` <fbb95c11-240c-1a11-0a62-0483908c577e@gmail.com>
@ 2018-06-16 7:14 ` Mathieu Malaterre
2018-06-16 12:45 ` Eric Dumazet
2018-06-17 10:09 ` Andreas Schwab
0 siblings, 2 replies; 16+ messages in thread
From: Mathieu Malaterre @ 2018-06-16 7:14 UTC (permalink / raw)
To: eric.dumazet
Cc: David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
Meelis Roos, schwab, netdev, linuxppc-dev
Hi Eric,
On Fri, Jun 15, 2018 at 9:14 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 06/15/2018 11:56 AM, Mathieu Malaterre wrote:
> > This reverts commit 88078d98d1bb085d72af8437707279e203524fa5.
> >
> > It causes regressions for people using chips driven by the sungem
> > driver. Suspicion is that the skb->csum value isn't being adjusted
> > properly.
> >
> > Symptoms as seen on G4+sungem are:
> >
> > [ 34.023281] eth0: hw csum failure
> > [ 34.023438] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #2
> > [ 34.023618] Call Trace:
> > [ 34.023707] [dffedbd0] [c069ddac] __skb_checksum_complete+0xf0/0x108 (unreliable)
> > [ 34.023948] [dffedbf0] [c0777a70] tcp_v4_rcv+0x604/0xe00
> > [ 34.024118] [dffedc70] [c0731624] ip_local_deliver_finish+0xa8/0x3c4
> > [ 34.024315] [dffedcb0] [c0732430] ip_local_deliver+0xf0/0x154
> > [ 34.024493] [dffedcf0] [c07328dc] ip_rcv+0x448/0x774
> > [ 34.024653] [dffedd50] [c06aeae0] __netif_receive_skb_core+0x5e8/0x1184
> > [ 34.024857] [dffedde0] [c06bba20] napi_gro_receive+0x160/0x22c
> > [ 34.025044] [dffede10] [e14b2590] gem_poll+0x7fc/0x1ac0 [sungem]
> > [ 34.025228] [dffedee0] [c06bacf0] net_rx_action+0x34c/0x618
> > [ 34.025402] [dffedf60] [c07fd27c] __do_softirq+0x16c/0x5f0
> > [ 34.025575] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
> > [ 34.025738] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
> > [ 34.025903] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
> > [ 34.026055] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
> > [ 34.026225] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
> > LR = arch_cpu_idle+0x30/0x78
> > [ 34.026510] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
> > [ 34.026682] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
> > [ 34.026835] [c0cf7fb0] [c00a3ab0] cpu_startup_entry+0x20/0x28
> > [ 34.027013] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
> > [ 34.027181] [c0cf7ff0] [00003444] 0x3444
> >
> > See commit 7ce5a27f2ef8 ("Revert "net: Handle CHECKSUM_COMPLETE more
> > adequately in pskb_trim_rcsum()."") for previous reference.
>
> This fix seems to hide a bug in csum functions on this architecture.
That's odd since it seems to only affect g4+sungem user. None of the
ppc64 seems to be having it. And some ppc32 users are not even seeing
it.
> Or a bug on this NIC when receiving a small packet (less than 60 bytes).
> Maybe the padding bytes are not included in NIC provided csum, and not 0.
Ok in that case the bug is located in
./drivers/net/ethernet/sun/sungem.c that seems more likely. I'll try
to understand that code, then.
Thanks
>
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-16 7:14 ` [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends" Mathieu Malaterre
@ 2018-06-16 12:45 ` Eric Dumazet
2018-06-17 10:27 ` Andreas Schwab
2018-06-17 10:09 ` Andreas Schwab
1 sibling, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-16 12:45 UTC (permalink / raw)
To: Mathieu Malaterre
Cc: David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
Meelis Roos, schwab, netdev, linuxppc-dev
On 06/16/2018 12:14 AM, Mathieu Malaterre wrote:
> Hi Eric,
>
> On Fri, Jun 15, 2018 at 9:14 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>
>>
>> On 06/15/2018 11:56 AM, Mathieu Malaterre wrote:
>>> This reverts commit 88078d98d1bb085d72af8437707279e203524fa5.
>>>
>>> It causes regressions for people using chips driven by the sungem
>>> driver. Suspicion is that the skb->csum value isn't being adjusted
>>> properly.
>>>
>>> Symptoms as seen on G4+sungem are:
>>>
>>> [ 34.023281] eth0: hw csum failure
>>> [ 34.023438] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #2
>>> [ 34.023618] Call Trace:
>>> [ 34.023707] [dffedbd0] [c069ddac] __skb_checksum_complete+0xf0/0x108 (unreliable)
>>> [ 34.023948] [dffedbf0] [c0777a70] tcp_v4_rcv+0x604/0xe00
>>> [ 34.024118] [dffedc70] [c0731624] ip_local_deliver_finish+0xa8/0x3c4
>>> [ 34.024315] [dffedcb0] [c0732430] ip_local_deliver+0xf0/0x154
>>> [ 34.024493] [dffedcf0] [c07328dc] ip_rcv+0x448/0x774
>>> [ 34.024653] [dffedd50] [c06aeae0] __netif_receive_skb_core+0x5e8/0x1184
>>> [ 34.024857] [dffedde0] [c06bba20] napi_gro_receive+0x160/0x22c
>>> [ 34.025044] [dffede10] [e14b2590] gem_poll+0x7fc/0x1ac0 [sungem]
>>> [ 34.025228] [dffedee0] [c06bacf0] net_rx_action+0x34c/0x618
>>> [ 34.025402] [dffedf60] [c07fd27c] __do_softirq+0x16c/0x5f0
>>> [ 34.025575] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
>>> [ 34.025738] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
>>> [ 34.025903] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
>>> [ 34.026055] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
>>> [ 34.026225] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
>>> LR = arch_cpu_idle+0x30/0x78
>>> [ 34.026510] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
>>> [ 34.026682] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
>>> [ 34.026835] [c0cf7fb0] [c00a3ab0] cpu_startup_entry+0x20/0x28
>>> [ 34.027013] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
>>> [ 34.027181] [c0cf7ff0] [00003444] 0x3444
>>>
>>> See commit 7ce5a27f2ef8 ("Revert "net: Handle CHECKSUM_COMPLETE more
>>> adequately in pskb_trim_rcsum()."") for previous reference.
>>
>> This fix seems to hide a bug in csum functions on this architecture.
>
> That's odd since it seems to only affect g4+sungem user. None of the
> ppc64 seems to be having it. And some ppc32 users are not even seeing
> it.
>
>> Or a bug on this NIC when receiving a small packet (less than 60 bytes).
>> Maybe the padding bytes are not included in NIC provided csum, and not 0.
>
> Ok in that case the bug is located in
> ./drivers/net/ethernet/sun/sungem.c that seems more likely. I'll try
> to understand that code, then.
I would try something like :
Basically do not bother using CHECKSUM_COMPLETE for small frames that might have been padded.
Since we need to bring one cache line in eth_type_trans() and further header processing,
computing the checksum in software will be almost free anyway.
diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..071039f211a8a33153e888bd4014314ba5e686a4 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -855,9 +855,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
skb = copy_skb;
}
- csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
- skb->csum = csum_unfold(csum);
- skb->ip_summed = CHECKSUM_COMPLETE;
+ if (len > ETH_ZLEN) {
+ csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
+ skb->csum = csum_unfold(csum);
+ skb->ip_summed = CHECKSUM_COMPLETE;
+ }
skb->protocol = eth_type_trans(skb, gp->dev);
napi_gro_receive(&gp->napi, skb);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-16 7:14 ` [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends" Mathieu Malaterre
2018-06-16 12:45 ` Eric Dumazet
@ 2018-06-17 10:09 ` Andreas Schwab
1 sibling, 0 replies; 16+ messages in thread
From: Andreas Schwab @ 2018-06-17 10:09 UTC (permalink / raw)
To: Mathieu Malaterre
Cc: eric.dumazet, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On Jun 16 2018, Mathieu Malaterre <malat@debian.org> wrote:
> That's odd since it seems to only affect g4+sungem user. None of the
> ppc64 seems to be having it.
I'm also seeing it on a PowerMac G5.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-16 12:45 ` Eric Dumazet
@ 2018-06-17 10:27 ` Andreas Schwab
2018-06-17 22:54 ` Eric Dumazet
0 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-17 10:27 UTC (permalink / raw)
To: Eric Dumazet
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On Jun 16 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> I would try something like :
>
> Basically do not bother using CHECKSUM_COMPLETE for small frames that might have been padded.
>
> Since we need to bring one cache line in eth_type_trans() and further header processing,
> computing the checksum in software will be almost free anyway.
>
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..071039f211a8a33153e888bd4014314ba5e686a4 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -855,9 +855,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
> skb = copy_skb;
> }
>
> - csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> - skb->csum = csum_unfold(csum);
> - skb->ip_summed = CHECKSUM_COMPLETE;
> + if (len > ETH_ZLEN) {
> + csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> + skb->csum = csum_unfold(csum);
> + skb->ip_summed = CHECKSUM_COMPLETE;
> + }
> skb->protocol = eth_type_trans(skb, gp->dev);
>
> napi_gro_receive(&gp->napi, skb);
That doesn't change anything.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-17 10:27 ` Andreas Schwab
@ 2018-06-17 22:54 ` Eric Dumazet
2018-06-18 17:54 ` Andreas Schwab
0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-17 22:54 UTC (permalink / raw)
To: Andreas Schwab
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On 06/17/2018 03:27 AM, Andreas Schwab wrote:
>
> That doesn't change anything.
OK, thanks !
Oh this is silly, please try :
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
{
if (skb->ip_summed == CHECKSUM_COMPLETE) {
- int delta = skb->len - len;
+ __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
- skb->csum = csum_sub(skb->csum,
- skb_checksum(skb, len, delta, 0));
+ skb->csum = csum_block_sub(skb->csum, csumdiff, len);
}
return __pskb_trim(skb, len);
}
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-17 22:54 ` Eric Dumazet
@ 2018-06-18 17:54 ` Andreas Schwab
2018-06-18 18:18 ` Eric Dumazet
2018-06-18 18:29 ` Mathieu Malaterre
0 siblings, 2 replies; 16+ messages in thread
From: Andreas Schwab @ 2018-06-18 17:54 UTC (permalink / raw)
To: Eric Dumazet
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> Oh this is silly, please try :
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
> int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
> {
> if (skb->ip_summed == CHECKSUM_COMPLETE) {
> - int delta = skb->len - len;
> + __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
>
> - skb->csum = csum_sub(skb->csum,
> - skb_checksum(skb, len, delta, 0));
> + skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> }
> return __pskb_trim(skb, len);
> }
That doesn't help either.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-18 17:54 ` Andreas Schwab
@ 2018-06-18 18:18 ` Eric Dumazet
2018-06-18 18:45 ` Mathieu Malaterre
2018-06-18 18:29 ` Mathieu Malaterre
1 sibling, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-18 18:18 UTC (permalink / raw)
To: Andreas Schwab, Eric Dumazet
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On 06/18/2018 10:54 AM, Andreas Schwab wrote:
> On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> Oh this is silly, please try :
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
>> int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
>> {
>> if (skb->ip_summed == CHECKSUM_COMPLETE) {
>> - int delta = skb->len - len;
>> + __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
>>
>> - skb->csum = csum_sub(skb->csum,
>> - skb_checksum(skb, len, delta, 0));
>> + skb->csum = csum_block_sub(skb->csum, csumdiff, len);
>> }
>> return __pskb_trim(skb, len);
>> }
>
> That doesn't help either.
>
> Andreas.
>
Then maybe NIC provided csum is not correct.
It does not compute a checksum on all the frame, but part of it.
You could use this patch to double check.
diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..277859ea73e35271a10b02011120fca248ec8e71 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -857,6 +857,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
skb->csum = csum_unfold(csum);
+ {
+ __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
+ if (csum != csum_fold(rsum))
+ pr_err_ratelimited("sungem wrong csum : %x/%x, len %u bytes\n", csum, csum_fold(rsum), len);
+ }
skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-18 17:54 ` Andreas Schwab
2018-06-18 18:18 ` Eric Dumazet
@ 2018-06-18 18:29 ` Mathieu Malaterre
1 sibling, 0 replies; 16+ messages in thread
From: Mathieu Malaterre @ 2018-06-18 18:29 UTC (permalink / raw)
To: schwab
Cc: eric.dumazet, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On Mon, Jun 18, 2018 at 7:54 PM Andreas Schwab <schwab@linux-m68k.org> wrote:
>
> On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> > Oh this is silly, please try :
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
> > int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
> > {
> > if (skb->ip_summed == CHECKSUM_COMPLETE) {
> > - int delta = skb->len - len;
> > + __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
> >
> > - skb->csum = csum_sub(skb->csum,
> > - skb_checksum(skb, len, delta, 0));
> > + skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> > }
> > return __pskb_trim(skb, len);
> > }
>
> That doesn't help either.
seconded (setup g4+sungem):
[ 100.272676] eth0: hw csum failure
[ 100.272710] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #6
[ 100.272716] Call Trace:
[ 100.272733] [dffedbd0] [c069ddb8]
__skb_checksum_complete+0xf0/0x108 (unreliable)
[ 100.272752] [dffedbf0] [c078ea28] __udp4_lib_rcv+0x238/0xf98
[ 100.272767] [dffedc70] [c0731630] ip_local_deliver_finish+0xa8/0x3c4
[ 100.272777] [dffedcb0] [c073243c] ip_local_deliver+0xf0/0x154
[ 100.272786] [dffedcf0] [c07328e8] ip_rcv+0x448/0x774
[ 100.272800] [dffedd50] [c06aeaec] __netif_receive_skb_core+0x5e8/0x1184
[ 100.272811] [dffedde0] [c06bba2c] napi_gro_receive+0x160/0x22c
[ 100.272828] [dffede10] [e1571590] gem_poll+0x7fc/0x1ac0 [sungem]
[ 100.272837] [dffedee0] [c06bacfc] net_rx_action+0x34c/0x618
[ 100.272849] [dffedf60] [c07fd28c] __do_softirq+0x16c/0x5f0
[ 100.272863] [dffedfd0] [c0064c7c] irq_exit+0x110/0x1a8
[ 100.272877] [dffedff0] [c0016170] call_do_irq+0x24/0x3c
[ 100.272890] [c0cf7e80] [c0009a84] do_IRQ+0x98/0x1a0
[ 100.272900] [c0cf7eb0] [c001b474] ret_from_except+0x0/0x14
[ 100.272911] --- interrupt: 501 at arch_cpu_idle+0x30/0x78
LR = arch_cpu_idle+0x30/0x78
[ 100.272920] [c0cf7f70] [c0cf6000] 0xc0cf6000 (unreliable)
[ 100.272935] [c0cf7f80] [c00a3868] do_idle+0xc4/0x158
[ 100.272944] [c0cf7fb0] [c00a3ab4] cpu_startup_entry+0x24/0x28
[ 100.272955] [c0cf7fc0] [c0998820] start_kernel+0x47c/0x490
[ 100.272963] [c0cf7ff0] [00003444] 0x3444
> Andreas.
>
> --
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
> "And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-18 18:18 ` Eric Dumazet
@ 2018-06-18 18:45 ` Mathieu Malaterre
2018-06-18 23:29 ` Eric Dumazet
0 siblings, 1 reply; 16+ messages in thread
From: Mathieu Malaterre @ 2018-06-18 18:45 UTC (permalink / raw)
To: eric.dumazet
Cc: schwab, David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
Meelis Roos, netdev, linuxppc-dev
On Mon, Jun 18, 2018 at 8:18 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 06/18/2018 10:54 AM, Andreas Schwab wrote:
> > On Jun 17 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >> Oh this is silly, please try :
> >>
> >> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >> index c642304f178ce0a4e1358d59e45032a39f76fb3f..54dd9c18ecad817812898d6f335e1794a07dabbe 100644
> >> --- a/net/core/skbuff.c
> >> +++ b/net/core/skbuff.c
> >> @@ -1845,10 +1845,9 @@ EXPORT_SYMBOL(___pskb_trim);
> >> int pskb_trim_rcsum_slow(struct sk_buff *skb, unsigned int len)
> >> {
> >> if (skb->ip_summed == CHECKSUM_COMPLETE) {
> >> - int delta = skb->len - len;
> >> + __wsum csumdiff = skb_checksum(skb, len, skb->len - len, 0);
> >>
> >> - skb->csum = csum_sub(skb->csum,
> >> - skb_checksum(skb, len, delta, 0));
> >> + skb->csum = csum_block_sub(skb->csum, csumdiff, len);
> >> }
> >> return __pskb_trim(skb, len);
> >> }
> >
> > That doesn't help either.
> >
> > Andreas.
> >
>
> Then maybe NIC provided csum is not correct.
>
> It does not compute a checksum on all the frame, but part of it.
>
> You could use this patch to double check.
>
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..277859ea73e35271a10b02011120fca248ec8e71 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -857,6 +857,11 @@ static int gem_rx(struct gem *gp, int work_to_do)
>
> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> skb->csum = csum_unfold(csum);
> + {
> + __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
> + if (csum != csum_fold(rsum))
> + pr_err_ratelimited("sungem wrong csum : %x/%x, len %u bytes\n", csum, csum_fold(rsum), len);
> + }
> skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>
>
Here is what I get on my side
[ 53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[ 53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
[ 58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
[ 58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
[ 63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
[ 68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
[ 73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
[ 78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
[ 83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[ 83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
[ 88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
[ 93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
[ 98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
[ 103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
[ 108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
[ 108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
[ 113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
[ 113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
[ 118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
[ 123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
[ 128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
[ 133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
[ 135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
[ 135.529208] eth0: hw csum failure
[ 135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
[ 135.529226] Call Trace:
[ 135.529243] [dffedbe0] [c069ddac]
__skb_checksum_complete+0xf0/0x108 (unreliable)
>
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-18 18:45 ` Mathieu Malaterre
@ 2018-06-18 23:29 ` Eric Dumazet
2018-06-18 23:36 ` Eric Dumazet
0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-18 23:29 UTC (permalink / raw)
To: Mathieu Malaterre
Cc: schwab, David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
Meelis Roos, netdev, linuxppc-dev
On 06/18/2018 11:45 AM, Mathieu Malaterre wrote:
>
> Here is what I get on my side
>
> [ 53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [ 53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
> [ 58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
> [ 58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
> [ 63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
> [ 68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
> [ 73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
> [ 78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
> [ 83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [ 83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
> [ 88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
> [ 93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
> [ 98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
> [ 103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
> [ 108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
> [ 108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
> [ 113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
> [ 113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
> [ 118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
> [ 123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
> [ 128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
> [ 133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
> [ 135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
> [ 135.529208] eth0: hw csum failure
> [ 135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
> [ 135.529226] Call Trace:
> [ 135.529243] [dffedbe0] [c069ddac]
> __skb_checksum_complete+0xf0/0x108 (unreliable)
Thanks, then I guess next step would be to dump the content of the frames
having a wrong checksum, hoping we find an easy way to discard the CHECKSUM_COMPLETE
in a selective way.
Otherwise, we will need to remove CHECKSUM_COMPLETE setting in this driver.
diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..77a761f95be788bb86c8d917f613c9084818f826 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -857,6 +857,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
skb->csum = csum_unfold(csum);
+ {
+ __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
+ if (csum != csum_fold(rsum) && net_ratelimit())
+ pr_err("sungem wrong csum : %04x/%04x, len %u bytes\n",
+ csum, csum_fold(rsum), len);
+ print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
+ 16, 1, skb->data, len, true);
+ }
skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-18 23:29 ` Eric Dumazet
@ 2018-06-18 23:36 ` Eric Dumazet
2018-06-19 19:10 ` Andreas Schwab
0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-18 23:36 UTC (permalink / raw)
To: Eric Dumazet, Mathieu Malaterre
Cc: schwab, David S. Miller, Eric Dumazet, LKML, Christophe LEROY,
Meelis Roos, netdev, linuxppc-dev
On 06/18/2018 04:29 PM, Eric Dumazet wrote:
>
>
> On 06/18/2018 11:45 AM, Mathieu Malaterre wrote:
>
>>
>> Here is what I get on my side
>>
>> [ 53.628847] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [ 53.667063] sungem: sungem wrong csum : eea8/6eec, len 149 bytes
>> [ 58.648952] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
>> [ 58.669414] sungem: sungem wrong csum : 5245/b50, len 149 bytes
>> [ 63.674451] sungem: sungem wrong csum : 2d8/5abd, len 149 bytes
>> [ 68.678233] sungem: sungem wrong csum : b8fc/a498, len 149 bytes
>> [ 73.685771] sungem: sungem wrong csum : 374/5a21, len 149 bytes
>> [ 78.689089] sungem: sungem wrong csum : d81/5014, len 149 bytes
>> [ 83.683261] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [ 83.690193] sungem: sungem wrong csum : c2f7/9a9d, len 149 bytes
>> [ 88.692511] sungem: sungem wrong csum : f4d8/68bc, len 149 bytes
>> [ 93.699915] sungem: sungem wrong csum : 1370/4a25, len 149 bytes
>> [ 98.703136] sungem: sungem wrong csum : e1b5/7bdf, len 149 bytes
>> [ 103.704230] sungem: sungem wrong csum : 5321/a74, len 149 bytes
>> [ 108.688912] sungem: sungem wrong csum : 2095/3d06, len 64 bytes
>> [ 108.706559] sungem: sungem wrong csum : ddbc/7fd8, len 149 bytes
>> [ 113.713189] sungem: sungem wrong csum : 5a65/330, len 149 bytes
>> [ 113.891697] sungem: sungem wrong csum : 4e04/f97, len 64 bytes
>> [ 118.717151] sungem: sungem wrong csum : f7c8/65cc, len 149 bytes
>> [ 123.722680] sungem: sungem wrong csum : 3d7a/201b, len 149 bytes
>> [ 128.726524] sungem: sungem wrong csum : c8fd/9497, len 149 bytes
>> [ 133.732045] sungem: sungem wrong csum : de0d/7f87, len 149 bytes
>> [ 135.529163] sungem: sungem wrong csum : 3089/b6dd, len 96 bytes
>> [ 135.529208] eth0: hw csum failure
>> [ 135.529220] CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0+ #7
>> [ 135.529226] Call Trace:
>> [ 135.529243] [dffedbe0] [c069ddac]
>> __skb_checksum_complete+0xf0/0x108 (unreliable)
>
> Thanks, then I guess next step would be to dump the content of the frames
> having a wrong checksum, hoping we find an easy way to discard the CHECKSUM_COMPLETE
> in a selective way.
>
> Otherwise, we will need to remove CHECKSUM_COMPLETE setting in this driver.
>
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..77a761f95be788bb86c8d917f613c9084818f826 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -857,6 +857,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>
> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> skb->csum = csum_unfold(csum);
> + {
> + __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
> + if (csum != csum_fold(rsum) && net_ratelimit())
> + pr_err("sungem wrong csum : %04x/%04x, len %u bytes\n",
> + csum, csum_fold(rsum), len);
> + print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)
> + 16, 1, skb->data, len, true);
> + }
> skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-18 23:36 ` Eric Dumazet
@ 2018-06-19 19:10 ` Andreas Schwab
2018-06-19 20:10 ` Eric Dumazet
0 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-19 19:10 UTC (permalink / raw)
To: Eric Dumazet
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On Jun 18 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)
DUMP_PREFIX_ADDRESS is useless for that purpose.
Here are some samples of broken csums:
[ 853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, c0000001fa187e02
[ 853.849232] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 10 ...C.b...Q....E.
[ 853.849235] raw data: 00000010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 c0 a8 .L..@.@.........
[ 853.849237] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 00 00 ...{.{.8i.......
[ 853.849240] raw data: 00000030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 d9 5b .......5gg.....[
[ 853.849242] raw data: 00000040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 8f 38 .......g.......8
[ 853.849244] raw data: 00000050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50 /......;.....P
[ 857.883052] sungem: sungem wrong csum : dbb4/c48f, len 94 bytes, c0000001fa185882
[ 857.883058] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 00 ...C.b...Q....E.
[ 857.883070] raw data: 00000010: 00 4c a1 97 40 00 3a 11 ce ed d9 5b 2c 11 c0 a8 .L..@.:....[,...
[ 857.883080] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 14 4b 24 02 06 ea 00 00 ...{.{.8.K$.....
[ 857.883085] raw data: 00000030: 00 0b 00 00 02 99 c0 a8 64 09 de d3 d2 5a 36 e4 ........d....Z6.
[ 857.883090] raw data: 00000040: bc f5 de d3 d2 8a 8f 2c 17 44 de d3 d2 8a 93 8b .......,.D......
[ 857.883094] raw data: 00000050: d7 b7 de d3 d2 8a 93 97 69 6e 39 7b d2 5a ........in9{.Z
[ 858.124689] sungem: sungem wrong csum : 1f4f/02d0, len 118 bytes, c0000001fa185602
[ 858.124700] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01 ...C.b.=~LH...a.
[ 858.124705] raw data: 00000010: 1e b1 00 3c 06 40 20 01 0a 62 17 11 88 01 00 00 ...<.@ ..b......
[ 858.124709] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00 .....8 ..b......
[ 858.124714] raw data: 00000030: 00 00 00 00 00 07 94 b4 00 16 86 f5 29 e8 36 cb ............).6.
[ 858.124718] raw data: 00000040: 50 49 80 18 05 93 9a 53 00 00 01 01 08 0a 58 b2 PI.....S......X.
[ 858.124723] raw data: 00000050: de 54 61 5f 2f 3c 00 00 00 10 cc 08 55 f7 da 21 .Ta_/<......U..!
[ 858.124727] raw data: 00000060: f4 60 0a 6b 3c aa b9 b3 7e 61 10 b8 c2 be 9a 0b .`.k<...~a......
[ 858.124731] raw data: 00000070: c7 e9 5b 97 1b ac ..[...
[ 858.126522] sungem: sungem wrong csum : 0836/19e9, len 90 bytes, c0000001fa185382
[ 858.126530] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01 ...C.b.=~LH...a.
[ 858.126532] raw data: 00000010: 1e b1 00 20 06 40 20 01 0a 62 17 11 88 01 00 00 ... .@ ..b......
[ 858.126535] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00 .....8 ..b......
[ 858.126537] raw data: 00000030: 00 00 00 00 00 07 94 b4 00 16 86 f5 2a 04 36 cb ............*.6.
[ 858.126540] raw data: 00000040: 50 65 80 10 05 93 3e 56 00 00 01 01 08 0a 58 b2 Pe....>V......X.
[ 858.126542] raw data: 00000050: de 56 61 5f 30 4d 1d 58 42 d2 .Va_0M.XB.
[ 858.131559] sungem: sungem wrong csum : 5891/c98d, len 90 bytes, c0000001fa185102
[ 858.131567] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01 ...C.b.=~LH...a.
[ 858.131570] raw data: 00000010: 1e b1 00 20 06 40 20 01 0a 62 17 11 88 01 00 00 ... .@ ..b......
[ 858.131572] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00 .....8 ..b......
[ 858.131574] raw data: 00000030: 00 00 00 00 00 07 94 b4 00 16 86 f5 2a 04 36 cb ............*.6.
[ 858.131577] raw data: 00000040: 50 a1 80 10 05 93 3e 10 00 00 01 01 08 0a 58 b2 P.....>.......X.
[ 858.131579] raw data: 00000050: de 5b 61 5f 30 52 3f ea 70 9b .[a_0R?.p.
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-19 19:10 ` Andreas Schwab
@ 2018-06-19 20:10 ` Eric Dumazet
2018-06-19 22:10 ` Eric Dumazet
0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-19 20:10 UTC (permalink / raw)
To: Andreas Schwab, Eric Dumazet
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On 06/19/2018 12:10 PM, Andreas Schwab wrote:
> On Jun 18 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)
>
> DUMP_PREFIX_ADDRESS is useless for that purpose.
>
> Here are some samples of broken csums:
>
> [ 853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, c0000001fa187e02
> [ 853.849232] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 10 ...C.b...Q....E.
> [ 853.849235] raw data: 00000010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 c0 a8 .L..@.@.........
> [ 853.849237] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 00 00 ...{.{.8i.......
> [ 853.849240] raw data: 00000030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 d9 5b .......5gg.....[
> [ 853.849242] raw data: 00000040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 8f 38 .......g.......8
> [ 853.849244] raw data: 00000050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50 /......;.....P
Thanks.
4 bytes in excess.
Might be the FCS, and it does not look like provided csum has a relation with it.
For some reason FCS stripping was disabled by :
commit 3e32011d4da6424b3bc65b1e1a047e30ac9882c7
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date: Mon May 19 09:39:11 2003 -0700
[SUNGEM]: Updates from PowerPC people.
Support more chips and split out all the complex PHY
handling into a seperate file.
Since this NIC never had CHECKSUM_COMPLETE support (since we have to trim each skb,
thus were forcing ip_summed to CHECKSUM_NONE) we probably should remove it and be happy.
Unless you guys find a way to let the NIC strip the FCS, and double check the csum is a real csum ;)
diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..85439308375e95c3854e4a1561697d69ec85399b 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -760,7 +760,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
struct net_device *dev = gp->dev;
int entry, drops, work_done = 0;
u32 done;
- __sum16 csum;
if (netif_msg_rx_status(gp))
printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
@@ -855,9 +854,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
skb = copy_skb;
}
- csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
- skb->csum = csum_unfold(csum);
- skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
napi_gro_receive(&gp->napi, skb);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-19 20:10 ` Eric Dumazet
@ 2018-06-19 22:10 ` Eric Dumazet
2018-06-19 22:32 ` Andreas Schwab
0 siblings, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2018-06-19 22:10 UTC (permalink / raw)
To: Andreas Schwab
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On 06/19/2018 01:10 PM, Eric Dumazet wrote:
>
>
> On 06/19/2018 12:10 PM, Andreas Schwab wrote:
>> On Jun 18 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>> DUMP_PREFIX_ADDRESS might give us more information (say alignment problem, or crossing page boundaries)
>>
>> DUMP_PREFIX_ADDRESS is useless for that purpose.
>>
>> Here are some samples of broken csums:
>>
>> [ 853.849225] sungem: sungem wrong csum : 9886/07be, len 94 bytes, c0000001fa187e02
>> [ 853.849232] raw data: 00000000: 00 0d 93 43 81 62 18 d6 c7 51 b8 1c 08 00 45 10 ...C.b...Q....E.
>> [ 853.849235] raw data: 00000010: 00 4c cb a0 40 00 40 11 d9 97 c0 a8 0a 01 c0 a8 .L..@.@.........
>> [ 853.849237] raw data: 00000020: 0a 07 00 7b 00 7b 00 38 69 e1 1c 03 0c f7 00 00 ...{.{.8i.......
>> [ 853.849240] raw data: 00000030: 08 f0 00 00 15 f0 c0 35 67 67 de d3 ca c9 d9 5b .......5gg.....[
>> [ 853.849242] raw data: 00000040: 1f ff de d3 d2 86 8f 67 fa f2 de d3 d2 86 8f 38 .......g.......8
>> [ 853.849244] raw data: 00000050: 2f ff de d3 d2 86 8f 3b ff ff d1 93 bc 50 /......;.....P
>
> Thanks.
>
> 4 bytes in excess.
>
> Might be the FCS, and it does not look like provided csum has a relation with it.
>
> For some reason FCS stripping was disabled by :
>
> commit 3e32011d4da6424b3bc65b1e1a047e30ac9882c7
> Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Date: Mon May 19 09:39:11 2003 -0700
>
> [SUNGEM]: Updates from PowerPC people.
>
> Support more chips and split out all the complex PHY
> handling into a seperate file.
>
>
> Since this NIC never had CHECKSUM_COMPLETE support (since we have to trim each skb,
> thus were forcing ip_summed to CHECKSUM_NONE) we probably should remove it and be happy.
>
> Unless you guys find a way to let the NIC strip the FCS, and double check the csum is a real csum ;)
>
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..85439308375e95c3854e4a1561697d69ec85399b 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -760,7 +760,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
> struct net_device *dev = gp->dev;
> int entry, drops, work_done = 0;
> u32 done;
> - __sum16 csum;
>
> if (netif_msg_rx_status(gp))
> printk(KERN_DEBUG "%s: rx interrupt, done: %d, rx_new: %d\n",
> @@ -855,9 +854,6 @@ static int gem_rx(struct gem *gp, int work_to_do)
> skb = copy_skb;
> }
>
> - csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> - skb->csum = csum_unfold(csum);
> - skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>
> napi_gro_receive(&gp->napi, skb);
>
If you have time, you also could check if changing the suspect (14 / 2) to ETH_HLEN would help
(and also enabling STRIP_FCS)
diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -60,8 +60,7 @@
#include <linux/sungem_phy.h>
#include "sungem.h"
-/* Stripping FCS is causing problems, disabled for now */
-#undef STRIP_FCS
+#define STRIP_FCS
#define DEFAULT_MSG (NETIF_MSG_DRV | \
NETIF_MSG_PROBE | \
@@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
writel(desc_dma & 0xffffffff, gp->regs + RXDMA_DBLOW);
writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
- ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+ (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
writel(val, gp->regs + RXDMA_CFG);
if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
writel(((5 & RXDMA_BLANK_IPKTS) |
@@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
skb->csum = csum_unfold(csum);
+ {
+ __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
+ if (csum != csum_fold(rsum) && net_ratelimit())
+ pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
+ csum, csum_fold(rsum), len);
+ print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
+ 16, 1, skb->data, len, true);
+ }
skb->ip_summed = CHECKSUM_COMPLETE;
skb->protocol = eth_type_trans(skb, gp->dev);
@@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
writel(0, gp->regs + TXDMA_KICK);
val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
- ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
+ (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
writel(val, gp->regs + RXDMA_CFG);
writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-19 22:10 ` Eric Dumazet
@ 2018-06-19 22:32 ` Andreas Schwab
2018-06-19 22:40 ` Eric Dumazet
0 siblings, 1 reply; 16+ messages in thread
From: Andreas Schwab @ 2018-06-19 22:32 UTC (permalink / raw)
To: Eric Dumazet
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On Jun 19 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9 100644
> --- a/drivers/net/ethernet/sun/sungem.c
> +++ b/drivers/net/ethernet/sun/sungem.c
> @@ -60,8 +60,7 @@
> #include <linux/sungem_phy.h>
> #include "sungem.h"
>
> -/* Stripping FCS is causing problems, disabled for now */
> -#undef STRIP_FCS
> +#define STRIP_FCS
>
> #define DEFAULT_MSG (NETIF_MSG_DRV | \
> NETIF_MSG_PROBE | \
> @@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
> writel(desc_dma & 0xffffffff, gp->regs + RXDMA_DBLOW);
> writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
> - ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
> + (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
> writel(val, gp->regs + RXDMA_CFG);
> if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
> writel(((5 & RXDMA_BLANK_IPKTS) |
> @@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>
> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
> skb->csum = csum_unfold(csum);
> + {
> + __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
> + if (csum != csum_fold(rsum) && net_ratelimit())
> + pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
> + csum, csum_fold(rsum), len);
> + print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
> + 16, 1, skb->data, len, true);
> + }
> skb->ip_summed = CHECKSUM_COMPLETE;
> skb->protocol = eth_type_trans(skb, gp->dev);
>
> @@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
> writel(0, gp->regs + TXDMA_KICK);
>
> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
> - ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
> + (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
> writel(val, gp->regs + RXDMA_CFG);
>
> writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);
With that patch I still get the wrong csum messages, but no longer the
hw csum failure messages (tested on a PowerMac G5).
[ 662.659767] sungem: sungem wrong csum : 8359/7ca6, len 86 bytes, c0000001fee9cc02
[ 662.659775] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01 ...C.b.=~LH...a.
[ 662.659778] raw data: 00000010: 1c 1e 00 20 06 40 20 01 0a 62 17 11 88 01 00 00 ... .@ ..b......
[ 662.659780] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00 .....8 ..b......
[ 662.659783] raw data: 00000030: 00 00 00 00 00 07 9a 18 00 16 c1 9a 7e ea ea 44 ............~..D
[ 662.659785] raw data: 00000040: fb 4a 80 10 05 93 44 08 00 00 01 01 08 0a 59 68 .J....D.......Yh
[ 662.659788] raw data: 00000050: ba e2 0e bb ac ae ......
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"
2018-06-19 22:32 ` Andreas Schwab
@ 2018-06-19 22:40 ` Eric Dumazet
0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2018-06-19 22:40 UTC (permalink / raw)
To: Andreas Schwab, Eric Dumazet
Cc: Mathieu Malaterre, David S. Miller, Eric Dumazet, LKML,
Christophe LEROY, Meelis Roos, netdev, linuxppc-dev
On 06/19/2018 03:32 PM, Andreas Schwab wrote:
> On Jun 19 2018, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>> diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
>> index 7a16d40a72d13cf1d522e8a3a396c826fe76f9b9..672d6748ab44f0890e92d5ca55d6ff6834c20dc9 100644
>> --- a/drivers/net/ethernet/sun/sungem.c
>> +++ b/drivers/net/ethernet/sun/sungem.c
>> @@ -60,8 +60,7 @@
>> #include <linux/sungem_phy.h>
>> #include "sungem.h"
>>
>> -/* Stripping FCS is causing problems, disabled for now */
>> -#undef STRIP_FCS
>> +#define STRIP_FCS
>>
>> #define DEFAULT_MSG (NETIF_MSG_DRV | \
>> NETIF_MSG_PROBE | \
>> @@ -435,7 +434,7 @@ static int gem_rxmac_reset(struct gem *gp)
>> writel(desc_dma & 0xffffffff, gp->regs + RXDMA_DBLOW);
>> writel(RX_RING_SIZE - 4, gp->regs + RXDMA_KICK);
>> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
>> - ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
>> + (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>> writel(val, gp->regs + RXDMA_CFG);
>> if (readl(gp->regs + GREG_BIFCFG) & GREG_BIFCFG_M66EN)
>> writel(((5 & RXDMA_BLANK_IPKTS) |
>> @@ -857,6 +856,14 @@ static int gem_rx(struct gem *gp, int work_to_do)
>>
>> csum = (__force __sum16)htons((status & RXDCTRL_TCPCSUM) ^ 0xffff);
>> skb->csum = csum_unfold(csum);
>> + {
>> + __wsum rsum = csum_partial(skb->data + ETH_HLEN, len - ETH_HLEN, 0);
>> + if (csum != csum_fold(rsum) && net_ratelimit())
>> + pr_err("sungem wrong csum : %x/%x, len %u bytes\n",
>> + csum, csum_fold(rsum), len);
>> + print_hex_dump(KERN_ERR, "raw data: ", DUMP_PREFIX_OFFSET,
>> + 16, 1, skb->data, len, true);
>> + }
>> skb->ip_summed = CHECKSUM_COMPLETE;
>> skb->protocol = eth_type_trans(skb, gp->dev);
>>
>> @@ -1761,7 +1768,7 @@ static void gem_init_dma(struct gem *gp)
>> writel(0, gp->regs + TXDMA_KICK);
>>
>> val = (RXDMA_CFG_BASE | (RX_OFFSET << 10) |
>> - ((14 / 2) << 13) | RXDMA_CFG_FTHRESH_128);
>> + (ETH_HLEN << 13) | RXDMA_CFG_FTHRESH_128);
>> writel(val, gp->regs + RXDMA_CFG);
>>
>> writel(desc_dma >> 32, gp->regs + RXDMA_DBHI);
>
> With that patch I still get the wrong csum messages, but no longer the
> hw csum failure messages (tested on a PowerMac G5).
>
> [ 662.659767] sungem: sungem wrong csum : 8359/7ca6, len 86 bytes, c0000001fee9cc02
> [ 662.659775] raw data: 00000000: 00 0d 93 43 81 62 d4 3d 7e 4c 48 b7 86 dd 61 01 ...C.b.=~LH...a.
> [ 662.659778] raw data: 00000010: 1c 1e 00 20 06 40 20 01 0a 62 17 11 88 01 00 00 ... .@ ..b......
> [ 662.659780] raw data: 00000020: 00 00 00 00 0a 38 20 01 0a 62 17 11 88 01 00 00 .....8 ..b......
> [ 662.659783] raw data: 00000030: 00 00 00 00 00 07 9a 18 00 16 c1 9a 7e ea ea 44 ............~..D
> [ 662.659785] raw data: 00000040: fb 4a 80 10 05 93 44 08 00 00 01 01 08 0a 59 68 .J....D.......Yh
> [ 662.659788] raw data: 00000050: ba e2 0e bb ac ae ......
>
> Andreas.
>
Note that 8359 and 7ca6 are the same really (a missing ~ to invert csum_partial())
So the bug was that :
1) Driver programmed a wrong start offset for the csum (7 bytes instead of 14 bytes to skip Ethernet Header)
2) FCS was not stripped.
Basically CHECKSUM_COMPLETE support never worked, but this was hidden by the fact that linux stack
had to throw away this CHECKSUM_COMPLETE because the FCS had to be removed.
Thanks !
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2018-06-19 22:41 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20180615185645.8921-1-malat@debian.org>
[not found] ` <fbb95c11-240c-1a11-0a62-0483908c577e@gmail.com>
2018-06-16 7:14 ` [PATCH] Revert "net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends" Mathieu Malaterre
2018-06-16 12:45 ` Eric Dumazet
2018-06-17 10:27 ` Andreas Schwab
2018-06-17 22:54 ` Eric Dumazet
2018-06-18 17:54 ` Andreas Schwab
2018-06-18 18:18 ` Eric Dumazet
2018-06-18 18:45 ` Mathieu Malaterre
2018-06-18 23:29 ` Eric Dumazet
2018-06-18 23:36 ` Eric Dumazet
2018-06-19 19:10 ` Andreas Schwab
2018-06-19 20:10 ` Eric Dumazet
2018-06-19 22:10 ` Eric Dumazet
2018-06-19 22:32 ` Andreas Schwab
2018-06-19 22:40 ` Eric Dumazet
2018-06-18 18:29 ` Mathieu Malaterre
2018-06-17 10:09 ` Andreas Schwab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).