All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] net: small csum optimizations
@ 2021-11-24 20:24 Eric Dumazet
  2021-11-24 20:24 ` [PATCH net-next 1/2] gro: optimize skb_gro_postpull_rcsum() Eric Dumazet
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Eric Dumazet @ 2021-11-24 20:24 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet

From: Eric Dumazet <edumazet@google.com>

After recent x86 csum_partial() optimizations, we can more easily
see in kernel profiles costs of add/adc operations that could
be avoided, by feeding a non zero third argument to csum_partial()

Eric Dumazet (2):
  gro: optimize skb_gro_postpull_rcsum()
  net: optimize skb_postpull_rcsum()

 include/linux/skbuff.h | 6 +++++-
 include/net/gro.h      | 4 ++--
 2 files changed, 7 insertions(+), 3 deletions(-)

-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH net-next 1/2] gro: optimize skb_gro_postpull_rcsum()
  2021-11-24 20:24 [PATCH net-next 0/2] net: small csum optimizations Eric Dumazet
@ 2021-11-24 20:24 ` Eric Dumazet
  2021-11-24 20:24 ` [PATCH net-next 2/2] net: optimize skb_postpull_rcsum() Eric Dumazet
  2021-11-26  5:20 ` [PATCH net-next 0/2] net: small csum optimizations patchwork-bot+netdevbpf
  2 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2021-11-24 20:24 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet

From: Eric Dumazet <edumazet@google.com>

We can leverage third argument to csum_partial():

  X = csum_sub(X, csum_partial(start, len, 0));

  -->

  X = csum_add(X, ~csum_partial(start, len, 0));

  -->

  X = ~csum_partial(start, len, ~X);

This removes one add/adc pair and its dependency against the carry flag.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/gro.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/gro.h b/include/net/gro.h
index 9c22a010369cb89f9511d78cc322be56170d7b20..b1139fca7c435ca0c353c4ed17628dd7f3df4401 100644
--- a/include/net/gro.h
+++ b/include/net/gro.h
@@ -173,8 +173,8 @@ static inline void skb_gro_postpull_rcsum(struct sk_buff *skb,
 					const void *start, unsigned int len)
 {
 	if (NAPI_GRO_CB(skb)->csum_valid)
-		NAPI_GRO_CB(skb)->csum = csum_sub(NAPI_GRO_CB(skb)->csum,
-						  csum_partial(start, len, 0));
+		NAPI_GRO_CB(skb)->csum = ~csum_partial(start, len,
+						       ~NAPI_GRO_CB(skb)->csum);
 }
 
 /* GRO checksum functions. These are logical equivalents of the normal
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-11-24 20:24 [PATCH net-next 0/2] net: small csum optimizations Eric Dumazet
  2021-11-24 20:24 ` [PATCH net-next 1/2] gro: optimize skb_gro_postpull_rcsum() Eric Dumazet
@ 2021-11-24 20:24 ` Eric Dumazet
  2021-11-25  9:41   ` David Laight
  2021-12-02 13:10   ` Vladimir Oltean
  2021-11-26  5:20 ` [PATCH net-next 0/2] net: small csum optimizations patchwork-bot+netdevbpf
  2 siblings, 2 replies; 26+ messages in thread
From: Eric Dumazet @ 2021-11-24 20:24 UTC (permalink / raw)
  To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet

From: Eric Dumazet <edumazet@google.com>

Remove one pair of add/adc instructions and their dependency
against carry flag.

We can leverage third argument to csum_partial():

  X = csum_block_sub(X, csum_partial(start, len, 0), 0);

  -->

  X = csum_block_add(X, ~csum_partial(start, len, 0), 0);

  -->

  X = ~csum_partial(start, len, ~X);

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/skbuff.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index eba256af64a577b458998845f2dc01a5ec80745a..eae4bd3237a41cc1b60b44c91fbfe21dfdd8f117 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3485,7 +3485,11 @@ __skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int len,
 static inline void skb_postpull_rcsum(struct sk_buff *skb,
 				      const void *start, unsigned int len)
 {
-	__skb_postpull_rcsum(skb, start, len, 0);
+	if (skb->ip_summed == CHECKSUM_COMPLETE)
+		skb->csum = ~csum_partial(start, len, ~skb->csum);
+	else if (skb->ip_summed == CHECKSUM_PARTIAL &&
+		 skb_checksum_start_offset(skb) < 0)
+		skb->ip_summed = CHECKSUM_NONE;
 }
 
 static __always_inline void
-- 
2.34.0.rc2.393.gf8c9666880-goog


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-11-24 20:24 ` [PATCH net-next 2/2] net: optimize skb_postpull_rcsum() Eric Dumazet
@ 2021-11-25  9:41   ` David Laight
  2021-11-25 13:32     ` Eric Dumazet
  2021-12-02 13:10   ` Vladimir Oltean
  1 sibling, 1 reply; 26+ messages in thread
From: David Laight @ 2021-11-25  9:41 UTC (permalink / raw)
  To: 'Eric Dumazet', David S . Miller, Jakub Kicinski
  Cc: netdev, Eric Dumazet

From: Eric Dumazet
> Sent: 24 November 2021 20:25
> 
> From: Eric Dumazet <edumazet@google.com>
> 
> Remove one pair of add/adc instructions and their dependency
> against carry flag.
> 
> We can leverage third argument to csum_partial():
> 
>   X = csum_block_sub(X, csum_partial(start, len, 0), 0);
> 
>   -->
> 
>   X = csum_block_add(X, ~csum_partial(start, len, 0), 0);
> 
>   -->
> 
>   X = ~csum_partial(start, len, ~X);

That doesn't seem to refer to the change in this file.

	David
 
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/skbuff.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index eba256af64a577b458998845f2dc01a5ec80745a..eae4bd3237a41cc1b60b44c91fbfe21dfdd8f117 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -3485,7 +3485,11 @@ __skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int len,
>  static inline void skb_postpull_rcsum(struct sk_buff *skb,
>  				      const void *start, unsigned int len)
>  {
> -	__skb_postpull_rcsum(skb, start, len, 0);
> +	if (skb->ip_summed == CHECKSUM_COMPLETE)
> +		skb->csum = ~csum_partial(start, len, ~skb->csum);
> +	else if (skb->ip_summed == CHECKSUM_PARTIAL &&
> +		 skb_checksum_start_offset(skb) < 0)
> +		skb->ip_summed = CHECKSUM_NONE;
>  }
> 
>  static __always_inline void
> --
> 2.34.0.rc2.393.gf8c9666880-goog

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-11-25  9:41   ` David Laight
@ 2021-11-25 13:32     ` Eric Dumazet
  2021-11-25 14:29       ` David Laight
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2021-11-25 13:32 UTC (permalink / raw)
  To: David Laight; +Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

On Thu, Nov 25, 2021 at 1:41 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Eric Dumazet
> > Sent: 24 November 2021 20:25
> >
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Remove one pair of add/adc instructions and their dependency
> > against carry flag.
> >
> > We can leverage third argument to csum_partial():
> >
> >   X = csum_block_sub(X, csum_partial(start, len, 0), 0);
> >
> >   -->
> >
> >   X = csum_block_add(X, ~csum_partial(start, len, 0), 0);
> >
> >   -->
> >
> >   X = ~csum_partial(start, len, ~X);
>
> That doesn't seem to refer to the change in this file.
>

It is describing the change.

The first step, of copying first the __skb_postpull_rcsum(skb, start,
len, 0) content
into __skb_postpull_rcsum() was kind of obvious.



>         David
>
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  include/linux/skbuff.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index eba256af64a577b458998845f2dc01a5ec80745a..eae4bd3237a41cc1b60b44c91fbfe21dfdd8f117 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -3485,7 +3485,11 @@ __skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int len,
> >  static inline void skb_postpull_rcsum(struct sk_buff *skb,
> >                                     const void *start, unsigned int len)
> >  {
> > -     __skb_postpull_rcsum(skb, start, len, 0);
> > +     if (skb->ip_summed == CHECKSUM_COMPLETE)
> > +             skb->csum = ~csum_partial(start, len, ~skb->csum);
> > +     else if (skb->ip_summed == CHECKSUM_PARTIAL &&
> > +              skb_checksum_start_offset(skb) < 0)
> > +             skb->ip_summed = CHECKSUM_NONE;
> >  }
> >
> >  static __always_inline void
> > --
> > 2.34.0.rc2.393.gf8c9666880-goog
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-11-25 13:32     ` Eric Dumazet
@ 2021-11-25 14:29       ` David Laight
  0 siblings, 0 replies; 26+ messages in thread
From: David Laight @ 2021-11-25 14:29 UTC (permalink / raw)
  To: 'Eric Dumazet'
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

There is another optimisation you can do that removes a conditional
from the end of the checksum generation.

The 'adc' sum is reduced to 16 bits leaving a value [1..0xffff].
This is then inverted to get the checksum, but the range is
then [0..0xfffe].
So 0 then has to be converted to 0xffff.

If you add 1 to thw inut checksum one of the csum_partial() calls
the adc sum is one too big, so the inverted value is one too small.
Adding 1 to the inverted value fixes this and leaves a checksum
in the correct range.

Potentially the invert+increment can be done as a negate prior
to the final masking with 0xffff.
(Which the compiler may well sort out for you.)

You do need to know 'early' that the checksum is going to get
inverted - or too many places might add in the extra 'one'.

On 64bit systems the 'input checksum' to csum_partial() can
(almost certainly) be made a long - with a proviso that the
value must not exceed 2**56 because the function might want
to add a partial word to it.

I'm also not sure how well any of this runs on mips-like cpu
that don't have a carry flag (I think this includes riscV).
On 64bit cpu it may be best to add 32bit values to 64bit registers.

With 2 memory read ports it is even possibly that an x86 cpu
can do 8 bytes/clock by adding 32 bit values to two registers.
However the reads would have to be aligned and arranged to
avoid cache bank conflicts.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 0/2] net: small csum optimizations
  2021-11-24 20:24 [PATCH net-next 0/2] net: small csum optimizations Eric Dumazet
  2021-11-24 20:24 ` [PATCH net-next 1/2] gro: optimize skb_gro_postpull_rcsum() Eric Dumazet
  2021-11-24 20:24 ` [PATCH net-next 2/2] net: optimize skb_postpull_rcsum() Eric Dumazet
@ 2021-11-26  5:20 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 26+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-11-26  5:20 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, kuba, netdev, edumazet

Hello:

This series was applied to netdev/net-next.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Wed, 24 Nov 2021 12:24:44 -0800 you wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> After recent x86 csum_partial() optimizations, we can more easily
> see in kernel profiles costs of add/adc operations that could
> be avoided, by feeding a non zero third argument to csum_partial()
> 
> Eric Dumazet (2):
>   gro: optimize skb_gro_postpull_rcsum()
>   net: optimize skb_postpull_rcsum()
> 
> [...]

Here is the summary with links:
  - [net-next,1/2] gro: optimize skb_gro_postpull_rcsum()
    https://git.kernel.org/netdev/net-next/c/0bd28476f636
  - [net-next,2/2] net: optimize skb_postpull_rcsum()
    https://git.kernel.org/netdev/net-next/c/29c3002644bd

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-11-24 20:24 ` [PATCH net-next 2/2] net: optimize skb_postpull_rcsum() Eric Dumazet
  2021-11-25  9:41   ` David Laight
@ 2021-12-02 13:10   ` Vladimir Oltean
  2021-12-02 14:51     ` Eric Dumazet
  2021-12-02 15:06     ` David Laight
  1 sibling, 2 replies; 26+ messages in thread
From: Vladimir Oltean @ 2021-12-02 13:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, netdev, Eric Dumazet, David Laight

Hello,

On Wed, Nov 24, 2021 at 12:24:46PM -0800, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> Remove one pair of add/adc instructions and their dependency
> against carry flag.
> 
> We can leverage third argument to csum_partial():
> 
>   X = csum_block_sub(X, csum_partial(start, len, 0), 0);
> 
>   -->
> 
>   X = csum_block_add(X, ~csum_partial(start, len, 0), 0);
> 
>   -->
> 
>   X = ~csum_partial(start, len, ~X);
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  include/linux/skbuff.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index eba256af64a577b458998845f2dc01a5ec80745a..eae4bd3237a41cc1b60b44c91fbfe21dfdd8f117 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -3485,7 +3485,11 @@ __skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int len,
>  static inline void skb_postpull_rcsum(struct sk_buff *skb,
>  				      const void *start, unsigned int len)
>  {
> -	__skb_postpull_rcsum(skb, start, len, 0);
> +	if (skb->ip_summed == CHECKSUM_COMPLETE)
> +		skb->csum = ~csum_partial(start, len, ~skb->csum);
> +	else if (skb->ip_summed == CHECKSUM_PARTIAL &&
> +		 skb_checksum_start_offset(skb) < 0)
> +		skb->ip_summed = CHECKSUM_NONE;
>  }
>  
>  static __always_inline void
> -- 
> 2.34.0.rc2.393.gf8c9666880-goog
> 

I am seeing some errors after this patch, and I am not able to
understand why. Specifically, __skb_gro_checksum_complete() hits this
condition:

	wsum = skb_checksum(skb, skb_gro_offset(skb), skb_gro_len(skb), 0);

	/* NAPI_GRO_CB(skb)->csum holds pseudo checksum */
	sum = csum_fold(csum_add(NAPI_GRO_CB(skb)->csum, wsum));
	/* See comments in __skb_checksum_complete(). */
	if (likely(!sum)) {
		if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
		    !skb->csum_complete_sw)
			netdev_rx_csum_fault(skb->dev, skb);
	}

To test, I am using a DSA switch network interface with an IPv4 address
and pinging through it.

The idea is as follows: an enetc port is attached to a switch, and that
switch places a frame header before the Ethernet header.
The enetc calculates the L2 payload (actually what it perceives as L2
payload, since it has no insight into the switch header format) checksum
and puts it in skb->csum:
https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/freescale/enetc/enetc.c#L726

Then, the switch driver packet type handler is invoked, and this strips
that header and recalculates the checksum (then it changes skb->dev and
this is how pinging is done through the DSA interface, but that is
irrelevant).
https://elixir.bootlin.com/linux/latest/source/net/dsa/tag_ocelot.c#L56

There seems to be a disparity when the skb->csum is calculated by
skb_postpull_rcsum as zero. Before, it was calculated as 0xffff.

Below is a dump added by me in skb_postpull_rcsum when the checksums
calculated through both methods differ. I've kept the old implementation
inside skb->csum and this is what skb_dump() sees:

[   99.891524] skb csum of 20 bytes (20 to left of skb->data) using method 1: 0x0 method 2 0xffffffff, orig 0xf470
[   99.901701] skb len=84 headroom=98 headlen=84 tailroom=1546
[   99.901701] mac=(84,-6) net=(78,0) trans=78
[   99.901701] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[   99.901701] csum(0xffffffff ip_summed=2 complete_sw=0 valid=0 level=0)
[   99.901701] hash(0x0 sw=0 l4=0) proto=0x00f8 pkttype=3 iif=7
[   99.929916] dev name=eno2 feat=0x0x00020100001149a9
[   99.934831] skb headroom: 00000000: 00 c0 b1 a4 ff ff 00 00 00 e0 b1 a4 ff ff 00 00
[   99.942533] skb headroom: 00000010: 00 6f 5b 02 f8 14 ff ff 40 62 5b 02 f8 14 ff ff
[   99.950232] skb headroom: 00000020: 21 6f 5b 02 f8 14 ff ff 00 00 00 00 00 00 00 00
[   99.957931] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00
[   99.965631] skb headroom: 00000040: 88 80 00 0a 00 33 9d 40 f0 41 01 80 00 00 08 0f
[   99.973330] skb headroom: 00000050: 00 10 00 00 00 04 9f 05 f6 28 ba ae e4 b6 2c 3d
[   99.981028] skb headroom: 00000060: 08 00
[   99.985062] skb linear:   00000000: 45 00 00 54 27 ac 00 00 40 01 09 a8 c0 a8 64 03
[   99.992762] skb linear:   00000010: c0 a8 64 01 00 00 10 e6 01 5c 00 04 49 30 a7 61
[  100.000462] skb linear:   00000020: 00 00 00 00 3d 55 01 00 00 00 00 00 10 11 12 13
[  100.008162] skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
[  100.015862] skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
[  100.023561] skb linear:   00000050: 34 35 36 37

And below is the same output as above, but annotated by me with some comments:

[   99.891524] skb csum of 20 bytes (20 to left of skb->data) using method 1: 0x0 method 2 0xffffffff, orig 0xf470
[   99.901701] skb len=84 headroom=98 headlen=84 tailroom=1546
[   99.901701] mac=(84,-6) net=(78,0) trans=78
               ^^^^^^^^^^^^^^^^^^^^^^
               since the print is done from ocelot_rcv, the network and
               transport headers haven't yet been updated

[   99.901701] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[   99.901701] csum(0xffffffff ip_summed=2 complete_sw=0 valid=0 level=0)
[   99.901701] hash(0x0 sw=0 l4=0) proto=0x00f8 pkttype=3 iif=7
[   99.929916] dev name=eno2 feat=0x0x00020100001149a9
[   99.934831] skb headroom: 00000000: 00 c0 b1 a4 ff ff 00 00 00 e0 b1 a4 ff ff 00 00
[   99.942533] skb headroom: 00000010: 00 6f 5b 02 f8 14 ff ff 40 62 5b 02 f8 14 ff ff
[   99.950232] skb headroom: 00000020: 21 6f 5b 02 f8 14 ff ff 00 00 00 00 00 00 00 00
[   99.957931] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00

                                       here is where the enetc saw the          the "start" variable (old skb->data)
                                       beginning of the frame                   points here
                                       v                                         v
[   99.965631] skb headroom: 00000040: 88 80 00 0a 00 33 9d 40 f0 41 01 80 00 00 08 0f

                                                   OCELOT_TAG_LEN bytes
                                                   later is the real MAC header
                                                   v
[   99.973330] skb headroom: 00000050: 00 10 00 00 00 04 9f 05 f6 28 ba ae e4 b6 2c 3d
[   99.981028] skb headroom: 00000060: 08 00
                                       ^
                                       the skb_postpull_rcsum is called from "start"
                                       until the first byte prior to this one

[   99.985062] skb linear:   00000000: 45 00 00 54 27 ac 00 00 40 01 09 a8 c0 a8 64 03
[   99.992762] skb linear:   00000010: c0 a8 64 01 00 00 10 e6 01 5c 00 04 49 30 a7 61
[  100.000462] skb linear:   00000020: 00 00 00 00 3d 55 01 00 00 00 00 00 10 11 12 13
[  100.008162] skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
[  100.015862] skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
[  100.023561] skb linear:   00000050: 34 35 36 37

Do you have some suggestions as to what may be wrong? Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 13:10   ` Vladimir Oltean
@ 2021-12-02 14:51     ` Eric Dumazet
  2021-12-02 16:29       ` Vladimir Oltean
  2021-12-02 15:06     ` David Laight
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2021-12-02 14:51 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, David Laight

Hi Vladimir

On Thu, Dec 2, 2021 at 5:10 AM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> Hello,
>
> On Wed, Nov 24, 2021 at 12:24:46PM -0800, Eric Dumazet wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Remove one pair of add/adc instructions and their dependency
> > against carry flag.
> >
> > We can leverage third argument to csum_partial():
> >
> >   X = csum_block_sub(X, csum_partial(start, len, 0), 0);
> >
> >   -->
> >
> >   X = csum_block_add(X, ~csum_partial(start, len, 0), 0);
> >
> >   -->
> >
> >   X = ~csum_partial(start, len, ~X);
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  include/linux/skbuff.h | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index eba256af64a577b458998845f2dc01a5ec80745a..eae4bd3237a41cc1b60b44c91fbfe21dfdd8f117 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -3485,7 +3485,11 @@ __skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int len,
> >  static inline void skb_postpull_rcsum(struct sk_buff *skb,
> >                                     const void *start, unsigned int len)
> >  {
> > -     __skb_postpull_rcsum(skb, start, len, 0);
> > +     if (skb->ip_summed == CHECKSUM_COMPLETE)
> > +             skb->csum = ~csum_partial(start, len, ~skb->csum);
> > +     else if (skb->ip_summed == CHECKSUM_PARTIAL &&
> > +              skb_checksum_start_offset(skb) < 0)
> > +             skb->ip_summed = CHECKSUM_NONE;
> >  }
> >
> >  static __always_inline void
> > --
> > 2.34.0.rc2.393.gf8c9666880-goog
> >
>
> I am seeing some errors after this patch, and I am not able to
> understand why. Specifically, __skb_gro_checksum_complete() hits this
> condition:

There were two patches, one for GRO, one for skb_postpull_rcsum()

I am a bit confused by your report. Which one is causing problems ?

>
>         wsum = skb_checksum(skb, skb_gro_offset(skb), skb_gro_len(skb), 0);
>
>         /* NAPI_GRO_CB(skb)->csum holds pseudo checksum */
>         sum = csum_fold(csum_add(NAPI_GRO_CB(skb)->csum, wsum));
>         /* See comments in __skb_checksum_complete(). */
>         if (likely(!sum)) {
>                 if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE) &&
>                     !skb->csum_complete_sw)
>                         netdev_rx_csum_fault(skb->dev, skb);
>         }
>
> To test, I am using a DSA switch network interface with an IPv4 address
> and pinging through it.
>
> The idea is as follows: an enetc port is attached to a switch, and that
> switch places a frame header before the Ethernet header.
> The enetc calculates the L2 payload (actually what it perceives as L2
> payload, since it has no insight into the switch header format) checksum
> and puts it in skb->csum:
> https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/freescale/enetc/enetc.c#L726
>
> Then, the switch driver packet type handler is invoked, and this strips
> that header and recalculates the checksum (then it changes skb->dev and
> this is how pinging is done through the DSA interface, but that is
> irrelevant).
> https://elixir.bootlin.com/linux/latest/source/net/dsa/tag_ocelot.c#L56
>
> There seems to be a disparity when the skb->csum is calculated by
> skb_postpull_rcsum as zero. Before, it was calculated as 0xffff.

skb->csum is 32bit, so there are about 2^16 different values for a
given Internet checksum

>
> Below is a dump added by me in skb_postpull_rcsum when the checksums
> calculated through both methods differ. I've kept the old implementation
> inside skb->csum and this is what skb_dump() sees:
>
> [   99.891524] skb csum of 20 bytes (20 to left of skb->data) using method 1: 0x0 method 2 0xffffffff, orig 0xf470
> [   99.901701] skb len=84 headroom=98 headlen=84 tailroom=1546
> [   99.901701] mac=(84,-6) net=(78,0) trans=78
> [   99.901701] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> [   99.901701] csum(0xffffffff ip_summed=2 complete_sw=0 valid=0 level=0)
> [   99.901701] hash(0x0 sw=0 l4=0) proto=0x00f8 pkttype=3 iif=7
> [   99.929916] dev name=eno2 feat=0x0x00020100001149a9
> [   99.934831] skb headroom: 00000000: 00 c0 b1 a4 ff ff 00 00 00 e0 b1 a4 ff ff 00 00
> [   99.942533] skb headroom: 00000010: 00 6f 5b 02 f8 14 ff ff 40 62 5b 02 f8 14 ff ff
> [   99.950232] skb headroom: 00000020: 21 6f 5b 02 f8 14 ff ff 00 00 00 00 00 00 00 00
> [   99.957931] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00
> [   99.965631] skb headroom: 00000040: 88 80 00 0a 00 33 9d 40 f0 41 01 80 00 00 08 0f
> [   99.973330] skb headroom: 00000050: 00 10 00 00 00 04 9f 05 f6 28 ba ae e4 b6 2c 3d
> [   99.981028] skb headroom: 00000060: 08 00
> [   99.985062] skb linear:   00000000: 45 00 00 54 27 ac 00 00 40 01 09 a8 c0 a8 64 03
> [   99.992762] skb linear:   00000010: c0 a8 64 01 00 00 10 e6 01 5c 00 04 49 30 a7 61
> [  100.000462] skb linear:   00000020: 00 00 00 00 3d 55 01 00 00 00 00 00 10 11 12 13
> [  100.008162] skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
> [  100.015862] skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
> [  100.023561] skb linear:   00000050: 34 35 36 37
>
> And below is the same output as above, but annotated by me with some comments:
>
> [   99.891524] skb csum of 20 bytes (20 to left of skb->data) using method 1: 0x0 method 2 0xffffffff, orig 0xf470
> [   99.901701] skb len=84 headroom=98 headlen=84 tailroom=1546
> [   99.901701] mac=(84,-6) net=(78,0) trans=78
>                ^^^^^^^^^^^^^^^^^^^^^^
>                since the print is done from ocelot_rcv, the network and
>                transport headers haven't yet been updated
>
> [   99.901701] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> [   99.901701] csum(0xffffffff ip_summed=2 complete_sw=0 valid=0 level=0)
> [   99.901701] hash(0x0 sw=0 l4=0) proto=0x00f8 pkttype=3 iif=7
> [   99.929916] dev name=eno2 feat=0x0x00020100001149a9
> [   99.934831] skb headroom: 00000000: 00 c0 b1 a4 ff ff 00 00 00 e0 b1 a4 ff ff 00 00
> [   99.942533] skb headroom: 00000010: 00 6f 5b 02 f8 14 ff ff 40 62 5b 02 f8 14 ff ff
> [   99.950232] skb headroom: 00000020: 21 6f 5b 02 f8 14 ff ff 00 00 00 00 00 00 00 00
> [   99.957931] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 40 00 00 00 00 00 00
>
>                                        here is where the enetc saw the          the "start" variable (old skb->data)
>                                        beginning of the frame                   points here
>                                        v                                         v
> [   99.965631] skb headroom: 00000040: 88 80 00 0a 00 33 9d 40 f0 41 01 80 00 00 08 0f
>
>                                                    OCELOT_TAG_LEN bytes
>                                                    later is the real MAC header
>                                                    v
> [   99.973330] skb headroom: 00000050: 00 10 00 00 00 04 9f 05 f6 28 ba ae e4 b6 2c 3d
> [   99.981028] skb headroom: 00000060: 08 00
>                                        ^
>                                        the skb_postpull_rcsum is called from "start"
>                                        until the first byte prior to this one
>
> [   99.985062] skb linear:   00000000: 45 00 00 54 27 ac 00 00 40 01 09 a8 c0 a8 64 03
> [   99.992762] skb linear:   00000010: c0 a8 64 01 00 00 10 e6 01 5c 00 04 49 30 a7 61
> [  100.000462] skb linear:   00000020: 00 00 00 00 3d 55 01 00 00 00 00 00 10 11 12 13
> [  100.008162] skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
> [  100.015862] skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
> [  100.023561] skb linear:   00000050: 34 35 36 37
>
> Do you have some suggestions as to what may be wrong? Thanks.

What kind of traffic is triggering the fault ? TCP, UDP, something else ?

Do you have a stack trace to provide, because it is not clear from
where the issue is detected.

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 13:10   ` Vladimir Oltean
  2021-12-02 14:51     ` Eric Dumazet
@ 2021-12-02 15:06     ` David Laight
  2021-12-02 15:22       ` Eric Dumazet
  1 sibling, 1 reply; 26+ messages in thread
From: David Laight @ 2021-12-02 15:06 UTC (permalink / raw)
  To: 'Vladimir Oltean', Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, netdev, Eric Dumazet

From: Vladimir Oltean
> Sent: 02 December 2021 13:11
...
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -3485,7 +3485,11 @@ __skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int
> len,
> >  static inline void skb_postpull_rcsum(struct sk_buff *skb,
> >  				      const void *start, unsigned int len)
> >  {
> > -	__skb_postpull_rcsum(skb, start, len, 0);
> > +	if (skb->ip_summed == CHECKSUM_COMPLETE)
> > +		skb->csum = ~csum_partial(start, len, ~skb->csum);

You can't do that, the domain is 1..0xffff (or maybe 0xffffffff).
The invert has to convert ~0 to ~0 not zero.
...
> There seems to be a disparity when the skb->csum is calculated by
> skb_postpull_rcsum as zero. Before, it was calculated as 0xffff.

Which is what that will do for some inputs at least.
Maybe:
		skb->csum = 1 + ~csum_partial(start, len, ~skb->csum + 1);
is right.
I think that is the same as:
		skb->csum = -csum_partial(start, len, -skb->csum);
Although letting the compiler do that transform probably makes
the code easier to read.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 15:06     ` David Laight
@ 2021-12-02 15:22       ` Eric Dumazet
  0 siblings, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2021-12-02 15:22 UTC (permalink / raw)
  To: David Laight
  Cc: Vladimir Oltean, Eric Dumazet, David S . Miller, Jakub Kicinski,
	netdev, David Lebrun

On Thu, Dec 2, 2021 at 7:06 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Vladimir Oltean
> > Sent: 02 December 2021 13:11
> ...
> > > --- a/include/linux/skbuff.h
> > > +++ b/include/linux/skbuff.h
> > > @@ -3485,7 +3485,11 @@ __skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int
> > len,
> > >  static inline void skb_postpull_rcsum(struct sk_buff *skb,
> > >                                   const void *start, unsigned int len)
> > >  {
> > > -   __skb_postpull_rcsum(skb, start, len, 0);
> > > +   if (skb->ip_summed == CHECKSUM_COMPLETE)
> > > +           skb->csum = ~csum_partial(start, len, ~skb->csum);
>
> You can't do that, the domain is 1..0xffff (or maybe 0xffffffff).
> The invert has to convert ~0 to ~0 not zero.
> ...
> > There seems to be a disparity when the skb->csum is calculated by
> > skb_postpull_rcsum as zero. Before, it was calculated as 0xffff.
>
> Which is what that will do for some inputs at least.
> Maybe:
>                 skb->csum = 1 + ~csum_partial(start, len, ~skb->csum + 1);
> is right.
> I think that is the same as:
>                 skb->csum = -csum_partial(start, len, -skb->csum);
> Although letting the compiler do that transform probably makes
> the code easier to read.
>
>

Interesting, update_csum_diff4() and update_csum_diff16() seem to both use.

skb->csum = ~csum_partial((char *)diff, sizeof(diff), ~skb->csum);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 14:51     ` Eric Dumazet
@ 2021-12-02 16:29       ` Vladimir Oltean
  2021-12-02 19:32         ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2021-12-02 16:29 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, David Laight

On Thu, Dec 02, 2021 at 06:51:47AM -0800, Eric Dumazet wrote:
> Hi Vladimir
> > I am seeing some errors after this patch, and I am not able to
> > understand why. Specifically, __skb_gro_checksum_complete() hits this
> > condition:
> 
> There were two patches, one for GRO, one for skb_postpull_rcsum()
> 
> I am a bit confused by your report. Which one is causing problems ?

I'm sorry, indeed it seems that I missed to provide that info.
Anyway, it is the skb_postpull_rcsum() call from the DSA switch driver,
that I pointed to, which seems to be problematic.

[  754.211845] mscc_felix 0000:00:00.5 swp0: hw csum failure
[  754.217670] skb len=64 headroom=118 headlen=64 tailroom=1546
[  754.217670] mac=(84,14) net=(98,20) trans=118
[  754.217670] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[  754.217670] csum(0x0 ip_summed=2 complete_sw=0 valid=0 level=0)
[  754.217670] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=9
[  754.246905] dev name=swp0 feat=0x0x0002000000195829
[  754.253200] skb headroom: 00000000: ef be ad de ef be ad de ef be ad de ef be ad de
[  754.261751] skb headroom: 00000010: ef be ad de ef be ad de ef be ad de ef be ad de
[  754.269444] skb headroom: 00000020: ef be ad de ef be ad de ef be ad de ef be ad de
[  754.277135] skb headroom: 00000030: ef be ad de ef be ad de ef be ad de ef be ad de
[  754.284826] skb headroom: 00000040: 88 80 00 0a 00 3e 6b e3 36 a1 01 80 00 00 00 0f
[  754.292516] skb headroom: 00000050: 00 10 00 00 d2 ee 27 92 2d 6c 6a b6 a6 22 19 47
[  754.300207] skb headroom: 00000060: 08 00 45 00 00 54 2d 7c 00 00 40 01 03 d9 c0 a8
[  754.307897] skb headroom: 00000070: 64 02 c0 a8 64 01
[  754.312971] skb linear:   00000000: 00 00 60 4d 03 af 00 01 77 eb a8 61 00 00 00 00
[  754.320662] skb linear:   00000010: b6 e2 06 00 00 00 00 00 10 11 12 13 14 15 16 17
[  754.328352] skb linear:   00000020: 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27
[  754.336042] skb linear:   00000030: 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37
[  754.343732] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 ef be ad de ef be
[  754.351423] skb tailroom: 00000010: ad de ef be ad de ef be ad de ef be ad de ef be
(irrelevant tailroom trimmed)
[  755.088130] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc3-next-20211202-07010-ga9b9500ffaac-dirty #1531
[  755.098169] Hardware name: LS1028A RDB Board (DT)
[  755.102885] Call trace:
[  755.105333]  dump_backtrace+0x0/0x1ac
[  755.109016]  show_stack+0x18/0x70
[  755.112341]  dump_stack_lvl+0x68/0x84
[  755.116022]  dump_stack+0x18/0x34
[  755.119345]  netdev_rx_csum_fault+0x60/0x64
[  755.123549]  __skb_checksum_complete+0x104/0x10c
[  755.128180]  icmp_rcv+0x9c/0x3f0
[  755.131421]  ip_protocol_deliver_rcu+0x40/0x220
[  755.135965]  ip_local_deliver_finish+0x68/0x84
[  755.140421]  ip_local_deliver+0x7c/0x120
[  755.144353]  ip_sublist_rcv_finish+0x48/0x70
[  755.148643]  ip_sublist_rcv+0x168/0x1f0
[  755.152489]  ip_list_rcv+0xf8/0x1a0
[  755.155985]  __netif_receive_skb_list_core+0x184/0x214
[  755.161142]  netif_receive_skb_list_internal+0x180/0x29c
[  755.166471]  napi_complete_done+0x68/0x1bc
[  755.170581]  gro_cell_poll+0x80/0xa0
[  755.174176]  __napi_poll+0x38/0x184
[  755.177674]  net_rx_action+0xe8/0x280
[  755.181347]  __do_softirq+0x124/0x2a0
[  755.185019]  __irq_exit_rcu+0xe4/0x100
[  755.188782]  irq_exit_rcu+0x10/0x1c
[  755.192278]  el1_interrupt+0x38/0x84
[  755.195864]  el1h_64_irq_handler+0x18/0x24
[  755.199972]  el1h_64_irq+0x78/0x7c
[  755.203380]  cpuidle_enter_state+0x12c/0x2f0
[  755.207671]  cpuidle_enter+0x38/0x50
[  755.211256]  do_idle+0x214/0x29c
[  755.214495]  cpu_startup_entry+0x24/0x80
[  755.218428]  rest_init+0xe4/0xf4
[  755.221664]  arch_call_rest_init+0x10/0x1c
[  755.225778]  start_kernel+0x628/0x668
[  755.229450]  __primary_switched+0xc0/0xc8

> > There seems to be a disparity when the skb->csum is calculated by
> > skb_postpull_rcsum as zero. Before, it was calculated as 0xffff.
> 
> skb->csum is 32bit, so there are about 2^16 different values for a
> given Internet checksum

I meant 0xffffffff, sorry. It is visible in the skb_dump output that it
was 0xffffffff before and now it is 0.

> > Do you have some suggestions as to what may be wrong? Thanks.
> 
> What kind of traffic is triggering the fault ? TCP, UDP, something else ?

The simplest to reproduce would be for ICMP. I'm pretty sure I had a
stack trace with TCP as well, but I don't seem to be able to reproduce
that right now.

> Do you have a stack trace to provide, because it is not clear from
> where the issue is detected.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 16:29       ` Vladimir Oltean
@ 2021-12-02 19:32         ` Eric Dumazet
  2021-12-02 20:37           ` Eric Dumazet
  2021-12-02 20:40           ` Vladimir Oltean
  0 siblings, 2 replies; 26+ messages in thread
From: Eric Dumazet @ 2021-12-02 19:32 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, David Laight

On Thu, Dec 2, 2021 at 8:29 AM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Thu, Dec 02, 2021 at 06:51:47AM -0800, Eric Dumazet wrote:
> > Hi Vladimir
> > > I am seeing some errors after this patch, and I am not able to
> > > understand why. Specifically, __skb_gro_checksum_complete() hits this
> > > condition:
> >
> > There were two patches, one for GRO, one for skb_postpull_rcsum()
> >
> > I am a bit confused by your report. Which one is causing problems ?
>
> I'm sorry, indeed it seems that I missed to provide that info.
> Anyway, it is the skb_postpull_rcsum() call from the DSA switch driver,
> that I pointed to, which seems to be problematic.
>
> [  754.211845] mscc_felix 0000:00:00.5 swp0: hw csum failure
> [  754.217670] skb len=64 headroom=118 headlen=64 tailroom=1546
> [  754.217670] mac=(84,14) net=(98,20) trans=118
> [  754.217670] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> [  754.217670] csum(0x0 ip_summed=2 complete_sw=0 valid=0 level=0)
> [  754.217670] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=9
> [  754.246905] dev name=swp0 feat=0x0x0002000000195829
> [  754.253200] skb headroom: 00000000: ef be ad de ef be ad de ef be ad de ef be ad de
> [  754.261751] skb headroom: 00000010: ef be ad de ef be ad de ef be ad de ef be ad de
> [  754.269444] skb headroom: 00000020: ef be ad de ef be ad de ef be ad de ef be ad de
> [  754.277135] skb headroom: 00000030: ef be ad de ef be ad de ef be ad de ef be ad de
> [  754.284826] skb headroom: 00000040: 88 80 00 0a 00 3e 6b e3 36 a1 01 80 00 00 00 0f
> [  754.292516] skb headroom: 00000050: 00 10 00 00 d2 ee 27 92 2d 6c 6a b6 a6 22 19 47
> [  754.300207] skb headroom: 00000060: 08 00 45 00 00 54 2d 7c 00 00 40 01 03 d9 c0 a8
> [  754.307897] skb headroom: 00000070: 64 02 c0 a8 64 01
> [  754.312971] skb linear:   00000000: 00 00 60 4d 03 af 00 01 77 eb a8 61 00 00 00 00
> [  754.320662] skb linear:   00000010: b6 e2 06 00 00 00 00 00 10 11 12 13 14 15 16 17
> [  754.328352] skb linear:   00000020: 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27
> [  754.336042] skb linear:   00000030: 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37
> [  754.343732] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 ef be ad de ef be
> [  754.351423] skb tailroom: 00000010: ad de ef be ad de ef be ad de ef be ad de ef be
> (irrelevant tailroom trimmed)
> [  755.088130] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc3-next-20211202-07010-ga9b9500ffaac-dirty #1531
> [  755.098169] Hardware name: LS1028A RDB Board (DT)
> [  755.102885] Call trace:
> [  755.105333]  dump_backtrace+0x0/0x1ac
> [  755.109016]  show_stack+0x18/0x70
> [  755.112341]  dump_stack_lvl+0x68/0x84
> [  755.116022]  dump_stack+0x18/0x34
> [  755.119345]  netdev_rx_csum_fault+0x60/0x64
> [  755.123549]  __skb_checksum_complete+0x104/0x10c
> [  755.128180]  icmp_rcv+0x9c/0x3f0
> [  755.131421]  ip_protocol_deliver_rcu+0x40/0x220
> [  755.135965]  ip_local_deliver_finish+0x68/0x84
> [  755.140421]  ip_local_deliver+0x7c/0x120
> [  755.144353]  ip_sublist_rcv_finish+0x48/0x70
> [  755.148643]  ip_sublist_rcv+0x168/0x1f0
> [  755.152489]  ip_list_rcv+0xf8/0x1a0
> [  755.155985]  __netif_receive_skb_list_core+0x184/0x214
> [  755.161142]  netif_receive_skb_list_internal+0x180/0x29c
> [  755.166471]  napi_complete_done+0x68/0x1bc
> [  755.170581]  gro_cell_poll+0x80/0xa0
> [  755.174176]  __napi_poll+0x38/0x184
> [  755.177674]  net_rx_action+0xe8/0x280
> [  755.181347]  __do_softirq+0x124/0x2a0
> [  755.185019]  __irq_exit_rcu+0xe4/0x100
> [  755.188782]  irq_exit_rcu+0x10/0x1c
> [  755.192278]  el1_interrupt+0x38/0x84
> [  755.195864]  el1h_64_irq_handler+0x18/0x24
> [  755.199972]  el1h_64_irq+0x78/0x7c
> [  755.203380]  cpuidle_enter_state+0x12c/0x2f0
> [  755.207671]  cpuidle_enter+0x38/0x50
> [  755.211256]  do_idle+0x214/0x29c
> [  755.214495]  cpu_startup_entry+0x24/0x80
> [  755.218428]  rest_init+0xe4/0xf4
> [  755.221664]  arch_call_rest_init+0x10/0x1c
> [  755.225778]  start_kernel+0x628/0x668
> [  755.229450]  __primary_switched+0xc0/0xc8
>
> > > There seems to be a disparity when the skb->csum is calculated by
> > > skb_postpull_rcsum as zero. Before, it was calculated as 0xffff.
> >
> > skb->csum is 32bit, so there are about 2^16 different values for a
> > given Internet checksum
>
> I meant 0xffffffff, sorry. It is visible in the skb_dump output that it
> was 0xffffffff before and now it is 0.
>
> > > Do you have some suggestions as to what may be wrong? Thanks.
> >
> > What kind of traffic is triggering the fault ? TCP, UDP, something else ?
>
> The simplest to reproduce would be for ICMP. I'm pretty sure I had a
> stack trace with TCP as well, but I don't seem to be able to reproduce
> that right now.
>
> > Do you have a stack trace to provide, because it is not clear from
> > where the issue is detected.

Thanks Vladimir

I think that maybe the issue is that the initial skb->csum is zero,
and the csum_parttial(removed_block) is also zero.

But the initial skb->csum should not be zero if you have a non " all
zero"  frame.

Can you double check this in drivers/net/ethernet/freescale/enetc/enetc.c ?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 19:32         ` Eric Dumazet
@ 2021-12-02 20:37           ` Eric Dumazet
  2021-12-02 21:07             ` Vladimir Oltean
  2021-12-02 20:40           ` Vladimir Oltean
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2021-12-02 20:37 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, David Laight

On Thu, Dec 2, 2021 at 11:32 AM Eric Dumazet <edumazet@google.com> wrote:
>
> Thanks Vladimir
>
> I think that maybe the issue is that the initial skb->csum is zero,
> and the csum_parttial(removed_block) is also zero.
>
> But the initial skb->csum should not be zero if you have a non " all
> zero"  frame.
>
> Can you double check this in drivers/net/ethernet/freescale/enetc/enetc.c ?


Yes, I am not sure why the csum is inverted in enetc_get_offloads()

Perhaps

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
b/drivers/net/ethernet/freescale/enetc/enetc.c
index 504e12554079e306e477b9619f272d6e96527377..72524f14cae0093763f8a3f57b1e08e31bc4df1a
100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -986,7 +986,7 @@ static void enetc_get_offloads(struct enetc_bdr *rx_ring,
        if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
                u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);

-               skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
+               skb->csum = csum_unfold((__force __sum16)htons(inet_csum));
                skb->ip_summed = CHECKSUM_COMPLETE;
        }

If this does not work, then maybe :

diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
b/drivers/net/ethernet/freescale/enetc/enetc.c
index 504e12554079e306e477b9619f272d6e96527377..d190faa9a8242f9f3f962dd30b9f4409a83ee697
100644
--- a/drivers/net/ethernet/freescale/enetc/enetc.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc.c
@@ -987,7 +987,8 @@ static void enetc_get_offloads(struct enetc_bdr *rx_ring,
                u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);

                skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
-               skb->ip_summed = CHECKSUM_COMPLETE;
+               if (likely(skb->csum))
+                       skb->ip_summed = CHECKSUM_COMPLETE;
        }

        if (le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_VLAN) {

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 19:32         ` Eric Dumazet
  2021-12-02 20:37           ` Eric Dumazet
@ 2021-12-02 20:40           ` Vladimir Oltean
  2021-12-02 20:58             ` Vladimir Oltean
  2021-12-02 20:58             ` David Laight
  1 sibling, 2 replies; 26+ messages in thread
From: Vladimir Oltean @ 2021-12-02 20:40 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, David Laight

On Thu, Dec 02, 2021 at 11:32:09AM -0800, Eric Dumazet wrote:
> On Thu, Dec 2, 2021 at 8:29 AM Vladimir Oltean <olteanv@gmail.com> wrote:
> >
> > On Thu, Dec 02, 2021 at 06:51:47AM -0800, Eric Dumazet wrote:
> > > Hi Vladimir
> > > > I am seeing some errors after this patch, and I am not able to
> > > > understand why. Specifically, __skb_gro_checksum_complete() hits this
> > > > condition:
> > >
> > > There were two patches, one for GRO, one for skb_postpull_rcsum()
> > >
> > > I am a bit confused by your report. Which one is causing problems ?
> >
> > I'm sorry, indeed it seems that I missed to provide that info.
> > Anyway, it is the skb_postpull_rcsum() call from the DSA switch driver,
> > that I pointed to, which seems to be problematic.
> >
> > [  754.211845] mscc_felix 0000:00:00.5 swp0: hw csum failure
> > [  754.217670] skb len=64 headroom=118 headlen=64 tailroom=1546
> > [  754.217670] mac=(84,14) net=(98,20) trans=118
> > [  754.217670] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> > [  754.217670] csum(0x0 ip_summed=2 complete_sw=0 valid=0 level=0)
> > [  754.217670] hash(0x0 sw=0 l4=0) proto=0x0800 pkttype=0 iif=9
> > [  754.246905] dev name=swp0 feat=0x0x0002000000195829
> > [  754.253200] skb headroom: 00000000: ef be ad de ef be ad de ef be ad de ef be ad de
> > [  754.261751] skb headroom: 00000010: ef be ad de ef be ad de ef be ad de ef be ad de
> > [  754.269444] skb headroom: 00000020: ef be ad de ef be ad de ef be ad de ef be ad de
> > [  754.277135] skb headroom: 00000030: ef be ad de ef be ad de ef be ad de ef be ad de
> > [  754.284826] skb headroom: 00000040: 88 80 00 0a 00 3e 6b e3 36 a1 01 80 00 00 00 0f
> > [  754.292516] skb headroom: 00000050: 00 10 00 00 d2 ee 27 92 2d 6c 6a b6 a6 22 19 47
> > [  754.300207] skb headroom: 00000060: 08 00 45 00 00 54 2d 7c 00 00 40 01 03 d9 c0 a8
> > [  754.307897] skb headroom: 00000070: 64 02 c0 a8 64 01
> > [  754.312971] skb linear:   00000000: 00 00 60 4d 03 af 00 01 77 eb a8 61 00 00 00 00
> > [  754.320662] skb linear:   00000010: b6 e2 06 00 00 00 00 00 10 11 12 13 14 15 16 17
> > [  754.328352] skb linear:   00000020: 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25 26 27
> > [  754.336042] skb linear:   00000030: 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 34 35 36 37
> > [  754.343732] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 ef be ad de ef be
> > [  754.351423] skb tailroom: 00000010: ad de ef be ad de ef be ad de ef be ad de ef be
> > (irrelevant tailroom trimmed)
> > [  755.088130] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc3-next-20211202-07010-ga9b9500ffaac-dirty #1531
> > [  755.098169] Hardware name: LS1028A RDB Board (DT)
> > [  755.102885] Call trace:
> > [  755.105333]  dump_backtrace+0x0/0x1ac
> > [  755.109016]  show_stack+0x18/0x70
> > [  755.112341]  dump_stack_lvl+0x68/0x84
> > [  755.116022]  dump_stack+0x18/0x34
> > [  755.119345]  netdev_rx_csum_fault+0x60/0x64
> > [  755.123549]  __skb_checksum_complete+0x104/0x10c
> > [  755.128180]  icmp_rcv+0x9c/0x3f0
> > [  755.131421]  ip_protocol_deliver_rcu+0x40/0x220
> > [  755.135965]  ip_local_deliver_finish+0x68/0x84
> > [  755.140421]  ip_local_deliver+0x7c/0x120
> > [  755.144353]  ip_sublist_rcv_finish+0x48/0x70
> > [  755.148643]  ip_sublist_rcv+0x168/0x1f0
> > [  755.152489]  ip_list_rcv+0xf8/0x1a0
> > [  755.155985]  __netif_receive_skb_list_core+0x184/0x214
> > [  755.161142]  netif_receive_skb_list_internal+0x180/0x29c
> > [  755.166471]  napi_complete_done+0x68/0x1bc
> > [  755.170581]  gro_cell_poll+0x80/0xa0
> > [  755.174176]  __napi_poll+0x38/0x184
> > [  755.177674]  net_rx_action+0xe8/0x280
> > [  755.181347]  __do_softirq+0x124/0x2a0
> > [  755.185019]  __irq_exit_rcu+0xe4/0x100
> > [  755.188782]  irq_exit_rcu+0x10/0x1c
> > [  755.192278]  el1_interrupt+0x38/0x84
> > [  755.195864]  el1h_64_irq_handler+0x18/0x24
> > [  755.199972]  el1h_64_irq+0x78/0x7c
> > [  755.203380]  cpuidle_enter_state+0x12c/0x2f0
> > [  755.207671]  cpuidle_enter+0x38/0x50
> > [  755.211256]  do_idle+0x214/0x29c
> > [  755.214495]  cpu_startup_entry+0x24/0x80
> > [  755.218428]  rest_init+0xe4/0xf4
> > [  755.221664]  arch_call_rest_init+0x10/0x1c
> > [  755.225778]  start_kernel+0x628/0x668
> > [  755.229450]  __primary_switched+0xc0/0xc8
> >
> > > > There seems to be a disparity when the skb->csum is calculated by
> > > > skb_postpull_rcsum as zero. Before, it was calculated as 0xffff.
> > >
> > > skb->csum is 32bit, so there are about 2^16 different values for a
> > > given Internet checksum
> >
> > I meant 0xffffffff, sorry. It is visible in the skb_dump output that it
> > was 0xffffffff before and now it is 0.
> >
> > > > Do you have some suggestions as to what may be wrong? Thanks.
> > >
> > > What kind of traffic is triggering the fault ? TCP, UDP, something else ?
> >
> > The simplest to reproduce would be for ICMP. I'm pretty sure I had a
> > stack trace with TCP as well, but I don't seem to be able to reproduce
> > that right now.
> >
> > > Do you have a stack trace to provide, because it is not clear from
> > > where the issue is detected.
>
> Thanks Vladimir
>
> I think that maybe the issue is that the initial skb->csum is zero,
> and the csum_parttial(removed_block) is also zero.
>
> But the initial skb->csum should not be zero if you have a non " all
> zero"  frame.
>
> Can you double check this in drivers/net/ethernet/freescale/enetc/enetc.c ?

To me it looks like the strange part is that the checksum of the removed
block (printed by me as "csum_partial(start, len, 0)" inside
skb_postpull_rcsum()) is the same as the skb->csum itself.

[   66.287583] fsl_enetc 0000:00:00.2 eno2: enetc_get_offloads 991: skb 0xffff4050c3671f00 csum 0x3c1d
[   66.296716] skb csum of 20 bytes (20 to the left of skb->data) using old method: 0x0, new method: 0xffffffff, orig csum 0x3c1d, csum of removed block 0x3c1d
[   66.310786] skb len=84 headroom=98 headlen=84 tailroom=1546
[   66.310786] mac=(84,-6) net=(78,0) trans=78
[   66.310786] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[   66.310786] csum(0xffffffff ip_summed=2 complete_sw=0 valid=0 level=0)
[   66.310786] hash(0x0 sw=0 l4=0) proto=0x00f8 pkttype=3 iif=7
[   66.338997] dev name=eno2 feat=0x0x00020100001149a9
[   66.343904] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   66.351600] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   66.359295] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   66.366990] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   66.374685] skb headroom: 00000040: 88 80 00 0a 00 38 c1 17 3d 01 01 80 00 00 08 0f
[   66.382379] skb headroom: 00000050: 00 10 00 00 52 f3 98 af f9 8c d2 ee 27 92 2d 6c
[   66.390073] skb headroom: 00000060: 08 00
[   66.394105] skb linear:   00000000: 45 00 00 54 c8 59 40 00 40 01 28 fb c0 a8 64 01
[   66.401799] skb linear:   00000010: c0 a8 64 02 08 00 ee 94 06 98 00 04 03 2e a9 61
[   66.409493] skb linear:   00000020: 00 00 00 00 8b 6c 0c 00 00 00 00 00 10 11 12 13
[   66.417187] skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
[   66.424880] skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
[   66.432574] skb linear:   00000050: 34 35 36 37
[   66.437128] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   67.181735] fsl_enetc 0000:00:00.2 eno2: ocelot_rcv 131: skb 0xffff4050c3671f00 csum before skb_postpull_rcsum 0x3c1d, after 0xffffffff

This isn't the case for some other traffic streams, here is some iperf3 TCP:

[   52.336847] fsl_enetc 0000:00:00.2 eno2: enetc_get_offloads 991: skb 0xffff4050c44ab500 csum 0x18ce
[   52.345930] fsl_enetc 0000:00:00.2 eno2: enetc_get_offloads 991: skb 0xffff4050c3671f00 csum 0x18ce
[   52.355014] fsl_enetc 0000:00:00.2 eno2: enetc_get_offloads 991: skb 0xffff4050c3671e00 csum 0x18ce
[   52.397629] fsl_enetc 0000:00:00.2 eno2: ocelot_rcv 131: skb 0xffff4050c44ab500 csum before skb_postpull_rcsum 0x18ce, after 0xffffdcb0
[   52.409853] fsl_enetc 0000:00:00.2 eno2: ocelot_rcv 131: skb 0xffff4050c3671f00 csum before skb_postpull_rcsum 0x18ce, after 0xffffdcb0
[   52.422076] fsl_enetc 0000:00:00.2 eno2: ocelot_rcv 131: skb 0xffff4050c3671e00 csum before skb_postpull_rcsum 0x18ce, after 0xffffdcb0

When skb->csum isn't equal to the csum of the removed block, the two
implementations of skb_postpull_rcsum() agree.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 20:40           ` Vladimir Oltean
@ 2021-12-02 20:58             ` Vladimir Oltean
  2021-12-02 20:58             ` David Laight
  1 sibling, 0 replies; 26+ messages in thread
From: Vladimir Oltean @ 2021-12-02 20:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, David Laight

On Thu, Dec 02, 2021 at 10:40:36PM +0200, Vladimir Oltean wrote:
> To me it looks like the strange part is that the checksum of the removed
> block (printed by me as "csum_partial(start, len, 0)" inside
> skb_postpull_rcsum()) is the same as the skb->csum itself.
> 
> [   66.287583] fsl_enetc 0000:00:00.2 eno2: enetc_get_offloads 991: skb 0xffff4050c3671f00 csum 0x3c1d
> [   66.296716] skb csum of 20 bytes (20 to the left of skb->data) using old method: 0x0, new method: 0xffffffff, orig csum 0x3c1d, csum of removed block 0x3c1d

sorry, this line is confusing, what is printed as the "old method" is in
fact the "new method" and viceversa. I tried to rename them from
something clearer than "method 1" and "method 2" and failed.

> [   66.310786] skb len=84 headroom=98 headlen=84 tailroom=1546
> [   66.310786] mac=(84,-6) net=(78,0) trans=78
> [   66.310786] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> [   66.310786] csum(0xffffffff ip_summed=2 complete_sw=0 valid=0 level=0)
> [   66.310786] hash(0x0 sw=0 l4=0) proto=0x00f8 pkttype=3 iif=7
> [   66.338997] dev name=eno2 feat=0x0x00020100001149a9
> [   66.343904] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   66.351600] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   66.359295] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   66.366990] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   66.374685] skb headroom: 00000040: 88 80 00 0a 00 38 c1 17 3d 01 01 80 00 00 08 0f
> [   66.382379] skb headroom: 00000050: 00 10 00 00 52 f3 98 af f9 8c d2 ee 27 92 2d 6c
> [   66.390073] skb headroom: 00000060: 08 00
> [   66.394105] skb linear:   00000000: 45 00 00 54 c8 59 40 00 40 01 28 fb c0 a8 64 01
> [   66.401799] skb linear:   00000010: c0 a8 64 02 08 00 ee 94 06 98 00 04 03 2e a9 61
> [   66.409493] skb linear:   00000020: 00 00 00 00 8b 6c 0c 00 00 00 00 00 10 11 12 13
> [   66.417187] skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
> [   66.424880] skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
> [   66.432574] skb linear:   00000050: 34 35 36 37
> [   66.437128] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> [   67.181735] fsl_enetc 0000:00:00.2 eno2: ocelot_rcv 131: skb 0xffff4050c3671f00 csum before skb_postpull_rcsum 0x3c1d, after 0xffffffff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 20:40           ` Vladimir Oltean
  2021-12-02 20:58             ` Vladimir Oltean
@ 2021-12-02 20:58             ` David Laight
  2021-12-02 21:40               ` Vladimir Oltean
  1 sibling, 1 reply; 26+ messages in thread
From: David Laight @ 2021-12-02 20:58 UTC (permalink / raw)
  To: 'Vladimir Oltean', Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

> To me it looks like the strange part is that the checksum of the removed
> block (printed by me as "csum_partial(start, len, 0)" inside
> skb_postpull_rcsum()) is the same as the skb->csum itself.

If you are removing all the bytes that made the original checksum
that will happen.
And that might be true for the packets you are building.

Try replacing both ~ with -.
So replace:
		skb->csum = ~csum_partial(start, len, ~skb->csum);
with:
		skb->csum = -csum_partial(start, len, -skb->csum);

That should geneate ~0u instead 0 (if I've got my maths right).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 20:37           ` Eric Dumazet
@ 2021-12-02 21:07             ` Vladimir Oltean
  0 siblings, 0 replies; 26+ messages in thread
From: Vladimir Oltean @ 2021-12-02 21:07 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev, David Laight

On Thu, Dec 02, 2021 at 12:37:35PM -0800, Eric Dumazet wrote:
> On Thu, Dec 2, 2021 at 11:32 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > Thanks Vladimir
> >
> > I think that maybe the issue is that the initial skb->csum is zero,
> > and the csum_parttial(removed_block) is also zero.
> >
> > But the initial skb->csum should not be zero if you have a non " all
> > zero"  frame.
> >
> > Can you double check this in drivers/net/ethernet/freescale/enetc/enetc.c ?
> 
> Yes, I am not sure why the csum is inverted in enetc_get_offloads()

I guess it's inverted because the hardware doesn't provide its one's
complement.

> Perhaps
> 
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> b/drivers/net/ethernet/freescale/enetc/enetc.c
> index 504e12554079e306e477b9619f272d6e96527377..72524f14cae0093763f8a3f57b1e08e31bc4df1a
> 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -986,7 +986,7 @@ static void enetc_get_offloads(struct enetc_bdr *rx_ring,
>         if (rx_ring->ndev->features & NETIF_F_RXCSUM) {
>                 u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> 
> -               skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> +               skb->csum = csum_unfold((__force __sum16)htons(inet_csum));
>                 skb->ip_summed = CHECKSUM_COMPLETE;
>         }
> 
> If this does not work, then maybe :
> 
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> b/drivers/net/ethernet/freescale/enetc/enetc.c
> index 504e12554079e306e477b9619f272d6e96527377..d190faa9a8242f9f3f962dd30b9f4409a83ee697
> 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -987,7 +987,8 @@ static void enetc_get_offloads(struct enetc_bdr *rx_ring,
>                 u16 inet_csum = le16_to_cpu(rxbd->r.inet_csum);
> 
>                 skb->csum = csum_unfold((__force __sum16)~htons(inet_csum));
> -               skb->ip_summed = CHECKSUM_COMPLETE;
> +               if (likely(skb->csum))
> +                       skb->ip_summed = CHECKSUM_COMPLETE;
>         }
> 
>         if (le16_to_cpu(rxbd->r.flags) & ENETC_RXBD_FLAG_VLAN) {

I guess you aren't interested any longer in the result of these changes,
since the csum isn't zero in enetc?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 20:58             ` David Laight
@ 2021-12-02 21:40               ` Vladimir Oltean
  2021-12-03 14:51                 ` David Laight
  2021-12-03 14:57                 ` David Laight
  0 siblings, 2 replies; 26+ messages in thread
From: Vladimir Oltean @ 2021-12-02 21:40 UTC (permalink / raw)
  To: David Laight
  Cc: Eric Dumazet, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

On Thu, Dec 02, 2021 at 08:58:46PM +0000, David Laight wrote:
> > To me it looks like the strange part is that the checksum of the removed
> > block (printed by me as "csum_partial(start, len, 0)" inside
> > skb_postpull_rcsum()) is the same as the skb->csum itself.
> 
> If you are removing all the bytes that made the original checksum
> that will happen.
> And that might be true for the packets you are building.

Yes, I am not removing all the bytes that made up the original L2
payload csum. Let me pull up the skb_dump from my original message:

                        here is where the enetc saw the          the "start" variable (old skb->data)
                        beginning of the frame                   points here
                        v                                         v
skb headroom: 00000040: 88 80 00 0a 00 33 9d 40 f0 41 01 80 00 00 08 0f

                              OCELOT_TAG_LEN bytes into the frame,
                              the real MAC header can be found
                                    v
skb headroom: 00000050: 00 10 00 00 00 04 9f 05 f6 28 ba ae e4 b6 2c 3d
skb headroom: 00000060: 08 00
skb linear:   00000000: 45 00 00 54 27 ac 00 00 40 01 09 a8 c0 a8 64 03
                        ^
                        the skb_postpull_rcsum is called from "start"
                        pointer until the first byte prior to this one

skb linear:   00000010: c0 a8 64 01 00 00 10 e6 01 5c 00 04 49 30 a7 61
skb linear:   00000020: 00 00 00 00 3d 55 01 00 00 00 00 00 10 11 12 13
skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
skb linear:   00000050: 34 35 36 37

So skb_postpull_rcsum() is called from "skb headroom" offset 0x4e to
offset 0x61 inclusive (0x61 - 0x4e + 1 = 20 == OCELOT_TAG_LEN).

However as I understand it, the CHECKSUM_COMPLETE of this packet is
calculated by enetc from "skb headroom" offset 0x4e and all the way
until "skb linear" offset 0x53. So there is still a good chunk of packet
to go. That's why it is still a mystery to me why the checksums would be
equal. They still are, with your change suggested below, of course, but
at least there is no splat now.

> 
> Try replacing both ~ with -.
> So replace:
> 		skb->csum = ~csum_partial(start, len, ~skb->csum);
> with:
> 		skb->csum = -csum_partial(start, len, -skb->csum);
> 
> That should geneate ~0u instead 0 (if I've got my maths right).

Indeed, replacing both one's complement operations with two's complement
seems to produce correct results (consistent with old code) in all cases
that I am testing with (ICMP, TCP, UDP). Thanks!

> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 21:40               ` Vladimir Oltean
@ 2021-12-03 14:51                 ` David Laight
  2021-12-03 14:57                 ` David Laight
  1 sibling, 0 replies; 26+ messages in thread
From: David Laight @ 2021-12-03 14:51 UTC (permalink / raw)
  To: 'Vladimir Oltean'
  Cc: Eric Dumazet, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

From: Vladimir Oltean
> Sent: 02 December 2021 21:40
> 
> On Thu, Dec 02, 2021 at 08:58:46PM +0000, David Laight wrote:
> > > To me it looks like the strange part is that the checksum of the removed
> > > block (printed by me as "csum_partial(start, len, 0)" inside
> > > skb_postpull_rcsum()) is the same as the skb->csum itself.
> >
> > If you are removing all the bytes that made the original checksum
> > that will happen.
> > And that might be true for the packets you are building.
> 
> Yes, I am not removing all the bytes that made up the original L2
> payload csum. Let me pull up the skb_dump from my original message:
> 
>                         here is where the enetc saw the          the "start" variable (old skb->data)
>                         beginning of the frame                   points here
>                         v                                         v
> skb headroom: 00000040: 88 80 00 0a 00 33 9d 40 f0 41 01 80 00 00 08 0f
> 
>                               OCELOT_TAG_LEN bytes into the frame,
>                               the real MAC header can be found
>                                     v
> skb headroom: 00000050: 00 10 00 00 00 04 9f 05 f6 28 ba ae e4 b6 2c 3d
> skb headroom: 00000060: 08 00
> skb linear:   00000000: 45 00 00 54 27 ac 00 00 40 01 09 a8 c0 a8 64 03
>                         ^
>                         the skb_postpull_rcsum is called from "start"
>                         pointer until the first byte prior to this one
> 
> skb linear:   00000010: c0 a8 64 01 00 00 10 e6 01 5c 00 04 49 30 a7 61
> skb linear:   00000020: 00 00 00 00 3d 55 01 00 00 00 00 00 10 11 12 13
> skb linear:   00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23
> skb linear:   00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33
> skb linear:   00000050: 34 35 36 37
> 
> So skb_postpull_rcsum() is called from "skb headroom" offset 0x4e to
> offset 0x61 inclusive (0x61 - 0x4e + 1 = 20 == OCELOT_TAG_LEN).
> 
> However as I understand it, the CHECKSUM_COMPLETE of this packet is
> calculated by enetc from "skb headroom" offset 0x4e and all the way
> until "skb linear" offset 0x53. So there is still a good chunk of packet
> to go. That's why it is still a mystery to me why the checksums would be
> equal
...

Possibly because the rest of the packet actually has a valid checksum
(ie 0xffff) that (somewhere) got reduced to 16 bits.
If the checksum of the header were then added, and later removed
you'd end up inverting ~0u.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-02 21:40               ` Vladimir Oltean
  2021-12-03 14:51                 ` David Laight
@ 2021-12-03 14:57                 ` David Laight
  2021-12-03 16:14                   ` Vladimir Oltean
  1 sibling, 1 reply; 26+ messages in thread
From: David Laight @ 2021-12-03 14:57 UTC (permalink / raw)
  To: 'Vladimir Oltean'
  Cc: Eric Dumazet, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

From: Vladimir Oltean
> Sent: 02 December 2021 21:40
...
> >
> > Try replacing both ~ with -.
> > So replace:
> > 		skb->csum = ~csum_partial(start, len, ~skb->csum);
> > with:
> > 		skb->csum = -csum_partial(start, len, -skb->csum);
> >
> > That should geneate ~0u instead 0 (if I've got my maths right).
> 
> Indeed, replacing both one's complement operations with two's complement
> seems to produce correct results (consistent with old code) in all cases
> that I am testing with (ICMP, TCP, UDP). Thanks!

You need to generate (or persuade Eric to generate) a patch.
I don't have the right source tree.

Any code that does ~csum_partial() is 'dubious' unless followed
by a check for 0.
The two's compliment negate save the conditional - provided the
offset of 1 can be added in earlier.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-03 14:57                 ` David Laight
@ 2021-12-03 16:14                   ` Vladimir Oltean
  2021-12-03 16:30                     ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: Vladimir Oltean @ 2021-12-03 16:14 UTC (permalink / raw)
  To: David Laight
  Cc: Eric Dumazet, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

On Fri, Dec 03, 2021 at 02:57:04PM +0000, David Laight wrote:
> From: Vladimir Oltean
> > Sent: 02 December 2021 21:40
> ...
> > >
> > > Try replacing both ~ with -.
> > > So replace:
> > > 		skb->csum = ~csum_partial(start, len, ~skb->csum);
> > > with:
> > > 		skb->csum = -csum_partial(start, len, -skb->csum);
> > >
> > > That should geneate ~0u instead 0 (if I've got my maths right).
> > 
> > Indeed, replacing both one's complement operations with two's complement
> > seems to produce correct results (consistent with old code) in all cases
> > that I am testing with (ICMP, TCP, UDP). Thanks!
> 
> You need to generate (or persuade Eric to generate) a patch.
> I don't have the right source tree.
> 
> Any code that does ~csum_partial() is 'dubious' unless followed
> by a check for 0.
> The two's compliment negate save the conditional - provided the
> offset of 1 can be added in earlier.
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
> 

Eric, could you please send a patch with this change?
If you want and if it helps, I can also help you reproduce this locally
using the dsa_loop mockup driver.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-03 16:14                   ` Vladimir Oltean
@ 2021-12-03 16:30                     ` Eric Dumazet
  2021-12-03 16:47                       ` David Laight
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2021-12-03 16:30 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: David Laight, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

On Fri, Dec 3, 2021 at 8:14 AM Vladimir Oltean <olteanv@gmail.com> wrote:
>
> On Fri, Dec 03, 2021 at 02:57:04PM +0000, David Laight wrote:
> > From: Vladimir Oltean
> > > Sent: 02 December 2021 21:40
> > ...
> > > >
> > > > Try replacing both ~ with -.
> > > > So replace:
> > > >           skb->csum = ~csum_partial(start, len, ~skb->csum);
> > > > with:
> > > >           skb->csum = -csum_partial(start, len, -skb->csum);
> > > >
> > > > That should geneate ~0u instead 0 (if I've got my maths right).
> > >
> > > Indeed, replacing both one's complement operations with two's complement
> > > seems to produce correct results (consistent with old code) in all cases
> > > that I am testing with (ICMP, TCP, UDP). Thanks!
> >
> > You need to generate (or persuade Eric to generate) a patch.
> > I don't have the right source tree.
> >
> > Any code that does ~csum_partial() is 'dubious' unless followed
> > by a check for 0.
> > The two's compliment negate save the conditional - provided the
> > offset of 1 can be added in earlier.
> >
> >       David
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> > Registration No: 1397386 (Wales)
> >
>
> Eric, could you please send a patch with this change?

Sure, I will do this today, after more testing.

> If you want and if it helps, I can also help you reproduce this locally
> using the dsa_loop mockup driver.

No need, thanks !

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-03 16:30                     ` Eric Dumazet
@ 2021-12-03 16:47                       ` David Laight
  2021-12-03 16:58                         ` Eric Dumazet
  0 siblings, 1 reply; 26+ messages in thread
From: David Laight @ 2021-12-03 16:47 UTC (permalink / raw)
  To: 'Eric Dumazet', Vladimir Oltean
  Cc: Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

> > Eric, could you please send a patch with this change?
> 
> Sure, I will do this today, after more testing.

I've just done a quick grep and found two ~csum_partial() in
include/net/seg6.h.
Both are wrong (and completely horrid).

There are also 40 csum_partial(buf, len, 0).
If all the buffer is zero they'll return zero - invalid.
They ought to be changed to csum_partial(buf, len, 0xffff).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-03 16:47                       ` David Laight
@ 2021-12-03 16:58                         ` Eric Dumazet
  2021-12-03 17:41                           ` David Laight
  0 siblings, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2021-12-03 16:58 UTC (permalink / raw)
  To: David Laight, David Lebrun
  Cc: Vladimir Oltean, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

On Fri, Dec 3, 2021 at 8:47 AM David Laight <David.Laight@aculab.com> wrote:
>
> > > Eric, could you please send a patch with this change?
> >
> > Sure, I will do this today, after more testing.
>
> I've just done a quick grep and found two ~csum_partial() in
> include/net/seg6.h.

This is what I already mentioned in this email thread, and the reason
I have CCed David Lebrun.

https://marc.info/?l=linux-netdev&m=163845851801840&w=2

David, can you comment on this ?


> Both are wrong (and completely horrid).
>
> There are also 40 csum_partial(buf, len, 0).
> If all the buffer is zero they'll return zero - invalid.
> They ought to be changed to csum_partial(buf, len, 0xffff).
>

Please point where all zero buffers can be valid in the first place.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: [PATCH net-next 2/2] net: optimize skb_postpull_rcsum()
  2021-12-03 16:58                         ` Eric Dumazet
@ 2021-12-03 17:41                           ` David Laight
  0 siblings, 0 replies; 26+ messages in thread
From: David Laight @ 2021-12-03 17:41 UTC (permalink / raw)
  To: 'Eric Dumazet', David Lebrun
  Cc: Vladimir Oltean, Eric Dumazet, David S . Miller, Jakub Kicinski, netdev

...
> > There are also 40 csum_partial(buf, len, 0).
> > If all the buffer is zero they'll return zero - invalid.
> > They ought to be changed to csum_partial(buf, len, 0xffff).
> >
> 
> Please point where all zero buffers can be valid in the first place.

They probably can't be.
But I bet a lot of the code was written without realising it mattered.

A quick scan does imply that they are all probably ok.

But I have spotted 18 ~csum_fold() - they may be dubious.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2021-12-03 17:41 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-24 20:24 [PATCH net-next 0/2] net: small csum optimizations Eric Dumazet
2021-11-24 20:24 ` [PATCH net-next 1/2] gro: optimize skb_gro_postpull_rcsum() Eric Dumazet
2021-11-24 20:24 ` [PATCH net-next 2/2] net: optimize skb_postpull_rcsum() Eric Dumazet
2021-11-25  9:41   ` David Laight
2021-11-25 13:32     ` Eric Dumazet
2021-11-25 14:29       ` David Laight
2021-12-02 13:10   ` Vladimir Oltean
2021-12-02 14:51     ` Eric Dumazet
2021-12-02 16:29       ` Vladimir Oltean
2021-12-02 19:32         ` Eric Dumazet
2021-12-02 20:37           ` Eric Dumazet
2021-12-02 21:07             ` Vladimir Oltean
2021-12-02 20:40           ` Vladimir Oltean
2021-12-02 20:58             ` Vladimir Oltean
2021-12-02 20:58             ` David Laight
2021-12-02 21:40               ` Vladimir Oltean
2021-12-03 14:51                 ` David Laight
2021-12-03 14:57                 ` David Laight
2021-12-03 16:14                   ` Vladimir Oltean
2021-12-03 16:30                     ` Eric Dumazet
2021-12-03 16:47                       ` David Laight
2021-12-03 16:58                         ` Eric Dumazet
2021-12-03 17:41                           ` David Laight
2021-12-02 15:06     ` David Laight
2021-12-02 15:22       ` Eric Dumazet
2021-11-26  5:20 ` [PATCH net-next 0/2] net: small csum optimizations patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.