All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 0/2] Fix a regression in the Gemini ethernet controller.
@ 2023-12-16 19:36 Linus Walleij
  2023-12-16 19:36 ` [PATCH net v2 1/2] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
  2023-12-16 19:36 ` [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes Linus Walleij
  0 siblings, 2 replies; 9+ messages in thread
From: Linus Walleij @ 2023-12-16 19:36 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, Linus Walleij

These fixes were developed on top of the earlier fixes.

Finding the right solution is hard because the Gemini checksumming
engine is completely undocumented in the datasheets.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
Changes in v2:
- Drop the TSO and length checks altogether, this was never
  working properly.
- Plan to make a proper TSO implementation in the next kernel
  cycle.
- Link to v1: https://lore.kernel.org/r/20231215-new-gemini-ethernet-regression-v1-0-93033544be23@linaro.org

---
Linus Walleij (2):
      net: ethernet: cortina: Drop software checksum and TSO
      net: ethernet: cortina: Bypass checksumming engine of alien ethertypes

 drivers/net/ethernet/cortina/gemini.c | 34 ++++++++++++++++++----------------
 1 file changed, 18 insertions(+), 16 deletions(-)
---
base-commit: 33cc938e65a98f1d29d0a18403dbbee050dcad9a
change-id: 20231203-new-gemini-ethernet-regression-3c672de9cfd9

Best regards,
-- 
Linus Walleij <linus.walleij@linaro.org>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net v2 1/2] net: ethernet: cortina: Drop software checksum and TSO
  2023-12-16 19:36 [PATCH net v2 0/2] Fix a regression in the Gemini ethernet controller Linus Walleij
@ 2023-12-16 19:36 ` Linus Walleij
  2023-12-18 23:23   ` Jakub Kicinski
  2023-12-16 19:36 ` [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes Linus Walleij
  1 sibling, 1 reply; 9+ messages in thread
From: Linus Walleij @ 2023-12-16 19:36 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, Linus Walleij

The recent change to allow large frames without hardware checksumming
slotted in software checksumming in the driver if hardware could not
do it.

This will however upset TSO (TCP Segment Offloading). Typical
error dumps includes this:

skb len=2961 headroom=222 headlen=66 tailroom=0
(...)
WARNING: CPU: 0 PID: 956 at net/core/dev.c:3259 skb_warn_bad_offload+0x7c/0x108
gemini-ethernet-port: caps=(0x0000010000154813, 0x00002007ffdd7889)

And the packets do not go through.

After investigating I drilled it down to the introduction of the
software checksumming in the driver.

Since the segmenting of packets will be done by the hardware this
makes a bit of sense since in that case the hardware also needs to
be keeping track of the checksumming.

That begs the question why large TCP or UDP packets also have to
bypass the checksumming (like e.g. ICMP does). If the hardware is
splitting it into smaller packets per-MTU setting, and checksumming
them, why is this happening then? I don't know. I know it is needed,
from tests: the OpenWrt webserver uhttpd starts sending big skb:s (up
to 2047 bytes, the max MTU) and above 1514 bytes it starts to fail
and hang unless the bypass bit is set: the frames are not getting
through.

Drop the size check and the offloading features for now: this
needs to be fixed up properly.

Suggested-by: Eric Dumazet <edumazet@google.com>
Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/net/ethernet/cortina/gemini.c | 21 ++-------------------
 1 file changed, 2 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index 78287cfcbf63..6a7ea051391a 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -79,8 +79,7 @@ MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
 #define GMAC0_IRQ4_8 (GMAC0_MIB_INT_BIT | GMAC0_RX_OVERRUN_INT_BIT)
 
 #define GMAC_OFFLOAD_FEATURES (NETIF_F_SG | NETIF_F_IP_CSUM | \
-		NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM | \
-		NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)
+	       NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM )
 
 /**
  * struct gmac_queue_page - page buffer per-page info
@@ -1145,7 +1144,6 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 	dma_addr_t mapping;
 	unsigned short mtu;
 	void *buffer;
-	int ret;
 
 	mtu  = ETH_HLEN;
 	mtu += netdev->mtu;
@@ -1160,22 +1158,7 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 		word3 |= mtu;
 	}
 
-	if (skb->len >= ETH_FRAME_LEN) {
-		/* Hardware offloaded checksumming isn't working on frames
-		 * bigger than 1514 bytes. A hypothesis about this is that the
-		 * checksum buffer is only 1518 bytes, so when the frames get
-		 * bigger they get truncated, or the last few bytes get
-		 * overwritten by the FCS.
-		 *
-		 * Just use software checksumming and bypass on bigger frames.
-		 */
-		if (skb->ip_summed == CHECKSUM_PARTIAL) {
-			ret = skb_checksum_help(skb);
-			if (ret)
-				return ret;
-		}
-		word1 |= TSS_BYPASS_BIT;
-	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
+	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		int tcp = 0;
 
 		/* We do not switch off the checksumming on non TCP/UDP

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes
  2023-12-16 19:36 [PATCH net v2 0/2] Fix a regression in the Gemini ethernet controller Linus Walleij
  2023-12-16 19:36 ` [PATCH net v2 1/2] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
@ 2023-12-16 19:36 ` Linus Walleij
  2023-12-18 14:50   ` Eric Dumazet
  1 sibling, 1 reply; 9+ messages in thread
From: Linus Walleij @ 2023-12-16 19:36 UTC (permalink / raw)
  To: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: netdev, Linus Walleij

We had workarounds were the ethernet checksumming engine would be bypassed
for larger frames, this fixed devices using DSA, but regressed devices
where the ethernet was connected directly to a PHY.

The devices with a PHY connected directly can't handle large frames
either way, with or without bypass. Looking at the size of the frame
is probably just wrong.

Rework the workaround such that we just bypass the checksumming engine if
the ethertype inside the actual frame is something else than 0x0800
(IPv4) or 0x86dd (IPv6). These are the only frames the checksumming engine
can actually handle. VLAN framing (0x8100) also works fine.

We can't inspect skb->protocol because DSA frames will sometimes have a
custom ethertype despite skb->protocol is e.g. 0x0800.

After this both devices with direct ethernet attached such as D-Link
DNS-313 and devices with a DSA switch with a custom ethertype such as
D-Link DIR-685 work fine.

Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
 drivers/net/ethernet/cortina/gemini.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
index 6a7ea051391a..1400f19bf05b 100644
--- a/drivers/net/ethernet/cortina/gemini.c
+++ b/drivers/net/ethernet/cortina/gemini.c
@@ -1143,7 +1143,9 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 	skb_frag_t *skb_frag;
 	dma_addr_t mapping;
 	unsigned short mtu;
+	u16 ethertype;
 	void *buffer;
+	__be16 *p;
 
 	mtu  = ETH_HLEN;
 	mtu += netdev->mtu;
@@ -1158,7 +1160,24 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
 		word3 |= mtu;
 	}
 
-	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+	/* Dig out the the ethertype actually in the buffer and not what the
+	 * protocol claims to be. This is the raw data that the checksumming
+	 * offload engine will have to deal with.
+	 */
+	p = (__be16 *)(skb->data + 2 * ETH_ALEN);
+	ethertype = ntohs(*p);
+	if (ethertype == ETH_P_8021Q) {
+		p += 2; /* +2 sizeof(__be16) */
+		ethertype = ntohs(*p);
+	}
+
+	if (ethertype != ETH_P_IP && ethertype != ETH_P_IPV6) {
+		/* Hardware offloaded checksumming isn't working on non-IP frames.
+		 * This happens for example on some DSA switches using a custom
+		 * ethertype. Just bypass the engine for those.
+		 */
+		word1 |= TSS_BYPASS_BIT;
+	} else if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		int tcp = 0;
 
 		/* We do not switch off the checksumming on non TCP/UDP

-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes
  2023-12-16 19:36 ` [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes Linus Walleij
@ 2023-12-18 14:50   ` Eric Dumazet
  2023-12-18 23:41     ` Linus Walleij
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2023-12-18 14:50 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans Ulli Kroll, David S. Miller, Jakub Kicinski, Paolo Abeni, netdev

On Sat, Dec 16, 2023 at 8:36 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> We had workarounds were the ethernet checksumming engine would be bypassed
> for larger frames, this fixed devices using DSA, but regressed devices
> where the ethernet was connected directly to a PHY.
>
> The devices with a PHY connected directly can't handle large frames
> either way, with or without bypass. Looking at the size of the frame
> is probably just wrong.
>
> Rework the workaround such that we just bypass the checksumming engine if
> the ethertype inside the actual frame is something else than 0x0800
> (IPv4) or 0x86dd (IPv6). These are the only frames the checksumming engine
> can actually handle. VLAN framing (0x8100) also works fine.
>
> We can't inspect skb->protocol because DSA frames will sometimes have a
> custom ethertype despite skb->protocol is e.g. 0x0800.
>
> After this both devices with direct ethernet attached such as D-Link
> DNS-313 and devices with a DSA switch with a custom ethertype such as
> D-Link DIR-685 work fine.
>
> Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
>  drivers/net/ethernet/cortina/gemini.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c
> index 6a7ea051391a..1400f19bf05b 100644
> --- a/drivers/net/ethernet/cortina/gemini.c
> +++ b/drivers/net/ethernet/cortina/gemini.c
> @@ -1143,7 +1143,9 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
>         skb_frag_t *skb_frag;
>         dma_addr_t mapping;
>         unsigned short mtu;
> +       u16 ethertype;
>         void *buffer;
> +       __be16 *p;
>
>         mtu  = ETH_HLEN;
>         mtu += netdev->mtu;
> @@ -1158,7 +1160,24 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb,
>                 word3 |= mtu;
>         }
>
> -       if (skb->ip_summed == CHECKSUM_PARTIAL) {
> +       /* Dig out the the ethertype actually in the buffer and not what the
> +        * protocol claims to be. This is the raw data that the checksumming
> +        * offload engine will have to deal with.
> +        */
> +       p = (__be16 *)(skb->data + 2 * ETH_ALEN);
> +       ethertype = ntohs(*p);
> +       if (ethertype == ETH_P_8021Q) {
> +               p += 2; /* +2 sizeof(__be16) */
> +               ethertype = ntohs(*p);
> +       }

Presumably all you need is to call vlan_get_protocol() ?

> +
> +       if (ethertype != ETH_P_IP && ethertype != ETH_P_IPV6) {
> +               /* Hardware offloaded checksumming isn't working on non-IP frames.
> +                * This happens for example on some DSA switches using a custom
> +                * ethertype. Just bypass the engine for those.
> +                */
> +               word1 |= TSS_BYPASS_BIT;
> +       } else if (skb->ip_summed == CHECKSUM_PARTIAL) {
>                 int tcp = 0;
>
>                 /* We do not switch off the checksumming on non TCP/UDP
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net v2 1/2] net: ethernet: cortina: Drop software checksum and TSO
  2023-12-16 19:36 ` [PATCH net v2 1/2] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
@ 2023-12-18 23:23   ` Jakub Kicinski
  2023-12-19 14:24     ` Linus Walleij
  0 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2023-12-18 23:23 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Paolo Abeni, netdev

On Sat, 16 Dec 2023 20:36:52 +0100 Linus Walleij wrote:
> -		NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM | \
> -		NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)
> +	       NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM )

nit: checkpatch is really upset about this space before )

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes
  2023-12-18 14:50   ` Eric Dumazet
@ 2023-12-18 23:41     ` Linus Walleij
  2023-12-19  9:14       ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Walleij @ 2023-12-18 23:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hans Ulli Kroll, David S. Miller, Jakub Kicinski, Paolo Abeni, netdev

On Mon, Dec 18, 2023 at 3:50 PM Eric Dumazet <edumazet@google.com> wrote:
> On Sat, Dec 16, 2023 at 8:36 PM Linus Walleij <linus.walleij@linaro.org> wrote:

> > +       /* Dig out the the ethertype actually in the buffer and not what the
> > +        * protocol claims to be. This is the raw data that the checksumming
> > +        * offload engine will have to deal with.
> > +        */
> > +       p = (__be16 *)(skb->data + 2 * ETH_ALEN);
> > +       ethertype = ntohs(*p);
> > +       if (ethertype == ETH_P_8021Q) {
> > +               p += 2; /* +2 sizeof(__be16) */
> > +               ethertype = ntohs(*p);
> > +       }
>
> Presumably all you need is to call vlan_get_protocol() ?

Sadly no. As the comment says: we want the ethertype that is actually in the
skb, not what is in skb->protocol, and the code in vlan_get_protocol() just
trusts skb->protocol to be the ethertype in the frame, especially if vlan
is not used.

This is often what we want: DSA switches will "wash" custom ethertypes
before they go out, but in this case the custom ethertype upsets the
ethernet checksum engine used as conduit interface toward the DSA
switch.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes
  2023-12-18 23:41     ` Linus Walleij
@ 2023-12-19  9:14       ` Eric Dumazet
  2023-12-19 14:22         ` Linus Walleij
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2023-12-19  9:14 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Hans Ulli Kroll, David S. Miller, Jakub Kicinski, Paolo Abeni, netdev

On Tue, Dec 19, 2023 at 12:42 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Mon, Dec 18, 2023 at 3:50 PM Eric Dumazet <edumazet@google.com> wrote:
> > On Sat, Dec 16, 2023 at 8:36 PM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> > > +       /* Dig out the the ethertype actually in the buffer and not what the
> > > +        * protocol claims to be. This is the raw data that the checksumming
> > > +        * offload engine will have to deal with.
> > > +        */
> > > +       p = (__be16 *)(skb->data + 2 * ETH_ALEN);
> > > +       ethertype = ntohs(*p);
> > > +       if (ethertype == ETH_P_8021Q) {
> > > +               p += 2; /* +2 sizeof(__be16) */
> > > +               ethertype = ntohs(*p);
> > > +       }
> >
> > Presumably all you need is to call vlan_get_protocol() ?
>
> Sadly no. As the comment says: we want the ethertype that is actually in the
> skb, not what is in skb->protocol, and the code in vlan_get_protocol() just
> trusts skb->protocol to be the ethertype in the frame, especially if vlan
> is not used.
>
> This is often what we want: DSA switches will "wash" custom ethertypes
> before they go out, but in this case the custom ethertype upsets the
> ethernet checksum engine used as conduit interface toward the DSA
> switch.

 Problem is that your code misses skb_header_pointer() or
pskb_may_pull() call...
Second "ethertype = ntohs(*p);" might access uninitialized data.

If this is a common operation, perhaps use a common helper from all drivers,
this would help code review a bit...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes
  2023-12-19  9:14       ` Eric Dumazet
@ 2023-12-19 14:22         ` Linus Walleij
  0 siblings, 0 replies; 9+ messages in thread
From: Linus Walleij @ 2023-12-19 14:22 UTC (permalink / raw)
  To: Eric Dumazet, Maxime Chevallier
  Cc: Hans Ulli Kroll, David S. Miller, Jakub Kicinski, Paolo Abeni, netdev

On Tue, Dec 19, 2023 at 10:15 AM Eric Dumazet <edumazet@google.com> wrote:
> On Tue, Dec 19, 2023 at 12:42 AM Linus Walleij <linus.walleij@linaro.org> wrote:

> > This is often what we want: DSA switches will "wash" custom ethertypes
> > before they go out, but in this case the custom ethertype upsets the
> > ethernet checksum engine used as conduit interface toward the DSA
> > switch.
>
>  Problem is that your code misses skb_header_pointer() or
> pskb_may_pull() call...
> Second "ethertype = ntohs(*p);" might access uninitialized data.

Yeah, needs to be done properly and look at skb->len etc.

> If this is a common operation, perhaps use a common helper from all drivers,
> this would help code review a bit...

You are right, Maxime opened a discussion on it in a parallel,
I'll cook something up!

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net v2 1/2] net: ethernet: cortina: Drop software checksum and TSO
  2023-12-18 23:23   ` Jakub Kicinski
@ 2023-12-19 14:24     ` Linus Walleij
  0 siblings, 0 replies; 9+ messages in thread
From: Linus Walleij @ 2023-12-19 14:24 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Hans Ulli Kroll, David S. Miller, Eric Dumazet, Paolo Abeni, netdev

On Tue, Dec 19, 2023 at 12:23 AM Jakub Kicinski <kuba@kernel.org> wrote:
> On Sat, 16 Dec 2023 20:36:52 +0100 Linus Walleij wrote:

> > -             NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM | \
> > -             NETIF_F_TSO | NETIF_F_TSO_ECN | NETIF_F_TSO6)
> > +            NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM )
>
> nit: checkpatch is really upset about this space before )

I'll fix it, Eric and Maxime opened up the idea of a generic helper to
extract the ethertype from a buffer, so I'll fix this in the next iteration.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-19 14:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-16 19:36 [PATCH net v2 0/2] Fix a regression in the Gemini ethernet controller Linus Walleij
2023-12-16 19:36 ` [PATCH net v2 1/2] net: ethernet: cortina: Drop software checksum and TSO Linus Walleij
2023-12-18 23:23   ` Jakub Kicinski
2023-12-19 14:24     ` Linus Walleij
2023-12-16 19:36 ` [PATCH net v2 2/2] net: ethernet: cortina: Bypass checksumming engine of alien ethertypes Linus Walleij
2023-12-18 14:50   ` Eric Dumazet
2023-12-18 23:41     ` Linus Walleij
2023-12-19  9:14       ` Eric Dumazet
2023-12-19 14:22         ` Linus Walleij

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.