netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* DSA tags seem to break checksum offloading on DWMAC100
@ 2023-12-15  9:54 Romain Gantois
  2023-12-15 10:30 ` Andrew Lunn
  2023-12-16  6:15 ` Richard Tresidder
  0 siblings, 2 replies; 4+ messages in thread
From: Romain Gantois @ 2023-12-15  9:54 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Vladimir Oltean, Alexandre Torgue,
	Jose Abreu
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, Maxime Chevallier, Thomas Petazzoni, Linus Walleij


A similar issue was just reported for a different MAC driver:

https://lore.kernel.org/netdev/20231215-new-gemini-ethernet-regression-v1-2-93033544be23@linaro.org/T/#u

Hello everyone,

I was rebasing on net-next an out-of-tree stmmac driver for the RZN1 GMAC
IP, and I noticed that something broke all TCP transmissions going through the
GMAC1 Ethernet controller. This MAC controller was connected to a 88E6352
Marvell switch through its CPU port. Further investigation revealed that
egressing packets had an invalid TCP checksum, which caused them to be dropped
at the receiving side's kernel.

A bisection on the transmitting side's kernel showed that the commit that caused
the bug was:

    6b2c6e4a938f (net: stmmac: propagate feature flags to vlan, 2023-04-17)

This stmmac patch makes it so that most of the feature flags of stmmac net
devices are copied to its vlan features. Some of these flags are then
transmitted to DSA user devices. The NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM flags 
that control checksum offloading, are responsible for the bug.

Relevant call chain: 
dsa_user_xmit -> ...[packet is tagged] -> __dev_queue_xmit \
-> validate_xmit_skb and stmmac_xmit

If checksum offloading is enabled, stmmac will set it in hardware at the
following location:

https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/enh_desc.c#L322

Then the hardware computes an incorrect checksum. I believe that this is caused
by the presence of DSA tags in the frames, although I can't be 100% sure of
this.

So it seems like a solution is needed to prevent checksum offloading by Ethernet
drivers when DSA tags are in used. I'm not sure how this would be done, since
DSA is purposefully kept quite separate from CPU port drivers. What are your
thoughts on this?

Best Regards,

-- 
Romain Gantois, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: DSA tags seem to break checksum offloading on DWMAC100
  2023-12-15  9:54 DSA tags seem to break checksum offloading on DWMAC100 Romain Gantois
@ 2023-12-15 10:30 ` Andrew Lunn
  2023-12-15 13:45   ` Vladimir Oltean
  2023-12-16  6:15 ` Richard Tresidder
  1 sibling, 1 reply; 4+ messages in thread
From: Andrew Lunn @ 2023-12-15 10:30 UTC (permalink / raw)
  To: Romain Gantois
  Cc: Florian Fainelli, Vladimir Oltean, Alexandre Torgue, Jose Abreu,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, Maxime Chevallier, Thomas Petazzoni, Linus Walleij

> So it seems like a solution is needed to prevent checksum offloading by Ethernet
> drivers when DSA tags are in used. I'm not sure how this would be done, since
> DSA is purposefully kept quite separate from CPU port drivers. What are your
> thoughts on this?

It is not as simple as that, because some Ethernet drivers do know how
to correctly calculate checksums when there is a DSA
header. e.g. Marvell and Broadcom devices can do this, when combined
with Marvell/Broadcom switches. I don't know how the Broadcom driver
does this, but on the Marvell Ethernet drivers, there is a value you
set in the transmit descriptor to indicate how big the headers are
before the IP header. Its normally used to skip over the VLAN tag, but
it can also be used to skip over the DSA header.

So i would suggest you look at the data sheet and see if there is
anything similar, a way to tell the hardware where the IP header
actually is in the frame. If you can do that, you can then actually
make use of the hardware checksum, rather than disable it.

     Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: DSA tags seem to break checksum offloading on DWMAC100
  2023-12-15 10:30 ` Andrew Lunn
@ 2023-12-15 13:45   ` Vladimir Oltean
  0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Oltean @ 2023-12-15 13:45 UTC (permalink / raw)
  To: Romain Gantois, Andrew Lunn, Corinna Vinschen
  Cc: Florian Fainelli, Alexandre Torgue, Jose Abreu, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev,
	Maxime Chevallier, Thomas Petazzoni, Linus Walleij

On Fri, Dec 15, 2023 at 11:30:47AM +0100, Andrew Lunn wrote:
> > So it seems like a solution is needed to prevent checksum offloading by Ethernet
> > drivers when DSA tags are in used. I'm not sure how this would be done, since
> > DSA is purposefully kept quite separate from CPU port drivers. What are your
> > thoughts on this?
> 
> It is not as simple as that, because some Ethernet drivers do know how
> to correctly calculate checksums when there is a DSA
> header. e.g. Marvell and Broadcom devices can do this, when combined
> with Marvell/Broadcom switches. I don't know how the Broadcom driver
> does this, but on the Marvell Ethernet drivers, there is a value you
> set in the transmit descriptor to indicate how big the headers are
> before the IP header. Its normally used to skip over the VLAN tag, but
> it can also be used to skip over the DSA header.
> 
> So i would suggest you look at the data sheet and see if there is
> anything similar, a way to tell the hardware where the IP header
> actually is in the frame. If you can do that, you can then actually
> make use of the hardware checksum, rather than disable it.
> 
>      Andrew

There's a chance that the DWMAC cannot be told what is the offset of the
IP header - it finds it by itself. Looking at DWMAC documentation for
the TDES3_CHECKSUM_INSERTION_SHIFT bit, it seems to me that this is the
case. Table "Transmit Checksum Offload Engine Functions for Different
Packet Types" also describes for which packet types this will work, and
for which it won't. My guess is that the DWMAC would classify DSA-tagged
packets as "Non-IPv4 or IPv6 packet", but they need checksums for the
inner transport layer nonetheless. I think that transport checksumming
of IP over IP tunnels is also broken, because stmmac says to the stack
it will checksum them, but this table says it won't.

We say this in Documentation/networking/dsa/dsa.rst:

  If the DSA conduit driver still uses the legacy NETIF_F_IP_CSUM or
  NETIF_F_IPV6_CSUM in vlan_features, the offload might only work if the
  offload hardware already expects that specific tag (perhaps due to
  matching vendors). DSA user ports inherit those flags from the
  conduit, and it is up to the driver to correctly fall back to software
  checksum when the IP header is not where the hardware expects. If that
  check is ineffective, the packets might go to the network without a
  proper checksum (the checksum field will have the pseudo IP header
  sum).

TL;DR: the simplest fix would be to revert commit 6b2c6e4a938f ("net:
stmmac: propagate feature flags to vlan") in 'net', and rework
stmmac_xmit() in 'net-next' to only use csum_insertion for those packets
where it will work, and fall back to skb_checksum_help() otherwise.
The existing check with coe_unsupported per TX queue is insufficient.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: DSA tags seem to break checksum offloading on DWMAC100
  2023-12-15  9:54 DSA tags seem to break checksum offloading on DWMAC100 Romain Gantois
  2023-12-15 10:30 ` Andrew Lunn
@ 2023-12-16  6:15 ` Richard Tresidder
  1 sibling, 0 replies; 4+ messages in thread
From: Richard Tresidder @ 2023-12-16  6:15 UTC (permalink / raw)
  To: Romain Gantois, Andrew Lunn, Florian Fainelli, Vladimir Oltean,
	Alexandre Torgue, Jose Abreu
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, Maxime Chevallier, Thomas Petazzoni, Linus Walleij

On 15/12/2023 5:54 pm, Romain Gantois wrote:

> A similar issue was just reported for a different MAC driver:
>
> https://lore.kernel.org/netdev/20231215-new-gemini-ethernet-regression-v1-2-93033544be23@linaro.org/T/#u
>
> Hello everyone,
>
> I was rebasing on net-next an out-of-tree stmmac driver for the RZN1 GMAC
> IP, and I noticed that something broke all TCP transmissions going through the
> GMAC1 Ethernet controller. This MAC controller was connected to a 88E6352
> Marvell switch through its CPU port. Further investigation revealed that
> egressing packets had an invalid TCP checksum, which caused them to be dropped
> at the receiving side's kernel.
>
> A bisection on the transmitting side's kernel showed that the commit that caused
> the bug was:
>
>      6b2c6e4a938f (net: stmmac: propagate feature flags to vlan, 2023-04-17)
>
> This stmmac patch makes it so that most of the feature flags of stmmac net
> devices are copied to its vlan features. Some of these flags are then
> transmitted to DSA user devices. The NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM flags
> that control checksum offloading, are responsible for the bug.
>
> Relevant call chain:
> dsa_user_xmit -> ...[packet is tagged] -> __dev_queue_xmit \
> -> validate_xmit_skb and stmmac_xmit
>
> If checksum offloading is enabled, stmmac will set it in hardware at the
> following location:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/net/ethernet/stmicro/stmmac/enh_desc.c#L322
>
> Then the hardware computes an incorrect checksum. I believe that this is caused
> by the presence of DSA tags in the frames, although I can't be 100% sure of
> this.
>
> So it seems like a solution is needed to prevent checksum offloading by Ethernet
> drivers when DSA tags are in used. I'm not sure how this would be done, since
> DSA is purposefully kept quite separate from CPU port drivers. What are your
> thoughts on this?
>
> Best Regards,
I'm working through a very similar issue in this conversation:
https://lore.kernel.org/netdev/e5c6c75f-2dfa-4e50-a1fb-6bf4cdb617c2@electromag.com.au/
It's the same commit that I found when bisecting that caused the issue.
We have this same stmmac interfacing with a 88E6352 running with the DSA 
driver.

Cheers
     Richard Tresidder


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-12-16  6:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-15  9:54 DSA tags seem to break checksum offloading on DWMAC100 Romain Gantois
2023-12-15 10:30 ` Andrew Lunn
2023-12-15 13:45   ` Vladimir Oltean
2023-12-16  6:15 ` Richard Tresidder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).