All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kurt Kanzenbach <kurt@linutronix.de>
To: Vladimir Oltean <olteanv@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Andrew Lunn <andrew@lunn.ch>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	netdev@vger.kernel.org
Subject: Re: [PATCH net-next] net: dsa: hellcreek: Print warning only once
Date: Thu, 01 Sep 2022 08:21:33 +0200	[thread overview]
Message-ID: <87czcfzkb6.fsf@kurt> (raw)
In-Reply-To: <20220831234329.w7wnxy4u3e5i6ydl@skbuf>

[-- Attachment #1: Type: text/plain, Size: 8317 bytes --]

On Thu Sep 01 2022, Vladimir Oltean wrote:
> On Wed, Aug 31, 2022 at 09:34:22PM +0200, Kurt Kanzenbach wrote:
>> I've plugged my hellcreek test device into a Cisco switch and saw these
>> messages flying by. It said it failed to get source port 0 at a constant
>> rate. Turns out the Cisco switch does RSTP by default. Every STP frame
>> received has source port 0 which doesn't make any sense. The switch adds
>> a tail tag to every frame towards the CPU. All STP frames have their
>> tail tag not really at the end of the frame. It's off by exactly four
>> bytes. So, maybe it's the FCS.
>
> Hmm, I'm not sure I'm awake enough to be looking at this. So AFAIU, some
> DWMAC versions can be configured to strip the padding and FCS from
> "Length" frames (EtherType <= 0x600) via the ACS bit in the MAC
> configuration register, and some others can also be configured to strip
> the FCS from "Type" frames (EtherType > 0x800) via the CST bit of the
> same register. Ok, I'll go with that, although I'm confused as to why
> some DWMACs would have ACS but not CST available.
>
> The stmmac driver does not support NETIF_F_RXFCS, so it wants frames
> passed up the stack to never have the FCS. That is good. Except that the
> frames passed via the RX buffer descriptors may sometimes have RX FCS,
> and sometimes may not.
>
> This means that the driver needs to make a determination of whether the
> hardware has already stripped the FCS or not, before attempting to strip
> the FCS by itself.
>
> So we may have:
>
> (a) FCS stripped by HW, left alone by SW => frame looks ok
> (b) FCS stripped by HW and also by SW => frame is truncated by 4 bytes
> (c) FCS left alone by HW, left alone by SW => frame has 4 bytes too many (the FCS)
> (d) FCS left alone by HW, stripped by SW => frame looks ok
>
> It seems from what you're saying that you're under circumstance (c).

Yes, exactly.

>
>> The DSA master is a older stmmac. It has this code here:
>> 
>> |drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
>> |	/* Clear ACS bit because Ethernet switch tagging formats such as
>> |	 * Broadcom tags can look like invalid LLC/SNAP packets and cause the
>> |	 * hardware to truncate packets on reception.
>> |	 */
>> |	if (netdev_uses_dsa(dev) || !priv->plat->enh_desc)
>> |		value &= ~GMAC_CONTROL_ACS;
>> |
>> 
>> Actually, this has to be done. Without disabling the ACS bit the STP
>> frames are truncated and the trailer tag is gone.
>
> So from Florian's comment above, he was under case (b), different than yours.
> I don't understand why you say that when ACS is set, "the STP frames are
> truncated and the trailer tag is gone". Simply altering the ACS bit
> isn't going to change the determination made by stmmac_rx(). My guess
> based on the current input I have is that it would work fine for you
> (but possibly not for Florian).

I thought so too. However, altering the ACS Bit didn't help at all.

>
>> Then, there is
>> 
>> |drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> |		/* ACS is set; GMAC core strips PAD/FCS for IEEE 802.3
>> |		 * Type frames (LLC/LLC-SNAP)
>> |		 *
>> |		 * llc_snap is never checked in GMAC >= 4, so this ACS
>> |		 * feature is always disabled and packets need to be
>> |		 * stripped manually.
>> |		 */
>> |		if (likely(!(status & rx_not_ls)) &&
>> |		    (likely(priv->synopsys_id >= DWMAC_CORE_4_00) ||
>> |		     unlikely(status != llc_snap))) {
>> |			if (buf2_len)
>> |				buf2_len -= ETH_FCS_LEN;
>> |			else
>> |				buf1_len -= ETH_FCS_LEN;
>> |
>> |			len -= ETH_FCS_LEN;
>> |		}
>> 
>> Great. Unfortunately the stmmac used here is a dwmac-3.70. So, the FCS
>> doesn't seem to be stripped for the STP frames.
>
> Yes, neither by hardware nor by software.
>
> It isn't stripped by hardware due to the ACS setting. And it isn't
> stripped by software because, although the synopsys_id is smaller than 4.00,
> there's only a single stmmac_desc_ops :: rx_status implementation which
> will ever return "llc_snap" as parse result for a frame, and that isn't
> yours.
>
>> Now, I'm currently testing the patch below and it seems to work:
>> 
>> |root@tsn:~# dmesg | grep -i 'Failed to get source port'
>> |root@tsn:~# tcpdump -i lan0 stp
>> |tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
>> |listening on lan0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
>> |19:25:17.031699 STP 802.1w, Rapid STP, Flags [Learn, Forward, Agreement], bridge-id 8000.2c:1a:05:28:06:c1.8006, length 36
>> 
>> Thanks,
>> Kurt
>> 
>> Patch:
>> 
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 9c1e19ea6fcd..74f348e27005 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -41,6 +41,7 @@
>>  #include <linux/bpf_trace.h>
>>  #include <net/pkt_cls.h>
>>  #include <net/xdp_sock_drv.h>
>> +#include <net/dsa.h>
>>  #include "stmmac_ptp.h"
>>  #include "stmmac.h"
>>  #include "stmmac_xdp.h"
>> @@ -5209,6 +5210,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
>>                  */
>>                 if (likely(!(status & rx_not_ls)) &&
>>                     (likely(priv->synopsys_id >= DWMAC_CORE_4_00) ||
>> +                    unlikely(netdev_uses_dsa(priv->dev)) ||
>
> This change forces FCS stripping in software for all DWMACs used as DSA
> masters.
>
> Florian solved his problem by moving from case (b) to (d), so the change
> should continue to work for him as well.
>
>>                      unlikely(status != llc_snap))) {
>>                         if (buf2_len)
>>                                 buf2_len -= ETH_FCS_LEN;
>
> My 2 cents on this topic are:
>
> 1. The software determination is way too complicated and hard to reason
>    about, there are too many tests in the fast path, and you are adding
>    one more (actually two, if you look at the implementation of netdev_uses_dsa).

Yes, exactly. That's why I dislike the extra check.

>    Additionally, someone will probably need to modify the zerocopy rx
>    procedure as well, in a not so distant future.

Yes, i know.

>    It must be taken into consideration how much worse would the
>    performance be if the driver would just configure the hardware to
>    never selectively strip the FCS (i.e. for some packets but not all;
>    in practice this means never strip the FCS. Ideally it would
>    *always* strip the FCS, but I'm not sure whether this is possible
>    with other hardware pre-4.0).  Considering that for the common case
>    of IP traffic the FCS is already stripped in software, I think
>    there will be a net gain by simplifying stmmac_rx() and leaving
>    just the "BD really is last" check.

I was thinking exactly the same. The FCS strip determination logic seems
complicated and is performed in the fast path.

From what I see, the stmmac strips the FCS in hardware only for IEEE
802.3 type frames and older versions. Meaning in most cases today the
FCS is stripped in software anyway.

We could do some measurements e.g., with perf to determine whether
removing the FCS logic has positive or negative effects?

>
> 2. The FCS stripping logic actually looks wrong to me.
>
> 			if (buf2_len) {
> 				buf2_len -= ETH_FCS_LEN;
> 				len -= ETH_FCS_LEN;
> 			} else if (buf1_len) {
> 				buf1_len -= ETH_FCS_LEN;
> 				len -= ETH_FCS_LEN;
> 			}
>
>    The "if (x != 0) x -= 4" idiom seems classically wrong, in that x may
>    still be < 4, and this will result in a negative buf2_len, or buf1_len.
>
>    Applied to reality, this would mean a scatter/gather frame where the FCS
>    is split between 2 buffers. I don't think the driver handles this case
>    well at all.

Oh, well.

>
> How large can you configure the hellcreek switch to send packets towards
> the DSA master? Could you force a packet size (including hellcreek tail tag)
> comparable to dma_conf->dma_buf_sz?

I don't think so.

> Or if not, perhaps you could hack the way in which stmmac_set_bfsize()
> works, or one of the constants?

I'm not sure if i follow you here.

Thanks,
Kurt

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]

  reply	other threads:[~2022-09-01  6:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-30 16:34 [PATCH net-next] net: dsa: hellcreek: Print warning only once Kurt Kanzenbach
2022-08-30 19:04 ` Andrew Lunn
2022-08-31 15:26 ` Vladimir Oltean
2022-08-31 19:34   ` Kurt Kanzenbach
2022-08-31 23:43     ` Vladimir Oltean
2022-09-01  6:21       ` Kurt Kanzenbach [this message]
2022-09-01 11:39         ` Vladimir Oltean
2022-09-03 13:24           ` Kurt Kanzenbach
2022-09-03 16:44             ` Vladimir Oltean
2022-09-03 17:25             ` Florian Fainelli
2022-09-05 12:28               ` Kurt Kanzenbach
2022-09-01  2:55 ` Jakub Kicinski
2022-09-01  3:00 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87czcfzkb6.fsf@kurt \
    --to=kurt@linutronix.de \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=f.fainelli@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.