All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
@ 2021-06-05 19:37 Matthew Hagan
  2021-06-05 20:35 ` Andrew Lunn
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Hagan @ 2021-06-05 19:37 UTC (permalink / raw)
  Cc: Matthew Hagan, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	Vladimir Oltean, David S. Miller, Jakub Kicinski, netdev,
	linux-kernel

The qca_tag_rcv function unconditionally expects a QCA tag to be present
between source MAC and EtherType. However if an upstream switch is used,
this may create a special case where VLAN tags are subsequently inserted
between the source MAC and the QCA tag. Thus when qca_tag_rcv is called,
it will attempt to read the 802.1q TPID as a QCA tag. This results in
complication since the TPID will pass the QCA tag version checking on bits
14 and 15, but the resulting packet after trimming the TPID will be
unusable.

The tested case is a Meraki MX65 which features two QCA8337 switches with
their CPU ports attached to a BCM58625 switch ports 4 and 5 respectively.
In this case a VLAN tag with VID 0 is added by the upstream BCM switch
when the port is unconfigured and packets with this VLAN tag or without
will be accepted at the BCM's CPU port. However, it is arguably possible
that other switches may be configured to drop VLAN untagged traffic at
their respective CPU port. Thus where packets are VLAN untagged, the
default VLAN tag, added by the upstream switch, should be maintained. Where
inbound packets are already VLAN tagged when arriving at the QCA switch, we
should replace the default VLAN tag, added by the upstream port, with the
correct VLAN tag.

This patch introduces:
  1 - A check for a VLAN tag before EtherType. If found, skip past this to
      find the QCA tag.
  2 - Check for a second VLAN tag after the QCA tag if one was found in 1.
      If found, remove both the initial VLAN tag and the QCA tag. If not
      found, remove only the QCA tag to maintain the VLAN tag added by the
      upstream switch.

Signed-off-by: Matthew Hagan <mnhagan88@gmail.com>
---
 net/dsa/tag_qca.c | 41 +++++++++++++++++++++++++++++++----------
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c
index 88181b52f480..e5273a27bf8a 100644
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -52,18 +52,27 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 				   struct packet_type *pt)
 {
 	u8 ver;
-	u16  hdr;
-	int port;
-	__be16 *phdr;
+	u16 hdr, vlan_hdr;
+	int port, vlan_offset = 0, vlan_skip = 0;
+	__be16 *phdr, *vlan_phdr;
 
 	if (unlikely(!pskb_may_pull(skb, QCA_HDR_LEN)))
 		return NULL;
 
-	/* The QCA header is added by the switch between src addr and Ethertype
-	 * At this point, skb->data points to ethertype so header should be
-	 * right before
+	/* The QCA header is added by the switch between src addr and
+	 * Ethertype. Normally at this point, skb->data points to ethertype so the
+	 * header should be right before. However if a VLAN tag has subsequently
+	 * been added upstream, we need to skip past it to find the QCA header.
 	 */
-	phdr = (__be16 *)(skb->data - 2);
+	vlan_phdr = (__be16 *)(skb->data - 2);
+	vlan_hdr = ntohs(*vlan_phdr);
+
+	/* Check for VLAN tag before QCA tag */
+	if (!(vlan_hdr ^ ETH_P_8021Q))
+		vlan_offset = VLAN_HLEN;
+
+	/* Look for QCA tag at the correct location */
+	phdr = (__be16 *)(skb->data - 2 + vlan_offset);
 	hdr = ntohs(*phdr);
 
 	/* Make sure the version is correct */
@@ -71,10 +80,22 @@ static struct sk_buff *qca_tag_rcv(struct sk_buff *skb, struct net_device *dev,
 	if (unlikely(ver != QCA_HDR_VERSION))
 		return NULL;
 
+	/* Check for second VLAN tag after QCA tag if one was found prior */
+	if (!!(vlan_offset)) {
+		vlan_phdr = (__be16 *)(skb->data + 4);
+		vlan_hdr = ntohs(*vlan_phdr);
+		if (!!(vlan_hdr ^ ETH_P_8021Q)) {
+		/* Do not remove existing tag in case a tag is required */
+			vlan_offset = 0;
+			vlan_skip = VLAN_HLEN;
+		}
+	}
+
 	/* Remove QCA tag and recalculate checksum */
-	skb_pull_rcsum(skb, QCA_HDR_LEN);
-	memmove(skb->data - ETH_HLEN, skb->data - ETH_HLEN - QCA_HDR_LEN,
-		ETH_HLEN - QCA_HDR_LEN);
+	skb_pull_rcsum(skb, QCA_HDR_LEN + vlan_offset);
+	memmove(skb->data - ETH_HLEN,
+		skb->data - ETH_HLEN - QCA_HDR_LEN - vlan_offset,
+		ETH_HLEN - QCA_HDR_LEN + vlan_skip);
 
 	/* Get source port information */
 	port = (hdr & QCA_HDR_RECV_SOURCE_PORT_MASK);
-- 
2.26.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-05 19:37 [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag Matthew Hagan
@ 2021-06-05 20:35 ` Andrew Lunn
  2021-06-05 22:39   ` Matthew Hagan
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Lunn @ 2021-06-05 20:35 UTC (permalink / raw)
  To: Matthew Hagan
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, netdev, linux-kernel

> The tested case is a Meraki MX65 which features two QCA8337 switches with
> their CPU ports attached to a BCM58625 switch ports 4 and 5 respectively.

Hi Matthew

The BCM58625 switch is also running DSA? What does you device tree
look like? I know Florian has used two broadcom switches in cascade
and did not have problems.

    Andrew

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-05 20:35 ` Andrew Lunn
@ 2021-06-05 22:39   ` Matthew Hagan
  2021-06-06  0:53     ` Vladimir Oltean
  0 siblings, 1 reply; 9+ messages in thread
From: Matthew Hagan @ 2021-06-05 22:39 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Vivien Didelot, Florian Fainelli, Vladimir Oltean,
	David S. Miller, Jakub Kicinski, netdev, linux-kernel

On 05/06/2021 21:35, Andrew Lunn wrote:

>> The tested case is a Meraki MX65 which features two QCA8337 switches with
>> their CPU ports attached to a BCM58625 switch ports 4 and 5 respectively.
> Hi Matthew
>
> The BCM58625 switch is also running DSA? What does you device tree
> look like? I know Florian has used two broadcom switches in cascade
> and did not have problems.
>
>     Andrew

Hi Andrew

I did discuss this with Florian, who recommended I submit the changes. Can
confirm the b53 DSA driver is being used. The issue here is that tagging
must occur on all ports. We can't selectively disable for ports 4 and 5
where the QCA switches are attached, thus this patch is required to get
things working.

Setup is like this:
                       sw0p2     sw0p4            sw1p2     sw1p4 
    wan1    wan2  sw0p1  +  sw0p3  +  sw0p5  sw1p1  +  sw1p3  +  sw1p5
     +       +      +    |    +    |    +      +    |    +    |    +
     |       |      |    |    |    |    |      |    |    |    |    |
     |       |    +--+----+----+----+----+-+ +--+----+----+----+----+-+
     |       |    |         QCA8337        | |        QCA8337         |
     |       |    +------------+-----------+ +-----------+------------+
     |       |             sw0 |                     sw1 |
+----+-------+-----------------+-------------------------+------------+
|    0       1    BCM58625     4                         5            |
+----+-------+-----------------+-------------------------+------------+

Relevant sections of the device tree are as follows:

mdio@0 {
    reg = <0x0>;
    #address-cells = <1>;
    #size-cells = <0>;

    phy_port6: phy@0 {
        reg = <0>;
    };

    phy_port7: phy@1 {
        reg = <1>;
    };

    phy_port8: phy@2 {
        reg = <2>;
    };

    phy_port9: phy@3 {
        reg = <3>;
    };

    phy_port10: phy@4 {
        reg = <4>;
    };

    switch@10 {
        compatible = "qca,qca8337";
        #address-cells = <1>;
        #size-cells = <0>;
        reg = <0x10>;
        dsa,member = <1 0>;

        ports {
            #address-cells = <1>;
            #size-cells = <0>;
            port@0 {
                reg = <0>;
                label = "cpu";
                ethernet = <&sgmii1>;
                phy-mode = "sgmii";
                fixed-link {
                    speed = <1000>;
                    full-duplex;
                };
            };

            port@1 {
                reg = <1>;
                label = "sw1p1";
                phy-handle = <&phy_port6>;
            };

            port@2 {
                reg = <2>;
                label = "sw1p2";
                phy-handle = <&phy_port7>;
            };

            port@3 {
                reg = <3>;
                label = "sw1p3";
                phy-handle = <&phy_port8>;
            };

            port@4 {
                reg = <4>;
                label = "sw1p4";
                phy-handle = <&phy_port9>;
            };

            port@5 {
                reg = <5>;
                label = "sw1p5";
                phy-handle = <&phy_port10>;
            };
        };
    };
};

mdio-mii@2000 {
    reg = <0x2000>;
    #address-cells = <1>;
    #size-cells = <0>;

    phy_port1: phy@0 {
        reg = <0>;
    };

    phy_port2: phy@1 {
        reg = <1>;
    };

    phy_port3: phy@2 {
        reg = <2>;
    };

    phy_port4: phy@3 {
        reg = <3>;
    };

    phy_port5: phy@4 {
        reg = <4>;
    };

    switch@10 {
        compatible = "qca,qca8337";
        #address-cells = <1>;
        #size-cells = <0>;
        reg = <0x10>;
        dsa,member = <2 0>;

        ports {
            #address-cells = <1>;
            #size-cells = <0>;
            port@0 {
                reg = <0>;
                label = "cpu";
                ethernet = <&sgmii0>;
                phy-mode = "sgmii";
                fixed-link {
                    speed = <1000>;
                    full-duplex;
                };
            };

            port@1 {
                reg = <1>;
                label = "sw0p1";
                phy-handle = <&phy_port1>;
            };

            port@2 {
                reg = <2>;
                label = "sw0p2";
                phy-handle = <&phy_port2>;
            };

            port@3 {
                reg = <3>;
                label = "sw0p3";
                phy-handle = <&phy_port3>;
            };

            port@4 {
                reg = <4>;
                label = "sw0p4";
                phy-handle = <&phy_port4>;
            };

            port@5 {
                reg = <5>;
                label = "sw0p5";
                phy-handle = <&phy_port5>;
            };
        };
    };
};


&srab {
    compatible = "brcm,bcm58625-srab", "brcm,nsp-srab";
    status = "okay";
    dsa,member = <0 0>;

    ports {
        #address-cells = <1>;
        #size-cells = <0>;

        port@0 {
            label = "wan1";
            reg = <0>;
        };

        port@1 {
            label = "wan2";
            reg = <1>;
        };

        sgmii0: port@4 {
            label = "sw0";
            reg = <4>;
            fixed-link {
                speed = <1000>;
                full-duplex;
            };
        };

        sgmii1: port@5 {
            label = "sw1";
            reg = <5>;
            fixed-link {
                speed = <1000>;
                full-duplex;
            };
        };

        port@8 {
            ethernet = <&amac2>;
            label = "cpu";
            reg = <8>;
            fixed-link {
                speed = <1000>;
                full-duplex;
            };
        };
    };
};

Matthew


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-05 22:39   ` Matthew Hagan
@ 2021-06-06  0:53     ` Vladimir Oltean
  2021-06-06  3:34       ` Florian Fainelli
  2021-06-06 13:09       ` Matthew Hagan
  0 siblings, 2 replies; 9+ messages in thread
From: Vladimir Oltean @ 2021-06-06  0:53 UTC (permalink / raw)
  To: Matthew Hagan
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, netdev, linux-kernel

Hi Matthew,

On Sat, Jun 05, 2021 at 11:39:24PM +0100, Matthew Hagan wrote:
> On 05/06/2021 21:35, Andrew Lunn wrote:
> 
> >> The tested case is a Meraki MX65 which features two QCA8337 switches with
> >> their CPU ports attached to a BCM58625 switch ports 4 and 5 respectively.
> > Hi Matthew
> >
> > The BCM58625 switch is also running DSA? What does you device tree
> > look like? I know Florian has used two broadcom switches in cascade
> > and did not have problems.
> >
> >     Andrew
> 
> Hi Andrew
> 
> I did discuss this with Florian, who recommended I submit the changes. Can
> confirm the b53 DSA driver is being used. The issue here is that tagging
> must occur on all ports. We can't selectively disable for ports 4 and 5
> where the QCA switches are attached, thus this patch is required to get
> things working.
> 
> Setup is like this:
>                        sw0p2     sw0p4            sw1p2     sw1p4 
>     wan1    wan2  sw0p1  +  sw0p3  +  sw0p5  sw1p1  +  sw1p3  +  sw1p5
>      +       +      +    |    +    |    +      +    |    +    |    +
>      |       |      |    |    |    |    |      |    |    |    |    |
>      |       |    +--+----+----+----+----+-+ +--+----+----+----+----+-+
>      |       |    |         QCA8337        | |        QCA8337         |
>      |       |    +------------+-----------+ +-----------+------------+
>      |       |             sw0 |                     sw1 |
> +----+-------+-----------------+-------------------------+------------+
> |    0       1    BCM58625     4                         5            |
> +----+-------+-----------------+-------------------------+------------+

It is a bit unconventional for the upstream Broadcom switch, which is a
DSA master of its own, to insert a VLAN ID of zero out of the blue,
especially if it operates in standalone mode. Supposedly sw0 and sw1 are
not under a bridge net device, are they?

If I'm not mistaken, this patch should solve your problem?

-----------------------------[ cut here ]-----------------------------
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 3ca6b394dd5f..d6655b516bd8 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1462,6 +1462,7 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
 	struct b53_device *dev = ds->priv;
 	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
 	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
+	bool really_untagged = false;
 	struct b53_vlan *vl;
 	int err;
 
@@ -1474,10 +1475,10 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
 	b53_get_vlan_entry(dev, vlan->vid, vl);
 
 	if (vlan->vid == 0 && vlan->vid == b53_default_pvid(dev))
-		untagged = true;
+		really_untagged = true;
 
 	vl->members |= BIT(port);
-	if (untagged && !dsa_is_cpu_port(ds, port))
+	if (really_untagged || (untagged && !dsa_is_cpu_port(ds, port)))
 		vl->untag |= BIT(port);
 	else
 		vl->untag &= ~BIT(port);
-----------------------------[ cut here ]-----------------------------

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-06  0:53     ` Vladimir Oltean
@ 2021-06-06  3:34       ` Florian Fainelli
  2021-06-06  9:38         ` Vladimir Oltean
  2021-06-06 13:09       ` Matthew Hagan
  1 sibling, 1 reply; 9+ messages in thread
From: Florian Fainelli @ 2021-06-06  3:34 UTC (permalink / raw)
  To: Vladimir Oltean, Matthew Hagan
  Cc: Andrew Lunn, Vivien Didelot, David S. Miller, Jakub Kicinski,
	netdev, linux-kernel



On 6/5/2021 5:53 PM, Vladimir Oltean wrote:
> Hi Matthew,
> 
> On Sat, Jun 05, 2021 at 11:39:24PM +0100, Matthew Hagan wrote:
>> On 05/06/2021 21:35, Andrew Lunn wrote:
>>
>>>> The tested case is a Meraki MX65 which features two QCA8337 switches with
>>>> their CPU ports attached to a BCM58625 switch ports 4 and 5 respectively.
>>> Hi Matthew
>>>
>>> The BCM58625 switch is also running DSA? What does you device tree
>>> look like? I know Florian has used two broadcom switches in cascade
>>> and did not have problems.
>>>
>>>     Andrew
>>
>> Hi Andrew
>>
>> I did discuss this with Florian, who recommended I submit the changes. Can
>> confirm the b53 DSA driver is being used. The issue here is that tagging
>> must occur on all ports. We can't selectively disable for ports 4 and 5
>> where the QCA switches are attached, thus this patch is required to get
>> things working.
>>
>> Setup is like this:
>>                        sw0p2     sw0p4            sw1p2     sw1p4 
>>     wan1    wan2  sw0p1  +  sw0p3  +  sw0p5  sw1p1  +  sw1p3  +  sw1p5
>>      +       +      +    |    +    |    +      +    |    +    |    +
>>      |       |      |    |    |    |    |      |    |    |    |    |
>>      |       |    +--+----+----+----+----+-+ +--+----+----+----+----+-+
>>      |       |    |         QCA8337        | |        QCA8337         |
>>      |       |    +------------+-----------+ +-----------+------------+
>>      |       |             sw0 |                     sw1 |
>> +----+-------+-----------------+-------------------------+------------+
>> |    0       1    BCM58625     4                         5            |
>> +----+-------+-----------------+-------------------------+------------+
> 
> It is a bit unconventional for the upstream Broadcom switch, which is a
> DSA master of its own, to insert a VLAN ID of zero out of the blue,
> especially if it operates in standalone mode. Supposedly sw0 and sw1 are
> not under a bridge net device, are they?

This is because of the need (or desire) to always tag the CPU port
regardless of the untagged VLAN that one of its downstream port is being
added to. Despite talking with Matthew about this before, I had not
realized that dsa_port_is_cpu() will return true for ports 4 and 5 when
a VLAN is added to one of the two QCA8337 switches because from the
perspective of that switch, those ports have been set as DSA_PORT_TYPE_CPU.

This may also mean that b53_setup() needs fixing as well while it
iterates over the ports of the switch though I am not sure how we could
fix that yet.

> 
> If I'm not mistaken, this patch should solve your problem?

How about this:

diff --git a/drivers/net/dsa/b53/b53_common.c
b/drivers/net/dsa/b53/b53_common.c
index 3ca6b394dd5f..6dfcff9018fd 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1455,6 +1455,22 @@ static int b53_vlan_prepare(struct dsa_switch
*ds, int port,
        return 0;
 }

+static inline bool b53_vlan_can_untag(struct dsa_switch *ds, int port)
+{
+       /* If this switch port is a CPU port */
+       if (dsa_is_cpu_port(ds, port)) {
+               /* We permit untagging to be configured if it is the DSA
+                * master of another switch (cascading).
+                */
+               if (dsa_slave_dev_check(dsa_to_port(ds, port)->master))
+                       return true;
+
+               return false;
+       }
+
+       return true;
+}
+
 int b53_vlan_add(struct dsa_switch *ds, int port,
                 const struct switchdev_obj_port_vlan *vlan,
                 struct netlink_ext_ack *extack)
@@ -1477,7 +1493,7 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
                untagged = true;

        vl->members |= BIT(port);
-       if (untagged && !dsa_is_cpu_port(ds, port))
+       if (untagged && b53_vlan_can_untag(ds, port))
                vl->untag |= BIT(port);
        else
                vl->untag &= ~BIT(port);
@@ -1514,7 +1530,7 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
        if (pvid == vlan->vid)
                pvid = b53_default_pvid(dev);

-       if (untagged && !dsa_is_cpu_port(ds, port))
+       if (untagged && b53_vlan_can_untag(ds, port))
                vl->untag &= ~(BIT(port));

        b53_set_vlan_entry(dev, vlan->vid, vl);
-- 
Florian

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-06  3:34       ` Florian Fainelli
@ 2021-06-06  9:38         ` Vladimir Oltean
  2021-06-07 17:01           ` Florian Fainelli
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Oltean @ 2021-06-06  9:38 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Matthew Hagan, Andrew Lunn, Vivien Didelot, David S. Miller,
	Jakub Kicinski, netdev, linux-kernel

Hi Florian,

On Sat, Jun 05, 2021 at 08:34:06PM -0700, Florian Fainelli wrote:
> On 6/5/2021 5:53 PM, Vladimir Oltean wrote:
> > Hi Matthew,
> > 
> > On Sat, Jun 05, 2021 at 11:39:24PM +0100, Matthew Hagan wrote:
> >> On 05/06/2021 21:35, Andrew Lunn wrote:
> >>
> >>>> The tested case is a Meraki MX65 which features two QCA8337 switches with
> >>>> their CPU ports attached to a BCM58625 switch ports 4 and 5 respectively.
> >>> Hi Matthew
> >>>
> >>> The BCM58625 switch is also running DSA? What does you device tree
> >>> look like? I know Florian has used two broadcom switches in cascade
> >>> and did not have problems.
> >>>
> >>>     Andrew
> >>
> >> Hi Andrew
> >>
> >> I did discuss this with Florian, who recommended I submit the changes. Can
> >> confirm the b53 DSA driver is being used. The issue here is that tagging
> >> must occur on all ports. We can't selectively disable for ports 4 and 5
> >> where the QCA switches are attached, thus this patch is required to get
> >> things working.
> >>
> >> Setup is like this:
> >>                        sw0p2     sw0p4            sw1p2     sw1p4 
> >>     wan1    wan2  sw0p1  +  sw0p3  +  sw0p5  sw1p1  +  sw1p3  +  sw1p5
> >>      +       +      +    |    +    |    +      +    |    +    |    +
> >>      |       |      |    |    |    |    |      |    |    |    |    |
> >>      |       |    +--+----+----+----+----+-+ +--+----+----+----+----+-+
> >>      |       |    |         QCA8337        | |        QCA8337         |
> >>      |       |    +------------+-----------+ +-----------+------------+
> >>      |       |             sw0 |                     sw1 |
> >> +----+-------+-----------------+-------------------------+------------+
> >> |    0       1    BCM58625     4                         5            |
> >> +----+-------+-----------------+-------------------------+------------+
> > 
> > It is a bit unconventional for the upstream Broadcom switch, which is a
> > DSA master of its own, to insert a VLAN ID of zero out of the blue,
> > especially if it operates in standalone mode. Supposedly sw0 and sw1 are
> > not under a bridge net device, are they?
> 
> This is because of the need (or desire) to always tag the CPU port
> regardless of the untagged VLAN that one of its downstream port is being
> added to. Despite talking with Matthew about this before, I had not
> realized that dsa_port_is_cpu() will return true for ports 4 and 5 when
> a VLAN is added to one of the two QCA8337 switches because from the
> perspective of that switch, those ports have been set as DSA_PORT_TYPE_CPU.

It will not, the ports maintain the same roles regardless of whether
there is another switch attached to them or not. For the BCM58625
switch, ports 4 and 5 are user ports with net devices that each happen
to be DSA masters for 2 QCA8337 switches, and port 8 is the CPU port.

When a DSA user port is a DSA master for another switch, tag stacking
takes place - first the rcv() from tag_brcm.c runs, then the rcv() from
tag_qca.c runs - you taught me this, in fact.

My point is that the Broadcom switch should leave the packet in a state
where tag_qca.c can work with it without being aware that it has been
first processed by another switch. This is why I asked Matthew whether
he configured any bridging between BCM58625 ports 4 and 5, and any
bridge VLANs. I am not completely sure we should start modifying our DSA
taggers under the assumption that VLANs might just pop up everywhere -
I simply don't see a compelling use case to let that happen and justify
the complexity.

In this case, my suspicion is that the root of the issue is the
resolution from commit d965a5432d4c ("net: dsa: b53: Ensure the default
VID is untagged"). It seems like it wanted to treat VID 0 as untagged if
it's the pvid, but it only treats it as untagged in one direction.

For the network stack, I think there are checks scattered in
__netif_receive_skb_core that make it treat a skb with VID == 0 as if it
was untagged, so the fact that untagged packets are sent as egress-tagged
with VID=0 by the Broadcom CPU port (8) towards the system, and received
as VLAN-tagged by tag_brcm.c, is not that big of a problem. The problem
only appears when there is another DSA switch downstream of it, because
it shifts the expected position of the DSA tag in tag_qca.c.

DSA switch drivers don't normally send all packets as egress-tagged
towards the CPU. If they do, they ought to be more careful and not let
VLAN tags escape their tagging driver, if there was no VLAN tag to begin
with in the packet as seen on the wire.

We might make a justifiable exception in the case where DSA_TAG_PROTO_NONE
is used, but in this case, my understanding is that BCM58625 uses
DSA_TAG_PROTO_BRCM_PREPEND, so I'm not sure why sending packets towards
the CPU with VID=0 instead of untagged makes that big of a difference.

> 
> This may also mean that b53_setup() needs fixing as well while it
> iterates over the ports of the switch though I am not sure how we could
> fix that yet.
> 
> > 
> > If I'm not mistaken, this patch should solve your problem?
> 
> How about this:
> 
> diff --git a/drivers/net/dsa/b53/b53_common.c
> b/drivers/net/dsa/b53/b53_common.c
> index 3ca6b394dd5f..6dfcff9018fd 100644
> --- a/drivers/net/dsa/b53/b53_common.c
> +++ b/drivers/net/dsa/b53/b53_common.c
> @@ -1455,6 +1455,22 @@ static int b53_vlan_prepare(struct dsa_switch
> *ds, int port,
>         return 0;
>  }
> 
> +static inline bool b53_vlan_can_untag(struct dsa_switch *ds, int port)
> +{
> +       /* If this switch port is a CPU port */
> +       if (dsa_is_cpu_port(ds, port)) {
              this matches only for port == 8

> +               /* We permit untagging to be configured if it is the DSA
> +                * master of another switch (cascading).
> +                */
> +               if (dsa_slave_dev_check(dsa_to_port(ds, port)->master))
                                          and the master of port 8 is the "brcm,nsp-amac" controller
                      which is not a DSA slave port
> +                       return true;
> +
> +               return false;
                  so this will still return false
> +       }
> +
> +       return true;
> +}
> +
>  int b53_vlan_add(struct dsa_switch *ds, int port,
>                  const struct switchdev_obj_port_vlan *vlan,
>                  struct netlink_ext_ack *extack)
> @@ -1477,7 +1493,7 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
>                 untagged = true;
> 
>         vl->members |= BIT(port);
> -       if (untagged && !dsa_is_cpu_port(ds, port))
> +       if (untagged && b53_vlan_can_untag(ds, port))
>                 vl->untag |= BIT(port);
>         else
>                 vl->untag &= ~BIT(port);
> @@ -1514,7 +1530,7 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
>         if (pvid == vlan->vid)
>                 pvid = b53_default_pvid(dev);
> 
> -       if (untagged && !dsa_is_cpu_port(ds, port))
> +       if (untagged && b53_vlan_can_untag(ds, port))
          and VID 0 will still be sent as egress-tagged by the BCM58625 on port 8.
>                 vl->untag &= ~(BIT(port));
> 
>         b53_set_vlan_entry(dev, vlan->vid, vl);
> -- 
> Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-06  0:53     ` Vladimir Oltean
  2021-06-06  3:34       ` Florian Fainelli
@ 2021-06-06 13:09       ` Matthew Hagan
  2021-06-06 19:27         ` Vladimir Oltean
  1 sibling, 1 reply; 9+ messages in thread
From: Matthew Hagan @ 2021-06-06 13:09 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, netdev, linux-kernel

On 06/06/2021 01:53, Vladimir Oltean wrote:

> It is a bit unconventional for the upstream Broadcom switch, which is a
> DSA master of its own, to insert a VLAN ID of zero out of the blue,
> especially if it operates in standalone mode. Supposedly sw0 and sw1 are
> not under a bridge net device, are they?

sw0 and sw1 are brought up but otherwise left unconfigured. The bridge
consists of the user ports only (wanN and swNpN). A side note here is that
your "net: dsa: don't set skb->offload_fwd_mark when not offloading the
bridge" patch is also in use. Would setting up a bridge for sw0/sw1 not
have implications for receiving unknown frames on one port, that have been
sent from another port of the same switch? Since unknown frames will go to
the CPU, dp->bridge_dev would return the bridge name, setting
offload_fwd_mark=1 thus preventing those frames being sent back out
sw0/sw1 to its other ports.

>
> If I'm not mistaken, this patch should solve your problem?
>
> -----------------------------[ cut here ]-----------------------------
> diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
> index 3ca6b394dd5f..d6655b516bd8 100644
> --- a/drivers/net/dsa/b53/b53_common.c
> +++ b/drivers/net/dsa/b53/b53_common.c
> @@ -1462,6 +1462,7 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
>  	struct b53_device *dev = ds->priv;
>  	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
>  	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
> +	bool really_untagged = false;
>  	struct b53_vlan *vl;
>  	int err;
>  
> @@ -1474,10 +1475,10 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
>  	b53_get_vlan_entry(dev, vlan->vid, vl);
>  
>  	if (vlan->vid == 0 && vlan->vid == b53_default_pvid(dev))
> -		untagged = true;
> +		really_untagged = true;
>  
>  	vl->members |= BIT(port);
> -	if (untagged && !dsa_is_cpu_port(ds, port))
> +	if (really_untagged || (untagged && !dsa_is_cpu_port(ds, port)))
>  		vl->untag |= BIT(port);
>  	else
>  		vl->untag &= ~BIT(port);
> -----------------------------[ cut here ]-----------------------------
>
This does seem to sort the issue as well in this case. Thanks!

Matthew


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-06 13:09       ` Matthew Hagan
@ 2021-06-06 19:27         ` Vladimir Oltean
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Oltean @ 2021-06-06 19:27 UTC (permalink / raw)
  To: Matthew Hagan
  Cc: Andrew Lunn, Vivien Didelot, Florian Fainelli, David S. Miller,
	Jakub Kicinski, netdev, linux-kernel

On Sun, Jun 06, 2021 at 02:09:24PM +0100, Matthew Hagan wrote:
> On 06/06/2021 01:53, Vladimir Oltean wrote:
> 
> > It is a bit unconventional for the upstream Broadcom switch, which is a
> > DSA master of its own, to insert a VLAN ID of zero out of the blue,
> > especially if it operates in standalone mode. Supposedly sw0 and sw1 are
> > not under a bridge net device, are they?
> 
> sw0 and sw1 are brought up but otherwise left unconfigured. The bridge
> consists of the user ports only (wanN and swNpN). A side note here is that
> your "net: dsa: don't set skb->offload_fwd_mark when not offloading the
> bridge" patch is also in use. Would setting up a bridge for sw0/sw1 not
> have implications for receiving unknown frames on one port, that have been
> sent from another port of the same switch? Since unknown frames will go to
> the CPU, dp->bridge_dev would return the bridge name, setting
> offload_fwd_mark=1 thus preventing those frames being sent back out
> sw0/sw1 to its other ports.

What you have is called "cross-chip bridging for disjoint DSA trees" and
has some level of support since this series:
http://patchwork.ozlabs.org/project/netdev/cover/20200510163743.18032-1-olteanv@gmail.com/

What you can/should do is:

create a bridge between sw0 and sw1 (say br1)
create another bridge between wanN and swNpM (say br0)

Here's the secret:

- the br1 bridge only performs hardware acceleration for forwarding
  between the 2 QCA switches. The Broadcom switch is still able to
  understand enough (aka the destination MAC) of the packets coming from
  the QCA switches, even if they are DSA tagged, in order to perform L2
  forwarding to the other port or to the CPU port. And because br0 != br1,
  what you mentioned above for skb->offload_fwd_mark does not actually
  matter - there are only 2 ports in br1, and both are part of the same
  hardware domain, so the software bridge doesn't need to forward any
  packet. As for br0, by the time the packets from swNpM reach the
  software bridge, there is no longer any indication that they were
  originally processed by the Broadcom ports as DSA masters, so there is
  no problem forwarding them in software to the other Broadcom ports
  (not DSA masters).
- the br0 bridge, in the presence of br1, doesn't have to do software
  forwarding of packets between the 2 QCA switches - br1 handles it. But
  even if br1 did not exist, it still could. How?
  nbp_switchdev_allowed_egress() will happily forward packets between
  ports with different offload_fwd_mark values. These are derived from:

  nbp_switchdev_mark_set
  -> dev_get_port_parent_id
     -> devlink_compat_switch_id_get
        -> &devlink_port->attrs.switch_id

  populated by &devlink_port->attrs.switch_id based on dst->index.

  Otherwise said, the devlink switch id is equal to the DSA switch tree
  index.

  But you already set the dsa,member properties properly (i.e. each QCA
  switch is in its own tree with a unique index):

    switch@10 {
        compatible = "qca,qca8337";
        dsa,member = <1 0>;
    };

    switch@10 {
        compatible = "qca,qca8337";
        dsa,member = <2 0>;
    };

  So there is in fact no problem.

The "net: dsa: don't set skb->offload_fwd_mark when not offloading the bridge"
patch was not accepted yet, am I right?
https://patchwork.kernel.org/project/netdevbpf/patch/20210318231829.3892920-15-olteanv@gmail.com/
Why are you using RFC patches instead of asking for them to be submitted
properly? :)

Regardless of that patch being present or not (which affects a different
use case which I cannot see how it relates to this), I think there is a bug:

if the DSA master sets skb->offload_fwd_mark = true, and then the DSA
tagger gets to process that same skb, and it wants to indicate it is a
standalone / non-offloading port, currently it will simply not set
skb->offload_fwd_mark = true. But not setting it to true is different
than always setting skb->offload_fwd_mark to true or false - it just
works because we assume that skb->offload_fwd_mark is initially false,
which is obv not true if the DSA master had already set it to true.
So if there is a provable bug caused by this, we might need a patch
which sets skb->offload_fwd_mark = false in dsa_switch_rcv(), right
before the call to cpu_dp->rcv(), in order to satisfy the taggers'
expectation that we do indeed start from a blank slate.
Does that make sense?

> >
> > If I'm not mistaken, this patch should solve your problem?
> >
> > -----------------------------[ cut here ]-----------------------------
> > diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
> > index 3ca6b394dd5f..d6655b516bd8 100644
> > --- a/drivers/net/dsa/b53/b53_common.c
> > +++ b/drivers/net/dsa/b53/b53_common.c
> > @@ -1462,6 +1462,7 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
> >  	struct b53_device *dev = ds->priv;
> >  	bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
> >  	bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
> > +	bool really_untagged = false;
> >  	struct b53_vlan *vl;
> >  	int err;
> >  
> > @@ -1474,10 +1475,10 @@ int b53_vlan_add(struct dsa_switch *ds, int port,
> >  	b53_get_vlan_entry(dev, vlan->vid, vl);
> >  
> >  	if (vlan->vid == 0 && vlan->vid == b53_default_pvid(dev))
> > -		untagged = true;
> > +		really_untagged = true;
> >  
> >  	vl->members |= BIT(port);
> > -	if (untagged && !dsa_is_cpu_port(ds, port))
> > +	if (really_untagged || (untagged && !dsa_is_cpu_port(ds, port)))
> >  		vl->untag |= BIT(port);
> >  	else
> >  		vl->untag &= ~BIT(port);
> > -----------------------------[ cut here ]-----------------------------
> >
> This does seem to sort the issue as well in this case. Thanks!

I'm sure Florian will explain some of the additional constraints around
why we might want the Broadcom switches to send untagged packets as
tagged with VID 0 towards the CPU, so the patch might suffer some
changes until submitted proper. At the very least, the same configuration
needs to work regardless of the value of CONFIG_VLAN_8021Q.
Currently, the default VLAN configuration done by b53_configure_vlan()
is overwritten by:

vlan_device_event (the "adding VLAN 0 to HW filter on device %s" message)
-> vlan_vid_add
   -> ...
      -> dsa_slave_vlan_rx_add_vid
         -> dsa_port_vlan_add
            -> ...
               -> b53_vlan_add

so my point is that it is not robust to only fix the case where this
chain of events happens, because CONFIG_VLAN_8021Q is entirely optional,
and if it is not enabled, nobody will add VID 0 to the RX filter of the
net devices, so the configuration of VID 0 on the Broadcom ports will be
left as it is done by the switch initialization code, and that code
should produce the same results.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag
  2021-06-06  9:38         ` Vladimir Oltean
@ 2021-06-07 17:01           ` Florian Fainelli
  0 siblings, 0 replies; 9+ messages in thread
From: Florian Fainelli @ 2021-06-07 17:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Matthew Hagan, Andrew Lunn, Vivien Didelot, David S. Miller,
	Jakub Kicinski, netdev, linux-kernel

Hello Vladimir,

On 6/6/2021 2:38 AM, Vladimir Oltean wrote:
> Hi Florian,
> 
> On Sat, Jun 05, 2021 at 08:34:06PM -0700, Florian Fainelli wrote:
>> On 6/5/2021 5:53 PM, Vladimir Oltean wrote:
>>> Hi Matthew,
>>>
>>> On Sat, Jun 05, 2021 at 11:39:24PM +0100, Matthew Hagan wrote:
>>>> On 05/06/2021 21:35, Andrew Lunn wrote:
>>>>
>>>>>> The tested case is a Meraki MX65 which features two QCA8337 switches with
>>>>>> their CPU ports attached to a BCM58625 switch ports 4 and 5 respectively.
>>>>> Hi Matthew
>>>>>
>>>>> The BCM58625 switch is also running DSA? What does you device tree
>>>>> look like? I know Florian has used two broadcom switches in cascade
>>>>> and did not have problems.
>>>>>
>>>>>     Andrew
>>>>
>>>> Hi Andrew
>>>>
>>>> I did discuss this with Florian, who recommended I submit the changes. Can
>>>> confirm the b53 DSA driver is being used. The issue here is that tagging
>>>> must occur on all ports. We can't selectively disable for ports 4 and 5
>>>> where the QCA switches are attached, thus this patch is required to get
>>>> things working.
>>>>
>>>> Setup is like this:
>>>>                        sw0p2     sw0p4            sw1p2     sw1p4 
>>>>     wan1    wan2  sw0p1  +  sw0p3  +  sw0p5  sw1p1  +  sw1p3  +  sw1p5
>>>>      +       +      +    |    +    |    +      +    |    +    |    +
>>>>      |       |      |    |    |    |    |      |    |    |    |    |
>>>>      |       |    +--+----+----+----+----+-+ +--+----+----+----+----+-+
>>>>      |       |    |         QCA8337        | |        QCA8337         |
>>>>      |       |    +------------+-----------+ +-----------+------------+
>>>>      |       |             sw0 |                     sw1 |
>>>> +----+-------+-----------------+-------------------------+------------+
>>>> |    0       1    BCM58625     4                         5            |
>>>> +----+-------+-----------------+-------------------------+------------+
>>>
>>> It is a bit unconventional for the upstream Broadcom switch, which is a
>>> DSA master of its own, to insert a VLAN ID of zero out of the blue,
>>> especially if it operates in standalone mode. Supposedly sw0 and sw1 are
>>> not under a bridge net device, are they?
>>
>> This is because of the need (or desire) to always tag the CPU port
>> regardless of the untagged VLAN that one of its downstream port is being
>> added to. Despite talking with Matthew about this before, I had not
>> realized that dsa_port_is_cpu() will return true for ports 4 and 5 when
>> a VLAN is added to one of the two QCA8337 switches because from the
>> perspective of that switch, those ports have been set as DSA_PORT_TYPE_CPU.
> 
> It will not, the ports maintain the same roles regardless of whether
> there is another switch attached to them or not. For the BCM58625
> switch, ports 4 and 5 are user ports with net devices that each happen
> to be DSA masters for 2 QCA8337 switches, and port 8 is the CPU port.

Right, I was not thinking properly while submitting that counter
proposal it does not make sense and neither did my explanation, I was
just too keen on thinking that the problem would be that one of the user
facing port (even if they happen to be the "CPU" port of another switch)
would be adding the tag on egress when the problem is that the CPU port
is egress tagged to begin with. This is not a problem for normal user
ports as you say because the network stack has all smarts about dealing
with that. I would like to get a proper tcpdump capture of the DSA
master first to make sure

> 
> When a DSA user port is a DSA master for another switch, tag stacking
> takes place - first the rcv() from tag_brcm.c runs, then the rcv() from
> tag_qca.c runs - you taught me this, in fact.
> 
> My point is that the Broadcom switch should leave the packet in a state
> where tag_qca.c can work with it without being aware that it has been
> first processed by another switch. This is why I asked Matthew whether
> he configured any bridging between BCM58625 ports 4 and 5, and any
> bridge VLANs. I am not completely sure we should start modifying our DSA
> taggers under the assumption that VLANs might just pop up everywhere -
> I simply don't see a compelling use case to let that happen and justify
> the complexity.
> 
> In this case, my suspicion is that the root of the issue is the
> resolution from commit d965a5432d4c ("net: dsa: b53: Ensure the default
> VID is untagged"). It seems like it wanted to treat VID 0 as untagged if
> it's the pvid, but it only treats it as untagged in one direction.

We only have control over the egress tagging attribute on Broadcom
switches and there is no "egress unmodified" unlike Marvell switches, so
we do indeed have an asymmetrical configuration in that the following
happens:

- user port egress untagged -> egress tagged towards CPU port
- CPU port egress untagged frame -> egress untagged towards user port

The CPU port, by virtue of using Broadcom tags can override any VLAN
membership.

> 
> For the network stack, I think there are checks scattered in
> __netif_receive_skb_core that make it treat a skb with VID == 0 as if it
> was untagged, so the fact that untagged packets are sent as egress-tagged
> with VID=0 by the Broadcom CPU port (8) towards the system, and received
> as VLAN-tagged by tag_brcm.c, is not that big of a problem. The problem
> only appears when there is another DSA switch downstream of it, because
> it shifts the expected position of the DSA tag in tag_qca.c.
> 
> DSA switch drivers don't normally send all packets as egress-tagged
> towards the CPU. If they do, they ought to be more careful and not let
> VLAN tags escape their tagging driver, if there was no VLAN tag to begin
> with in the packet as seen on the wire.
> 
> We might make a justifiable exception in the case where DSA_TAG_PROTO_NONE
> is used, but in this case, my understanding is that BCM58625 uses
> DSA_TAG_PROTO_BRCM_PREPEND, so I'm not sure why sending packets towards
> the CPU with VID=0 instead of untagged makes that big of a difference.

I don't think there is any justification for doing what b53 does
anymore. Let me sleep on it for a day and submit a patch for Matthew to
try out.
-- 
Florian

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-06-07 17:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-05 19:37 [RFC PATCH net-next] net: dsa: tag_qca: Check for upstream VLAN tag Matthew Hagan
2021-06-05 20:35 ` Andrew Lunn
2021-06-05 22:39   ` Matthew Hagan
2021-06-06  0:53     ` Vladimir Oltean
2021-06-06  3:34       ` Florian Fainelli
2021-06-06  9:38         ` Vladimir Oltean
2021-06-07 17:01           ` Florian Fainelli
2021-06-06 13:09       ` Matthew Hagan
2021-06-06 19:27         ` Vladimir Oltean

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.