dsa/mv88e6xxx: leaking packets on MV88E6341 switch

* dsa/mv88e6xxx: leaking packets on MV88E6341 switch
@ 2020-09-30 10:09 Peter Vollmer
  2020-09-30 10:28 ` Vladimir Oltean
  2020-09-30 19:19 ` Andrew Lunn
  0 siblings, 2 replies; 12+ messages in thread
From: Peter Vollmer @ 2020-09-30 10:09 UTC (permalink / raw)
  To: Network Development

Hi all,
I am currently investigating a leaking packets problem on a
armada-37xx + MV88E6341 switch (via SGMII)  + MV88E1512 Phy (via
RGMII)  platform. We are using the mainline 5.4.y kernel.

The switch and phy setup is defined in the flat device tree as follows:

&eth0 {
        phy-mode = "rgmii-id";
        phy = <&ethphy0>;
        status = "okay";
};

&eth1 {
        phy-mode = "sgmii";
        status = "okay";

        fixed-link {
                speed = <2500>;
                full-duplex;
        };
};

&mdio {
        reset-gpios = <&gpiosb 0 GPIO_ACTIVE_LOW>;
        reset-delay-us = <2>;

        ethphy0: ethernet-phy@0 {
                reg = <0x0>;
                status = "okay";
        };

        switch0: switch0@1 {
                compatible = "marvell,mv88e6085";
                #address-cells = <1>;
                #size-cells = <0>;
                reg = <1>;
                cpu-port = <5>;
                dsa,member = <0 0>;
                status = "okay";

                ports {
                        #address-cells = <0x1>;
                        #size-cells = <0x0>;

                        port@1 {
                                reg = <1>;
                                label = "lan0";
                                phy-handle = <&switch0phy1>;
                        };
                        port@2 {
                                reg = <2>;
                                label = "lan1";
                                phy-handle = <&switch0phy2>;
                        };

                        port@3 {
                                reg = <3>;
                                label = "lan2";
                                phy-handle = <&switch0phy3>;
                        };

                        port@4 {
                                reg = <4>;
                                label = "lan3";
                                phy-handle = <&switch0phy4>;
                        };

                        port@5 {
                                reg = <5>;
                                label = "cpu";
                                ethernet = <&eth1>;
                        };
                };

                mdio {
                        #address-cells = <1>;
                        #size-cells = <0>;

                        switch0phy1: switch0phy0@11 {
                                reg = <0x11>;
                        };
                        switch0phy2: switch0phy1@12 {
                                reg = <0x12>;
                        };
                        switch0phy3: switch0phy2@13 {
                                reg = <0x13>;
                        };
                        switch0phy4: switch0phy2@14 {
                                reg = <0x14>;
                        };
                };
        };
};

lan0..lan3 are members of the br0 bridge interface.

The problem is that for ICMP ping lan0-> eth0, ICMP ping request
packets are leaking (i.e. flooded)  to all other ports lan1..lan3,
while the ping reply eth0->lan0 arrives correctly at lan0 without any
leaked packets on lan1..lan3.
The problem temporarily goes away for ~280 seconds after I toggle the
multicast flag of the bridge interface ( ifconfig br0 [-]multicast )
We also noticed an asymmetric maximum network throughput, UDP traffic
lan0->eth0 is much slower than in the direction eth0->lan0.

My assumption is that in our case the SRC MAC address of the bridge
(or eth1) interface is not correctly learned by the switch, so it
floods the packets in reverse direction to all ports (CPU port 5 and
the other lan ports). As it seems the DSA packets ingressing on CPU
port5 (eth0->lan0) are sent as DSA MGMT frames, but those seem not to
be used for address learning.

Is this a known effect for this kind of setup, and is there something
we can do about it ?

What would be the best way to debug this ? Is there a way to dump the
ATU MAC tables to see what's going on with the address learning ?

Many thanks and best regards

Peter

^ permalink raw reply	[flat|nested] 12+ messages in thread