* warnings from MTU setting on switch ports @ 2020-11-30 14:30 Rasmus Villemoes 2020-11-30 16:04 ` Vladimir Oltean 0 siblings, 1 reply; 15+ messages in thread From: Rasmus Villemoes @ 2020-11-30 14:30 UTC (permalink / raw) To: Network Development; +Cc: Vladimir Oltean, Florian Fainelli Hi, Updating our mpc8309 board to 5.9, we're starting to get [ 0.709832] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 0 [ 0.720721] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 1 [ 0.731002] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 2 [ 0.741333] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 3 [ 0.752220] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 4 [ 0.764231] eth1: mtu greater than device maximum [ 0.769022] ucc_geth e0102000.ethernet eth1: error -22 setting MTU to include DSA overhead So it does say "nonfatal", but do we have to live with those warnings on every boot going forward, or is there something that we could do to silence it? It's a mv88e6250 switch with cpu port connected to a ucc_geth interface; the ucc_geth driver indeed does not implement ndo_change_mtu and has ->max_mtu set to the default of 1500. Rasmus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: warnings from MTU setting on switch ports 2020-11-30 14:30 warnings from MTU setting on switch ports Rasmus Villemoes @ 2020-11-30 16:04 ` Vladimir Oltean 2020-11-30 22:13 ` Rasmus Villemoes 0 siblings, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2020-11-30 16:04 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: Network Development, Florian Fainelli Hi Rasmus, On Mon, Nov 30, 2020 at 03:30:50PM +0100, Rasmus Villemoes wrote: > Hi, > > Updating our mpc8309 board to 5.9, we're starting to get > > [ 0.709832] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 0 > [ 0.720721] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 1 > [ 0.731002] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 2 > [ 0.741333] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 3 > [ 0.752220] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 4 > [ 0.764231] eth1: mtu greater than device maximum > [ 0.769022] ucc_geth e0102000.ethernet eth1: error -22 setting MTU to include DSA overhead > > So it does say "nonfatal", but do we have to live with those warnings on > every boot going forward, or is there something that we could do to > silence it? > > It's a mv88e6250 switch with cpu port connected to a ucc_geth interface; > the ucc_geth driver indeed does not implement ndo_change_mtu and has > ->max_mtu set to the default of 1500. To suppress the warning: commit 4349abdb409b04a5ed4ca4d2c1df7ef0cc16f6bd Author: Vladimir Oltean <olteanv@gmail.com> Date: Tue Sep 8 02:25:56 2020 +0300 net: dsa: don't print non-fatal MTU error if not supported Commit 72579e14a1d3 ("net: dsa: don't fail to probe if we couldn't set the MTU") changed, for some reason, the "err && err != -EOPNOTSUPP" check into a simple "err". This causes the MTU warning to be printed even for drivers that don't have the MTU operations implemented. Fix that. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> But you might also want to look into adding .ndo_change_mtu for ucc_geth. If you are able to pass MTU-sized traffic through your mv88e6085, then it is probably the case that the mpc8309 already supports larger packets than 1500 bytes, and it is simply a matter of letting the stack know about that. The warning is there to give people a clue for the reason why MTU-sized traffic might not work over DSA. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: warnings from MTU setting on switch ports 2020-11-30 16:04 ` Vladimir Oltean @ 2020-11-30 22:13 ` Rasmus Villemoes 2020-11-30 22:35 ` Vladimir Oltean 2020-12-05 14:49 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Rasmus Villemoes 0 siblings, 2 replies; 15+ messages in thread From: Rasmus Villemoes @ 2020-11-30 22:13 UTC (permalink / raw) To: Vladimir Oltean; +Cc: Network Development, Florian Fainelli On 30/11/2020 17.04, Vladimir Oltean wrote: > Hi Rasmus, > > On Mon, Nov 30, 2020 at 03:30:50PM +0100, Rasmus Villemoes wrote: >> Hi, >> >> Updating our mpc8309 board to 5.9, we're starting to get >> >> [ 0.709832] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 0 >> [ 0.720721] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 1 >> [ 0.731002] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 2 >> [ 0.741333] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 3 >> [ 0.752220] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 4 >> [ 0.764231] eth1: mtu greater than device maximum >> [ 0.769022] ucc_geth e0102000.ethernet eth1: error -22 setting MTU to include DSA overhead >> >> So it does say "nonfatal", but do we have to live with those warnings on >> every boot going forward, or is there something that we could do to >> silence it? >> >> It's a mv88e6250 switch with cpu port connected to a ucc_geth interface; >> the ucc_geth driver indeed does not implement ndo_change_mtu and has >> ->max_mtu set to the default of 1500. > > To suppress the warning: > > commit 4349abdb409b04a5ed4ca4d2c1df7ef0cc16f6bd > Author: Vladimir Oltean <olteanv@gmail.com> > Date: Tue Sep 8 02:25:56 2020 +0300 > > net: dsa: don't print non-fatal MTU error if not supported > Thanks, but I don't think that will change anything. -34 is -ERANGE. > But you might also want to look into adding .ndo_change_mtu for > ucc_geth. Well, that was what I first did, but I'm incapable of making sense of the QE reference manual. Perhaps, given the domain of your email address, you could poke someone that might know what would need to be done? In any case, something else seems to be broken with 5.9; no network traffic seems to be working (but the bridge creation etc. seems to work just the same, link status works, and "ip link show" shows the same things as in 4.19). So until I figure that out I can't play around with modifying ucc_geth. If you are able to pass MTU-sized traffic through your > mv88e6085, then it is probably the case that the mpc8309 already > supports larger packets than 1500 bytes, and it is simply a matter of > letting the stack know about that. Perhaps, but I don't know how I should test that given that 1500 give-or-take is hardcoded. FWIW, on a 4.19 kernel, I can do 'ping -s X -M do' for X up to 1472 for IPv4 and 1452 for IPv6, but I don't think that tells me much about what the hardware could do. A thought: Shouldn't the initialization of slave_dev->max_mtu in dsa_slave_create() be capped by master->max_mtu minus tag overhead? Rasmus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: warnings from MTU setting on switch ports 2020-11-30 22:13 ` Rasmus Villemoes @ 2020-11-30 22:35 ` Vladimir Oltean 2020-11-30 22:50 ` Rasmus Villemoes 2020-11-30 23:11 ` Andrew Lunn 2020-12-05 14:49 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Rasmus Villemoes 1 sibling, 2 replies; 15+ messages in thread From: Vladimir Oltean @ 2020-11-30 22:35 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: Network Development, Florian Fainelli, Andrew Lunn On Mon, Nov 30, 2020 at 11:13:39PM +0100, Rasmus Villemoes wrote: > On 30/11/2020 17.04, Vladimir Oltean wrote: > > Hi Rasmus, > > > > On Mon, Nov 30, 2020 at 03:30:50PM +0100, Rasmus Villemoes wrote: > >> Hi, > >> > >> Updating our mpc8309 board to 5.9, we're starting to get > >> > >> [ 0.709832] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 0 > >> [ 0.720721] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 1 > >> [ 0.731002] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 2 > >> [ 0.741333] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 3 > >> [ 0.752220] mv88e6085 mdio@e0102120:10: nonfatal error -34 setting MTU on port 4 > >> [ 0.764231] eth1: mtu greater than device maximum > >> [ 0.769022] ucc_geth e0102000.ethernet eth1: error -22 setting MTU to include DSA overhead > >> > >> So it does say "nonfatal", but do we have to live with those warnings on > >> every boot going forward, or is there something that we could do to > >> silence it? > >> > >> It's a mv88e6250 switch with cpu port connected to a ucc_geth interface; > >> the ucc_geth driver indeed does not implement ndo_change_mtu and has > >> ->max_mtu set to the default of 1500. > > > > To suppress the warning: > > > > commit 4349abdb409b04a5ed4ca4d2c1df7ef0cc16f6bd > > Author: Vladimir Oltean <olteanv@gmail.com> > > Date: Tue Sep 8 02:25:56 2020 +0300 > > > > net: dsa: don't print non-fatal MTU error if not supported > > > > Thanks, but I don't think that will change anything. -34 is -ERANGE. Yup, I thought of that after responding. > > But you might also want to look into adding .ndo_change_mtu for > > ucc_geth. > > Well, that was what I first did, but I'm incapable of making sense of > the QE reference manual. Perhaps, given the domain of your email > address, you could poke someone that might know what would need to be done? Well, you could poke Leo Li, the QUICC engine maintainer, yourself too. What exactly from the QEIWRM.pdf are you incapable of making sense of? The MFLR register appears to be at offset 0x4c within the Rx Global Parameter RAM. > > If you are able to pass MTU-sized traffic through your > > mv88e6085, then it is probably the case that the mpc8309 already > > supports larger packets than 1500 bytes, and it is simply a matter of > > letting the stack know about that. > > Perhaps, but I don't know how I should test that given that 1500 > give-or-take is hardcoded. Very simply, you could run iperf3 or any other TCP stream (ssh, whatever). That will eventually increase the TCP segment size to the MTU. If the TCP stream doesn't hang, then the QUICC engine can receive packets over the MTU. > FWIW, on a 4.19 kernel, I can do 'ping -s X -M do' for X up to 1472 > for IPv4 and 1452 for IPv6, but I don't think that tells me much about > what the hardware could do. IP will get fragmented by the stack to the interface's MTU. > A thought: Shouldn't the initialization of slave_dev->max_mtu in > dsa_slave_create() be capped by master->max_mtu minus tag overhead? Talk to Andrew: https://www.spinics.net/lists/netdev/msg645810.html ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: warnings from MTU setting on switch ports 2020-11-30 22:35 ` Vladimir Oltean @ 2020-11-30 22:50 ` Rasmus Villemoes 2020-11-30 23:11 ` Andrew Lunn 1 sibling, 0 replies; 15+ messages in thread From: Rasmus Villemoes @ 2020-11-30 22:50 UTC (permalink / raw) To: Vladimir Oltean; +Cc: Network Development, Florian Fainelli, Andrew Lunn On 30/11/2020 23.35, Vladimir Oltean wrote: > On Mon, Nov 30, 2020 at 11:13:39PM +0100, Rasmus Villemoes wrote: >> FWIW, on a 4.19 kernel, I can do 'ping -s X -M do' for X up to 1472 >> for IPv4 and 1452 for IPv6, but I don't think that tells me much about >> what the hardware could do. > > IP will get fragmented by the stack to the interface's MTU. Nope, that's what the '-M do' is for: # ping -M do -s 1880 169.254.246.13 PING 169.254.246.13 (169.254.246.13) 1880(1908) bytes of data. ping: local error: message too long, mtu=1500 Rasmus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: warnings from MTU setting on switch ports 2020-11-30 22:35 ` Vladimir Oltean 2020-11-30 22:50 ` Rasmus Villemoes @ 2020-11-30 23:11 ` Andrew Lunn 1 sibling, 0 replies; 15+ messages in thread From: Andrew Lunn @ 2020-11-30 23:11 UTC (permalink / raw) To: Vladimir Oltean; +Cc: Rasmus Villemoes, Network Development, Florian Fainelli > > A thought: Shouldn't the initialization of slave_dev->max_mtu in > > dsa_slave_create() be capped by master->max_mtu minus tag overhead? > > Talk to Andrew: > https://www.spinics.net/lists/netdev/msg645810.html Yes, this is historic. DSA started life with Marvell switches connected to Marvell MACs. And Marvell MACs always allowed frames bigger than the MTU to be sent/received. A few more MACs were paired with Marvell switches, and they also happened to allow bigger frames than the MTU to be used. So it all worked fine until a MAC/Switch pair came along where the MAC did not pass frames bigger than the MTU. We could not break backwards compatibility, so decided to ask the MAC to up its MTU, but not error out if it failed. It was like that for a long time, until Vladimir added jumbo support. Then you do need to know if the MAC supports bigger MTU, so you can disallow jumbo. So this warning got added. Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) 2020-11-30 22:13 ` Rasmus Villemoes 2020-11-30 22:35 ` Vladimir Oltean @ 2020-12-05 14:49 ` Rasmus Villemoes 2020-12-05 15:45 ` Russell King - ARM Linux admin 2020-12-05 19:03 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Vladimir Oltean 1 sibling, 2 replies; 15+ messages in thread From: Rasmus Villemoes @ 2020-12-05 14:49 UTC (permalink / raw) To: Vladimir Oltean Cc: Network Development, Florian Fainelli, Russell King - ARM Linux On 30/11/2020 23.13, Rasmus Villemoes wrote: > On 30/11/2020 17.04, Vladimir Oltean wrote: > Thanks, but I don't think that will change anything. -34 is -ERANGE. >> But you might also want to look into adding .ndo_change_mtu for >> ucc_geth. > > Well, that was what I first did, but I'm incapable of making sense of > the QE reference manual. Perhaps, given the domain of your email > address, you could poke someone that might know what would need to be done? > > In any case, something else seems to be broken with 5.9; no network > traffic seems to be working (but the bridge creation etc. seems to work > just the same, link status works, and "ip link show" shows the same > things as in 4.19). So until I figure that out I can't play around with > modifying ucc_geth. So, I found out that the problem disappers when I disable vlan_filtering, and googling then led me to Russell's patches from around March (https://patchwork.ozlabs.org/project/netdev/cover/20200218114515.GL18808@shell.armlinux.org.uk/). But, unlike from what I gather from Russell's description, the problem is there whether or not the bridge is created with vlan_filtering enabled from the outset or not. Also, cherry-picking 517648c8d1 to 5.9.12 doesn't help. The problem also exists on 5.4.80, and (somewhat naively) backporting 54a0ed0df496 as well as 517648c8d1 doesn't change that. So I'm out of ideas. I also tried booting a 5.3, but that had some horrible UBI/nand failure, and since the mv88e6250 support only landed in 5.3, it's rather difficult to try kernels further back. Does anyone know what might have happened between 4.19 and 5.4 that caused this change in behaviour for Marvell switches? Rasmus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) 2020-12-05 14:49 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Rasmus Villemoes @ 2020-12-05 15:45 ` Russell King - ARM Linux admin 2020-12-05 16:02 ` vlan_filtering=1 breaks all traffic Rasmus Villemoes 2020-12-05 19:03 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Vladimir Oltean 1 sibling, 1 reply; 15+ messages in thread From: Russell King - ARM Linux admin @ 2020-12-05 15:45 UTC (permalink / raw) To: Rasmus Villemoes; +Cc: Vladimir Oltean, Network Development, Florian Fainelli On Sat, Dec 05, 2020 at 03:49:02PM +0100, Rasmus Villemoes wrote: > So, I found out that the problem disappers when I disable > vlan_filtering, and googling then led me to Russell's patches from > around March > (https://patchwork.ozlabs.org/project/netdev/cover/20200218114515.GL18808@shell.armlinux.org.uk/). > > But, unlike from what I gather from Russell's description, the problem > is there whether or not the bridge is created with vlan_filtering > enabled from the outset or not. No. My problem is where the bridge is created _without_ vlan_filtering enabled, and is subsequently enabled. That caused traffic to stop as soon as vlan_filtering was enabled. Note that if the bridge were created with vlan_filtering enabled from the start, there would be no problem. > Also, cherry-picking 517648c8d1 to > 5.9.12 doesn't help. The problem also exists on 5.4.80, and (somewhat > naively) backporting 54a0ed0df496 as well as 517648c8d1 doesn't change that. I'm not sure what 517648c8d1 is - this isn't a mainline or net-next commit ID. 54a0ed0df496 is a modification by Vladimir Oltean of my original patch ("net: dsa: mv88e6xxx: fix vlan setup") that fixed it for Marvell DSA, dropping the Marvell DSA bits of the patch, thereby neutering it for its intended purpose, while still mostly maintaining the commit description which implies that it fixes Marvell DSA. I'm afraid that I've now given up dealing with anything that involves interaction with the impossible Vladimir Oltean - the problems you are experiencing are a direct result of Vladimir Oltean's attitudes towards fellow kernel developers. You will, however, find that the problem was subsequently fixed by 1fb74191988f on top of 54a0ed0df496, which adds the part that Vladimir Oltean failed to carry forward. It's all an unnecessary horrible mess. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic 2020-12-05 15:45 ` Russell King - ARM Linux admin @ 2020-12-05 16:02 ` Rasmus Villemoes 0 siblings, 0 replies; 15+ messages in thread From: Rasmus Villemoes @ 2020-12-05 16:02 UTC (permalink / raw) To: Russell King - ARM Linux admin Cc: Vladimir Oltean, Network Development, Florian Fainelli On 05/12/2020 16.45, Russell King - ARM Linux admin wrote: > On Sat, Dec 05, 2020 at 03:49:02PM +0100, Rasmus Villemoes wrote: >> So, I found out that the problem disappers when I disable >> vlan_filtering, and googling then led me to Russell's patches from >> around March >> (https://patchwork.ozlabs.org/project/netdev/cover/20200218114515.GL18808@shell.armlinux.org.uk/). >> >> But, unlike from what I gather from Russell's description, the problem >> is there whether or not the bridge is created with vlan_filtering >> enabled from the outset or not. > > No. My problem is where the bridge is created _without_ vlan_filtering > enabled, and is subsequently enabled. That caused traffic to stop as > soon as vlan_filtering was enabled. > > Note that if the bridge were created with vlan_filtering enabled from > the start, there would be no problem. That was what I was trying to say: I see the problem whether or not vlan_filtering is enabled from the beginning (hence the "unlike..."). So since disabling vlan_filtering (or never enabling it in the first place) makes traffic flow, the symptoms look similar to the problem you described (and that was what led me to those patches). >> Also, cherry-picking 517648c8d1 to >> 5.9.12 doesn't help. The problem also exists on 5.4.80, and (somewhat >> naively) backporting 54a0ed0df496 as well as 517648c8d1 doesn't change that. > > I'm not sure what 517648c8d1 is - this isn't a mainline or net-next > commit ID. Sorry, I mistakenly took the sha1 from the 5.4 branch I cherry-picked it to - I did indeed mean your subsequent 1fb74191988f setting the flag for mv88e6xxxx. > You will, however, find that the problem was subsequently fixed by > 1fb74191988f on top of 54a0ed0df496, Unfortunately not. So either it's simply some completely different underlying issue, or there's something else going on in my setup that prevents the configure_vlan_while_not_filtering patches from fixing it. Thanks, Rasmus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) 2020-12-05 14:49 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Rasmus Villemoes 2020-12-05 15:45 ` Russell King - ARM Linux admin @ 2020-12-05 19:03 ` Vladimir Oltean 2020-12-05 20:13 ` vlan_filtering=1 breaks all traffic Rasmus Villemoes 1 sibling, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2020-12-05 19:03 UTC (permalink / raw) To: Rasmus Villemoes Cc: Network Development, Florian Fainelli, Russell King - ARM Linux Hi Rasmus, On Sat, Dec 05, 2020 at 03:49:02PM +0100, Rasmus Villemoes wrote: > So I'm out of ideas. I also tried booting a 5.3, but that had some > horrible UBI/nand failure, Test with a ramdisk maybe? > and since the mv88e6250 support only landed > in 5.3, it's rather difficult to try kernels further back. > Does anyone know what might have happened between 4.19 and 5.4 that > caused this change in behaviour for Marvell switches? I think the most important question is: what commands are you running, and how are you determining that "it doesn't work"? What type of traffic are you sending, and how are you receiving it? I don't own a 6250 switch, but the 6390 and 6190. On my switches, both termination and forwarding work fine with VLAN filtering enabled - tested just now. This is with the official v5.9 tag: commit bbf5c979011a099af5dc76498918ed7df445635b (HEAD, tag: v5.9) Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Sun Oct 11 14:15:50 2020 -0700 Linux 5.9 It's interesting that it is you who added VLAN support for the 6250 in commit bec8e5725281 ("net: dsa: mv88e6xxx: implement vtu_getnext and vtu_loadpurge for mv88e6250") which appeared around the v5.3 timeframe. When you tested it then, did you apply the same test as you did now? It is very confusing that you mention v4.19, since of course, it is the v5.3 tag where the 6250 support for VLANs has appeared, just as you said. As far as I can gather from your email, the only kernel where your testing passes is a kernel that nobody else can test, am I right? So it either is a problem specific to the 6250, or a problem that I did not catch with the trivial setup I had below: # uname -a Linux buildroot 5.9.0-mox #1526 SMP PREEMPT Sat Dec 5 20:53:11 EET 2020 aarch64 GNU/Linux # ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up # ip link set lan4 master br0 [ 56.393743] br0: port 1(lan4) entered blocking state [ 56.396614] br0: port 1(lan4) entered disabled state [ 56.408327] device lan4 entered promiscuous mode [ 56.410255] device eth1 entered promiscuous mode [ 57.155280] br0: port 1(lan4) entered blocking state [ 57.157639] br0: port 1(lan4) entered forwarding state # # ip addr add 192.168.100.2/24 dev br0 # ping 192.168.100.1 [...] --- 192.168.100.1 ping statistics --- 18 packets transmitted, 18 received, 0% packet loss, time 17034ms rtt min/avg/max/mdev = 1.389/1.643/2.578/0.266 ms # bridge vlan del dev lan4 vid 1 # bridge vlan add dev lan4 vid 100 pvid untagged && bridge vlan add dev br0 vid 100 pvid untagged self # ping 192.168.100.1 [...] ^C--- 192.168.100.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2005ms rtt min/avg/max/mdev = 0.993/1.358/1.633/0.269 ms ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic 2020-12-05 19:03 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Vladimir Oltean @ 2020-12-05 20:13 ` Rasmus Villemoes 2020-12-06 19:45 ` Vladimir Oltean 0 siblings, 1 reply; 15+ messages in thread From: Rasmus Villemoes @ 2020-12-05 20:13 UTC (permalink / raw) To: Vladimir Oltean Cc: Network Development, Florian Fainelli, Russell King - ARM Linux On 05/12/2020 20.03, Vladimir Oltean wrote: > Hi Rasmus, > > On Sat, Dec 05, 2020 at 03:49:02PM +0100, Rasmus Villemoes wrote: >> So I'm out of ideas. I also tried booting a 5.3, but that had some >> horrible UBI/nand failure, > > Test with a ramdisk maybe? > >> and since the mv88e6250 support only landed >> in 5.3, it's rather difficult to try kernels further back. >> Does anyone know what might have happened between 4.19 and 5.4 that >> caused this change in behaviour for Marvell switches? > > I think the most important question is: what commands are you running, > and how are you determining that "it doesn't work"? What type of traffic > are you sending, and how are you receiving it? > Ping, ssh, you-name-it. Testing with both IPv4 and IPv6LL. > I don't own a 6250 switch, but the 6390 and 6190. On my switches, both > termination and forwarding work fine with VLAN filtering enabled - > tested just now. This is with the official v5.9 tag: > > commit bbf5c979011a099af5dc76498918ed7df445635b (HEAD, tag: v5.9) > Author: Linus Torvalds <torvalds@linux-foundation.org> > Date: Sun Oct 11 14:15:50 2020 -0700 > > Linux 5.9 > > It's interesting that it is you who added VLAN support for the 6250 Not just VLAN, all of 6250 was added by me around that time. in > commit bec8e5725281 ("net: dsa: mv88e6xxx: implement vtu_getnext and > vtu_loadpurge for mv88e6250") which appeared around the v5.3 timeframe. > When you tested it then, did you apply the same test as you did now? > > It is very confusing that you mention v4.19, since of course, it is the > v5.3 tag where the 6250 support for VLANs has appeared, just as you said. We're running with the set of 6250 patches backported to 4.19, and the testing was/is done on that. I'd love to be able to run Linus' master branch at any time, and that's actually what I'm currently in the process of getting closer to (apart from the switch issue, 5.9 seems to work just fine for us with just about five out-of-tree patches, none network-related). > As far as I can gather from your email, the only kernel where your > testing passes is a kernel that nobody else can test, am I right? In a sense, yes, but nobody else has the hardware that I have either. > So it either is a problem specific to the 6250, That is certainly a possibility. or a problem that I did > not catch with the trivial setup I had below: > > # uname -a > Linux buildroot 5.9.0-mox #1526 SMP PREEMPT Sat Dec 5 20:53:11 EET 2020 aarch64 GNU/Linux > # ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up > # ip link set lan4 master br0 > [ 56.393743] br0: port 1(lan4) entered blocking state > [ 56.396614] br0: port 1(lan4) entered disabled state > [ 56.408327] device lan4 entered promiscuous mode > [ 56.410255] device eth1 entered promiscuous mode > [ 57.155280] br0: port 1(lan4) entered blocking state > [ 57.157639] br0: port 1(lan4) entered forwarding state > # > # ip addr add 192.168.100.2/24 dev br0 > # ping 192.168.100.1 > [...] > --- 192.168.100.1 ping statistics --- > 18 packets transmitted, 18 received, 0% packet loss, time 17034ms > rtt min/avg/max/mdev = 1.389/1.643/2.578/0.266 ms Yup, that corresponds pretty much to what I do. Just for good measure, I tried doing exactly the above (with only a change in IP address), and... it worked. So, first thought was "perhaps it's because you bring up br0 before adding the ports". But no, bringing it up after still works. Second thought: "portS - hm, only one port is added here", and indeed, once I add two or more ports to the bridge, it stops working. Removing all but the single port that has a cable plugged in makes it work again. It doesn't seem to matter whether the other ports are up or down. I should probably mention that wireshark says that ARP (ipv4) and neighbor solicitation (ipv6ll) packets do reach my laptop when I attempt the ping. If I start by doing a succesful ping (i.e., no other ports added), then add another port, then do a ping, the ping packets do reach the laptop (and of course get answered). So the problem does not appear to be on egress. Rasmus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic 2020-12-05 20:13 ` vlan_filtering=1 breaks all traffic Rasmus Villemoes @ 2020-12-06 19:45 ` Vladimir Oltean 2020-12-07 19:43 ` Rasmus Villemoes 0 siblings, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2020-12-06 19:45 UTC (permalink / raw) To: Rasmus Villemoes Cc: Network Development, Florian Fainelli, Russell King - ARM Linux On Sat, Dec 05, 2020 at 09:13:37PM +0100, Rasmus Villemoes wrote: > Yup, that corresponds pretty much to what I do. Just for good measure, I > tried doing exactly the above (with only a change in IP address), and... > it worked. So, first thought was "perhaps it's because you bring up br0 > before adding the ports". But no, bringing it up after still works. > Second thought: "portS - hm, only one port is added here", and indeed, > once I add two or more ports to the bridge, it stops working. Removing > all but the single port that has a cable plugged in makes it work again. > It doesn't seem to matter whether the other ports are up or down. > > I should probably mention that wireshark says that ARP (ipv4) and > neighbor solicitation (ipv6ll) packets do reach my laptop when I attempt > the ping. If I start by doing a succesful ping (i.e., no other ports > added), then add another port, then do a ping, the ping packets do reach > the laptop (and of course get answered). So the problem does not appear > to be on egress. It would be interesting to see what is the ingress drop reason, if that could be deduced from the drop counters that are incrementing in ethtool -S. I am not confident that it can be a VTU issue, given the fact that you have not said that you see VTU violation warnings, which are fairly loud on mv88e6xxx. The procedure of joining a new port to the bridge does alter the VTU, since that second port needs to be a part of the default_pvid of the bridge as soon as it goes up (VID 1). However that is not the main thing that bridging a new port does - instead, the in-chip Port VLAN map is altered. In theory it _would_ be possible (even if unlikely) for the VTU to get overwritten by the second port join, in a way that removes the first port from the bridge's VID 1. Remember that this issue only seems to be observable on 8250, so it seems logical to search in 8250 specific code first (therefore making the VTU a more likely suspect than the in-chip Port VLAN map). Since you've already made the effort to boot kernel 5.9, you could make the extra leap to try out the 5.10 rc's and look at the VTU using Andrew's devlink based tool: https://github.com/lunn/mv88e6xxx_dump # devlink dev mdio_bus/d0032004.mdio-mii:11 mdio_bus/d0032004.mdio-mii:12 mdio_bus/d0032004.mdio-mii:10 # ./mv88e6xxx_dump --device mdio_bus/d0032004.mdio-mii:10 --vtu VTU: V - a member, egress not modified U - a member, egress untagged T - a member, egress tagged X - not a member, Ingress frames with VID discarded P VID 0123456789a FID SID QPrio FPrio VidPolicy # ip link add br0 type bridge vlan_filtering 1 # ip link set lan4 master br0 [ 74.443547] br0: port 1(lan4) entered blocking state [ 74.446037] br0: port 1(lan4) entered disabled state [ 74.461416] device lan4 entered promiscuous mode [ 74.463564] device eth1 entered promiscuous mode # ./mv88e6xxx_dump --device mdio_bus/d0032004.mdio-mii:10 --vtu VTU: V - a member, egress not modified U - a member, egress untagged T - a member, egress tagged X - not a member, Ingress frames with VID discarded P VID 0123456789a FID SID QPrio FPrio VidPolicy 0 1 XXXXUXXXXVV 1 0 - - 0 # ip link set lan5 master br0 [ 84.533120] br0: port 2(lan5) entered blocking state [ 84.535563] br0: port 2(lan5) entered disabled state [ 84.552022] device lan5 entered promiscuous mode # # ./mv88e6xxx_dump --device mdio_bus/d0032004.mdio-mii:10 --vtu VTU: V - a member, egress not modified U - a member, egress untagged T - a member, egress tagged X - not a member, Ingress frames with VID discarded P VID 0123456789a FID SID QPrio FPrio VidPolicy 0 1 XXXXUUXXXVV 1 0 - - 0 You would expect to see two U's for your ports, and not something else like X. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic 2020-12-06 19:45 ` Vladimir Oltean @ 2020-12-07 19:43 ` Rasmus Villemoes 2020-12-07 20:15 ` Vladimir Oltean 0 siblings, 1 reply; 15+ messages in thread From: Rasmus Villemoes @ 2020-12-07 19:43 UTC (permalink / raw) To: Vladimir Oltean Cc: Network Development, Florian Fainelli, Russell King - ARM Linux On 06/12/2020 20.45, Vladimir Oltean wrote: > It would be interesting to see what is the ingress drop reason, if that > could be deduced from the drop counters that are incrementing in ethtool -S. I don't see anything obvious from running ethtool -S before/during/after ping. > Since you've already made the effort to boot kernel 5.9, you could make > the extra leap to try out the 5.10 rc's and look at the VTU using > Andrew's devlink based tool: > https://github.com/lunn/mv88e6xxx_dump > > # devlink dev > mdio_bus/d0032004.mdio-mii:11 > mdio_bus/d0032004.mdio-mii:12 > mdio_bus/d0032004.mdio-mii:10 > # ./mv88e6xxx_dump --device mdio_bus/d0032004.mdio-mii:10 --vtu > VTU: Thanks for the hint. Unfortunately: # uname -a Linux (none) 5.10.0-rc7-00035-g66d777e1729d #194 Mon Dec 7 16:00:30 CET 2020 ppc GNU/Linux # devlink dev mdio_bus/mdio@e0102120:10 # mv88e6xxx_dump --device mdio_bus/mdio@e0102120:10 --vtu VTU: Error: devlink: The requested region does not exist. devlink answers: Invalid argument Unable to snapshot vtu --atu, --global1 and --global2 does work, but the latter two say "Unknown mv88e6xxx chip 186a" (and 186a is 6250 in hex, so I think that should have been printed in decimal to reduce confusion). Whether that has anything to do with --vtu not working I don't know - the global1/2 registers to seem to get printed correctly. Rasmus ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic 2020-12-07 19:43 ` Rasmus Villemoes @ 2020-12-07 20:15 ` Vladimir Oltean 2020-12-08 2:22 ` Andrew Lunn 0 siblings, 1 reply; 15+ messages in thread From: Vladimir Oltean @ 2020-12-07 20:15 UTC (permalink / raw) To: Rasmus Villemoes, Andrew Lunn Cc: Network Development, Florian Fainelli, Russell King - ARM Linux On Mon, Dec 07, 2020 at 08:43:18PM +0100, Rasmus Villemoes wrote: > # uname -a > Linux (none) 5.10.0-rc7-00035-g66d777e1729d #194 Mon Dec 7 16:00:30 CET > 2020 ppc GNU/Linux > # devlink dev > mdio_bus/mdio@e0102120:10 > # mv88e6xxx_dump --device mdio_bus/mdio@e0102120:10 --vtu > VTU: > Error: devlink: The requested region does not exist. > devlink answers: Invalid argument > Unable to snapshot vtu > > --atu, --global1 and --global2 does work, but the latter two say > "Unknown mv88e6xxx chip 186a" (and 186a is 6250 in hex, so I think that > should have been printed in decimal to reduce confusion). Whether that > has anything to do with --vtu not working I don't know - the global1/2 > registers to seem to get printed correctly. Ahh, this is probably because Andrew didn't know how many bits is the FID field width for 6250. The code has the 6250 treated the same as the "default" case, which is to error out: cmd_vtu: switch (ctx->chip) { case MV88E6190: case MV88E6191: case MV88E6290: case MV88E6390: return vtu_mv88e6xxx(ctx, 0x7ff); case MV88E6171: case MV88E6175: case MV88E6350: case MV88E6351: case MV88E6172: case MV88E6176: case MV88E6240: case MV88E6352: return vtu_mv88e6xxx(ctx, 0x7ff); case MV88E6141: case MV88E6341: return vtu_mv88e6xxx(ctx, 0xff); case MV88E6320: case MV88E6321: return vtu_mv88e6xxx(ctx, 0x7ff); case MV88E6220: case MV88E6250: case MV88E6131: case MV88E6185: case MV88E6123: case MV88E6161: case MV88E6165: default: printf("Unknown mv88e6xxx chip %x\n", ctx->chip); } You can probably figure out the width of that field based on your datasheet. You can probably just put a value in there and get some output regardless of whether it's actually correct or not. The kernel takes care of issuing the VTU GetNext operation and performs a raw read of the VTU Data registers, which are passed to user space (and where they get interpreted). Nonetheless, I'm adding Andrew as well. Andrew, this is the thread, for context, in case you haven't been following. The threading got broken multiple times, it seems. https://lore.kernel.org/netdev/6424c14e-bd25-2a06-cf0b-f1a07f9a3604@prevas.dk/ ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vlan_filtering=1 breaks all traffic 2020-12-07 20:15 ` Vladimir Oltean @ 2020-12-08 2:22 ` Andrew Lunn 0 siblings, 0 replies; 15+ messages in thread From: Andrew Lunn @ 2020-12-08 2:22 UTC (permalink / raw) To: Vladimir Oltean Cc: Rasmus Villemoes, Network Development, Florian Fainelli, Russell King - ARM Linux On Mon, Dec 07, 2020 at 08:15:28PM +0000, Vladimir Oltean wrote: > On Mon, Dec 07, 2020 at 08:43:18PM +0100, Rasmus Villemoes wrote: > > # uname -a > > Linux (none) 5.10.0-rc7-00035-g66d777e1729d #194 Mon Dec 7 16:00:30 CET > > 2020 ppc GNU/Linux > > # devlink dev > > mdio_bus/mdio@e0102120:10 > > # mv88e6xxx_dump --device mdio_bus/mdio@e0102120:10 --vtu > > VTU: > > Error: devlink: The requested region does not exist. > > devlink answers: Invalid argument > > Unable to snapshot vtu VTU dumping is new. It is in net-next, but not 5.10-rc7. atu, global1 and global2 are part for 5.10-rc7. Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-12-08 2:23 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-30 14:30 warnings from MTU setting on switch ports Rasmus Villemoes 2020-11-30 16:04 ` Vladimir Oltean 2020-11-30 22:13 ` Rasmus Villemoes 2020-11-30 22:35 ` Vladimir Oltean 2020-11-30 22:50 ` Rasmus Villemoes 2020-11-30 23:11 ` Andrew Lunn 2020-12-05 14:49 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Rasmus Villemoes 2020-12-05 15:45 ` Russell King - ARM Linux admin 2020-12-05 16:02 ` vlan_filtering=1 breaks all traffic Rasmus Villemoes 2020-12-05 19:03 ` vlan_filtering=1 breaks all traffic (was: Re: warnings from MTU setting on switch ports) Vladimir Oltean 2020-12-05 20:13 ` vlan_filtering=1 breaks all traffic Rasmus Villemoes 2020-12-06 19:45 ` Vladimir Oltean 2020-12-07 19:43 ` Rasmus Villemoes 2020-12-07 20:15 ` Vladimir Oltean 2020-12-08 2:22 ` Andrew Lunn
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.