* [BUG] mv88e6xxx: tx regression in v5.3 @ 2019-12-11 9:48 Baruch Siach 2019-12-11 13:11 ` Andrew Lunn 0 siblings, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-11 9:48 UTC (permalink / raw) To: Andrew Lunn, Vivien Didelot; +Cc: netdev, Denis Odintsov Hi Andrew, Vivien, Since kernel v5.3 (tested v5.3.15), the 88E6141 switch on SolidRun Clearfog GT-8K stopped transmitting packets on switch connected ports. Kernel v5.2 works fine (tested v5.2.21). Here are the relevant kernel v5.3 log lines: [ 2.867424] mv88e6085 f412a200.mdio-mii:04: switch 0x3400 detected: Marvell 88E6141, revision 0 [ 2.927445] libphy: mdio: probed [ 3.578496] mv88e6085 f412a200.mdio-mii:04 lan2 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:11] driver [Marvell 88E6390] [ 3.595674] mv88e6085 f412a200.mdio-mii:04 lan1 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:12] driver [Marvell 88E6390] [ 3.612797] mv88e6085 f412a200.mdio-mii:04 lan4 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:13] driver [Marvell 88E6390] [ 3.629910] mv88e6085 f412a200.mdio-mii:04 lan3 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:14] driver [Marvell 88E6390] [ 3.646049] mv88e6085 f412a200.mdio-mii:04: configuring for phy/ link mode [ 3.654451] DSA: tree 0 setup ... [ 10.784521] mvpp2 f4000000.ethernet eth2: configuring for fixed/2500base-x link mode [ 10.792401] mvpp2 f4000000.ethernet eth2: Link is Up - 2.5Gbps/Full - flow control off [ 19.817981] mv88e6085 f412a200.mdio-mii:04 lan1: configuring for phy/ link mode [ 19.827083] 8021q: adding VLAN 0 to HW filter on device lan1 [ 21.577276] mv88e6085 f412a200.mdio-mii:04 lan1: Link is Up - 100Mbps/Full - flow control rx/tx [ 21.586030] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready The Tx count on the lan1 interface increments, but the ARP packets don't show on the network. Do you have any idea? Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-11 9:48 [BUG] mv88e6xxx: tx regression in v5.3 Baruch Siach @ 2019-12-11 13:11 ` Andrew Lunn 2019-12-11 17:07 ` Baruch Siach 0 siblings, 1 reply; 27+ messages in thread From: Andrew Lunn @ 2019-12-11 13:11 UTC (permalink / raw) To: Baruch Siach; +Cc: Vivien Didelot, netdev, Denis Odintsov On Wed, Dec 11, 2019 at 11:48:57AM +0200, Baruch Siach wrote: > Hi Andrew, Vivien, > > Since kernel v5.3 (tested v5.3.15), the 88E6141 switch on SolidRun > Clearfog GT-8K stopped transmitting packets on switch connected > ports. Kernel v5.2 works fine (tested v5.2.21). > > Here are the relevant kernel v5.3 log lines: > > [ 2.867424] mv88e6085 f412a200.mdio-mii:04: switch 0x3400 detected: Marvell 88E6141, revision 0 > [ 2.927445] libphy: mdio: probed > [ 3.578496] mv88e6085 f412a200.mdio-mii:04 lan2 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:11] driver [Marvell 88E6390] > [ 3.595674] mv88e6085 f412a200.mdio-mii:04 lan1 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:12] driver [Marvell 88E6390] > [ 3.612797] mv88e6085 f412a200.mdio-mii:04 lan4 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:13] driver [Marvell 88E6390] > [ 3.629910] mv88e6085 f412a200.mdio-mii:04 lan3 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:14] driver [Marvell 88E6390] > [ 3.646049] mv88e6085 f412a200.mdio-mii:04: configuring for phy/ link mode > [ 3.654451] DSA: tree 0 setup > ... > [ 10.784521] mvpp2 f4000000.ethernet eth2: configuring for fixed/2500base-x link mode > [ 10.792401] mvpp2 f4000000.ethernet eth2: Link is Up - 2.5Gbps/Full - flow control off > [ 19.817981] mv88e6085 f412a200.mdio-mii:04 lan1: configuring for phy/ link mode > [ 19.827083] 8021q: adding VLAN 0 to HW filter on device lan1 > [ 21.577276] mv88e6085 f412a200.mdio-mii:04 lan1: Link is Up - 100Mbps/Full - flow control rx/tx > [ 21.586030] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready > > The Tx count on the lan1 interface increments, but the ARP packets don't > show on the network. Hi Baruch I don't know of an issues. If the MAC TX counter increases, it sounds like it is a PHY issue? Does 100Mbps/Full make sense for this link? Probably your best bet is to do a git bisect to find which commit broke it. Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-11 13:11 ` Andrew Lunn @ 2019-12-11 17:07 ` Baruch Siach 2019-12-11 17:49 ` Andrew Lunn 0 siblings, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-11 17:07 UTC (permalink / raw) To: Andrew Lunn; +Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Andrew, On Wed, Dec 11 2019, Andrew Lunn wrote: > On Wed, Dec 11, 2019 at 11:48:57AM +0200, Baruch Siach wrote: >> Hi Andrew, Vivien, >> >> Since kernel v5.3 (tested v5.3.15), the 88E6141 switch on SolidRun >> Clearfog GT-8K stopped transmitting packets on switch connected >> ports. Kernel v5.2 works fine (tested v5.2.21). >> >> Here are the relevant kernel v5.3 log lines: >> >> [ 2.867424] mv88e6085 f412a200.mdio-mii:04: switch 0x3400 detected: Marvell 88E6141, revision 0 >> [ 2.927445] libphy: mdio: probed >> [ 3.578496] mv88e6085 f412a200.mdio-mii:04 lan2 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:11] driver [Marvell 88E6390] >> [ 3.595674] mv88e6085 f412a200.mdio-mii:04 lan1 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:12] driver [Marvell 88E6390] >> [ 3.612797] mv88e6085 f412a200.mdio-mii:04 lan4 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:13] driver [Marvell 88E6390] >> [ 3.629910] mv88e6085 f412a200.mdio-mii:04 lan3 (uninitialized): PHY [!cp1!config-space@f4000000!mdio@12a200!switch0@4!mdio:14] driver [Marvell 88E6390] >> [ 3.646049] mv88e6085 f412a200.mdio-mii:04: configuring for phy/ link mode >> [ 3.654451] DSA: tree 0 setup >> ... >> [ 10.784521] mvpp2 f4000000.ethernet eth2: configuring for fixed/2500base-x link mode >> [ 10.792401] mvpp2 f4000000.ethernet eth2: Link is Up - 2.5Gbps/Full - flow control off >> [ 19.817981] mv88e6085 f412a200.mdio-mii:04 lan1: configuring for phy/ link mode >> [ 19.827083] 8021q: adding VLAN 0 to HW filter on device lan1 >> [ 21.577276] mv88e6085 f412a200.mdio-mii:04 lan1: Link is Up - 100Mbps/Full - flow control rx/tx >> [ 21.586030] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready >> >> The Tx count on the lan1 interface increments, but the ARP packets don't >> show on the network. > > Hi Baruch > > I don't know of an issues. > > If the MAC TX counter increases, it sounds like it is a PHY issue? > Does 100Mbps/Full make sense for this link? 100Mbps switch is what I have at the other side of the link. Works perfectly with v5.2. > Probably your best bet is to do a git bisect to find which commit > broke it. Bisect points at 7fb5a711545d ("net: dsa: mv88e6xxx: drop adjust_link to enabled phylink"). Reverting this commit on top of v5.3.15 fixes the issue (and brings the warning back). As I understand, this basically reverts the driver migration to phylink. What might be the issue with phylink? Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-11 17:07 ` Baruch Siach @ 2019-12-11 17:49 ` Andrew Lunn 2019-12-12 8:50 ` Baruch Siach 0 siblings, 1 reply; 27+ messages in thread From: Andrew Lunn @ 2019-12-11 17:49 UTC (permalink / raw) To: Baruch Siach; +Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein > Bisect points at 7fb5a711545d ("net: dsa: mv88e6xxx: drop adjust_link to > enabled phylink"). Reverting this commit on top of v5.3.15 fixes the > issue (and brings the warning back). As I understand, this basically > reverts the driver migration to phylink. What might be the issue with > phylink? That suggests the MAC is wrongly running at 1G, and the PHY at 100M, which is why it does not work. You probably want to add #define DEBUG to the top of phylink.c, and scatter some debug prints in mv88e6xxx_mac_config(). My guess would be, either mv88e6xxx_mac_config() is existing before configuring the MAC, or mv88e6xxx_port_setup_mac() has wrongly decided nothing has changed and so has not configured the MAC. Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-11 17:49 ` Andrew Lunn @ 2019-12-12 8:50 ` Baruch Siach 2019-12-12 13:14 ` Andrew Lunn 0 siblings, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-12 8:50 UTC (permalink / raw) To: Andrew Lunn; +Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Andrew, On Wed, Dec 11, 2019 at 06:49:38PM +0100, Andrew Lunn wrote: > > Bisect points at 7fb5a711545d ("net: dsa: mv88e6xxx: drop adjust_link to > > enabled phylink"). Reverting this commit on top of v5.3.15 fixes the > > issue (and brings the warning back). As I understand, this basically > > reverts the driver migration to phylink. What might be the issue with > > phylink? > > That suggests the MAC is wrongly running at 1G, and the PHY at 100M, > which is why it does not work. I reproduced the same issue with 1Gb link as well: [ 63.510782] mv88e6085 f412a200.mdio-mii:04 lan1: Link is Up - 1Gbps/Full - flow control rx/tx [ 63.519357] IPv6: ADDRCONF(NETDEV_CHANGE): lan1: link becomes ready > You probably want to add #define DEBUG to the top of phylink.c, and > scatter some debug prints in mv88e6xxx_mac_config(). My guess would > be, either mv88e6xxx_mac_config() is existing before configuring the > MAC, or mv88e6xxx_port_setup_mac() has wrongly decided nothing has > changed and so has not configured the MAC. As you guessed, mv88e6xxx_mac_config() exits early because mv88e6xxx_phy_is_internal() returns true for port number 2, and 'mode' is MLO_AN_PHY. What is the right MAC/PHY setup flow in this case? Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 8:50 ` Baruch Siach @ 2019-12-12 13:14 ` Andrew Lunn 2019-12-12 14:19 ` Andreas Tobler 2019-12-12 15:08 ` Baruch Siach 0 siblings, 2 replies; 27+ messages in thread From: Andrew Lunn @ 2019-12-12 13:14 UTC (permalink / raw) To: Baruch Siach; +Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein > As you guessed, mv88e6xxx_mac_config() exits early because > mv88e6xxx_phy_is_internal() returns true for port number 2, and 'mode' is > MLO_AN_PHY. What is the right MAC/PHY setup flow in this case? So this goes back to commit d700ec4118f9d5e88db8f678e7342f28c93037b9 Author: Marek Vasut <marex@denx.de> Date: Wed Sep 12 00:15:24 2018 +0200 net: dsa: mv88e6xxx: Make sure to configure ports with external PHYs The MV88E6xxx can have external PHYs attached to certain ports and those PHYs could even be on different MDIO bus than the one within the switch. This patch makes sure that ports with such PHYs are configured correctly according to the information provided by the PHY. @@ -709,13 +717,17 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int speed, duplex, link, pause, err; - if (mode == MLO_AN_PHY) + if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) return; The idea being, that the MAC has direct knowledge of the PHY configuration because it is internal. There is no need to configure the MAC, it does it itself. This assumption seems wrong for the switch you have. I think it is just a optimisation. So we can probably remove this phy internal test. And } else if (!mv88e6xxx_phy_is_internal(ds, port)) { also needs to change. It would be interesting to know if the MAC is completely wrongly configured, or it is just a subset of parameters. Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 13:14 ` Andrew Lunn @ 2019-12-12 14:19 ` Andreas Tobler 2019-12-12 15:08 ` Baruch Siach 1 sibling, 0 replies; 27+ messages in thread From: Andreas Tobler @ 2019-12-12 14:19 UTC (permalink / raw) To: Andrew Lunn, Baruch Siach Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein On 12.12.19 14:14, Andrew Lunn wrote: >> As you guessed, mv88e6xxx_mac_config() exits early because >> mv88e6xxx_phy_is_internal() returns true for port number 2, and 'mode' is >> MLO_AN_PHY. What is the right MAC/PHY setup flow in this case? > > So this goes back to > > commit d700ec4118f9d5e88db8f678e7342f28c93037b9 > Author: Marek Vasut <marex@denx.de> > Date: Wed Sep 12 00:15:24 2018 +0200 > > net: dsa: mv88e6xxx: Make sure to configure ports with external PHYs > > The MV88E6xxx can have external PHYs attached to certain ports and those > PHYs could even be on different MDIO bus than the one within the switch. > This patch makes sure that ports with such PHYs are configured correctly > according to the information provided by the PHY. > > @@ -709,13 +717,17 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > struct mv88e6xxx_chip *chip = ds->priv; > int speed, duplex, link, pause, err; > > - if (mode == MLO_AN_PHY) > + if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > return; > > The idea being, that the MAC has direct knowledge of the PHY > configuration because it is internal. There is no need to configure > the MAC, it does it itself. > > This assumption seems wrong for the switch you have. > > I think it is just a optimisation. So we can probably remove this phy > internal test. > > And > } else if (!mv88e6xxx_phy_is_internal(ds, port)) { > > also needs to change. > > It would be interesting to know if the MAC is completely wrongly > configured, or it is just a subset of parameters. Interesting that you mention the patch above. I jump in because I have an issue since the above patch went in. In my case (see below) the links are up and running fine, but when I disconnect the lan on port 8 or 9, I do not get the IP back. The same applies if I set the device down and up again. I loose the ip and do not get it again. If I set the link in the above 'else if' clause to LINK_UNFORCED, I get my ip's back and the links are working again. } else if (!mv88e6xxx_phy_is_internal(ds, port)) { - link = state->link; + link = LINK_UNFORCED; speed = state->speed; I hope I did not hijack the thread an my situation is related. Otherwise please let me know. Thanks, Andreas The lan0/1/2 are 100M, lan3/4 are 1G my switch info from the log, 5.4.2: mv88e6085 f1072004.mdio-mii:00: switch 0x990 detected: Marvell 88E6097/88E6097F, revision 2 libphy: mv88e6xxx SMI: probed mv88e6085 f1072004.mdio-mii:00 lan0 (uninitialized): PHY [mv88e6xxx-1:05] driver [Generic PHY] mv88e6085 f1072004.mdio-mii:00 lan1 (uninitialized): PHY [mv88e6xxx-1:06] driver [Generic PHY] mv88e6085 f1072004.mdio-mii:00 lan2 (uninitialized): PHY [mv88e6xxx-1:07] driver [Generic PHY] mv88e6085 f1072004.mdio-mii:00 lan3 (uninitialized): PHY [mv88e6xxx-1:08] driver [Marvell 88E1112] mv88e6085 f1072004.mdio-mii:00 lan4 (uninitialized): PHY [mv88e6xxx-1:09] driver [Marvell 88E1112] -- onway ag Andreas Tobler Software Engineer Stauffacherstrasse 16, CH-8004 Zürich Tel: +41 55 214 18 42 andreas.tobler@onway.ch www.onway.ch ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 13:14 ` Andrew Lunn 2019-12-12 14:19 ` Andreas Tobler @ 2019-12-12 15:08 ` Baruch Siach 2019-12-12 15:13 ` Andrew Lunn 1 sibling, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-12 15:08 UTC (permalink / raw) To: Andrew Lunn; +Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Andrew, On Thu, Dec 12, 2019 at 02:14:48PM +0100, Andrew Lunn wrote: > > As you guessed, mv88e6xxx_mac_config() exits early because > > mv88e6xxx_phy_is_internal() returns true for port number 2, and 'mode' is > > MLO_AN_PHY. What is the right MAC/PHY setup flow in this case? > > So this goes back to > > commit d700ec4118f9d5e88db8f678e7342f28c93037b9 > Author: Marek Vasut <marex@denx.de> > Date: Wed Sep 12 00:15:24 2018 +0200 > > net: dsa: mv88e6xxx: Make sure to configure ports with external PHYs > > The MV88E6xxx can have external PHYs attached to certain ports and those > PHYs could even be on different MDIO bus than the one within the switch. > This patch makes sure that ports with such PHYs are configured correctly > according to the information provided by the PHY. > > @@ -709,13 +717,17 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > struct mv88e6xxx_chip *chip = ds->priv; > int speed, duplex, link, pause, err; > > - if (mode == MLO_AN_PHY) > + if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > return; > > The idea being, that the MAC has direct knowledge of the PHY > configuration because it is internal. There is no need to configure > the MAC, it does it itself. > > This assumption seems wrong for the switch you have. > > I think it is just a optimisation. So we can probably remove this phy > internal test. > > And > } else if (!mv88e6xxx_phy_is_internal(ds, port)) { > > also needs to change. > > It would be interesting to know if the MAC is completely wrongly > configured, or it is just a subset of parameters. I compared phylib to phylink calls to mv88e6xxx_port_setup_mac(). It turns out that the phylink adds mv88e6xxx_port_setup_mac() call for the cpu port (port 5 in my case) with these parameters and call stack: [ 4.219148] mv88e6xxx_port_setup_mac: port: 5 link: 0 speed: -1 duplex: 255 [ 4.226144] CPU: 2 PID: 21 Comm: kworker/2:0 Not tainted 5.3.15-00003-gb9bb09189d02-dirty #104 [ 4.234795] Hardware name: SolidRun ClearFog GT 8K (DT) [ 4.240044] Workqueue: events deferred_probe_work_func [ 4.245205] Call trace: [ 4.247661] dump_backtrace+0x0/0x128 [ 4.251339] show_stack+0x14/0x1c [ 4.254669] dump_stack+0xa4/0xd0 [ 4.257998] mv88e6xxx_port_setup_mac+0x78/0x2a0 [ 4.262635] mv88e6xxx_mac_config+0xd0/0x154 [ 4.266924] dsa_port_phylink_mac_config+0x2c/0x38 [ 4.271736] phylink_mac_config+0xe0/0x1cc [ 4.275849] phylink_start+0xc8/0x224 [ 4.279527] dsa_port_link_register_of+0xe8/0x1b0 [ 4.284251] dsa_register_switch+0x7fc/0x908 [ 4.288539] mv88e6xxx_probe+0x62c/0x66c [ 4.292478] mdio_probe+0x30/0x5c [ 4.295806] really_probe+0x1d0/0x280 [ 4.299483] driver_probe_device+0xd4/0xe4 [ 4.303596] __device_attach_driver+0x94/0xa0 [ 4.307971] bus_for_each_drv+0x94/0xb4 [ 4.311823] __device_attach+0xc0/0x12c [ 4.315674] device_initial_probe+0x10/0x18 [ 4.319875] bus_probe_device+0x2c/0x8c [ 4.323726] deferred_probe_work_func+0x84/0x98 [ 4.328276] process_one_work+0x19c/0x258 [ 4.332303] process_scheduled_works+0x3c/0x40 [ 4.336765] worker_thread+0x228/0x2f8 [ 4.340530] kthread+0x114/0x124 [ 4.343771] ret_from_fork+0x10/0x18 This hunk removed that mv88e6xxx_port_setup_mac() call, and fixed Tx for me on v5.3.15: diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index d0a97eb73a37..f0457274b5a9 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) return; + if (dsa_is_cpu_port(ds, port)) + return; + if (mode == MLO_AN_FIXED) { link = LINK_FORCED_UP; speed = state->speed; Is that the right solution? Or maybe mv88e6xxx_port_setup_mac() should be called again when the cpu interface goes up? baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 15:08 ` Baruch Siach @ 2019-12-12 15:13 ` Andrew Lunn 2019-12-12 15:23 ` Baruch Siach 0 siblings, 1 reply; 27+ messages in thread From: Andrew Lunn @ 2019-12-12 15:13 UTC (permalink / raw) To: Baruch Siach; +Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein > I compared phylib to phylink calls to mv88e6xxx_port_setup_mac(). It turns out > that the phylink adds mv88e6xxx_port_setup_mac() call for the cpu port (port 5 > in my case) with these parameters and call stack: > > [ 4.219148] mv88e6xxx_port_setup_mac: port: 5 link: 0 speed: -1 duplex: 255 > [ 4.226144] CPU: 2 PID: 21 Comm: kworker/2:0 Not tainted 5.3.15-00003-gb9bb09189d02-dirty #104 > [ 4.234795] Hardware name: SolidRun ClearFog GT 8K (DT) > [ 4.240044] Workqueue: events deferred_probe_work_func > [ 4.245205] Call trace: > [ 4.247661] dump_backtrace+0x0/0x128 > [ 4.251339] show_stack+0x14/0x1c > [ 4.254669] dump_stack+0xa4/0xd0 > [ 4.257998] mv88e6xxx_port_setup_mac+0x78/0x2a0 > [ 4.262635] mv88e6xxx_mac_config+0xd0/0x154 > [ 4.266924] dsa_port_phylink_mac_config+0x2c/0x38 > [ 4.271736] phylink_mac_config+0xe0/0x1cc > [ 4.275849] phylink_start+0xc8/0x224 > [ 4.279527] dsa_port_link_register_of+0xe8/0x1b0 > [ 4.284251] dsa_register_switch+0x7fc/0x908 > [ 4.288539] mv88e6xxx_probe+0x62c/0x66c > [ 4.292478] mdio_probe+0x30/0x5c > [ 4.295806] really_probe+0x1d0/0x280 > [ 4.299483] driver_probe_device+0xd4/0xe4 > [ 4.303596] __device_attach_driver+0x94/0xa0 > [ 4.307971] bus_for_each_drv+0x94/0xb4 > [ 4.311823] __device_attach+0xc0/0x12c > [ 4.315674] device_initial_probe+0x10/0x18 > [ 4.319875] bus_probe_device+0x2c/0x8c > [ 4.323726] deferred_probe_work_func+0x84/0x98 > [ 4.328276] process_one_work+0x19c/0x258 > [ 4.332303] process_scheduled_works+0x3c/0x40 > [ 4.336765] worker_thread+0x228/0x2f8 > [ 4.340530] kthread+0x114/0x124 > [ 4.343771] ret_from_fork+0x10/0x18 > > This hunk removed that mv88e6xxx_port_setup_mac() call, and fixed Tx for me on > v5.3.15: > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > index d0a97eb73a37..f0457274b5a9 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > return; > > + if (dsa_is_cpu_port(ds, port)) > + return; > + > if (mode == MLO_AN_FIXED) { > link = LINK_FORCED_UP; > speed = state->speed; > > Is that the right solution? What needs testing is: port@0 { reg = <0>; label = "cpu"; ethernet = <&fec1>; fixed-link { speed = <100>; full-duplex; }; At some point, there is a call to configure the CPU port to 100Mbps, because the SoC Ethernet does not support 1G. We need to ensure this does not break with your change. Thanks Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 15:13 ` Andrew Lunn @ 2019-12-12 15:23 ` Baruch Siach 2019-12-12 18:17 ` Baruch Siach 2019-12-12 18:36 ` Marek Behun 0 siblings, 2 replies; 27+ messages in thread From: Baruch Siach @ 2019-12-12 15:23 UTC (permalink / raw) To: Andrew Lunn; +Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Andrew, On Thu, Dec 12, 2019 at 04:13:55PM +0100, Andrew Lunn wrote: > > I compared phylib to phylink calls to mv88e6xxx_port_setup_mac(). It turns out > > that the phylink adds mv88e6xxx_port_setup_mac() call for the cpu port (port 5 > > in my case) with these parameters and call stack: > > > > [ 4.219148] mv88e6xxx_port_setup_mac: port: 5 link: 0 speed: -1 duplex: 255 > > [ 4.226144] CPU: 2 PID: 21 Comm: kworker/2:0 Not tainted 5.3.15-00003-gb9bb09189d02-dirty #104 > > [ 4.234795] Hardware name: SolidRun ClearFog GT 8K (DT) > > [ 4.240044] Workqueue: events deferred_probe_work_func > > [ 4.245205] Call trace: > > [ 4.247661] dump_backtrace+0x0/0x128 > > [ 4.251339] show_stack+0x14/0x1c > > [ 4.254669] dump_stack+0xa4/0xd0 > > [ 4.257998] mv88e6xxx_port_setup_mac+0x78/0x2a0 > > [ 4.262635] mv88e6xxx_mac_config+0xd0/0x154 > > [ 4.266924] dsa_port_phylink_mac_config+0x2c/0x38 > > [ 4.271736] phylink_mac_config+0xe0/0x1cc > > [ 4.275849] phylink_start+0xc8/0x224 > > [ 4.279527] dsa_port_link_register_of+0xe8/0x1b0 > > [ 4.284251] dsa_register_switch+0x7fc/0x908 > > [ 4.288539] mv88e6xxx_probe+0x62c/0x66c > > [ 4.292478] mdio_probe+0x30/0x5c > > [ 4.295806] really_probe+0x1d0/0x280 > > [ 4.299483] driver_probe_device+0xd4/0xe4 > > [ 4.303596] __device_attach_driver+0x94/0xa0 > > [ 4.307971] bus_for_each_drv+0x94/0xb4 > > [ 4.311823] __device_attach+0xc0/0x12c > > [ 4.315674] device_initial_probe+0x10/0x18 > > [ 4.319875] bus_probe_device+0x2c/0x8c > > [ 4.323726] deferred_probe_work_func+0x84/0x98 > > [ 4.328276] process_one_work+0x19c/0x258 > > [ 4.332303] process_scheduled_works+0x3c/0x40 > > [ 4.336765] worker_thread+0x228/0x2f8 > > [ 4.340530] kthread+0x114/0x124 > > [ 4.343771] ret_from_fork+0x10/0x18 > > > > This hunk removed that mv88e6xxx_port_setup_mac() call, and fixed Tx for me on > > v5.3.15: > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > > index d0a97eb73a37..f0457274b5a9 100644 > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > > return; > > > > + if (dsa_is_cpu_port(ds, port)) > > + return; > > + > > if (mode == MLO_AN_FIXED) { > > link = LINK_FORCED_UP; > > speed = state->speed; > > > > Is that the right solution? > > What needs testing is: > > port@0 { > reg = <0>; > label = "cpu"; > ethernet = <&fec1>; > > fixed-link { > speed = <100>; > full-duplex; > }; > > At some point, there is a call to configure the CPU port to 100Mbps, > because the SoC Ethernet does not support 1G. We need to ensure this > does not break with your change. So maybe this: diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index d0a97eb73a37..84ca4f36a778 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) return; + if (mode != MLO_AN_FIXED && dsa_is_cpu_port(ds, port)) + return; + if (mode == MLO_AN_FIXED) { link = LINK_FORCED_UP; speed = state->speed; baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 15:23 ` Baruch Siach @ 2019-12-12 18:17 ` Baruch Siach 2019-12-12 18:32 ` Marek Behun 2019-12-12 18:36 ` Marek Behun 1 sibling, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-12 18:17 UTC (permalink / raw) To: Andrew Lunn, Marek Behún Cc: Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Andrew, Marek, On Thu, Dec 12, 2019 at 05:23:57PM +0200, Baruch Siach wrote: > On Thu, Dec 12, 2019 at 04:13:55PM +0100, Andrew Lunn wrote: > > > I compared phylib to phylink calls to mv88e6xxx_port_setup_mac(). It turns out > > > that the phylink adds mv88e6xxx_port_setup_mac() call for the cpu port (port 5 > > > in my case) with these parameters and call stack: > > > > > > [ 4.219148] mv88e6xxx_port_setup_mac: port: 5 link: 0 speed: -1 duplex: 255 > > > [ 4.226144] CPU: 2 PID: 21 Comm: kworker/2:0 Not tainted 5.3.15-00003-gb9bb09189d02-dirty #104 > > > [ 4.234795] Hardware name: SolidRun ClearFog GT 8K (DT) > > > [ 4.240044] Workqueue: events deferred_probe_work_func > > > [ 4.245205] Call trace: > > > [ 4.247661] dump_backtrace+0x0/0x128 > > > [ 4.251339] show_stack+0x14/0x1c > > > [ 4.254669] dump_stack+0xa4/0xd0 > > > [ 4.257998] mv88e6xxx_port_setup_mac+0x78/0x2a0 > > > [ 4.262635] mv88e6xxx_mac_config+0xd0/0x154 > > > [ 4.266924] dsa_port_phylink_mac_config+0x2c/0x38 > > > [ 4.271736] phylink_mac_config+0xe0/0x1cc > > > [ 4.275849] phylink_start+0xc8/0x224 > > > [ 4.279527] dsa_port_link_register_of+0xe8/0x1b0 > > > [ 4.284251] dsa_register_switch+0x7fc/0x908 > > > [ 4.288539] mv88e6xxx_probe+0x62c/0x66c > > > [ 4.292478] mdio_probe+0x30/0x5c > > > [ 4.295806] really_probe+0x1d0/0x280 > > > [ 4.299483] driver_probe_device+0xd4/0xe4 > > > [ 4.303596] __device_attach_driver+0x94/0xa0 > > > [ 4.307971] bus_for_each_drv+0x94/0xb4 > > > [ 4.311823] __device_attach+0xc0/0x12c > > > [ 4.315674] device_initial_probe+0x10/0x18 > > > [ 4.319875] bus_probe_device+0x2c/0x8c > > > [ 4.323726] deferred_probe_work_func+0x84/0x98 > > > [ 4.328276] process_one_work+0x19c/0x258 > > > [ 4.332303] process_scheduled_works+0x3c/0x40 > > > [ 4.336765] worker_thread+0x228/0x2f8 > > > [ 4.340530] kthread+0x114/0x124 > > > [ 4.343771] ret_from_fork+0x10/0x18 > > > > > > This hunk removed that mv88e6xxx_port_setup_mac() call, and fixed Tx for me on > > > v5.3.15: > > > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > > > index d0a97eb73a37..f0457274b5a9 100644 > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > > > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > > > return; > > > > > > + if (dsa_is_cpu_port(ds, port)) > > > + return; > > > + > > > if (mode == MLO_AN_FIXED) { > > > link = LINK_FORCED_UP; > > > speed = state->speed; > > > > > > Is that the right solution? > > > > What needs testing is: > > > > port@0 { > > reg = <0>; > > label = "cpu"; > > ethernet = <&fec1>; > > > > fixed-link { > > speed = <100>; > > full-duplex; > > }; > > > > At some point, there is a call to configure the CPU port to 100Mbps, > > because the SoC Ethernet does not support 1G. We need to ensure this > > does not break with your change. > > So maybe this: > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > index d0a97eb73a37..84ca4f36a778 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > return; > > + if (mode != MLO_AN_FIXED && dsa_is_cpu_port(ds, port)) > + return; > + > if (mode == MLO_AN_FIXED) { > link = LINK_FORCED_UP; > speed = state->speed; This is not enough to fix v5.4, though. Commit 7a3007d22e8dc ("net: dsa: mv88e6xxx: fully support SERDES on Topaz family") breaks switch Tx on SolidRun Clearfog GT-8K in much the same way. I could not easily revert 7a3007d22e8dc on top of v5.4, but this is enough to make Tx work again with v5.4: diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 7b5b73499e37..e7e6400a994e 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3191,7 +3191,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops = { .port_disable_pri_override = mv88e6xxx_port_disable_pri_override, .port_link_state = mv88e6352_port_link_state, .port_get_cmode = mv88e6352_port_get_cmode, - .port_set_cmode = mv88e6341_port_set_cmode, .port_setup_message_port = mv88e6xxx_setup_message_port, .stats_snapshot = mv88e6390_g1_stats_snapshot, .stats_set_histogram = mv88e6095_g1_stats_set_histogram, Marek, do you have any idea how to properly fix this? Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 18:17 ` Baruch Siach @ 2019-12-12 18:32 ` Marek Behun 2019-12-12 18:48 ` Baruch Siach 0 siblings, 1 reply; 27+ messages in thread From: Marek Behun @ 2019-12-12 18:32 UTC (permalink / raw) To: Baruch Siach Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein On Thu, 12 Dec 2019 20:17:29 +0200 Baruch Siach <baruch@tkos.co.il> wrote: > This is not enough to fix v5.4, though. Commit 7a3007d22e8dc ("net: dsa: > mv88e6xxx: fully support SERDES on Topaz family") breaks switch Tx on SolidRun > Clearfog GT-8K in much the same way. I could not easily revert 7a3007d22e8dc > on top of v5.4, but this is enough to make Tx work again with v5.4: > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > index 7b5b73499e37..e7e6400a994e 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -3191,7 +3191,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops = { > .port_disable_pri_override = mv88e6xxx_port_disable_pri_override, > .port_link_state = mv88e6352_port_link_state, > .port_get_cmode = mv88e6352_port_get_cmode, > - .port_set_cmode = mv88e6341_port_set_cmode, > .port_setup_message_port = mv88e6xxx_setup_message_port, > .stats_snapshot = mv88e6390_g1_stats_snapshot, > .stats_set_histogram = mv88e6095_g1_stats_set_histogram, > > Marek, do you have any idea how to properly fix this? > > Thanks, > baruch > Hi Baruch, can you tell on which port the port_set_cmode is called when things break? Because if it is other than port 5, than method mv88e6341_port_set_cmode does nothing. If it is port 5, do you have cpu connected to the switch via port 5? What phy mode? Marek ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 18:32 ` Marek Behun @ 2019-12-12 18:48 ` Baruch Siach 0 siblings, 0 replies; 27+ messages in thread From: Baruch Siach @ 2019-12-12 18:48 UTC (permalink / raw) To: Marek Behun Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Marek, On Thu, Dec 12, 2019 at 07:32:18PM +0100, Marek Behun wrote: > On Thu, 12 Dec 2019 20:17:29 +0200 > Baruch Siach <baruch@tkos.co.il> wrote: > > > This is not enough to fix v5.4, though. Commit 7a3007d22e8dc ("net: dsa: > > mv88e6xxx: fully support SERDES on Topaz family") breaks switch Tx on SolidRun > > Clearfog GT-8K in much the same way. I could not easily revert 7a3007d22e8dc > > on top of v5.4, but this is enough to make Tx work again with v5.4: > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > > index 7b5b73499e37..e7e6400a994e 100644 > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > @@ -3191,7 +3191,6 @@ static const struct mv88e6xxx_ops mv88e6141_ops = { > > .port_disable_pri_override = mv88e6xxx_port_disable_pri_override, > > .port_link_state = mv88e6352_port_link_state, > > .port_get_cmode = mv88e6352_port_get_cmode, > > - .port_set_cmode = mv88e6341_port_set_cmode, > > .port_setup_message_port = mv88e6xxx_setup_message_port, > > .stats_snapshot = mv88e6390_g1_stats_snapshot, > > .stats_set_histogram = mv88e6095_g1_stats_set_histogram, > > > > Marek, do you have any idea how to properly fix this? > > can you tell on which port the port_set_cmode is called when things > break? Because if it is other than port 5, than method > mv88e6341_port_set_cmode does nothing. If it is port 5, do you have cpu > connected to the switch via port 5? What phy mode? This is indeed port 5. It is connected to the cpu (Armada 8040). PHY mode is 19 (PHY_INTERFACE_MODE_2500BASEX). I have now tested v5.5-rc1 which behaves just like v5.4. The same "fix" works with v5.5-rc1. Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 15:23 ` Baruch Siach 2019-12-12 18:17 ` Baruch Siach @ 2019-12-12 18:36 ` Marek Behun 2019-12-12 19:06 ` Baruch Siach 1 sibling, 1 reply; 27+ messages in thread From: Marek Behun @ 2019-12-12 18:36 UTC (permalink / raw) To: Baruch Siach Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein On Thu, 12 Dec 2019 17:23:55 +0200 Baruch Siach <baruch@tkos.co.il> wrote: > Hi Andrew, > > On Thu, Dec 12, 2019 at 04:13:55PM +0100, Andrew Lunn wrote: > > > I compared phylib to phylink calls to mv88e6xxx_port_setup_mac(). It turns out > > > that the phylink adds mv88e6xxx_port_setup_mac() call for the cpu port (port 5 > > > in my case) with these parameters and call stack: > > > > > > [ 4.219148] mv88e6xxx_port_setup_mac: port: 5 link: 0 speed: -1 duplex: 255 > > > [ 4.226144] CPU: 2 PID: 21 Comm: kworker/2:0 Not tainted 5.3.15-00003-gb9bb09189d02-dirty #104 > > > [ 4.234795] Hardware name: SolidRun ClearFog GT 8K (DT) > > > [ 4.240044] Workqueue: events deferred_probe_work_func > > > [ 4.245205] Call trace: > > > [ 4.247661] dump_backtrace+0x0/0x128 > > > [ 4.251339] show_stack+0x14/0x1c > > > [ 4.254669] dump_stack+0xa4/0xd0 > > > [ 4.257998] mv88e6xxx_port_setup_mac+0x78/0x2a0 > > > [ 4.262635] mv88e6xxx_mac_config+0xd0/0x154 > > > [ 4.266924] dsa_port_phylink_mac_config+0x2c/0x38 > > > [ 4.271736] phylink_mac_config+0xe0/0x1cc > > > [ 4.275849] phylink_start+0xc8/0x224 > > > [ 4.279527] dsa_port_link_register_of+0xe8/0x1b0 > > > [ 4.284251] dsa_register_switch+0x7fc/0x908 > > > [ 4.288539] mv88e6xxx_probe+0x62c/0x66c > > > [ 4.292478] mdio_probe+0x30/0x5c > > > [ 4.295806] really_probe+0x1d0/0x280 > > > [ 4.299483] driver_probe_device+0xd4/0xe4 > > > [ 4.303596] __device_attach_driver+0x94/0xa0 > > > [ 4.307971] bus_for_each_drv+0x94/0xb4 > > > [ 4.311823] __device_attach+0xc0/0x12c > > > [ 4.315674] device_initial_probe+0x10/0x18 > > > [ 4.319875] bus_probe_device+0x2c/0x8c > > > [ 4.323726] deferred_probe_work_func+0x84/0x98 > > > [ 4.328276] process_one_work+0x19c/0x258 > > > [ 4.332303] process_scheduled_works+0x3c/0x40 > > > [ 4.336765] worker_thread+0x228/0x2f8 > > > [ 4.340530] kthread+0x114/0x124 > > > [ 4.343771] ret_from_fork+0x10/0x18 > > > > > > This hunk removed that mv88e6xxx_port_setup_mac() call, and fixed Tx for me on > > > v5.3.15: > > > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > > > index d0a97eb73a37..f0457274b5a9 100644 > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > > > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > > > return; > > > > > > + if (dsa_is_cpu_port(ds, port)) > > > + return; > > > + > > > if (mode == MLO_AN_FIXED) { > > > link = LINK_FORCED_UP; > > > speed = state->speed; > > > > > > Is that the right solution? > > > > What needs testing is: > > > > port@0 { > > reg = <0>; > > label = "cpu"; > > ethernet = <&fec1>; > > > > fixed-link { > > speed = <100>; > > full-duplex; > > }; > > > > At some point, there is a call to configure the CPU port to 100Mbps, > > because the SoC Ethernet does not support 1G. We need to ensure this > > does not break with your change. > > So maybe this: > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > index d0a97eb73a37..84ca4f36a778 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > return; > > + if (mode != MLO_AN_FIXED && dsa_is_cpu_port(ds, port)) > + return; > + > if (mode == MLO_AN_FIXED) { > link = LINK_FORCED_UP; > speed = state->speed; > > baruch > No, if your switch is connected to the cpu via 100mbps on the cpu side, you have to add fixed-link node as Andrew suggested. This is what we were trying to do with the Topaz patches, so that dsa port nodes support fixed-link via phylink. Marek ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 18:36 ` Marek Behun @ 2019-12-12 19:06 ` Baruch Siach 2019-12-12 19:31 ` Andrew Lunn 0 siblings, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-12 19:06 UTC (permalink / raw) To: Marek Behun Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Marek, On Thu, Dec 12, 2019 at 07:36:11PM +0100, Marek Behun wrote: > On Thu, 12 Dec 2019 17:23:55 +0200 > Baruch Siach <baruch@tkos.co.il> wrote: > > > > On Thu, Dec 12, 2019 at 04:13:55PM +0100, Andrew Lunn wrote: > > > > I compared phylib to phylink calls to mv88e6xxx_port_setup_mac(). It turns out > > > > that the phylink adds mv88e6xxx_port_setup_mac() call for the cpu port (port 5 > > > > in my case) with these parameters and call stack: > > > > > > > > [ 4.219148] mv88e6xxx_port_setup_mac: port: 5 link: 0 speed: -1 duplex: 255 > > > > [ 4.226144] CPU: 2 PID: 21 Comm: kworker/2:0 Not tainted 5.3.15-00003-gb9bb09189d02-dirty #104 > > > > [ 4.234795] Hardware name: SolidRun ClearFog GT 8K (DT) > > > > [ 4.240044] Workqueue: events deferred_probe_work_func > > > > [ 4.245205] Call trace: > > > > [ 4.247661] dump_backtrace+0x0/0x128 > > > > [ 4.251339] show_stack+0x14/0x1c > > > > [ 4.254669] dump_stack+0xa4/0xd0 > > > > [ 4.257998] mv88e6xxx_port_setup_mac+0x78/0x2a0 > > > > [ 4.262635] mv88e6xxx_mac_config+0xd0/0x154 > > > > [ 4.266924] dsa_port_phylink_mac_config+0x2c/0x38 > > > > [ 4.271736] phylink_mac_config+0xe0/0x1cc > > > > [ 4.275849] phylink_start+0xc8/0x224 > > > > [ 4.279527] dsa_port_link_register_of+0xe8/0x1b0 > > > > [ 4.284251] dsa_register_switch+0x7fc/0x908 > > > > [ 4.288539] mv88e6xxx_probe+0x62c/0x66c > > > > [ 4.292478] mdio_probe+0x30/0x5c > > > > [ 4.295806] really_probe+0x1d0/0x280 > > > > [ 4.299483] driver_probe_device+0xd4/0xe4 > > > > [ 4.303596] __device_attach_driver+0x94/0xa0 > > > > [ 4.307971] bus_for_each_drv+0x94/0xb4 > > > > [ 4.311823] __device_attach+0xc0/0x12c > > > > [ 4.315674] device_initial_probe+0x10/0x18 > > > > [ 4.319875] bus_probe_device+0x2c/0x8c > > > > [ 4.323726] deferred_probe_work_func+0x84/0x98 > > > > [ 4.328276] process_one_work+0x19c/0x258 > > > > [ 4.332303] process_scheduled_works+0x3c/0x40 > > > > [ 4.336765] worker_thread+0x228/0x2f8 > > > > [ 4.340530] kthread+0x114/0x124 > > > > [ 4.343771] ret_from_fork+0x10/0x18 > > > > > > > > This hunk removed that mv88e6xxx_port_setup_mac() call, and fixed Tx for me on > > > > v5.3.15: > > > > > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > > > > index d0a97eb73a37..f0457274b5a9 100644 > > > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > > > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > > > > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > > > > return; > > > > > > > > + if (dsa_is_cpu_port(ds, port)) > > > > + return; > > > > + > > > > if (mode == MLO_AN_FIXED) { > > > > link = LINK_FORCED_UP; > > > > speed = state->speed; > > > > > > > > Is that the right solution? > > > > > > What needs testing is: > > > > > > port@0 { > > > reg = <0>; > > > label = "cpu"; > > > ethernet = <&fec1>; > > > > > > fixed-link { > > > speed = <100>; > > > full-duplex; > > > }; > > > > > > At some point, there is a call to configure the CPU port to 100Mbps, > > > because the SoC Ethernet does not support 1G. We need to ensure this > > > does not break with your change. > > > > So maybe this: > > > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c > > index d0a97eb73a37..84ca4f36a778 100644 > > --- a/drivers/net/dsa/mv88e6xxx/chip.c > > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > > @@ -611,6 +611,9 @@ static void mv88e6xxx_mac_config(struct dsa_switch *ds, int port, > > if ((mode == MLO_AN_PHY) && mv88e6xxx_phy_is_internal(ds, port)) > > return; > > > > + if (mode != MLO_AN_FIXED && dsa_is_cpu_port(ds, port)) > > + return; > > + > > if (mode == MLO_AN_FIXED) { > > link = LINK_FORCED_UP; > > speed = state->speed; > > No, if your switch is connected to the cpu via 100mbps on the cpu side, > you have to add fixed-link node as Andrew suggested. This is what we > were trying to do with the Topaz patches, so that dsa port nodes > support fixed-link via phylink. Thanks for the clarification. This also works without the hunk above: diff --git a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts index bd881497b872..8f61cae9d3b0 100644 --- a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts +++ b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts @@ -408,6 +408,11 @@ port@5 { reg = <5>; label = "cpu"; ethernet = <&cp1_eth2>; + + fixed-link { + speed = <2500>; + full-duplex; + }; }; }; With this I get calls to mv88e6341_port_set_cmode() with PHY mode 0. So the other "fix" is still needed. Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 19:06 ` Baruch Siach @ 2019-12-12 19:31 ` Andrew Lunn 2019-12-12 19:41 ` Marek Behun 0 siblings, 1 reply; 27+ messages in thread From: Andrew Lunn @ 2019-12-12 19:31 UTC (permalink / raw) To: Baruch Siach Cc: Marek Behun, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein > diff --git a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts > index bd881497b872..8f61cae9d3b0 100644 > --- a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts > +++ b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts > @@ -408,6 +408,11 @@ port@5 { > reg = <5>; > label = "cpu"; > ethernet = <&cp1_eth2>; > + > + fixed-link { > + speed = <2500>; > + full-duplex; > + }; > }; > }; The DSA driver is expected to configure the CPU port at its maximum speed. You should only add a fixed link if you need to slow it down. I expect 2500 is the maximum speed of this port. Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 19:31 ` Andrew Lunn @ 2019-12-12 19:41 ` Marek Behun 2019-12-12 20:49 ` Andrew Lunn 2019-12-15 10:13 ` Baruch Siach 0 siblings, 2 replies; 27+ messages in thread From: Marek Behun @ 2019-12-12 19:41 UTC (permalink / raw) To: Andrew Lunn Cc: Baruch Siach, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein On Thu, 12 Dec 2019 20:31:29 +0100 Andrew Lunn <andrew@lunn.ch> wrote: > > diff --git a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts > > index bd881497b872..8f61cae9d3b0 100644 > > --- a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts > > +++ b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts > > @@ -408,6 +408,11 @@ port@5 { > > reg = <5>; > > label = "cpu"; > > ethernet = <&cp1_eth2>; > > + > > + fixed-link { > > + speed = <2500>; > > + full-duplex; > > + }; > > }; > > }; > > The DSA driver is expected to configure the CPU port at its maximum > speed. You should only add a fixed link if you need to slow it down. > I expect 2500 is the maximum speed of this port. > > Andrew Baruch, if the cpu port is in 2500 base-x, remove the fixed-link and do this: port@5 { reg = <5>; label = "cpu"; ethernet = <&cp1_eth2>; phy-mode = "2500base-x"; managed = "in-band-status"; }; Andrew, if the dsa driver is expected to do that, the code certainly does not do so. For example in mv88e6xxx_port_set_cmode you have: /* Default to a slow mode, so freeing up SERDES interfaces for * other ports which might use them for SFPs. */ if (mode == PHY_INTERFACE_MODE_NA) mode = PHY_INTERFACE_MODE_1000BASEX; Marek ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 19:41 ` Marek Behun @ 2019-12-12 20:49 ` Andrew Lunn 2019-12-15 10:13 ` Baruch Siach 1 sibling, 0 replies; 27+ messages in thread From: Andrew Lunn @ 2019-12-12 20:49 UTC (permalink / raw) To: Marek Behun Cc: Baruch Siach, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein > Baruch, if the cpu port is in 2500 base-x, remove the fixed-link and do > this: > > port@5 { > reg = <5>; > label = "cpu"; > ethernet = <&cp1_eth2>; > phy-mode = "2500base-x"; > managed = "in-band-status"; > }; > > Andrew, if the dsa driver is expected to do that, the code certainly > does not do so. For example in mv88e6xxx_port_set_cmode you have: > /* Default to a slow mode, so freeing up SERDES interfaces for > * other ports which might use them for SFPs. > */ > if (mode == PHY_INTERFACE_MODE_NA) > mode = PHY_INTERFACE_MODE_1000BASEX; Yah, Ports 9 and 10 of 6390X are a bit odd. They can do 10Gbps. But only if you set the correct phy-mode. Then they will default to 10G. If these ports are not doing 10Gbps they can lend there SERDES interfaces to other ports. There are a few boards which want this lending, connecting lots of SFPs using these SERDES interfaces. And there are other boards which do use the ports at 10G as DSA links. DSA links also default to the maximum speed of the port, and that does work if you set the correct phy-mode. So in general, if the port supports > 1Gbps, you need to set the phy-mode for CPU and DSA ports. It will then default to the maximum speed for that mode. Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-12 19:41 ` Marek Behun 2019-12-12 20:49 ` Andrew Lunn @ 2019-12-15 10:13 ` Baruch Siach 2019-12-15 11:10 ` Baruch Siach 2019-12-18 14:30 ` Marek Behún 1 sibling, 2 replies; 27+ messages in thread From: Baruch Siach @ 2019-12-15 10:13 UTC (permalink / raw) To: Marek Behun Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Marek, On Thu, Dec 12 2019, Marek Behun wrote: > On Thu, 12 Dec 2019 20:31:29 +0100 > Andrew Lunn <andrew@lunn.ch> wrote: > >> > diff --git a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts >> > index bd881497b872..8f61cae9d3b0 100644 >> > --- a/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts >> > +++ b/arch/arm64/boot/dts/marvell/armada-8040-clearfog-gt-8k.dts >> > @@ -408,6 +408,11 @@ port@5 { >> > reg = <5>; >> > label = "cpu"; >> > ethernet = <&cp1_eth2>; >> > + >> > + fixed-link { >> > + speed = <2500>; >> > + full-duplex; >> > + }; >> > }; >> > }; >> >> The DSA driver is expected to configure the CPU port at its maximum >> speed. You should only add a fixed link if you need to slow it down. >> I expect 2500 is the maximum speed of this port. > > Baruch, if the cpu port is in 2500 base-x, remove the fixed-link and do > this: > > port@5 { > reg = <5>; > label = "cpu"; > ethernet = <&cp1_eth2>; > phy-mode = "2500base-x"; > managed = "in-band-status"; > }; Thanks. That is enough to fix the phylink issue triggered by commit 7fb5a711545 ("net: dsa: mv88e6xxx: drop adjust_link to enabled phylink"). The Clearfog GT-8K DT has also this on the cpu side: &cp1_eth2 { status = "okay"; phy-mode = "2500base-x"; phys = <&cp1_comphy5 2>; fixed-link { speed = <2500>; full-duplex; }; }; Should I drop fixed-link here as well? The call to mv88e6341_port_set_cmode() introduced in commit 7a3007d22e8 ("net: dsa: mv88e6xxx: fully support SERDES on Topaz family") still breaks port 5 (cpu) configuration. When called, its mode parameter is set to PHY_INTERFACE_MODE_2500BASEX (19). Any idea? Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-15 10:13 ` Baruch Siach @ 2019-12-15 11:10 ` Baruch Siach 2019-12-15 14:53 ` Andrew Lunn 2019-12-18 14:30 ` Marek Behún 1 sibling, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-15 11:10 UTC (permalink / raw) To: Marek Behun Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Marek, On Sun, Dec 15 2019, Baruch Siach wrote: > The call to mv88e6341_port_set_cmode() introduced in commit 7a3007d22e8 > ("net: dsa: mv88e6xxx: fully support SERDES on Topaz family") still > breaks port 5 (cpu) configuration. When called, its mode parameter is > set to PHY_INTERFACE_MODE_2500BASEX (19). > > Any idea? I dug a little further here. It turns out that mv88e6xxx_port_set_cmode() does not do any register write, because it exits early here with cmode == 0xb: /* cmode doesn't change, nothing to do for us */ if (cmode == chip->ports[port].cmode) return 0; mv88e6341_port_set_cmode_writable() breaks the port configuration with its call to mv88e6xxx_port_hidden_write(). Before this call mv88e6352_port_get_cmode() sets cmode to 0xb. After this call cmode is 6. This breaks the assumption that the equality test above relies on. This fixes cpu port configuration for me: diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 7fe256c5739d..a6c320978bcf 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -427,10 +427,6 @@ static int mv88e6xxx_port_set_cmode(struct mv88e6xxx_chip *chip, int port, cmode = 0; } - /* cmode doesn't change, nothing to do for us */ - if (cmode == chip->ports[port].cmode) - return 0; - lane = mv88e6xxx_serdes_get_lane(chip, port); if (lane) { if (chip->ports[port].serdes_irq) { Does that make sense? Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-15 11:10 ` Baruch Siach @ 2019-12-15 14:53 ` Andrew Lunn 2019-12-15 15:08 ` Baruch Siach 0 siblings, 1 reply; 27+ messages in thread From: Andrew Lunn @ 2019-12-15 14:53 UTC (permalink / raw) To: Baruch Siach Cc: Marek Behun, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein > This fixes cpu port configuration for me: > > diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c > index 7fe256c5739d..a6c320978bcf 100644 > --- a/drivers/net/dsa/mv88e6xxx/port.c > +++ b/drivers/net/dsa/mv88e6xxx/port.c > @@ -427,10 +427,6 @@ static int mv88e6xxx_port_set_cmode(struct mv88e6xxx_chip *chip, int port, > cmode = 0; > } > > - /* cmode doesn't change, nothing to do for us */ > - if (cmode == chip->ports[port].cmode) > - return 0; > - > lane = mv88e6xxx_serdes_get_lane(chip, port); > if (lane) { > if (chip->ports[port].serdes_irq) { > > Does that make sense? This needs testing on a 6390, with a port 9 or 10 using fixed link. We have had issues in the past where mac_config() has been called once per second, and each time it reconfigured the MAC, causing the link to go down/up. So we try to avoid doing work which is not requires and which could upset the link. Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-15 14:53 ` Andrew Lunn @ 2019-12-15 15:08 ` Baruch Siach 2019-12-15 16:14 ` Andrew Lunn 0 siblings, 1 reply; 27+ messages in thread From: Baruch Siach @ 2019-12-15 15:08 UTC (permalink / raw) To: Andrew Lunn Cc: Marek Behun, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein, Heiner Kallweit Hi Andrew, On Sun, Dec 15 2019, Andrew Lunn wrote: >> This fixes cpu port configuration for me: >> >> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c >> index 7fe256c5739d..a6c320978bcf 100644 >> --- a/drivers/net/dsa/mv88e6xxx/port.c >> +++ b/drivers/net/dsa/mv88e6xxx/port.c >> @@ -427,10 +427,6 @@ static int mv88e6xxx_port_set_cmode(struct mv88e6xxx_chip *chip, int port, >> cmode = 0; >> } >> >> - /* cmode doesn't change, nothing to do for us */ >> - if (cmode == chip->ports[port].cmode) >> - return 0; >> - >> lane = mv88e6xxx_serdes_get_lane(chip, port); >> if (lane) { >> if (chip->ports[port].serdes_irq) { >> >> Does that make sense? > > This needs testing on a 6390, with a port 9 or 10 using fixed link. We > have had issues in the past where mac_config() has been called once > per second, and each time it reconfigured the MAC, causing the link to > go down/up. So we try to avoid doing work which is not requires and > which could upset the link. You refer to ed8fe20205ac ("net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode") that introduced this code. The alternative is to call ->port_get_cmode() to refresh the cmode cache after mv88e6xxx_port_hidden_write(). What do you think? Thanks, baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-15 15:08 ` Baruch Siach @ 2019-12-15 16:14 ` Andrew Lunn 2019-12-17 18:26 ` Baruch Siach 0 siblings, 1 reply; 27+ messages in thread From: Andrew Lunn @ 2019-12-15 16:14 UTC (permalink / raw) To: Baruch Siach Cc: Marek Behun, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein, Heiner Kallweit On Sun, Dec 15, 2019 at 05:08:01PM +0200, Baruch Siach wrote: > Hi Andrew, > > On Sun, Dec 15 2019, Andrew Lunn wrote: > >> This fixes cpu port configuration for me: > >> > >> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c > >> index 7fe256c5739d..a6c320978bcf 100644 > >> --- a/drivers/net/dsa/mv88e6xxx/port.c > >> +++ b/drivers/net/dsa/mv88e6xxx/port.c > >> @@ -427,10 +427,6 @@ static int mv88e6xxx_port_set_cmode(struct mv88e6xxx_chip *chip, int port, > >> cmode = 0; > >> } > >> > >> - /* cmode doesn't change, nothing to do for us */ > >> - if (cmode == chip->ports[port].cmode) > >> - return 0; > >> - > >> lane = mv88e6xxx_serdes_get_lane(chip, port); > >> if (lane) { > >> if (chip->ports[port].serdes_irq) { > >> > >> Does that make sense? > > > > This needs testing on a 6390, with a port 9 or 10 using fixed link. We > > have had issues in the past where mac_config() has been called once > > per second, and each time it reconfigured the MAC, causing the link to > > go down/up. So we try to avoid doing work which is not requires and > > which could upset the link. > > You refer to ed8fe20205ac ("net: dsa: mv88e6xxx: prevent interrupt storm > caused by mv88e6390x_port_set_cmode") that introduced this code. > > The alternative is to call ->port_get_cmode() to refresh the cmode cache > after mv88e6xxx_port_hidden_write(). Refreshing the cmode after mv88e6xxx_port_hidden_write() sounds like a good idea. But please limit it to just switches which need to make cmode writable. cmode is rather fragile and the 6390 family is easy to break in this area. Thanks Andrew ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-15 16:14 ` Andrew Lunn @ 2019-12-17 18:26 ` Baruch Siach 0 siblings, 0 replies; 27+ messages in thread From: Baruch Siach @ 2019-12-17 18:26 UTC (permalink / raw) To: Andrew Lunn Cc: Marek Behun, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein, Heiner Kallweit Hi Andrew, On Sun, Dec 15, 2019 at 05:14:23PM +0100, Andrew Lunn wrote: > On Sun, Dec 15, 2019 at 05:08:01PM +0200, Baruch Siach wrote: > > On Sun, Dec 15 2019, Andrew Lunn wrote: > > >> This fixes cpu port configuration for me: > > >> > > >> diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c > > >> index 7fe256c5739d..a6c320978bcf 100644 > > >> --- a/drivers/net/dsa/mv88e6xxx/port.c > > >> +++ b/drivers/net/dsa/mv88e6xxx/port.c > > >> @@ -427,10 +427,6 @@ static int mv88e6xxx_port_set_cmode(struct mv88e6xxx_chip *chip, int port, > > >> cmode = 0; > > >> } > > >> > > >> - /* cmode doesn't change, nothing to do for us */ > > >> - if (cmode == chip->ports[port].cmode) > > >> - return 0; > > >> - > > >> lane = mv88e6xxx_serdes_get_lane(chip, port); > > >> if (lane) { > > >> if (chip->ports[port].serdes_irq) { > > >> > > >> Does that make sense? > > > > > > This needs testing on a 6390, with a port 9 or 10 using fixed link. We > > > have had issues in the past where mac_config() has been called once > > > per second, and each time it reconfigured the MAC, causing the link to > > > go down/up. So we try to avoid doing work which is not requires and > > > which could upset the link. > > > > You refer to ed8fe20205ac ("net: dsa: mv88e6xxx: prevent interrupt storm > > caused by mv88e6390x_port_set_cmode") that introduced this code. > > > > The alternative is to call ->port_get_cmode() to refresh the cmode cache > > after mv88e6xxx_port_hidden_write(). > > Refreshing the cmode after mv88e6xxx_port_hidden_write() sounds like a > good idea. But please limit it to just switches which need to make > cmode writable. cmode is rather fragile and the 6390 family is easy to > break in this area. This turned out to be much harder than expected. cmode update after mv88e6xxx_port_hidden_write() breaks the serdes configuration. The link change irq does not trigger on serdes interface up. The strange thing is that if I add 10ms delay after cmode read (or anywhere before the mv88e6xxx_port_write() call in mv88e6xxx_port_set_cmode()) it works again. I have no idea what is going on here. The cmode cache is used in many places, so maybe setting it to an invalid value (6) is not a good idea. I ended up with cmode write force in mv88e6341_port_set_cmode_writable() like this: diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index 8a8e38bfb161..70284f100d87 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -233,6 +233,7 @@ struct mv88e6xxx_port { u64 vtu_member_violation; u64 vtu_miss_violation; u8 cmode; + bool force_cmode; bool mirror_ingress; bool mirror_egress; unsigned int serdes_irq; diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 7fe256c5739d..8e8724eb5669 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -427,9 +427,10 @@ static int mv88e6xxx_port_set_cmode(struct mv88e6xxx_chip *chip, int port, cmode = 0; } - /* cmode doesn't change, nothing to do for us */ - if (cmode == chip->ports[port].cmode) + /* cmode doesn't change, nothing to do for us unless forced */ + if (cmode == chip->ports[port].cmode && !chip->ports[port].force_cmode) return 0; + chip->ports[port].force_cmode = false; lane = mv88e6xxx_serdes_get_lane(chip, port); if (lane) { @@ -516,6 +517,8 @@ static int mv88e6341_port_set_cmode_writable(struct mv88e6xxx_chip *chip, if (port != 5) return -EOPNOTSUPP; + chip->ports[port].force_cmode = true; + addr = chip->info->port_base_addr + port; err = mv88e6xxx_port_hidden_read(chip, 0x7, addr, 0, ®); Does that look right? baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-15 10:13 ` Baruch Siach 2019-12-15 11:10 ` Baruch Siach @ 2019-12-18 14:30 ` Marek Behún 2019-12-18 16:04 ` Baruch Siach 2019-12-19 9:04 ` Baruch Siach 1 sibling, 2 replies; 27+ messages in thread From: Marek Behún @ 2019-12-18 14:30 UTC (permalink / raw) To: Baruch Siach Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Baruch, On Sun, 15 Dec 2019 12:13:25 +0200 Baruch Siach <baruch@tkos.co.il> wrote: > Thanks. That is enough to fix the phylink issue triggered by commit > 7fb5a711545 ("net: dsa: mv88e6xxx: drop adjust_link to enabled > phylink"). > > The Clearfog GT-8K DT has also this on the cpu side: > > &cp1_eth2 { > status = "okay"; > phy-mode = "2500base-x"; > phys = <&cp1_comphy5 2>; > fixed-link { > speed = <2500>; > full-duplex; > }; > }; > > Should I drop fixed-link here as well? I would think yes. phy-mode = 2500base-x should already force 2500mbps, the fixed-link should be irrelevant. Whether this is truly the case I do not know, but on Turris Mox I do not use fixed-link with these. > > The call to mv88e6341_port_set_cmode() introduced in commit > 7a3007d22e8 ("net: dsa: mv88e6xxx: fully support SERDES on Topaz > family") still breaks port 5 (cpu) configuration. When called, its > mode parameter is set to PHY_INTERFACE_MODE_2500BASEX (19). > > Any idea? I shall look into this. On Turris Mox this works, so I will have to do some experiments. > Thanks, > baruch > > -- > http://baruch.siach.name/blog/ ~. .~ Tk Open > Systems > =}------------------------------------------------ooO--U--Ooo------------{= > - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il > - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-18 14:30 ` Marek Behún @ 2019-12-18 16:04 ` Baruch Siach 2019-12-19 9:04 ` Baruch Siach 1 sibling, 0 replies; 27+ messages in thread From: Baruch Siach @ 2019-12-18 16:04 UTC (permalink / raw) To: Marek Behún Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Marek, On Wed, Dec 18 2019, Marek Behún wrote: > On Sun, 15 Dec 2019 12:13:25 +0200 > Baruch Siach <baruch@tkos.co.il> wrote: > >> Thanks. That is enough to fix the phylink issue triggered by commit >> 7fb5a711545 ("net: dsa: mv88e6xxx: drop adjust_link to enabled >> phylink"). >> >> The Clearfog GT-8K DT has also this on the cpu side: >> >> &cp1_eth2 { >> status = "okay"; >> phy-mode = "2500base-x"; >> phys = <&cp1_comphy5 2>; >> fixed-link { >> speed = <2500>; >> full-duplex; >> }; >> }; >> >> Should I drop fixed-link here as well? > > I would think yes. phy-mode = 2500base-x should already force 2500mbps, > the fixed-link should be irrelevant. Whether this is truly the case I > do not know, but on Turris Mox I do not use fixed-link with these. OK. I'll test that. >> The call to mv88e6341_port_set_cmode() introduced in commit >> 7a3007d22e8 ("net: dsa: mv88e6xxx: fully support SERDES on Topaz >> family") still breaks port 5 (cpu) configuration. When called, its >> mode parameter is set to PHY_INTERFACE_MODE_2500BASEX (19). >> >> Any idea? > > I shall look into this. On Turris Mox this works, so I will have to do > some experiments. I have made some progress on this issue. See my other messages on this thread. My latest suggestion is here: https://lore.kernel.org/netdev/20191217182643.augknhx57pafnelv@sapphire.tkos.co.il/ I now think that the force_cmode field is not necessary. We should just add a 'force' parameter to mv88e6xxx_port_set_cmode(), and set it true in mv88e6341_port_set_cmode(). baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [BUG] mv88e6xxx: tx regression in v5.3 2019-12-18 14:30 ` Marek Behún 2019-12-18 16:04 ` Baruch Siach @ 2019-12-19 9:04 ` Baruch Siach 1 sibling, 0 replies; 27+ messages in thread From: Baruch Siach @ 2019-12-19 9:04 UTC (permalink / raw) To: Marek Behún Cc: Andrew Lunn, Vivien Didelot, netdev, Denis Odintsov, Hubert Feurstein Hi Marek, On Wed, Dec 18, 2019 at 03:30:35PM +0100, Marek Behún wrote: > On Sun, 15 Dec 2019 12:13:25 +0200 > Baruch Siach <baruch@tkos.co.il> wrote: > > > Thanks. That is enough to fix the phylink issue triggered by commit > > 7fb5a711545 ("net: dsa: mv88e6xxx: drop adjust_link to enabled > > phylink"). > > > > The Clearfog GT-8K DT has also this on the cpu side: > > > > &cp1_eth2 { > > status = "okay"; > > phy-mode = "2500base-x"; > > phys = <&cp1_comphy5 2>; > > fixed-link { > > speed = <2500>; > > full-duplex; > > }; > > }; > > > > Should I drop fixed-link here as well? > > I would think yes. phy-mode = 2500base-x should already force 2500mbps, > the fixed-link should be irrelevant. Whether this is truly the case I > do not know, but on Turris Mox I do not use fixed-link with these. It doesn't work here without fixed-link. The link state remains down. baruch > > The call to mv88e6341_port_set_cmode() introduced in commit > > 7a3007d22e8 ("net: dsa: mv88e6xxx: fully support SERDES on Topaz > > family") still breaks port 5 (cpu) configuration. When called, its > > mode parameter is set to PHY_INTERFACE_MODE_2500BASEX (19). > > > > Any idea? > > I shall look into this. On Turris Mox this works, so I will have to do > some experiments. -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.2.679.5364, http://www.tkos.co.il - ^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2019-12-19 9:04 UTC | newest] Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-11 9:48 [BUG] mv88e6xxx: tx regression in v5.3 Baruch Siach 2019-12-11 13:11 ` Andrew Lunn 2019-12-11 17:07 ` Baruch Siach 2019-12-11 17:49 ` Andrew Lunn 2019-12-12 8:50 ` Baruch Siach 2019-12-12 13:14 ` Andrew Lunn 2019-12-12 14:19 ` Andreas Tobler 2019-12-12 15:08 ` Baruch Siach 2019-12-12 15:13 ` Andrew Lunn 2019-12-12 15:23 ` Baruch Siach 2019-12-12 18:17 ` Baruch Siach 2019-12-12 18:32 ` Marek Behun 2019-12-12 18:48 ` Baruch Siach 2019-12-12 18:36 ` Marek Behun 2019-12-12 19:06 ` Baruch Siach 2019-12-12 19:31 ` Andrew Lunn 2019-12-12 19:41 ` Marek Behun 2019-12-12 20:49 ` Andrew Lunn 2019-12-15 10:13 ` Baruch Siach 2019-12-15 11:10 ` Baruch Siach 2019-12-15 14:53 ` Andrew Lunn 2019-12-15 15:08 ` Baruch Siach 2019-12-15 16:14 ` Andrew Lunn 2019-12-17 18:26 ` Baruch Siach 2019-12-18 14:30 ` Marek Behún 2019-12-18 16:04 ` Baruch Siach 2019-12-19 9:04 ` Baruch Siach
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.