All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v3 0/7] Polling be gone on LAN95xx
@ 2022-05-12  8:42 Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 1/7] usbnet: Run unregister_netdev() before unbind() again Lukas Wunner
                   ` (7 more replies)
  0 siblings, 8 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

Do away with link status polling on LAN95xx USB Ethernet
and rely on interrupts instead, thereby reducing bus traffic,
CPU overhead and improving interface bringup latency.

Link to v2:
https://lore.kernel.org/netdev/cover.1651574194.git.lukas@wunner.de/

Only change since v2:

* Patch [5/7]:
  * Drop call to __irq_enter_raw() which worked around a warning in
    generic_handle_domain_irq().  That warning is gone since
    792ea6a074ae (queued on tip.git/irq/urgent).
    (Marc Zyngier, Thomas Gleixner)


Lukas Wunner (7):
  usbnet: Run unregister_netdev() before unbind() again
  usbnet: smsc95xx: Don't clear read-only PHY interrupt
  usbnet: smsc95xx: Don't reset PHY behind PHY driver's back
  usbnet: smsc95xx: Avoid link settings race on interrupt reception
  usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid
    polling
  net: phy: smsc: Cache interrupt mask
  net: phy: smsc: Cope with hot-removal in interrupt handler

 drivers/net/phy/smsc.c         |  28 +++---
 drivers/net/usb/asix_devices.c |   6 +-
 drivers/net/usb/smsc95xx.c     | 152 +++++++++++++++------------------
 drivers/net/usb/usbnet.c       |   6 +-
 4 files changed, 88 insertions(+), 104 deletions(-)

-- 
2.35.2


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH net-next v3 1/7] usbnet: Run unregister_netdev() before unbind() again
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
@ 2022-05-12  8:42 ` Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 2/7] usbnet: smsc95xx: Don't clear read-only PHY interrupt Lukas Wunner
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

Commit 2c9d6c2b871d ("usbnet: run unbind() before unregister_netdev()")
sought to fix a use-after-free on disconnect of USB Ethernet adapters.

It turns out that a different fix is necessary to address the issue:
https://lore.kernel.org/netdev/18b3541e5372bc9b9fc733d422f4e698c089077c.1650177997.git.lukas@wunner.de/

So the commit was not necessary.

The commit made binding and unbinding of USB Ethernet asymmetrical:
Before, usbnet_probe() first invoked the ->bind() callback and then
register_netdev().  usbnet_disconnect() mirrored that by first invoking
unregister_netdev() and then ->unbind().

Since the commit, the order in usbnet_disconnect() is reversed and no
longer mirrors usbnet_probe().

One consequence is that a PHY disconnected (and stopped) in ->unbind()
is afterwards stopped once more by unregister_netdev() as it closes the
netdev before unregistering.  That necessitates a contortion in ->stop()
because the PHY may only be stopped if it hasn't already been
disconnected.

Reverting the commit allows making the call to phy_stop() unconditional
in ->stop().

Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Oliver Neukum <oneukum@suse.com>
Cc: Martyn Welch <martyn.welch@collabora.com>
Cc: Andrew Lunn <andrew@lunn.ch>
---
 drivers/net/usb/asix_devices.c | 6 +-----
 drivers/net/usb/smsc95xx.c     | 3 +--
 drivers/net/usb/usbnet.c       | 6 +++---
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/net/usb/asix_devices.c b/drivers/net/usb/asix_devices.c
index 38e47a93fb83..5b5eb630c4b7 100644
--- a/drivers/net/usb/asix_devices.c
+++ b/drivers/net/usb/asix_devices.c
@@ -795,11 +795,7 @@ static int ax88772_stop(struct usbnet *dev)
 {
 	struct asix_common_private *priv = dev->driver_priv;
 
-	/* On unplugged USB, we will get MDIO communication errors and the
-	 * PHY will be set in to PHY_HALTED state.
-	 */
-	if (priv->phydev->state != PHY_HALTED)
-		phy_stop(priv->phydev);
+	phy_stop(priv->phydev);
 
 	return 0;
 }
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 4ef61f6b85df..edf0492ad489 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -1243,8 +1243,7 @@ static int smsc95xx_start_phy(struct usbnet *dev)
 
 static int smsc95xx_stop(struct usbnet *dev)
 {
-	if (dev->net->phydev)
-		phy_stop(dev->net->phydev);
+	phy_stop(dev->net->phydev);
 
 	return 0;
 }
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 9a6450f796dc..36b24ec11650 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1616,9 +1616,6 @@ void usbnet_disconnect (struct usb_interface *intf)
 		   xdev->bus->bus_name, xdev->devpath,
 		   dev->driver_info->description);
 
-	if (dev->driver_info->unbind)
-		dev->driver_info->unbind(dev, intf);
-
 	net = dev->net;
 	unregister_netdev (net);
 
@@ -1626,6 +1623,9 @@ void usbnet_disconnect (struct usb_interface *intf)
 
 	usb_scuttle_anchored_urbs(&dev->deferred);
 
+	if (dev->driver_info->unbind)
+		dev->driver_info->unbind(dev, intf);
+
 	usb_kill_urb(dev->interrupt);
 	usb_free_urb(dev->interrupt);
 	kfree(dev->padding_pkt);
-- 
2.35.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH net-next v3 2/7] usbnet: smsc95xx: Don't clear read-only PHY interrupt
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 1/7] usbnet: Run unregister_netdev() before unbind() again Lukas Wunner
@ 2022-05-12  8:42 ` Lukas Wunner
  2022-05-12 11:57   ` Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 3/7] usbnet: smsc95xx: Don't reset PHY behind PHY driver's back Lukas Wunner
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

Upon receiving data from the Interrupt Endpoint, the SMSC LAN95xx driver
attempts to clear the signaled interrupts by writing "all ones" to the
Interrupt Status Register.

However the driver only ever enables a single type of interrupt, namely
the PHY Interrupt.  And according to page 119 of the LAN950x datasheet,
its bit in the Interrupt Status Register is read-only.  There's no other
way to clear it than in a separate PHY register:

https://www.microchip.com/content/dam/mchp/documents/UNG/ProductDocuments/DataSheets/LAN950x-Data-Sheet-DS00001875D.pdf

Consequently, writing "all ones" to the Interrupt Status Register is
pointless and can be dropped.

Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/net/usb/smsc95xx.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index edf0492ad489..2cb44d65bbc3 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -572,10 +572,6 @@ static int smsc95xx_link_reset(struct usbnet *dev)
 	unsigned long flags;
 	int ret;
 
-	ret = smsc95xx_write_reg(dev, INT_STS, INT_STS_CLEAR_ALL_);
-	if (ret < 0)
-		return ret;
-
 	spin_lock_irqsave(&pdata->mac_cr_lock, flags);
 	if (pdata->phydev->duplex != DUPLEX_FULL) {
 		pdata->mac_cr &= ~MAC_CR_FDPX_;
-- 
2.35.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH net-next v3 3/7] usbnet: smsc95xx: Don't reset PHY behind PHY driver's back
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 1/7] usbnet: Run unregister_netdev() before unbind() again Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 2/7] usbnet: smsc95xx: Don't clear read-only PHY interrupt Lukas Wunner
@ 2022-05-12  8:42 ` Lukas Wunner
  2022-05-12 12:12   ` Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 4/7] usbnet: smsc95xx: Avoid link settings race on interrupt reception Lukas Wunner
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

smsc95xx_reset() resets the PHY behind the PHY driver's back, which
seems like a bad idea generally.  Remove that portion of the function.

We're about to use PHY interrupts instead of polling to detect link
changes on SMSC LAN95xx chips.  Because smsc95xx_reset() is called from
usbnet_open(), PHY interrupt settings are lost whenever the net_device
is brought up.

There are two other callers of smsc95xx_reset(), namely smsc95xx_bind()
and smsc95xx_reset_resume(), and both may indeed benefit from a PHY
reset.  However they already perform one through their calls to
phy_connect_direct() and phy_init_hw().

Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: Martyn Welch <martyn.welch@collabora.com>
Cc: Gabriel Hojda <ghojda@yo2urs.ro>
---
 drivers/net/usb/smsc95xx.c | 18 ------------------
 1 file changed, 18 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 2cb44d65bbc3..6c37c7adde1b 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -887,24 +887,6 @@ static int smsc95xx_reset(struct usbnet *dev)
 		return ret;
 	}
 
-	ret = smsc95xx_write_reg(dev, PM_CTRL, PM_CTL_PHY_RST_);
-	if (ret < 0)
-		return ret;
-
-	timeout = 0;
-	do {
-		msleep(10);
-		ret = smsc95xx_read_reg(dev, PM_CTRL, &read_buf);
-		if (ret < 0)
-			return ret;
-		timeout++;
-	} while ((read_buf & PM_CTL_PHY_RST_) && (timeout < 100));
-
-	if (timeout >= 100) {
-		netdev_warn(dev->net, "timeout waiting for PHY Reset\n");
-		return ret;
-	}
-
 	ret = smsc95xx_set_mac_address(dev);
 	if (ret < 0)
 		return ret;
-- 
2.35.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH net-next v3 4/7] usbnet: smsc95xx: Avoid link settings race on interrupt reception
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
                   ` (2 preceding siblings ...)
  2022-05-12  8:42 ` [PATCH net-next v3 3/7] usbnet: smsc95xx: Don't reset PHY behind PHY driver's back Lukas Wunner
@ 2022-05-12  8:42 ` Lukas Wunner
  2022-05-12 12:15   ` Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling Lukas Wunner
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

When a PHY interrupt is signaled, the SMSC LAN95xx driver updates the
MAC full duplex mode and PHY flow control registers based on cached data
in struct phy_device:

  smsc95xx_status()                 # raises EVENT_LINK_RESET
    usbnet_deferred_kevent()
      smsc95xx_link_reset()         # uses cached data in phydev

Simultaneously, phylib polls link status once per second and updates
that cached data:

  phy_state_machine()
    phy_check_link_status()
      phy_read_status()
        lan87xx_read_status()
          genphy_read_status()      # updates cached data in phydev

If smsc95xx_link_reset() wins the race against genphy_read_status(),
the registers may be updated based on stale data.

E.g. if the link was previously down, phydev->duplex is set to
DUPLEX_UNKNOWN and that's what smsc95xx_link_reset() will use, even
though genphy_read_status() may update it to DUPLEX_FULL afterwards.

PHY interrupts are currently only enabled on suspend to trigger wakeup,
so the impact of the race is limited, but we're about to enable them
perpetually.

Avoid the race by delaying execution of smsc95xx_link_reset() until
phy_state_machine() has done its job and calls back via
smsc95xx_handle_link_change().

Signaling EVENT_LINK_RESET on wakeup is not necessary because phylib
picks up link status changes through polling.  So drop the declaration
of a ->link_reset() callback.

Note that the semicolon on a line by itself added in smsc95xx_status()
is a placeholder for a function call which will be added in a subsequent
commit.  That function call will actually handle the INT_ENP_PHY_INT_
interrupt.

Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/net/usb/smsc95xx.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 6c37c7adde1b..6309ff8e75de 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -566,7 +566,7 @@ static int smsc95xx_phy_update_flowcontrol(struct usbnet *dev)
 	return smsc95xx_write_reg(dev, AFC_CFG, afc_cfg);
 }
 
-static int smsc95xx_link_reset(struct usbnet *dev)
+static void smsc95xx_mac_update_fullduplex(struct usbnet *dev)
 {
 	struct smsc95xx_priv *pdata = dev->driver_priv;
 	unsigned long flags;
@@ -583,14 +583,16 @@ static int smsc95xx_link_reset(struct usbnet *dev)
 	spin_unlock_irqrestore(&pdata->mac_cr_lock, flags);
 
 	ret = smsc95xx_write_reg(dev, MAC_CR, pdata->mac_cr);
-	if (ret < 0)
-		return ret;
+	if (ret < 0) {
+		if (ret != -ENODEV)
+			netdev_warn(dev->net,
+				    "Error updating MAC full duplex mode\n");
+		return;
+	}
 
 	ret = smsc95xx_phy_update_flowcontrol(dev);
 	if (ret < 0)
 		netdev_warn(dev->net, "Error updating PHY flow control\n");
-
-	return ret;
 }
 
 static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
@@ -607,7 +609,7 @@ static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
 	netif_dbg(dev, link, dev->net, "intdata: 0x%08X\n", intdata);
 
 	if (intdata & INT_ENP_PHY_INT_)
-		usbnet_defer_kevent(dev, EVENT_LINK_RESET);
+		;
 	else
 		netdev_warn(dev->net, "unexpected interrupt, intdata=0x%08X\n",
 			    intdata);
@@ -1070,6 +1072,7 @@ static void smsc95xx_handle_link_change(struct net_device *net)
 	struct usbnet *dev = netdev_priv(net);
 
 	phy_print_status(net->phydev);
+	smsc95xx_mac_update_fullduplex(dev);
 	usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
 }
 
@@ -1975,7 +1978,6 @@ static const struct driver_info smsc95xx_info = {
 	.description	= "smsc95xx USB 2.0 Ethernet",
 	.bind		= smsc95xx_bind,
 	.unbind		= smsc95xx_unbind,
-	.link_reset	= smsc95xx_link_reset,
 	.reset		= smsc95xx_reset,
 	.check_connect	= smsc95xx_start_phy,
 	.stop		= smsc95xx_stop,
-- 
2.35.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
                   ` (3 preceding siblings ...)
  2022-05-12  8:42 ` [PATCH net-next v3 4/7] usbnet: smsc95xx: Avoid link settings race on interrupt reception Lukas Wunner
@ 2022-05-12  8:42 ` Lukas Wunner
       [not found]   ` <CGME20220517101846eucas1p2c132f7e7032ed00996e222e9cc6cdf99@eucas1p2.samsung.com>
  2022-05-12  8:42 ` [PATCH net-next v3 6/7] net: phy: smsc: Cache interrupt mask Lukas Wunner
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

Link status of SMSC LAN95xx chips is polled once per second, even though
they're capable of signaling PHY interrupts through the MAC layer.

Forward those interrupts to the PHY driver to avoid polling.  Benefits
are reduced bus traffic, reduced CPU overhead and quicker interface
bringup.

Polling was introduced in 2016 by commit d69d16949346 ("usbnet:
smsc95xx: fix link detection for disabled autonegotiation").
Back then, the LAN95xx driver neglected to enable the ENERGYON interrupt,
hence couldn't detect link-up events when auto-negotiation was disabled.
The proper solution would have been to enable the ENERGYON interrupt
instead of polling.

Since then, PHY handling was moved from the LAN95xx driver to the SMSC
PHY driver with commit 05b35e7eb9a1 ("smsc95xx: add phylib support").
That PHY driver is capable of link detection with auto-negotiation
disabled because it enables the ENERGYON interrupt.

Note that signaling interrupts through the MAC layer not only works with
the integrated PHY, but also with an external PHY, provided its
interrupt pin is attached to LAN95xx's nPHY_INT pin.

In the unlikely event that the interrupt pin of an external PHY is
attached to a GPIO of the SoC (or not connected at all), the driver can
be amended to retrieve the irq from the PHY's of_node.

To forward PHY interrupts to phylib, it is not sufficient to call
phy_mac_interrupt().  Instead, the PHY's interrupt handler needs to run
so that PHY interrupts are cleared.  That's because according to page
119 of the LAN950x datasheet, "The source of this interrupt is a level.
The interrupt persists until it is cleared in the PHY."

https://www.microchip.com/content/dam/mchp/documents/UNG/ProductDocuments/DataSheets/LAN950x-Data-Sheet-DS00001875D.pdf

Therefore, create an IRQ domain with a single IRQ for the PHY.  In the
future, the IRQ domain may be extended to support the 11 GPIOs on the
LAN95xx.

Normally the PHY interrupt should be masked until the PHY driver has
cleared it.  However masking requires a (sleeping) USB transaction and
interrupts are received in (non-sleepable) softirq context.  I decided
not to mask the interrupt at all (by using the dummy_irq_chip's noop
->irq_mask() callback):  The USB interrupt endpoint is polled in 1 msec
intervals and normally that's sufficient to wake the PHY driver's IRQ
thread and have it clear the interrupt.  If it does take longer, worst
thing that can happen is the IRQ thread is woken again.  No big deal.

Because PHY interrupts are now perpetually enabled, there's no need to
selectively enable them on suspend.  So remove all invocations of
smsc95xx_enable_phy_wakeup_interrupts().

In smsc95xx_resume(), move the call of phy_init_hw() before
usbnet_resume() (which restarts the status URB) to ensure that the PHY
is fully initialized when an interrupt is handled.

Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch> # from a PHY perspective
Cc: Andre Edich <andre.edich@microchip.com>
---
Only change since v2:
 * Drop call to __irq_enter_raw() which worked around a warning in
   generic_handle_domain_irq().  That warning is gone since
   792ea6a074ae (queued on tip.git/irq/urgent).
   (Marc Zyngier, Thomas Gleixner)

 drivers/net/usb/smsc95xx.c | 113 ++++++++++++++++++++-----------------
 1 file changed, 61 insertions(+), 52 deletions(-)

diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 6309ff8e75de..bd03e16f98a1 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -18,6 +18,8 @@
 #include <linux/usb/usbnet.h>
 #include <linux/slab.h>
 #include <linux/of_net.h>
+#include <linux/irq.h>
+#include <linux/irqdomain.h>
 #include <linux/mdio.h>
 #include <linux/phy.h>
 #include <net/selftests.h>
@@ -53,6 +55,9 @@
 #define SUSPEND_ALLMODES		(SUSPEND_SUSPEND0 | SUSPEND_SUSPEND1 | \
 					 SUSPEND_SUSPEND2 | SUSPEND_SUSPEND3)
 
+#define SMSC95XX_NR_IRQS		(1) /* raise to 12 for GPIOs */
+#define PHY_HWIRQ			(SMSC95XX_NR_IRQS - 1)
+
 struct smsc95xx_priv {
 	u32 mac_cr;
 	u32 hash_hi;
@@ -61,6 +66,9 @@ struct smsc95xx_priv {
 	spinlock_t mac_cr_lock;
 	u8 features;
 	u8 suspend_flags;
+	struct irq_chip irqchip;
+	struct irq_domain *irqdomain;
+	struct fwnode_handle *irqfwnode;
 	struct mii_bus *mdiobus;
 	struct phy_device *phydev;
 };
@@ -597,6 +605,8 @@ static void smsc95xx_mac_update_fullduplex(struct usbnet *dev)
 
 static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
 {
+	struct smsc95xx_priv *pdata = dev->driver_priv;
+	unsigned long flags;
 	u32 intdata;
 
 	if (urb->actual_length != 4) {
@@ -608,11 +618,15 @@ static void smsc95xx_status(struct usbnet *dev, struct urb *urb)
 	intdata = get_unaligned_le32(urb->transfer_buffer);
 	netif_dbg(dev, link, dev->net, "intdata: 0x%08X\n", intdata);
 
+	local_irq_save(flags);
+
 	if (intdata & INT_ENP_PHY_INT_)
-		;
+		generic_handle_domain_irq(pdata->irqdomain, PHY_HWIRQ);
 	else
 		netdev_warn(dev->net, "unexpected interrupt, intdata=0x%08X\n",
 			    intdata);
+
+	local_irq_restore(flags);
 }
 
 /* Enable or disable Tx & Rx checksum offload engines */
@@ -1080,8 +1094,9 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
 {
 	struct smsc95xx_priv *pdata;
 	bool is_internal_phy;
+	char usb_path[64];
+	int ret, phy_irq;
 	u32 val;
-	int ret;
 
 	printk(KERN_INFO SMSC_CHIPNAME " v" SMSC_DRIVER_VERSION "\n");
 
@@ -1121,10 +1136,38 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
 	if (ret)
 		goto free_pdata;
 
+	/* create irq domain for use by PHY driver and GPIO consumers */
+	usb_make_path(dev->udev, usb_path, sizeof(usb_path));
+	pdata->irqfwnode = irq_domain_alloc_named_fwnode(usb_path);
+	if (!pdata->irqfwnode) {
+		ret = -ENOMEM;
+		goto free_pdata;
+	}
+
+	pdata->irqdomain = irq_domain_create_linear(pdata->irqfwnode,
+						    SMSC95XX_NR_IRQS,
+						    &irq_domain_simple_ops,
+						    pdata);
+	if (!pdata->irqdomain) {
+		ret = -ENOMEM;
+		goto free_irqfwnode;
+	}
+
+	phy_irq = irq_create_mapping(pdata->irqdomain, PHY_HWIRQ);
+	if (!phy_irq) {
+		ret = -ENOENT;
+		goto remove_irqdomain;
+	}
+
+	pdata->irqchip = dummy_irq_chip;
+	pdata->irqchip.name = SMSC_CHIPNAME;
+	irq_set_chip_and_handler_name(phy_irq, &pdata->irqchip,
+				      handle_simple_irq, "phy");
+
 	pdata->mdiobus = mdiobus_alloc();
 	if (!pdata->mdiobus) {
 		ret = -ENOMEM;
-		goto free_pdata;
+		goto dispose_irq;
 	}
 
 	ret = smsc95xx_read_reg(dev, HW_CFG, &val);
@@ -1157,6 +1200,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
 		goto unregister_mdio;
 	}
 
+	pdata->phydev->irq = phy_irq;
 	pdata->phydev->is_internal = is_internal_phy;
 
 	/* detect device revision as different features may be available */
@@ -1199,6 +1243,15 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
 free_mdio:
 	mdiobus_free(pdata->mdiobus);
 
+dispose_irq:
+	irq_dispose_mapping(phy_irq);
+
+remove_irqdomain:
+	irq_domain_remove(pdata->irqdomain);
+
+free_irqfwnode:
+	irq_domain_free_fwnode(pdata->irqfwnode);
+
 free_pdata:
 	kfree(pdata);
 	return ret;
@@ -1211,6 +1264,9 @@ static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
 	phy_disconnect(dev->net->phydev);
 	mdiobus_unregister(pdata->mdiobus);
 	mdiobus_free(pdata->mdiobus);
+	irq_dispose_mapping(irq_find_mapping(pdata->irqdomain, PHY_HWIRQ));
+	irq_domain_remove(pdata->irqdomain);
+	irq_domain_free_fwnode(pdata->irqfwnode);
 	netif_dbg(dev, ifdown, dev->net, "free pdata\n");
 	kfree(pdata);
 }
@@ -1235,29 +1291,6 @@ static u32 smsc_crc(const u8 *buffer, size_t len, int filter)
 	return crc << ((filter % 2) * 16);
 }
 
-static int smsc95xx_enable_phy_wakeup_interrupts(struct usbnet *dev, u16 mask)
-{
-	int ret;
-
-	netdev_dbg(dev->net, "enabling PHY wakeup interrupts\n");
-
-	/* read to clear */
-	ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_SRC);
-	if (ret < 0)
-		return ret;
-
-	/* enable interrupt source */
-	ret = smsc95xx_mdio_read_nopm(dev, PHY_INT_MASK);
-	if (ret < 0)
-		return ret;
-
-	ret |= mask;
-
-	smsc95xx_mdio_write_nopm(dev, PHY_INT_MASK, ret);
-
-	return 0;
-}
-
 static int smsc95xx_link_ok_nopm(struct usbnet *dev)
 {
 	int ret;
@@ -1424,7 +1457,6 @@ static int smsc95xx_enter_suspend3(struct usbnet *dev)
 static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
 {
 	struct smsc95xx_priv *pdata = dev->driver_priv;
-	int ret;
 
 	if (!netif_running(dev->net)) {
 		/* interface is ifconfig down so fully power down hw */
@@ -1443,27 +1475,10 @@ static int smsc95xx_autosuspend(struct usbnet *dev, u32 link_up)
 		}
 
 		netdev_dbg(dev->net, "autosuspend entering SUSPEND1\n");
-
-		/* enable PHY wakeup events for if cable is attached */
-		ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
-			PHY_INT_MASK_ANEG_COMP_);
-		if (ret < 0) {
-			netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
-			return ret;
-		}
-
 		netdev_info(dev->net, "entering SUSPEND1 mode\n");
 		return smsc95xx_enter_suspend1(dev);
 	}
 
-	/* enable PHY wakeup events so we remote wakeup if cable is pulled */
-	ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
-		PHY_INT_MASK_LINK_DOWN_);
-	if (ret < 0) {
-		netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
-		return ret;
-	}
-
 	netdev_dbg(dev->net, "autosuspend entering SUSPEND3\n");
 	return smsc95xx_enter_suspend3(dev);
 }
@@ -1529,13 +1544,6 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 	}
 
 	if (pdata->wolopts & WAKE_PHY) {
-		ret = smsc95xx_enable_phy_wakeup_interrupts(dev,
-			(PHY_INT_MASK_ANEG_COMP_ | PHY_INT_MASK_LINK_DOWN_));
-		if (ret < 0) {
-			netdev_warn(dev->net, "error enabling PHY wakeup ints\n");
-			goto done;
-		}
-
 		/* if link is down then configure EDPD and enter SUSPEND1,
 		 * otherwise enter SUSPEND0 below
 		 */
@@ -1769,11 +1777,12 @@ static int smsc95xx_resume(struct usb_interface *intf)
 			return ret;
 	}
 
+	phy_init_hw(pdata->phydev);
+
 	ret = usbnet_resume(intf);
 	if (ret < 0)
 		netdev_warn(dev->net, "usbnet_resume error\n");
 
-	phy_init_hw(pdata->phydev);
 	return ret;
 }
 
-- 
2.35.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH net-next v3 6/7] net: phy: smsc: Cache interrupt mask
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
                   ` (4 preceding siblings ...)
  2022-05-12  8:42 ` [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling Lukas Wunner
@ 2022-05-12  8:42 ` Lukas Wunner
  2022-05-12 12:17   ` Lukas Wunner
  2022-05-12  8:42 ` [PATCH net-next v3 7/7] net: phy: smsc: Cope with hot-removal in interrupt handler Lukas Wunner
  2022-05-13 10:40 ` [PATCH net-next v3 0/7] Polling be gone on LAN95xx patchwork-bot+netdevbpf
  7 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

Cache the interrupt mask to avoid re-reading it from the PHY upon every
interrupt.

This will simplify a subsequent commit which detects hot-removal in the
interrupt handler and bails out.

Analyzing and debugging PHY transactions also becomes simpler if such
redundant reads are avoided.

Last not least, interrupt overhead and latency is slightly improved.

Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/net/phy/smsc.c | 24 +++++++++++-------------
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index 92225d0fc246..e0eadea4b4b5 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -44,6 +44,7 @@ static struct smsc_hw_stat smsc_hw_stats[] = {
 };
 
 struct smsc_phy_priv {
+	u16 intmask;
 	bool energy_enable;
 	struct clk *refclk;
 };
@@ -58,7 +59,6 @@ static int smsc_phy_ack_interrupt(struct phy_device *phydev)
 static int smsc_phy_config_intr(struct phy_device *phydev)
 {
 	struct smsc_phy_priv *priv = phydev->priv;
-	u16 intmask = 0;
 	int rc;
 
 	if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
@@ -66,12 +66,15 @@ static int smsc_phy_config_intr(struct phy_device *phydev)
 		if (rc)
 			return rc;
 
-		intmask = MII_LAN83C185_ISF_INT4 | MII_LAN83C185_ISF_INT6;
+		priv->intmask = MII_LAN83C185_ISF_INT4 | MII_LAN83C185_ISF_INT6;
 		if (priv->energy_enable)
-			intmask |= MII_LAN83C185_ISF_INT7;
-		rc = phy_write(phydev, MII_LAN83C185_IM, intmask);
+			priv->intmask |= MII_LAN83C185_ISF_INT7;
+
+		rc = phy_write(phydev, MII_LAN83C185_IM, priv->intmask);
 	} else {
-		rc = phy_write(phydev, MII_LAN83C185_IM, intmask);
+		priv->intmask = 0;
+
+		rc = phy_write(phydev, MII_LAN83C185_IM, 0);
 		if (rc)
 			return rc;
 
@@ -83,13 +86,8 @@ static int smsc_phy_config_intr(struct phy_device *phydev)
 
 static irqreturn_t smsc_phy_handle_interrupt(struct phy_device *phydev)
 {
-	int irq_status, irq_enabled;
-
-	irq_enabled = phy_read(phydev, MII_LAN83C185_IM);
-	if (irq_enabled < 0) {
-		phy_error(phydev);
-		return IRQ_NONE;
-	}
+	struct smsc_phy_priv *priv = phydev->priv;
+	int irq_status;
 
 	irq_status = phy_read(phydev, MII_LAN83C185_ISF);
 	if (irq_status < 0) {
@@ -97,7 +95,7 @@ static irqreturn_t smsc_phy_handle_interrupt(struct phy_device *phydev)
 		return IRQ_NONE;
 	}
 
-	if (!(irq_status & irq_enabled))
+	if (!(irq_status & priv->intmask))
 		return IRQ_NONE;
 
 	phy_trigger_machine(phydev);
-- 
2.35.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH net-next v3 7/7] net: phy: smsc: Cope with hot-removal in interrupt handler
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
                   ` (5 preceding siblings ...)
  2022-05-12  8:42 ` [PATCH net-next v3 6/7] net: phy: smsc: Cache interrupt mask Lukas Wunner
@ 2022-05-12  8:42 ` Lukas Wunner
  2022-05-12 12:19   ` Lukas Wunner
  2022-05-13 10:40 ` [PATCH net-next v3 0/7] Polling be gone on LAN95xx patchwork-bot+netdevbpf
  7 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12  8:42 UTC (permalink / raw)
  To: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth

If reading the Interrupt Source Flag register fails with -ENODEV, then
the PHY has been hot-removed and the correct response is to bail out
instead of throwing a WARN splat and attempting to suspend the PHY.
The PHY should be stopped in due course anyway as the kernel
asynchronously tears down the device.

Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
Signed-off-by: Lukas Wunner <lukas@wunner.de>
---
 drivers/net/phy/smsc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/smsc.c b/drivers/net/phy/smsc.c
index e0eadea4b4b5..1b54684b68a0 100644
--- a/drivers/net/phy/smsc.c
+++ b/drivers/net/phy/smsc.c
@@ -91,7 +91,9 @@ static irqreturn_t smsc_phy_handle_interrupt(struct phy_device *phydev)
 
 	irq_status = phy_read(phydev, MII_LAN83C185_ISF);
 	if (irq_status < 0) {
-		phy_error(phydev);
+		if (irq_status != -ENODEV)
+			phy_error(phydev);
+
 		return IRQ_NONE;
 	}
 
-- 
2.35.2


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 2/7] usbnet: smsc95xx: Don't clear read-only PHY interrupt
  2022-05-12  8:42 ` [PATCH net-next v3 2/7] usbnet: smsc95xx: Don't clear read-only PHY interrupt Lukas Wunner
@ 2022-05-12 11:57   ` Lukas Wunner
  0 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12 11:57 UTC (permalink / raw)
  To: netdev; +Cc: Andrew Lunn

On Thu, May 12, 2022 at 10:42:02AM +0200, Lukas Wunner wrote:
> Upon receiving data from the Interrupt Endpoint, the SMSC LAN95xx driver
> attempts to clear the signaled interrupts by writing "all ones" to the
> Interrupt Status Register.
> 
> However the driver only ever enables a single type of interrupt, namely
> the PHY Interrupt.  And according to page 119 of the LAN950x datasheet,
> its bit in the Interrupt Status Register is read-only.  There's no other
> way to clear it than in a separate PHY register:
> 
> https://www.microchip.com/content/dam/mchp/documents/UNG/ProductDocuments/DataSheets/LAN950x-Data-Sheet-DS00001875D.pdf
> 
> Consequently, writing "all ones" to the Interrupt Status Register is
> pointless and can be dropped.
> 
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
> Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
> Signed-off-by: Lukas Wunner <lukas@wunner.de>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Forgot to add Andrew's R-b which he kindly provided here:
https://lore.kernel.org/netdev/YnGqtCDVHHpo%2FS+L@lunn.ch/

Let's see if patchwork picks the tag up if I add it like this.
My apologies for the inconvenience.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 3/7] usbnet: smsc95xx: Don't reset PHY behind PHY driver's back
  2022-05-12  8:42 ` [PATCH net-next v3 3/7] usbnet: smsc95xx: Don't reset PHY behind PHY driver's back Lukas Wunner
@ 2022-05-12 12:12   ` Lukas Wunner
  0 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12 12:12 UTC (permalink / raw)
  To: netdev; +Cc: Andrew Lunn

On Thu, May 12, 2022 at 10:42:03AM +0200, Lukas Wunner wrote:
> smsc95xx_reset() resets the PHY behind the PHY driver's back, which
> seems like a bad idea generally.  Remove that portion of the function.
> 
> We're about to use PHY interrupts instead of polling to detect link
> changes on SMSC LAN95xx chips.  Because smsc95xx_reset() is called from
> usbnet_open(), PHY interrupt settings are lost whenever the net_device
> is brought up.
> 
> There are two other callers of smsc95xx_reset(), namely smsc95xx_bind()
> and smsc95xx_reset_resume(), and both may indeed benefit from a PHY
> reset.  However they already perform one through their calls to
> phy_connect_direct() and phy_init_hw().
> 
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
> Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: Martyn Welch <martyn.welch@collabora.com>
> Cc: Gabriel Hojda <ghojda@yo2urs.ro>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Andrew kindly provided this tag here:
https://lore.kernel.org/netdev/YnGq401sOeC0zwt6@lunn.ch/

Forgot to add it to the commit.
Sending it in separately so patchwork picks it up.
My apologies for the inconvenience.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 4/7] usbnet: smsc95xx: Avoid link settings race on interrupt reception
  2022-05-12  8:42 ` [PATCH net-next v3 4/7] usbnet: smsc95xx: Avoid link settings race on interrupt reception Lukas Wunner
@ 2022-05-12 12:15   ` Lukas Wunner
  0 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12 12:15 UTC (permalink / raw)
  To: netdev; +Cc: Andrew Lunn

On Thu, May 12, 2022 at 10:42:04AM +0200, Lukas Wunner wrote:
> When a PHY interrupt is signaled, the SMSC LAN95xx driver updates the
> MAC full duplex mode and PHY flow control registers based on cached data
> in struct phy_device:
> 
>   smsc95xx_status()                 # raises EVENT_LINK_RESET
>     usbnet_deferred_kevent()
>       smsc95xx_link_reset()         # uses cached data in phydev
> 
> Simultaneously, phylib polls link status once per second and updates
> that cached data:
> 
>   phy_state_machine()
>     phy_check_link_status()
>       phy_read_status()
>         lan87xx_read_status()
>           genphy_read_status()      # updates cached data in phydev
> 
> If smsc95xx_link_reset() wins the race against genphy_read_status(),
> the registers may be updated based on stale data.
> 
> E.g. if the link was previously down, phydev->duplex is set to
> DUPLEX_UNKNOWN and that's what smsc95xx_link_reset() will use, even
> though genphy_read_status() may update it to DUPLEX_FULL afterwards.
> 
> PHY interrupts are currently only enabled on suspend to trigger wakeup,
> so the impact of the race is limited, but we're about to enable them
> perpetually.
> 
> Avoid the race by delaying execution of smsc95xx_link_reset() until
> phy_state_machine() has done its job and calls back via
> smsc95xx_handle_link_change().
> 
> Signaling EVENT_LINK_RESET on wakeup is not necessary because phylib
> picks up link status changes through polling.  So drop the declaration
> of a ->link_reset() callback.
> 
> Note that the semicolon on a line by itself added in smsc95xx_status()
> is a placeholder for a function call which will be added in a subsequent
> commit.  That function call will actually handle the INT_ENP_PHY_INT_
> interrupt.
> 
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
> Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
> Signed-off-by: Lukas Wunner <lukas@wunner.de>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Andrew kindly provided this tag here:
https://lore.kernel.org/netdev/YnGrUdmqtzt3Ogn+@lunn.ch/

Forgot to add it to the commit.
Sending it in separately so patchwork picks it up.
My apologies for the inconvenience.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 6/7] net: phy: smsc: Cache interrupt mask
  2022-05-12  8:42 ` [PATCH net-next v3 6/7] net: phy: smsc: Cache interrupt mask Lukas Wunner
@ 2022-05-12 12:17   ` Lukas Wunner
  0 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12 12:17 UTC (permalink / raw)
  To: netdev; +Cc: Andrew Lunn

On Thu, May 12, 2022 at 10:42:06AM +0200, Lukas Wunner wrote:
> Cache the interrupt mask to avoid re-reading it from the PHY upon every
> interrupt.
> 
> This will simplify a subsequent commit which detects hot-removal in the
> interrupt handler and bails out.
> 
> Analyzing and debugging PHY transactions also becomes simpler if such
> redundant reads are avoided.
> 
> Last not least, interrupt overhead and latency is slightly improved.
> 
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
> Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
> Signed-off-by: Lukas Wunner <lukas@wunner.de>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Andrew kindly provided this tag here:
https://lore.kernel.org/netdev/YnGsDtscYkh28PFm@lunn.ch/

Forgot to add it to the commit.
Sending it in separately so patchwork picks it up.
My apologies for the inconvenience.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 7/7] net: phy: smsc: Cope with hot-removal in interrupt handler
  2022-05-12  8:42 ` [PATCH net-next v3 7/7] net: phy: smsc: Cope with hot-removal in interrupt handler Lukas Wunner
@ 2022-05-12 12:19   ` Lukas Wunner
  0 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-12 12:19 UTC (permalink / raw)
  To: netdev; +Cc: Andrew Lunn

On Thu, May 12, 2022 at 10:42:07AM +0200, Lukas Wunner wrote:
> If reading the Interrupt Source Flag register fails with -ENODEV, then
> the PHY has been hot-removed and the correct response is to bail out
> instead of throwing a WARN splat and attempting to suspend the PHY.
> The PHY should be stopped in due course anyway as the kernel
> asynchronously tears down the device.
> 
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
> Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
> Signed-off-by: Lukas Wunner <lukas@wunner.de>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

Andrew kindly provided this tag here:
https://lore.kernel.org/netdev/YnGsKQC1WxihasYs@lunn.ch/

Forgot to add it to the commit.
Sending it in separately so patchwork picks it up.
My apologies for the inconvenience.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 0/7] Polling be gone on LAN95xx
  2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
                   ` (6 preceding siblings ...)
  2022-05-12  8:42 ` [PATCH net-next v3 7/7] net: phy: smsc: Cope with hot-removal in interrupt handler Lukas Wunner
@ 2022-05-13 10:40 ` patchwork-bot+netdevbpf
  7 siblings, 0 replies; 40+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-05-13 10:40 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: davem, kuba, pabeni, edumazet, netdev, linux-usb,
	steve.glendinning, UNGLinuxDriver, oneukum, andre.edich, linux,
	martyn.welch, ghojda, chf.fritz, LinoSanfilippo, p.rosenberger,
	hkallweit1, andrew, linux, fntoth

Hello:

This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu, 12 May 2022 10:42:00 +0200 you wrote:
> Do away with link status polling on LAN95xx USB Ethernet
> and rely on interrupts instead, thereby reducing bus traffic,
> CPU overhead and improving interface bringup latency.
> 
> Link to v2:
> https://lore.kernel.org/netdev/cover.1651574194.git.lukas@wunner.de/
> 
> [...]

Here is the summary with links:
  - [net-next,v3,1/7] usbnet: Run unregister_netdev() before unbind() again
    https://git.kernel.org/netdev/net-next/c/d1408f6b4dd7
  - [net-next,v3,2/7] usbnet: smsc95xx: Don't clear read-only PHY interrupt
    https://git.kernel.org/netdev/net-next/c/3108871f1922
  - [net-next,v3,3/7] usbnet: smsc95xx: Don't reset PHY behind PHY driver's back
    https://git.kernel.org/netdev/net-next/c/14021da69811
  - [net-next,v3,4/7] usbnet: smsc95xx: Avoid link settings race on interrupt reception
    https://git.kernel.org/netdev/net-next/c/8960f878e39f
  - [net-next,v3,5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
    https://git.kernel.org/netdev/net-next/c/1ce8b37241ed
  - [net-next,v3,6/7] net: phy: smsc: Cache interrupt mask
    https://git.kernel.org/netdev/net-next/c/7e8b617eb93f
  - [net-next,v3,7/7] net: phy: smsc: Cope with hot-removal in interrupt handler
    https://git.kernel.org/netdev/net-next/c/1e7b81edebc1

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
       [not found]   ` <CGME20220517101846eucas1p2c132f7e7032ed00996e222e9cc6cdf99@eucas1p2.samsung.com>
@ 2022-05-17 10:18     ` Marek Szyprowski
  2022-05-19 19:08       ` Lukas Wunner
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Szyprowski @ 2022-05-17 10:18 UTC (permalink / raw)
  To: Lukas Wunner, David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet
  Cc: netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

Hi Lukas,

On 12.05.2022 10:42, Lukas Wunner wrote:
> Link status of SMSC LAN95xx chips is polled once per second, even though
> they're capable of signaling PHY interrupts through the MAC layer.
>
> Forward those interrupts to the PHY driver to avoid polling.  Benefits
> are reduced bus traffic, reduced CPU overhead and quicker interface
> bringup.
>
> Polling was introduced in 2016 by commit d69d16949346 ("usbnet:
> smsc95xx: fix link detection for disabled autonegotiation").
> Back then, the LAN95xx driver neglected to enable the ENERGYON interrupt,
> hence couldn't detect link-up events when auto-negotiation was disabled.
> The proper solution would have been to enable the ENERGYON interrupt
> instead of polling.
>
> Since then, PHY handling was moved from the LAN95xx driver to the SMSC
> PHY driver with commit 05b35e7eb9a1 ("smsc95xx: add phylib support").
> That PHY driver is capable of link detection with auto-negotiation
> disabled because it enables the ENERGYON interrupt.
>
> Note that signaling interrupts through the MAC layer not only works with
> the integrated PHY, but also with an external PHY, provided its
> interrupt pin is attached to LAN95xx's nPHY_INT pin.
>
> In the unlikely event that the interrupt pin of an external PHY is
> attached to a GPIO of the SoC (or not connected at all), the driver can
> be amended to retrieve the irq from the PHY's of_node.
>
> To forward PHY interrupts to phylib, it is not sufficient to call
> phy_mac_interrupt().  Instead, the PHY's interrupt handler needs to run
> so that PHY interrupts are cleared.  That's because according to page
> 119 of the LAN950x datasheet, "The source of this interrupt is a level.
> The interrupt persists until it is cleared in the PHY."
>
> https://www.microchip.com/content/dam/mchp/documents/UNG/ProductDocuments/DataSheets/LAN950x-Data-Sheet-DS00001875D.pdf
>
> Therefore, create an IRQ domain with a single IRQ for the PHY.  In the
> future, the IRQ domain may be extended to support the 11 GPIOs on the
> LAN95xx.
>
> Normally the PHY interrupt should be masked until the PHY driver has
> cleared it.  However masking requires a (sleeping) USB transaction and
> interrupts are received in (non-sleepable) softirq context.  I decided
> not to mask the interrupt at all (by using the dummy_irq_chip's noop
> ->irq_mask() callback):  The USB interrupt endpoint is polled in 1 msec
> intervals and normally that's sufficient to wake the PHY driver's IRQ
> thread and have it clear the interrupt.  If it does take longer, worst
> thing that can happen is the IRQ thread is woken again.  No big deal.
>
> Because PHY interrupts are now perpetually enabled, there's no need to
> selectively enable them on suspend.  So remove all invocations of
> smsc95xx_enable_phy_wakeup_interrupts().
>
> In smsc95xx_resume(), move the call of phy_init_hw() before
> usbnet_resume() (which restarts the status URB) to ensure that the PHY
> is fully initialized when an interrupt is handled.
>
> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> # LAN9514/9512/9500
> Tested-by: Ferry Toth <fntoth@gmail.com> # LAN9514
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Andrew Lunn <andrew@lunn.ch> # from a PHY perspective
> Cc: Andre Edich <andre.edich@microchip.com>

This patch landed in the recent linux next-20220516 as commit 
1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to 
avoid polling"). Unfortunately it breaks smsc95xx usb ethernet operation 
after system suspend-resume cycle. On the Odroid XU3 board I got the 
following warning in the kernel log:

# time rtcwake -s10 -mmem
rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
PM: suspend entry (deep)
Filesystems sync: 0.001 seconds
Freezing user space processes ... (elapsed 0.002 seconds) done.
OOM killer disabled.
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
printk: Suspending console(s) (use no_console_suspend to debug)
smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
------------[ cut here ]------------
WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946 
phy_state_machine+0x98/0x28c
Modules linked in: snd_soc_hdmi_codec snd_soc_odroid governor_passive 
snd_soc_i2s exynos_bus snd_soc_idma snd_soc_s3c_dma exynosdrm 
analogix_dp snd_soc_max98090 snd_soc_core ac97_bus snd_pcm_dmaengine 
snd_pcm clk_s2mps11 rtc_s5m snd_timer snd soundcore ina2xx exynos_gsc 
pwm_samsung exynos_adc s5p_jpeg ohci_exynosv4l2_mem2mem phy_exynos_usb2 
panfrost ehci_exynos s5p_mfc drm_shmem_helper videobuf2_dma_contig 
videobuf2_memops videobuf2_v4l2 videobuf2_common gpu_sched videodev mc 
exynos_ppmu exynos5422_dmc exynos_nocp s5p_sss rtc_s3c exynos_rng 
s3c2410_wdt s5p_cec pwm_fan
CPU: 2 PID: 73 Comm: kworker/2:1 Tainted: G        W 
5.18.0-rc6-01433-g1ce8b37241ed #5040
Hardware name: Samsung Exynos (Flattened Device Tree)
Workqueue: events_power_efficient phy_state_machine
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x40/0x4c
  dump_stack_lvl from __warn+0xc8/0x13c
  __warn from warn_slowpath_fmt+0x5c/0xb4
  warn_slowpath_fmt from phy_state_machine+0x98/0x28c
  phy_state_machine from process_one_work+0x1ec/0x4cc
  process_one_work from worker_thread+0x58/0x54c
  worker_thread from kthread+0xd0/0xec
  kthread from ret_from_fork+0x14/0x2c
Exception stack(0xf0aa9fb0 to 0xf0aa9ff8)
...
---[ end trace 0000000000000000 ]---

It looks that the driver's suspend/resume operations might need some 
adjustments. After the system suspend/resume cycle the driver is not 
operational anymore. Reverting the $subject patch on top of linux 
next-20220516 restores ethernet operation after system suspend/resume.

> ---
> Only change since v2:
>   * Drop call to __irq_enter_raw() which worked around a warning in
>     generic_handle_domain_irq().  That warning is gone since
>     792ea6a074ae (queued on tip.git/irq/urgent).
>     (Marc Zyngier, Thomas Gleixner)
>
>   drivers/net/usb/smsc95xx.c | 113 ++++++++++++++++++++-----------------
>   1 file changed, 61 insertions(+), 52 deletions(-)

 > ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-17 10:18     ` Marek Szyprowski
@ 2022-05-19 19:08       ` Lukas Wunner
  2022-05-19 21:22         ` Marek Szyprowski
  0 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-19 19:08 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
> This patch landed in the recent linux next-20220516 as commit 
> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to 
> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet operation 
> after system suspend-resume cycle. On the Odroid XU3 board I got the 
> following warning in the kernel log:
> 
> # time rtcwake -s10 -mmem
> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
> PM: suspend entry (deep)
> Filesystems sync: 0.001 seconds
> Freezing user space processes ... (elapsed 0.002 seconds) done.
> OOM killer disabled.
> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> printk: Suspending console(s) (use no_console_suspend to debug)
> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
> phy_state_machine+0x98/0x28c
[...]
> It looks that the driver's suspend/resume operations might need some 
> adjustments. After the system suspend/resume cycle the driver is not 
> operational anymore. Reverting the $subject patch on top of linux 
> next-20220516 restores ethernet operation after system suspend/resume.

Thanks a lot for the report.  It seems the PHY is signaling a link change
shortly before system sleep and by the time the phy_state_machine() worker
gets around to handle it, the device has already been suspended and thus
refuses any further USB requests with -EHOSTUNREACH (-113):

usb_suspend_both()
  usb_suspend_interface()
    smsc95xx_suspend()
      usbnet_suspend()
        __usbnet_status_stop_force() # stops interrupt polling,
                                     # link change is signaled before this

  udev->can_submit = 0               # refuse further URBs

Assuming the above theory is correct, calling phy_stop_machine()
after usbnet_suspend() would be sufficient to fix the issue.
It cancels the phy_state_machine() worker.

The small patch below does that.  Could you give it a spin?

Taking a step back though, I'm wondering if there's a bigger problem here:
This is a USB device, so we stop receiving interrupts once the Interrupt
Endpoint is no longer polled.  But what if a PHY's interrupt is attached
to a GPIO of the SoC and that interrupt is raised while the system is
suspending?  The interrupt handler may likewise try to reach an
inaccessible (suspended) device.

The right thing to do would probably be to signal wakeup.  But the
PHY drivers' irq handlers instead schedule the phy_state_machine().
Perhaps we need something like the following at the top of
phy_state_machine():

	if (phydev->suspended) {
		pm_wakeup_dev_event(&phydev->mdio.dev, 0, true);
		return;
	}

However, phydev->suspended is set at the *bottom* of phy_suspend(),
it would have to be set at the *top* of mdio_bus_phy_suspend()
for the above to be correct.  Hmmm...

Thanks,

Lukas

-- >8 --
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index bd03e16..d351a6c 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -1201,6 +1201,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
 	}
 
 	pdata->phydev->irq = phy_irq;
+	pdata->phydev->mac_managed_pm = true;
 	pdata->phydev->is_internal = is_internal_phy;
 
 	/* detect device revision as different features may be available */
@@ -1496,6 +1497,9 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
 		return ret;
 	}
 
+	if (netif_running(dev->net))
+		phy_stop(pdata->phydev);
+
 	if (pdata->suspend_flags) {
 		netdev_warn(dev->net, "error during last resume\n");
 		pdata->suspend_flags = 0;
@@ -1778,6 +1782,8 @@ static int smsc95xx_resume(struct usb_interface *intf)
 	}
 
 	phy_init_hw(pdata->phydev);
+	if (netif_running(dev->net))
+		phy_start(pdata->phydev);
 
 	ret = usbnet_resume(intf);
 	if (ret < 0)

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-19 19:08       ` Lukas Wunner
@ 2022-05-19 21:22         ` Marek Szyprowski
  2022-05-23  9:43           ` Lukas Wunner
  2022-08-26  6:51           ` Marek Szyprowski
  0 siblings, 2 replies; 40+ messages in thread
From: Marek Szyprowski @ 2022-05-19 21:22 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

Hi Lukas,

On 19.05.2022 21:08, Lukas Wunner wrote:
> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
>> This patch landed in the recent linux next-20220516 as commit
>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to
>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet operation
>> after system suspend-resume cycle. On the Odroid XU3 board I got the
>> following warning in the kernel log:
>>
>> # time rtcwake -s10 -mmem
>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
>> PM: suspend entry (deep)
>> Filesystems sync: 0.001 seconds
>> Freezing user space processes ... (elapsed 0.002 seconds) done.
>> OOM killer disabled.
>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>> printk: Suspending console(s) (use no_console_suspend to debug)
>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
>> ------------[ cut here ]------------
>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
>> phy_state_machine+0x98/0x28c
> [...]
>> It looks that the driver's suspend/resume operations might need some
>> adjustments. After the system suspend/resume cycle the driver is not
>> operational anymore. Reverting the $subject patch on top of linux
>> next-20220516 restores ethernet operation after system suspend/resume.
> Thanks a lot for the report.  It seems the PHY is signaling a link change
> shortly before system sleep and by the time the phy_state_machine() worker
> gets around to handle it, the device has already been suspended and thus
> refuses any further USB requests with -EHOSTUNREACH (-113):
>
> usb_suspend_both()
>    usb_suspend_interface()
>      smsc95xx_suspend()
>        usbnet_suspend()
>          __usbnet_status_stop_force() # stops interrupt polling,
>                                       # link change is signaled before this
>
>    udev->can_submit = 0               # refuse further URBs
>
> Assuming the above theory is correct, calling phy_stop_machine()
> after usbnet_suspend() would be sufficient to fix the issue.
> It cancels the phy_state_machine() worker.
>
> The small patch below does that.  Could you give it a spin?

That's it. Your analysis is right and the patch fixes the issue. Thanks!

Feel free to add:

Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

> Taking a step back though, I'm wondering if there's a bigger problem here:
> This is a USB device, so we stop receiving interrupts once the Interrupt
> Endpoint is no longer polled.  But what if a PHY's interrupt is attached
> to a GPIO of the SoC and that interrupt is raised while the system is
> suspending?  The interrupt handler may likewise try to reach an
> inaccessible (suspended) device.
>
> The right thing to do would probably be to signal wakeup.  But the
> PHY drivers' irq handlers instead schedule the phy_state_machine().
> Perhaps we need something like the following at the top of
> phy_state_machine():
>
> 	if (phydev->suspended) {
> 		pm_wakeup_dev_event(&phydev->mdio.dev, 0, true);
> 		return;
> 	}
>
> However, phydev->suspended is set at the *bottom* of phy_suspend(),
> it would have to be set at the *top* of mdio_bus_phy_suspend()
> for the above to be correct.  Hmmm...
Well, your concern sounds valid, but I don't have a board with such hw 
configuration, so I cannot really test.
> Thanks,
>
> Lukas
>
> -- >8 --
> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
> index bd03e16..d351a6c 100644
> --- a/drivers/net/usb/smsc95xx.c
> +++ b/drivers/net/usb/smsc95xx.c
> @@ -1201,6 +1201,7 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
>   	}
>   
>   	pdata->phydev->irq = phy_irq;
> +	pdata->phydev->mac_managed_pm = true;
>   	pdata->phydev->is_internal = is_internal_phy;
>   
>   	/* detect device revision as different features may be available */
> @@ -1496,6 +1497,9 @@ static int smsc95xx_suspend(struct usb_interface *intf, pm_message_t message)
>   		return ret;
>   	}
>   
> +	if (netif_running(dev->net))
> +		phy_stop(pdata->phydev);
> +
>   	if (pdata->suspend_flags) {
>   		netdev_warn(dev->net, "error during last resume\n");
>   		pdata->suspend_flags = 0;
> @@ -1778,6 +1782,8 @@ static int smsc95xx_resume(struct usb_interface *intf)
>   	}
>   
>   	phy_init_hw(pdata->phydev);
> +	if (netif_running(dev->net))
> +		phy_start(pdata->phydev);
>   
>   	ret = usbnet_resume(intf);
>   	if (ret < 0)
>
Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-19 21:22         ` Marek Szyprowski
@ 2022-05-23  9:43           ` Lukas Wunner
  2022-05-23 11:38             ` Marek Szyprowski
                               ` (2 more replies)
  2022-08-26  6:51           ` Marek Szyprowski
  1 sibling, 3 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-23  9:43 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

On Thu, May 19, 2022 at 11:22:36PM +0200, Marek Szyprowski wrote:
> On 19.05.2022 21:08, Lukas Wunner wrote:
> > Taking a step back though, I'm wondering if there's a bigger problem here:
> > This is a USB device, so we stop receiving interrupts once the Interrupt
> > Endpoint is no longer polled.  But what if a PHY's interrupt is attached
> > to a GPIO of the SoC and that interrupt is raised while the system is
> > suspending?  The interrupt handler may likewise try to reach an
> > inaccessible (suspended) device.
> >
> > The right thing to do would probably be to signal wakeup.  But the
> > PHY drivers' irq handlers instead schedule the phy_state_machine().
> > Perhaps we need something like the following at the top of
> > phy_state_machine():
> >
> > 	if (phydev->suspended) {
> > 		pm_wakeup_dev_event(&phydev->mdio.dev, 0, true);
> > 		return;
> > 	}
> >
> > However, phydev->suspended is set at the *bottom* of phy_suspend(),
> > it would have to be set at the *top* of mdio_bus_phy_suspend()
> > for the above to be correct.  Hmmm...
> 
> Well, your concern sounds valid, but I don't have a board with such hw 
> configuration, so I cannot really test.

I'm torn whether I should submit the quick fix in my last e-mail
or attempt to address the deeper issue.  The quick fix would ensure
v5.19-rc1 isn't broken, but if possible I'd rather address the deeper
issue...

Below is another patch.  Would you mind testing if it fixes the problem
for you?  It's a replacement for the patch in my last e-mail and seeks
to fix the problem for all drivers, not just smsc95xx.  If you don't
have time to test it, let me know and I'll just submit the quick fix
in my previous e-mail.

BTW, getting a PHY interrupt on suspend seems like a corner case to me,
so I'm amazed you found this and seem to be able to reproduce it 100%.
Out of curiosity, is this a CI test you're performing?

Thanks,

Lukas

-- >8 --

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index ef62f357b76d..c2442c38d312 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -31,6 +31,7 @@
 #include <linux/io.h>
 #include <linux/uaccess.h>
 #include <linux/atomic.h>
+#include <linux/suspend.h>
 #include <net/netlink.h>
 #include <net/genetlink.h>
 #include <net/sock.h>
@@ -976,6 +977,25 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
 	struct phy_driver *drv = phydev->drv;
 	irqreturn_t ret;
 
+	if (IS_ENABLED(CONFIG_PM_SLEEP) &&
+	    (phydev->mdio.dev.power.is_prepared ||
+	     phydev->mdio.dev.power.is_suspended)) {
+		struct net_device *netdev = phydev->attached_dev;
+
+		if (netdev) {
+			struct device *parent = netdev->dev.parent;
+
+			if (netdev->wol_enabled)
+				pm_system_wakeup();
+			else if (device_may_wakeup(&netdev->dev))
+				pm_wakeup_dev_event(&netdev->dev, 0, true);
+			else if (parent && device_may_wakeup(parent))
+				pm_wakeup_dev_event(parent, 0, true);
+		}
+
+		return IRQ_HANDLED;
+	}
+
 	mutex_lock(&phydev->lock);
 	ret = drv->handle_interrupt(phydev);
 	mutex_unlock(&phydev->lock);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 431a8719c635..da6d70ddf167 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -283,8 +283,11 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
 	 * may call phy routines that try to grab the same lock, and that may
 	 * lead to a deadlock.
 	 */
-	if (phydev->attached_dev && phydev->adjust_link)
+	if (phydev->attached_dev && phydev->adjust_link) {
+		if (phy_interrupt_is_valid(phydev))
+			synchronize_irq(phydev->irq);
 		phy_stop_machine(phydev);
+	}
 
 	if (!mdio_bus_phy_may_suspend(phydev))
 		return 0;

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-23  9:43           ` Lukas Wunner
@ 2022-05-23 11:38             ` Marek Szyprowski
  2022-05-23 12:34             ` Andrew Lunn
  2022-05-24  1:08             ` Andrew Lunn
  2 siblings, 0 replies; 40+ messages in thread
From: Marek Szyprowski @ 2022-05-23 11:38 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

Hi Lukas,

On 23.05.2022 11:43, Lukas Wunner wrote:
> On Thu, May 19, 2022 at 11:22:36PM +0200, Marek Szyprowski wrote:
>> On 19.05.2022 21:08, Lukas Wunner wrote:
>>> Taking a step back though, I'm wondering if there's a bigger problem here:
>>> This is a USB device, so we stop receiving interrupts once the Interrupt
>>> Endpoint is no longer polled.  But what if a PHY's interrupt is attached
>>> to a GPIO of the SoC and that interrupt is raised while the system is
>>> suspending?  The interrupt handler may likewise try to reach an
>>> inaccessible (suspended) device.
>>>
>>> The right thing to do would probably be to signal wakeup.  But the
>>> PHY drivers' irq handlers instead schedule the phy_state_machine().
>>> Perhaps we need something like the following at the top of
>>> phy_state_machine():
>>>
>>> 	if (phydev->suspended) {
>>> 		pm_wakeup_dev_event(&phydev->mdio.dev, 0, true);
>>> 		return;
>>> 	}
>>>
>>> However, phydev->suspended is set at the *bottom* of phy_suspend(),
>>> it would have to be set at the *top* of mdio_bus_phy_suspend()
>>> for the above to be correct.  Hmmm...
>> Well, your concern sounds valid, but I don't have a board with such hw
>> configuration, so I cannot really test.
> I'm torn whether I should submit the quick fix in my last e-mail
> or attempt to address the deeper issue.  The quick fix would ensure
> v5.19-rc1 isn't broken, but if possible I'd rather address the deeper
> issue...
>
> Below is another patch.  Would you mind testing if it fixes the problem
> for you?  It's a replacement for the patch in my last e-mail and seeks
> to fix the problem for all drivers, not just smsc95xx.  If you don't
> have time to test it, let me know and I'll just submit the quick fix
> in my previous e-mail.

I've just tested it on top of next-20220519 and I was not able to 
reproduce the issue, so it looks it also fixes the issue. :)

Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>

> BTW, getting a PHY interrupt on suspend seems like a corner case to me,
> so I'm amazed you found this and seem to be able to reproduce it 100%.
> Out of curiosity, is this a CI test you're performing?

I've have some semi-automated (based on simple bash scripts) tests 
utilizing remote test boards (with remote power on/off control, serial 
console, tftp booting). This issue was quite easy to reproduce, even 
manually. Maybe it is somehow specific to the Odroid-XU3/XU3-lite boards 
and the way the smsc95xx USB ethernet chip is connected there, but it 
happens there usually in 2 of 3 suspend/resume tests.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-23  9:43           ` Lukas Wunner
  2022-05-23 11:38             ` Marek Szyprowski
@ 2022-05-23 12:34             ` Andrew Lunn
  2022-05-23 13:47               ` Lukas Wunner
  2022-05-24  1:08             ` Andrew Lunn
  2 siblings, 1 reply; 40+ messages in thread
From: Andrew Lunn @ 2022-05-23 12:34 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -283,8 +283,11 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
>  	 * may call phy routines that try to grab the same lock, and that may
>  	 * lead to a deadlock.
>  	 */
> -	if (phydev->attached_dev && phydev->adjust_link)
> +	if (phydev->attached_dev && phydev->adjust_link) {
> +		if (phy_interrupt_is_valid(phydev))
> +			synchronize_irq(phydev->irq);
>  		phy_stop_machine(phydev);
> +	}

What is this hunk trying to achieve? As far as i know, interrupts have
not been disabled. So as soon as the call to synchronize_irq()
finishes, could well be another interrupt happens.

	  Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-23 12:34             ` Andrew Lunn
@ 2022-05-23 13:47               ` Lukas Wunner
  2022-05-24  0:52                 ` Andrew Lunn
  0 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-23 13:47 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

On Mon, May 23, 2022 at 02:34:49PM +0200, Andrew Lunn wrote:
> > --- a/drivers/net/phy/phy_device.c
> > +++ b/drivers/net/phy/phy_device.c
> > @@ -283,8 +283,11 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
> >  	 * may call phy routines that try to grab the same lock, and that may
> >  	 * lead to a deadlock.
> >  	 */
> > -	if (phydev->attached_dev && phydev->adjust_link)
> > +	if (phydev->attached_dev && phydev->adjust_link) {
> > +		if (phy_interrupt_is_valid(phydev))
> > +			synchronize_irq(phydev->irq);
> >  		phy_stop_machine(phydev);
> > +	}
> 
> What is this hunk trying to achieve? As far as i know, interrupts have
> not been disabled. So as soon as the call to synchronize_irq()
> finishes, could well be another interrupt happens.

That other interrupt would bail out of phy_interrupt() because
the is_prepared flag is set on the PHY's struct device, see
first hunk of the patch.

The problem is that an interrupt may occur before the system
sleep transition commences.  phy_interrupt() will notice that
is_prepared is not (yet) set, hence invokes drv->handle_interrupt().
Let's say the IRQ thread is preempted at that point, the system
sleep transition is started and mdio_bus_phy_suspend() is run.
It calls phy_stop_machine(), so the state machine is now stopped.
Now phy_interrupt() continues, and the PHY driver's ->handle_interrupt()
callback starts the state machine.  Boom, that's not what we want.

So the synchronize_irq() ensures that any already running
phy_interrupt() runs to completion before phy_stop_machine()
is called.  It doesn't matter if another interrupt occurs
because then is_prepared will have been set and therefore
phy_interrupt() won't call drv->handle_interrupt().

Let me know if I haven't explained it in sufficient clarity,
I'll be happy to try again. :)

I'm more concerned about the first hunk of the patch because I'm
not sure I got the wakeup stuff right...

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-23 13:47               ` Lukas Wunner
@ 2022-05-24  0:52                 ` Andrew Lunn
  0 siblings, 0 replies; 40+ messages in thread
From: Andrew Lunn @ 2022-05-24  0:52 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

On Mon, May 23, 2022 at 03:47:09PM +0200, Lukas Wunner wrote:
> On Mon, May 23, 2022 at 02:34:49PM +0200, Andrew Lunn wrote:
> > > --- a/drivers/net/phy/phy_device.c
> > > +++ b/drivers/net/phy/phy_device.c
> > > @@ -283,8 +283,11 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
> > >  	 * may call phy routines that try to grab the same lock, and that may
> > >  	 * lead to a deadlock.
> > >  	 */
> > > -	if (phydev->attached_dev && phydev->adjust_link)
> > > +	if (phydev->attached_dev && phydev->adjust_link) {
> > > +		if (phy_interrupt_is_valid(phydev))
> > > +			synchronize_irq(phydev->irq);
> > >  		phy_stop_machine(phydev);
> > > +	}
> > 
> > What is this hunk trying to achieve? As far as i know, interrupts have
> > not been disabled. So as soon as the call to synchronize_irq()
> > finishes, could well be another interrupt happens.
> 
> That other interrupt would bail out of phy_interrupt() because
> the is_prepared flag is set on the PHY's struct device, see
> first hunk of the patch.
> 
> The problem is that an interrupt may occur before the system
> sleep transition commences.  phy_interrupt() will notice that
> is_prepared is not (yet) set, hence invokes drv->handle_interrupt().
> Let's say the IRQ thread is preempted at that point, the system
> sleep transition is started and mdio_bus_phy_suspend() is run.
> It calls phy_stop_machine(), so the state machine is now stopped.
> Now phy_interrupt() continues, and the PHY driver's ->handle_interrupt()
> callback starts the state machine.  Boom, that's not what we want.
> 
> So the synchronize_irq() ensures that any already running
> phy_interrupt() runs to completion before phy_stop_machine()
> is called.  It doesn't matter if another interrupt occurs
> because then is_prepared will have been set and therefore
> phy_interrupt() won't call drv->handle_interrupt().
> 
> Let me know if I haven't explained it in sufficient clarity,
> I'll be happy to try again. :)

I think some comments are needed. If i don't understand what is going
on, i'm sure others don't as well.

    Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-23  9:43           ` Lukas Wunner
  2022-05-23 11:38             ` Marek Szyprowski
  2022-05-23 12:34             ` Andrew Lunn
@ 2022-05-24  1:08             ` Andrew Lunn
  2022-05-24  6:16               ` Marek Szyprowski
  2022-05-24 12:13               ` Lukas Wunner
  2 siblings, 2 replies; 40+ messages in thread
From: Andrew Lunn @ 2022-05-24  1:08 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

> @@ -976,6 +977,25 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
>  	struct phy_driver *drv = phydev->drv;
>  	irqreturn_t ret;
>  
> +	if (IS_ENABLED(CONFIG_PM_SLEEP) &&
> +	    (phydev->mdio.dev.power.is_prepared ||
> +	     phydev->mdio.dev.power.is_suspended)) {
> +		struct net_device *netdev = phydev->attached_dev;
> +
> +		if (netdev) {
> +			struct device *parent = netdev->dev.parent;
> +
> +			if (netdev->wol_enabled)
> +				pm_system_wakeup();
> +			else if (device_may_wakeup(&netdev->dev))
> +				pm_wakeup_dev_event(&netdev->dev, 0, true);
> +			else if (parent && device_may_wakeup(parent))
> +				pm_wakeup_dev_event(parent, 0, true);
> +		}
> +
> +		return IRQ_HANDLED;

I'm not sure you can just throw the interrupt away. There have been
issues with WoL, where the WoL signal has been applied to a PMC, not
an actual interrupt. Yet the PHY driver assumes it is an
interrupt. And in order for WoL to work correctly, it needs the
interrupt handler to be called. We said the hardware is broken, WoL
cannot work for that setup.

Here you have correct hardware, but you are throwing the interrupt
away, which will have the same result. So i think you need to abort
the suspend, get the bus working again, and call the interrupt
handler. If this is a WoL interrupt you are supposed to be waking up
anyway.

	Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-24  1:08             ` Andrew Lunn
@ 2022-05-24  6:16               ` Marek Szyprowski
  2022-05-24 12:03                 ` Andrew Lunn
  2022-05-24 12:13               ` Lukas Wunner
  1 sibling, 1 reply; 40+ messages in thread
From: Marek Szyprowski @ 2022-05-24  6:16 UTC (permalink / raw)
  To: Andrew Lunn, Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

On 24.05.2022 03:08, Andrew Lunn wrote:
>> @@ -976,6 +977,25 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
>>   	struct phy_driver *drv = phydev->drv;
>>   	irqreturn_t ret;
>>   
>> +	if (IS_ENABLED(CONFIG_PM_SLEEP) &&
>> +	    (phydev->mdio.dev.power.is_prepared ||
>> +	     phydev->mdio.dev.power.is_suspended)) {
>> +		struct net_device *netdev = phydev->attached_dev;
>> +
>> +		if (netdev) {
>> +			struct device *parent = netdev->dev.parent;
>> +
>> +			if (netdev->wol_enabled)
>> +				pm_system_wakeup();
>> +			else if (device_may_wakeup(&netdev->dev))
>> +				pm_wakeup_dev_event(&netdev->dev, 0, true);
>> +			else if (parent && device_may_wakeup(parent))
>> +				pm_wakeup_dev_event(parent, 0, true);
>> +		}
>> +
>> +		return IRQ_HANDLED;
> I'm not sure you can just throw the interrupt away. There have been
> issues with WoL, where the WoL signal has been applied to a PMC, not
> an actual interrupt. Yet the PHY driver assumes it is an
> interrupt. And in order for WoL to work correctly, it needs the
> interrupt handler to be called. We said the hardware is broken, WoL
> cannot work for that setup.
>
> Here you have correct hardware, but you are throwing the interrupt
> away, which will have the same result. So i think you need to abort
> the suspend, get the bus working again, and call the interrupt
> handler. If this is a WoL interrupt you are supposed to be waking up
> anyway.

This hardware doesn't support wake-on-lan. It looks somehow that it 
manages to throw an interrupt just a moment before the power regulator 
for the whole usb bus is cut off.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-24  6:16               ` Marek Szyprowski
@ 2022-05-24 12:03                 ` Andrew Lunn
  2022-05-26 13:55                   ` Lukas Wunner
  0 siblings, 1 reply; 40+ messages in thread
From: Andrew Lunn @ 2022-05-24 12:03 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Lukas Wunner, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

On Tue, May 24, 2022 at 08:16:23AM +0200, Marek Szyprowski wrote:
> On 24.05.2022 03:08, Andrew Lunn wrote:
> >> @@ -976,6 +977,25 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
> >>   	struct phy_driver *drv = phydev->drv;
> >>   	irqreturn_t ret;
> >>   
> >> +	if (IS_ENABLED(CONFIG_PM_SLEEP) &&
> >> +	    (phydev->mdio.dev.power.is_prepared ||
> >> +	     phydev->mdio.dev.power.is_suspended)) {
> >> +		struct net_device *netdev = phydev->attached_dev;
> >> +
> >> +		if (netdev) {
> >> +			struct device *parent = netdev->dev.parent;
> >> +
> >> +			if (netdev->wol_enabled)
> >> +				pm_system_wakeup();
> >> +			else if (device_may_wakeup(&netdev->dev))
> >> +				pm_wakeup_dev_event(&netdev->dev, 0, true);
> >> +			else if (parent && device_may_wakeup(parent))
> >> +				pm_wakeup_dev_event(parent, 0, true);
> >> +		}
> >> +
> >> +		return IRQ_HANDLED;
> > I'm not sure you can just throw the interrupt away. There have been
> > issues with WoL, where the WoL signal has been applied to a PMC, not
> > an actual interrupt. Yet the PHY driver assumes it is an
> > interrupt. And in order for WoL to work correctly, it needs the
> > interrupt handler to be called. We said the hardware is broken, WoL
> > cannot work for that setup.
> >
> > Here you have correct hardware, but you are throwing the interrupt
> > away, which will have the same result. So i think you need to abort
> > the suspend, get the bus working again, and call the interrupt
> > handler. If this is a WoL interrupt you are supposed to be waking up
> > anyway.
> 
> This hardware doesn't support wake-on-lan. It looks somehow that it 
> manages to throw an interrupt just a moment before the power regulator 
> for the whole usb bus is cut off.

Your hardware might not support WOL, but this is generic code. It
needs to work for all hardware.

As for this hardware, if it does not support WOL, why are interrupts
still enabled?

      Andrew

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-24  1:08             ` Andrew Lunn
  2022-05-24  6:16               ` Marek Szyprowski
@ 2022-05-24 12:13               ` Lukas Wunner
  2022-06-06  1:28                 ` Andrew Lunn
  1 sibling, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-05-24 12:13 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

On Tue, May 24, 2022 at 03:08:07AM +0200, Andrew Lunn wrote:
> > @@ -976,6 +977,25 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
> >  	struct phy_driver *drv = phydev->drv;
> >  	irqreturn_t ret;
> >  
> > +	if (IS_ENABLED(CONFIG_PM_SLEEP) &&
> > +	    (phydev->mdio.dev.power.is_prepared ||
> > +	     phydev->mdio.dev.power.is_suspended)) {
> > +		struct net_device *netdev = phydev->attached_dev;
> > +
> > +		if (netdev) {
> > +			struct device *parent = netdev->dev.parent;
> > +
> > +			if (netdev->wol_enabled)
> > +				pm_system_wakeup();
> > +			else if (device_may_wakeup(&netdev->dev))
> > +				pm_wakeup_dev_event(&netdev->dev, 0, true);
> > +			else if (parent && device_may_wakeup(parent))
> > +				pm_wakeup_dev_event(parent, 0, true);
> > +		}
> > +
> > +		return IRQ_HANDLED;
> 
> I'm not sure you can just throw the interrupt away. There have been
> issues with WoL, where the WoL signal has been applied to a PMC, not
> an actual interrupt. Yet the PHY driver assumes it is an
> interrupt. And in order for WoL to work correctly, it needs the
> interrupt handler to be called. We said the hardware is broken, WoL
> cannot work for that setup.
> 
> Here you have correct hardware, but you are throwing the interrupt
> away, which will have the same result. So i think you need to abort
> the suspend, get the bus working again, and call the interrupt
> handler. If this is a WoL interrupt you are supposed to be waking up
> anyway.

mdio_bus_phy_resume() does trigger the state machine via
phy_start_machine(), so link state changes *are* detected after wakeup.

But you're saying that's not sufficient and you really want the
PHY driver's IRQ handler to be called, do I understand that correctly?

That could be achieved with a flag indicating that the IRQ handler
needs to be rerun after resume.  A simple invocation of irq_wake_thread()
will then achieve that.

It has also occurred to me that not clearing the IRQ may lead to
an interrupt storm if it's level-triggered.  So I need to disable
the IRQ and re-enable it after the PHY has been resumed.
Back to the drawing board...

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-24 12:03                 ` Andrew Lunn
@ 2022-05-26 13:55                   ` Lukas Wunner
  0 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-05-26 13:55 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

On Tue, May 24, 2022 at 02:03:52PM +0200, Andrew Lunn wrote:
> As for this hardware, if it does not support WOL, why are interrupts
> still enabled?

LAN95xx chips do support WoL and will signal a USB wake event.
But whether that actually results in resume from system sleep
depends on the capabilities of the SoC and its USB host controller.

LAN95xx supports a variety of wake options (WoL, PHY Energy Detect, ...)
and can use either its integrated SMSC PHY or an external PHY.
I'm not sure all wake options will work with arbitrary external PHYs.

If WoL or Wake on PHY Energy Detect is not used, we just program the
LAN95xx to enter a deeper power state which results in the respective
wake events being ignored.  As a result, interrupts may be left enabled
even though they're not used as a wakeup source.  The phylib doesn't
provide an API to selectively disable or enable interrupts, other than
phy_stop() and phy_start(), which does a lot more.

The patch I've submitted today treats such unnecessarily enabled
interrupts leniently:  It will not signal wakeup if that wasn't enabled
and just remembers that an interrupt occurred.  The interrupt will be
replayed upon resume and that's it.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-24 12:13               ` Lukas Wunner
@ 2022-06-06  1:28                 ` Andrew Lunn
  0 siblings, 0 replies; 40+ messages in thread
From: Andrew Lunn @ 2022-06-06  1:28 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Russell King, Ferry Toth,
	Krzysztof Kozlowski, 'Linux Samsung SOC'

> mdio_bus_phy_resume() does trigger the state machine via
> phy_start_machine(), so link state changes *are* detected after wakeup.
> 
> But you're saying that's not sufficient and you really want the
> PHY driver's IRQ handler to be called, do I understand that correctly?

It is an interrupt, so i would expect the handler to be called. I've
never looked deeply how the kernel handles this, but maybe there is
some core support for this. The kernel does know about wake up
interrupts. The interesting bit is how do you defer the interrupt
until you have enough of the system running again you can actually
service the interrupt.

PHY interrupts mostly are level, not edge, because there are multiple
sources of interrupts within the PHY. So you do need to clear the
interrupt source, or you are going to get a storm, as you pointed out.
But being a level might actually help you. It fires once to get you
out of sleep, and then fires again when the interrupt controller is
resumed and is enabled.

	  Andrew
 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-05-19 21:22         ` Marek Szyprowski
  2022-05-23  9:43           ` Lukas Wunner
@ 2022-08-26  6:51           ` Marek Szyprowski
  2022-08-26  7:19             ` Lukas Wunner
  1 sibling, 1 reply; 40+ messages in thread
From: Marek Szyprowski @ 2022-08-26  6:51 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

On 19.05.2022 23:22, Marek Szyprowski wrote:
> On 19.05.2022 21:08, Lukas Wunner wrote:
>> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
>>> This patch landed in the recent linux next-20220516 as commit
>>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY 
>>> driver to
>>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet 
>>> operation
>>> after system suspend-resume cycle. On the Odroid XU3 board I got the
>>> following warning in the kernel log:
>>>
>>> # time rtcwake -s10 -mmem
>>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
>>> PM: suspend entry (deep)
>>> Filesystems sync: 0.001 seconds
>>> Freezing user space processes ... (elapsed 0.002 seconds) done.
>>> OOM killer disabled.
>>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>>> printk: Suspending console(s) (use no_console_suspend to debug)
>>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
>>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
>>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
>>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
>>> phy_state_machine+0x98/0x28c
>> [...]
>>> It looks that the driver's suspend/resume operations might need some
>>> adjustments. After the system suspend/resume cycle the driver is not
>>> operational anymore. Reverting the $subject patch on top of linux
>>> next-20220516 restores ethernet operation after system suspend/resume.
>> Thanks a lot for the report.  It seems the PHY is signaling a link 
>> change
>> shortly before system sleep and by the time the phy_state_machine() 
>> worker
>> gets around to handle it, the device has already been suspended and thus
>> refuses any further USB requests with -EHOSTUNREACH (-113):
>>
>> usb_suspend_both()
>>    usb_suspend_interface()
>>      smsc95xx_suspend()
>>        usbnet_suspend()
>>          __usbnet_status_stop_force() # stops interrupt polling,
>>                                       # link change is signaled 
>> before this
>>
>>    udev->can_submit = 0               # refuse further URBs
>>
>> Assuming the above theory is correct, calling phy_stop_machine()
>> after usbnet_suspend() would be sufficient to fix the issue.
>> It cancels the phy_state_machine() worker.
>>
>> The small patch below does that.  Could you give it a spin?
>
> That's it. Your analysis is right and the patch fixes the issue. Thanks!
>
> Feel free to add:
>
> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
>
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>


Gentle ping for the final patch...

It looks that the similar fix is posted for other drivers, i.e.:

https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/


>
>> Taking a step back though, I'm wondering if there's a bigger problem 
>> here:
>> This is a USB device, so we stop receiving interrupts once the Interrupt
>> Endpoint is no longer polled.  But what if a PHY's interrupt is attached
>> to a GPIO of the SoC and that interrupt is raised while the system is
>> suspending?  The interrupt handler may likewise try to reach an
>> inaccessible (suspended) device.
>>
>> The right thing to do would probably be to signal wakeup.  But the
>> PHY drivers' irq handlers instead schedule the phy_state_machine().
>> Perhaps we need something like the following at the top of
>> phy_state_machine():
>>
>>     if (phydev->suspended) {
>>         pm_wakeup_dev_event(&phydev->mdio.dev, 0, true);
>>         return;
>>     }
>>
>> However, phydev->suspended is set at the *bottom* of phy_suspend(),
>> it would have to be set at the *top* of mdio_bus_phy_suspend()
>> for the above to be correct.  Hmmm...
> Well, your concern sounds valid, but I don't have a board with such hw 
> configuration, so I cannot really test.
>> Thanks,
>>
>> Lukas
>>
>> -- >8 --
>> diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
>> index bd03e16..d351a6c 100644
>> --- a/drivers/net/usb/smsc95xx.c
>> +++ b/drivers/net/usb/smsc95xx.c
>> @@ -1201,6 +1201,7 @@ static int smsc95xx_bind(struct usbnet *dev, 
>> struct usb_interface *intf)
>>       }
>>         pdata->phydev->irq = phy_irq;
>> +    pdata->phydev->mac_managed_pm = true;
>>       pdata->phydev->is_internal = is_internal_phy;
>>         /* detect device revision as different features may be 
>> available */
>> @@ -1496,6 +1497,9 @@ static int smsc95xx_suspend(struct 
>> usb_interface *intf, pm_message_t message)
>>           return ret;
>>       }
>>   +    if (netif_running(dev->net))
>> +        phy_stop(pdata->phydev);
>> +
>>       if (pdata->suspend_flags) {
>>           netdev_warn(dev->net, "error during last resume\n");
>>           pdata->suspend_flags = 0;
>> @@ -1778,6 +1782,8 @@ static int smsc95xx_resume(struct usb_interface 
>> *intf)
>>       }
>>         phy_init_hw(pdata->phydev);
>> +    if (netif_running(dev->net))
>> +        phy_start(pdata->phydev);
>>         ret = usbnet_resume(intf);
>>       if (ret < 0)
>>
> Best regards

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-08-26  6:51           ` Marek Szyprowski
@ 2022-08-26  7:19             ` Lukas Wunner
  2022-08-26  7:41               ` Marek Szyprowski
  0 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-08-26  7:19 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

On Fri, Aug 26, 2022 at 08:51:58AM +0200, Marek Szyprowski wrote:
> On 19.05.2022 23:22, Marek Szyprowski wrote:
> > On 19.05.2022 21:08, Lukas Wunner wrote:
> >> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
> >>> This patch landed in the recent linux next-20220516 as commit
> >>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY 
> >>> driver to
> >>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet 
> >>> operation
> >>> after system suspend-resume cycle. On the Odroid XU3 board I got the
> >>> following warning in the kernel log:
> >>>
> >>> # time rtcwake -s10 -mmem
> >>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
> >>> PM: suspend entry (deep)
> >>> Filesystems sync: 0.001 seconds
> >>> Freezing user space processes ... (elapsed 0.002 seconds) done.
> >>> OOM killer disabled.
> >>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> >>> printk: Suspending console(s) (use no_console_suspend to debug)
> >>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
> >>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
> >>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
> >>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
> >>> ------------[ cut here ]------------
> >>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
> >>> phy_state_machine+0x98/0x28c
> >> [...]
> >>> It looks that the driver's suspend/resume operations might need some
> >>> adjustments. After the system suspend/resume cycle the driver is not
> >>> operational anymore. Reverting the $subject patch on top of linux
> >>> next-20220516 restores ethernet operation after system suspend/resume.
> >> Thanks a lot for the report. It seems the PHY is signaling a link 
> >> change
> >> shortly before system sleep and by the time the phy_state_machine() 
> >> worker
> >> gets around to handle it, the device has already been suspended and thus
> >> refuses any further USB requests with -EHOSTUNREACH (-113):
[...]
> >> Assuming the above theory is correct, calling phy_stop_machine()
> >> after usbnet_suspend() would be sufficient to fix the issue.
> >> It cancels the phy_state_machine() worker.
> >>
> >> The small patch below does that. Could you give it a spin?
> >
> > That's it. Your analysis is right and the patch fixes the issue. Thanks!
> >
> > Feel free to add:
> >
> > Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >
> > Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> 
> Gentle ping for the final patch...

Hm?  Actually this issue is supposed to be fixed by mainline commit
1758bde2e4aa ("net: phy: Don't trigger state machine while in suspend").

The initial fix attempt that you're replying to should not be necessary
with that commit.

Are you still seeing issues even with 1758bde2e4aa applied?
Or are you maybe using a custom downstream tree which is missing that commit?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-08-26  7:19             ` Lukas Wunner
@ 2022-08-26  7:41               ` Marek Szyprowski
  2022-08-26  7:53                 ` Lukas Wunner
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Szyprowski @ 2022-08-26  7:41 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

Hi Lukas,

On 26.08.2022 09:19, Lukas Wunner wrote:
> On Fri, Aug 26, 2022 at 08:51:58AM +0200, Marek Szyprowski wrote:
>> On 19.05.2022 23:22, Marek Szyprowski wrote:
>>> On 19.05.2022 21:08, Lukas Wunner wrote:
>>>> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
>>>>> This patch landed in the recent linux next-20220516 as commit
>>>>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY
>>>>> driver to
>>>>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet
>>>>> operation
>>>>> after system suspend-resume cycle. On the Odroid XU3 board I got the
>>>>> following warning in the kernel log:
>>>>>
>>>>> # time rtcwake -s10 -mmem
>>>>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
>>>>> PM: suspend entry (deep)
>>>>> Filesystems sync: 0.001 seconds
>>>>> Freezing user space processes ... (elapsed 0.002 seconds) done.
>>>>> OOM killer disabled.
>>>>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>>>>> printk: Suspending console(s) (use no_console_suspend to debug)
>>>>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
>>>>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
>>>>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
>>>>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
>>>>> ------------[ cut here ]------------
>>>>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
>>>>> phy_state_machine+0x98/0x28c
>>>> [...]
>>>>> It looks that the driver's suspend/resume operations might need some
>>>>> adjustments. After the system suspend/resume cycle the driver is not
>>>>> operational anymore. Reverting the $subject patch on top of linux
>>>>> next-20220516 restores ethernet operation after system suspend/resume.
>>>> Thanks a lot for the report. It seems the PHY is signaling a link
>>>> change
>>>> shortly before system sleep and by the time the phy_state_machine()
>>>> worker
>>>> gets around to handle it, the device has already been suspended and thus
>>>> refuses any further USB requests with -EHOSTUNREACH (-113):
> [...]
>>>> Assuming the above theory is correct, calling phy_stop_machine()
>>>> after usbnet_suspend() would be sufficient to fix the issue.
>>>> It cancels the phy_state_machine() worker.
>>>>
>>>> The small patch below does that. Could you give it a spin?
>>> That's it. Your analysis is right and the patch fixes the issue. Thanks!
>>>
>>> Feel free to add:
>>>
>>> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>
>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> Gentle ping for the final patch...
> Hm?  Actually this issue is supposed to be fixed by mainline commit
> 1758bde2e4aa ("net: phy: Don't trigger state machine while in suspend").
>
> The initial fix attempt that you're replying to should not be necessary
> with that commit.
>
> Are you still seeing issues even with 1758bde2e4aa applied?
> Or are you maybe using a custom downstream tree which is missing that commit?

On Linux next-20220825 I still get the following warning during 
suspend/resume cycle:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 1483 at drivers/net/phy/phy_device.c:323 
mdio_bus_phy_resume+0x10c/0x110
Modules linked in: exynos_gsc s5p_jpeg s5p_mfc videobuf2_dma_contig 
v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev 
mc s5p_cec
CPU: 0 PID: 1483 Comm: rtcwake Not tainted 6.0.0-rc2-next-20220825 #5482
Hardware name: Samsung Exynos (Flattened Device Tree)
  unwind_backtrace from show_stack+0x10/0x14
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0xc8/0x220
  __warn from warn_slowpath_fmt+0x5c/0xb4
  warn_slowpath_fmt from mdio_bus_phy_resume+0x10c/0x110
  mdio_bus_phy_resume from dpm_run_callback+0x94/0x208
  dpm_run_callback from device_resume+0x124/0x21c
  device_resume from dpm_resume+0x108/0x278
  dpm_resume from dpm_resume_end+0xc/0x18
  dpm_resume_end from suspend_devices_and_enter+0x208/0x70c
  suspend_devices_and_enter from pm_suspend+0x364/0x430
  pm_suspend from state_store+0x68/0xc8
  state_store from kernfs_fop_write_iter+0x110/0x1d4
  kernfs_fop_write_iter from vfs_write+0x1c4/0x2ac
  vfs_write from ksys_write+0x5c/0xd4
  ksys_write from ret_fast_syscall+0x0/0x1c
Exception stack(0xf2ee5fa8 to 0xf2ee5ff0)
5fa0:                   00000004 0002b438 00000004 0002b438 00000004 
00000000
5fc0: 00000004 0002b438 000291b0 00000004 0002b438 00000004 befd9c1c 
00028160
5fe0: 0000006c befd9ae8 b6eb4148 b6f118a4
irq event stamp: 58381
hardirqs last  enabled at (58393): [<c019ff28>] vprintk_emit+0x320/0x344
hardirqs last disabled at (58400): [<c019fedc>] vprintk_emit+0x2d4/0x344
softirqs last  enabled at (58258): [<c0101694>] __do_softirq+0x354/0x618
softirqs last disabled at (58247): [<c012dd18>] __irq_exit_rcu+0x140/0x1ec
---[ end trace 0000000000000000 ]---

The mentioned patch fixes it.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-08-26  7:41               ` Marek Szyprowski
@ 2022-08-26  7:53                 ` Lukas Wunner
  2022-08-26  8:29                   ` Marek Szyprowski
  0 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-08-26  7:53 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

On Fri, Aug 26, 2022 at 09:41:46AM +0200, Marek Szyprowski wrote:
> On 26.08.2022 09:19, Lukas Wunner wrote:
> > On Fri, Aug 26, 2022 at 08:51:58AM +0200, Marek Szyprowski wrote:
> >> On 19.05.2022 23:22, Marek Szyprowski wrote:
> >>> On 19.05.2022 21:08, Lukas Wunner wrote:
> >>>> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
> >>>>> This patch landed in the recent linux next-20220516 as commit
> >>>>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY
> >>>>> driver to
> >>>>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet
> >>>>> operation
> >>>>> after system suspend-resume cycle. On the Odroid XU3 board I got the
> >>>>> following warning in the kernel log:
> >>>>>
> >>>>> # time rtcwake -s10 -mmem
> >>>>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
> >>>>> PM: suspend entry (deep)
> >>>>> Filesystems sync: 0.001 seconds
> >>>>> Freezing user space processes ... (elapsed 0.002 seconds) done.
> >>>>> OOM killer disabled.
> >>>>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
> >>>>> printk: Suspending console(s) (use no_console_suspend to debug)
> >>>>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
> >>>>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
> >>>>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
> >>>>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
> >>>>> ------------[ cut here ]------------
> >>>>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
> >>>>> phy_state_machine+0x98/0x28c
> >>>> [...]
> >>>>> It looks that the driver's suspend/resume operations might need some
> >>>>> adjustments. After the system suspend/resume cycle the driver is not
> >>>>> operational anymore. Reverting the $subject patch on top of linux
> >>>>> next-20220516 restores ethernet operation after system suspend/resume.
> >>>> Thanks a lot for the report. It seems the PHY is signaling a link
> >>>> change
> >>>> shortly before system sleep and by the time the phy_state_machine()
> >>>> worker
> >>>> gets around to handle it, the device has already been suspended and thus
> >>>> refuses any further USB requests with -EHOSTUNREACH (-113):
> > [...]
> >>>> Assuming the above theory is correct, calling phy_stop_machine()
> >>>> after usbnet_suspend() would be sufficient to fix the issue.
> >>>> It cancels the phy_state_machine() worker.
> >>>>
> >>>> The small patch below does that. Could you give it a spin?
> >>> That's it. Your analysis is right and the patch fixes the issue. Thanks!
> >>>
> >>> Feel free to add:
> >>>
> >>> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>
> >>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >> Gentle ping for the final patch...
> > Hm?  Actually this issue is supposed to be fixed by mainline commit
> > 1758bde2e4aa ("net: phy: Don't trigger state machine while in suspend").
> >
> > The initial fix attempt that you're replying to should not be necessary
> > with that commit.
> >
> > Are you still seeing issues even with 1758bde2e4aa applied?
> > Or are you maybe using a custom downstream tree which is missing that commit?
> 
> On Linux next-20220825 I still get the following warning during 
> suspend/resume cycle:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 1483 at drivers/net/phy/phy_device.c:323 
> mdio_bus_phy_resume+0x10c/0x110
> Modules linked in: exynos_gsc s5p_jpeg s5p_mfc videobuf2_dma_contig 
> v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev 
> mc s5p_cec
> CPU: 0 PID: 1483 Comm: rtcwake Not tainted 6.0.0-rc2-next-20220825 #5482
> Hardware name: Samsung Exynos (Flattened Device Tree)
>   unwind_backtrace from show_stack+0x10/0x14
>   show_stack from dump_stack_lvl+0x58/0x70
>   dump_stack_lvl from __warn+0xc8/0x220
>   __warn from warn_slowpath_fmt+0x5c/0xb4
>   warn_slowpath_fmt from mdio_bus_phy_resume+0x10c/0x110
>   mdio_bus_phy_resume from dpm_run_callback+0x94/0x208
>   dpm_run_callback from device_resume+0x124/0x21c
>   device_resume from dpm_resume+0x108/0x278
>   dpm_resume from dpm_resume_end+0xc/0x18
>   dpm_resume_end from suspend_devices_and_enter+0x208/0x70c
>   suspend_devices_and_enter from pm_suspend+0x364/0x430
>   pm_suspend from state_store+0x68/0xc8
>   state_store from kernfs_fop_write_iter+0x110/0x1d4
>   kernfs_fop_write_iter from vfs_write+0x1c4/0x2ac
>   vfs_write from ksys_write+0x5c/0xd4
>   ksys_write from ret_fast_syscall+0x0/0x1c
> Exception stack(0xf2ee5fa8 to 0xf2ee5ff0)
> 5fa0:                   00000004 0002b438 00000004 0002b438 00000004 
> 00000000
> 5fc0: 00000004 0002b438 000291b0 00000004 0002b438 00000004 befd9c1c 
> 00028160
> 5fe0: 0000006c befd9ae8 b6eb4148 b6f118a4
> irq event stamp: 58381
> hardirqs last  enabled at (58393): [<c019ff28>] vprintk_emit+0x320/0x344
> hardirqs last disabled at (58400): [<c019fedc>] vprintk_emit+0x2d4/0x344
> softirqs last  enabled at (58258): [<c0101694>] __do_softirq+0x354/0x618
> softirqs last disabled at (58247): [<c012dd18>] __irq_exit_rcu+0x140/0x1ec
> ---[ end trace 0000000000000000 ]---
> 
> The mentioned patch fixes it.

Color me confused.

With "the mentioned patch", are you referring to 1758bde2e4aa
or are you referring to the little test patch in this e-mail:
https://lore.kernel.org/netdev/20220519190841.GA30869@wunner.de/

There's a Tested-by from you on 1758bde2e4aa, so I assume the
commit fixed the issue at the time.  Does it still occur
intermittently?  Or does it occur every time?  In the latter case,
I'd assume some other change was made in the meantime and that
other change exposed the issue again...

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-08-26  7:53                 ` Lukas Wunner
@ 2022-08-26  8:29                   ` Marek Szyprowski
  2022-08-29 11:40                     ` Marek Szyprowski
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Szyprowski @ 2022-08-26  8:29 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

Hi Lukas,

On 26.08.2022 09:53, Lukas Wunner wrote:
> On Fri, Aug 26, 2022 at 09:41:46AM +0200, Marek Szyprowski wrote:
>> On 26.08.2022 09:19, Lukas Wunner wrote:
>>> On Fri, Aug 26, 2022 at 08:51:58AM +0200, Marek Szyprowski wrote:
>>>> On 19.05.2022 23:22, Marek Szyprowski wrote:
>>>>> On 19.05.2022 21:08, Lukas Wunner wrote:
>>>>>> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
>>>>>>> This patch landed in the recent linux next-20220516 as commit
>>>>>>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY
>>>>>>> driver to
>>>>>>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet
>>>>>>> operation
>>>>>>> after system suspend-resume cycle. On the Odroid XU3 board I got the
>>>>>>> following warning in the kernel log:
>>>>>>>
>>>>>>> # time rtcwake -s10 -mmem
>>>>>>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 09:16:07 2022
>>>>>>> PM: suspend entry (deep)
>>>>>>> Filesystems sync: 0.001 seconds
>>>>>>> Freezing user space processes ... (elapsed 0.002 seconds) done.
>>>>>>> OOM killer disabled.
>>>>>>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
>>>>>>> printk: Suspending console(s) (use no_console_suspend to debug)
>>>>>>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
>>>>>>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
>>>>>>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
>>>>>>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
>>>>>>> ------------[ cut here ]------------
>>>>>>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
>>>>>>> phy_state_machine+0x98/0x28c
>>>>>> [...]
>>>>>>> It looks that the driver's suspend/resume operations might need some
>>>>>>> adjustments. After the system suspend/resume cycle the driver is not
>>>>>>> operational anymore. Reverting the $subject patch on top of linux
>>>>>>> next-20220516 restores ethernet operation after system suspend/resume.
>>>>>> Thanks a lot for the report. It seems the PHY is signaling a link
>>>>>> change
>>>>>> shortly before system sleep and by the time the phy_state_machine()
>>>>>> worker
>>>>>> gets around to handle it, the device has already been suspended and thus
>>>>>> refuses any further USB requests with -EHOSTUNREACH (-113):
>>> [...]
>>>>>> Assuming the above theory is correct, calling phy_stop_machine()
>>>>>> after usbnet_suspend() would be sufficient to fix the issue.
>>>>>> It cancels the phy_state_machine() worker.
>>>>>>
>>>>>> The small patch below does that. Could you give it a spin?
>>>>> That's it. Your analysis is right and the patch fixes the issue. Thanks!
>>>>>
>>>>> Feel free to add:
>>>>>
>>>>> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>
>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>> Gentle ping for the final patch...
>>> Hm?  Actually this issue is supposed to be fixed by mainline commit
>>> 1758bde2e4aa ("net: phy: Don't trigger state machine while in suspend").
>>>
>>> The initial fix attempt that you're replying to should not be necessary
>>> with that commit.
>>>
>>> Are you still seeing issues even with 1758bde2e4aa applied?
>>> Or are you maybe using a custom downstream tree which is missing that commit?
>> On Linux next-20220825 I still get the following warning during
>> suspend/resume cycle:
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 1483 at drivers/net/phy/phy_device.c:323
>> mdio_bus_phy_resume+0x10c/0x110
>> Modules linked in: exynos_gsc s5p_jpeg s5p_mfc videobuf2_dma_contig
>> v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev
>> mc s5p_cec
>> CPU: 0 PID: 1483 Comm: rtcwake Not tainted 6.0.0-rc2-next-20220825 #5482
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>>    unwind_backtrace from show_stack+0x10/0x14
>>    show_stack from dump_stack_lvl+0x58/0x70
>>    dump_stack_lvl from __warn+0xc8/0x220
>>    __warn from warn_slowpath_fmt+0x5c/0xb4
>>    warn_slowpath_fmt from mdio_bus_phy_resume+0x10c/0x110
>>    mdio_bus_phy_resume from dpm_run_callback+0x94/0x208
>>    dpm_run_callback from device_resume+0x124/0x21c
>>    device_resume from dpm_resume+0x108/0x278
>>    dpm_resume from dpm_resume_end+0xc/0x18
>>    dpm_resume_end from suspend_devices_and_enter+0x208/0x70c
>>    suspend_devices_and_enter from pm_suspend+0x364/0x430
>>    pm_suspend from state_store+0x68/0xc8
>>    state_store from kernfs_fop_write_iter+0x110/0x1d4
>>    kernfs_fop_write_iter from vfs_write+0x1c4/0x2ac
>>    vfs_write from ksys_write+0x5c/0xd4
>>    ksys_write from ret_fast_syscall+0x0/0x1c
>> Exception stack(0xf2ee5fa8 to 0xf2ee5ff0)
>> 5fa0:                   00000004 0002b438 00000004 0002b438 00000004
>> 00000000
>> 5fc0: 00000004 0002b438 000291b0 00000004 0002b438 00000004 befd9c1c
>> 00028160
>> 5fe0: 0000006c befd9ae8 b6eb4148 b6f118a4
>> irq event stamp: 58381
>> hardirqs last  enabled at (58393): [<c019ff28>] vprintk_emit+0x320/0x344
>> hardirqs last disabled at (58400): [<c019fedc>] vprintk_emit+0x2d4/0x344
>> softirqs last  enabled at (58258): [<c0101694>] __do_softirq+0x354/0x618
>> softirqs last disabled at (58247): [<c012dd18>] __irq_exit_rcu+0x140/0x1ec
>> ---[ end trace 0000000000000000 ]---
>>
>> The mentioned patch fixes it.
> Color me confused.
>
> With "the mentioned patch", are you referring to 1758bde2e4aa
> or are you referring to the little test patch in this e-mail:
> https://lore.kernel.org/netdev/20220519190841.GA30869@wunner.de/
>
> There's a Tested-by from you on 1758bde2e4aa, so I assume the
> commit fixed the issue at the time.  Does it still occur
> intermittently?  Or does it occur every time?  In the latter case,
> I'd assume some other change was made in the meantime and that
> other change exposed the issue again...

The issue seems to be exposed again. On 1758bde2e4aa everything works 
fine and there is no warning during suspend/resume cycle. Then I've 
noticed that warning again while playing with current linux-next. I will 
check when it got broken and report again.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-08-26  8:29                   ` Marek Szyprowski
@ 2022-08-29 11:40                     ` Marek Szyprowski
  2022-09-18 19:13                       ` Lukas Wunner
  0 siblings, 1 reply; 40+ messages in thread
From: Marek Szyprowski @ 2022-08-29 11:40 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

Hi All,

On 26.08.2022 10:29, Marek Szyprowski wrote:
> On 26.08.2022 09:53, Lukas Wunner wrote:
>> On Fri, Aug 26, 2022 at 09:41:46AM +0200, Marek Szyprowski wrote:
>>> On 26.08.2022 09:19, Lukas Wunner wrote:
>>>> On Fri, Aug 26, 2022 at 08:51:58AM +0200, Marek Szyprowski wrote:
>>>>> On 19.05.2022 23:22, Marek Szyprowski wrote:
>>>>>> On 19.05.2022 21:08, Lukas Wunner wrote:
>>>>>>> On Tue, May 17, 2022 at 12:18:45PM +0200, Marek Szyprowski wrote:
>>>>>>>> This patch landed in the recent linux next-20220516 as commit
>>>>>>>> 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY
>>>>>>>> driver to
>>>>>>>> avoid polling"). Unfortunately it breaks smsc95xx usb ethernet
>>>>>>>> operation
>>>>>>>> after system suspend-resume cycle. On the Odroid XU3 board I 
>>>>>>>> got the
>>>>>>>> following warning in the kernel log:
>>>>>>>>
>>>>>>>> # time rtcwake -s10 -mmem
>>>>>>>> rtcwake: wakeup from "mem" using /dev/rtc0 at Tue May 17 
>>>>>>>> 09:16:07 2022
>>>>>>>> PM: suspend entry (deep)
>>>>>>>> Filesystems sync: 0.001 seconds
>>>>>>>> Freezing user space processes ... (elapsed 0.002 seconds) done.
>>>>>>>> OOM killer disabled.
>>>>>>>> Freezing remaining freezable tasks ... (elapsed 0.001 seconds) 
>>>>>>>> done.
>>>>>>>> printk: Suspending console(s) (use no_console_suspend to debug)
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: entering SUSPEND2 mode
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: Failed to read reg index 0x00000114: -113
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: Error reading MII_ACCESS
>>>>>>>> smsc95xx 4-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
>>>>>>>> ------------[ cut here ]------------
>>>>>>>> WARNING: CPU: 2 PID: 73 at drivers/net/phy/phy.c:946
>>>>>>>> phy_state_machine+0x98/0x28c
>>>>>>> [...]
>>>>>>>> It looks that the driver's suspend/resume operations might need 
>>>>>>>> some
>>>>>>>> adjustments. After the system suspend/resume cycle the driver 
>>>>>>>> is not
>>>>>>>> operational anymore. Reverting the $subject patch on top of linux
>>>>>>>> next-20220516 restores ethernet operation after system 
>>>>>>>> suspend/resume.
>>>>>>> Thanks a lot for the report. It seems the PHY is signaling a link
>>>>>>> change
>>>>>>> shortly before system sleep and by the time the phy_state_machine()
>>>>>>> worker
>>>>>>> gets around to handle it, the device has already been suspended 
>>>>>>> and thus
>>>>>>> refuses any further USB requests with -EHOSTUNREACH (-113):
>>>> [...]
>>>>>>> Assuming the above theory is correct, calling phy_stop_machine()
>>>>>>> after usbnet_suspend() would be sufficient to fix the issue.
>>>>>>> It cancels the phy_state_machine() worker.
>>>>>>>
>>>>>>> The small patch below does that. Could you give it a spin?
>>>>>> That's it. Your analysis is right and the patch fixes the issue. 
>>>>>> Thanks!
>>>>>>
>>>>>> Feel free to add:
>>>>>>
>>>>>> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>>
>>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>> Gentle ping for the final patch...
>>>> Hm?  Actually this issue is supposed to be fixed by mainline commit
>>>> 1758bde2e4aa ("net: phy: Don't trigger state machine while in 
>>>> suspend").
>>>>
>>>> The initial fix attempt that you're replying to should not be 
>>>> necessary
>>>> with that commit.
>>>>
>>>> Are you still seeing issues even with 1758bde2e4aa applied?
>>>> Or are you maybe using a custom downstream tree which is missing 
>>>> that commit?
>>> On Linux next-20220825 I still get the following warning during
>>> suspend/resume cycle:
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 0 PID: 1483 at drivers/net/phy/phy_device.c:323
>>> mdio_bus_phy_resume+0x10c/0x110
>>> Modules linked in: exynos_gsc s5p_jpeg s5p_mfc videobuf2_dma_contig
>>> v4l2_mem2mem videobuf2_memops videobuf2_v4l2 videobuf2_common videodev
>>> mc s5p_cec
>>> CPU: 0 PID: 1483 Comm: rtcwake Not tainted 6.0.0-rc2-next-20220825 
>>> #5482
>>> Hardware name: Samsung Exynos (Flattened Device Tree)
>>>    unwind_backtrace from show_stack+0x10/0x14
>>>    show_stack from dump_stack_lvl+0x58/0x70
>>>    dump_stack_lvl from __warn+0xc8/0x220
>>>    __warn from warn_slowpath_fmt+0x5c/0xb4
>>>    warn_slowpath_fmt from mdio_bus_phy_resume+0x10c/0x110
>>>    mdio_bus_phy_resume from dpm_run_callback+0x94/0x208
>>>    dpm_run_callback from device_resume+0x124/0x21c
>>>    device_resume from dpm_resume+0x108/0x278
>>>    dpm_resume from dpm_resume_end+0xc/0x18
>>>    dpm_resume_end from suspend_devices_and_enter+0x208/0x70c
>>>    suspend_devices_and_enter from pm_suspend+0x364/0x430
>>>    pm_suspend from state_store+0x68/0xc8
>>>    state_store from kernfs_fop_write_iter+0x110/0x1d4
>>>    kernfs_fop_write_iter from vfs_write+0x1c4/0x2ac
>>>    vfs_write from ksys_write+0x5c/0xd4
>>>    ksys_write from ret_fast_syscall+0x0/0x1c
>>> Exception stack(0xf2ee5fa8 to 0xf2ee5ff0)
>>> 5fa0:                   00000004 0002b438 00000004 0002b438 00000004
>>> 00000000
>>> 5fc0: 00000004 0002b438 000291b0 00000004 0002b438 00000004 befd9c1c
>>> 00028160
>>> 5fe0: 0000006c befd9ae8 b6eb4148 b6f118a4
>>> irq event stamp: 58381
>>> hardirqs last  enabled at (58393): [<c019ff28>] 
>>> vprintk_emit+0x320/0x344
>>> hardirqs last disabled at (58400): [<c019fedc>] 
>>> vprintk_emit+0x2d4/0x344
>>> softirqs last  enabled at (58258): [<c0101694>] 
>>> __do_softirq+0x354/0x618
>>> softirqs last disabled at (58247): [<c012dd18>] 
>>> __irq_exit_rcu+0x140/0x1ec
>>> ---[ end trace 0000000000000000 ]---
>>>
>>> The mentioned patch fixes it.
>> Color me confused.
>>
>> With "the mentioned patch", are you referring to 1758bde2e4aa
>> or are you referring to the little test patch in this e-mail:
>> https://lore.kernel.org/netdev/20220519190841.GA30869@wunner.de/
>>
>> There's a Tested-by from you on 1758bde2e4aa, so I assume the
>> commit fixed the issue at the time.  Does it still occur
>> intermittently?  Or does it occur every time?  In the latter case,
>> I'd assume some other change was made in the meantime and that
>> other change exposed the issue again...
>
> The issue seems to be exposed again. On 1758bde2e4aa everything works 
> fine and there is no warning during suspend/resume cycle. Then I've 
> noticed that warning again while playing with current linux-next. I 
> will check when it got broken and report again.

I've finally traced what has happened. I've double checked and indeed 
the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as 
such it has been merged to linus tree. Then the commit 744d23c71af3 
("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been 
merged to linus tree, which triggers a new warning during the 
suspend/resume cycle with smsc95xx driver. Please note, that the 
smsc95xx still works fine regardless that warning. However it look that 
the commit 1758bde2e4aa only hide a real problem, which the commit 
744d23c71af3 warns about.

Probably a proper fix for smsc95xx driver is to call phy_stop/start 
during suspend/resume cycle, like in similar patches for other drivers:

https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-08-29 11:40                     ` Marek Szyprowski
@ 2022-09-18 19:13                       ` Lukas Wunner
  2022-09-18 20:41                         ` Florian Fainelli
  2022-09-22 13:21                         ` Marek Szyprowski
  0 siblings, 2 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-09-18 19:13 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC',
	Florian Fainelli

[cc += Florian]

On Mon, Aug 29, 2022 at 01:40:05PM +0200, Marek Szyprowski wrote:
> I've finally traced what has happened. I've double checked and indeed 
> the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as 
> such it has been merged to linus tree. Then the commit 744d23c71af3 
> ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been 
> merged to linus tree, which triggers a new warning during the 
> suspend/resume cycle with smsc95xx driver. Please note, that the 
> smsc95xx still works fine regardless that warning. However it look that 
> the commit 1758bde2e4aa only hide a real problem, which the commit 
> 744d23c71af3 warns about.
> 
> Probably a proper fix for smsc95xx driver is to call phy_stop/start 
> during suspend/resume cycle, like in similar patches for other drivers:
> 
> https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/

No, smsc95xx.c relies on mdio_bus_phy_{suspend,resume}() and there's
no need to call phy_{stop,start}().

744d23c71af3 was flawed and 6dbe852c379f has already fixed a portion
of the fallout.

However the WARN() condition still seems too broad and causes false
positives such as in your case.  In particular, mdio_bus_phy_suspend()
may leave the device in PHY_UP state, so that's a legal state that
needs to be exempted from the WARN().

Does the issue still appear even after 6dbe852c379f?

If it does, could you test whether exempting PHY_UP silences the
gratuitous WARN splat?  I.e.:

-	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY);
+	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
+		phydev->state != PHY_UP);

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-09-18 19:13                       ` Lukas Wunner
@ 2022-09-18 20:41                         ` Florian Fainelli
  2022-09-18 20:55                           ` Lukas Wunner
  2022-09-22 13:21                         ` Marek Szyprowski
  1 sibling, 1 reply; 40+ messages in thread
From: Florian Fainelli @ 2022-09-18 20:41 UTC (permalink / raw)
  To: Lukas Wunner, Marek Szyprowski
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'



On 9/18/2022 12:13 PM, Lukas Wunner wrote:
> [cc += Florian]
> 
> On Mon, Aug 29, 2022 at 01:40:05PM +0200, Marek Szyprowski wrote:
>> I've finally traced what has happened. I've double checked and indeed
>> the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as
>> such it has been merged to linus tree. Then the commit 744d23c71af3
>> ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been
>> merged to linus tree, which triggers a new warning during the
>> suspend/resume cycle with smsc95xx driver. Please note, that the
>> smsc95xx still works fine regardless that warning. However it look that
>> the commit 1758bde2e4aa only hide a real problem, which the commit
>> 744d23c71af3 warns about.
>>
>> Probably a proper fix for smsc95xx driver is to call phy_stop/start
>> during suspend/resume cycle, like in similar patches for other drivers:
>>
>> https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/
> 
> No, smsc95xx.c relies on mdio_bus_phy_{suspend,resume}() and there's
> no need to call phy_{stop,start}() >
> 744d23c71af3 was flawed and 6dbe852c379f has already fixed a portion
> of the fallout.
> 
> However the WARN() condition still seems too broad and causes false
> positives such as in your case.  In particular, mdio_bus_phy_suspend()
> may leave the device in PHY_UP state, so that's a legal state that
> needs to be exempted from the WARN().

How is that a legal state when the PHY should be suspended? Even if we 
are interrupt driven, the state machine should be stopped, does not mean 
that Wake-on-LAN or other activity interrupts should be disabled.

> 
> Does the issue still appear even after 6dbe852c379f?
> 
> If it does, could you test whether exempting PHY_UP silences the
> gratuitous WARN splat?  I.e.:

If you allow PHY_UP, then the warning becomes effectively useless, so I 
don't believe this is quite what you want to do here.
-- 
Florian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-09-18 20:41                         ` Florian Fainelli
@ 2022-09-18 20:55                           ` Lukas Wunner
  2022-09-18 22:11                             ` Florian Fainelli
  0 siblings, 1 reply; 40+ messages in thread
From: Lukas Wunner @ 2022-09-18 20:55 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

On Sun, Sep 18, 2022 at 01:41:13PM -0700, Florian Fainelli wrote:
> On 9/18/2022 12:13 PM, Lukas Wunner wrote:
> > On Mon, Aug 29, 2022 at 01:40:05PM +0200, Marek Szyprowski wrote:
> > > I've finally traced what has happened. I've double checked and indeed
> > > the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as
> > > such it has been merged to linus tree. Then the commit 744d23c71af3
> > > ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been
> > > merged to linus tree, which triggers a new warning during the
> > > suspend/resume cycle with smsc95xx driver. Please note, that the
> > > smsc95xx still works fine regardless that warning. However it look that
> > > the commit 1758bde2e4aa only hide a real problem, which the commit
> > > 744d23c71af3 warns about.
> > > 
> > > Probably a proper fix for smsc95xx driver is to call phy_stop/start
> > > during suspend/resume cycle, like in similar patches for other drivers:
> > > 
> > > https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/
> > 
> > No, smsc95xx.c relies on mdio_bus_phy_{suspend,resume}() and there's
> > no need to call phy_{stop,start}() >
> > 744d23c71af3 was flawed and 6dbe852c379f has already fixed a portion
> > of the fallout.
> > 
> > However the WARN() condition still seems too broad and causes false
> > positives such as in your case.  In particular, mdio_bus_phy_suspend()
> > may leave the device in PHY_UP state, so that's a legal state that
> > needs to be exempted from the WARN().
> 
> How is that a legal state when the PHY should be suspended? Even if we are
> interrupt driven, the state machine should be stopped, does not mean that
> Wake-on-LAN or other activity interrupts should be disabled.

mdio_bus_phy_suspend()
  phy_stop_machine()
    phydev->state = PHY_UP  #  if (phydev->state >= PHY_UP)

So apparently PHY_UP is a legal state for a suspended PHY.


> > Does the issue still appear even after 6dbe852c379f?
> > 
> > If it does, could you test whether exempting PHY_UP silences the
> > gratuitous WARN splat?  I.e.:
> 
> If you allow PHY_UP, then the warning becomes effectively useless, so I
> don't believe this is quite what you want to do here.

Hm, maybe the WARN() should be dropped altogether?

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-09-18 20:55                           ` Lukas Wunner
@ 2022-09-18 22:11                             ` Florian Fainelli
  2022-09-23  4:20                               ` Lukas Wunner
  0 siblings, 1 reply; 40+ messages in thread
From: Florian Fainelli @ 2022-09-18 22:11 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'



On 9/18/2022 1:55 PM, Lukas Wunner wrote:
> On Sun, Sep 18, 2022 at 01:41:13PM -0700, Florian Fainelli wrote:
>> On 9/18/2022 12:13 PM, Lukas Wunner wrote:
>>> On Mon, Aug 29, 2022 at 01:40:05PM +0200, Marek Szyprowski wrote:
>>>> I've finally traced what has happened. I've double checked and indeed
>>>> the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as
>>>> such it has been merged to linus tree. Then the commit 744d23c71af3
>>>> ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been
>>>> merged to linus tree, which triggers a new warning during the
>>>> suspend/resume cycle with smsc95xx driver. Please note, that the
>>>> smsc95xx still works fine regardless that warning. However it look that
>>>> the commit 1758bde2e4aa only hide a real problem, which the commit
>>>> 744d23c71af3 warns about.
>>>>
>>>> Probably a proper fix for smsc95xx driver is to call phy_stop/start
>>>> during suspend/resume cycle, like in similar patches for other drivers:
>>>>
>>>> https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/
>>>
>>> No, smsc95xx.c relies on mdio_bus_phy_{suspend,resume}() and there's
>>> no need to call phy_{stop,start}() >
>>> 744d23c71af3 was flawed and 6dbe852c379f has already fixed a portion
>>> of the fallout.
>>>
>>> However the WARN() condition still seems too broad and causes false
>>> positives such as in your case.  In particular, mdio_bus_phy_suspend()
>>> may leave the device in PHY_UP state, so that's a legal state that
>>> needs to be exempted from the WARN().
>>
>> How is that a legal state when the PHY should be suspended? Even if we are
>> interrupt driven, the state machine should be stopped, does not mean that
>> Wake-on-LAN or other activity interrupts should be disabled.
> 
> mdio_bus_phy_suspend()
>    phy_stop_machine()
>      phydev->state = PHY_UP  #  if (phydev->state >= PHY_UP)
> 
> So apparently PHY_UP is a legal state for a suspended PHY.

It is not clear to me why, however. Sure it does ensure that when we 
resume we set needs_aneg = true but this feels like a hack in the sense 
that we are setting the PHY in a provisional state in anticipation for 
what might come next.

> 
> 
>>> Does the issue still appear even after 6dbe852c379f?
>>>
>>> If it does, could you test whether exempting PHY_UP silences the
>>> gratuitous WARN splat?  I.e.:
>>
>> If you allow PHY_UP, then the warning becomes effectively useless, so I
>> don't believe this is quite what you want to do here.
> 
> Hm, maybe the WARN() should be dropped altogether?

And then be left with debugging similar problems that prompted me to 
submit the patch in the first place, no thank you. I guess I would 
rather accept that PHY_UP needs to be special cased then.
-- 
Florian

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-09-18 19:13                       ` Lukas Wunner
  2022-09-18 20:41                         ` Florian Fainelli
@ 2022-09-22 13:21                         ` Marek Szyprowski
  1 sibling, 0 replies; 40+ messages in thread
From: Marek Szyprowski @ 2022-09-22 13:21 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	netdev, linux-usb, Steve Glendinning, UNGLinuxDriver,
	Oliver Neukum, Andre Edich, Oleksij Rempel, Martyn Welch,
	Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC',
	Florian Fainelli

On 18.09.2022 21:13, Lukas Wunner wrote:
> [cc += Florian]
>
> On Mon, Aug 29, 2022 at 01:40:05PM +0200, Marek Szyprowski wrote:
>> I've finally traced what has happened. I've double checked and indeed
>> the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as
>> such it has been merged to linus tree. Then the commit 744d23c71af3
>> ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been
>> merged to linus tree, which triggers a new warning during the
>> suspend/resume cycle with smsc95xx driver. Please note, that the
>> smsc95xx still works fine regardless that warning. However it look that
>> the commit 1758bde2e4aa only hide a real problem, which the commit
>> 744d23c71af3 warns about.
>>
>> Probably a proper fix for smsc95xx driver is to call phy_stop/start
>> during suspend/resume cycle, like in similar patches for other drivers:
>>
>> https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/
> No, smsc95xx.c relies on mdio_bus_phy_{suspend,resume}() and there's
> no need to call phy_{stop,start}().
>
> 744d23c71af3 was flawed and 6dbe852c379f has already fixed a portion
> of the fallout.
>
> However the WARN() condition still seems too broad and causes false
> positives such as in your case.  In particular, mdio_bus_phy_suspend()
> may leave the device in PHY_UP state, so that's a legal state that
> needs to be exempted from the WARN().
>
> Does the issue still appear even after 6dbe852c379f?
>
> If it does, could you test whether exempting PHY_UP silences the
> gratuitous WARN splat?  I.e.:
>
> -	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY);
> +	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
> +		phydev->state != PHY_UP);

Yes, this hides this warning in my case. I've tested on linux 
next-20220921 with the above change.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling
  2022-09-18 22:11                             ` Florian Fainelli
@ 2022-09-23  4:20                               ` Lukas Wunner
  0 siblings, 0 replies; 40+ messages in thread
From: Lukas Wunner @ 2022-09-23  4:20 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Marek Szyprowski, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, netdev, linux-usb, Steve Glendinning,
	UNGLinuxDriver, Oliver Neukum, Andre Edich, Oleksij Rempel,
	Martyn Welch, Gabriel Hojda, Christoph Fritz, Lino Sanfilippo,
	Philipp Rosenberger, Heiner Kallweit, Andrew Lunn, Russell King,
	Ferry Toth, Krzysztof Kozlowski, 'Linux Samsung SOC'

On Sun, Sep 18, 2022 at 03:11:47PM -0700, Florian Fainelli wrote:
> On 9/18/2022 1:55 PM, Lukas Wunner wrote:
> > On Sun, Sep 18, 2022 at 01:41:13PM -0700, Florian Fainelli wrote:
> > > On 9/18/2022 12:13 PM, Lukas Wunner wrote:
> > > > On Mon, Aug 29, 2022 at 01:40:05PM +0200, Marek Szyprowski wrote:
> > > > > I've finally traced what has happened. I've double checked and indeed
> > > > > the 1758bde2e4aa commit fixed the issue on next-20220516 kernel and as
> > > > > such it has been merged to linus tree. Then the commit 744d23c71af3
> > > > > ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") has been
> > > > > merged to linus tree, which triggers a new warning during the
> > > > > suspend/resume cycle with smsc95xx driver. Please note, that the
> > > > > smsc95xx still works fine regardless that warning. However it look that
> > > > > the commit 1758bde2e4aa only hide a real problem, which the commit
> > > > > 744d23c71af3 warns about.
> > > > > 
> > > > > Probably a proper fix for smsc95xx driver is to call phy_stop/start
> > > > > during suspend/resume cycle, like in similar patches for other drivers:
> > > > > 
> > > > > https://lore.kernel.org/all/20220825023951.3220-1-f.fainelli@gmail.com/
> > > > 
> > > > No, smsc95xx.c relies on mdio_bus_phy_{suspend,resume}() and there's
> > > > no need to call phy_{stop,start}() >
> > > > 744d23c71af3 was flawed and 6dbe852c379f has already fixed a portion
> > > > of the fallout.
> > > > 
> > > > However the WARN() condition still seems too broad and causes false
> > > > positives such as in your case.  In particular, mdio_bus_phy_suspend()
> > > > may leave the device in PHY_UP state, so that's a legal state that
> > > > needs to be exempted from the WARN().
> > > 
> > > How is that a legal state when the PHY should be suspended? Even if we are
> > > interrupt driven, the state machine should be stopped, does not mean that
> > > Wake-on-LAN or other activity interrupts should be disabled.
> > 
> > mdio_bus_phy_suspend()
> >    phy_stop_machine()
> >      phydev->state = PHY_UP  #  if (phydev->state >= PHY_UP)
> > 
> > So apparently PHY_UP is a legal state for a suspended PHY.
> 
> It is not clear to me why, however. Sure it does ensure that when we resume
> we set needs_aneg = true but this feels like a hack in the sense that we are
> setting the PHY in a provisional state in anticipation for what might come
> next.

I've just submitted a fix so that at least v6.0 doesn't get released
with a false-positive WARN splat on resume:

https://lore.kernel.org/netdev/8128fdb51eeebc9efbf3776a4097363a1317aaf1.1663905575.git.lukas@wunner.de/

I guess we can look into making the state setting more logical in a
separate step.


> > > If you allow PHY_UP, then the warning becomes effectively useless, so I
> > > don't believe this is quite what you want to do here.
> > 
> > Hm, maybe the WARN() should be dropped altogether?
> 
> And then be left with debugging similar problems that prompted me to submit
> the patch in the first place, no thank you. I guess I would rather accept
> that PHY_UP needs to be special cased then.

I've interpreted that as an Acked-by for exempting PHY_UP.
If that was not what you wanted, please speak up.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2022-09-23  4:20 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-12  8:42 [PATCH net-next v3 0/7] Polling be gone on LAN95xx Lukas Wunner
2022-05-12  8:42 ` [PATCH net-next v3 1/7] usbnet: Run unregister_netdev() before unbind() again Lukas Wunner
2022-05-12  8:42 ` [PATCH net-next v3 2/7] usbnet: smsc95xx: Don't clear read-only PHY interrupt Lukas Wunner
2022-05-12 11:57   ` Lukas Wunner
2022-05-12  8:42 ` [PATCH net-next v3 3/7] usbnet: smsc95xx: Don't reset PHY behind PHY driver's back Lukas Wunner
2022-05-12 12:12   ` Lukas Wunner
2022-05-12  8:42 ` [PATCH net-next v3 4/7] usbnet: smsc95xx: Avoid link settings race on interrupt reception Lukas Wunner
2022-05-12 12:15   ` Lukas Wunner
2022-05-12  8:42 ` [PATCH net-next v3 5/7] usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling Lukas Wunner
     [not found]   ` <CGME20220517101846eucas1p2c132f7e7032ed00996e222e9cc6cdf99@eucas1p2.samsung.com>
2022-05-17 10:18     ` Marek Szyprowski
2022-05-19 19:08       ` Lukas Wunner
2022-05-19 21:22         ` Marek Szyprowski
2022-05-23  9:43           ` Lukas Wunner
2022-05-23 11:38             ` Marek Szyprowski
2022-05-23 12:34             ` Andrew Lunn
2022-05-23 13:47               ` Lukas Wunner
2022-05-24  0:52                 ` Andrew Lunn
2022-05-24  1:08             ` Andrew Lunn
2022-05-24  6:16               ` Marek Szyprowski
2022-05-24 12:03                 ` Andrew Lunn
2022-05-26 13:55                   ` Lukas Wunner
2022-05-24 12:13               ` Lukas Wunner
2022-06-06  1:28                 ` Andrew Lunn
2022-08-26  6:51           ` Marek Szyprowski
2022-08-26  7:19             ` Lukas Wunner
2022-08-26  7:41               ` Marek Szyprowski
2022-08-26  7:53                 ` Lukas Wunner
2022-08-26  8:29                   ` Marek Szyprowski
2022-08-29 11:40                     ` Marek Szyprowski
2022-09-18 19:13                       ` Lukas Wunner
2022-09-18 20:41                         ` Florian Fainelli
2022-09-18 20:55                           ` Lukas Wunner
2022-09-18 22:11                             ` Florian Fainelli
2022-09-23  4:20                               ` Lukas Wunner
2022-09-22 13:21                         ` Marek Szyprowski
2022-05-12  8:42 ` [PATCH net-next v3 6/7] net: phy: smsc: Cache interrupt mask Lukas Wunner
2022-05-12 12:17   ` Lukas Wunner
2022-05-12  8:42 ` [PATCH net-next v3 7/7] net: phy: smsc: Cope with hot-removal in interrupt handler Lukas Wunner
2022-05-12 12:19   ` Lukas Wunner
2022-05-13 10:40 ` [PATCH net-next v3 0/7] Polling be gone on LAN95xx patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.