netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Subject: [PATCH v3 net-next 0/4] net:phy: Fix LAN87xx external reset
@ 2021-01-15 17:19 Badel, Laurent
  2021-01-15 17:36 ` Heiner Kallweit
  0 siblings, 1 reply; 3+ messages in thread
From: Badel, Laurent @ 2021-01-15 17:19 UTC (permalink / raw)
  To: davem, m.felsch, fugang.duan, kuba, andrew, Heiner Kallweit,
	linux, p.zabel, lgirdwood, broonie, robh+dt, richard.leitner,
	netdev, devicetree, f.fainelli, marex

Description: 
External PHY reset from the FEC driver was introduced in commit [1] to 
mitigate an issue with iMX SoCs and LAN87xx PHYs. The issue occurs 
because the FEC driver turns off the reference clock for power saving 
reasons [2], which doesn't work out well with LAN87xx PHYs which require 
a running REF_CLK during the power-up sequence. As a result, the PHYs 
occasionnally (and unpredictably) fail to establish a stable link and
require a hardware reset to work reliably.

As previously noted [3] the solution in [1] integrates poorly with the
PHY abstraction layer, and it also performs many unnecessary resets. This
patch series suggests a simpler solution to this problem, namely to hold
the PHY in reset during the time between the PHY driver probe and the first
opening of the FEC driver.

To illustrate why this is sufficient, below is a representation of the PHY
RST and REF_CLK status at relevant time points (note that RST signal is
active-low for LAN8710/20):

 1. During system boot when the PHY is probed:
 RST    111111111111111111111000001111111111111
 CLK    000011111111111111111111111111111000000
 REF_CLK is enabled during fec_probe(), and there is a short reset pulse
 due to mdiobus_register_gpiod() which calls gpiod_get_optional() with
 the GPIOD_OUT_LOW which sets the initial value to 0. The reset is
 de-asserted by phy_device_register() shortly after.  After that, the PHY
 runs without clock until the FEC is opened, which causes issues.

 2. At first opening of the FEC:
 RST    111111111111111111111111111100000111111
 CLK    000000000011111111111111111111111111111
 After REF_CLK is enabled, phy_reset_after_clk_enable() causes a
 short reset pulse. Reset is needed here because the PHY was running 
 without clock before. 
   
 3. At closing of the FEC driver:
 RST    111110000000000000000000000000000000000                 
 CLK    111111111111000000000000000000000000000
 FEC first disconnects the PHY, which asserts the reset, and then 
 disables the clock.
   
 4. At subsequent openings of the FEC:
 RST    000000000000000011111111111110000011111                  
 CLK    000000000011111111111111111111111111111
 FEC first enables the clock, then connects to the PHY which releases 
 the reset. Here the second reset pulse (phy_reset_after_clk_enable()) 
 is unnecessary, because REF_CLK is already running when the reset is 
 first deasserted. 
  
This illustrates that the only place where the extra reset pulse is 
actually needed, is at the first opening of the FEC driver, and the reason
it is needed in the first place, is because the PHY has been running 
without clock after it was probed. 

Extensive testing with LAN8720 confirmed that the REF_CLK can be disabled
without problems as long as the PHY is either in reset or in power-down 
mode (which is relevant for suspend-to-ram as well). Therefore, instead 
of relying on extra calls to phy_reset_after_clk_enable(), the issue 
addressed by commit [1] can be simply fixed by keeping the PHY in reset 
when exiting from phy_probe(). In this way the PHY will always be in reset
or power-down whenever the REF_CLK is turned off.

This should not cause issues, since as per the PAL documentation any 
driver that has business with the PHY should at least call phy_attach(), 
which will deassert the reset in due time. Therefore this fix probably 
works equally well for any PHY, but out of caution the patch uses the 
existing PHY_RST_AFTER_CLK_EN driver flag (which it renames), to implement
the fix only for LAN8710/20/40 PHYs.

Previous versions:
This is the 3rd version of the series;  below is a short description of
the previous versions.

v1: 
The solution in [1] has the unfortunate side-effect of breaking the PHY 
interrupt system due to the hardware reset erasing the interrupt mask of
the PHY. Patch series v1 suggested performing the extra reset before the 
PHY is configured, by moving the call to phy_reset_after_clk_enable() from
the FEC into phy_init_hw() instead. The patch was re-examinated after 
finding an issue during resume from suspend, where the PHY also seemed to
require a hardware reset to work properly. 
Further investigation showed that this is in fact due to another
peculiarity of the LAN87xx, which also erase their interrupt mask upon 
software reset (which is done by phy_init_hw() on resuming from 
suspend-to-ram), and is thus a separate issue that will be addressed in 
a separate patch. 

v2:
During this time the kernel had moved on and 2 new commits rendered the v1
fix unnecessary: 
  [3] allows the extra PHY reset to still be performed from the FEC, but 
  before the interrupt mask is configured, thereby fixing the above 
  interrupt mask issue.
  [4] allows LAN87xx to take control of the REF_CLK directly, preventing
  the FEC from disabling it and thus circumventing the entire REF_CLK 
  issue.
Patch v2 proposed to fix 2 potential issues with the solution from [4], 
namely that (i) failing to set the PHY "clocks" DT property would silently 
break the system (because FEC succeeds in disabling the REF_CLK, yet the 
extra reset has been removed), and (ii) keeping the REF_CLK enabled
defeated the power-saving purpose of commit [2].

The present patch fixes (i), and leaves it up to the user to use the 
power-friendly clock management of [2] (leave the DT clocks property 
unset), or keep the REF_CLK always enabled (set the clocks property). 
It also simplifies the code by removing all calls to 
phy_reset_after_clk_enable() and related code, and the function
phy_reset_after_clk_enable() altogether.  

Tests: against net-next (5.11-rc3) with LAN8720 and LAN8742 and iMX283 
SoC. Unfortunately unable to test LAN8740 which has a different form 
factor.

References:
[1] commit 1b0a83ac04e3 ("net: fec: add phy_reset_after_clk_enable()
    support")
[2] commit e8fcfcd5684a ("net: fec: optimize the clock management to save
    power")
[3] commit 64a632da538a ("net: fec: Fix phy_device lookup for 
    phy_reset_after_clk_enable()")
[4] commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in
    support")

Laurent Badel (4):
  Remove PHY reset in fec_main.c
  Remove phy_reset_after_clk_enable()
  Rename PHY_RST_AFTER_CLK_EN to PHY_RST_AFTER_PROBE
  Add PHY reset after probe for PHYs with PHY_RST_AFTER_PROBE flag 

 drivers/net/ethernet/freescale/fec_main.c | 40 -----------------------
 drivers/net/phy/phy_device.c              | 26 +--------------
 drivers/net/phy/smsc.c                    |  4 +--
 include/linux/phy.h                       |  3 +-
 4 files changed, 4 insertions(+), 69 deletions(-)

-- 
2.17.1



-----------------------------
Eaton Industries Manufacturing GmbH ~ Registered place of business: Route de la Longeraie 7, 1110, Morges, Switzerland 

-----------------------------


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Subject: [PATCH v3 net-next 0/4] net:phy: Fix LAN87xx external reset
  2021-01-15 17:19 Subject: [PATCH v3 net-next 0/4] net:phy: Fix LAN87xx external reset Badel, Laurent
@ 2021-01-15 17:36 ` Heiner Kallweit
  2021-01-15 21:27   ` [EXTERNAL] " Badel, Laurent
  0 siblings, 1 reply; 3+ messages in thread
From: Heiner Kallweit @ 2021-01-15 17:36 UTC (permalink / raw)
  To: Badel, Laurent, davem, m.felsch, fugang.duan, kuba, andrew,
	linux, p.zabel, lgirdwood, broonie, robh+dt, richard.leitner,
	netdev, devicetree, f.fainelli, marex

On 15.01.2021 18:19, Badel, Laurent wrote:
> Description: 
> External PHY reset from the FEC driver was introduced in commit [1] to 
> mitigate an issue with iMX SoCs and LAN87xx PHYs. The issue occurs 
> because the FEC driver turns off the reference clock for power saving 
> reasons [2], which doesn't work out well with LAN87xx PHYs which require 
> a running REF_CLK during the power-up sequence. As a result, the PHYs 
> occasionnally (and unpredictably) fail to establish a stable link and
> require a hardware reset to work reliably.
> 
> As previously noted [3] the solution in [1] integrates poorly with the
> PHY abstraction layer, and it also performs many unnecessary resets. This
> patch series suggests a simpler solution to this problem, namely to hold
> the PHY in reset during the time between the PHY driver probe and the first
> opening of the FEC driver.
> 
> To illustrate why this is sufficient, below is a representation of the PHY
> RST and REF_CLK status at relevant time points (note that RST signal is
> active-low for LAN8710/20):
> 
>  1. During system boot when the PHY is probed:
>  RST    111111111111111111111000001111111111111
>  CLK    000011111111111111111111111111111000000
>  REF_CLK is enabled during fec_probe(), and there is a short reset pulse
>  due to mdiobus_register_gpiod() which calls gpiod_get_optional() with
>  the GPIOD_OUT_LOW which sets the initial value to 0. The reset is
>  de-asserted by phy_device_register() shortly after.  After that, the PHY
>  runs without clock until the FEC is opened, which causes issues.
> 
>  2. At first opening of the FEC:
>  RST    111111111111111111111111111100000111111
>  CLK    000000000011111111111111111111111111111
>  After REF_CLK is enabled, phy_reset_after_clk_enable() causes a
>  short reset pulse. Reset is needed here because the PHY was running 
>  without clock before. 
>    
>  3. At closing of the FEC driver:
>  RST    111110000000000000000000000000000000000                 
>  CLK    111111111111000000000000000000000000000
>  FEC first disconnects the PHY, which asserts the reset, and then 
>  disables the clock.
>    
>  4. At subsequent openings of the FEC:
>  RST    000000000000000011111111111110000011111                  
>  CLK    000000000011111111111111111111111111111
>  FEC first enables the clock, then connects to the PHY which releases 
>  the reset. Here the second reset pulse (phy_reset_after_clk_enable()) 
>  is unnecessary, because REF_CLK is already running when the reset is 
>  first deasserted. 
>   
> This illustrates that the only place where the extra reset pulse is 
> actually needed, is at the first opening of the FEC driver, and the reason
> it is needed in the first place, is because the PHY has been running 
> without clock after it was probed. 
> 
> Extensive testing with LAN8720 confirmed that the REF_CLK can be disabled
> without problems as long as the PHY is either in reset or in power-down 
> mode (which is relevant for suspend-to-ram as well). Therefore, instead 
> of relying on extra calls to phy_reset_after_clk_enable(), the issue 
> addressed by commit [1] can be simply fixed by keeping the PHY in reset 
> when exiting from phy_probe(). In this way the PHY will always be in reset
> or power-down whenever the REF_CLK is turned off.
> 
> This should not cause issues, since as per the PAL documentation any 
> driver that has business with the PHY should at least call phy_attach(), 
> which will deassert the reset in due time. Therefore this fix probably 
> works equally well for any PHY, but out of caution the patch uses the 
> existing PHY_RST_AFTER_CLK_EN driver flag (which it renames), to implement
> the fix only for LAN8710/20/40 PHYs.
> 
> Previous versions:
> This is the 3rd version of the series;  below is a short description of
> the previous versions.
> 
> v1: 
> The solution in [1] has the unfortunate side-effect of breaking the PHY 
> interrupt system due to the hardware reset erasing the interrupt mask of
> the PHY. Patch series v1 suggested performing the extra reset before the 
> PHY is configured, by moving the call to phy_reset_after_clk_enable() from
> the FEC into phy_init_hw() instead. The patch was re-examinated after 
> finding an issue during resume from suspend, where the PHY also seemed to
> require a hardware reset to work properly. 
> Further investigation showed that this is in fact due to another
> peculiarity of the LAN87xx, which also erase their interrupt mask upon 
> software reset (which is done by phy_init_hw() on resuming from 
> suspend-to-ram), and is thus a separate issue that will be addressed in 
> a separate patch. 
> 
> v2:
> During this time the kernel had moved on and 2 new commits rendered the v1
> fix unnecessary: 
>   [3] allows the extra PHY reset to still be performed from the FEC, but 
>   before the interrupt mask is configured, thereby fixing the above 
>   interrupt mask issue.
>   [4] allows LAN87xx to take control of the REF_CLK directly, preventing
>   the FEC from disabling it and thus circumventing the entire REF_CLK 
>   issue.
> Patch v2 proposed to fix 2 potential issues with the solution from [4], 
> namely that (i) failing to set the PHY "clocks" DT property would silently 
> break the system (because FEC succeeds in disabling the REF_CLK, yet the 
> extra reset has been removed), and (ii) keeping the REF_CLK enabled
> defeated the power-saving purpose of commit [2].
> 
> The present patch fixes (i), and leaves it up to the user to use the 
> power-friendly clock management of [2] (leave the DT clocks property 
> unset), or keep the REF_CLK always enabled (set the clocks property). 
> It also simplifies the code by removing all calls to 
> phy_reset_after_clk_enable() and related code, and the function
> phy_reset_after_clk_enable() altogether.  
> 
> Tests: against net-next (5.11-rc3) with LAN8720 and LAN8742 and iMX283 
> SoC. Unfortunately unable to test LAN8740 which has a different form 
> factor.
> 
> References:
> [1] commit 1b0a83ac04e3 ("net: fec: add phy_reset_after_clk_enable()
>     support")
> [2] commit e8fcfcd5684a ("net: fec: optimize the clock management to save
>     power")
> [3] commit 64a632da538a ("net: fec: Fix phy_device lookup for 
>     phy_reset_after_clk_enable()")
> [4] commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in
>     support")
> 
> Laurent Badel (4):
>   Remove PHY reset in fec_main.c
>   Remove phy_reset_after_clk_enable()
>   Rename PHY_RST_AFTER_CLK_EN to PHY_RST_AFTER_PROBE
>   Add PHY reset after probe for PHYs with PHY_RST_AFTER_PROBE flag 
> 
>  drivers/net/ethernet/freescale/fec_main.c | 40 -----------------------
>  drivers/net/phy/phy_device.c              | 26 +--------------
>  drivers/net/phy/smsc.c                    |  4 +--
>  include/linux/phy.h                       |  3 +-
>  4 files changed, 4 insertions(+), 69 deletions(-)
> 

W/o knowing the exact issue in detail: For the sake of bisectability,
shouldn't patches 4 and a modified patch 3 be first? After patches 1 and 2
we may be in trouble, right?

And it seems you composed the mails manually:
- subject is prefixed with "Subject:"
- patch subjects in the cover letter don't match the ones of the
  respective patches

For the patch prefixes: Please add a space between net: and phy:

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [EXTERNAL]  Re: Subject: [PATCH v3 net-next 0/4] net:phy: Fix LAN87xx external reset
  2021-01-15 17:36 ` Heiner Kallweit
@ 2021-01-15 21:27   ` Badel, Laurent
  0 siblings, 0 replies; 3+ messages in thread
From: Badel, Laurent @ 2021-01-15 21:27 UTC (permalink / raw)
  To: Heiner Kallweit, davem, m.felsch, fugang.duan, kuba, andrew,
	linux, p.zabel, lgirdwood, broonie, robh+dt, richard.leitner,
	netdev, devicetree, f.fainelli, marex

 > W/o knowing the exact issue in detail: For the sake of bisectability,
> shouldn't patches 4 and a modified patch 3 be first? After patches 1
> and 2 we may be in trouble, right?

Thank you very much for the quick reply. You are right, this would be a better way to split the changes. 
I will do as you suggest and resubmit on Monday.  


> 
> And it seems you composed the mails manually:
> - subject is prefixed with "Subject:"
> - patch subjects in the cover letter don't match the ones of the
>   respective patches
> 
My bad, I did copy-paste the output of git into the email client and this slipped through - I will be more careful in the future. Sorry about this. 

> For the patch prefixes: Please add a space between net: and phy:
Will do, thank you for pointing this out. 

Best regards
Laurent


-----------------------------
Eaton Industries Manufacturing GmbH ~ Registered place of business: Route de la Longeraie 7, 1110, Morges, Switzerland 

-----------------------------


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-15 21:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-15 17:19 Subject: [PATCH v3 net-next 0/4] net:phy: Fix LAN87xx external reset Badel, Laurent
2021-01-15 17:36 ` Heiner Kallweit
2021-01-15 21:27   ` [EXTERNAL] " Badel, Laurent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).