netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 net-next 0/5] net: phy: Fix SMSC LAN87xx external reset
@ 2021-01-18 16:57 Badel, Laurent
  2021-01-18 20:02 ` Marco Felsch
  0 siblings, 1 reply; 3+ messages in thread
From: Badel, Laurent @ 2021-01-18 16:57 UTC (permalink / raw)
  To: Badel, Laurent, davem, m.felsch, fugang.duan, kuba, andrew,
	linux, p.zabel, lgirdwood, broonie, robh+dt, richard.leitner,
	netdev, devicetree, f.fainelli, marex

Description:
External PHY reset from the FEC driver was introduced in commit [1] to 
mitigate an issue with iMX SoCs and LAN87xx PHYs. The issue occurs 
because the FEC driver turns off the reference clock for power saving 
reasons [2], which doesn't work out well with LAN87xx PHYs which require 
a running REF_CLK during the power-up sequence. As a result, the PHYs 
occasionally (and unpredictably) fail to establish a stable link and 
require a hardware reset to work reliably.

As previously noted [3], the solution in [1] integrates poorly with the
PHY abstraction layer, and it also performs many unnecessary resets. This
patch series suggests a simpler solution to this problem, namely to hold
the PHY in reset during the time between the PHY driver probe and the first
opening of the FEC driver.

To illustrate why this is sufficient, below is a representation of the PHY
RST and REF_CLK status at relevant time points (note that RST signal is
active-low for LAN87xx):

 1. During system boot when the PHY is probed:
 RST    111111111111111111111000001111111111111
 CLK    000011111111111111111111111111111000000
 REF_CLK is enabled during fec_probe(), and there is a short reset pulse
 due to mdiobus_register_gpiod() which calls gpiod_get_optional() with
 the GPIOD_OUT_LOW flag, which sets the initial value to 0. The reset is
 deasserted by phy_device_register() shortly after.  After that, the PHY
 runs without clock until the FEC is opened, which causes the unstable 
 link issue.

 2. At first opening of the FEC:
 RST    111111111111111111111111111100000111111
 CLK    000000000011111111111111111111111111111
 After REF_CLK is enabled, phy_reset_after_clk_enable() causes a
 short reset pulse. Reset is needed here because the PHY was running 
 without clock before. 
   
 3. At closing of the FEC driver:
 RST    111110000000000000000000000000000000000                 
 CLK    111111111111000000000000000000000000000
 FEC first disconnects the PHY, which asserts the reset, and then 
 disables the clock.
   
 4. At subsequent openings of the FEC:
 RST    000000000000000011111111111110000011111                  
 CLK    000000000011111111111111111111111111111
 FEC first enables the clock, then connects to the PHY which releases 
 the reset. Here the second reset pulse (phy_reset_after_clk_enable()) 
 is unnecessary, because REF_CLK is already running when the reset is 
 first deasserted. 
  
This illustrates that the only place where the extra reset pulse is 
actually needed, is at the first opening of the FEC driver, and the reason
it is needed in the first place, is because the PHY has been running 
without clock after it was probed. 

Extensive testing with LAN8720 confirmed that the REF_CLK can be disabled
without problems as long as the PHY is either in reset or in power-down 
mode (which is relevant for suspend-to-ram as well). Therefore, instead 
of relying on extra calls to phy_reset_after_clk_enable(), the issue 
addressed by commit [1] can be simply fixed by keeping the PHY in reset 
when exiting from phy_probe(). In this way the PHY will always be in reset
or power-down whenever the REF_CLK is turned off.

This should not cause issues, since as per the PAL documentation any 
driver that has business with the PHY should at least call phy_attach(), 
which will deassert the reset in due time. Therefore this fix probably 
works equally well for any PHY, but out of caution the patch uses the 
existing PHY_RST_AFTER_CLK_EN driver flag (which it renames), to implement
the fix only for LAN87xx PHYs.

Previous versions:
This is the 4th version of the series;  below is a short description of
the previous versions.

v1: 
The solution in [1] has the unfortunate side-effect of breaking the PHY 
interrupt system due to the hardware reset erasing the interrupt mask of
the PHY. Patch series v1 suggested performing the extra reset before the 
PHY is configured, by moving the call to phy_reset_after_clk_enable() from
the FEC into phy_init_hw() instead. The patch was re-examined after 
finding an issue during resume from suspend, where the PHY also seemed to
require a hardware reset to work properly. 
Further investigation showed that this is in fact due to another
peculiarity of the LAN87xx, which also erase their interrupt mask upon 
software reset (which is done by phy_init_hw() on resuming from 
suspend-to-ram), and is thus a separate issue that will be addressed in 
a separate patch. 

v2:
During this time the kernel had moved on and 2 new commits rendered the v1
fix unnecessary: 
  [3] allows the extra PHY reset to still be performed from the FEC, but 
  before the interrupt mask is configured, thereby fixing the above 
  interrupt mask issue.
  [4] allows LAN87xx to take control of the REF_CLK directly, preventing
  the FEC from disabling it and thus circumventing the entire REF_CLK 
  issue.
Patch v2 proposed to fix 2 potential issues with the solution from [4], 
namely that (i) failing to set the PHY "clocks" DT property would silently 
break the system (because FEC succeeds in disabling the REF_CLK, yet the 
extra reset has been removed), and (ii) keeping the REF_CLK enabled
defeated the power-saving purpose of commit [2].
The present patch fixes (i), and leaves it up to the user to use the 
power-friendly clock management of [2] (leave the DT clocks property 
unset), or keep the REF_CLK always enabled (set the clocks property). 
It also simplifies the code by removing all calls to 
phy_reset_after_clk_enable() and related code, and the function
phy_reset_after_clk_enable() altogether.  

v3:
The same content as v4, except that splitting of the patches has been 
amended to make sure that the PHY works correctly after every successive
patch. 

Tests: against net-next (5.11-rc3) with LAN8720 and LAN8742 and iMX283 
SoC. Unfortunately unable to test LAN8740 which has a different form 
factor.

References:
[1] commit 1b0a83ac04e3 ("net: fec: add phy_reset_after_clk_enable()
    support")
[2] commit e8fcfcd5684a ("net: fec: optimize the clock management to save
    power")
[3] commit 64a632da538a ("net: fec: Fix phy_device lookup for 
    phy_reset_after_clk_enable()")
[4] commit bedd8d78aba3 ("net: phy: smsc: LAN8710/20: add phy refclk in
    support")


Laurent Badel (5):
  net: phy: Add PHY_RST_AFTER_PROBE flag
  net: phy: net: phy: Hold SMSC LAN87xx in reset after probe
  net: fec: Remove PHY reset in fec_main.c
  net: phy: Remove phy_reset_after_clk_enable()
  net: phy: Remove PHY_RST_AFTER_CLK_EN flag

 drivers/net/ethernet/freescale/fec_main.c | 40 -----------------------
 drivers/net/phy/phy_device.c              | 26 +--------------
 drivers/net/phy/smsc.c                    |  4 +--
 include/linux/phy.h                       |  3 +-
 4 files changed, 4 insertions(+), 69 deletions(-)

-- 
2.17.1



-----------------------------
Eaton Industries Manufacturing GmbH ~ Registered place of business: Route de la Longeraie 7, 1110, Morges, Switzerland 

-----------------------------


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v4 net-next 0/5] net: phy: Fix SMSC LAN87xx external reset
  2021-01-18 16:57 [PATCH v4 net-next 0/5] net: phy: Fix SMSC LAN87xx external reset Badel, Laurent
@ 2021-01-18 20:02 ` Marco Felsch
  2021-01-19 13:25   ` [EXTERNAL] " Badel, Laurent
  0 siblings, 1 reply; 3+ messages in thread
From: Marco Felsch @ 2021-01-18 20:02 UTC (permalink / raw)
  To: Badel, Laurent
  Cc: davem, fugang.duan, kuba, andrew, linux, p.zabel, lgirdwood,
	broonie, robh+dt, richard.leitner, netdev, devicetree,
	f.fainelli, marex

Hi Laurent,

thanks for your patches :) Can you check your setup since we get 6
individual emails: 'git send-email --thread ...' ;)

On 21-01-18 16:57, Badel, Laurent wrote:
> Description:
> External PHY reset from the FEC driver was introduced in commit [1] to 
> mitigate an issue with iMX SoCs and LAN87xx PHYs. The issue occurs 
> because the FEC driver turns off the reference clock for power saving 
> reasons [2], which doesn't work out well with LAN87xx PHYs which require 
> a running REF_CLK during the power-up sequence.

Not only during the power-up sequence. The complete phy internal state
machine (the hardware state machine) gets confused if the clock is
turned off randomly.

> As a result, the PHYs 
> occasionally (and unpredictably) fail to establish a stable link and 
> require a hardware reset to work reliably.
> 
> As previously noted [3], the solution in [1] integrates poorly with the
> PHY abstraction layer, and it also performs many unnecessary resets. This
> patch series suggests a simpler solution to this problem, namely to hold
> the PHY in reset during the time between the PHY driver probe and the first
> opening of the FEC driver.

Holding the Phy within reset during the FEC is in reset seems wrong to
me because: The clock can be supplied by an external crystal/oszi. This
would add unnecessary delays. Also this is again a FEC/SMSC combination
fix again. The phy has the same problem on other hosts if they are the
clock provider and toggling the ref-clk.

> To illustrate why this is sufficient, below is a representation of the PHY
> RST and REF_CLK status at relevant time points (note that RST signal is
> active-low for LAN87xx):
> 
>  1. During system boot when the PHY is probed:
>  RST    111111111111111111111000001111111111111
>  CLK    000011111111111111111111111111111000000
>  REF_CLK is enabled during fec_probe(), and there is a short reset pulse
>  due to mdiobus_register_gpiod() which calls gpiod_get_optional() with

There is also a deprecated "phy-reset-gpios" did you test this as well?

>  the GPIOD_OUT_LOW flag, which sets the initial value to 0. The reset is
>  deasserted by phy_device_register() shortly after.  After that, the PHY
>  runs without clock until the FEC is opened, which causes the unstable 
>  link issue.

Nope that's not true, you can specify the clock within the device-tree
so the fec-ref-clk isn't disabled anymore.

>  2. At first opening of the FEC:

...

> Extensive testing with LAN8720 confirmed that the REF_CLK can be disabled
> without problems as long as the PHY is either in reset or in power-down 
> mode (which is relevant for suspend-to-ram as well).

You can't disable the clock. What you listing here means that the smsc
phy needs to be re-initialized after such an clock loss. If we can
disbale the clock randomly we wouldn't need to re-initialize the phy
again.

Regards,
  Marco

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [EXTERNAL]  Re: [PATCH v4 net-next 0/5] net: phy: Fix SMSC LAN87xx external reset
  2021-01-18 20:02 ` Marco Felsch
@ 2021-01-19 13:25   ` Badel, Laurent
  0 siblings, 0 replies; 3+ messages in thread
From: Badel, Laurent @ 2021-01-19 13:25 UTC (permalink / raw)
  To: Marco Felsch
  Cc: davem, fugang.duan, kuba, andrew, linux, p.zabel, lgirdwood,
	broonie, robh+dt, richard.leitner, netdev, devicetree,
	f.fainelli, marex

Thank you very much to everyone who took the time to read and comment 
on the patch. 

Given that Marco doesn't seem to agree with the main idea of the patch, 
I don't think it is worth sending the corrected version so I will leave
it as is. 

My replies to Marco's comments are below.

Best regards,

Laurent

> Hi Laurent,
> 
> thanks for your patches :) Can you check your setup since we get 6
> individual emails: 'git send-email --thread ...' ;)

Hi Marco, thank you for you time reviewing and sorry about the threading
of the emails, I will see to it next time.

> 
> On 21-01-18 16:57, Badel, Laurent wrote:
> > Description:
> > External PHY reset from the FEC driver was introduced in commit [1]
> to
> > mitigate an issue with iMX SoCs and LAN87xx PHYs. The issue occurs
> > because the FEC driver turns off the reference clock for power
> saving
> > reasons [2], which doesn't work out well with LAN87xx PHYs which
> > require a running REF_CLK during the power-up sequence.
> 
> Not only during the power-up sequence. The complete phy internal state
> machine (the hardware state machine) gets confused if the clock is
> turned off randomly.

Yes, you are right. LAN87xx don't like the REF_CLK being turned off while 
they are running, which is understandable, so this is why you should 
either avoid turning it off, or if you do, make sure you don't do it
while the PHY is running.

> 
> > As a result, the PHYs
> > occasionally (and unpredictably) fail to establish a stable link and
> > require a hardware reset to work reliably.
> >
> > As previously noted [3], the solution in [1] integrates poorly with
> > the PHY abstraction layer, and it also performs many unnecessary
> > resets. This patch series suggests a simpler solution to this
> problem,
> > namely to hold the PHY in reset during the time between the PHY
> driver
> > probe and the first opening of the FEC driver.
> 
> Holding the Phy within reset during the FEC is in reset seems wrong to
> me because: The clock can be supplied by an external crystal/oszi.
> This would add unnecessary delays. Also this is again a FEC/SMSC
> combination fix again. The phy has the same problem on other hosts if
> they are the clock provider and toggling the ref-clk.

It's true that the PHY will be kept in reset until the interface is up, 
but then I don't really see the point of having the PHY running if the 
MAC is not listening anyway. When the interface is taken down, the PHY
layer asserts the reset, so this is basically the only place where 
the interface is down but the PHY is not held in reset, so in my view
it made sense to do this.
But fair enough, keeping the clock on does the job and that is what 
matters in the end. This is at the expense of power management though, 
for example the clock stays on during suspend-to-ram, though this
perhaps not the end of the world indeed.

> > To illustrate why this is sufficient, below is a representation of
> the
> > PHY RST and REF_CLK status at relevant time points (note that RST
> > signal is active-low for LAN87xx):
> >
> >  1. During system boot when the PHY is probed:
> >  RST    111111111111111111111000001111111111111
> >  CLK    000011111111111111111111111111111000000
> >  REF_CLK is enabled during fec_probe(), and there is a short reset
> > pulse  due to mdiobus_register_gpiod() which calls
> > gpiod_get_optional() with
> 
> There is also a deprecated "phy-reset-gpios" did you test this as
> well?

I gave it a try, but since FEC only uses this to reset the PHY at probe 
time, there is not much to expect. As can be expected, if we set 
phy-reset-gpios in the fec node, but not reset-gpios in the phy node, 
the PHY layer is unable to reset the PHY, so disabling the REF_CLK 
would mean trouble indeed. 
 
> >  the GPIOD_OUT_LOW flag, which sets the initial value to 0. The
> reset
> > is  deasserted by phy_device_register() shortly after.  After that,
> > the PHY  runs without clock until the FEC is opened, which causes
> the
> > unstable  link issue.
> 
> Nope that's not true, you can specify the clock within the device-tree
> so the fec-ref-clk isn't disabled anymore.

Yes right, I was assuming clocks is not set in reference to the original
solution in [1] but that may not have been obvious. 
 
> > Extensive testing with LAN8720 confirmed that the REF_CLK can be
> > disabled without problems as long as the PHY is either in reset or
> in
> > power-down mode (which is relevant for suspend-to-ram as well).
> 
> You can't disable the clock. What you listing here means that the smsc
> phy needs to be re-initialized after such an clock loss. If we can
> disbale the clock randomly we wouldn't need to re-initialize the phy
> again.

Fair enough, if you correctly configure your DT with the clocks property, 
the clock will stay on.  

Best regards,

Laurent


-----------------------------
Eaton Industries Manufacturing GmbH ~ Registered place of business: Route de la Longeraie 7, 1110, Morges, Switzerland 

-----------------------------


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-01-19 14:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-18 16:57 [PATCH v4 net-next 0/5] net: phy: Fix SMSC LAN87xx external reset Badel, Laurent
2021-01-18 20:02 ` Marco Felsch
2021-01-19 13:25   ` [EXTERNAL] " Badel, Laurent

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).