All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] net: sfp: reset fault retry counter on successful reinitialisation
@ 2022-08-12 13:04 Stefan Mahr
  2022-08-12 15:07 ` Russell King (Oracle)
  0 siblings, 1 reply; 3+ messages in thread
From: Stefan Mahr @ 2022-08-12 13:04 UTC (permalink / raw)
  To: Russell King, netdev; +Cc: linux-kernel, Stefan Mahr

This patch resets the fault retry counter to the default value, if the
module reinitialisation was successful. Otherwise without resetting
the counter, five (N_FAULT/N_FAULT_INIT) single TX_FAULT events will
deactivate the module persistently.

In case the reinitialisation was not successful after five retries,
the module is still being deactivated.

Signed-off-by: Stefan Mahr <dac922@gmx.de>
---
 drivers/net/phy/sfp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 63f90fe9a4d2..a8d7a713222a 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -2263,6 +2263,9 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 		} else if (event == SFP_E_TIMEOUT || event == SFP_E_TX_CLEAR) {
 			dev_info(sfp->dev, "module transmit fault recovered\n");
 			sfp_sm_link_check_los(sfp);
+
+			/* Reset the fault retry count */
+			sfp->sm_fault_retries = N_FAULT;
 		}
 		break;

--
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] net: sfp: reset fault retry counter on successful reinitialisation
  2022-08-12 13:04 [PATCH] net: sfp: reset fault retry counter on successful reinitialisation Stefan Mahr
@ 2022-08-12 15:07 ` Russell King (Oracle)
  2022-08-12 15:45   ` Stefan Mahr
  0 siblings, 1 reply; 3+ messages in thread
From: Russell King (Oracle) @ 2022-08-12 15:07 UTC (permalink / raw)
  To: Stefan Mahr; +Cc: netdev, linux-kernel

On Fri, Aug 12, 2022 at 03:04:38PM +0200, Stefan Mahr wrote:
> This patch resets the fault retry counter to the default value, if the
> module reinitialisation was successful. Otherwise without resetting
> the counter, five (N_FAULT/N_FAULT_INIT) single TX_FAULT events will
> deactivate the module persistently.

This is intentional - if a module keeps asserting its TX_FAULT status,
then there is something wrong with it, and for an optical module it
means that the laser is overloading (producing more output than it
should.) That is a safety issue.

So, if the module keeps indicating that its laser is misbehaving the
only right thing to do is to shut it down permanently, and have
someone investigate.

What issue are you seeing that needs a different behaviour?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] net: sfp: reset fault retry counter on successful reinitialisation
  2022-08-12 15:07 ` Russell King (Oracle)
@ 2022-08-12 15:45   ` Stefan Mahr
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Mahr @ 2022-08-12 15:45 UTC (permalink / raw)
  To: Russell King (Oracle); +Cc: netdev, linux-kernel

> On Fri, Aug 12, 2022 at 03:04:38PM +0200, Stefan Mahr wrote:
>> This patch resets the fault retry counter to the default value, if the
>> module reinitialisation was successful. Otherwise without resetting
>> the counter, five (N_FAULT/N_FAULT_INIT) single TX_FAULT events will
>> deactivate the module persistently.
>
> This is intentional - if a module keeps asserting its TX_FAULT status,
> then there is something wrong with it, and for an optical module it
> means that the laser is overloading (producing more output than it
> should.) That is a safety issue.

Yes, this behaviour is not changed by the patch. When TX_FAULT is true
persistently for 5 retries, the module will still be disabled.

>
> So, if the module keeps indicating that its laser is misbehaving the
> only right thing to do is to shut it down permanently, and have
> someone investigate.
>
> What issue are you seeing that needs a different behaviour?
>

In my case two different SFP modules (1000Base-BX) indicate short
TX_FAULT events for less than one second and only one per week -
sometimes more, sometimes less. So the idea was to reset the counter if
reinitialisation was successful, so rare single TX_FAULT events can be
ignored.

However, you are right: If a module reports TX_FAULT several times,
something serious may be wrong.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-08-12 15:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-12 13:04 [PATCH] net: sfp: reset fault retry counter on successful reinitialisation Stefan Mahr
2022-08-12 15:07 ` Russell King (Oracle)
2022-08-12 15:45   ` Stefan Mahr

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.