netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 00/17] Allow slow to initialise GPON modules to work
@ 2019-11-10 14:05 Russell King - ARM Linux admin
  2019-11-10 14:06 ` [PATCH net-next 01/17] net: sfp: move sfp sub-state machines into separate functions Russell King
                   ` (18 more replies)
  0 siblings, 19 replies; 32+ messages in thread
From: Russell King - ARM Linux admin @ 2019-11-10 14:05 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Some GPON modules take longer than the SFF MSA specified time to
initialise and respond to transactions on the I2C bus for either
both 0x50 and 0x51, or 0x51 bus addresses.  Technically these modules
are non-compliant with the SFP Multi-Source Agreement, they have
been around for some time, so are difficult to just ignore.

Most of the patch series is restructuring the code to make it more
readable, and split various things into separate functions.

We split the three state machines into three separate functions, and
re-arrange them to start probing the module as soon as a module has
been detected (without waiting for the network device.)  We try to
read the module's EEPROM, retrying quickly for the first second, and
then once every five seconds for about a minute until we have read
the EEPROM.  So that the kernel isn't entirely silent, we print a
message indicating that we're waiting for the module to respond after
the first second, or when all retries have expired.

Once the module ID has been read, we kick off a delayed work queue
which attempts to register the hwmon, retrying for up to a minute if
the monitoring parameters are unreadable; this allows us to proceed
with module initialisation independently of the hwmon state.

With high-power modules, we wait for the netdev to be attached before
switching the module power mode, and retry this in a similar way to
before until we have successfully read and written the EEPROM at 0x51.

We also move the handling of the TX_DISABLE signal entirely to the main
state machine, and avoid probing any on-board PHY while TX_FAULT is
set.

 drivers/net/phy/sfp.c | 526 +++++++++++++++++++++++++++++++++++---------------
 1 file changed, 369 insertions(+), 157 deletions(-)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH net-next 01/17] net: sfp: move sfp sub-state machines into separate functions
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 17:56   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 02/17] net: sfp: move tx disable on device down to main state machine Russell King
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Move the SFP sub-state machines out of the main state machine function,
in preparation for it doing a bit more with the device state.  By doing
so, we ensure that our debug after the main state machine is always
printed.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 74 +++++++++++++++++++++++++------------------
 1 file changed, 43 insertions(+), 31 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index e36c04c26866..f0e325324b23 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1481,19 +1481,34 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
 	dev_info(sfp->dev, "module removed\n");
 }
 
-static void sfp_sm_event(struct sfp *sfp, unsigned int event)
+/* This state machine tracks the netdev up/down state */
+static void sfp_sm_device(struct sfp *sfp, unsigned int event)
 {
-	mutex_lock(&sfp->sm_mutex);
+	switch (sfp->sm_dev_state) {
+	default:
+		if (event == SFP_E_DEV_UP)
+			sfp->sm_dev_state = SFP_DEV_UP;
+		break;
 
-	dev_dbg(sfp->dev, "SM: enter %s:%s:%s event %s\n",
-		mod_state_to_str(sfp->sm_mod_state),
-		dev_state_to_str(sfp->sm_dev_state),
-		sm_state_to_str(sfp->sm_state),
-		event_to_str(event));
+	case SFP_DEV_UP:
+		if (event == SFP_E_DEV_DOWN) {
+			/* If the module has a PHY, avoid raising TX disable
+			 * as this resets the PHY. Otherwise, raise it to
+			 * turn the laser off.
+			 */
+			if (!sfp->mod_phy)
+				sfp_module_tx_disable(sfp);
+			sfp->sm_dev_state = SFP_DEV_DOWN;
+		}
+		break;
+	}
+}
 
-	/* This state machine tracks the insert/remove state of
-	 * the module, and handles probing the on-board EEPROM.
-	 */
+/* This state machine tracks the insert/remove state of
+ * the module, and handles probing the on-board EEPROM.
+ */
+static void sfp_sm_module(struct sfp *sfp, unsigned int event)
+{
 	switch (sfp->sm_mod_state) {
 	default:
 		if (event == SFP_E_INSERT && sfp->attached) {
@@ -1533,27 +1548,10 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 		}
 		break;
 	}
+}
 
-	/* This state machine tracks the netdev up/down state */
-	switch (sfp->sm_dev_state) {
-	default:
-		if (event == SFP_E_DEV_UP)
-			sfp->sm_dev_state = SFP_DEV_UP;
-		break;
-
-	case SFP_DEV_UP:
-		if (event == SFP_E_DEV_DOWN) {
-			/* If the module has a PHY, avoid raising TX disable
-			 * as this resets the PHY. Otherwise, raise it to
-			 * turn the laser off.
-			 */
-			if (!sfp->mod_phy)
-				sfp_module_tx_disable(sfp);
-			sfp->sm_dev_state = SFP_DEV_DOWN;
-		}
-		break;
-	}
-
+static void sfp_sm_main(struct sfp *sfp, unsigned int event)
+{
 	/* Some events are global */
 	if (sfp->sm_state != SFP_S_DOWN &&
 	    (sfp->sm_mod_state != SFP_MOD_PRESENT ||
@@ -1564,7 +1562,6 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 		if (sfp->mod_phy)
 			sfp_sm_phy_detach(sfp);
 		sfp_sm_next(sfp, SFP_S_DOWN, 0);
-		mutex_unlock(&sfp->sm_mutex);
 		return;
 	}
 
@@ -1619,6 +1616,21 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 	case SFP_S_TX_DISABLE:
 		break;
 	}
+}
+
+static void sfp_sm_event(struct sfp *sfp, unsigned int event)
+{
+	mutex_lock(&sfp->sm_mutex);
+
+	dev_dbg(sfp->dev, "SM: enter %s:%s:%s event %s\n",
+		mod_state_to_str(sfp->sm_mod_state),
+		dev_state_to_str(sfp->sm_dev_state),
+		sm_state_to_str(sfp->sm_state),
+		event_to_str(event));
+
+	sfp_sm_module(sfp, event);
+	sfp_sm_device(sfp, event);
+	sfp_sm_main(sfp, event);
 
 	dev_dbg(sfp->dev, "SM: exit %s:%s:%s\n",
 		mod_state_to_str(sfp->sm_mod_state),
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 02/17] net: sfp: move tx disable on device down to main state machine
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
  2019-11-10 14:06 ` [PATCH net-next 01/17] net: sfp: move sfp sub-state machines into separate functions Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:00   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 03/17] net: sfp: rename sfp_sm_ins_next() as sfp_sm_mod_next() Russell King
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Move the tx disable assertion on device down to the main state
machine.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index f0e325324b23..47f204b227e8 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1491,15 +1491,8 @@ static void sfp_sm_device(struct sfp *sfp, unsigned int event)
 		break;
 
 	case SFP_DEV_UP:
-		if (event == SFP_E_DEV_DOWN) {
-			/* If the module has a PHY, avoid raising TX disable
-			 * as this resets the PHY. Otherwise, raise it to
-			 * turn the laser off.
-			 */
-			if (!sfp->mod_phy)
-				sfp_module_tx_disable(sfp);
+		if (event == SFP_E_DEV_DOWN)
 			sfp->sm_dev_state = SFP_DEV_DOWN;
-		}
 		break;
 	}
 }
@@ -1561,6 +1554,7 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 			sfp_sm_link_down(sfp);
 		if (sfp->mod_phy)
 			sfp_sm_phy_detach(sfp);
+		sfp_module_tx_disable(sfp);
 		sfp_sm_next(sfp, SFP_S_DOWN, 0);
 		return;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 03/17] net: sfp: rename sfp_sm_ins_next() as sfp_sm_mod_next()
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
  2019-11-10 14:06 ` [PATCH net-next 01/17] net: sfp: move sfp sub-state machines into separate functions Russell King
  2019-11-10 14:06 ` [PATCH net-next 02/17] net: sfp: move tx disable on device down to main state machine Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:01   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 04/17] net: sfp: handle module remove outside state machine Russell King
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

sfp_sm_ins_next() modifies the module state machine.  Change it's name
to reflect this.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 47f204b227e8..f56a19b26924 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1182,7 +1182,7 @@ static void sfp_sm_next(struct sfp *sfp, unsigned int state,
 	sfp_sm_set_timer(sfp, timeout);
 }
 
-static void sfp_sm_ins_next(struct sfp *sfp, unsigned int state,
+static void sfp_sm_mod_next(struct sfp *sfp, unsigned int state,
 			    unsigned int timeout)
 {
 	sfp->sm_mod_state = state;
@@ -1506,22 +1506,22 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 	default:
 		if (event == SFP_E_INSERT && sfp->attached) {
 			sfp_module_tx_disable(sfp);
-			sfp_sm_ins_next(sfp, SFP_MOD_PROBE, T_PROBE_INIT);
+			sfp_sm_mod_next(sfp, SFP_MOD_PROBE, T_PROBE_INIT);
 		}
 		break;
 
 	case SFP_MOD_PROBE:
 		if (event == SFP_E_REMOVE) {
-			sfp_sm_ins_next(sfp, SFP_MOD_EMPTY, 0);
+			sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
 		} else if (event == SFP_E_TIMEOUT) {
 			int val = sfp_sm_mod_probe(sfp);
 
 			if (val == 0)
-				sfp_sm_ins_next(sfp, SFP_MOD_PRESENT, 0);
+				sfp_sm_mod_next(sfp, SFP_MOD_PRESENT, 0);
 			else if (val > 0)
-				sfp_sm_ins_next(sfp, SFP_MOD_HPOWER, val);
+				sfp_sm_mod_next(sfp, SFP_MOD_HPOWER, val);
 			else if (val != -EAGAIN)
-				sfp_sm_ins_next(sfp, SFP_MOD_ERROR, 0);
+				sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
 			else
 				sfp_sm_set_timer(sfp, T_PROBE_RETRY);
 		}
@@ -1529,7 +1529,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 
 	case SFP_MOD_HPOWER:
 		if (event == SFP_E_TIMEOUT) {
-			sfp_sm_ins_next(sfp, SFP_MOD_PRESENT, 0);
+			sfp_sm_mod_next(sfp, SFP_MOD_PRESENT, 0);
 			break;
 		}
 		/* fallthrough */
@@ -1537,7 +1537,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 	case SFP_MOD_ERROR:
 		if (event == SFP_E_REMOVE) {
 			sfp_sm_mod_remove(sfp);
-			sfp_sm_ins_next(sfp, SFP_MOD_EMPTY, 0);
+			sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
 		}
 		break;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 04/17] net: sfp: handle module remove outside state machine
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (2 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 03/17] net: sfp: rename sfp_sm_ins_next() as sfp_sm_mod_next() Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:07   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 05/17] net: sfp: rename T_PROBE_WAIT to T_SERIAL Russell King
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Removing a module resets the module state machine back to its initial
state.  Rather than explicitly handling this in every state, handle it
early on outside of the state machine.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index f56a19b26924..e34370c4a6c5 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1502,6 +1502,14 @@ static void sfp_sm_device(struct sfp *sfp, unsigned int event)
  */
 static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 {
+	/* Handle remove event globally, it resets this state machine */
+	if (event == SFP_E_REMOVE) {
+		if (sfp->sm_mod_state > SFP_MOD_PROBE)
+			sfp_sm_mod_remove(sfp);
+		sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
+		return;
+	}
+
 	switch (sfp->sm_mod_state) {
 	default:
 		if (event == SFP_E_INSERT && sfp->attached) {
@@ -1511,9 +1519,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 		break;
 
 	case SFP_MOD_PROBE:
-		if (event == SFP_E_REMOVE) {
-			sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
-		} else if (event == SFP_E_TIMEOUT) {
+		if (event == SFP_E_TIMEOUT) {
 			int val = sfp_sm_mod_probe(sfp);
 
 			if (val == 0)
@@ -1535,10 +1541,6 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 		/* fallthrough */
 	case SFP_MOD_PRESENT:
 	case SFP_MOD_ERROR:
-		if (event == SFP_E_REMOVE) {
-			sfp_sm_mod_remove(sfp);
-			sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
-		}
 		break;
 	}
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 05/17] net: sfp: rename T_PROBE_WAIT to T_SERIAL
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (3 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 04/17] net: sfp: handle module remove outside state machine Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:08   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 06/17] net: sfp: parse SFP power requirement earlier Russell King
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

SFF-8472 rev 12.2 defines the time for the serial bus to become ready
using t_serial.  Use this as our identifier for this timeout to make
it clear what we are referring to.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index e34370c4a6c5..955ada116ec9 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -149,11 +149,10 @@ static const enum gpiod_flags gpio_flags[] = {
  * the same length on the PCB, which means it's possible for MOD DEF 0 to
  * connect before the I2C bus on MOD DEF 1/2.
  *
- * The SFP MSA specifies 300ms as t_init (the time taken for TX_FAULT to
- * be deasserted) but makes no mention of the earliest time before we can
- * access the I2C EEPROM.  However, Avago modules require 300ms.
+ * The SFF-8472 specifies t_serial ("Time from power on until module is
+ * ready for data transmission over the two wire serial bus.") as 300ms.
  */
-#define T_PROBE_INIT	msecs_to_jiffies(300)
+#define T_SERIAL	msecs_to_jiffies(300)
 #define T_HPOWER_LEVEL	msecs_to_jiffies(300)
 #define T_PROBE_RETRY	msecs_to_jiffies(100)
 
@@ -1497,8 +1496,8 @@ static void sfp_sm_device(struct sfp *sfp, unsigned int event)
 	}
 }
 
-/* This state machine tracks the insert/remove state of
- * the module, and handles probing the on-board EEPROM.
+/* This state machine tracks the insert/remove state of the module, probes
+ * the on-board EEPROM, and sets up the power level.
  */
 static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 {
@@ -1514,7 +1513,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 	default:
 		if (event == SFP_E_INSERT && sfp->attached) {
 			sfp_module_tx_disable(sfp);
-			sfp_sm_mod_next(sfp, SFP_MOD_PROBE, T_PROBE_INIT);
+			sfp_sm_mod_next(sfp, SFP_MOD_PROBE, T_SERIAL);
 		}
 		break;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 06/17] net: sfp: parse SFP power requirement earlier
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (4 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 05/17] net: sfp: rename T_PROBE_WAIT to T_SERIAL Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:10   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 07/17] net: sfp: avoid power switch on address-change modules Russell King
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Parse the SFP power requirement earlier, in preparation for moving the
power level setup code.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 42 +++++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 955ada116ec9..b105bbe7720a 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -198,6 +198,8 @@ struct sfp {
 	unsigned int sm_retries;
 
 	struct sfp_eeprom_id id;
+	unsigned int module_power_mW;
+
 #if IS_ENABLED(CONFIG_HWMON)
 	struct sfp_diag diag;
 	struct device *hwmon_dev;
@@ -1311,17 +1313,14 @@ static void sfp_sm_mod_init(struct sfp *sfp)
 		sfp_sm_probe_phy(sfp);
 }
 
-static int sfp_sm_mod_hpower(struct sfp *sfp)
+static int sfp_module_parse_power(struct sfp *sfp)
 {
-	u32 power;
-	u8 val;
-	int err;
+	u32 power_mW = 1000;
 
-	power = 1000;
 	if (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_POWER_DECL))
-		power = 1500;
+		power_mW = 1500;
 	if (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_HIGH_POWER_LEVEL))
-		power = 2000;
+		power_mW = 2000;
 
 	if (sfp->id.ext.sff8472_compliance == SFP_SFF8472_COMPLIANCE_NONE &&
 	    (sfp->id.ext.diagmon & (SFP_DIAGMON_DDM | SFP_DIAGMON_ADDRMODE)) !=
@@ -1330,23 +1329,33 @@ static int sfp_sm_mod_hpower(struct sfp *sfp)
 		 * or requires an address change sequence, so assume that
 		 * the module powers up in the indicated power mode.
 		 */
-		if (power > sfp->max_power_mW) {
+		if (power_mW > sfp->max_power_mW) {
 			dev_err(sfp->dev,
 				"Host does not support %u.%uW modules\n",
-				power / 1000, (power / 100) % 10);
+				power_mW / 1000, (power_mW / 100) % 10);
 			return -EINVAL;
 		}
 		return 0;
 	}
 
-	if (power > sfp->max_power_mW) {
+	if (power_mW > sfp->max_power_mW) {
 		dev_warn(sfp->dev,
 			 "Host does not support %u.%uW modules, module left in power mode 1\n",
-			 power / 1000, (power / 100) % 10);
+			 power_mW / 1000, (power_mW / 100) % 10);
 		return 0;
 	}
 
-	if (power <= 1000)
+	sfp->module_power_mW = power_mW;
+
+	return 0;
+}
+
+static int sfp_sm_mod_hpower(struct sfp *sfp)
+{
+	u8 val;
+	int err;
+
+	if (sfp->module_power_mW <= 1000)
 		return 0;
 
 	err = sfp_read(sfp, true, SFP_EXT_STATUS, &val, sizeof(val));
@@ -1366,7 +1375,8 @@ static int sfp_sm_mod_hpower(struct sfp *sfp)
 	}
 
 	dev_info(sfp->dev, "Module switched to %u.%uW power level\n",
-		 power / 1000, (power / 100) % 10);
+		 sfp->module_power_mW / 1000,
+		 (sfp->module_power_mW / 100) % 10);
 	return T_HPOWER_LEVEL;
 
 err:
@@ -1453,6 +1463,11 @@ static int sfp_sm_mod_probe(struct sfp *sfp)
 		dev_warn(sfp->dev,
 			 "module address swap to access page 0xA2 is not supported.\n");
 
+	/* Parse the module power requirement */
+	ret = sfp_module_parse_power(sfp);
+	if (ret < 0)
+		return ret;
+
 	ret = sfp_hwmon_insert(sfp);
 	if (ret < 0)
 		return ret;
@@ -1476,6 +1491,7 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
 	sfp_module_tx_disable(sfp);
 
 	memset(&sfp->id, 0, sizeof(sfp->id));
+	sfp->module_power_mW = 0;
 
 	dev_info(sfp->dev, "module removed\n");
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 07/17] net: sfp: avoid power switch on address-change modules
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (5 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 06/17] net: sfp: parse SFP power requirement earlier Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:12   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 08/17] net: sfp: control TX_DISABLE and phy only from main state machine Russell King
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

If the module indicates that it requires an address change sequence to
switch between address 0x50 and 0x51, which we don't support, we can't
write to the register that controls the power mode to switch to high
power mode.  Warn the user that the module may not be functional in
this case, and don't try to change the power mode.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index b105bbe7720a..7accd24a6875 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1322,25 +1322,34 @@ static int sfp_module_parse_power(struct sfp *sfp)
 	if (sfp->id.ext.options & cpu_to_be16(SFP_OPTIONS_HIGH_POWER_LEVEL))
 		power_mW = 2000;
 
-	if (sfp->id.ext.sff8472_compliance == SFP_SFF8472_COMPLIANCE_NONE &&
-	    (sfp->id.ext.diagmon & (SFP_DIAGMON_DDM | SFP_DIAGMON_ADDRMODE)) !=
-	    SFP_DIAGMON_DDM) {
-		/* The module appears not to implement bus address 0xa2,
-		 * or requires an address change sequence, so assume that
-		 * the module powers up in the indicated power mode.
-		 */
-		if (power_mW > sfp->max_power_mW) {
+	if (power_mW > sfp->max_power_mW) {
+		/* Module power specification exceeds the allowed maximum. */
+		if (sfp->id.ext.sff8472_compliance ==
+			SFP_SFF8472_COMPLIANCE_NONE &&
+		    !(sfp->id.ext.diagmon & SFP_DIAGMON_DDM)) {
+			/* The module appears not to implement bus address
+			 * 0xa2, so assume that the module powers up in the
+			 * indicated mode.
+			 */
 			dev_err(sfp->dev,
 				"Host does not support %u.%uW modules\n",
 				power_mW / 1000, (power_mW / 100) % 10);
 			return -EINVAL;
+		} else {
+			dev_warn(sfp->dev,
+				 "Host does not support %u.%uW modules, module left in power mode 1\n",
+				 power_mW / 1000, (power_mW / 100) % 10);
+			return 0;
 		}
-		return 0;
 	}
 
-	if (power_mW > sfp->max_power_mW) {
+	/* If the module requires a higher power mode, but also requires
+	 * an address change sequence, warn the user that the module may
+	 * not be functional.
+	 */
+	if (sfp->id.ext.diagmon & SFP_DIAGMON_ADDRMODE && power_mW > 1000) {
 		dev_warn(sfp->dev,
-			 "Host does not support %u.%uW modules, module left in power mode 1\n",
+			 "Address Change Sequence not supported but module requies %u.%uW, module may not be functional\n",
 			 power_mW / 1000, (power_mW / 100) % 10);
 		return 0;
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 08/17] net: sfp: control TX_DISABLE and phy only from main state machine
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (6 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 07/17] net: sfp: avoid power switch on address-change modules Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:14   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 09/17] net: sfp: split the PHY probe from sfp_sm_mod_init() Russell King
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

We initialise TX_DISABLE when the sfp cage is probed, and then
maintain its state in the main state machine.  However, the module
state machine:
- negates it when detecting a newly inserted module when it's already
  guaranteed to be negated.
- negates it when the module is removed, but the main state machine
  will do this anyway.

Make TX_DISABLE entirely controlled by the main state machine.

The main state machine also probes the module for a PHY, and removes
the PHY when the the module is removed.  Hence, removing the PHY in
sfp_sm_module_remove() is also redundant, and is a left-over from
when we tried to probe for the PHY from the module state machine.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 7accd24a6875..bd55584e193d 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1494,11 +1494,6 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
 
 	sfp_hwmon_remove(sfp);
 
-	if (sfp->mod_phy)
-		sfp_sm_phy_detach(sfp);
-
-	sfp_module_tx_disable(sfp);
-
 	memset(&sfp->id, 0, sizeof(sfp->id));
 	sfp->module_power_mW = 0;
 
@@ -1536,10 +1531,8 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 
 	switch (sfp->sm_mod_state) {
 	default:
-		if (event == SFP_E_INSERT && sfp->attached) {
-			sfp_module_tx_disable(sfp);
+		if (event == SFP_E_INSERT && sfp->attached)
 			sfp_sm_mod_next(sfp, SFP_MOD_PROBE, T_SERIAL);
-		}
 		break;
 
 	case SFP_MOD_PROBE:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 09/17] net: sfp: split the PHY probe from sfp_sm_mod_init()
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (7 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 08/17] net: sfp: control TX_DISABLE and phy only from main state machine Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 18:19   ` Andrew Lunn
  2019-11-10 14:06 ` [PATCH net-next 10/17] net: sfp: eliminate mdelay() from PHY probe Russell King
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Move the PHY probe into a separate function, splitting it from
sfp_sm_mod_init().  This will allow us to eliminate the 50ms mdelay()
inside the state machine.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index bd55584e193d..6fa32246ba41 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1290,14 +1290,10 @@ static void sfp_sm_fault(struct sfp *sfp, bool warn)
 static void sfp_sm_mod_init(struct sfp *sfp)
 {
 	sfp_module_tx_enable(sfp);
+}
 
-	/* Wait t_init before indicating that the link is up, provided the
-	 * current state indicates no TX_FAULT.  If TX_FAULT clears before
-	 * this time, that's fine too.
-	 */
-	sfp_sm_next(sfp, SFP_S_INIT, T_INIT_JIFFIES);
-	sfp->sm_retries = 5;
-
+static void sfp_sm_probe_for_phy(struct sfp *sfp)
+{
 	/* Setting the serdes link mode is guesswork: there's no
 	 * field in the EEPROM which indicates what mode should
 	 * be used.
@@ -1582,8 +1578,17 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 	switch (sfp->sm_state) {
 	case SFP_S_DOWN:
 		if (sfp->sm_mod_state == SFP_MOD_PRESENT &&
-		    sfp->sm_dev_state == SFP_DEV_UP)
+		    sfp->sm_dev_state == SFP_DEV_UP) {
 			sfp_sm_mod_init(sfp);
+			sfp_sm_probe_for_phy(sfp);
+
+			/* Wait t_init before indicating that the link is up,
+			 * provided the current state indicates no TX_FAULT. If
+			 * TX_FAULT clears before this time, that's fine too.
+			 */
+			sfp_sm_next(sfp, SFP_S_INIT, T_INIT_JIFFIES);
+			sfp->sm_retries = 5;
+		}
 		break;
 
 	case SFP_S_INIT:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 10/17] net: sfp: eliminate mdelay() from PHY probe
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (8 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 09/17] net: sfp: split the PHY probe from sfp_sm_mod_init() Russell King
@ 2019-11-10 14:06 ` Russell King
  2019-11-10 19:37   ` Andrew Lunn
  2019-11-10 14:07 ` [PATCH net-next 11/17] net: sfp: allow fault processing to transition to other states Russell King
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 32+ messages in thread
From: Russell King @ 2019-11-10 14:06 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Rather than using mdelay() to wait before probing the PHY (which holds
several locks, including the rtnl lock), add an extra wait state to
the state machine to introduce the 50ms delay without holding any
locks.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 52 +++++++++++++++++++++++++++++++++----------
 1 file changed, 40 insertions(+), 12 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 6fa32246ba41..db015e2cb616 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -54,6 +54,7 @@ enum {
 	SFP_DEV_UP,
 
 	SFP_S_DOWN = 0,
+	SFP_S_WAIT,
 	SFP_S_INIT,
 	SFP_S_WAIT_LOS,
 	SFP_S_LINK_UP,
@@ -110,6 +111,7 @@ static const char *event_to_str(unsigned short event)
 
 static const char * const sm_state_strings[] = {
 	[SFP_S_DOWN] = "down",
+	[SFP_S_WAIT] = "wait",
 	[SFP_S_INIT] = "init",
 	[SFP_S_WAIT_LOS] = "wait_los",
 	[SFP_S_LINK_UP] = "link_up",
@@ -141,6 +143,7 @@ static const enum gpiod_flags gpio_flags[] = {
 	GPIOD_ASIS,
 };
 
+#define T_WAIT		msecs_to_jiffies(50)
 #define T_INIT_JIFFIES	msecs_to_jiffies(300)
 #define T_RESET_US	10
 #define T_FAULT_RECOVER	msecs_to_jiffies(1000)
@@ -161,9 +164,6 @@ static const enum gpiod_flags gpio_flags[] = {
  */
 #define SFP_PHY_ADDR	22
 
-/* Give this long for the PHY to reset. */
-#define T_PHY_RESET_MS	50
-
 struct sff_data {
 	unsigned int gpios;
 	bool (*module_supported)(const struct sfp_eeprom_id *id);
@@ -1204,8 +1204,6 @@ static void sfp_sm_probe_phy(struct sfp *sfp)
 	struct phy_device *phy;
 	int err;
 
-	msleep(T_PHY_RESET_MS);
-
 	phy = mdiobus_scan(sfp->i2c_mii, SFP_PHY_ADDR);
 	if (phy == ERR_PTR(-ENODEV)) {
 		dev_info(sfp->dev, "no PHY detected\n");
@@ -1560,6 +1558,8 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 
 static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 {
+	unsigned long timeout;
+
 	/* Some events are global */
 	if (sfp->sm_state != SFP_S_DOWN &&
 	    (sfp->sm_mod_state != SFP_MOD_PRESENT ||
@@ -1577,17 +1577,45 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 	/* The main state machine */
 	switch (sfp->sm_state) {
 	case SFP_S_DOWN:
-		if (sfp->sm_mod_state == SFP_MOD_PRESENT &&
-		    sfp->sm_dev_state == SFP_DEV_UP) {
-			sfp_sm_mod_init(sfp);
-			sfp_sm_probe_for_phy(sfp);
+		if (sfp->sm_mod_state != SFP_MOD_PRESENT ||
+		    sfp->sm_dev_state != SFP_DEV_UP)
+			break;
+
+		sfp_sm_mod_init(sfp);
+
+		/* Initialise the fault clearance retries */
+		sfp->sm_retries = 5;
+
+		/* We need to check the TX_FAULT state, which is not defined
+		 * while TX_DISABLE is asserted. The earliest we want to do
+		 * anything (such as probe for a PHY) is 50ms.
+		 */
+		sfp_sm_next(sfp, SFP_S_WAIT, T_WAIT);
+		break;
 
+	case SFP_S_WAIT:
+		if (event != SFP_E_TIMEOUT)
+			break;
+
+		sfp_sm_probe_for_phy(sfp);
+
+		if (sfp->state & SFP_F_TX_FAULT) {
 			/* Wait t_init before indicating that the link is up,
 			 * provided the current state indicates no TX_FAULT. If
 			 * TX_FAULT clears before this time, that's fine too.
 			 */
-			sfp_sm_next(sfp, SFP_S_INIT, T_INIT_JIFFIES);
-			sfp->sm_retries = 5;
+			timeout = T_INIT_JIFFIES;
+			if (timeout > T_WAIT)
+				timeout -= T_WAIT;
+			else
+				timeout = 1;
+
+			sfp_sm_next(sfp, SFP_S_INIT, timeout);
+		} else {
+			/* TX_FAULT is not asserted, assume the module has
+			 * finished initialising.
+			 */
+			goto init_done;
 		}
 		break;
 
@@ -1595,7 +1623,7 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 		if (event == SFP_E_TIMEOUT && sfp->state & SFP_F_TX_FAULT)
 			sfp_sm_fault(sfp, true);
 		else if (event == SFP_E_TIMEOUT || event == SFP_E_TX_CLEAR)
-			sfp_sm_link_check_los(sfp);
+	init_done:	sfp_sm_link_check_los(sfp);
 		break;
 
 	case SFP_S_WAIT_LOS:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 11/17] net: sfp: allow fault processing to transition to other states
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (9 preceding siblings ...)
  2019-11-10 14:06 ` [PATCH net-next 10/17] net: sfp: eliminate mdelay() from PHY probe Russell King
@ 2019-11-10 14:07 ` Russell King
  2019-11-10 14:07 ` [PATCH net-next 12/17] net: sfp: ensure TX_FAULT has deasserted before probing the PHY Russell King
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 32+ messages in thread
From: Russell King @ 2019-11-10 14:07 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Add the next state to sfp_sm_fault() so that it can branch to other
states. This will be necessary to improve the initialisation path.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index db015e2cb616..bdc633cc7c30 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1271,7 +1271,7 @@ static bool sfp_los_event_inactive(struct sfp *sfp, unsigned int event)
 		event == SFP_E_LOS_LOW);
 }
 
-static void sfp_sm_fault(struct sfp *sfp, bool warn)
+static void sfp_sm_fault(struct sfp *sfp, unsigned int next_state, bool warn)
 {
 	if (sfp->sm_retries && !--sfp->sm_retries) {
 		dev_err(sfp->dev,
@@ -1281,7 +1281,7 @@ static void sfp_sm_fault(struct sfp *sfp, bool warn)
 		if (warn)
 			dev_err(sfp->dev, "module transmit fault indicated\n");
 
-		sfp_sm_next(sfp, SFP_S_TX_FAULT, T_FAULT_RECOVER);
+		sfp_sm_next(sfp, next_state, T_FAULT_RECOVER);
 	}
 }
 
@@ -1621,14 +1621,14 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 
 	case SFP_S_INIT:
 		if (event == SFP_E_TIMEOUT && sfp->state & SFP_F_TX_FAULT)
-			sfp_sm_fault(sfp, true);
+			sfp_sm_fault(sfp, SFP_S_TX_FAULT, true);
 		else if (event == SFP_E_TIMEOUT || event == SFP_E_TX_CLEAR)
 	init_done:	sfp_sm_link_check_los(sfp);
 		break;
 
 	case SFP_S_WAIT_LOS:
 		if (event == SFP_E_TX_FAULT)
-			sfp_sm_fault(sfp, true);
+			sfp_sm_fault(sfp, SFP_S_TX_FAULT, true);
 		else if (sfp_los_event_inactive(sfp, event))
 			sfp_sm_link_up(sfp);
 		break;
@@ -1636,7 +1636,7 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 	case SFP_S_LINK_UP:
 		if (event == SFP_E_TX_FAULT) {
 			sfp_sm_link_down(sfp);
-			sfp_sm_fault(sfp, true);
+			sfp_sm_fault(sfp, SFP_S_TX_FAULT, true);
 		} else if (sfp_los_event_active(sfp, event)) {
 			sfp_sm_link_down(sfp);
 			sfp_sm_next(sfp, SFP_S_WAIT_LOS, 0);
@@ -1652,7 +1652,7 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 
 	case SFP_S_REINIT:
 		if (event == SFP_E_TIMEOUT && sfp->state & SFP_F_TX_FAULT) {
-			sfp_sm_fault(sfp, false);
+			sfp_sm_fault(sfp, SFP_S_TX_FAULT, false);
 		} else if (event == SFP_E_TIMEOUT || event == SFP_E_TX_CLEAR) {
 			dev_info(sfp->dev, "module transmit fault recovered\n");
 			sfp_sm_link_check_los(sfp);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 12/17] net: sfp: ensure TX_FAULT has deasserted before probing the PHY
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (10 preceding siblings ...)
  2019-11-10 14:07 ` [PATCH net-next 11/17] net: sfp: allow fault processing to transition to other states Russell King
@ 2019-11-10 14:07 ` Russell King
  2019-11-10 14:07 ` [PATCH net-next 13/17] net: sfp: track upstream's attachment state in state machine Russell King
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 32+ messages in thread
From: Russell King @ 2019-11-10 14:07 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

TX_FAULT should be deasserted to indicate that the module has completed
its initialisation.  This may include the on-board PHY, so wait until
the module has deasserted TX_FAULT before probing the PHY.

This means that we need an extra state to handle a TX_FAULT that
remains set for longer than t_init, since using the existing handling
state would bypass the PHY probe.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index bdc633cc7c30..91fd218ba6b6 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -56,6 +56,7 @@ enum {
 	SFP_S_DOWN = 0,
 	SFP_S_WAIT,
 	SFP_S_INIT,
+	SFP_S_INIT_TX_FAULT,
 	SFP_S_WAIT_LOS,
 	SFP_S_LINK_UP,
 	SFP_S_TX_FAULT,
@@ -113,6 +114,7 @@ static const char * const sm_state_strings[] = {
 	[SFP_S_DOWN] = "down",
 	[SFP_S_WAIT] = "wait",
 	[SFP_S_INIT] = "init",
+	[SFP_S_INIT_TX_FAULT] = "init_tx_fault",
 	[SFP_S_WAIT_LOS] = "wait_los",
 	[SFP_S_LINK_UP] = "link_up",
 	[SFP_S_TX_FAULT] = "tx_fault",
@@ -1597,8 +1599,6 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 		if (event != SFP_E_TIMEOUT)
 			break;
 
-		sfp_sm_probe_for_phy(sfp);
-
 		if (sfp->state & SFP_F_TX_FAULT) {
 			/* Wait t_init before indicating that the link is up,
 			 * provided the current state indicates no TX_FAULT. If
@@ -1620,10 +1620,29 @@ static void sfp_sm_main(struct sfp *sfp, unsigned int event)
 		break;
 
 	case SFP_S_INIT:
-		if (event == SFP_E_TIMEOUT && sfp->state & SFP_F_TX_FAULT)
-			sfp_sm_fault(sfp, SFP_S_TX_FAULT, true);
-		else if (event == SFP_E_TIMEOUT || event == SFP_E_TX_CLEAR)
-	init_done:	sfp_sm_link_check_los(sfp);
+		if (event == SFP_E_TIMEOUT && sfp->state & SFP_F_TX_FAULT) {
+			/* TX_FAULT is still asserted after t_init, so assume
+			 * there is a fault.
+			 */
+			sfp_sm_fault(sfp, SFP_S_INIT_TX_FAULT,
+				     sfp->sm_retries == 5);
+		} else if (event == SFP_E_TIMEOUT || event == SFP_E_TX_CLEAR) {
+	init_done:	/* TX_FAULT deasserted or we timed out with TX_FAULT
+			 * clear.  Probe for the PHY and check the LOS state.
+			 */
+			sfp_sm_probe_for_phy(sfp);
+			sfp_sm_link_check_los(sfp);
+
+			/* Reset the fault retry count */
+			sfp->sm_retries = 5;
+		}
+		break;
+
+	case SFP_S_INIT_TX_FAULT:
+		if (event == SFP_E_TIMEOUT) {
+			sfp_module_tx_fault_reset(sfp);
+			sfp_sm_next(sfp, SFP_S_INIT, T_INIT_JIFFIES);
+		}
 		break;
 
 	case SFP_S_WAIT_LOS:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 13/17] net: sfp: track upstream's attachment state in state machine
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (11 preceding siblings ...)
  2019-11-10 14:07 ` [PATCH net-next 12/17] net: sfp: ensure TX_FAULT has deasserted before probing the PHY Russell King
@ 2019-11-10 14:07 ` Russell King
  2019-11-10 14:07 ` [PATCH net-next 14/17] net: sfp: split power mode switching from probe Russell King
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 32+ messages in thread
From: Russell King @ 2019-11-10 14:07 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Track the upstream's attachment state in the state machine rather than
maintaining a boolean, which ensures that we have a strict order of
ATTACH followed by an UP event - we can never believe that a newly
attached upstream will be anything but down.

Rearrange the order of state machines so we run the module state
machine after the upstream device's state machine, so the module state
machine can check the current state of the device and take action to
e.g. reset back to empty state when the upstream is detached.

This is to allow the module detection to run independently of the
network device becoming available.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 42 +++++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 91fd218ba6b6..95e0dd4a52df 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -36,6 +36,8 @@ enum {
 
 	SFP_E_INSERT = 0,
 	SFP_E_REMOVE,
+	SFP_E_DEV_ATTACH,
+	SFP_E_DEV_DETACH,
 	SFP_E_DEV_DOWN,
 	SFP_E_DEV_UP,
 	SFP_E_TX_FAULT,
@@ -50,7 +52,8 @@ enum {
 	SFP_MOD_PRESENT,
 	SFP_MOD_ERROR,
 
-	SFP_DEV_DOWN = 0,
+	SFP_DEV_DETACHED = 0,
+	SFP_DEV_DOWN,
 	SFP_DEV_UP,
 
 	SFP_S_DOWN = 0,
@@ -80,6 +83,7 @@ static const char *mod_state_to_str(unsigned short mod_state)
 }
 
 static const char * const dev_state_strings[] = {
+	[SFP_DEV_DETACHED] = "detached",
 	[SFP_DEV_DOWN] = "down",
 	[SFP_DEV_UP] = "up",
 };
@@ -94,6 +98,8 @@ static const char *dev_state_to_str(unsigned short dev_state)
 static const char * const event_strings[] = {
 	[SFP_E_INSERT] = "insert",
 	[SFP_E_REMOVE] = "remove",
+	[SFP_E_DEV_ATTACH] = "dev_attach",
+	[SFP_E_DEV_DETACH] = "dev_detach",
 	[SFP_E_DEV_DOWN] = "dev_down",
 	[SFP_E_DEV_UP] = "dev_up",
 	[SFP_E_TX_FAULT] = "tx_fault",
@@ -188,7 +194,6 @@ struct sfp {
 	struct gpio_desc *gpio[GPIO_MAX];
 	int gpio_irq[GPIO_MAX];
 
-	bool attached;
 	struct mutex st_mutex;			/* Protects state */
 	unsigned int state;
 	struct delayed_work poll;
@@ -1496,17 +1501,26 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
 	dev_info(sfp->dev, "module removed\n");
 }
 
-/* This state machine tracks the netdev up/down state */
+/* This state machine tracks the upstream's state */
 static void sfp_sm_device(struct sfp *sfp, unsigned int event)
 {
 	switch (sfp->sm_dev_state) {
 	default:
-		if (event == SFP_E_DEV_UP)
+		if (event == SFP_E_DEV_ATTACH)
+			sfp->sm_dev_state = SFP_DEV_DOWN;
+		break;
+
+	case SFP_DEV_DOWN:
+		if (event == SFP_E_DEV_DETACH)
+			sfp->sm_dev_state = SFP_DEV_DETACHED;
+		else if (event == SFP_E_DEV_UP)
 			sfp->sm_dev_state = SFP_DEV_UP;
 		break;
 
 	case SFP_DEV_UP:
-		if (event == SFP_E_DEV_DOWN)
+		if (event == SFP_E_DEV_DETACH)
+			sfp->sm_dev_state = SFP_DEV_DETACHED;
+		else if (event == SFP_E_DEV_DOWN)
 			sfp->sm_dev_state = SFP_DEV_DOWN;
 		break;
 	}
@@ -1517,17 +1531,20 @@ static void sfp_sm_device(struct sfp *sfp, unsigned int event)
  */
 static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 {
-	/* Handle remove event globally, it resets this state machine */
-	if (event == SFP_E_REMOVE) {
+	/* Handle remove event globally, it resets this state machine.
+	 * Also deal with upstream detachment.
+	 */
+	if (event == SFP_E_REMOVE || sfp->sm_dev_state < SFP_DEV_DOWN) {
 		if (sfp->sm_mod_state > SFP_MOD_PROBE)
 			sfp_sm_mod_remove(sfp);
-		sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
+		if (sfp->sm_mod_state != SFP_MOD_EMPTY)
+			sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
 		return;
 	}
 
 	switch (sfp->sm_mod_state) {
 	default:
-		if (event == SFP_E_INSERT && sfp->attached)
+		if (event == SFP_E_INSERT)
 			sfp_sm_mod_next(sfp, SFP_MOD_PROBE, T_SERIAL);
 		break;
 
@@ -1693,8 +1710,8 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 		sm_state_to_str(sfp->sm_state),
 		event_to_str(event));
 
-	sfp_sm_module(sfp, event);
 	sfp_sm_device(sfp, event);
+	sfp_sm_module(sfp, event);
 	sfp_sm_main(sfp, event);
 
 	dev_dbg(sfp->dev, "SM: exit %s:%s:%s\n",
@@ -1707,15 +1724,14 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 
 static void sfp_attach(struct sfp *sfp)
 {
-	sfp->attached = true;
+	sfp_sm_event(sfp, SFP_E_DEV_ATTACH);
 	if (sfp->state & SFP_F_PRESENT)
 		sfp_sm_event(sfp, SFP_E_INSERT);
 }
 
 static void sfp_detach(struct sfp *sfp)
 {
-	sfp->attached = false;
-	sfp_sm_event(sfp, SFP_E_REMOVE);
+	sfp_sm_event(sfp, SFP_E_DEV_DETACH);
 }
 
 static void sfp_start(struct sfp *sfp)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 14/17] net: sfp: split power mode switching from probe
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (12 preceding siblings ...)
  2019-11-10 14:07 ` [PATCH net-next 13/17] net: sfp: track upstream's attachment state in state machine Russell King
@ 2019-11-10 14:07 ` Russell King
  2019-11-10 14:07 ` [PATCH net-next 15/17] net: sfp: move module insert reporting out of probe Russell King
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 32+ messages in thread
From: Russell King @ 2019-11-10 14:07 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Switch the power mode switching from the probe, so that we don't
repeatedly re-probe the SFP device if there is a problem accessing
the registers at I2C address 0x51.

In splitting this out, we can also fix a bug where we leave the module
in high-power mode when the upstream device is detached but the module
is still inserted.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 101 ++++++++++++++++++++++++++----------------
 1 file changed, 64 insertions(+), 37 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 95e0dd4a52df..1d58e0d0478b 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -49,6 +49,7 @@ enum {
 	SFP_MOD_EMPTY = 0,
 	SFP_MOD_PROBE,
 	SFP_MOD_HPOWER,
+	SFP_MOD_WAITPWR,
 	SFP_MOD_PRESENT,
 	SFP_MOD_ERROR,
 
@@ -71,6 +72,7 @@ static const char  * const mod_state_strings[] = {
 	[SFP_MOD_EMPTY] = "empty",
 	[SFP_MOD_PROBE] = "probe",
 	[SFP_MOD_HPOWER] = "hpower",
+	[SFP_MOD_WAITPWR] = "waitpwr",
 	[SFP_MOD_PRESENT] = "present",
 	[SFP_MOD_ERROR] = "error",
 };
@@ -1360,37 +1362,34 @@ static int sfp_module_parse_power(struct sfp *sfp)
 	return 0;
 }
 
-static int sfp_sm_mod_hpower(struct sfp *sfp)
+static int sfp_sm_mod_hpower(struct sfp *sfp, bool enable)
 {
 	u8 val;
 	int err;
 
-	if (sfp->module_power_mW <= 1000)
-		return 0;
-
 	err = sfp_read(sfp, true, SFP_EXT_STATUS, &val, sizeof(val));
 	if (err != sizeof(val)) {
 		dev_err(sfp->dev, "Failed to read EEPROM: %d\n", err);
-		err = -EAGAIN;
-		goto err;
+		return -EAGAIN;
 	}
 
-	val |= BIT(0);
+	if (enable)
+		val |= BIT(0);
+	else
+		val &= ~BIT(0);
 
 	err = sfp_write(sfp, true, SFP_EXT_STATUS, &val, sizeof(val));
 	if (err != sizeof(val)) {
 		dev_err(sfp->dev, "Failed to write EEPROM: %d\n", err);
-		err = -EAGAIN;
-		goto err;
+		return -EAGAIN;
 	}
 
-	dev_info(sfp->dev, "Module switched to %u.%uW power level\n",
-		 sfp->module_power_mW / 1000,
-		 (sfp->module_power_mW / 100) % 10);
-	return T_HPOWER_LEVEL;
+	if (enable)
+		dev_info(sfp->dev, "Module switched to %u.%uW power level\n",
+			 sfp->module_power_mW / 1000,
+			 (sfp->module_power_mW / 100) % 10);
 
-err:
-	return err;
+	return 0;
 }
 
 static int sfp_sm_mod_probe(struct sfp *sfp)
@@ -1486,7 +1485,7 @@ static int sfp_sm_mod_probe(struct sfp *sfp)
 	if (ret < 0)
 		return ret;
 
-	return sfp_sm_mod_hpower(sfp);
+	return 0;
 }
 
 static void sfp_sm_mod_remove(struct sfp *sfp)
@@ -1531,13 +1530,22 @@ static void sfp_sm_device(struct sfp *sfp, unsigned int event)
  */
 static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 {
-	/* Handle remove event globally, it resets this state machine.
-	 * Also deal with upstream detachment.
-	 */
-	if (event == SFP_E_REMOVE || sfp->sm_dev_state < SFP_DEV_DOWN) {
+	int err;
+
+	/* Handle remove event globally, it resets this state machine */
+	if (event == SFP_E_REMOVE) {
 		if (sfp->sm_mod_state > SFP_MOD_PROBE)
 			sfp_sm_mod_remove(sfp);
-		if (sfp->sm_mod_state != SFP_MOD_EMPTY)
+		sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
+		return;
+	}
+
+	/* Handle device detach globally */
+	if (sfp->sm_dev_state < SFP_DEV_DOWN) {
+		if (sfp->module_power_mW > 1000 &&
+		    sfp->sm_mod_state > SFP_MOD_HPOWER)
+			sfp_sm_mod_hpower(sfp, false);
+		if (sfp->sm_mod_state > SFP_MOD_EMPTY)
 			sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
 		return;
 	}
@@ -1549,26 +1557,45 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 		break;
 
 	case SFP_MOD_PROBE:
-		if (event == SFP_E_TIMEOUT) {
-			int val = sfp_sm_mod_probe(sfp);
-
-			if (val == 0)
-				sfp_sm_mod_next(sfp, SFP_MOD_PRESENT, 0);
-			else if (val > 0)
-				sfp_sm_mod_next(sfp, SFP_MOD_HPOWER, val);
-			else if (val != -EAGAIN)
-				sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
-			else
-				sfp_sm_set_timer(sfp, T_PROBE_RETRY);
+		if (event != SFP_E_TIMEOUT)
+			break;
+
+		err = sfp_sm_mod_probe(sfp);
+		if (err == -EAGAIN) {
+			sfp_sm_set_timer(sfp, T_PROBE_RETRY);
+			break;
+		}
+		if (err < 0) {
+			sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
+			break;
 		}
-		break;
 
+		/* If this is a power level 1 module, we are done */
+		if (sfp->module_power_mW <= 1000)
+			goto insert;
+
+		sfp_sm_mod_next(sfp, SFP_MOD_HPOWER, 0);
+		/* fall through */
 	case SFP_MOD_HPOWER:
-		if (event == SFP_E_TIMEOUT) {
-			sfp_sm_mod_next(sfp, SFP_MOD_PRESENT, 0);
+		/* Enable high power mode */
+		err = sfp_sm_mod_hpower(sfp, true);
+		if (err == 0)
+			sfp_sm_mod_next(sfp, SFP_MOD_WAITPWR, T_HPOWER_LEVEL);
+		else if (err != -EAGAIN)
+			sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
+		else
+			sfp_sm_set_timer(sfp, T_PROBE_RETRY);
+		break;
+
+	case SFP_MOD_WAITPWR:
+		/* Wait for T_HPOWER_LEVEL to time out */
+		if (event != SFP_E_TIMEOUT)
 			break;
-		}
-		/* fallthrough */
+
+	insert:
+		sfp_sm_mod_next(sfp, SFP_MOD_PRESENT, 0);
+		break;
+
 	case SFP_MOD_PRESENT:
 	case SFP_MOD_ERROR:
 		break;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 15/17] net: sfp: move module insert reporting out of probe
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (13 preceding siblings ...)
  2019-11-10 14:07 ` [PATCH net-next 14/17] net: sfp: split power mode switching from probe Russell King
@ 2019-11-10 14:07 ` Russell King
  2019-11-10 14:07 ` [PATCH net-next 16/17] net: sfp: allow sfp to probe slow to initialise GPON modules Russell King
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 32+ messages in thread
From: Russell King @ 2019-11-10 14:07 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Move the module insertion reporting out of the probe handling, but
after we have detected that the upstream has attached (since that is
whom we are reporting insertion to.)

Only report module removal if we had previously reported a module
insertion.

This gives cleaner semantics, and means we can probe the module before
we have an upstream attached.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 58 +++++++++++++++++++++++++++++--------------
 1 file changed, 40 insertions(+), 18 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 1d58e0d0478b..5aaeee461d06 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -47,11 +47,12 @@ enum {
 	SFP_E_TIMEOUT,
 
 	SFP_MOD_EMPTY = 0,
+	SFP_MOD_ERROR,
 	SFP_MOD_PROBE,
+	SFP_MOD_WAITDEV,
 	SFP_MOD_HPOWER,
 	SFP_MOD_WAITPWR,
 	SFP_MOD_PRESENT,
-	SFP_MOD_ERROR,
 
 	SFP_DEV_DETACHED = 0,
 	SFP_DEV_DOWN,
@@ -70,11 +71,12 @@ enum {
 
 static const char  * const mod_state_strings[] = {
 	[SFP_MOD_EMPTY] = "empty",
+	[SFP_MOD_ERROR] = "error",
 	[SFP_MOD_PROBE] = "probe",
+	[SFP_MOD_WAITDEV] = "waitdev",
 	[SFP_MOD_HPOWER] = "hpower",
 	[SFP_MOD_WAITPWR] = "waitpwr",
 	[SFP_MOD_PRESENT] = "present",
-	[SFP_MOD_ERROR] = "error",
 };
 
 static const char *mod_state_to_str(unsigned short mod_state)
@@ -1481,16 +1483,13 @@ static int sfp_sm_mod_probe(struct sfp *sfp)
 	if (ret < 0)
 		return ret;
 
-	ret = sfp_module_insert(sfp->sfp_bus, &sfp->id);
-	if (ret < 0)
-		return ret;
-
 	return 0;
 }
 
 static void sfp_sm_mod_remove(struct sfp *sfp)
 {
-	sfp_module_remove(sfp->sfp_bus);
+	if (sfp->sm_mod_state > SFP_MOD_WAITDEV)
+		sfp_module_remove(sfp->sfp_bus);
 
 	sfp_hwmon_remove(sfp);
 
@@ -1541,12 +1540,12 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 	}
 
 	/* Handle device detach globally */
-	if (sfp->sm_dev_state < SFP_DEV_DOWN) {
+	if (sfp->sm_dev_state < SFP_DEV_DOWN &&
+	    sfp->sm_mod_state > SFP_MOD_WAITDEV) {
 		if (sfp->module_power_mW > 1000 &&
 		    sfp->sm_mod_state > SFP_MOD_HPOWER)
 			sfp_sm_mod_hpower(sfp, false);
-		if (sfp->sm_mod_state > SFP_MOD_EMPTY)
-			sfp_sm_mod_next(sfp, SFP_MOD_EMPTY, 0);
+		sfp_sm_mod_next(sfp, SFP_MOD_WAITDEV, 0);
 		return;
 	}
 
@@ -1557,6 +1556,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 		break;
 
 	case SFP_MOD_PROBE:
+		/* Wait for T_PROBE_INIT to time out */
 		if (event != SFP_E_TIMEOUT)
 			break;
 
@@ -1570,6 +1570,20 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 			break;
 		}
 
+		sfp_sm_mod_next(sfp, SFP_MOD_WAITDEV, 0);
+		/* fall through */
+	case SFP_MOD_WAITDEV:
+		/* Ensure that the device is attached before proceeding */
+		if (sfp->sm_dev_state < SFP_DEV_DOWN)
+			break;
+
+		/* Report the module insertion to the upstream device */
+		err = sfp_module_insert(sfp->sfp_bus, &sfp->id);
+		if (err < 0) {
+			sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
+			break;
+		}
+
 		/* If this is a power level 1 module, we are done */
 		if (sfp->module_power_mW <= 1000)
 			goto insert;
@@ -1579,12 +1593,17 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 	case SFP_MOD_HPOWER:
 		/* Enable high power mode */
 		err = sfp_sm_mod_hpower(sfp, true);
-		if (err == 0)
-			sfp_sm_mod_next(sfp, SFP_MOD_WAITPWR, T_HPOWER_LEVEL);
-		else if (err != -EAGAIN)
-			sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
-		else
-			sfp_sm_set_timer(sfp, T_PROBE_RETRY);
+		if (err < 0) {
+			if (err != -EAGAIN) {
+				sfp_module_remove(sfp->sfp_bus);
+				sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
+			} else {
+				sfp_sm_set_timer(sfp, T_PROBE_RETRY);
+			}
+			break;
+		}
+
+		sfp_sm_mod_next(sfp, SFP_MOD_WAITPWR, T_HPOWER_LEVEL);
 		break;
 
 	case SFP_MOD_WAITPWR:
@@ -1752,8 +1771,6 @@ static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 static void sfp_attach(struct sfp *sfp)
 {
 	sfp_sm_event(sfp, SFP_E_DEV_ATTACH);
-	if (sfp->state & SFP_F_PRESENT)
-		sfp_sm_event(sfp, SFP_E_INSERT);
 }
 
 static void sfp_detach(struct sfp *sfp)
@@ -2021,6 +2038,11 @@ static int sfp_probe(struct platform_device *pdev)
 		sfp->state |= SFP_F_RATE_SELECT;
 	sfp_set_state(sfp, sfp->state);
 	sfp_module_tx_disable(sfp);
+	if (sfp->state & SFP_F_PRESENT) {
+		rtnl_lock();
+		sfp_sm_event(sfp, SFP_E_INSERT);
+		rtnl_unlock();
+	}
 
 	for (i = 0; i < GPIO_MAX; i++) {
 		if (gpio_flags[i] != GPIOD_IN || !sfp->gpio[i])
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 16/17] net: sfp: allow sfp to probe slow to initialise GPON modules
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (14 preceding siblings ...)
  2019-11-10 14:07 ` [PATCH net-next 15/17] net: sfp: move module insert reporting out of probe Russell King
@ 2019-11-10 14:07 ` Russell King
  2019-11-10 14:07 ` [PATCH net-next 17/17] net: sfp: allow modules with slow diagnostics to probe Russell King
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 32+ messages in thread
From: Russell King @ 2019-11-10 14:07 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

Some GPON modules (e.g. Huawei MA5671A) take a significant amount of
time to start responding on the I2C bus, contary to the SFF
specifications.

Work around this by implementing a two-level timeout strategy, where
we initially quickly retry for the module, and then use a slower retry
after we exceed a maximum number of quick attempts.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 38 ++++++++++++++++++++++++++++----------
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 5aaeee461d06..68d91fae2077 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -167,9 +167,12 @@ static const enum gpiod_flags gpio_flags[] = {
  * The SFF-8472 specifies t_serial ("Time from power on until module is
  * ready for data transmission over the two wire serial bus.") as 300ms.
  */
-#define T_SERIAL	msecs_to_jiffies(300)
-#define T_HPOWER_LEVEL	msecs_to_jiffies(300)
-#define T_PROBE_RETRY	msecs_to_jiffies(100)
+#define T_SERIAL		msecs_to_jiffies(300)
+#define T_HPOWER_LEVEL		msecs_to_jiffies(300)
+#define T_PROBE_RETRY_INIT	msecs_to_jiffies(100)
+#define R_PROBE_RETRY_INIT	10
+#define T_PROBE_RETRY_SLOW	msecs_to_jiffies(5000)
+#define R_PROBE_RETRY_SLOW	12
 
 /* SFP modules appear to always have their PHY configured for bus address
  * 0x56 (which with mdio-i2c, translates to a PHY address of 22).
@@ -204,6 +207,8 @@ struct sfp {
 	struct delayed_work timeout;
 	struct mutex sm_mutex;			/* Protects state machine */
 	unsigned char sm_mod_state;
+	unsigned char sm_mod_tries_init;
+	unsigned char sm_mod_tries;
 	unsigned char sm_dev_state;
 	unsigned short sm_state;
 	unsigned int sm_retries;
@@ -1394,7 +1399,7 @@ static int sfp_sm_mod_hpower(struct sfp *sfp, bool enable)
 	return 0;
 }
 
-static int sfp_sm_mod_probe(struct sfp *sfp)
+static int sfp_sm_mod_probe(struct sfp *sfp, bool report)
 {
 	/* SFP module inserted - read I2C data */
 	struct sfp_eeprom_id id;
@@ -1404,7 +1409,8 @@ static int sfp_sm_mod_probe(struct sfp *sfp)
 
 	ret = sfp_read(sfp, false, 0, &id, sizeof(id));
 	if (ret < 0) {
-		dev_err(sfp->dev, "failed to read EEPROM: %d\n", ret);
+		if (report)
+			dev_err(sfp->dev, "failed to read EEPROM: %d\n", ret);
 		return -EAGAIN;
 	}
 
@@ -1551,8 +1557,11 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 
 	switch (sfp->sm_mod_state) {
 	default:
-		if (event == SFP_E_INSERT)
+		if (event == SFP_E_INSERT) {
 			sfp_sm_mod_next(sfp, SFP_MOD_PROBE, T_SERIAL);
+			sfp->sm_mod_tries_init = R_PROBE_RETRY_INIT;
+			sfp->sm_mod_tries = R_PROBE_RETRY_SLOW;
+		}
 		break;
 
 	case SFP_MOD_PROBE:
@@ -1560,10 +1569,19 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 		if (event != SFP_E_TIMEOUT)
 			break;
 
-		err = sfp_sm_mod_probe(sfp);
+		err = sfp_sm_mod_probe(sfp, sfp->sm_mod_tries == 1);
 		if (err == -EAGAIN) {
-			sfp_sm_set_timer(sfp, T_PROBE_RETRY);
-			break;
+			if (sfp->sm_mod_tries_init &&
+			   --sfp->sm_mod_tries_init) {
+				sfp_sm_set_timer(sfp, T_PROBE_RETRY_INIT);
+				break;
+			} else if (sfp->sm_mod_tries && --sfp->sm_mod_tries) {
+				if (sfp->sm_mod_tries == R_PROBE_RETRY_SLOW - 1)
+					dev_warn(sfp->dev,
+						 "please wait, module slow to respond\n");
+				sfp_sm_set_timer(sfp, T_PROBE_RETRY_SLOW);
+				break;
+			}
 		}
 		if (err < 0) {
 			sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
@@ -1598,7 +1616,7 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 				sfp_module_remove(sfp->sfp_bus);
 				sfp_sm_mod_next(sfp, SFP_MOD_ERROR, 0);
 			} else {
-				sfp_sm_set_timer(sfp, T_PROBE_RETRY);
+				sfp_sm_set_timer(sfp, T_PROBE_RETRY_INIT);
 			}
 			break;
 		}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH net-next 17/17] net: sfp: allow modules with slow diagnostics to probe
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (15 preceding siblings ...)
  2019-11-10 14:07 ` [PATCH net-next 16/17] net: sfp: allow sfp to probe slow to initialise GPON modules Russell King
@ 2019-11-10 14:07 ` Russell King
  2019-11-10 17:52 ` [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Andrew Lunn
  2019-11-12  0:18 ` David Miller
  18 siblings, 0 replies; 32+ messages in thread
From: Russell King @ 2019-11-10 14:07 UTC (permalink / raw)
  To: Andrew Lunn, Florian Fainelli, Heiner Kallweit; +Cc: David S. Miller, netdev

When a module is inserted, we attempt to read read the ID from address
0x50.  Once we are able to read the ID, we immediately attempt to
initialise the hwmon support by reading from address 0x51.  If this
fails, then we fall into error state, and assume that the module is
not usable.

Modules such as the ALCATELLUCENT 3FE46541AA use a real EEPROM for
I2C address 0x50, which responds immediately.  However, address 0x51
is an emulated, which only becomes available once the on-board firmware
has booted.  This prompts us to fall into the error state.

Since the module may be usable without diagnostics, arrange for the
hwmon probe independent of the rest of the SFP itself, retrying every
5s for up to about 60s for the monitoring to become available, and
print an error message if it doesn't become available.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/sfp.c | 96 +++++++++++++++++++++++++++++++++----------
 1 file changed, 74 insertions(+), 22 deletions(-)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 68d91fae2077..69bedef96ca7 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -218,6 +218,8 @@ struct sfp {
 
 #if IS_ENABLED(CONFIG_HWMON)
 	struct sfp_diag diag;
+	struct delayed_work hwmon_probe;
+	unsigned int hwmon_tries;
 	struct device *hwmon_dev;
 	char *hwmon_name;
 #endif
@@ -1096,29 +1098,27 @@ static const struct hwmon_chip_info sfp_hwmon_chip_info = {
 	.info = sfp_hwmon_info,
 };
 
-static int sfp_hwmon_insert(struct sfp *sfp)
+static void sfp_hwmon_probe(struct work_struct *work)
 {
+	struct sfp *sfp = container_of(work, struct sfp, hwmon_probe.work);
 	int err, i;
 
-	if (sfp->id.ext.sff8472_compliance == SFP_SFF8472_COMPLIANCE_NONE)
-		return 0;
-
-	if (!(sfp->id.ext.diagmon & SFP_DIAGMON_DDM))
-		return 0;
-
-	if (sfp->id.ext.diagmon & SFP_DIAGMON_ADDRMODE)
-		/* This driver in general does not support address
-		 * change.
-		 */
-		return 0;
-
 	err = sfp_read(sfp, true, 0, &sfp->diag, sizeof(sfp->diag));
-	if (err < 0)
-		return err;
+	if (err < 0) {
+		if (sfp->hwmon_tries--) {
+			mod_delayed_work(system_wq, &sfp->hwmon_probe,
+					 T_PROBE_RETRY_SLOW);
+		} else {
+			dev_warn(sfp->dev, "hwmon probe failed: %d\n", err);
+		}
+		return;
+	}
 
 	sfp->hwmon_name = kstrdup(dev_name(sfp->dev), GFP_KERNEL);
-	if (!sfp->hwmon_name)
-		return -ENODEV;
+	if (!sfp->hwmon_name) {
+		dev_err(sfp->dev, "out of memory for hwmon name\n");
+		return;
+	}
 
 	for (i = 0; sfp->hwmon_name[i]; i++)
 		if (hwmon_is_bad_char(sfp->hwmon_name[i]))
@@ -1128,18 +1128,52 @@ static int sfp_hwmon_insert(struct sfp *sfp)
 							 sfp->hwmon_name, sfp,
 							 &sfp_hwmon_chip_info,
 							 NULL);
+	if (IS_ERR(sfp->hwmon_dev))
+		dev_err(sfp->dev, "failed to register hwmon device: %ld\n",
+			PTR_ERR(sfp->hwmon_dev));
+}
+
+static int sfp_hwmon_insert(struct sfp *sfp)
+{
+	if (sfp->id.ext.sff8472_compliance == SFP_SFF8472_COMPLIANCE_NONE)
+		return 0;
 
-	return PTR_ERR_OR_ZERO(sfp->hwmon_dev);
+	if (!(sfp->id.ext.diagmon & SFP_DIAGMON_DDM))
+		return 0;
+
+	if (sfp->id.ext.diagmon & SFP_DIAGMON_ADDRMODE)
+		/* This driver in general does not support address
+		 * change.
+		 */
+		return 0;
+
+	mod_delayed_work(system_wq, &sfp->hwmon_probe, 1);
+	sfp->hwmon_tries = R_PROBE_RETRY_SLOW;
+
+	return 0;
 }
 
 static void sfp_hwmon_remove(struct sfp *sfp)
 {
+	cancel_delayed_work_sync(&sfp->hwmon_probe);
 	if (!IS_ERR_OR_NULL(sfp->hwmon_dev)) {
 		hwmon_device_unregister(sfp->hwmon_dev);
 		sfp->hwmon_dev = NULL;
 		kfree(sfp->hwmon_name);
 	}
 }
+
+static int sfp_hwmon_init(struct sfp *sfp)
+{
+	INIT_DELAYED_WORK(&sfp->hwmon_probe, sfp_hwmon_probe);
+
+	return 0;
+}
+
+static void sfp_hwmon_exit(struct sfp *sfp)
+{
+	cancel_delayed_work_sync(&sfp->hwmon_probe);
+}
 #else
 static int sfp_hwmon_insert(struct sfp *sfp)
 {
@@ -1149,6 +1183,15 @@ static int sfp_hwmon_insert(struct sfp *sfp)
 static void sfp_hwmon_remove(struct sfp *sfp)
 {
 }
+
+static int sfp_hwmon_init(struct sfp *sfp)
+{
+	return 0;
+}
+
+static void sfp_hwmon_exit(struct sfp *sfp)
+{
+}
 #endif
 
 /* Helpers */
@@ -1485,10 +1528,6 @@ static int sfp_sm_mod_probe(struct sfp *sfp, bool report)
 	if (ret < 0)
 		return ret;
 
-	ret = sfp_hwmon_insert(sfp);
-	if (ret < 0)
-		return ret;
-
 	return 0;
 }
 
@@ -1637,6 +1676,15 @@ static void sfp_sm_module(struct sfp *sfp, unsigned int event)
 	case SFP_MOD_ERROR:
 		break;
 	}
+
+#if IS_ENABLED(CONFIG_HWMON)
+	if (sfp->sm_mod_state >= SFP_MOD_WAITDEV &&
+	    IS_ERR_OR_NULL(sfp->hwmon_dev)) {
+		err = sfp_hwmon_insert(sfp);
+		if (err)
+			dev_warn(sfp->dev, "hwmon probe failed: %d\n", err);
+	}
+#endif
 }
 
 static void sfp_sm_main(struct sfp *sfp, unsigned int event)
@@ -1938,6 +1986,8 @@ static struct sfp *sfp_alloc(struct device *dev)
 	INIT_DELAYED_WORK(&sfp->poll, sfp_poll);
 	INIT_DELAYED_WORK(&sfp->timeout, sfp_timeout);
 
+	sfp_hwmon_init(sfp);
+
 	return sfp;
 }
 
@@ -1945,6 +1995,8 @@ static void sfp_cleanup(void *data)
 {
 	struct sfp *sfp = data;
 
+	sfp_hwmon_exit(sfp);
+
 	cancel_delayed_work_sync(&sfp->poll);
 	cancel_delayed_work_sync(&sfp->timeout);
 	if (sfp->i2c_mii) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 00/17] Allow slow to initialise GPON modules to work
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (16 preceding siblings ...)
  2019-11-10 14:07 ` [PATCH net-next 17/17] net: sfp: allow modules with slow diagnostics to probe Russell King
@ 2019-11-10 17:52 ` Andrew Lunn
  2019-11-10 19:47   ` Russell King - ARM Linux admin
  2019-11-12  0:18 ` David Miller
  18 siblings, 1 reply; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 17:52 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:05:30PM +0000, Russell King - ARM Linux admin wrote:
> Some GPON modules take longer than the SFF MSA specified time to
> initialise and respond to transactions on the I2C bus for either
> both 0x50 and 0x51, or 0x51 bus addresses.  Technically these modules
> are non-compliant with the SFP Multi-Source Agreement, they have
> been around for some time, so are difficult to just ignore.

Hi Russell

We are seeing quite a few SFP/SFF which violate the spec. Do you think
there is any value in naming and shaming in the kernel logs SFP which
don't conform to the standard? If you need to wait longer than 1
second for the EEPROM to become readable, print the vendor name from
the EEPROM and warn it is not conforment. If the diagnostic page is
not immediately available, again, print the vendor name warn it is not
conforment?

      Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 01/17] net: sfp: move sfp sub-state machines into separate functions
  2019-11-10 14:06 ` [PATCH net-next 01/17] net: sfp: move sfp sub-state machines into separate functions Russell King
@ 2019-11-10 17:56   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 17:56 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:13PM +0000, Russell King wrote:
> Move the SFP sub-state machines out of the main state machine function,
> in preparation for it doing a bit more with the device state.  By doing
> so, we ensure that our debug after the main state machine is always
> printed.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 02/17] net: sfp: move tx disable on device down to main state machine
  2019-11-10 14:06 ` [PATCH net-next 02/17] net: sfp: move tx disable on device down to main state machine Russell King
@ 2019-11-10 18:00   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:00 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:18PM +0000, Russell King wrote:
> Move the tx disable assertion on device down to the main state
> machine.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 03/17] net: sfp: rename sfp_sm_ins_next() as sfp_sm_mod_next()
  2019-11-10 14:06 ` [PATCH net-next 03/17] net: sfp: rename sfp_sm_ins_next() as sfp_sm_mod_next() Russell King
@ 2019-11-10 18:01   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:01 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:23PM +0000, Russell King wrote:
> sfp_sm_ins_next() modifies the module state machine.  Change it's name
> to reflect this.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 04/17] net: sfp: handle module remove outside state machine
  2019-11-10 14:06 ` [PATCH net-next 04/17] net: sfp: handle module remove outside state machine Russell King
@ 2019-11-10 18:07   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:07 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:28PM +0000, Russell King wrote:
> Removing a module resets the module state machine back to its initial
> state.  Rather than explicitly handling this in every state, handle it
> early on outside of the state machine.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 05/17] net: sfp: rename T_PROBE_WAIT to T_SERIAL
  2019-11-10 14:06 ` [PATCH net-next 05/17] net: sfp: rename T_PROBE_WAIT to T_SERIAL Russell King
@ 2019-11-10 18:08   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:08 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:33PM +0000, Russell King wrote:
> SFF-8472 rev 12.2 defines the time for the serial bus to become ready
> using t_serial.  Use this as our identifier for this timeout to make
> it clear what we are referring to.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 06/17] net: sfp: parse SFP power requirement earlier
  2019-11-10 14:06 ` [PATCH net-next 06/17] net: sfp: parse SFP power requirement earlier Russell King
@ 2019-11-10 18:10   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:10 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:39PM +0000, Russell King wrote:
> Parse the SFP power requirement earlier, in preparation for moving the
> power level setup code.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 07/17] net: sfp: avoid power switch on address-change modules
  2019-11-10 14:06 ` [PATCH net-next 07/17] net: sfp: avoid power switch on address-change modules Russell King
@ 2019-11-10 18:12   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:12 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:44PM +0000, Russell King wrote:
> If the module indicates that it requires an address change sequence to
> switch between address 0x50 and 0x51, which we don't support, we can't
> write to the register that controls the power mode to switch to high
> power mode.  Warn the user that the module may not be functional in
> this case, and don't try to change the power mode.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 08/17] net: sfp: control TX_DISABLE and phy only from main state machine
  2019-11-10 14:06 ` [PATCH net-next 08/17] net: sfp: control TX_DISABLE and phy only from main state machine Russell King
@ 2019-11-10 18:14   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:14 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:49PM +0000, Russell King wrote:
> We initialise TX_DISABLE when the sfp cage is probed, and then
> maintain its state in the main state machine.  However, the module
> state machine:
> - negates it when detecting a newly inserted module when it's already
>   guaranteed to be negated.
> - negates it when the module is removed, but the main state machine
>   will do this anyway.
> 
> Make TX_DISABLE entirely controlled by the main state machine.
> 
> The main state machine also probes the module for a PHY, and removes
> the PHY when the the module is removed.  Hence, removing the PHY in
> sfp_sm_module_remove() is also redundant, and is a left-over from
> when we tried to probe for the PHY from the module state machine.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 09/17] net: sfp: split the PHY probe from sfp_sm_mod_init()
  2019-11-10 14:06 ` [PATCH net-next 09/17] net: sfp: split the PHY probe from sfp_sm_mod_init() Russell King
@ 2019-11-10 18:19   ` Andrew Lunn
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 18:19 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:54PM +0000, Russell King wrote:
> Move the PHY probe into a separate function, splitting it from
> sfp_sm_mod_init().  This will allow us to eliminate the 50ms mdelay()
> inside the state machine.
> 
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>


Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 10/17] net: sfp: eliminate mdelay() from PHY probe
  2019-11-10 14:06 ` [PATCH net-next 10/17] net: sfp: eliminate mdelay() from PHY probe Russell King
@ 2019-11-10 19:37   ` Andrew Lunn
  2019-11-10 19:59     ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 32+ messages in thread
From: Andrew Lunn @ 2019-11-10 19:37 UTC (permalink / raw)
  To: Russell King; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 02:06:59PM +0000, Russell King wrote:
> Rather than using mdelay() to wait before probing the PHY (which holds
> several locks, including the rtnl lock), add an extra wait state to
> the state machine to introduce the 50ms delay without holding any
> locks.

Hi Russell

Is this 50ms part of the standard? Or did you find that empirically?

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 00/17] Allow slow to initialise GPON modules to work
  2019-11-10 17:52 ` [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Andrew Lunn
@ 2019-11-10 19:47   ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 32+ messages in thread
From: Russell King - ARM Linux admin @ 2019-11-10 19:47 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 06:52:17PM +0100, Andrew Lunn wrote:
> On Sun, Nov 10, 2019 at 02:05:30PM +0000, Russell King - ARM Linux admin wrote:
> > Some GPON modules take longer than the SFF MSA specified time to
> > initialise and respond to transactions on the I2C bus for either
> > both 0x50 and 0x51, or 0x51 bus addresses.  Technically these modules
> > are non-compliant with the SFP Multi-Source Agreement, they have
> > been around for some time, so are difficult to just ignore.
> 
> Hi Russell
> 
> We are seeing quite a few SFP/SFF which violate the spec. Do you think
> there is any value in naming and shaming in the kernel logs SFP which
> don't conform to the standard? If you need to wait longer than 1
> second for the EEPROM to become readable, print the vendor name from
> the EEPROM and warn it is not conforment. If the diagnostic page is
> not immediately available, again, print the vendor name warn it is not
> conforment?

I really don't think it will achieve anything.  Once something is
established in the market, it's difficult to get the vendor to change
it.

In some cases, it's not possible to change it without an entire
hardware redesign, and we're not going to achieve that by "naming and
shaming" when most places this will be used is in embedded devices
where hardly anyone looks at the kernel message log.

It is annoying that there are these modules that do not conform, but
we have many instances in the kernel of hardware that doesn't quite
conform, yet we still make it work.

I have another fun case with another module - a copper SFP+ module that
has a Broadcom Clause 45 NBASE-T PHY on it.  On reset, it quickly
becomes accessible via the I2C bus, but it hasn't finished
initialising.  We're soo quick in the kernel that we read the IDs and
bind the PHY driver for it, and attempt to set the advertisements - but
because the PHY hasn't finished initialising, the kernels
advertisements get overwritten by the PHY (presumably EEPROM loading,
or maybe via PHY firmware initialisation.)

I've stumbled over (very annoyingly) the fact that OpenWRT is carrying
a patch for the SFP code to deal with PHYs that need longer to
initialise - but there has been no upstream report of it afaics:

https://github.com/openwrt/openwrt/blob/master/target/linux/mvebu/patches-4.19/450-reprobe_sfp_phy.patch

That may be due to us not quite checking the TX_FAULT line correctly
(which these patches change) - we had assumed that the PHY would
always be available after 50ms, but the spec actually says that
modules are allowed 300ms to startup (even longer, 90s, for cooled
laser optical modules, which I haven't published the patches for.)
End of the startup is signalled by TX_FAULT being deasserted.  However,
the copper PHYs I've seen so far tie TX_FAULT to ground on the module,
and I have no cooled laser modules to test with.  Maybe all copper PHY
modules are non-compliant...

Also, remember that there is nothing in the EEPROM which tells us what
mode the host serdes should operate in - we work that out by best-
guessing today, and so far we're getting away with it.  There are,
however, copper SFP modules that are 1G only that use 1000BASE-X,
where the only difference from their 1G/100M/10M cousins is a different
part number and use SGMII by default.

What I'm basically saying is that relying on the "specification" is
all well and good, but if we implmented the letter of the spec, we
would only allow 1000BASE-X with SFP and 10GBASE-R with SFP+ cages,
and wouldn't have copper SFPs working.  Technically, the majority of
those on the market are non-compliant.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 10/17] net: sfp: eliminate mdelay() from PHY probe
  2019-11-10 19:37   ` Andrew Lunn
@ 2019-11-10 19:59     ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 32+ messages in thread
From: Russell King - ARM Linux admin @ 2019-11-10 19:59 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Florian Fainelli, Heiner Kallweit, David S. Miller, netdev

On Sun, Nov 10, 2019 at 08:37:19PM +0100, Andrew Lunn wrote:
> On Sun, Nov 10, 2019 at 02:06:59PM +0000, Russell King wrote:
> > Rather than using mdelay() to wait before probing the PHY (which holds
> > several locks, including the rtnl lock), add an extra wait state to
> > the state machine to introduce the 50ms delay without holding any
> > locks.
> 
> Hi Russell
> 
> Is this 50ms part of the standard? Or did you find that empirically?

From what I'm aware, there is no standard for copper SFPs.

It's something that I found works with the Source Photonics module I
have, and after feedback via Jon Nettleton from folk using the SolidRun
platforms.

Hence, it's something I'm preserving in the kernel at the moment, as
we know it works for most people so far.  We can increase it within
reason if necessary (see my reply to your 00/17 reply.)

From what I can tell, accessing the PHY on I2C address 0x56 just
happens to work for many of the modules because most SFPs use the
Marvell 88E1111 - but there are modules out there (Mikrotik S-RJ01)
that have an Atheros AR8033 PHY which is inaccessible from the host
(and hence the results of the negotiation can't be fully known.)

However, I have a 10M/100M/1G/10G copper SFP+ which uses a Broadcom
Clause 45 PHY which is also accessible (via a slightly different I2C
protocol from the Clause 22 variety) via address 0x56, and there are
other copper SFP+ modules that I've found online that use the same
protocol - but with a delay between the I2C write and I2C read to
read a Clause 45 register.  So, 0x56 *appears* to be some kind of
informally agreed I2C address for SFP PHYs.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH net-next 00/17] Allow slow to initialise GPON modules to work
  2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
                   ` (17 preceding siblings ...)
  2019-11-10 17:52 ` [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Andrew Lunn
@ 2019-11-12  0:18 ` David Miller
  18 siblings, 0 replies; 32+ messages in thread
From: David Miller @ 2019-11-12  0:18 UTC (permalink / raw)
  To: linux; +Cc: andrew, f.fainelli, hkallweit1, netdev

From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Date: Sun, 10 Nov 2019 14:05:30 +0000

> Some GPON modules take longer than the SFF MSA specified time to
> initialise and respond to transactions on the I2C bus for either
> both 0x50 and 0x51, or 0x51 bus addresses.  Technically these modules
> are non-compliant with the SFP Multi-Source Agreement, they have
> been around for some time, so are difficult to just ignore.
> 
> Most of the patch series is restructuring the code to make it more
> readable, and split various things into separate functions.
> 
> We split the three state machines into three separate functions, and
> re-arrange them to start probing the module as soon as a module has
> been detected (without waiting for the network device.)  We try to
> read the module's EEPROM, retrying quickly for the first second, and
> then once every five seconds for about a minute until we have read
> the EEPROM.  So that the kernel isn't entirely silent, we print a
> message indicating that we're waiting for the module to respond after
> the first second, or when all retries have expired.
> 
> Once the module ID has been read, we kick off a delayed work queue
> which attempts to register the hwmon, retrying for up to a minute if
> the monitoring parameters are unreadable; this allows us to proceed
> with module initialisation independently of the hwmon state.
> 
> With high-power modules, we wait for the netdev to be attached before
> switching the module power mode, and retry this in a similar way to
> before until we have successfully read and written the EEPROM at 0x51.
> 
> We also move the handling of the TX_DISABLE signal entirely to the main
> state machine, and avoid probing any on-board PHY while TX_FAULT is
> set.

Series applied.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2019-11-12  0:18 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-10 14:05 [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Russell King - ARM Linux admin
2019-11-10 14:06 ` [PATCH net-next 01/17] net: sfp: move sfp sub-state machines into separate functions Russell King
2019-11-10 17:56   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 02/17] net: sfp: move tx disable on device down to main state machine Russell King
2019-11-10 18:00   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 03/17] net: sfp: rename sfp_sm_ins_next() as sfp_sm_mod_next() Russell King
2019-11-10 18:01   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 04/17] net: sfp: handle module remove outside state machine Russell King
2019-11-10 18:07   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 05/17] net: sfp: rename T_PROBE_WAIT to T_SERIAL Russell King
2019-11-10 18:08   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 06/17] net: sfp: parse SFP power requirement earlier Russell King
2019-11-10 18:10   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 07/17] net: sfp: avoid power switch on address-change modules Russell King
2019-11-10 18:12   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 08/17] net: sfp: control TX_DISABLE and phy only from main state machine Russell King
2019-11-10 18:14   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 09/17] net: sfp: split the PHY probe from sfp_sm_mod_init() Russell King
2019-11-10 18:19   ` Andrew Lunn
2019-11-10 14:06 ` [PATCH net-next 10/17] net: sfp: eliminate mdelay() from PHY probe Russell King
2019-11-10 19:37   ` Andrew Lunn
2019-11-10 19:59     ` Russell King - ARM Linux admin
2019-11-10 14:07 ` [PATCH net-next 11/17] net: sfp: allow fault processing to transition to other states Russell King
2019-11-10 14:07 ` [PATCH net-next 12/17] net: sfp: ensure TX_FAULT has deasserted before probing the PHY Russell King
2019-11-10 14:07 ` [PATCH net-next 13/17] net: sfp: track upstream's attachment state in state machine Russell King
2019-11-10 14:07 ` [PATCH net-next 14/17] net: sfp: split power mode switching from probe Russell King
2019-11-10 14:07 ` [PATCH net-next 15/17] net: sfp: move module insert reporting out of probe Russell King
2019-11-10 14:07 ` [PATCH net-next 16/17] net: sfp: allow sfp to probe slow to initialise GPON modules Russell King
2019-11-10 14:07 ` [PATCH net-next 17/17] net: sfp: allow modules with slow diagnostics to probe Russell King
2019-11-10 17:52 ` [PATCH net-next 00/17] Allow slow to initialise GPON modules to work Andrew Lunn
2019-11-10 19:47   ` Russell King - ARM Linux admin
2019-11-12  0:18 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).