linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/11] Make C45 autoprobe more robust
@ 2020-05-22 21:30 Jeremy Linton
  2020-05-22 21:30 ` [RFC 01/11] net: phy: Don't report success if devices weren't found Jeremy Linton
                   ` (10 more replies)
  0 siblings, 11 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

It would be nice if we could depend on the c45 scanner
to identify standards complaint phys or fail cleanly
enough that we can turn around and continue probing the
bus for c22 devices.

In order to pull this off we should be looking at a larger
range of MMD addresses, as well as doing a better job of judging
if a phy appears to be responding correctly. Once that is in
place we then allow a MDIO bus to report that its c45 capable
with a c22 fallback.

So this set is really a heavy RFC, and I have my own set of
questions about it.

First, it seems like its ok to scan reserved parts of the MMD space,
given the existing code is scanning MMD 0. Should we do a better
job of blocking out the reserved areas? Or was this really what
the commit to sanitize the c22 capability was fixing (avoid a
probe at location 0).

Secondly, are there parts of the system that are depending on
"stub" MDIO devices being created?The DT code looks ok, but I
think the existing code path left open a number possibilities
where devices are created without valid IDs. The commit which
cleared the c22 capability registers from the device id list
doesn't make much sense to me except to create bugus devices
by avoiding breaking out of the devices loop early. There were
a couple other cases (all 0 device lists, device lists
reporting devices that respond as 0xFFFFFFFF to the id registers).

Do we want to probe some of the additional package id registers?

Do we want a more "strict" flag for fully compliant MDIO/PHY
combinations or are we ok with extra stub devices? The 3rd to last
set is just using the C45_FIRST flag for it.

What have I missed?

Jeremy Linton (11):
  net: phy: Don't report success if devices weren't found
  net: phy: Simplify MMD device list termination
  net: phy: refactor c45 phy identification sequence
  net: phy: Handle c22 regs presence better
  net: phy: Scan the entire MMD device space
  net: phy: Hoist no phy detected state
  net: phy: reset invalid phy reads of 0 back to 0xffffffff
  net: phy: Allow mdio buses to auto-probe c45 devices
  net: phy: Refuse to consider phy_id=0 a valid phy
  net: example acpize xgmac_mdio
  net: example xgmac enable extended scanning

 drivers/net/ethernet/freescale/xgmac_mdio.c |  28 +++--
 drivers/net/phy/mdio_bus.c                  |   9 +-
 drivers/net/phy/phy_device.c                | 112 ++++++++++++--------
 include/linux/phy.h                         |   7 +-
 4 files changed, 100 insertions(+), 56 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 61+ messages in thread

* [RFC 01/11] net: phy: Don't report success if devices weren't found
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-23 18:20   ` Russell King - ARM Linux admin
  2020-05-22 21:30 ` [RFC 02/11] net: phy: Simplify MMD device list termination Jeremy Linton
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

C45 devices are to return 0 for registers they haven't
implemented. This means in theory we can terminate the
device search loop without finding any MMDs. In that
case we want to immediately return indicating that
nothing was found rather than continuing to probe
and falling into the success state at the bottom.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index ac2784192472..245899b58a7d 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -742,6 +742,12 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 		}
 	}
 
+	/* no reported devices */
+	if (*devs == 0) {
+		*phy_id = 0xffffffff;
+		return 0;
+	}
+
 	/* Now probe Device Identifiers for each device present. */
 	for (i = 1; i < num_ids; i++) {
 		if (!(c45_ids->devices_in_package & (1 << i)))
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 02/11] net: phy: Simplify MMD device list termination
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
  2020-05-22 21:30 ` [RFC 01/11] net: phy: Don't report success if devices weren't found Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-23 18:36   ` Russell King - ARM Linux admin
  2020-05-22 21:30 ` [RFC 03/11] net: phy: refactor c45 phy identification sequence Jeremy Linton
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

Since we are already checking for *devs == 0 after
the loop terminates, we can add a mostly F's check
as well. With that change we can simplify the return/break
sequence inside the loop.

Add a valid_phy_id() macro for this, since we will be using it
in a couple other places.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 245899b58a7d..7746c07b97fe 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -695,6 +695,11 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
 	return 0;
 }
 
+static bool valid_phy_id(int val)
+{
+	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
+}
+
 /**
  * get_phy_c45_ids - reads the specified addr for its 802.3-c45 IDs.
  * @bus: the target MII bus
@@ -732,18 +737,12 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 			phy_reg = get_phy_c45_devs_in_pkg(bus, addr, 0, devs);
 			if (phy_reg < 0)
 				return -EIO;
-			/* no device there, let's get out of here */
-			if ((*devs & 0x1fffffff) == 0x1fffffff) {
-				*phy_id = 0xffffffff;
-				return 0;
-			} else {
-				break;
-			}
+			break;
 		}
 	}
 
 	/* no reported devices */
-	if (*devs == 0) {
+	if (!valid_phy_id(*devs)) {
 		*phy_id = 0xffffffff;
 		return 0;
 	}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
  2020-05-22 21:30 ` [RFC 01/11] net: phy: Don't report success if devices weren't found Jeremy Linton
  2020-05-22 21:30 ` [RFC 02/11] net: phy: Simplify MMD device list termination Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-23 15:28   ` Andrew Lunn
  2020-05-23 18:30   ` Russell King - ARM Linux admin
  2020-05-22 21:30 ` [RFC 04/11] net: phy: Handle c22 regs presence better Jeremy Linton
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

Lets factor out the phy id logic, and make it generic
so that it can be used for c22 and c45.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 65 +++++++++++++++++++-----------------
 1 file changed, 35 insertions(+), 30 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 7746c07b97fe..f0761fa5e40b 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -695,6 +695,29 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
 	return 0;
 }
 
+static int _get_phy_id(struct mii_bus *bus, int addr, int dev_addr,
+		       u32 *phy_id, bool c45)
+{
+	int phy_reg, reg_addr;
+
+	int reg_base = c45 ? (MII_ADDR_C45 | dev_addr << 16) : 0;
+
+	reg_addr =  reg_base | MII_PHYSID1;
+	phy_reg = mdiobus_read(bus, addr, reg_addr);
+	if (phy_reg < 0)
+		return -EIO;
+
+	*phy_id = phy_reg << 16;
+
+	reg_addr = reg_base | MII_PHYSID2;
+	phy_reg = mdiobus_read(bus, addr, reg_addr);
+	if (phy_reg < 0)
+		return -EIO;
+	*phy_id |= phy_reg;
+
+	return 0;
+}
+
 static bool valid_phy_id(int val)
 {
 	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
@@ -715,17 +738,17 @@ static bool valid_phy_id(int val)
  */
 static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 			   struct phy_c45_device_ids *c45_ids) {
-	int phy_reg;
-	int i, reg_addr;
+	int ret;
+	int i;
 	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
 	u32 *devs = &c45_ids->devices_in_package;
 
 	/* Find first non-zero Devices In package. Device zero is reserved
 	 * for 802.3 c45 complied PHYs, so don't probe it at first.
 	 */
-	for (i = 1; i < num_ids && *devs == 0; i++) {
-		phy_reg = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
-		if (phy_reg < 0)
+	for (i = 0; i < num_ids && *devs == 0; i++) {
+		ret = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
+		if (ret < 0)
 			return -EIO;
 
 		if ((*devs & 0x1fffffff) == 0x1fffffff) {
@@ -752,17 +775,9 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 		if (!(c45_ids->devices_in_package & (1 << i)))
 			continue;
 
-		reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID1;
-		phy_reg = mdiobus_read(bus, addr, reg_addr);
-		if (phy_reg < 0)
-			return -EIO;
-		c45_ids->device_ids[i] = phy_reg << 16;
-
-		reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID2;
-		phy_reg = mdiobus_read(bus, addr, reg_addr);
-		if (phy_reg < 0)
-			return -EIO;
-		c45_ids->device_ids[i] |= phy_reg;
+		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
+		if (ret < 0)
+			return ret;
 	}
 	*phy_id = 0;
 	return 0;
@@ -787,27 +802,17 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
 		      bool is_c45, struct phy_c45_device_ids *c45_ids)
 {
-	int phy_reg;
+	int ret;
 
 	if (is_c45)
 		return get_phy_c45_ids(bus, addr, phy_id, c45_ids);
 
-	/* Grab the bits from PHYIR1, and put them in the upper half */
-	phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
-	if (phy_reg < 0) {
+	ret = _get_phy_id(bus, addr, 0, phy_id, false);
+	if (ret < 0) {
 		/* returning -ENODEV doesn't stop bus scanning */
-		return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO;
+		return (ret == -EIO || ret == -ENODEV) ? -ENODEV : -EIO;
 	}
 
-	*phy_id = phy_reg << 16;
-
-	/* Grab the bits from PHYIR2, and put them in the lower half */
-	phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
-	if (phy_reg < 0)
-		return -EIO;
-
-	*phy_id |= phy_reg;
-
 	return 0;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (2 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 03/11] net: phy: refactor c45 phy identification sequence Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-23 18:37   ` Russell King - ARM Linux admin
  2020-05-22 21:30 ` [RFC 05/11] net: phy: Scan the entire MMD device space Jeremy Linton
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

Until this point, we have been sanitizing the c22
regs presence bit out of all the MMD device lists.
This is incorrect as it causes the 0xFFFFFFFF checks
to incorrectly fail. Further, it turns out that we
want to utilize this flag to make a determination that
there is actually a phy at this location and we should
be accessing it using c22.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index f0761fa5e40b..2d677490ecab 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
 		return -EIO;
 	*devices_in_package |= phy_reg;
 
-	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
-	*devices_in_package &= ~BIT(0);
-
 	return 0;
 }
 
@@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 	int i;
 	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
 	u32 *devs = &c45_ids->devices_in_package;
+	bool c22_present = false;
+	bool valid_id = false;
 
 	/* Find first non-zero Devices In package. Device zero is reserved
 	 * for 802.3 c45 complied PHYs, so don't probe it at first.
@@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 		return 0;
 	}
 
+	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
+	c22_present = *devs & BIT(0);
+	*devs &= ~BIT(0);
+
 	/* Now probe Device Identifiers for each device present. */
 	for (i = 1; i < num_ids; i++) {
 		if (!(c45_ids->devices_in_package & (1 << i)))
@@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
 		if (ret < 0)
 			return ret;
+		if (valid_phy_id(c45_ids->device_ids[i]))
+			valid_id = true;
+	}
+
+	if (!valid_id && c22_present) {
+		*phy_id = 0xffffffff;
+	        return 0;
 	}
 	*phy_id = 0;
 	return 0;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 05/11] net: phy: Scan the entire MMD device space
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (3 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 04/11] net: phy: Handle c22 regs presence better Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-22 21:30 ` [RFC 06/11] net: phy: Hoist no phy detected state Jeremy Linton
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

The spec identifies devices in the top of
the 32-bit device space. Some phys are actually
responding that high. Lets try and capture their
information as well.

Starting at the reserved address 0, lets scan
every single possible MMD address. The spec
seems to indicate that every MMD should respond
with the same devices list. But it seems this
is being interpreted that only implemented MMDs
need respond. Since it doesn't appear to hurt to
scan reserved addresses, and the spec says that
access to unimplemented registers should return 0
(despite this some devices appear to be returning
0xFFFFFFFF) we are just going to ignore anything
that doesn't look like a valid return.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 8 +++-----
 include/linux/phy.h          | 2 +-
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 2d677490ecab..360c3a72c498 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -743,7 +743,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 	bool valid_id = false;
 
 	/* Find first non-zero Devices In package. Device zero is reserved
-	 * for 802.3 c45 complied PHYs, so don't probe it at first.
+	 * for 802.3 c45 complied PHYs, We will ask it for a devices list,
+	 * but later we won't ask for identification from it.
 	 */
 	for (i = 0; i < num_ids && *devs == 0; i++) {
 		ret = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
@@ -756,10 +757,7 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 			 *  10G PHYs have zero Devices In package,
 			 *  e.g. Cortina CS4315/CS4340 PHY.
 			 */
-			phy_reg = get_phy_c45_devs_in_pkg(bus, addr, 0, devs);
-			if (phy_reg < 0)
-				return -EIO;
-			break;
+			*devs = 0;
 		}
 	}
 
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 2432ca463ddc..480a6b153227 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -346,7 +346,7 @@ enum phy_state {
  */
 struct phy_c45_device_ids {
 	u32 devices_in_package;
-	u32 device_ids[8];
+	u32 device_ids[32];
 };
 
 struct macsec_context;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 06/11] net: phy: Hoist no phy detected state
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (4 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 05/11] net: phy: Scan the entire MMD device space Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-22 21:30 ` [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff Jeremy Linton
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

Default initializing the phy_id to "invalid" allows
us to avoid setting it on the error returns.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 360c3a72c498..b2cd22d6315c 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -742,6 +742,7 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 	bool c22_present = false;
 	bool valid_id = false;
 
+	*phy_id = 0xffffffff;
 	/* Find first non-zero Devices In package. Device zero is reserved
 	 * for 802.3 c45 complied PHYs, We will ask it for a devices list,
 	 * but later we won't ask for identification from it.
@@ -762,10 +763,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 	}
 
 	/* no reported devices */
-	if (!valid_phy_id(*devs)) {
-		*phy_id = 0xffffffff;
+	if (!valid_phy_id(*devs))
 		return 0;
-	}
 
 	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
 	c22_present = *devs & BIT(0);
@@ -783,10 +782,9 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 			valid_id = true;
 	}
 
-	if (!valid_id && c22_present) {
-		*phy_id = 0xffffffff;
+	if (!valid_id && c22_present)
 	        return 0;
-	}
+
 	*phy_id = 0;
 	return 0;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (5 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 06/11] net: phy: Hoist no phy detected state Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-23 18:44   ` Russell King - ARM Linux admin
  2020-05-22 21:30 ` [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices Jeremy Linton
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

MMD's in the device list sometimes return 0 for their id.
When that happens lets reset the id back to 0xfffffff so
that we don't get a stub device created for it.

This is a questionable commit, but i'm tossing it out
there along with the comment that reading the spec
seems to indicate that maybe there are further registers
that could be probed in an attempt to resolve some futher
"bad" phys. It sort of comes down to do we want unused phy
devices floating around (potentially unmatched in dt) or
do we want to cut them off early and let DT create them
directly.

For the ACPI case, we don't really have a good way to match
them, and since it hasn't been working I think its perfectly
reasonable at this point to expect phy's to implement enough
of the standard that we can detect them and attach a phy
specific driver.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index b2cd22d6315c..acdada865864 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -780,6 +780,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 			return ret;
 		if (valid_phy_id(c45_ids->device_ids[i]))
 			valid_id = true;
+		else
+			c45_ids->device_ids[i] = 0xffffffff;
+
+		/* consider probing PKGID per 45.2.12.2.1? */
 	}
 
 	if (!valid_id && c22_present)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (6 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-24 14:44   ` Andrew Lunn
  2020-05-22 21:30 ` [RFC 09/11] net: phy: Refuse to consider phy_id=0 a valid phy Jeremy Linton
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

The mdiobus_scan logic is currently hardcoded to only
work with c22 devices. This works fairly well in most
cases, but its possible a c45 device doesn't respond
despite being a standard phy. If the parent hardware
is capable, it makes sense to scan for c45 devices before
falling back to c22.

As we want this to reflect the capabilities of the STA,
lets add a field to the mii_bus structure to represent
the capability. That way devices can opt into the extended
scanning. Existing users should continue to default to c22
only scanning as long as they are zero'ing the structure
before use.

At the moment its a yes/no option, but in the future
it may be useful to extend this to c45 only policy, or
add additional classes and policies.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/mdio_bus.c | 9 +++++++--
 include/linux/phy.h        | 5 +++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 7a4eb3f2cb74..3ab618e15f2c 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -732,10 +732,15 @@ EXPORT_SYMBOL(mdiobus_free);
  */
 struct phy_device *mdiobus_scan(struct mii_bus *bus, int addr)
 {
-	struct phy_device *phydev;
+	struct phy_device *phydev = ERR_PTR(-ENODEV);
 	int err;
 
-	phydev = get_phy_device(bus, addr, false);
+	if (bus->probe_capabilities == MDIOBUS_C45_FIRST)
+		phydev = get_phy_device(bus, addr, true);
+
+	if (IS_ERR(phydev))
+		phydev = get_phy_device(bus, addr, false);
+
 	if (IS_ERR(phydev))
 		return phydev;
 
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 480a6b153227..d68e1ebab484 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -275,6 +275,11 @@ struct mii_bus {
 	int reset_delay_us;
 	/* RESET GPIO descriptor pointer */
 	struct gpio_desc *reset_gpiod;
+	/* bus capabilities, used for probing */
+	enum {
+		MDIOBUS_C22_ONLY = 0,
+		MDIOBUS_C45_FIRST,
+	} probe_capabilities;
 };
 #define to_mii_bus(d) container_of(d, struct mii_bus, dev)
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 09/11] net: phy: Refuse to consider phy_id=0 a valid phy
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (7 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-22 21:30 ` [RFC 10/11] net: example acpize xgmac_mdio Jeremy Linton
  2020-05-22 21:30 ` [RFC 11/11] net: example xgmac enable extended scanning Jeremy Linton
  10 siblings, 0 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

This is another one of those questionable commits. In this
case a bus tagged C45_FIRST refuses to create phys
where the phy id is invalid. In general this is probably a
good idea, but might cause problems. Another idea might be to
create an additional flag (MDIOBUS_STRICT_ID?) for this case.

Or we just ignore it and accept that the probe logic as it
stands potentially creates bogus phy devices, to avoid the
case where an actual phy exists but isn't responding correctly.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/phy/phy_device.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index acdada865864..e74f2ef6f12b 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -789,7 +789,9 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
 	if (!valid_id && c22_present)
 	        return 0;
 
-	*phy_id = 0;
+	if (valid_id || bus->probe_capabilities != MDIOBUS_C45_FIRST)
+		*phy_id = 0;
+
 	return 0;
 }
 
@@ -853,6 +855,10 @@ struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45)
 	if ((phy_id & 0x1fffffff) == 0x1fffffff)
 		return ERR_PTR(-ENODEV);
 
+	/* Strict scanning should also ignore phy_id = 0 */
+	if (phy_id == 0 && bus->probe_capabilities == MDIOBUS_C45_FIRST)
+		return ERR_PTR(-ENODEV);
+
 	return phy_device_create(bus, addr, phy_id, is_c45, &c45_ids);
 }
 EXPORT_SYMBOL(get_phy_device);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 10/11] net: example acpize xgmac_mdio
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (8 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 09/11] net: phy: Refuse to consider phy_id=0 a valid phy Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  2020-05-23 18:48   ` Russell King - ARM Linux admin
  2020-05-22 21:30 ` [RFC 11/11] net: example xgmac enable extended scanning Jeremy Linton
  10 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/ethernet/freescale/xgmac_mdio.c | 27 +++++++++++++--------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
index c82c85ef5fb3..96ee3bd89983 100644
--- a/drivers/net/ethernet/freescale/xgmac_mdio.c
+++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
@@ -245,14 +245,14 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
 {
 	struct device_node *np = pdev->dev.of_node;
 	struct mii_bus *bus;
-	struct resource res;
+	struct resource *res;
 	struct mdio_fsl_priv *priv;
 	int ret;
 
-	ret = of_address_to_resource(np, 0, &res);
-	if (ret) {
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
 		dev_err(&pdev->dev, "could not obtain address\n");
-		return ret;
+		return -EINVAL;
 	}
 
 	bus = mdiobus_alloc_size(sizeof(struct mdio_fsl_priv));
@@ -263,21 +263,21 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
 	bus->read = xgmac_mdio_read;
 	bus->write = xgmac_mdio_write;
 	bus->parent = &pdev->dev;
-	snprintf(bus->id, MII_BUS_ID_SIZE, "%llx", (unsigned long long)res.start);
+	snprintf(bus->id, MII_BUS_ID_SIZE, "%llx", (unsigned long long)res->start);
 
 	/* Set the PHY base address */
 	priv = bus->priv;
-	priv->mdio_base = of_iomap(np, 0);
+	priv->mdio_base = devm_platform_ioremap_resource(pdev, 0);
 	if (!priv->mdio_base) {
 		ret = -ENOMEM;
 		goto err_ioremap;
 	}
 
-	priv->is_little_endian = of_property_read_bool(pdev->dev.of_node,
-						       "little-endian");
+	priv->is_little_endian = device_property_read_bool(&pdev->dev,
+							   "little-endian");
 
-	priv->has_a011043 = of_property_read_bool(pdev->dev.of_node,
-						  "fsl,erratum-a011043");
+	priv->has_a011043 = device_property_read_bool(&pdev->dev,
+						      "fsl,erratum-a011043");
 
 	ret = of_mdiobus_register(bus, np);
 	if (ret) {
@@ -320,10 +320,17 @@ static const struct of_device_id xgmac_mdio_match[] = {
 };
 MODULE_DEVICE_TABLE(of, xgmac_mdio_match);
 
+static const struct acpi_device_id xgmac_acpi_match[] = {
+	{ "NXP0006", (kernel_ulong_t)NULL },
+	{ },
+};
+MODULE_DEVICE_TABLE(acpi, xgmac_acpi_match);
+
 static struct platform_driver xgmac_mdio_driver = {
 	.driver = {
 		.name = "fsl-fman_xmdio",
 		.of_match_table = xgmac_mdio_match,
+		.acpi_match_table = xgmac_acpi_match,
 	},
 	.probe = xgmac_mdio_probe,
 	.remove = xgmac_mdio_remove,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [RFC 11/11] net: example xgmac enable extended scanning
  2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
                   ` (9 preceding siblings ...)
  2020-05-22 21:30 ` [RFC 10/11] net: example acpize xgmac_mdio Jeremy Linton
@ 2020-05-22 21:30 ` Jeremy Linton
  10 siblings, 0 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-22 21:30 UTC (permalink / raw)
  To: netdev
  Cc: davem, andrew, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel, Jeremy Linton

Since we know the xgmac hardware always has a c45
complaint bus, lets try scanning for c45 capable
phys first. If we fail to find any, then it with
fall back to c22 automatically.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/net/ethernet/freescale/xgmac_mdio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
index 96ee3bd89983..1d7313031be6 100644
--- a/drivers/net/ethernet/freescale/xgmac_mdio.c
+++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
@@ -263,6 +263,7 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
 	bus->read = xgmac_mdio_read;
 	bus->write = xgmac_mdio_write;
 	bus->parent = &pdev->dev;
+	bus->probe_capabilities = MDIOBUS_C45_FIRST;
 	snprintf(bus->id, MII_BUS_ID_SIZE, "%llx", (unsigned long long)res->start);
 
 	/* Set the PHY base address */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-22 21:30 ` [RFC 03/11] net: phy: refactor c45 phy identification sequence Jeremy Linton
@ 2020-05-23 15:28   ` Andrew Lunn
  2020-05-23 17:16     ` Jeremy Linton
  2020-05-23 17:32     ` Jeremy Linton
  2020-05-23 18:30   ` Russell King - ARM Linux admin
  1 sibling, 2 replies; 61+ messages in thread
From: Andrew Lunn @ 2020-05-23 15:28 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel

On Fri, May 22, 2020 at 04:30:51PM -0500, Jeremy Linton wrote:
> Lets factor out the phy id logic, and make it generic
> so that it can be used for c22 and c45.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/net/phy/phy_device.c | 65 +++++++++++++++++++-----------------
>  1 file changed, 35 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 7746c07b97fe..f0761fa5e40b 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -695,6 +695,29 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>  	return 0;
>  }
>  
> +static int _get_phy_id(struct mii_bus *bus, int addr, int dev_addr,
> +		       u32 *phy_id, bool c45)

Hi Jeremy

How about read_phy_id() so you can avoid the _ prefix.

>  static bool valid_phy_id(int val)
>  {
>  	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
> @@ -715,17 +738,17 @@ static bool valid_phy_id(int val)
>   */
>  static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  			   struct phy_c45_device_ids *c45_ids) {
> -	int phy_reg;
> -	int i, reg_addr;
> +	int ret;
> +	int i;
>  	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>  	u32 *devs = &c45_ids->devices_in_package;
>  
>  	/* Find first non-zero Devices In package. Device zero is reserved
>  	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>  	 */
> -	for (i = 1; i < num_ids && *devs == 0; i++) {
> -		phy_reg = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
> -		if (phy_reg < 0)
> +	for (i = 0; i < num_ids && *devs == 0; i++) {
> +		ret = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
> +		if (ret < 0)
>  			return -EIO;

Renaming reg_addr to ret does not belong in this patch.

	 Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-23 15:28   ` Andrew Lunn
@ 2020-05-23 17:16     ` Jeremy Linton
  2020-05-23 17:32     ` Jeremy Linton
  1 sibling, 0 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-23 17:16 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, davem, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

Thanks for taking a look at this!

On 5/23/20 10:28 AM, Andrew Lunn wrote:
> On Fri, May 22, 2020 at 04:30:51PM -0500, Jeremy Linton wrote:
>> Lets factor out the phy id logic, and make it generic
>> so that it can be used for c22 and c45.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/net/phy/phy_device.c | 65 +++++++++++++++++++-----------------
>>   1 file changed, 35 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 7746c07b97fe..f0761fa5e40b 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -695,6 +695,29 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>   	return 0;
>>   }
>>   
>> +static int _get_phy_id(struct mii_bus *bus, int addr, int dev_addr,
>> +		       u32 *phy_id, bool c45)
> 
> Hi Jeremy
> 
> How about read_phy_id() so you can avoid the _ prefix.

Yes, that sounds good.

> 
>>   static bool valid_phy_id(int val)
>>   {
>>   	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
>> @@ -715,17 +738,17 @@ static bool valid_phy_id(int val)
>>    */
>>   static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>   			   struct phy_c45_device_ids *c45_ids) {
>> -	int phy_reg;
>> -	int i, reg_addr;
>> +	int ret;
>> +	int i;
>>   	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>   	u32 *devs = &c45_ids->devices_in_package;
>>   
>>   	/* Find first non-zero Devices In package. Device zero is reserved
>>   	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>>   	 */
>> -	for (i = 1; i < num_ids && *devs == 0; i++) {
>> -		phy_reg = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
>> -		if (phy_reg < 0)
>> +	for (i = 0; i < num_ids && *devs == 0; i++) {
>> +		ret = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
>> +		if (ret < 0)
>>   			return -EIO;
> 
> Renaming reg_addr to ret does not belong in this patch.

Sure, that makes sense. The rename was a last min change when I was 
shuffling the args around.

Thanks,



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-23 15:28   ` Andrew Lunn
  2020-05-23 17:16     ` Jeremy Linton
@ 2020-05-23 17:32     ` Jeremy Linton
  2020-05-23 19:12       ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-23 17:32 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, davem, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/23/20 10:28 AM, Andrew Lunn wrote:
> On Fri, May 22, 2020 at 04:30:51PM -0500, Jeremy Linton wrote:
>> Lets factor out the phy id logic, and make it generic
>> so that it can be used for c22 and c45.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/net/phy/phy_device.c | 65 +++++++++++++++++++-----------------
>>   1 file changed, 35 insertions(+), 30 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 7746c07b97fe..f0761fa5e40b 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -695,6 +695,29 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>   	return 0;
>>   }
>>   
>> +static int _get_phy_id(struct mii_bus *bus, int addr, int dev_addr,
>> +		       u32 *phy_id, bool c45)
> 
> Hi Jeremy
> 
> How about read_phy_id() so you can avoid the _ prefix.
> 
>>   static bool valid_phy_id(int val)
>>   {
>>   	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
>> @@ -715,17 +738,17 @@ static bool valid_phy_id(int val)
>>    */
>>   static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>   			   struct phy_c45_device_ids *c45_ids) {
>> -	int phy_reg;
>> -	int i, reg_addr;
>> +	int ret;
>> +	int i;
>>   	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>   	u32 *devs = &c45_ids->devices_in_package;
>>   
>>   	/* Find first non-zero Devices In package. Device zero is reserved
>>   	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>>   	 */
>> -	for (i = 1; i < num_ids && *devs == 0; i++) {
>> -		phy_reg = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
>> -		if (phy_reg < 0)
>> +	for (i = 0; i < num_ids && *devs == 0; i++) {
>> +		ret = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
>> +		if (ret < 0)
>>   			return -EIO;
> 
> Renaming reg_addr to ret does not belong in this patch.
> 

Looks like I changed the loop index in this patch while shuffling things 
around yesterday too. The "for (i = 1/0.." change belongs in 5/11 as well.




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/11] net: phy: Don't report success if devices weren't found
  2020-05-22 21:30 ` [RFC 01/11] net: phy: Don't report success if devices weren't found Jeremy Linton
@ 2020-05-23 18:20   ` Russell King - ARM Linux admin
  2020-05-25  2:46     ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 18:20 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Fri, May 22, 2020 at 04:30:49PM -0500, Jeremy Linton wrote:
> C45 devices are to return 0 for registers they haven't
> implemented. This means in theory we can terminate the
> device search loop without finding any MMDs. In that
> case we want to immediately return indicating that
> nothing was found rather than continuing to probe
> and falling into the success state at the bottom.

This is a little confusing when you read this comment:

                        /*  If mostly Fs, there is no device there,
                         *  then let's continue to probe more, as some
                         *  10G PHYs have zero Devices In package,
                         *  e.g. Cortina CS4315/CS4340 PHY.
                         */

Since it appears to be talking about the case of a PHY where *devs will
be zero.  However, tracking down the original submission, it seems this
is not the case at all, and the comment is grossly misleading.

It seems these PHYs only report a valid data in the Devices In Package
registers for devad=0 - it has nothing to do with a zero Devices In
Package.

Can I suggest that this comment is fixed while we're changing the code
to explicitly reject this "zero Devices In package" so that it's not
confusing?

Thanks.

> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/net/phy/phy_device.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index ac2784192472..245899b58a7d 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -742,6 +742,12 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  		}
>  	}
>  
> +	/* no reported devices */
> +	if (*devs == 0) {
> +		*phy_id = 0xffffffff;
> +		return 0;
> +	}
> +
>  	/* Now probe Device Identifiers for each device present. */
>  	for (i = 1; i < num_ids; i++) {
>  		if (!(c45_ids->devices_in_package & (1 << i)))
> -- 
> 2.26.2
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-22 21:30 ` [RFC 03/11] net: phy: refactor c45 phy identification sequence Jeremy Linton
  2020-05-23 15:28   ` Andrew Lunn
@ 2020-05-23 18:30   ` Russell King - ARM Linux admin
  2020-05-23 19:51     ` Andrew Lunn
  1 sibling, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 18:30 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Fri, May 22, 2020 at 04:30:51PM -0500, Jeremy Linton wrote:
> Lets factor out the phy id logic, and make it generic
> so that it can be used for c22 and c45.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/net/phy/phy_device.c | 65 +++++++++++++++++++-----------------
>  1 file changed, 35 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 7746c07b97fe..f0761fa5e40b 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -695,6 +695,29 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>  	return 0;
>  }
>  
> +static int _get_phy_id(struct mii_bus *bus, int addr, int dev_addr,
> +		       u32 *phy_id, bool c45)
> +{
> +	int phy_reg, reg_addr;
> +
> +	int reg_base = c45 ? (MII_ADDR_C45 | dev_addr << 16) : 0;

	int phy_reg, reg_addr, reg_base;

	reg_base = c45 ? (MII_ADDR_C45 | dev_addr << 16) : 0;

I think would be preferable.

> +
> +	reg_addr =  reg_base | MII_PHYSID1;
> +	phy_reg = mdiobus_read(bus, addr, reg_addr);
> +	if (phy_reg < 0)
> +		return -EIO;
> +
> +	*phy_id = phy_reg << 16;
> +
> +	reg_addr = reg_base | MII_PHYSID2;
> +	phy_reg = mdiobus_read(bus, addr, reg_addr);
> +	if (phy_reg < 0)
> +		return -EIO;
> +	*phy_id |= phy_reg;

The line spacing seems to be a little inconsistent between the above two
register reads.

> +
> +	return 0;
> +}

So, _get_phy_id() returns either zero or -EIO.

> +
>  static bool valid_phy_id(int val)
>  {
>  	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
> @@ -715,17 +738,17 @@ static bool valid_phy_id(int val)
>   */
>  static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  			   struct phy_c45_device_ids *c45_ids) {
> -	int phy_reg;
> -	int i, reg_addr;
> +	int ret;
> +	int i;
>  	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>  	u32 *devs = &c45_ids->devices_in_package;

I feel a "reverse christmas tree" complaint brewing... yes, the original
code didn't follow it.  Maybe a tidy up while touching this code?

>  
>  	/* Find first non-zero Devices In package. Device zero is reserved
>  	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>  	 */
> -	for (i = 1; i < num_ids && *devs == 0; i++) {
> -		phy_reg = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
> -		if (phy_reg < 0)
> +	for (i = 0; i < num_ids && *devs == 0; i++) {
> +		ret = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
> +		if (ret < 0)
>  			return -EIO;
>  
>  		if ((*devs & 0x1fffffff) == 0x1fffffff) {
> @@ -752,17 +775,9 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  		if (!(c45_ids->devices_in_package & (1 << i)))
>  			continue;
>  
> -		reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID1;
> -		phy_reg = mdiobus_read(bus, addr, reg_addr);
> -		if (phy_reg < 0)
> -			return -EIO;
> -		c45_ids->device_ids[i] = phy_reg << 16;
> -
> -		reg_addr = MII_ADDR_C45 | i << 16 | MII_PHYSID2;
> -		phy_reg = mdiobus_read(bus, addr, reg_addr);
> -		if (phy_reg < 0)
> -			return -EIO;
> -		c45_ids->device_ids[i] |= phy_reg;
> +		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
> +		if (ret < 0)
> +			return ret;

So here we can only propagate the -EIO, so this keeps the semantics.

>  	}
>  	*phy_id = 0;
>  	return 0;
> @@ -787,27 +802,17 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
>  		      bool is_c45, struct phy_c45_device_ids *c45_ids)
>  {
> -	int phy_reg;
> +	int ret;
>  
>  	if (is_c45)
>  		return get_phy_c45_ids(bus, addr, phy_id, c45_ids);
>  
> -	/* Grab the bits from PHYIR1, and put them in the upper half */
> -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
> -	if (phy_reg < 0) {
> +	ret = _get_phy_id(bus, addr, 0, phy_id, false);
> +	if (ret < 0) {
>  		/* returning -ENODEV doesn't stop bus scanning */
> -		return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO;
> +		return (ret == -EIO || ret == -ENODEV) ? -ENODEV : -EIO;

Since ret will only ever be -EIO here, this can only ever return
-ENODEV, which is a functional change in the code (probably unintended.)
Nevertheless, it's likely introducing a bug if the intention is for
some other return from mdiobus_read() to be handled differently.

>  	}
>  
> -	*phy_id = phy_reg << 16;
> -
> -	/* Grab the bits from PHYIR2, and put them in the lower half */
> -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
> -	if (phy_reg < 0)
> -		return -EIO;

... whereas this one always returns -EIO on any error.

So, I think you have the potential in this patch to introduce a subtle
change of behaviour, which may lead to problems - have you closely
analysed why the code was the way it was, and whether your change of
behaviour is actually valid?

> -
> -	*phy_id |= phy_reg;
> -
>  	return 0;
>  }
>  
> -- 
> 2.26.2
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 02/11] net: phy: Simplify MMD device list termination
  2020-05-22 21:30 ` [RFC 02/11] net: phy: Simplify MMD device list termination Jeremy Linton
@ 2020-05-23 18:36   ` Russell King - ARM Linux admin
  2020-05-25  2:48     ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 18:36 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Fri, May 22, 2020 at 04:30:50PM -0500, Jeremy Linton wrote:
> Since we are already checking for *devs == 0 after
> the loop terminates, we can add a mostly F's check
> as well. With that change we can simplify the return/break
> sequence inside the loop.
> 
> Add a valid_phy_id() macro for this, since we will be using it
> in a couple other places.

I'm not sure you have the name of this correct, and your usage layer
in your patch series is correct.

> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/net/phy/phy_device.c | 15 +++++++--------
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 245899b58a7d..7746c07b97fe 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -695,6 +695,11 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>  	return 0;
>  }
>  
> +static bool valid_phy_id(int val)
> +{
> +	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
> +}
> +
>  /**
>   * get_phy_c45_ids - reads the specified addr for its 802.3-c45 IDs.
>   * @bus: the target MII bus
> @@ -732,18 +737,12 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  			phy_reg = get_phy_c45_devs_in_pkg(bus, addr, 0, devs);
>  			if (phy_reg < 0)
>  				return -EIO;
> -			/* no device there, let's get out of here */
> -			if ((*devs & 0x1fffffff) == 0x1fffffff) {
> -				*phy_id = 0xffffffff;
> -				return 0;
> -			} else {
> -				break;
> -			}
> +			break;
>  		}
>  	}
>  
>  	/* no reported devices */
> -	if (*devs == 0) {
> +	if (!valid_phy_id(*devs)) {

You are using this to validate the "devices in package" value, not the
PHY ID value.  So, IMHO this should be called "valid_devs_in_package()"
or similar.

>  		*phy_id = 0xffffffff;
>  		return 0;
>  	}
> -- 
> 2.26.2
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-22 21:30 ` [RFC 04/11] net: phy: Handle c22 regs presence better Jeremy Linton
@ 2020-05-23 18:37   ` Russell King - ARM Linux admin
  2020-05-25  3:34     ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 18:37 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
> Until this point, we have been sanitizing the c22
> regs presence bit out of all the MMD device lists.
> This is incorrect as it causes the 0xFFFFFFFF checks
> to incorrectly fail. Further, it turns out that we
> want to utilize this flag to make a determination that
> there is actually a phy at this location and we should
> be accessing it using c22.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/net/phy/phy_device.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index f0761fa5e40b..2d677490ecab 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>  		return -EIO;
>  	*devices_in_package |= phy_reg;
>  
> -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> -	*devices_in_package &= ~BIT(0);
> -
>  	return 0;
>  }
>  
> @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  	int i;
>  	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>  	u32 *devs = &c45_ids->devices_in_package;
> +	bool c22_present = false;
> +	bool valid_id = false;
>  
>  	/* Find first non-zero Devices In package. Device zero is reserved
>  	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  		return 0;
>  	}
>  
> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> +	c22_present = *devs & BIT(0);
> +	*devs &= ~BIT(0);
> +
>  	/* Now probe Device Identifiers for each device present. */
>  	for (i = 1; i < num_ids; i++) {
>  		if (!(c45_ids->devices_in_package & (1 << i)))
> @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>  		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
>  		if (ret < 0)
>  			return ret;
> +		if (valid_phy_id(c45_ids->device_ids[i]))
> +			valid_id = true;

Here you are using your "devices in package" validator to validate the
PHY ID value.  One of the things it does is mask this value with
0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
looks completely wrong.

> +	}
> +
> +	if (!valid_id && c22_present) {
> +		*phy_id = 0xffffffff;
> +	        return 0;
>  	}
>  	*phy_id = 0;
>  	return 0;
> -- 
> 2.26.2
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff
  2020-05-22 21:30 ` [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff Jeremy Linton
@ 2020-05-23 18:44   ` Russell King - ARM Linux admin
  2020-05-25  4:20     ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 18:44 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Fri, May 22, 2020 at 04:30:55PM -0500, Jeremy Linton wrote:
> MMD's in the device list sometimes return 0 for their id.
> When that happens lets reset the id back to 0xfffffff so
> that we don't get a stub device created for it.
> 
> This is a questionable commit, but i'm tossing it out
> there along with the comment that reading the spec
> seems to indicate that maybe there are further registers
> that could be probed in an attempt to resolve some futher
> "bad" phys. It sort of comes down to do we want unused phy
> devices floating around (potentially unmatched in dt) or
> do we want to cut them off early and let DT create them
> directly.

I'm not sure what you mean "stub device" or "unused phy devices
floating around" - the individual MMDs are not treated as separate
"phy devices", but as one PHY device as a whole.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 10/11] net: example acpize xgmac_mdio
  2020-05-22 21:30 ` [RFC 10/11] net: example acpize xgmac_mdio Jeremy Linton
@ 2020-05-23 18:48   ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 18:48 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Fri, May 22, 2020 at 04:30:58PM -0500, Jeremy Linton wrote:
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/net/ethernet/freescale/xgmac_mdio.c | 27 +++++++++++++--------
>  1 file changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/xgmac_mdio.c b/drivers/net/ethernet/freescale/xgmac_mdio.c
> index c82c85ef5fb3..96ee3bd89983 100644
> --- a/drivers/net/ethernet/freescale/xgmac_mdio.c
> +++ b/drivers/net/ethernet/freescale/xgmac_mdio.c
> @@ -245,14 +245,14 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
>  {
>  	struct device_node *np = pdev->dev.of_node;
>  	struct mii_bus *bus;
> -	struct resource res;
> +	struct resource *res;
>  	struct mdio_fsl_priv *priv;
>  	int ret;
>  
> -	ret = of_address_to_resource(np, 0, &res);
> -	if (ret) {
> +	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +	if (!res) {
>  		dev_err(&pdev->dev, "could not obtain address\n");
> -		return ret;
> +		return -EINVAL;
>  	}
>  
>  	bus = mdiobus_alloc_size(sizeof(struct mdio_fsl_priv));
> @@ -263,21 +263,21 @@ static int xgmac_mdio_probe(struct platform_device *pdev)
>  	bus->read = xgmac_mdio_read;
>  	bus->write = xgmac_mdio_write;
>  	bus->parent = &pdev->dev;
> -	snprintf(bus->id, MII_BUS_ID_SIZE, "%llx", (unsigned long long)res.start);
> +	snprintf(bus->id, MII_BUS_ID_SIZE, "%llx", (unsigned long long)res->start);
>  
>  	/* Set the PHY base address */
>  	priv = bus->priv;
> -	priv->mdio_base = of_iomap(np, 0);
> +	priv->mdio_base = devm_platform_ioremap_resource(pdev, 0);
>  	if (!priv->mdio_base) {

I think you need to pay greater attention to the return value of
functions - this is one such case, where
devm_platform_ioremap_resource() does not return NULL on failure.
It uses devm_ioremap_resource(), which is documented in lib/devres.c
to return an error-pointer on failure.

>  		ret = -ENOMEM;
>  		goto err_ioremap;
>  	}
>  
> -	priv->is_little_endian = of_property_read_bool(pdev->dev.of_node,
> -						       "little-endian");
> +	priv->is_little_endian = device_property_read_bool(&pdev->dev,
> +							   "little-endian");
>  
> -	priv->has_a011043 = of_property_read_bool(pdev->dev.of_node,
> -						  "fsl,erratum-a011043");
> +	priv->has_a011043 = device_property_read_bool(&pdev->dev,
> +						      "fsl,erratum-a011043");
>  
>  	ret = of_mdiobus_register(bus, np);
>  	if (ret) {
> @@ -320,10 +320,17 @@ static const struct of_device_id xgmac_mdio_match[] = {
>  };
>  MODULE_DEVICE_TABLE(of, xgmac_mdio_match);
>  
> +static const struct acpi_device_id xgmac_acpi_match[] = {
> +	{ "NXP0006", (kernel_ulong_t)NULL },
> +	{ },
> +};
> +MODULE_DEVICE_TABLE(acpi, xgmac_acpi_match);
> +
>  static struct platform_driver xgmac_mdio_driver = {
>  	.driver = {
>  		.name = "fsl-fman_xmdio",
>  		.of_match_table = xgmac_mdio_match,
> +		.acpi_match_table = xgmac_acpi_match,
>  	},
>  	.probe = xgmac_mdio_probe,
>  	.remove = xgmac_mdio_remove,
> -- 
> 2.26.2
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-23 17:32     ` Jeremy Linton
@ 2020-05-23 19:12       ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 19:12 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Andrew Lunn, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

On Sat, May 23, 2020 at 12:32:59PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/23/20 10:28 AM, Andrew Lunn wrote:
> > On Fri, May 22, 2020 at 04:30:51PM -0500, Jeremy Linton wrote:
> > > Lets factor out the phy id logic, and make it generic
> > > so that it can be used for c22 and c45.
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > >   drivers/net/phy/phy_device.c | 65 +++++++++++++++++++-----------------
> > >   1 file changed, 35 insertions(+), 30 deletions(-)
> > > 
> > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > index 7746c07b97fe..f0761fa5e40b 100644
> > > --- a/drivers/net/phy/phy_device.c
> > > +++ b/drivers/net/phy/phy_device.c
> > > @@ -695,6 +695,29 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > >   	return 0;
> > >   }
> > > +static int _get_phy_id(struct mii_bus *bus, int addr, int dev_addr,
> > > +		       u32 *phy_id, bool c45)
> > 
> > Hi Jeremy
> > 
> > How about read_phy_id() so you can avoid the _ prefix.
> > 
> > >   static bool valid_phy_id(int val)
> > >   {
> > >   	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
> > > @@ -715,17 +738,17 @@ static bool valid_phy_id(int val)
> > >    */
> > >   static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   			   struct phy_c45_device_ids *c45_ids) {
> > > -	int phy_reg;
> > > -	int i, reg_addr;
> > > +	int ret;
> > > +	int i;
> > >   	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > >   	u32 *devs = &c45_ids->devices_in_package;
> > >   	/* Find first non-zero Devices In package. Device zero is reserved
> > >   	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> > >   	 */
> > > -	for (i = 1; i < num_ids && *devs == 0; i++) {
> > > -		phy_reg = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
> > > -		if (phy_reg < 0)
> > > +	for (i = 0; i < num_ids && *devs == 0; i++) {
> > > +		ret = get_phy_c45_devs_in_pkg(bus, addr, i, devs);
> > > +		if (ret < 0)
> > >   			return -EIO;
> > 
> > Renaming reg_addr to ret does not belong in this patch.
> > 
> 
> Looks like I changed the loop index in this patch while shuffling things
> around yesterday too. The "for (i = 1/0.." change belongs in 5/11 as well.

Indeed; that's a change of behaviour - see the CS4315/CS4340 workaround.

Note that MMD 0 is explicitly marked "Reserved" in 802.3, so we
shouldn't be probing it in this way.

Also note that bit 0 of the "devices in package" does not mean that
MMD 0 is accessible; it means that the clause 22 registers are
present:

  "Bit 5.0 is used to indicate that Clause 22 functionality has been
  implemented within a Clause 45 electrical interface device."

which basically means Clause 22 protocol over Clause 45 electrical
interface.

So, we should be careful about poking in MMD 0 if 5.0 is set.

Some places that will break are:

- phylink_phy_read() / phylink_phy_write()

- phy_restart_aneg() / phy_config_aneg() if we have an clause 45
  MDIO interface that can't issue clause 22 MDIO cycles with a PHY
  that sets 5.0.

These comments apply more to your patch 4, but you brought up the
MMD 0 accesses in this patch, so I thought I'd provide it here.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-23 18:30   ` Russell King - ARM Linux admin
@ 2020-05-23 19:51     ` Andrew Lunn
  2020-05-23 20:01       ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 61+ messages in thread
From: Andrew Lunn @ 2020-05-23 19:51 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Jeremy Linton, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

> >  static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> >  			   struct phy_c45_device_ids *c45_ids) {
> > -	int phy_reg;
> > -	int i, reg_addr;
> > +	int ret;
> > +	int i;
> >  	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> >  	u32 *devs = &c45_ids->devices_in_package;
> 
> I feel a "reverse christmas tree" complaint brewing... yes, the original
> code didn't follow it.  Maybe a tidy up while touching this code?

At minimum, a patch should not make it worse. ret and i should clearly
be after devs.

> >  static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
> >  		      bool is_c45, struct phy_c45_device_ids *c45_ids)
> >  {
> > -	int phy_reg;
> > +	int ret;
> >  
> >  	if (is_c45)
> >  		return get_phy_c45_ids(bus, addr, phy_id, c45_ids);
> >  
> > -	/* Grab the bits from PHYIR1, and put them in the upper half */
> > -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
> > -	if (phy_reg < 0) {
> > +	ret = _get_phy_id(bus, addr, 0, phy_id, false);
> > +	if (ret < 0) {
> >  		/* returning -ENODEV doesn't stop bus scanning */
> > -		return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO;
> > +		return (ret == -EIO || ret == -ENODEV) ? -ENODEV : -EIO;
> 
> Since ret will only ever be -EIO here, this can only ever return
> -ENODEV, which is a functional change in the code (probably unintended.)
> Nevertheless, it's likely introducing a bug if the intention is for
> some other return from mdiobus_read() to be handled differently.
> 
> >  	}
> >  
> > -	*phy_id = phy_reg << 16;
> > -
> > -	/* Grab the bits from PHYIR2, and put them in the lower half */
> > -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
> > -	if (phy_reg < 0)
> > -		return -EIO;
> 
> ... whereas this one always returns -EIO on any error.
> 
> So, I think you have the potential in this patch to introduce a subtle
> change of behaviour, which may lead to problems - have you closely
> analysed why the code was the way it was, and whether your change of
> behaviour is actually valid?

I could be remembering this wrongly, but i think this is to do with
orion_mdio_xsmi_read() returning -ENODEV, not 0xffffffffff, if there
is no device on the bus at the given address. -EIO is fatal to the
scan, everything stops with the assumption the bus is broken. -ENODEV
should not be fatal to the scan.

   Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-23 19:51     ` Andrew Lunn
@ 2020-05-23 20:01       ` Russell King - ARM Linux admin
  2020-05-25  2:37         ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-23 20:01 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jeremy Linton, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

On Sat, May 23, 2020 at 09:51:31PM +0200, Andrew Lunn wrote:
> > >  static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >  			   struct phy_c45_device_ids *c45_ids) {
> > > -	int phy_reg;
> > > -	int i, reg_addr;
> > > +	int ret;
> > > +	int i;
> > >  	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > >  	u32 *devs = &c45_ids->devices_in_package;
> > 
> > I feel a "reverse christmas tree" complaint brewing... yes, the original
> > code didn't follow it.  Maybe a tidy up while touching this code?
> 
> At minimum, a patch should not make it worse. ret and i should clearly
> be after devs.
> 
> > >  static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
> > >  		      bool is_c45, struct phy_c45_device_ids *c45_ids)
> > >  {
> > > -	int phy_reg;
> > > +	int ret;
> > >  
> > >  	if (is_c45)
> > >  		return get_phy_c45_ids(bus, addr, phy_id, c45_ids);
> > >  
> > > -	/* Grab the bits from PHYIR1, and put them in the upper half */
> > > -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
> > > -	if (phy_reg < 0) {
> > > +	ret = _get_phy_id(bus, addr, 0, phy_id, false);
> > > +	if (ret < 0) {
> > >  		/* returning -ENODEV doesn't stop bus scanning */
> > > -		return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO;
> > > +		return (ret == -EIO || ret == -ENODEV) ? -ENODEV : -EIO;
> > 
> > Since ret will only ever be -EIO here, this can only ever return
> > -ENODEV, which is a functional change in the code (probably unintended.)
> > Nevertheless, it's likely introducing a bug if the intention is for
> > some other return from mdiobus_read() to be handled differently.
> > 
> > >  	}
> > >  
> > > -	*phy_id = phy_reg << 16;
> > > -
> > > -	/* Grab the bits from PHYIR2, and put them in the lower half */
> > > -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
> > > -	if (phy_reg < 0)
> > > -		return -EIO;
> > 
> > ... whereas this one always returns -EIO on any error.
> > 
> > So, I think you have the potential in this patch to introduce a subtle
> > change of behaviour, which may lead to problems - have you closely
> > analysed why the code was the way it was, and whether your change of
> > behaviour is actually valid?
> 
> I could be remembering this wrongly, but i think this is to do with
> orion_mdio_xsmi_read() returning -ENODEV, not 0xffffffffff, if there
> is no device on the bus at the given address. -EIO is fatal to the
> scan, everything stops with the assumption the bus is broken. -ENODEV
> should not be fatal to the scan.

Maybe orion_mdio_xsmi_read() should be fixed then?  Also, maybe
adding return code documentation for mdiobus_read() / mdiobus_write()
would help MDIO driver authors have some consistency in what
errors they are expected to return (does anyone know for certain?)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices
  2020-05-22 21:30 ` [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices Jeremy Linton
@ 2020-05-24 14:44   ` Andrew Lunn
  2020-05-25  4:28     ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Andrew Lunn @ 2020-05-24 14:44 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel

> +++ b/include/linux/phy.h
> @@ -275,6 +275,11 @@ struct mii_bus {
>  	int reset_delay_us;
>  	/* RESET GPIO descriptor pointer */
>  	struct gpio_desc *reset_gpiod;
> +	/* bus capabilities, used for probing */
> +	enum {
> +		MDIOBUS_C22_ONLY = 0,
> +		MDIOBUS_C45_FIRST,
> +	} probe_capabilities;
>  };


I'm not so keen on _FIRST. It suggest _LAST would also be valid.  But
that then suggests this is not a bus property, but a PHY property, and
some PHYs might need _FIRST and other phys need _LAST, and then you
have a bus which has both sorts of PHY on it, and you have a problem.

So i think it would be better to have

	enum {
		MDIOBUS_UNKNOWN = 0,
		MDIOBUS_C22,
		MDIOBUS_C45,
		MDIOBUS_C45_C22,
	} bus_capabilities;

Describe just what the bus master can support.

	 Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence
  2020-05-23 20:01       ` Russell King - ARM Linux admin
@ 2020-05-25  2:37         ` Jeremy Linton
  0 siblings, 0 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25  2:37 UTC (permalink / raw)
  To: Russell King - ARM Linux admin, Andrew Lunn
  Cc: netdev, davem, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/23/20 3:01 PM, Russell King - ARM Linux admin wrote:
> On Sat, May 23, 2020 at 09:51:31PM +0200, Andrew Lunn wrote:
>>>>   static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>   			   struct phy_c45_device_ids *c45_ids) {
>>>> -	int phy_reg;
>>>> -	int i, reg_addr;
>>>> +	int ret;
>>>> +	int i;
>>>>   	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>>>   	u32 *devs = &c45_ids->devices_in_package;
>>>
>>> I feel a "reverse christmas tree" complaint brewing... yes, the original
>>> code didn't follow it.  Maybe a tidy up while touching this code?
>>
>> At minimum, a patch should not make it worse. ret and i should clearly
>> be after devs.
>>
>>>>   static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>   		      bool is_c45, struct phy_c45_device_ids *c45_ids)
>>>>   {
>>>> -	int phy_reg;
>>>> +	int ret;
>>>>   
>>>>   	if (is_c45)
>>>>   		return get_phy_c45_ids(bus, addr, phy_id, c45_ids);
>>>>   
>>>> -	/* Grab the bits from PHYIR1, and put them in the upper half */
>>>> -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
>>>> -	if (phy_reg < 0) {
>>>> +	ret = _get_phy_id(bus, addr, 0, phy_id, false);
>>>> +	if (ret < 0) {
>>>>   		/* returning -ENODEV doesn't stop bus scanning */
>>>> -		return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO;
>>>> +		return (ret == -EIO || ret == -ENODEV) ? -ENODEV : -EIO;
>>>
>>> Since ret will only ever be -EIO here, this can only ever return
>>> -ENODEV, which is a functional change in the code (probably unintended.)
>>> Nevertheless, it's likely introducing a bug if the intention is for
>>> some other return from mdiobus_read() to be handled differently.
>>>
>>>>   	}
>>>>   
>>>> -	*phy_id = phy_reg << 16;
>>>> -
>>>> -	/* Grab the bits from PHYIR2, and put them in the lower half */
>>>> -	phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
>>>> -	if (phy_reg < 0)
>>>> -		return -EIO;
>>>
>>> ... whereas this one always returns -EIO on any error.
>>>
>>> So, I think you have the potential in this patch to introduce a subtle
>>> change of behaviour, which may lead to problems - have you closely
>>> analysed why the code was the way it was, and whether your change of
>>> behaviour is actually valid?
>>
>> I could be remembering this wrongly, but i think this is to do with
>> orion_mdio_xsmi_read() returning -ENODEV, not 0xffffffffff, if there
>> is no device on the bus at the given address. -EIO is fatal to the
>> scan, everything stops with the assumption the bus is broken. -ENODEV
>> should not be fatal to the scan.
> 
> Maybe orion_mdio_xsmi_read() should be fixed then?  Also, maybe
> adding return code documentation for mdiobus_read() / mdiobus_write()
> would help MDIO driver authors have some consistency in what
> errors they are expected to return (does anyone know for certain?)
> 

My understanding at this point (which is mostly based on the xgmac 
here), is that 0xffffffff is equivalent to "bus responding correctly, 
phy failed to respond at this register location" while any -Eerror is 
understood as "something wrong with bus", and the mdio core then makes a 
choice about terminating just the current phy search (ENODEV), or 
terminating the entire mdio bus (basically everything else) registration.

I will see about clarifying the docs. I need to see if that will end up 
being a bit of a rabbit hole before committing to including that in this 
set.

Which brings up the problem that at least xgmac_mdio doesn't appear to 
handle being told "your bus registration failed" without OOPSing the 
probe routine. I think Calvin is aware of this, and I believe he has 
some additional xgmac/etc patches on top of this set. Although he pinged 
me offline the other day to say that apparently all my hunk shuffling 
broke some of the c45 phy detection I had working earlier in the week.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/11] net: phy: Don't report success if devices weren't found
  2020-05-23 18:20   ` Russell King - ARM Linux admin
@ 2020-05-25  2:46     ` Jeremy Linton
  2020-05-25  9:45       ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25  2:46 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

Thanks for taking a look at this.

On 5/23/20 1:20 PM, Russell King - ARM Linux admin wrote:
> On Fri, May 22, 2020 at 04:30:49PM -0500, Jeremy Linton wrote:
>> C45 devices are to return 0 for registers they haven't
>> implemented. This means in theory we can terminate the
>> device search loop without finding any MMDs. In that
>> case we want to immediately return indicating that
>> nothing was found rather than continuing to probe
>> and falling into the success state at the bottom.
> 
> This is a little confusing when you read this comment:
> 
>                          /*  If mostly Fs, there is no device there,
>                           *  then let's continue to probe more, as some
>                           *  10G PHYs have zero Devices In package,
>                           *  e.g. Cortina CS4315/CS4340 PHY.
>                           */
> 
> Since it appears to be talking about the case of a PHY where *devs will
> be zero.  However, tracking down the original submission, it seems this
> is not the case at all, and the comment is grossly misleading.
> 
> It seems these PHYs only report a valid data in the Devices In Package
> registers for devad=0 - it has nothing to do with a zero Devices In
> Package.

Yes, this ended up being my understanding of this commit, and is part of 
my justification for starting the devices search at the reserved address 
0 rather than 1.

> 
> Can I suggest that this comment is fixed while we're changing the code
> to explicitly reject this "zero Devices In package" so that it's not
> confusing?

Its probably better to kill it in 5/11 with a mention that we are 
starting at a reserved address?

OTOH, I'm a bit concerned that reading at 0 as the first address will 
cause problems because the original code was only triggering it after a 
read returning 0xFFFFFFFF at a valid MMD address. It does 
simplify/clarify the loop though. If it weren't for this 0 read, I would 
have tried to avoid some of the additional MMD reserved addresses.


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 02/11] net: phy: Simplify MMD device list termination
  2020-05-23 18:36   ` Russell King - ARM Linux admin
@ 2020-05-25  2:48     ` Jeremy Linton
  2020-05-25  8:09       ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25  2:48 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/23/20 1:36 PM, Russell King - ARM Linux admin wrote:
> On Fri, May 22, 2020 at 04:30:50PM -0500, Jeremy Linton wrote:
>> Since we are already checking for *devs == 0 after
>> the loop terminates, we can add a mostly F's check
>> as well. With that change we can simplify the return/break
>> sequence inside the loop.
>>
>> Add a valid_phy_id() macro for this, since we will be using it
>> in a couple other places.
> 
> I'm not sure you have the name of this correct, and your usage layer
> in your patch series is correct.

Or the name is poor..

> 
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/net/phy/phy_device.c | 15 +++++++--------
>>   1 file changed, 7 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 245899b58a7d..7746c07b97fe 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -695,6 +695,11 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>   	return 0;
>>   }
>>   
>> +static bool valid_phy_id(int val)
>> +{
>> +	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
>> +}
>> +
>>   /**
>>    * get_phy_c45_ids - reads the specified addr for its 802.3-c45 IDs.
>>    * @bus: the target MII bus
>> @@ -732,18 +737,12 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>   			phy_reg = get_phy_c45_devs_in_pkg(bus, addr, 0, devs);
>>   			if (phy_reg < 0)
>>   				return -EIO;
>> -			/* no device there, let's get out of here */
>> -			if ((*devs & 0x1fffffff) == 0x1fffffff) {
>> -				*phy_id = 0xffffffff;
>> -				return 0;
>> -			} else {
>> -				break;
>> -			}
>> +			break;
>>   		}
>>   	}
>>   
>>   	/* no reported devices */
>> -	if (*devs == 0) {
>> +	if (!valid_phy_id(*devs)) {
> 
> You are using this to validate the "devices in package" value, not the
> PHY ID value.  So, IMHO this should be called "valid_devs_in_package()"
> or similar.

Hmmm, its more "valid_phy_reg()" since it ends up being used to validate 
both the devs in package as well as phy id.



> 
>>   		*phy_id = 0xffffffff;
>>   		return 0;
>>   	}
>> -- 
>> 2.26.2
>>
>>
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-23 18:37   ` Russell King - ARM Linux admin
@ 2020-05-25  3:34     ` Jeremy Linton
  2020-05-25  9:53       ` Russell King - ARM Linux admin
  2020-05-25 10:06       ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25  3:34 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
> On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
>> Until this point, we have been sanitizing the c22
>> regs presence bit out of all the MMD device lists.
>> This is incorrect as it causes the 0xFFFFFFFF checks
>> to incorrectly fail. Further, it turns out that we
>> want to utilize this flag to make a determination that
>> there is actually a phy at this location and we should
>> be accessing it using c22.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/net/phy/phy_device.c | 16 +++++++++++++---
>>   1 file changed, 13 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index f0761fa5e40b..2d677490ecab 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>   		return -EIO;
>>   	*devices_in_package |= phy_reg;
>>   
>> -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>> -	*devices_in_package &= ~BIT(0);
>> -
>>   	return 0;
>>   }
>>   
>> @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>   	int i;
>>   	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>   	u32 *devs = &c45_ids->devices_in_package;
>> +	bool c22_present = false;
>> +	bool valid_id = false;
>>   
>>   	/* Find first non-zero Devices In package. Device zero is reserved
>>   	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>> @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>   		return 0;
>>   	}
>>   
>> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>> +	c22_present = *devs & BIT(0);
>> +	*devs &= ~BIT(0);
>> +
>>   	/* Now probe Device Identifiers for each device present. */
>>   	for (i = 1; i < num_ids; i++) {
>>   		if (!(c45_ids->devices_in_package & (1 << i)))
>> @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>   		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
>>   		if (ret < 0)
>>   			return ret;
>> +		if (valid_phy_id(c45_ids->device_ids[i]))
>> +			valid_id = true;
> 
> Here you are using your "devices in package" validator to validate the
> PHY ID value.  One of the things it does is mask this value with
> 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
> looks completely wrong.

I think in this case I was just using it like the comment in 
get_phy_device() "if the phy_id is mostly F's, there is no device here".

My understanding is that the code is trying to avoid the 0xFFFFFFFF 
returns that seem to indicate "bus ok, phy didn't respond".

I just checked the OUI registration, and while there are a couple OUI's 
registered that have a number of FFF's in them, none of those cases 
seems to overlap sufficiently to cause this to throw them out. Plus a 
phy would also have to have model+revision set to 'F's. So while might 
be possible, if unlikely, at the moment I think the OUI registration 
keeps this from being a problem. Particularly, if i'm reading the 
mapping correctly, the OUI mapping guarantees that the field cannot be 
all '1's due to the OUI having X & M bits cleared. It sort of looks like 
the mapping is trying to lose those bits, by tossing bit 1 & 2, but the 
X & M are in the wrong octet (AFAIK, I just read it three times cause it 
didn't make any sense).



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff
  2020-05-23 18:44   ` Russell King - ARM Linux admin
@ 2020-05-25  4:20     ` Jeremy Linton
  2020-05-25  8:20       ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25  4:20 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/23/20 1:44 PM, Russell King - ARM Linux admin wrote:
> On Fri, May 22, 2020 at 04:30:55PM -0500, Jeremy Linton wrote:
>> MMD's in the device list sometimes return 0 for their id.
>> When that happens lets reset the id back to 0xfffffff so
>> that we don't get a stub device created for it.
>>
>> This is a questionable commit, but i'm tossing it out
>> there along with the comment that reading the spec
>> seems to indicate that maybe there are further registers
>> that could be probed in an attempt to resolve some futher
>> "bad" phys. It sort of comes down to do we want unused phy
>> devices floating around (potentially unmatched in dt) or
>> do we want to cut them off early and let DT create them
>> directly.
> 
> I'm not sure what you mean "stub device" or "unused phy devices
> floating around" - the individual MMDs are not treated as separate
> "phy devices", but as one PHY device as a whole.
> 

Well, I guess its clearer to say phy/mmd devices with a phy_id=0. Which 
is a problem if we don't have DT overriding the phy_id for a given 
address. Although AFAIK given a couple of the /sys/bus/mdio_bus/devices 
lists I've seen, and after studying this code for a while now, I think 
"bogus" phy's might be getting created*. I was far to easy, to upset the 
cart when I was hacking on this set, and end up with a directory chuck 
full of phys.

So this gets close to one of the questions I asked in the cover letter. 
This patch and 09/11 serve to cut off possibly valid phy's which are 
failing to identify themselves using the standard registers. Which per 
the 802.3 spec there is a blurb about 0 in the id registers for some 
cases. Its not really a critical problem for ACPI machines to have these 
phys around (OTOH, there might be issues with c22 phys on c45 electrical 
buses that respond to c45 reg requests but don't set the c22 regs flag, 
I haven't seen that yet.). I considered dropping this patch, and 9/11 
was a last minute addition. I kept it because I was worried all those 
extra "reserved" MMDs would end up with id = 0's in there and break 
something.

* In places where there isn't actually a phy, likely a large part of the 
problem was clearing the c22 bit, which allowed 0xFFFFFFFF returns to 
slip through the devices list.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices
  2020-05-24 14:44   ` Andrew Lunn
@ 2020-05-25  4:28     ` Jeremy Linton
  2020-05-25  8:25       ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25  4:28 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, davem, f.fainelli, hkallweit1, linux, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/24/20 9:44 AM, Andrew Lunn wrote:
>> +++ b/include/linux/phy.h
>> @@ -275,6 +275,11 @@ struct mii_bus {
>>   	int reset_delay_us;
>>   	/* RESET GPIO descriptor pointer */
>>   	struct gpio_desc *reset_gpiod;
>> +	/* bus capabilities, used for probing */
>> +	enum {
>> +		MDIOBUS_C22_ONLY = 0,
>> +		MDIOBUS_C45_FIRST,
>> +	} probe_capabilities;
>>   };
> 
> 
> I'm not so keen on _FIRST. It suggest _LAST would also be valid.  But
> that then suggests this is not a bus property, but a PHY property, and
> some PHYs might need _FIRST and other phys need _LAST, and then you
> have a bus which has both sorts of PHY on it, and you have a problem.
> 
> So i think it would be better to have
> 
> 	enum {
> 		MDIOBUS_UNKNOWN = 0,
> 		MDIOBUS_C22,
> 		MDIOBUS_C45,
> 		MDIOBUS_C45_C22,
> 	} bus_capabilities;
> 
> Describe just what the bus master can support.

Yes, the naming is reasonable and I will update it in the next patch. I 
went around a bit myself with this naming early on, and the problem I 
saw was that a C45 capable master, can have C45 electrical phy's that 
only respond to c22 requests (AFAIK). So the MDIOBUS_C45 (I think I was 
calling it C45_ONLY) is an invalid selection. Not, that it wouldn't be 
helpful to have a C45_ONLY case, but that the assumption is that you 
wouldn't try and probe c22 registers, which I thought was a mistake.


Thanks,


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 02/11] net: phy: Simplify MMD device list termination
  2020-05-25  2:48     ` Jeremy Linton
@ 2020-05-25  8:09       ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25  8:09 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Sun, May 24, 2020 at 09:48:55PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/23/20 1:36 PM, Russell King - ARM Linux admin wrote:
> > On Fri, May 22, 2020 at 04:30:50PM -0500, Jeremy Linton wrote:
> > > Since we are already checking for *devs == 0 after
> > > the loop terminates, we can add a mostly F's check
> > > as well. With that change we can simplify the return/break
> > > sequence inside the loop.
> > > 
> > > Add a valid_phy_id() macro for this, since we will be using it
> > > in a couple other places.
> > 
> > I'm not sure you have the name of this correct, and your usage layer
> > in your patch series is correct.
> 
> Or the name is poor..
> 
> > 
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > >   drivers/net/phy/phy_device.c | 15 +++++++--------
> > >   1 file changed, 7 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > index 245899b58a7d..7746c07b97fe 100644
> > > --- a/drivers/net/phy/phy_device.c
> > > +++ b/drivers/net/phy/phy_device.c
> > > @@ -695,6 +695,11 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > >   	return 0;
> > >   }
> > > +static bool valid_phy_id(int val)
> > > +{
> > > +	return (val > 0 && ((val & 0x1fffffff) != 0x1fffffff));
> > > +}
> > > +
> > >   /**
> > >    * get_phy_c45_ids - reads the specified addr for its 802.3-c45 IDs.
> > >    * @bus: the target MII bus
> > > @@ -732,18 +737,12 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   			phy_reg = get_phy_c45_devs_in_pkg(bus, addr, 0, devs);
> > >   			if (phy_reg < 0)
> > >   				return -EIO;
> > > -			/* no device there, let's get out of here */
> > > -			if ((*devs & 0x1fffffff) == 0x1fffffff) {
> > > -				*phy_id = 0xffffffff;
> > > -				return 0;
> > > -			} else {
> > > -				break;
> > > -			}
> > > +			break;
> > >   		}
> > >   	}
> > >   	/* no reported devices */
> > > -	if (*devs == 0) {
> > > +	if (!valid_phy_id(*devs)) {
> > 
> > You are using this to validate the "devices in package" value, not the
> > PHY ID value.  So, IMHO this should be called "valid_devs_in_package()"
> > or similar.
> 
> Hmmm, its more "valid_phy_reg()" since it ends up being used to validate
> both the devs in package as well as phy id.

I don't think that is a valid use of the code you've put in
valid_phy_id().

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff
  2020-05-25  4:20     ` Jeremy Linton
@ 2020-05-25  8:20       ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25  8:20 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Sun, May 24, 2020 at 11:20:01PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/23/20 1:44 PM, Russell King - ARM Linux admin wrote:
> > On Fri, May 22, 2020 at 04:30:55PM -0500, Jeremy Linton wrote:
> > > MMD's in the device list sometimes return 0 for their id.
> > > When that happens lets reset the id back to 0xfffffff so
> > > that we don't get a stub device created for it.
> > > 
> > > This is a questionable commit, but i'm tossing it out
> > > there along with the comment that reading the spec
> > > seems to indicate that maybe there are further registers
> > > that could be probed in an attempt to resolve some futher
> > > "bad" phys. It sort of comes down to do we want unused phy
> > > devices floating around (potentially unmatched in dt) or
> > > do we want to cut them off early and let DT create them
> > > directly.
> > 
> > I'm not sure what you mean "stub device" or "unused phy devices
> > floating around" - the individual MMDs are not treated as separate
> > "phy devices", but as one PHY device as a whole.
> > 
> 
> Well, I guess its clearer to say phy/mmd devices with a phy_id=0. Which is a
> problem if we don't have DT overriding the phy_id for a given address.
> Although AFAIK given a couple of the /sys/bus/mdio_bus/devices lists I've
> seen, and after studying this code for a while now, I think "bogus" phy's
> might be getting created*. I was far to easy, to upset the cart when I was
> hacking on this set, and end up with a directory chuck full of phys.
> 
> So this gets close to one of the questions I asked in the cover letter. This
> patch and 09/11 serve to cut off possibly valid phy's which are failing to
> identify themselves using the standard registers. Which per the 802.3 spec
> there is a blurb about 0 in the id registers for some cases. Its not really
> a critical problem for ACPI machines to have these phys around (OTOH, there
> might be issues with c22 phys on c45 electrical buses that respond to c45
> reg requests but don't set the c22 regs flag, I haven't seen that yet.).

If you have a classical clause 22 PHY on a clause 45 bus, it isn't
going to respond to clause 45 cycles, so it isn't going to respond to
a request to read the devices-in-package register, so there is no
"c22 regs" flag.

> I
> considered dropping this patch, and 9/11 was a last minute addition. I kept
> it because I was worried all those extra "reserved" MMDs would end up with
> id = 0's in there and break something.
> 
> * In places where there isn't actually a phy, likely a large part of the
> problem was clearing the c22 bit, which allowed 0xFFFFFFFF returns to slip
> through the devices list.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices
  2020-05-25  4:28     ` Jeremy Linton
@ 2020-05-25  8:25       ` Russell King - ARM Linux admin
  2020-05-25 13:43         ` Andrew Lunn
  2020-05-25 22:09         ` Jeremy Linton
  0 siblings, 2 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25  8:25 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Andrew Lunn, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

On Sun, May 24, 2020 at 11:28:52PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/24/20 9:44 AM, Andrew Lunn wrote:
> > > +++ b/include/linux/phy.h
> > > @@ -275,6 +275,11 @@ struct mii_bus {
> > >   	int reset_delay_us;
> > >   	/* RESET GPIO descriptor pointer */
> > >   	struct gpio_desc *reset_gpiod;
> > > +	/* bus capabilities, used for probing */
> > > +	enum {
> > > +		MDIOBUS_C22_ONLY = 0,
> > > +		MDIOBUS_C45_FIRST,
> > > +	} probe_capabilities;
> > >   };
> > 
> > 
> > I'm not so keen on _FIRST. It suggest _LAST would also be valid.  But
> > that then suggests this is not a bus property, but a PHY property, and
> > some PHYs might need _FIRST and other phys need _LAST, and then you
> > have a bus which has both sorts of PHY on it, and you have a problem.
> > 
> > So i think it would be better to have
> > 
> > 	enum {
> > 		MDIOBUS_UNKNOWN = 0,
> > 		MDIOBUS_C22,
> > 		MDIOBUS_C45,
> > 		MDIOBUS_C45_C22,
> > 	} bus_capabilities;
> > 
> > Describe just what the bus master can support.
> 
> Yes, the naming is reasonable and I will update it in the next patch. I went
> around a bit myself with this naming early on, and the problem I saw was
> that a C45 capable master, can have C45 electrical phy's that only respond
> to c22 requests (AFAIK).

If you have a master that can only generate clause 45 cycles, and
someone is daft enough to connect a clause 22 only PHY to it, the
result is hardware that doesn't work - there's no getting around
that.  The MDIO interface can't generate the appropriate cycles to
access the clause 22 PHY.  So, this is not something we need care
about.

> So the MDIOBUS_C45 (I think I was calling it
> C45_ONLY) is an invalid selection. Not, that it wouldn't be helpful to have
> a C45_ONLY case, but that the assumption is that you wouldn't try and probe
> c22 registers, which I thought was a mistake.

MDIOBUS_C45 means "I can generate clause 45 cycles".
MDIOBUS_C22 means "I can generate clause 22 cycles".
MDIOBUS_C45_C22 means "I can generate both clause 45 and clause 22
cycles."

Notice carefully the values these end up with - MDIOBUS_C22 = BIT(0),
MDIOBUS_C45 = BIT(1), MDIOBUS_C45_C22 = BIT(0) | BIT(1).  I suspect
that was no coincidence in Andrew's suggestion.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/11] net: phy: Don't report success if devices weren't found
  2020-05-25  2:46     ` Jeremy Linton
@ 2020-05-25  9:45       ` Russell King - ARM Linux admin
  2020-05-25 21:02         ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25  9:45 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Sun, May 24, 2020 at 09:46:55PM -0500, Jeremy Linton wrote:
> Hi,
> 
> Thanks for taking a look at this.
> 
> On 5/23/20 1:20 PM, Russell King - ARM Linux admin wrote:
> > On Fri, May 22, 2020 at 04:30:49PM -0500, Jeremy Linton wrote:
> > > C45 devices are to return 0 for registers they haven't
> > > implemented. This means in theory we can terminate the
> > > device search loop without finding any MMDs. In that
> > > case we want to immediately return indicating that
> > > nothing was found rather than continuing to probe
> > > and falling into the success state at the bottom.
> > 
> > This is a little confusing when you read this comment:
> > 
> >                          /*  If mostly Fs, there is no device there,
> >                           *  then let's continue to probe more, as some
> >                           *  10G PHYs have zero Devices In package,
> >                           *  e.g. Cortina CS4315/CS4340 PHY.
> >                           */
> > 
> > Since it appears to be talking about the case of a PHY where *devs will
> > be zero.  However, tracking down the original submission, it seems this
> > is not the case at all, and the comment is grossly misleading.
> > 
> > It seems these PHYs only report a valid data in the Devices In Package
> > registers for devad=0 - it has nothing to do with a zero Devices In
> > Package.
> 
> Yes, this ended up being my understanding of this commit, and is part of my
> justification for starting the devices search at the reserved address 0
> rather than 1.
> 
> > 
> > Can I suggest that this comment is fixed while we're changing the code
> > to explicitly reject this "zero Devices In package" so that it's not
> > confusing?
> 
> Its probably better to kill it in 5/11 with a mention that we are starting
> at a reserved address?
> 
> OTOH, I'm a bit concerned that reading at 0 as the first address will cause
> problems because the original code was only triggering it after a read
> returning 0xFFFFFFFF at a valid MMD address. It does simplify/clarify the
> loop though. If it weren't for this 0 read, I would have tried to avoid some
> of the additional MMD reserved addresses.

Yes, that is the worry, and as MMD 0 is marked as reserved, I don't
think we should routinely probe it.

As I've already mentioned, note that bit 0 of devices-in-package does
not mean that there is a MMD 0 implemented, even if bit 0 is set.  Bit
0 means that the clause 22 register set is available through clause 22
cycles.  So, simplifying the loop to start at 0 and removing the work-
around could end up breaking Cortina PHYs if they don't set bit 0 in
their devices in package - but I don't see why we should depend on bit 0
being set for their workaround.

So, I think you're going to have to add a work-around to ignore bit 0,
which brings up the question whether this is worth it or not.

Hence, I think this is a "simplifcation" too far.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25  3:34     ` Jeremy Linton
@ 2020-05-25  9:53       ` Russell King - ARM Linux admin
  2020-05-25 10:06       ` Russell King - ARM Linux admin
  1 sibling, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25  9:53 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
> > On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
> > > Until this point, we have been sanitizing the c22
> > > regs presence bit out of all the MMD device lists.
> > > This is incorrect as it causes the 0xFFFFFFFF checks
> > > to incorrectly fail. Further, it turns out that we
> > > want to utilize this flag to make a determination that
> > > there is actually a phy at this location and we should
> > > be accessing it using c22.
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > >   drivers/net/phy/phy_device.c | 16 +++++++++++++---
> > >   1 file changed, 13 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > index f0761fa5e40b..2d677490ecab 100644
> > > --- a/drivers/net/phy/phy_device.c
> > > +++ b/drivers/net/phy/phy_device.c
> > > @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > >   		return -EIO;
> > >   	*devices_in_package |= phy_reg;
> > > -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > -	*devices_in_package &= ~BIT(0);
> > > -
> > >   	return 0;
> > >   }
> > > @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   	int i;
> > >   	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > >   	u32 *devs = &c45_ids->devices_in_package;
> > > +	bool c22_present = false;
> > > +	bool valid_id = false;
> > >   	/* Find first non-zero Devices In package. Device zero is reserved
> > >   	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> > > @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   		return 0;
> > >   	}
> > > +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > +	c22_present = *devs & BIT(0);
> > > +	*devs &= ~BIT(0);
> > > +
> > >   	/* Now probe Device Identifiers for each device present. */
> > >   	for (i = 1; i < num_ids; i++) {
> > >   		if (!(c45_ids->devices_in_package & (1 << i)))
> > > @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
> > >   		if (ret < 0)
> > >   			return ret;
> > > +		if (valid_phy_id(c45_ids->device_ids[i]))
> > > +			valid_id = true;
> > 
> > Here you are using your "devices in package" validator to validate the
> > PHY ID value.  One of the things it does is mask this value with
> > 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
> > looks completely wrong.
> 
> I think in this case I was just using it like the comment in
> get_phy_device() "if the phy_id is mostly F's, there is no device here".

Yes, that is certainly an interesting comment.  What's so magic about
this 0x1fffffff?  If it's about the time taken for the bus to rise
to logic 1 when not being actively driven by a PHY, then it actually
makes little sense, because we perform two transations to read each half
of the field, and both should have the same behaviour.  If this was the
issue, we should be masking and testing against 0x1fff1fff rather than
0x1fffffff.

> I just checked the OUI registration, and while there are a couple OUI's
> registered that have a number of FFF's in them, none of those cases seems to
> overlap sufficiently to cause this to throw them out. Plus a phy would also
> have to have model+revision set to 'F's. So while might be possible, if
> unlikely, at the moment I think the OUI registration keeps this from being a
> problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
> guarantees that the field cannot be all '1's due to the OUI having X & M
> bits cleared. It sort of looks like the mapping is trying to lose those
> bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
> just read it three times cause it didn't make any sense).

The most-bits-set OUI that is currently allocated is 5C-FF-FF.  This
would result in a register value of 0x73fffc00 to 0x73ffffff, so as
you say, it should be safe.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25  3:34     ` Jeremy Linton
  2020-05-25  9:53       ` Russell King - ARM Linux admin
@ 2020-05-25 10:06       ` Russell King - ARM Linux admin
  2020-05-25 21:51         ` Jeremy Linton
  1 sibling, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 10:06 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
> > On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
> > > Until this point, we have been sanitizing the c22
> > > regs presence bit out of all the MMD device lists.
> > > This is incorrect as it causes the 0xFFFFFFFF checks
> > > to incorrectly fail. Further, it turns out that we
> > > want to utilize this flag to make a determination that
> > > there is actually a phy at this location and we should
> > > be accessing it using c22.
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > >   drivers/net/phy/phy_device.c | 16 +++++++++++++---
> > >   1 file changed, 13 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > index f0761fa5e40b..2d677490ecab 100644
> > > --- a/drivers/net/phy/phy_device.c
> > > +++ b/drivers/net/phy/phy_device.c
> > > @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > >   		return -EIO;
> > >   	*devices_in_package |= phy_reg;
> > > -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > -	*devices_in_package &= ~BIT(0);
> > > -
> > >   	return 0;
> > >   }
> > > @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   	int i;
> > >   	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > >   	u32 *devs = &c45_ids->devices_in_package;
> > > +	bool c22_present = false;
> > > +	bool valid_id = false;
> > >   	/* Find first non-zero Devices In package. Device zero is reserved
> > >   	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> > > @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   		return 0;
> > >   	}
> > > +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > +	c22_present = *devs & BIT(0);
> > > +	*devs &= ~BIT(0);
> > > +
> > >   	/* Now probe Device Identifiers for each device present. */
> > >   	for (i = 1; i < num_ids; i++) {
> > >   		if (!(c45_ids->devices_in_package & (1 << i)))
> > > @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > >   		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
> > >   		if (ret < 0)
> > >   			return ret;
> > > +		if (valid_phy_id(c45_ids->device_ids[i]))
> > > +			valid_id = true;
> > 
> > Here you are using your "devices in package" validator to validate the
> > PHY ID value.  One of the things it does is mask this value with
> > 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
> > looks completely wrong.
> 
> I think in this case I was just using it like the comment in
> get_phy_device() "if the phy_id is mostly F's, there is no device here".
> 
> My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
> that seem to indicate "bus ok, phy didn't respond".
> 
> I just checked the OUI registration, and while there are a couple OUI's
> registered that have a number of FFF's in them, none of those cases seems to
> overlap sufficiently to cause this to throw them out. Plus a phy would also
> have to have model+revision set to 'F's. So while might be possible, if
> unlikely, at the moment I think the OUI registration keeps this from being a
> problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
> guarantees that the field cannot be all '1's due to the OUI having X & M
> bits cleared. It sort of looks like the mapping is trying to lose those
> bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
> just read it three times cause it didn't make any sense).

I should also note that we have at least one supported PHY where one
of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
odd numbered registers in one of the vendor MMDs for addresses 0
through 0xefff - which has a bit set in the devices-in-package.

It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
devices-in-package bit is clear in most of the valid MMDs, so we
shouldn't touch it.

These reveal the problem of randomly probing MMDs - they can return
unexpected values and not be as well behaved as we would like them to
be.  Using register 8 to detect presence may be beneficial, but that
may also introduce problems as we haven't used that before (and we
don't know whether any PHY that wrong.)  I know at least the 88x3310
gets it right for all except the vendor MMDs, where the low addresses
appear non-confromant to the 802.3 specs.  Both vendor MMDs are
definitely implemented, just not with anything conforming to 802.3.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices
  2020-05-25  8:25       ` Russell King - ARM Linux admin
@ 2020-05-25 13:43         ` Andrew Lunn
  2020-05-25 22:09         ` Jeremy Linton
  1 sibling, 0 replies; 61+ messages in thread
From: Andrew Lunn @ 2020-05-25 13:43 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Jeremy Linton, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

> > > So i think it would be better to have
> > > 
> > > 	enum {
> > > 		MDIOBUS_UNKNOWN = 0,
> > > 		MDIOBUS_C22,
> > > 		MDIOBUS_C45,
> > > 		MDIOBUS_C45_C22,
> > > 	} bus_capabilities;
> > > 
> > > Describe just what the bus master can support.
> > 
> > Yes, the naming is reasonable and I will update it in the next patch. I went
> > around a bit myself with this naming early on, and the problem I saw was
> > that a C45 capable master, can have C45 electrical phy's that only respond
> > to c22 requests (AFAIK).
> 
> If you have a master that can only generate clause 45 cycles, and
> someone is daft enough to connect a clause 22 only PHY to it, the
> result is hardware that doesn't work - there's no getting around
> that.  The MDIO interface can't generate the appropriate cycles to
> access the clause 22 PHY.  So, this is not something we need care
> about.
> 
> > So the MDIOBUS_C45 (I think I was calling it
> > C45_ONLY) is an invalid selection. Not, that it wouldn't be helpful to have
> > a C45_ONLY case, but that the assumption is that you wouldn't try and probe
> > c22 registers, which I thought was a mistake.
> 
> MDIOBUS_C45 means "I can generate clause 45 cycles".
> MDIOBUS_C22 means "I can generate clause 22 cycles".
> MDIOBUS_C45_C22 means "I can generate both clause 45 and clause 22
> cycles."
> 
> Notice carefully the values these end up with - MDIOBUS_C22 = BIT(0),
> MDIOBUS_C45 = BIT(1), MDIOBUS_C45_C22 = BIT(0) | BIT(1).  I suspect
> that was no coincidence in Andrew's suggestion.

Hi Russell

What was a nice side affect. Since i doubt Jeremy is going to go
through every MDIO driver and set the capabilities correctly, i wanted
0 to have a safe meaning. In the code we should treat MDIOBUS_UNKNOWN
and MDIOBUS_C22 identically. But maybe some time in the distant
future, we can make 0 issue a warning.

  Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/11] net: phy: Don't report success if devices weren't found
  2020-05-25  9:45       ` Russell King - ARM Linux admin
@ 2020-05-25 21:02         ` Jeremy Linton
  2020-05-25 21:07           ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 21:02 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/25/20 4:45 AM, Russell King - ARM Linux admin wrote:
> On Sun, May 24, 2020 at 09:46:55PM -0500, Jeremy Linton wrote:
>> Hi,
>>
>> Thanks for taking a look at this.
>>
>> On 5/23/20 1:20 PM, Russell King - ARM Linux admin wrote:
>>> On Fri, May 22, 2020 at 04:30:49PM -0500, Jeremy Linton wrote:
>>>> C45 devices are to return 0 for registers they haven't
>>>> implemented. This means in theory we can terminate the
>>>> device search loop without finding any MMDs. In that
>>>> case we want to immediately return indicating that
>>>> nothing was found rather than continuing to probe
>>>> and falling into the success state at the bottom.
>>>
>>> This is a little confusing when you read this comment:
>>>
>>>                           /*  If mostly Fs, there is no device there,
>>>                            *  then let's continue to probe more, as some
>>>                            *  10G PHYs have zero Devices In package,
>>>                            *  e.g. Cortina CS4315/CS4340 PHY.
>>>                            */
>>>
>>> Since it appears to be talking about the case of a PHY where *devs will
>>> be zero.  However, tracking down the original submission, it seems this
>>> is not the case at all, and the comment is grossly misleading.
>>>
>>> It seems these PHYs only report a valid data in the Devices In Package
>>> registers for devad=0 - it has nothing to do with a zero Devices In
>>> Package.
>>
>> Yes, this ended up being my understanding of this commit, and is part of my
>> justification for starting the devices search at the reserved address 0
>> rather than 1.
>>
>>>
>>> Can I suggest that this comment is fixed while we're changing the code
>>> to explicitly reject this "zero Devices In package" so that it's not
>>> confusing?
>>
>> Its probably better to kill it in 5/11 with a mention that we are starting
>> at a reserved address?
>>
>> OTOH, I'm a bit concerned that reading at 0 as the first address will cause
>> problems because the original code was only triggering it after a read
>> returning 0xFFFFFFFF at a valid MMD address. It does simplify/clarify the
>> loop though. If it weren't for this 0 read, I would have tried to avoid some
>> of the additional MMD reserved addresses.
> 
> Yes, that is the worry, and as MMD 0 is marked as reserved, I don't
> think we should routinely probe it.

Hmm, ok, so it gets a bit more complex then. The loop could probe all 
the valid MMD addresses, then if that doesn't appear to have yielded 
anything try the reserved ones? Its actually not a big code change 
because we could have a hardcoded bitfield of valid MMD addresses which 
we check before trying the probe. Then make one pass through the loop 
hitting the valid ones, and if we still didn't find anything, try the 
reserved addresses.


> 
> As I've already mentioned, note that bit 0 of devices-in-package does
> not mean that there is a MMD 0 implemented, even if bit 0 is set.  Bit
> 0 means that the clause 22 register set is available through clause 22
> cycles.  So, simplifying the loop to start at 0 and removing the work-
> around could end up breaking Cortina PHYs if they don't set bit 0 in
> their devices in package - but I don't see why we should depend on bit 0
> being set for their workaround.
Just to be clear this set probes MMD 0 except to ask for the device 
list. That is the same behavior as before. That is because all the 
subsequent id/etc loops are indexed to start at MMD 1. The primary 
difference for the cortiana PHY's AFAIK, is the order we ask for the 
devices list. Before it had to fail at a valid addr before reading 0, 
now it just starts at 0 and continues to probe if that fails. Some of 
this is required (continuing scan on failure) due to phys that "fail" 
reading the devices list for the lower MMD's addresses but work on the 
higher ones.


> 
> So, I think you're going to have to add a work-around to ignore bit 0,
> which brings up the question whether this is worth it or not.

It does ignore bit 0, it gets turned into the C22 regs flag, and 
cleared/ignored in the remainder of the code (do to MMD loop indexes 
starting at 1).

> 
> Hence, I think this is a "simplifcation" too far.
> 

Ok, if i'm understanding correctly, avoid the reserved addresses unless 
we fail to get a device list as before. That isn't a problem, I will 
include that in the next revision.


Thanks,


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/11] net: phy: Don't report success if devices weren't found
  2020-05-25 21:02         ` Jeremy Linton
@ 2020-05-25 21:07           ` Russell King - ARM Linux admin
  2020-05-25 21:59             ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 21:07 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 04:02:13PM -0500, Jeremy Linton wrote:
> > So, I think you're going to have to add a work-around to ignore bit 0,
> > which brings up the question whether this is worth it or not.
> 
> It does ignore bit 0, it gets turned into the C22 regs flag, and
> cleared/ignored in the remainder of the code (do to MMD loop indexes
> starting at 1).

However, I've already pointed out that that isn't the case in a
number of functions that I listed in another email, and I suspect
was glossed over.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 10:06       ` Russell King - ARM Linux admin
@ 2020-05-25 21:51         ` Jeremy Linton
  2020-05-25 22:01           ` Russell King - ARM Linux admin
  2020-05-25 22:06           ` Andrew Lunn
  0 siblings, 2 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 21:51 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
> On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
>> Hi,
>>
>> On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
>>> On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
>>>> Until this point, we have been sanitizing the c22
>>>> regs presence bit out of all the MMD device lists.
>>>> This is incorrect as it causes the 0xFFFFFFFF checks
>>>> to incorrectly fail. Further, it turns out that we
>>>> want to utilize this flag to make a determination that
>>>> there is actually a phy at this location and we should
>>>> be accessing it using c22.
>>>>
>>>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>>>> ---
>>>>    drivers/net/phy/phy_device.c | 16 +++++++++++++---
>>>>    1 file changed, 13 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>>> index f0761fa5e40b..2d677490ecab 100644
>>>> --- a/drivers/net/phy/phy_device.c
>>>> +++ b/drivers/net/phy/phy_device.c
>>>> @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>>>    		return -EIO;
>>>>    	*devices_in_package |= phy_reg;
>>>> -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>> -	*devices_in_package &= ~BIT(0);
>>>> -
>>>>    	return 0;
>>>>    }
>>>> @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>    	int i;
>>>>    	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>>>    	u32 *devs = &c45_ids->devices_in_package;
>>>> +	bool c22_present = false;
>>>> +	bool valid_id = false;
>>>>    	/* Find first non-zero Devices In package. Device zero is reserved
>>>>    	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>>>> @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>    		return 0;
>>>>    	}
>>>> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>> +	c22_present = *devs & BIT(0);
>>>> +	*devs &= ~BIT(0);
>>>> +
>>>>    	/* Now probe Device Identifiers for each device present. */
>>>>    	for (i = 1; i < num_ids; i++) {
>>>>    		if (!(c45_ids->devices_in_package & (1 << i)))
>>>> @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>    		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
>>>>    		if (ret < 0)
>>>>    			return ret;
>>>> +		if (valid_phy_id(c45_ids->device_ids[i]))
>>>> +			valid_id = true;
>>>
>>> Here you are using your "devices in package" validator to validate the
>>> PHY ID value.  One of the things it does is mask this value with
>>> 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
>>> looks completely wrong.
>>
>> I think in this case I was just using it like the comment in
>> get_phy_device() "if the phy_id is mostly F's, there is no device here".
>>
>> My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
>> that seem to indicate "bus ok, phy didn't respond".
>>
>> I just checked the OUI registration, and while there are a couple OUI's
>> registered that have a number of FFF's in them, none of those cases seems to
>> overlap sufficiently to cause this to throw them out. Plus a phy would also
>> have to have model+revision set to 'F's. So while might be possible, if
>> unlikely, at the moment I think the OUI registration keeps this from being a
>> problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
>> guarantees that the field cannot be all '1's due to the OUI having X & M
>> bits cleared. It sort of looks like the mapping is trying to lose those
>> bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
>> just read it three times cause it didn't make any sense).
> 
> I should also note that we have at least one supported PHY where one
> of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
> odd numbered registers in one of the vendor MMDs for addresses 0
> through 0xefff - which has a bit set in the devices-in-package.
> 
> It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
> devices-in-package bit is clear in most of the valid MMDs, so we
> shouldn't touch it.
> 
> These reveal the problem of randomly probing MMDs - they can return
> unexpected values and not be as well behaved as we would like them to
> be.  Using register 8 to detect presence may be beneficial, but that
> may also introduce problems as we haven't used that before (and we
> don't know whether any PHY that wrong.)  I know at least the 88x3310
> gets it right for all except the vendor MMDs, where the low addresses
> appear non-confromant to the 802.3 specs.  Both vendor MMDs are
> definitely implemented, just not with anything conforming to 802.3.

Yes, we know even for the NXP reference hardware, one of the phy's 
doesn't probe out correctly because it doesn't respond to the ieee 
defined registers. I think at this point, there really isn't anything we 
can do about that unless we involve the (ACPI) firmware in currently 
nonstandard behaviors.

So, my goals here have been to first, not break anything, and then do a 
slightly better job finding phy's that are (mostly?) responding 
correctly to the 802.3 spec. So we can say "if you hardware is ACPI 
conformant, and you have IEEE conformant phy's you should be ok". So, 
for your example phy, I guess the immediate answer is "use DT" or "find 
a conformant phy", or even "abstract it in the firmware and use a 
mailbox interface".




^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 01/11] net: phy: Don't report success if devices weren't found
  2020-05-25 21:07           ` Russell King - ARM Linux admin
@ 2020-05-25 21:59             ` Jeremy Linton
  0 siblings, 0 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 21:59 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/25/20 4:07 PM, Russell King - ARM Linux admin wrote:
> On Mon, May 25, 2020 at 04:02:13PM -0500, Jeremy Linton wrote:
>>> So, I think you're going to have to add a work-around to ignore bit 0,
>>> which brings up the question whether this is worth it or not.
>>
>> It does ignore bit 0, it gets turned into the C22 regs flag, and
>> cleared/ignored in the remainder of the code (do to MMD loop indexes
>> starting at 1).
> 
> However, I've already pointed out that that isn't the case in a
> number of functions that I listed in another email, and I suspect
> was glossed over.
> 

Hmm, right, I might not be understanding, I'm still considering your 
comments in 4/11 and a couple others..

OTOH, the mmd 0 logic could be completely removed, as its actually been 
broken for a year or so in linux (AFAIK) because the code triggering it 
was disabled when the C22 sanitation patch was merged. OTOH, this patch 
is still clearing the C22 flag from devices, so anything dependent 
entirely on that should have the same behavior as before.

So, there is a bug in the is_valid_phy/device macro, because I messed it 
up when I converted it to a function because its using a signed val, 
when it should be unsigned. I don't think that is what you were hinting 
in 4/11 though.

Thanks,

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 21:51         ` Jeremy Linton
@ 2020-05-25 22:01           ` Russell King - ARM Linux admin
  2020-05-25 22:22             ` Jeremy Linton
  2020-05-25 23:16             ` Jeremy Linton
  2020-05-25 22:06           ` Andrew Lunn
  1 sibling, 2 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 22:01 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
> > On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
> > > Hi,
> > > 
> > > On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
> > > > On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
> > > > > Until this point, we have been sanitizing the c22
> > > > > regs presence bit out of all the MMD device lists.
> > > > > This is incorrect as it causes the 0xFFFFFFFF checks
> > > > > to incorrectly fail. Further, it turns out that we
> > > > > want to utilize this flag to make a determination that
> > > > > there is actually a phy at this location and we should
> > > > > be accessing it using c22.
> > > > > 
> > > > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > > > ---
> > > > >    drivers/net/phy/phy_device.c | 16 +++++++++++++---
> > > > >    1 file changed, 13 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > > > index f0761fa5e40b..2d677490ecab 100644
> > > > > --- a/drivers/net/phy/phy_device.c
> > > > > +++ b/drivers/net/phy/phy_device.c
> > > > > @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > > > >    		return -EIO;
> > > > >    	*devices_in_package |= phy_reg;
> > > > > -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > -	*devices_in_package &= ~BIT(0);
> > > > > -
> > > > >    	return 0;
> > > > >    }
> > > > > @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > >    	int i;
> > > > >    	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > > > >    	u32 *devs = &c45_ids->devices_in_package;
> > > > > +	bool c22_present = false;
> > > > > +	bool valid_id = false;
> > > > >    	/* Find first non-zero Devices In package. Device zero is reserved
> > > > >    	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> > > > > @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > >    		return 0;
> > > > >    	}
> > > > > +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > +	c22_present = *devs & BIT(0);
> > > > > +	*devs &= ~BIT(0);
> > > > > +
> > > > >    	/* Now probe Device Identifiers for each device present. */
> > > > >    	for (i = 1; i < num_ids; i++) {
> > > > >    		if (!(c45_ids->devices_in_package & (1 << i)))
> > > > > @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > >    		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
> > > > >    		if (ret < 0)
> > > > >    			return ret;
> > > > > +		if (valid_phy_id(c45_ids->device_ids[i]))
> > > > > +			valid_id = true;
> > > > 
> > > > Here you are using your "devices in package" validator to validate the
> > > > PHY ID value.  One of the things it does is mask this value with
> > > > 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
> > > > looks completely wrong.
> > > 
> > > I think in this case I was just using it like the comment in
> > > get_phy_device() "if the phy_id is mostly F's, there is no device here".
> > > 
> > > My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
> > > that seem to indicate "bus ok, phy didn't respond".
> > > 
> > > I just checked the OUI registration, and while there are a couple OUI's
> > > registered that have a number of FFF's in them, none of those cases seems to
> > > overlap sufficiently to cause this to throw them out. Plus a phy would also
> > > have to have model+revision set to 'F's. So while might be possible, if
> > > unlikely, at the moment I think the OUI registration keeps this from being a
> > > problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
> > > guarantees that the field cannot be all '1's due to the OUI having X & M
> > > bits cleared. It sort of looks like the mapping is trying to lose those
> > > bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
> > > just read it three times cause it didn't make any sense).
> > 
> > I should also note that we have at least one supported PHY where one
> > of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
> > odd numbered registers in one of the vendor MMDs for addresses 0
> > through 0xefff - which has a bit set in the devices-in-package.
> > 
> > It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
> > devices-in-package bit is clear in most of the valid MMDs, so we
> > shouldn't touch it.
> > 
> > These reveal the problem of randomly probing MMDs - they can return
> > unexpected values and not be as well behaved as we would like them to
> > be.  Using register 8 to detect presence may be beneficial, but that
> > may also introduce problems as we haven't used that before (and we
> > don't know whether any PHY that wrong.)  I know at least the 88x3310
> > gets it right for all except the vendor MMDs, where the low addresses
> > appear non-confromant to the 802.3 specs.  Both vendor MMDs are
> > definitely implemented, just not with anything conforming to 802.3.
> 
> Yes, we know even for the NXP reference hardware, one of the phy's doesn't
> probe out correctly because it doesn't respond to the ieee defined
> registers. I think at this point, there really isn't anything we can do
> about that unless we involve the (ACPI) firmware in currently nonstandard
> behaviors.
> 
> So, my goals here have been to first, not break anything, and then do a
> slightly better job finding phy's that are (mostly?) responding correctly to
> the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
> have IEEE conformant phy's you should be ok". So, for your example phy, I
> guess the immediate answer is "use DT" or "find a conformant phy", or even
> "abstract it in the firmware and use a mailbox interface".

You haven't understood.  The PHY does conform for most of the MMDs,
but there are a number that do not conform.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 21:51         ` Jeremy Linton
  2020-05-25 22:01           ` Russell King - ARM Linux admin
@ 2020-05-25 22:06           ` Andrew Lunn
  2020-05-25 22:17             ` Jeremy Linton
  1 sibling, 1 reply; 61+ messages in thread
From: Andrew Lunn @ 2020-05-25 22:06 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Russell King - ARM Linux admin, netdev, davem, f.fainelli,
	hkallweit1, madalin.bucur, calvin.johnson, linux-kernel

> Yes, we know even for the NXP reference hardware, one of the phy's doesn't
> probe out correctly because it doesn't respond to the ieee defined
> registers. I think at this point, there really isn't anything we can do
> about that unless we involve the (ACPI) firmware in currently nonstandard
> behaviors.
> 
> So, my goals here have been to first, not break anything, and then do a
> slightly better job finding phy's that are (mostly?) responding correctly to
> the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
> have IEEE conformant phy's you should be ok". So, for your example phy, I
> guess the immediate answer is "use DT" or "find a conformant phy", or even
> "abstract it in the firmware and use a mailbox interface".
 
Hi Jeremy

You need to be careful here, when you say "use DT". With a c45 PHY
of_mdiobus_register_phy() calls get_phy_device() to see if the device
exists on the bus. So your changes to get_phy_device() etc, needs to
continue to find devices it used to find, even if they are not fully
complient to 802.3.

	  Andrew
 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices
  2020-05-25  8:25       ` Russell King - ARM Linux admin
  2020-05-25 13:43         ` Andrew Lunn
@ 2020-05-25 22:09         ` Jeremy Linton
  2020-05-25 22:41           ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 22:09 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Andrew Lunn, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

Hi,

On 5/25/20 3:25 AM, Russell King - ARM Linux admin wrote:
> On Sun, May 24, 2020 at 11:28:52PM -0500, Jeremy Linton wrote:
>> Hi,
>>
>> On 5/24/20 9:44 AM, Andrew Lunn wrote:
>>>> +++ b/include/linux/phy.h
>>>> @@ -275,6 +275,11 @@ struct mii_bus {
>>>>    	int reset_delay_us;
>>>>    	/* RESET GPIO descriptor pointer */
>>>>    	struct gpio_desc *reset_gpiod;
>>>> +	/* bus capabilities, used for probing */
>>>> +	enum {
>>>> +		MDIOBUS_C22_ONLY = 0,
>>>> +		MDIOBUS_C45_FIRST,
>>>> +	} probe_capabilities;
>>>>    };
>>>
>>>
>>> I'm not so keen on _FIRST. It suggest _LAST would also be valid.  But
>>> that then suggests this is not a bus property, but a PHY property, and
>>> some PHYs might need _FIRST and other phys need _LAST, and then you
>>> have a bus which has both sorts of PHY on it, and you have a problem.
>>>
>>> So i think it would be better to have
>>>
>>> 	enum {
>>> 		MDIOBUS_UNKNOWN = 0,
>>> 		MDIOBUS_C22,
>>> 		MDIOBUS_C45,
>>> 		MDIOBUS_C45_C22,
>>> 	} bus_capabilities;
>>>
>>> Describe just what the bus master can support.
>>
>> Yes, the naming is reasonable and I will update it in the next patch. I went
>> around a bit myself with this naming early on, and the problem I saw was
>> that a C45 capable master, can have C45 electrical phy's that only respond
>> to c22 requests (AFAIK).
> 
> If you have a master that can only generate clause 45 cycles, and
> someone is daft enough to connect a clause 22 only PHY to it, the
> result is hardware that doesn't work - there's no getting around
> that.  The MDIO interface can't generate the appropriate cycles to
> access the clause 22 PHY.  So, this is not something we need care
> about.
> 
>> So the MDIOBUS_C45 (I think I was calling it
>> C45_ONLY) is an invalid selection. Not, that it wouldn't be helpful to have
>> a C45_ONLY case, but that the assumption is that you wouldn't try and probe
>> c22 registers, which I thought was a mistake.
> 
> MDIOBUS_C45 means "I can generate clause 45 cycles".
> MDIOBUS_C22 means "I can generate clause 22 cycles".
> MDIOBUS_C45_C22 means "I can generate both clause 45 and clause 22
> cycles."

Hi, to be clear, we are talking about c45 controllers that can access 
the c22 register space via c45, or controllers which are 
electrically/level shifting to be compatible with c22 voltages/etc?

The nxp hardware in question has 1, 10 and 40Gbit phys on the same MDIO, 
the 1gbit we fall back to c22 registers because it doesn't respond 
correctly to c45 registers. Which is AFAIK what the bit0 C22 regs bit is 
for..

The general logic right now for a C45_FIRST is attempt to detect a c45 
phy and if nothing is detected fall back and attempt c22 register 
access. That is whats picking up the 1G phys. If for whatever reason the 
MDIO controller can't do the right thing to access the c22 regs, I guess 
there really isn't anything we can do about it.


> 
> Notice carefully the values these end up with - MDIOBUS_C22 = BIT(0),
> MDIOBUS_C45 = BIT(1), MDIOBUS_C45_C22 = BIT(0) | BIT(1).  I suspect
> that was no coincidence in Andrew's suggestion.
> 


^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 22:06           ` Andrew Lunn
@ 2020-05-25 22:17             ` Jeremy Linton
  2020-05-25 23:06               ` Andrew Lunn
  2020-05-25 23:07               ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 22:17 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Russell King - ARM Linux admin, netdev, davem, f.fainelli,
	hkallweit1, madalin.bucur, calvin.johnson, linux-kernel

Hi,

On 5/25/20 5:06 PM, Andrew Lunn wrote:
>> Yes, we know even for the NXP reference hardware, one of the phy's doesn't
>> probe out correctly because it doesn't respond to the ieee defined
>> registers. I think at this point, there really isn't anything we can do
>> about that unless we involve the (ACPI) firmware in currently nonstandard
>> behaviors.
>>
>> So, my goals here have been to first, not break anything, and then do a
>> slightly better job finding phy's that are (mostly?) responding correctly to
>> the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
>> have IEEE conformant phy's you should be ok". So, for your example phy, I
>> guess the immediate answer is "use DT" or "find a conformant phy", or even
>> "abstract it in the firmware and use a mailbox interface".
>   
> Hi Jeremy
> 
> You need to be careful here, when you say "use DT". With a c45 PHY
> of_mdiobus_register_phy() calls get_phy_device() to see if the device
> exists on the bus. So your changes to get_phy_device() etc, needs to
> continue to find devices it used to find, even if they are not fully
> complient to 802.3.
> 

Yes, that is my "don't break anything". But, in a number of cases I 
can't tell if something is an intentional "bug", or what exactly the 
intended side effect actually was. The c22 bit0 sanitation is in this 
bucket, because its basically disabling the MMD0 probe..

I know for sure we find phys that previously weren't found. OTOH, i'm 
not sure how many that were previously "found" are now getting kicked 
out by because they are doing something "bad" that looked like a bug.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 22:01           ` Russell King - ARM Linux admin
@ 2020-05-25 22:22             ` Jeremy Linton
  2020-05-25 23:09               ` Russell King - ARM Linux admin
  2020-05-25 23:16             ` Jeremy Linton
  1 sibling, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 22:22 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
> On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
>> Hi,
>>
>> On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
>>> On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
>>>> Hi,
>>>>
>>>> On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
>>>>> On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
>>>>>> Until this point, we have been sanitizing the c22
>>>>>> regs presence bit out of all the MMD device lists.
>>>>>> This is incorrect as it causes the 0xFFFFFFFF checks
>>>>>> to incorrectly fail. Further, it turns out that we
>>>>>> want to utilize this flag to make a determination that
>>>>>> there is actually a phy at this location and we should
>>>>>> be accessing it using c22.
>>>>>>
>>>>>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>>>>>> ---
>>>>>>     drivers/net/phy/phy_device.c | 16 +++++++++++++---
>>>>>>     1 file changed, 13 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>>>>> index f0761fa5e40b..2d677490ecab 100644
>>>>>> --- a/drivers/net/phy/phy_device.c
>>>>>> +++ b/drivers/net/phy/phy_device.c
>>>>>> @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>>>>>     		return -EIO;
>>>>>>     	*devices_in_package |= phy_reg;
>>>>>> -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>> -	*devices_in_package &= ~BIT(0);
>>>>>> -
>>>>>>     	return 0;
>>>>>>     }
>>>>>> @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>     	int i;
>>>>>>     	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>>>>>     	u32 *devs = &c45_ids->devices_in_package;
>>>>>> +	bool c22_present = false;
>>>>>> +	bool valid_id = false;
>>>>>>     	/* Find first non-zero Devices In package. Device zero is reserved
>>>>>>     	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>>>>>> @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>     		return 0;
>>>>>>     	}
>>>>>> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>> +	c22_present = *devs & BIT(0);
>>>>>> +	*devs &= ~BIT(0);
>>>>>> +
>>>>>>     	/* Now probe Device Identifiers for each device present. */
>>>>>>     	for (i = 1; i < num_ids; i++) {
>>>>>>     		if (!(c45_ids->devices_in_package & (1 << i)))
>>>>>> @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>     		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
>>>>>>     		if (ret < 0)
>>>>>>     			return ret;
>>>>>> +		if (valid_phy_id(c45_ids->device_ids[i]))
>>>>>> +			valid_id = true;
>>>>>
>>>>> Here you are using your "devices in package" validator to validate the
>>>>> PHY ID value.  One of the things it does is mask this value with
>>>>> 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
>>>>> looks completely wrong.
>>>>
>>>> I think in this case I was just using it like the comment in
>>>> get_phy_device() "if the phy_id is mostly F's, there is no device here".
>>>>
>>>> My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
>>>> that seem to indicate "bus ok, phy didn't respond".
>>>>
>>>> I just checked the OUI registration, and while there are a couple OUI's
>>>> registered that have a number of FFF's in them, none of those cases seems to
>>>> overlap sufficiently to cause this to throw them out. Plus a phy would also
>>>> have to have model+revision set to 'F's. So while might be possible, if
>>>> unlikely, at the moment I think the OUI registration keeps this from being a
>>>> problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
>>>> guarantees that the field cannot be all '1's due to the OUI having X & M
>>>> bits cleared. It sort of looks like the mapping is trying to lose those
>>>> bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
>>>> just read it three times cause it didn't make any sense).
>>>
>>> I should also note that we have at least one supported PHY where one
>>> of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
>>> odd numbered registers in one of the vendor MMDs for addresses 0
>>> through 0xefff - which has a bit set in the devices-in-package.
>>>
>>> It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
>>> devices-in-package bit is clear in most of the valid MMDs, so we
>>> shouldn't touch it.
>>>
>>> These reveal the problem of randomly probing MMDs - they can return
>>> unexpected values and not be as well behaved as we would like them to
>>> be.  Using register 8 to detect presence may be beneficial, but that
>>> may also introduce problems as we haven't used that before (and we
>>> don't know whether any PHY that wrong.)  I know at least the 88x3310
>>> gets it right for all except the vendor MMDs, where the low addresses
>>> appear non-confromant to the 802.3 specs.  Both vendor MMDs are
>>> definitely implemented, just not with anything conforming to 802.3.
>>
>> Yes, we know even for the NXP reference hardware, one of the phy's doesn't
>> probe out correctly because it doesn't respond to the ieee defined
>> registers. I think at this point, there really isn't anything we can do
>> about that unless we involve the (ACPI) firmware in currently nonstandard
>> behaviors.
>>
>> So, my goals here have been to first, not break anything, and then do a
>> slightly better job finding phy's that are (mostly?) responding correctly to
>> the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
>> have IEEE conformant phy's you should be ok". So, for your example phy, I
>> guess the immediate answer is "use DT" or "find a conformant phy", or even
>> "abstract it in the firmware and use a mailbox interface".
> 
> You haven't understood.  The PHY does conform for most of the MMDs,
> but there are a number that do not conform.

Probably...

Except that i'm not sure how that is a problem at the moment, its still 
going to trigger as a found phy, and walk the same mmd list as before 
requesting drivers. Those drivers haven't changed their behavior so 
where is the problem? If there is a problem its in 7/11 where things are 
getting kicked due to seemingly invalid Ids.

The 1/11 devices=0 case actually appears to be a bug i'm fixing because 
you won't get an ID or a MMD list from that (before or after).



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices
  2020-05-25 22:09         ` Jeremy Linton
@ 2020-05-25 22:41           ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 22:41 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Andrew Lunn, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 05:09:56PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/25/20 3:25 AM, Russell King - ARM Linux admin wrote:
> > On Sun, May 24, 2020 at 11:28:52PM -0500, Jeremy Linton wrote:
> > > Hi,
> > > 
> > > On 5/24/20 9:44 AM, Andrew Lunn wrote:
> > > > > +++ b/include/linux/phy.h
> > > > > @@ -275,6 +275,11 @@ struct mii_bus {
> > > > >    	int reset_delay_us;
> > > > >    	/* RESET GPIO descriptor pointer */
> > > > >    	struct gpio_desc *reset_gpiod;
> > > > > +	/* bus capabilities, used for probing */
> > > > > +	enum {
> > > > > +		MDIOBUS_C22_ONLY = 0,
> > > > > +		MDIOBUS_C45_FIRST,
> > > > > +	} probe_capabilities;
> > > > >    };
> > > > 
> > > > 
> > > > I'm not so keen on _FIRST. It suggest _LAST would also be valid.  But
> > > > that then suggests this is not a bus property, but a PHY property, and
> > > > some PHYs might need _FIRST and other phys need _LAST, and then you
> > > > have a bus which has both sorts of PHY on it, and you have a problem.
> > > > 
> > > > So i think it would be better to have
> > > > 
> > > > 	enum {
> > > > 		MDIOBUS_UNKNOWN = 0,
> > > > 		MDIOBUS_C22,
> > > > 		MDIOBUS_C45,
> > > > 		MDIOBUS_C45_C22,
> > > > 	} bus_capabilities;
> > > > 
> > > > Describe just what the bus master can support.
> > > 
> > > Yes, the naming is reasonable and I will update it in the next patch. I went
> > > around a bit myself with this naming early on, and the problem I saw was
> > > that a C45 capable master, can have C45 electrical phy's that only respond
> > > to c22 requests (AFAIK).
> > 
> > If you have a master that can only generate clause 45 cycles, and
> > someone is daft enough to connect a clause 22 only PHY to it, the
> > result is hardware that doesn't work - there's no getting around
> > that.  The MDIO interface can't generate the appropriate cycles to
> > access the clause 22 PHY.  So, this is not something we need care
> > about.
> > 
> > > So the MDIOBUS_C45 (I think I was calling it
> > > C45_ONLY) is an invalid selection. Not, that it wouldn't be helpful to have
> > > a C45_ONLY case, but that the assumption is that you wouldn't try and probe
> > > c22 registers, which I thought was a mistake.
> > 
> > MDIOBUS_C45 means "I can generate clause 45 cycles".
> > MDIOBUS_C22 means "I can generate clause 22 cycles".
> > MDIOBUS_C45_C22 means "I can generate both clause 45 and clause 22
> > cycles."
> 
> Hi, to be clear, we are talking about c45 controllers that can access the
> c22 register space via c45, or controllers which are electrically/level
> shifting to be compatible with c22 voltages/etc?

To me, Andrew was quite clear that these flags should describe what
the MDIO controller can support, and what I understand from Andrew's
email is:

If it can't support clause 45 accesses, then it should not advertise
MDIOBUS_C45 nor MDIOBUS_C45_C22.

If it can't support clause 22 accesses, then it should not advertise
MDIOBUS_C22.

And that's all there is to it.

If you want to talk about clause 45 access via the clause 22 register
set, that is a property of the PHY, not of the MDIO controller, and
the MDIO controller has no business attempting to describe that
property in any shape or form since it is a PHY property.

If you have access to clause 22 registers, then you likely have the
clause 22 ID registers, and the way phylib currently works, that will
also match a PHY driver.

Once we have a PHY and a driver bound, then even if it is a C45
driver, accesses using the phy_*_mmd() functions will _automatically_
switch to using C22 cycles to the indirect C45 access registers if
the PHY has not been detected as a C45 PHY (phydev->is_c45 is false.)

So, I'm coming to the conclusion that you're making way more work
here than is necessary, your changes to gratuitously change the way
stuff in phylib work which is not risk-free are completely unnecessary
and really not a risk worth taking when it's likely that we already
have the code mostly in place to be able to support your PHY.

I think there are some questionable uses of phydev->is_c45 that
your point about accessing C45 PHYs through C22 indirect registers
brings up which need to be resolved, but I really don't think that's
a completely separate issue.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 22:17             ` Jeremy Linton
@ 2020-05-25 23:06               ` Andrew Lunn
  2020-05-25 23:07               ` Russell King - ARM Linux admin
  1 sibling, 0 replies; 61+ messages in thread
From: Andrew Lunn @ 2020-05-25 23:06 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Russell King - ARM Linux admin, netdev, davem, f.fainelli,
	hkallweit1, madalin.bucur, calvin.johnson, linux-kernel

> I know for sure we find phys that previously weren't found.

That is in itself somewhat dangerous. Those using primitive
configuration systems are probably going to use phy_find_first(),
rather than an address on the bus.  I always recommend against that,
because if another PHY suddenly pops up on the bus, bad things can
happen.

   Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 22:17             ` Jeremy Linton
  2020-05-25 23:06               ` Andrew Lunn
@ 2020-05-25 23:07               ` Russell King - ARM Linux admin
  2020-05-25 23:12                 ` Andrew Lunn
  1 sibling, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 23:07 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Andrew Lunn, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 05:17:27PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/25/20 5:06 PM, Andrew Lunn wrote:
> > > Yes, we know even for the NXP reference hardware, one of the phy's doesn't
> > > probe out correctly because it doesn't respond to the ieee defined
> > > registers. I think at this point, there really isn't anything we can do
> > > about that unless we involve the (ACPI) firmware in currently nonstandard
> > > behaviors.
> > > 
> > > So, my goals here have been to first, not break anything, and then do a
> > > slightly better job finding phy's that are (mostly?) responding correctly to
> > > the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
> > > have IEEE conformant phy's you should be ok". So, for your example phy, I
> > > guess the immediate answer is "use DT" or "find a conformant phy", or even
> > > "abstract it in the firmware and use a mailbox interface".
> > Hi Jeremy
> > 
> > You need to be careful here, when you say "use DT". With a c45 PHY
> > of_mdiobus_register_phy() calls get_phy_device() to see if the device
> > exists on the bus. So your changes to get_phy_device() etc, needs to
> > continue to find devices it used to find, even if they are not fully
> > complient to 802.3.
> > 
> 
> Yes, that is my "don't break anything". But, in a number of cases I can't
> tell if something is an intentional "bug", or what exactly the intended side
> effect actually was. The c22 bit0 sanitation is in this bucket, because its
> basically disabling the MMD0 probe..

I'm really not sure it causes any problem what so ever.  Have you read
the commit adding cortina.c to see what it says - there is an
interesting comment about what it requires in firmware.  That is, it
calls for an explicit "ethernet-phy-id" compatible in DT naming the
PHY ID, but that can't be used for Clause 45 PHYs (it will be ignored.)
So, it will be treated by the kernel as a Clause 22 PHY.

It is presently in use:

arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";
arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";

Now, given this, none of the Clause 45 PHY detection code will be
touched while probing for these PHYs, so really that work-around to
read form MMD 0 in the Clause 45 probing function really doesn't seem
to apply to these PHYs.

Next, when you read cortina.c, it becomes obvious that the PHY's MMD 0
doesn't even follow IEEE 802.3 - the ID registers are at register 0/1
not 2/3.  So even if we did try to read the ID from MMD 0, we wouldn't
be reading the ID from the right registers.

Hence, I don't think anything has been broken at all by the commit
you refer to.

> I know for sure we find phys that previously weren't found. OTOH, i'm not
> sure how many that were previously "found" are now getting kicked out by
> because they are doing something "bad" that looked like a bug.

I don't think you've found any problem what so ever.

For these PHYs to be automatically probed, they need to have a DT
string identifying them as clause 45 compliant.  From the DTS files
I've provided above, that isn't the case, this code path won't be
reached, so nothing has been broken.  In any case, for the
reasons I mention above wrt non-standard register layout, it seems
it couldn't have worked via this probing code.

I would dig some more into the history of the change to
get_phy_c45_ids() and how it relates to the addition of the Cortina
driver, but unfortunately my machine is being painfully slow with
git log searching the history that it's way too time consuming for
me to do anything further now, but the conclusion I'm coming to is
that there has been no regression how you think there has been.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 22:22             ` Jeremy Linton
@ 2020-05-25 23:09               ` Russell King - ARM Linux admin
  2020-05-25 23:22                 ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 23:09 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 05:22:07PM -0500, Jeremy Linton wrote:
> On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
> > On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
> > > Hi,
> > > 
> > > On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
> > > > On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
> > > > > Hi,
> > > > > 
> > > > > On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
> > > > > > On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
> > > > > > > Until this point, we have been sanitizing the c22
> > > > > > > regs presence bit out of all the MMD device lists.
> > > > > > > This is incorrect as it causes the 0xFFFFFFFF checks
> > > > > > > to incorrectly fail. Further, it turns out that we
> > > > > > > want to utilize this flag to make a determination that
> > > > > > > there is actually a phy at this location and we should
> > > > > > > be accessing it using c22.
> > > > > > > 
> > > > > > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > > > > > ---
> > > > > > >     drivers/net/phy/phy_device.c | 16 +++++++++++++---
> > > > > > >     1 file changed, 13 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > > > > > index f0761fa5e40b..2d677490ecab 100644
> > > > > > > --- a/drivers/net/phy/phy_device.c
> > > > > > > +++ b/drivers/net/phy/phy_device.c
> > > > > > > @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > > > > > >     		return -EIO;
> > > > > > >     	*devices_in_package |= phy_reg;
> > > > > > > -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > > > -	*devices_in_package &= ~BIT(0);
> > > > > > > -
> > > > > > >     	return 0;
> > > > > > >     }
> > > > > > > @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > >     	int i;
> > > > > > >     	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > > > > > >     	u32 *devs = &c45_ids->devices_in_package;
> > > > > > > +	bool c22_present = false;
> > > > > > > +	bool valid_id = false;
> > > > > > >     	/* Find first non-zero Devices In package. Device zero is reserved
> > > > > > >     	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> > > > > > > @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > >     		return 0;
> > > > > > >     	}
> > > > > > > +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > > > +	c22_present = *devs & BIT(0);
> > > > > > > +	*devs &= ~BIT(0);
> > > > > > > +
> > > > > > >     	/* Now probe Device Identifiers for each device present. */
> > > > > > >     	for (i = 1; i < num_ids; i++) {
> > > > > > >     		if (!(c45_ids->devices_in_package & (1 << i)))
> > > > > > > @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > >     		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
> > > > > > >     		if (ret < 0)
> > > > > > >     			return ret;
> > > > > > > +		if (valid_phy_id(c45_ids->device_ids[i]))
> > > > > > > +			valid_id = true;
> > > > > > 
> > > > > > Here you are using your "devices in package" validator to validate the
> > > > > > PHY ID value.  One of the things it does is mask this value with
> > > > > > 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
> > > > > > looks completely wrong.
> > > > > 
> > > > > I think in this case I was just using it like the comment in
> > > > > get_phy_device() "if the phy_id is mostly F's, there is no device here".
> > > > > 
> > > > > My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
> > > > > that seem to indicate "bus ok, phy didn't respond".
> > > > > 
> > > > > I just checked the OUI registration, and while there are a couple OUI's
> > > > > registered that have a number of FFF's in them, none of those cases seems to
> > > > > overlap sufficiently to cause this to throw them out. Plus a phy would also
> > > > > have to have model+revision set to 'F's. So while might be possible, if
> > > > > unlikely, at the moment I think the OUI registration keeps this from being a
> > > > > problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
> > > > > guarantees that the field cannot be all '1's due to the OUI having X & M
> > > > > bits cleared. It sort of looks like the mapping is trying to lose those
> > > > > bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
> > > > > just read it three times cause it didn't make any sense).
> > > > 
> > > > I should also note that we have at least one supported PHY where one
> > > > of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
> > > > odd numbered registers in one of the vendor MMDs for addresses 0
> > > > through 0xefff - which has a bit set in the devices-in-package.
> > > > 
> > > > It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
> > > > devices-in-package bit is clear in most of the valid MMDs, so we
> > > > shouldn't touch it.
> > > > 
> > > > These reveal the problem of randomly probing MMDs - they can return
> > > > unexpected values and not be as well behaved as we would like them to
> > > > be.  Using register 8 to detect presence may be beneficial, but that
> > > > may also introduce problems as we haven't used that before (and we
> > > > don't know whether any PHY that wrong.)  I know at least the 88x3310
> > > > gets it right for all except the vendor MMDs, where the low addresses
> > > > appear non-confromant to the 802.3 specs.  Both vendor MMDs are
> > > > definitely implemented, just not with anything conforming to 802.3.
> > > 
> > > Yes, we know even for the NXP reference hardware, one of the phy's doesn't
> > > probe out correctly because it doesn't respond to the ieee defined
> > > registers. I think at this point, there really isn't anything we can do
> > > about that unless we involve the (ACPI) firmware in currently nonstandard
> > > behaviors.
> > > 
> > > So, my goals here have been to first, not break anything, and then do a
> > > slightly better job finding phy's that are (mostly?) responding correctly to
> > > the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
> > > have IEEE conformant phy's you should be ok". So, for your example phy, I
> > > guess the immediate answer is "use DT" or "find a conformant phy", or even
> > > "abstract it in the firmware and use a mailbox interface".
> > 
> > You haven't understood.  The PHY does conform for most of the MMDs,
> > but there are a number that do not conform.
> 
> Probably...
> 
> Except that i'm not sure how that is a problem at the moment, its still
> going to trigger as a found phy, and walk the same mmd list as before
> requesting drivers. Those drivers haven't changed their behavior so where is
> the problem? If there is a problem its in 7/11 where things are getting
> kicked due to seemingly invalid Ids.
> 
> The 1/11 devices=0 case actually appears to be a bug i'm fixing because you
> won't get an ID or a MMD list from that (before or after).

I think I've just flattened that argument in my immediately preceding
reply on the Cortina situation; I think you've grossly misread that
through not fully researching the history and then finding the
existing users.

There is no bug that you are fixing from what I can see.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:07               ` Russell King - ARM Linux admin
@ 2020-05-25 23:12                 ` Andrew Lunn
  2020-05-25 23:46                   ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Andrew Lunn @ 2020-05-25 23:12 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Jeremy Linton, netdev, davem, f.fainelli, hkallweit1,
	madalin.bucur, calvin.johnson, linux-kernel

> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";

Hi Jeremy

You are doing this work for NXP right? Maybe you can ask them to go
searching in the cellar and find you one of these boards?

	  Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 22:01           ` Russell King - ARM Linux admin
  2020-05-25 22:22             ` Jeremy Linton
@ 2020-05-25 23:16             ` Jeremy Linton
  2020-05-25 23:30               ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 23:16 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
> On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
>> Hi,
>>
>> On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
>>> On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
>>>> Hi,
>>>>
>>>> On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
>>>>> On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
>>>>>> Until this point, we have been sanitizing the c22
>>>>>> regs presence bit out of all the MMD device lists.
>>>>>> This is incorrect as it causes the 0xFFFFFFFF checks
>>>>>> to incorrectly fail. Further, it turns out that we
>>>>>> want to utilize this flag to make a determination that
>>>>>> there is actually a phy at this location and we should
>>>>>> be accessing it using c22.
>>>>>>
>>>>>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>>>>>> ---
>>>>>>     drivers/net/phy/phy_device.c | 16 +++++++++++++---
>>>>>>     1 file changed, 13 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>>>>> index f0761fa5e40b..2d677490ecab 100644
>>>>>> --- a/drivers/net/phy/phy_device.c
>>>>>> +++ b/drivers/net/phy/phy_device.c
>>>>>> @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>>>>>     		return -EIO;
>>>>>>     	*devices_in_package |= phy_reg;
>>>>>> -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>> -	*devices_in_package &= ~BIT(0);
>>>>>> -
>>>>>>     	return 0;
>>>>>>     }
>>>>>> @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>     	int i;
>>>>>>     	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>>>>>     	u32 *devs = &c45_ids->devices_in_package;
>>>>>> +	bool c22_present = false;
>>>>>> +	bool valid_id = false;
>>>>>>     	/* Find first non-zero Devices In package. Device zero is reserved
>>>>>>     	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>>>>>> @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>     		return 0;
>>>>>>     	}
>>>>>> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>> +	c22_present = *devs & BIT(0);
>>>>>> +	*devs &= ~BIT(0);
>>>>>> +
>>>>>>     	/* Now probe Device Identifiers for each device present. */
>>>>>>     	for (i = 1; i < num_ids; i++) {
>>>>>>     		if (!(c45_ids->devices_in_package & (1 << i)))
>>>>>> @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>     		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
>>>>>>     		if (ret < 0)
>>>>>>     			return ret;
>>>>>> +		if (valid_phy_id(c45_ids->device_ids[i]))
>>>>>> +			valid_id = true;
>>>>>
>>>>> Here you are using your "devices in package" validator to validate the
>>>>> PHY ID value.  One of the things it does is mask this value with
>>>>> 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
>>>>> looks completely wrong.
>>>>
>>>> I think in this case I was just using it like the comment in
>>>> get_phy_device() "if the phy_id is mostly F's, there is no device here".
>>>>
>>>> My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
>>>> that seem to indicate "bus ok, phy didn't respond".
>>>>
>>>> I just checked the OUI registration, and while there are a couple OUI's
>>>> registered that have a number of FFF's in them, none of those cases seems to
>>>> overlap sufficiently to cause this to throw them out. Plus a phy would also
>>>> have to have model+revision set to 'F's. So while might be possible, if
>>>> unlikely, at the moment I think the OUI registration keeps this from being a
>>>> problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
>>>> guarantees that the field cannot be all '1's due to the OUI having X & M
>>>> bits cleared. It sort of looks like the mapping is trying to lose those
>>>> bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
>>>> just read it three times cause it didn't make any sense).
>>>
>>> I should also note that we have at least one supported PHY where one
>>> of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
>>> odd numbered registers in one of the vendor MMDs for addresses 0
>>> through 0xefff - which has a bit set in the devices-in-package.
>>>
>>> It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
>>> devices-in-package bit is clear in most of the valid MMDs, so we
>>> shouldn't touch it.
>>>
>>> These reveal the problem of randomly probing MMDs - they can return
>>> unexpected values and not be as well behaved as we would like them to
>>> be.  Using register 8 to detect presence may be beneficial, but that
>>> may also introduce problems as we haven't used that before (and we
>>> don't know whether any PHY that wrong.)  I know at least the 88x3310
>>> gets it right for all except the vendor MMDs, where the low addresses
>>> appear non-confromant to the 802.3 specs.  Both vendor MMDs are
>>> definitely implemented, just not with anything conforming to 802.3.
>>
>> Yes, we know even for the NXP reference hardware, one of the phy's doesn't
>> probe out correctly because it doesn't respond to the ieee defined
>> registers. I think at this point, there really isn't anything we can do
>> about that unless we involve the (ACPI) firmware in currently nonstandard
>> behaviors.
>>
>> So, my goals here have been to first, not break anything, and then do a
>> slightly better job finding phy's that are (mostly?) responding correctly to
>> the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
>> have IEEE conformant phy's you should be ok". So, for your example phy, I
>> guess the immediate answer is "use DT" or "find a conformant phy", or even
>> "abstract it in the firmware and use a mailbox interface".
> 
> You haven't understood.  The PHY does conform for most of the MMDs,
> but there are a number that do not conform.
> 

Maybe I should clarify. This set is still terminating the search for a 
valid MMD device list on the first MMD that responds. It then probes the 
same ID registers of the flagged MMDs as before. What has changed is 
that we search higher into the MMD address space for a device list. So 
previously if a device didn't respond to MMD0-8 it was ignored. Now it 
needs to fail all of 0-31 to be ignored. Similarly for the ID's, if we 
find what appears to be a valid MMD device list, then we will probe not 
only the original 1-8 MMDs for IDs, but 1-31 MMDs for IDs.

So any device which presented a non zero, non "mostly ff's" devices list 
in 0-8 will get the same device list as before. Similarly we probe for 
ids at the same MMD addresses as before, with additional MMD detection 
 >8. This change means we pick up additional phys, and we detect the 
correct MMD ids for more devices.

The possible negative differences are in 7/11 where we transition a 
device which responded with an ID=0 to 0xFFFFFFFF which will cause the 
code to not request a phy/mmd driver for 00000000 (or potentially some 
bogus OUI's due to the 1fffffff difference). Assuming a valid phy id 
gets applied somewhere, the same phy driver will load, and presumably it 
will know what to do with MMD's in the devices list.

So, I don't see anything in your example above which changes the 
detected MMD devices. Potentially though, not only will you get the 
previous MMD Id's, but you might get a few new ones. Whether MMD2 is 
probed is going to depend on whether MMD0/1's devices list has the MMD2 
device bit set (you weren't clear in the description above).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:09               ` Russell King - ARM Linux admin
@ 2020-05-25 23:22                 ` Jeremy Linton
  2020-05-25 23:33                   ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 23:22 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On 5/25/20 6:09 PM, Russell King - ARM Linux admin wrote:
> On Mon, May 25, 2020 at 05:22:07PM -0500, Jeremy Linton wrote:
>> On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
>>> On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
>>>> Hi,
>>>>
>>>> On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
>>>>> On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
>>>>>>> On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
>>>>>>>> Until this point, we have been sanitizing the c22
>>>>>>>> regs presence bit out of all the MMD device lists.
>>>>>>>> This is incorrect as it causes the 0xFFFFFFFF checks
>>>>>>>> to incorrectly fail. Further, it turns out that we
>>>>>>>> want to utilize this flag to make a determination that
>>>>>>>> there is actually a phy at this location and we should
>>>>>>>> be accessing it using c22.
>>>>>>>>
>>>>>>>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>>>>>>>> ---
>>>>>>>>      drivers/net/phy/phy_device.c | 16 +++++++++++++---
>>>>>>>>      1 file changed, 13 insertions(+), 3 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>>>>>>> index f0761fa5e40b..2d677490ecab 100644
>>>>>>>> --- a/drivers/net/phy/phy_device.c
>>>>>>>> +++ b/drivers/net/phy/phy_device.c
>>>>>>>> @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>>>>>>>      		return -EIO;
>>>>>>>>      	*devices_in_package |= phy_reg;
>>>>>>>> -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>>>> -	*devices_in_package &= ~BIT(0);
>>>>>>>> -
>>>>>>>>      	return 0;
>>>>>>>>      }
>>>>>>>> @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>>>      	int i;
>>>>>>>>      	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>>>>>>>      	u32 *devs = &c45_ids->devices_in_package;
>>>>>>>> +	bool c22_present = false;
>>>>>>>> +	bool valid_id = false;
>>>>>>>>      	/* Find first non-zero Devices In package. Device zero is reserved
>>>>>>>>      	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>>>>>>>> @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>>>      		return 0;
>>>>>>>>      	}
>>>>>>>> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>>>> +	c22_present = *devs & BIT(0);
>>>>>>>> +	*devs &= ~BIT(0);
>>>>>>>> +
>>>>>>>>      	/* Now probe Device Identifiers for each device present. */
>>>>>>>>      	for (i = 1; i < num_ids; i++) {
>>>>>>>>      		if (!(c45_ids->devices_in_package & (1 << i)))
>>>>>>>> @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>>>      		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
>>>>>>>>      		if (ret < 0)
>>>>>>>>      			return ret;
>>>>>>>> +		if (valid_phy_id(c45_ids->device_ids[i]))
>>>>>>>> +			valid_id = true;
>>>>>>>
>>>>>>> Here you are using your "devices in package" validator to validate the
>>>>>>> PHY ID value.  One of the things it does is mask this value with
>>>>>>> 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
>>>>>>> looks completely wrong.
>>>>>>
>>>>>> I think in this case I was just using it like the comment in
>>>>>> get_phy_device() "if the phy_id is mostly F's, there is no device here".
>>>>>>
>>>>>> My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
>>>>>> that seem to indicate "bus ok, phy didn't respond".
>>>>>>
>>>>>> I just checked the OUI registration, and while there are a couple OUI's
>>>>>> registered that have a number of FFF's in them, none of those cases seems to
>>>>>> overlap sufficiently to cause this to throw them out. Plus a phy would also
>>>>>> have to have model+revision set to 'F's. So while might be possible, if
>>>>>> unlikely, at the moment I think the OUI registration keeps this from being a
>>>>>> problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
>>>>>> guarantees that the field cannot be all '1's due to the OUI having X & M
>>>>>> bits cleared. It sort of looks like the mapping is trying to lose those
>>>>>> bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
>>>>>> just read it three times cause it didn't make any sense).
>>>>>
>>>>> I should also note that we have at least one supported PHY where one
>>>>> of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
>>>>> odd numbered registers in one of the vendor MMDs for addresses 0
>>>>> through 0xefff - which has a bit set in the devices-in-package.
>>>>>
>>>>> It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
>>>>> devices-in-package bit is clear in most of the valid MMDs, so we
>>>>> shouldn't touch it.
>>>>>
>>>>> These reveal the problem of randomly probing MMDs - they can return
>>>>> unexpected values and not be as well behaved as we would like them to
>>>>> be.  Using register 8 to detect presence may be beneficial, but that
>>>>> may also introduce problems as we haven't used that before (and we
>>>>> don't know whether any PHY that wrong.)  I know at least the 88x3310
>>>>> gets it right for all except the vendor MMDs, where the low addresses
>>>>> appear non-confromant to the 802.3 specs.  Both vendor MMDs are
>>>>> definitely implemented, just not with anything conforming to 802.3.
>>>>
>>>> Yes, we know even for the NXP reference hardware, one of the phy's doesn't
>>>> probe out correctly because it doesn't respond to the ieee defined
>>>> registers. I think at this point, there really isn't anything we can do
>>>> about that unless we involve the (ACPI) firmware in currently nonstandard
>>>> behaviors.
>>>>
>>>> So, my goals here have been to first, not break anything, and then do a
>>>> slightly better job finding phy's that are (mostly?) responding correctly to
>>>> the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
>>>> have IEEE conformant phy's you should be ok". So, for your example phy, I
>>>> guess the immediate answer is "use DT" or "find a conformant phy", or even
>>>> "abstract it in the firmware and use a mailbox interface".
>>>
>>> You haven't understood.  The PHY does conform for most of the MMDs,
>>> but there are a number that do not conform.
>>
>> Probably...
>>
>> Except that i'm not sure how that is a problem at the moment, its still
>> going to trigger as a found phy, and walk the same mmd list as before
>> requesting drivers. Those drivers haven't changed their behavior so where is
>> the problem? If there is a problem its in 7/11 where things are getting
>> kicked due to seemingly invalid Ids.
>>
>> The 1/11 devices=0 case actually appears to be a bug i'm fixing because you
>> won't get an ID or a MMD list from that (before or after).
> 
> I think I've just flattened that argument in my immediately preceding
> reply on the Cortina situation; I think you've grossly misread that
> through not fully researching the history and then finding the
> existing users.
> 
> There is no bug that you are fixing from what I can see.

One of us is missing something,

The "cortina" solution is broken in the current kernel. That is because 
lines 726-742 are dead code due to line 693.

I believe I've understood the problem there, and corrected it in this 
set along with a few others, but its distinctly possible that isn't true.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:16             ` Jeremy Linton
@ 2020-05-25 23:30               ` Russell King - ARM Linux admin
  0 siblings, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 23:30 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 06:16:18PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
> > On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
> > > So, my goals here have been to first, not break anything, and then do a
> > > slightly better job finding phy's that are (mostly?) responding correctly to
> > > the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
> > > have IEEE conformant phy's you should be ok". So, for your example phy, I
> > > guess the immediate answer is "use DT" or "find a conformant phy", or even
> > > "abstract it in the firmware and use a mailbox interface".
> > 
> > You haven't understood.  The PHY does conform for most of the MMDs,
> > but there are a number that do not conform.
> > 
> 
> Maybe I should clarify. This set is still terminating the search for a valid
> MMD device list on the first MMD that responds. It then probes the same ID
> registers of the flagged MMDs as before. What has changed is that we search
> higher into the MMD address space for a device list. So previously if a
> device didn't respond to MMD0-8 it was ignored. Now it needs to fail all of
> 0-31 to be ignored. Similarly for the ID's, if we find what appears to be a
> valid MMD device list, then we will probe not only the original 1-8 MMDs for
> IDs, but 1-31 MMDs for IDs.

Clarification is not required; I understand what you're doing, but you
are not understanding my points.

For the 88x3310, your change means that the list of IDs for this PHY
will not only 0x002b09aX, 0x01410daX (the official IDs), but also
0x00000000 and 0xfffe0000 from MMD 30 and 31 respectively, which are
not real IDs.  That's two incorrect IDs that should actually not be
there.

Here's what the first few registers from MMD 30 and 31 look like on
this PHY:

MMD30:
 Addr  Data
 0000  0000 0000 0000 0000 0000 0000 0000 0000
 0008  0000 0000 0000 0000 0000 0000 0000 0000
 0010  0000 0000 0000 0000 0000 0000 0000 0000

MMD31:
 0000  fffe 0000 fffe 0000 fffe 0000 fffe 0000
 0008  fffe 0000 fffe 0000 fffe 0000 fffe 0000
 0010  fffe 0000 fffe 0000 fffe 0000 fffe 0000

We've got away with it so far on several counts:
1. The code doesn't probe that high for IDs.
2. We have no driver that may match those IDs.

You're taking away (1), so now all it takes is for condition (2) to
be broken, and we can end up with a regression if another driver
appears that may match using those.

So, I would suggest that you avoid reading IDs from MMD 30/31, or
maybe only read the ID from MMDs > 8 if register 8 indicates that
there is a device present at that MMD.  That would be compliant
with IEEE 802.3, preserve our existing behaviour, while allowing
us to expand the IDs that we probe for to have a better chance of
only detecting truely valid IDs.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:22                 ` Jeremy Linton
@ 2020-05-25 23:33                   ` Russell King - ARM Linux admin
  2020-05-25 23:42                     ` Jeremy Linton
  0 siblings, 1 reply; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 23:33 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 06:22:19PM -0500, Jeremy Linton wrote:
> On 5/25/20 6:09 PM, Russell King - ARM Linux admin wrote:
> > On Mon, May 25, 2020 at 05:22:07PM -0500, Jeremy Linton wrote:
> > > On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
> > > > On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
> > > > > Hi,
> > > > > 
> > > > > On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
> > > > > > On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
> > > > > > > > On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
> > > > > > > > > Until this point, we have been sanitizing the c22
> > > > > > > > > regs presence bit out of all the MMD device lists.
> > > > > > > > > This is incorrect as it causes the 0xFFFFFFFF checks
> > > > > > > > > to incorrectly fail. Further, it turns out that we
> > > > > > > > > want to utilize this flag to make a determination that
> > > > > > > > > there is actually a phy at this location and we should
> > > > > > > > > be accessing it using c22.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > > > > > > > ---
> > > > > > > > >      drivers/net/phy/phy_device.c | 16 +++++++++++++---
> > > > > > > > >      1 file changed, 13 insertions(+), 3 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > > > > > > > index f0761fa5e40b..2d677490ecab 100644
> > > > > > > > > --- a/drivers/net/phy/phy_device.c
> > > > > > > > > +++ b/drivers/net/phy/phy_device.c
> > > > > > > > > @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > > > > > > > >      		return -EIO;
> > > > > > > > >      	*devices_in_package |= phy_reg;
> > > > > > > > > -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > > > > > -	*devices_in_package &= ~BIT(0);
> > > > > > > > > -
> > > > > > > > >      	return 0;
> > > > > > > > >      }
> > > > > > > > > @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > > > >      	int i;
> > > > > > > > >      	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > > > > > > > >      	u32 *devs = &c45_ids->devices_in_package;
> > > > > > > > > +	bool c22_present = false;
> > > > > > > > > +	bool valid_id = false;
> > > > > > > > >      	/* Find first non-zero Devices In package. Device zero is reserved
> > > > > > > > >      	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> > > > > > > > > @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > > > >      		return 0;
> > > > > > > > >      	}
> > > > > > > > > +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > > > > > +	c22_present = *devs & BIT(0);
> > > > > > > > > +	*devs &= ~BIT(0);
> > > > > > > > > +
> > > > > > > > >      	/* Now probe Device Identifiers for each device present. */
> > > > > > > > >      	for (i = 1; i < num_ids; i++) {
> > > > > > > > >      		if (!(c45_ids->devices_in_package & (1 << i)))
> > > > > > > > > @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > > > >      		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
> > > > > > > > >      		if (ret < 0)
> > > > > > > > >      			return ret;
> > > > > > > > > +		if (valid_phy_id(c45_ids->device_ids[i]))
> > > > > > > > > +			valid_id = true;
> > > > > > > > 
> > > > > > > > Here you are using your "devices in package" validator to validate the
> > > > > > > > PHY ID value.  One of the things it does is mask this value with
> > > > > > > > 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
> > > > > > > > looks completely wrong.
> > > > > > > 
> > > > > > > I think in this case I was just using it like the comment in
> > > > > > > get_phy_device() "if the phy_id is mostly F's, there is no device here".
> > > > > > > 
> > > > > > > My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
> > > > > > > that seem to indicate "bus ok, phy didn't respond".
> > > > > > > 
> > > > > > > I just checked the OUI registration, and while there are a couple OUI's
> > > > > > > registered that have a number of FFF's in them, none of those cases seems to
> > > > > > > overlap sufficiently to cause this to throw them out. Plus a phy would also
> > > > > > > have to have model+revision set to 'F's. So while might be possible, if
> > > > > > > unlikely, at the moment I think the OUI registration keeps this from being a
> > > > > > > problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
> > > > > > > guarantees that the field cannot be all '1's due to the OUI having X & M
> > > > > > > bits cleared. It sort of looks like the mapping is trying to lose those
> > > > > > > bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
> > > > > > > just read it three times cause it didn't make any sense).
> > > > > > 
> > > > > > I should also note that we have at least one supported PHY where one
> > > > > > of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
> > > > > > odd numbered registers in one of the vendor MMDs for addresses 0
> > > > > > through 0xefff - which has a bit set in the devices-in-package.
> > > > > > 
> > > > > > It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
> > > > > > devices-in-package bit is clear in most of the valid MMDs, so we
> > > > > > shouldn't touch it.
> > > > > > 
> > > > > > These reveal the problem of randomly probing MMDs - they can return
> > > > > > unexpected values and not be as well behaved as we would like them to
> > > > > > be.  Using register 8 to detect presence may be beneficial, but that
> > > > > > may also introduce problems as we haven't used that before (and we
> > > > > > don't know whether any PHY that wrong.)  I know at least the 88x3310
> > > > > > gets it right for all except the vendor MMDs, where the low addresses
> > > > > > appear non-confromant to the 802.3 specs.  Both vendor MMDs are
> > > > > > definitely implemented, just not with anything conforming to 802.3.
> > > > > 
> > > > > Yes, we know even for the NXP reference hardware, one of the phy's doesn't
> > > > > probe out correctly because it doesn't respond to the ieee defined
> > > > > registers. I think at this point, there really isn't anything we can do
> > > > > about that unless we involve the (ACPI) firmware in currently nonstandard
> > > > > behaviors.
> > > > > 
> > > > > So, my goals here have been to first, not break anything, and then do a
> > > > > slightly better job finding phy's that are (mostly?) responding correctly to
> > > > > the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
> > > > > have IEEE conformant phy's you should be ok". So, for your example phy, I
> > > > > guess the immediate answer is "use DT" or "find a conformant phy", or even
> > > > > "abstract it in the firmware and use a mailbox interface".
> > > > 
> > > > You haven't understood.  The PHY does conform for most of the MMDs,
> > > > but there are a number that do not conform.
> > > 
> > > Probably...
> > > 
> > > Except that i'm not sure how that is a problem at the moment, its still
> > > going to trigger as a found phy, and walk the same mmd list as before
> > > requesting drivers. Those drivers haven't changed their behavior so where is
> > > the problem? If there is a problem its in 7/11 where things are getting
> > > kicked due to seemingly invalid Ids.
> > > 
> > > The 1/11 devices=0 case actually appears to be a bug i'm fixing because you
> > > won't get an ID or a MMD list from that (before or after).
> > 
> > I think I've just flattened that argument in my immediately preceding
> > reply on the Cortina situation; I think you've grossly misread that
> > through not fully researching the history and then finding the
> > existing users.
> > 
> > There is no bug that you are fixing from what I can see.
> 
> One of us is missing something,
> 
> The "cortina" solution is broken in the current kernel. That is because
> lines 726-742 are dead code due to line 693.
> 
> I believe I've understood the problem there, and corrected it in this set
> along with a few others, but its distinctly possible that isn't true.

The code you refer to above is NOT used on the platforms that I have
identified use the Cortina PHY.  If this code is not used, it has not
caused any issue, and there is no breakage due to the change you are
referring to.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:33                   ` Russell King - ARM Linux admin
@ 2020-05-25 23:42                     ` Jeremy Linton
  2020-05-25 23:46                       ` Andrew Lunn
  2020-05-25 23:57                       ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 23:42 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,

On 5/25/20 6:33 PM, Russell King - ARM Linux admin wrote:
> On Mon, May 25, 2020 at 06:22:19PM -0500, Jeremy Linton wrote:
>> On 5/25/20 6:09 PM, Russell King - ARM Linux admin wrote:
>>> On Mon, May 25, 2020 at 05:22:07PM -0500, Jeremy Linton wrote:
>>>> On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
>>>>> On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
>>>>>>> On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
>>>>>>>>> On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
>>>>>>>>>> Until this point, we have been sanitizing the c22
>>>>>>>>>> regs presence bit out of all the MMD device lists.
>>>>>>>>>> This is incorrect as it causes the 0xFFFFFFFF checks
>>>>>>>>>> to incorrectly fail. Further, it turns out that we
>>>>>>>>>> want to utilize this flag to make a determination that
>>>>>>>>>> there is actually a phy at this location and we should
>>>>>>>>>> be accessing it using c22.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>>>>>>>>>> ---
>>>>>>>>>>       drivers/net/phy/phy_device.c | 16 +++++++++++++---
>>>>>>>>>>       1 file changed, 13 insertions(+), 3 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>>>>>>>>> index f0761fa5e40b..2d677490ecab 100644
>>>>>>>>>> --- a/drivers/net/phy/phy_device.c
>>>>>>>>>> +++ b/drivers/net/phy/phy_device.c
>>>>>>>>>> @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
>>>>>>>>>>       		return -EIO;
>>>>>>>>>>       	*devices_in_package |= phy_reg;
>>>>>>>>>> -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>>>>>> -	*devices_in_package &= ~BIT(0);
>>>>>>>>>> -
>>>>>>>>>>       	return 0;
>>>>>>>>>>       }
>>>>>>>>>> @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>>>>>       	int i;
>>>>>>>>>>       	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
>>>>>>>>>>       	u32 *devs = &c45_ids->devices_in_package;
>>>>>>>>>> +	bool c22_present = false;
>>>>>>>>>> +	bool valid_id = false;
>>>>>>>>>>       	/* Find first non-zero Devices In package. Device zero is reserved
>>>>>>>>>>       	 * for 802.3 c45 complied PHYs, so don't probe it at first.
>>>>>>>>>> @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>>>>>       		return 0;
>>>>>>>>>>       	}
>>>>>>>>>> +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
>>>>>>>>>> +	c22_present = *devs & BIT(0);
>>>>>>>>>> +	*devs &= ~BIT(0);
>>>>>>>>>> +
>>>>>>>>>>       	/* Now probe Device Identifiers for each device present. */
>>>>>>>>>>       	for (i = 1; i < num_ids; i++) {
>>>>>>>>>>       		if (!(c45_ids->devices_in_package & (1 << i)))
>>>>>>>>>> @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
>>>>>>>>>>       		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
>>>>>>>>>>       		if (ret < 0)
>>>>>>>>>>       			return ret;
>>>>>>>>>> +		if (valid_phy_id(c45_ids->device_ids[i]))
>>>>>>>>>> +			valid_id = true;
>>>>>>>>>
>>>>>>>>> Here you are using your "devices in package" validator to validate the
>>>>>>>>> PHY ID value.  One of the things it does is mask this value with
>>>>>>>>> 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
>>>>>>>>> looks completely wrong.
>>>>>>>>
>>>>>>>> I think in this case I was just using it like the comment in
>>>>>>>> get_phy_device() "if the phy_id is mostly F's, there is no device here".
>>>>>>>>
>>>>>>>> My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
>>>>>>>> that seem to indicate "bus ok, phy didn't respond".
>>>>>>>>
>>>>>>>> I just checked the OUI registration, and while there are a couple OUI's
>>>>>>>> registered that have a number of FFF's in them, none of those cases seems to
>>>>>>>> overlap sufficiently to cause this to throw them out. Plus a phy would also
>>>>>>>> have to have model+revision set to 'F's. So while might be possible, if
>>>>>>>> unlikely, at the moment I think the OUI registration keeps this from being a
>>>>>>>> problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
>>>>>>>> guarantees that the field cannot be all '1's due to the OUI having X & M
>>>>>>>> bits cleared. It sort of looks like the mapping is trying to lose those
>>>>>>>> bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
>>>>>>>> just read it three times cause it didn't make any sense).
>>>>>>>
>>>>>>> I should also note that we have at least one supported PHY where one
>>>>>>> of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
>>>>>>> odd numbered registers in one of the vendor MMDs for addresses 0
>>>>>>> through 0xefff - which has a bit set in the devices-in-package.
>>>>>>>
>>>>>>> It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
>>>>>>> devices-in-package bit is clear in most of the valid MMDs, so we
>>>>>>> shouldn't touch it.
>>>>>>>
>>>>>>> These reveal the problem of randomly probing MMDs - they can return
>>>>>>> unexpected values and not be as well behaved as we would like them to
>>>>>>> be.  Using register 8 to detect presence may be beneficial, but that
>>>>>>> may also introduce problems as we haven't used that before (and we
>>>>>>> don't know whether any PHY that wrong.)  I know at least the 88x3310
>>>>>>> gets it right for all except the vendor MMDs, where the low addresses
>>>>>>> appear non-confromant to the 802.3 specs.  Both vendor MMDs are
>>>>>>> definitely implemented, just not with anything conforming to 802.3.
>>>>>>
>>>>>> Yes, we know even for the NXP reference hardware, one of the phy's doesn't
>>>>>> probe out correctly because it doesn't respond to the ieee defined
>>>>>> registers. I think at this point, there really isn't anything we can do
>>>>>> about that unless we involve the (ACPI) firmware in currently nonstandard
>>>>>> behaviors.
>>>>>>
>>>>>> So, my goals here have been to first, not break anything, and then do a
>>>>>> slightly better job finding phy's that are (mostly?) responding correctly to
>>>>>> the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
>>>>>> have IEEE conformant phy's you should be ok". So, for your example phy, I
>>>>>> guess the immediate answer is "use DT" or "find a conformant phy", or even
>>>>>> "abstract it in the firmware and use a mailbox interface".
>>>>>
>>>>> You haven't understood.  The PHY does conform for most of the MMDs,
>>>>> but there are a number that do not conform.
>>>>
>>>> Probably...
>>>>
>>>> Except that i'm not sure how that is a problem at the moment, its still
>>>> going to trigger as a found phy, and walk the same mmd list as before
>>>> requesting drivers. Those drivers haven't changed their behavior so where is
>>>> the problem? If there is a problem its in 7/11 where things are getting
>>>> kicked due to seemingly invalid Ids.
>>>>
>>>> The 1/11 devices=0 case actually appears to be a bug i'm fixing because you
>>>> won't get an ID or a MMD list from that (before or after).
>>>
>>> I think I've just flattened that argument in my immediately preceding
>>> reply on the Cortina situation; I think you've grossly misread that
>>> through not fully researching the history and then finding the
>>> existing users.
>>>
>>> There is no bug that you are fixing from what I can see.
>>
>> One of us is missing something,
>>
>> The "cortina" solution is broken in the current kernel. That is because
>> lines 726-742 are dead code due to line 693.
>>
>> I believe I've understood the problem there, and corrected it in this set
>> along with a few others, but its distinctly possible that isn't true.
> 
> The code you refer to above is NOT used on the platforms that I have
> identified use the Cortina PHY.  If this code is not used, it has not
> caused any issue, and there is no breakage due to the change you are
> referring to.
> 
Right, which is what I sort of expected. Because its falling back to a 
device list of 0xFFFFFFFF, which means probe every single MMD.

Combined with the lack of filtering means that your getting a bunch of 
MMD IDs that potentially are invalid, along with any that happen to be 
valid. Its that behavior (and some others) which were what blew this set 
up from a couple lines of tweaks into this mess.

I don't really see a way to guess at all the "wrong" ids that are being 
pushed into the system. Which is why I started to think about a "strict" 
mode later in the set. Maybe at this point the only way around some of 
these bugs/side effects/etc is just a second c45 probe routine if we 
don't think its possible to implement a variable strictness scanner in 
this code path without losing phys that previously were detected.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:12                 ` Andrew Lunn
@ 2020-05-25 23:46                   ` Jeremy Linton
  2020-05-25 23:47                     ` Andrew Lunn
  0 siblings, 1 reply; 61+ messages in thread
From: Jeremy Linton @ 2020-05-25 23:46 UTC (permalink / raw)
  To: Andrew Lunn, Russell King - ARM Linux admin
  Cc: netdev, davem, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

Hi,


On 5/25/20 6:12 PM, Andrew Lunn wrote:
>> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
>> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
>> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
>> arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
>> arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";
>> arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> 
> Hi Jeremy
> 
> You are doing this work for NXP right? Maybe you can ask them to go
> searching in the cellar and find you one of these boards?

Yes, thats a good idea. I've been talking to various parties to let me 
run this code on some of their "odd" devices.



^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:42                     ` Jeremy Linton
@ 2020-05-25 23:46                       ` Andrew Lunn
  2020-05-25 23:57                       ` Russell King - ARM Linux admin
  1 sibling, 0 replies; 61+ messages in thread
From: Andrew Lunn @ 2020-05-25 23:46 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Russell King - ARM Linux admin, netdev, davem, f.fainelli,
	hkallweit1, madalin.bucur, calvin.johnson, linux-kernel

> Right, which is what I sort of expected. Because its falling back to a
> device list of 0xFFFFFFFF, which means probe every single MMD.
> 
> Combined with the lack of filtering means that your getting a bunch of MMD
> IDs that potentially are invalid, along with any that happen to be valid.
> Its that behavior (and some others) which were what blew this set up from a
> couple lines of tweaks into this mess.
> 
> I don't really see a way to guess at all the "wrong" ids that are being
> pushed into the system. Which is why I started to think about a "strict"
> mode later in the set. Maybe at this point the only way around some of these
> bugs/side effects/etc is just a second c45 probe routine if we don't think
> its possible to implement a variable strictness scanner in this code path
> without losing phys that previously were detected.

Hi Jeremy

I really think we need to find one of those boards which have a
cortina and see what it actually does. Can you get in contact with NXP
and see if they can find one?

    Thanks
        Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:46                   ` Jeremy Linton
@ 2020-05-25 23:47                     ` Andrew Lunn
  0 siblings, 0 replies; 61+ messages in thread
From: Andrew Lunn @ 2020-05-25 23:47 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Russell King - ARM Linux admin, netdev, davem, f.fainelli,
	hkallweit1, madalin.bucur, calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 06:46:10PM -0500, Jeremy Linton wrote:
> Hi,
> 
> 
> On 5/25/20 6:12 PM, Andrew Lunn wrote:
> > > arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> > > arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> > > arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> > > arch/powerpc/boot/dts/fsl/t4240rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> > > arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> > > arch/powerpc/boot/dts/fsl/t2080rdb.dts: compatible = "ethernet-phy-id13e5.1002";
> > 
> > Hi Jeremy
> > 
> > You are doing this work for NXP right? Maybe you can ask them to go
> > searching in the cellar and find you one of these boards?
> 
> Yes, thats a good idea. I've been talking to various parties to let me run
> this code on some of their "odd" devices.

O.K. great.

Then i think we should all stop emailing for a while until we know
more.

	Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [RFC 04/11] net: phy: Handle c22 regs presence better
  2020-05-25 23:42                     ` Jeremy Linton
  2020-05-25 23:46                       ` Andrew Lunn
@ 2020-05-25 23:57                       ` Russell King - ARM Linux admin
  1 sibling, 0 replies; 61+ messages in thread
From: Russell King - ARM Linux admin @ 2020-05-25 23:57 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: netdev, davem, andrew, f.fainelli, hkallweit1, madalin.bucur,
	calvin.johnson, linux-kernel

On Mon, May 25, 2020 at 06:42:50PM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 5/25/20 6:33 PM, Russell King - ARM Linux admin wrote:
> > On Mon, May 25, 2020 at 06:22:19PM -0500, Jeremy Linton wrote:
> > > On 5/25/20 6:09 PM, Russell King - ARM Linux admin wrote:
> > > > On Mon, May 25, 2020 at 05:22:07PM -0500, Jeremy Linton wrote:
> > > > > On 5/25/20 5:01 PM, Russell King - ARM Linux admin wrote:
> > > > > > On Mon, May 25, 2020 at 04:51:16PM -0500, Jeremy Linton wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > On 5/25/20 5:06 AM, Russell King - ARM Linux admin wrote:
> > > > > > > > On Sun, May 24, 2020 at 10:34:13PM -0500, Jeremy Linton wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > On 5/23/20 1:37 PM, Russell King - ARM Linux admin wrote:
> > > > > > > > > > On Fri, May 22, 2020 at 04:30:52PM -0500, Jeremy Linton wrote:
> > > > > > > > > > > Until this point, we have been sanitizing the c22
> > > > > > > > > > > regs presence bit out of all the MMD device lists.
> > > > > > > > > > > This is incorrect as it causes the 0xFFFFFFFF checks
> > > > > > > > > > > to incorrectly fail. Further, it turns out that we
> > > > > > > > > > > want to utilize this flag to make a determination that
> > > > > > > > > > > there is actually a phy at this location and we should
> > > > > > > > > > > be accessing it using c22.
> > > > > > > > > > > 
> > > > > > > > > > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > > > > > > > > > ---
> > > > > > > > > > >       drivers/net/phy/phy_device.c | 16 +++++++++++++---
> > > > > > > > > > >       1 file changed, 13 insertions(+), 3 deletions(-)
> > > > > > > > > > > 
> > > > > > > > > > > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > > > > > > > > > > index f0761fa5e40b..2d677490ecab 100644
> > > > > > > > > > > --- a/drivers/net/phy/phy_device.c
> > > > > > > > > > > +++ b/drivers/net/phy/phy_device.c
> > > > > > > > > > > @@ -689,9 +689,6 @@ static int get_phy_c45_devs_in_pkg(struct mii_bus *bus, int addr, int dev_addr,
> > > > > > > > > > >       		return -EIO;
> > > > > > > > > > >       	*devices_in_package |= phy_reg;
> > > > > > > > > > > -	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > > > > > > > -	*devices_in_package &= ~BIT(0);
> > > > > > > > > > > -
> > > > > > > > > > >       	return 0;
> > > > > > > > > > >       }
> > > > > > > > > > > @@ -742,6 +739,8 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > > > > > >       	int i;
> > > > > > > > > > >       	const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
> > > > > > > > > > >       	u32 *devs = &c45_ids->devices_in_package;
> > > > > > > > > > > +	bool c22_present = false;
> > > > > > > > > > > +	bool valid_id = false;
> > > > > > > > > > >       	/* Find first non-zero Devices In package. Device zero is reserved
> > > > > > > > > > >       	 * for 802.3 c45 complied PHYs, so don't probe it at first.
> > > > > > > > > > > @@ -770,6 +769,10 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > > > > > >       		return 0;
> > > > > > > > > > >       	}
> > > > > > > > > > > +	/* Bit 0 doesn't represent a device, it indicates c22 regs presence */
> > > > > > > > > > > +	c22_present = *devs & BIT(0);
> > > > > > > > > > > +	*devs &= ~BIT(0);
> > > > > > > > > > > +
> > > > > > > > > > >       	/* Now probe Device Identifiers for each device present. */
> > > > > > > > > > >       	for (i = 1; i < num_ids; i++) {
> > > > > > > > > > >       		if (!(c45_ids->devices_in_package & (1 << i)))
> > > > > > > > > > > @@ -778,6 +781,13 @@ static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
> > > > > > > > > > >       		ret = _get_phy_id(bus, addr, i, &c45_ids->device_ids[i], true);
> > > > > > > > > > >       		if (ret < 0)
> > > > > > > > > > >       			return ret;
> > > > > > > > > > > +		if (valid_phy_id(c45_ids->device_ids[i]))
> > > > > > > > > > > +			valid_id = true;
> > > > > > > > > > 
> > > > > > > > > > Here you are using your "devices in package" validator to validate the
> > > > > > > > > > PHY ID value.  One of the things it does is mask this value with
> > > > > > > > > > 0x1fffffff.  That means you lose some of the vendor OUI.  To me, this
> > > > > > > > > > looks completely wrong.
> > > > > > > > > 
> > > > > > > > > I think in this case I was just using it like the comment in
> > > > > > > > > get_phy_device() "if the phy_id is mostly F's, there is no device here".
> > > > > > > > > 
> > > > > > > > > My understanding is that the code is trying to avoid the 0xFFFFFFFF returns
> > > > > > > > > that seem to indicate "bus ok, phy didn't respond".
> > > > > > > > > 
> > > > > > > > > I just checked the OUI registration, and while there are a couple OUI's
> > > > > > > > > registered that have a number of FFF's in them, none of those cases seems to
> > > > > > > > > overlap sufficiently to cause this to throw them out. Plus a phy would also
> > > > > > > > > have to have model+revision set to 'F's. So while might be possible, if
> > > > > > > > > unlikely, at the moment I think the OUI registration keeps this from being a
> > > > > > > > > problem. Particularly, if i'm reading the mapping correctly, the OUI mapping
> > > > > > > > > guarantees that the field cannot be all '1's due to the OUI having X & M
> > > > > > > > > bits cleared. It sort of looks like the mapping is trying to lose those
> > > > > > > > > bits, by tossing bit 1 & 2, but the X & M are in the wrong octet (AFAIK, I
> > > > > > > > > just read it three times cause it didn't make any sense).
> > > > > > > > 
> > > > > > > > I should also note that we have at least one supported PHY where one
> > > > > > > > of the MMDs returns 0xfffe for even numbered registers and 0x0000 for
> > > > > > > > odd numbered registers in one of the vendor MMDs for addresses 0
> > > > > > > > through 0xefff - which has a bit set in the devices-in-package.
> > > > > > > > 
> > > > > > > > It also returns 0x0082 for almost every register in MMD 2, but MMD 2's
> > > > > > > > devices-in-package bit is clear in most of the valid MMDs, so we
> > > > > > > > shouldn't touch it.
> > > > > > > > 
> > > > > > > > These reveal the problem of randomly probing MMDs - they can return
> > > > > > > > unexpected values and not be as well behaved as we would like them to
> > > > > > > > be.  Using register 8 to detect presence may be beneficial, but that
> > > > > > > > may also introduce problems as we haven't used that before (and we
> > > > > > > > don't know whether any PHY that wrong.)  I know at least the 88x3310
> > > > > > > > gets it right for all except the vendor MMDs, where the low addresses
> > > > > > > > appear non-confromant to the 802.3 specs.  Both vendor MMDs are
> > > > > > > > definitely implemented, just not with anything conforming to 802.3.
> > > > > > > 
> > > > > > > Yes, we know even for the NXP reference hardware, one of the phy's doesn't
> > > > > > > probe out correctly because it doesn't respond to the ieee defined
> > > > > > > registers. I think at this point, there really isn't anything we can do
> > > > > > > about that unless we involve the (ACPI) firmware in currently nonstandard
> > > > > > > behaviors.
> > > > > > > 
> > > > > > > So, my goals here have been to first, not break anything, and then do a
> > > > > > > slightly better job finding phy's that are (mostly?) responding correctly to
> > > > > > > the 802.3 spec. So we can say "if you hardware is ACPI conformant, and you
> > > > > > > have IEEE conformant phy's you should be ok". So, for your example phy, I
> > > > > > > guess the immediate answer is "use DT" or "find a conformant phy", or even
> > > > > > > "abstract it in the firmware and use a mailbox interface".
> > > > > > 
> > > > > > You haven't understood.  The PHY does conform for most of the MMDs,
> > > > > > but there are a number that do not conform.
> > > > > 
> > > > > Probably...
> > > > > 
> > > > > Except that i'm not sure how that is a problem at the moment, its still
> > > > > going to trigger as a found phy, and walk the same mmd list as before
> > > > > requesting drivers. Those drivers haven't changed their behavior so where is
> > > > > the problem? If there is a problem its in 7/11 where things are getting
> > > > > kicked due to seemingly invalid Ids.
> > > > > 
> > > > > The 1/11 devices=0 case actually appears to be a bug i'm fixing because you
> > > > > won't get an ID or a MMD list from that (before or after).
> > > > 
> > > > I think I've just flattened that argument in my immediately preceding
> > > > reply on the Cortina situation; I think you've grossly misread that
> > > > through not fully researching the history and then finding the
> > > > existing users.
> > > > 
> > > > There is no bug that you are fixing from what I can see.
> > > 
> > > One of us is missing something,
> > > 
> > > The "cortina" solution is broken in the current kernel. That is because
> > > lines 726-742 are dead code due to line 693.
> > > 
> > > I believe I've understood the problem there, and corrected it in this set
> > > along with a few others, but its distinctly possible that isn't true.
> > 
> > The code you refer to above is NOT used on the platforms that I have
> > identified use the Cortina PHY.  If this code is not used, it has not
> > caused any issue, and there is no breakage due to the change you are
> > referring to.
> > 
> Right, which is what I sort of expected. Because its falling back to a
> device list of 0xFFFFFFFF, which means probe every single MMD.

No.  no no no no no.

In the platforms that I have identified, the Cortina PHY will be
created by the DT code (drivers/of/of_mdio.c).  The PHY device
will be created by phy_device_create() with is_c45 *false*.
phydev->c45_ids will actually end up containing all-zeros.
So, there is no list of MMDs in this case.

The phy_device_create() path does not call get_phy_c45_ids().
This code is not run for any of the platforms I've identified
for a Cortina PHY.

The workaround for the devices-in-package was added back in
2015, two years _before_ there was a Cortina PHY driver, when
the phylib support for Clause 45 PHYs was in its infancy - there
was something really basic that didn't care what ID the PHY was,
just assumed it was a 10G PHY, no configuration of it, and just
reported link up/down.  So, back then IDs were mostly meaningless
for Clause 45 PHYs.

In 2017, a Cortina PHY driver was added, and we now have some
platforms that use this, and they _totally_ avoid this code path.

The workaround is likely obsolete and redundant, but we've no way
to know if removing it will create a regression.

In any case, for the reasons I've already clearly set out in one of
my previous emails analysing the Cortina situation (which you seem
to have ignored), pointing out that even the MMD 0 register set is
not compatible and would likely not reveal a valid ID, I think it's
highly likely that Cortina PHYs did not work for very long between
2015 and 2017, but continue to work fine today.

Please, rather than immediately writing yet another email trying to
"clarify" something in reply to this email, please go away and think
about some of the points I've raised, read the code as it stands,
not only in phylib but also in drivers/of.  Look at the device tree
descriptions for the boards I've pointed out.  Analyse what the code
would do.  It will help you immensely to have that understanding,
and I'm sure you will come to the same conclusion I have that the
workaround we see in get_phy_c45_ids() is likely obsolete.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2020-05-25 23:57 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-22 21:30 [RFC 00/11] Make C45 autoprobe more robust Jeremy Linton
2020-05-22 21:30 ` [RFC 01/11] net: phy: Don't report success if devices weren't found Jeremy Linton
2020-05-23 18:20   ` Russell King - ARM Linux admin
2020-05-25  2:46     ` Jeremy Linton
2020-05-25  9:45       ` Russell King - ARM Linux admin
2020-05-25 21:02         ` Jeremy Linton
2020-05-25 21:07           ` Russell King - ARM Linux admin
2020-05-25 21:59             ` Jeremy Linton
2020-05-22 21:30 ` [RFC 02/11] net: phy: Simplify MMD device list termination Jeremy Linton
2020-05-23 18:36   ` Russell King - ARM Linux admin
2020-05-25  2:48     ` Jeremy Linton
2020-05-25  8:09       ` Russell King - ARM Linux admin
2020-05-22 21:30 ` [RFC 03/11] net: phy: refactor c45 phy identification sequence Jeremy Linton
2020-05-23 15:28   ` Andrew Lunn
2020-05-23 17:16     ` Jeremy Linton
2020-05-23 17:32     ` Jeremy Linton
2020-05-23 19:12       ` Russell King - ARM Linux admin
2020-05-23 18:30   ` Russell King - ARM Linux admin
2020-05-23 19:51     ` Andrew Lunn
2020-05-23 20:01       ` Russell King - ARM Linux admin
2020-05-25  2:37         ` Jeremy Linton
2020-05-22 21:30 ` [RFC 04/11] net: phy: Handle c22 regs presence better Jeremy Linton
2020-05-23 18:37   ` Russell King - ARM Linux admin
2020-05-25  3:34     ` Jeremy Linton
2020-05-25  9:53       ` Russell King - ARM Linux admin
2020-05-25 10:06       ` Russell King - ARM Linux admin
2020-05-25 21:51         ` Jeremy Linton
2020-05-25 22:01           ` Russell King - ARM Linux admin
2020-05-25 22:22             ` Jeremy Linton
2020-05-25 23:09               ` Russell King - ARM Linux admin
2020-05-25 23:22                 ` Jeremy Linton
2020-05-25 23:33                   ` Russell King - ARM Linux admin
2020-05-25 23:42                     ` Jeremy Linton
2020-05-25 23:46                       ` Andrew Lunn
2020-05-25 23:57                       ` Russell King - ARM Linux admin
2020-05-25 23:16             ` Jeremy Linton
2020-05-25 23:30               ` Russell King - ARM Linux admin
2020-05-25 22:06           ` Andrew Lunn
2020-05-25 22:17             ` Jeremy Linton
2020-05-25 23:06               ` Andrew Lunn
2020-05-25 23:07               ` Russell King - ARM Linux admin
2020-05-25 23:12                 ` Andrew Lunn
2020-05-25 23:46                   ` Jeremy Linton
2020-05-25 23:47                     ` Andrew Lunn
2020-05-22 21:30 ` [RFC 05/11] net: phy: Scan the entire MMD device space Jeremy Linton
2020-05-22 21:30 ` [RFC 06/11] net: phy: Hoist no phy detected state Jeremy Linton
2020-05-22 21:30 ` [RFC 07/11] net: phy: reset invalid phy reads of 0 back to 0xffffffff Jeremy Linton
2020-05-23 18:44   ` Russell King - ARM Linux admin
2020-05-25  4:20     ` Jeremy Linton
2020-05-25  8:20       ` Russell King - ARM Linux admin
2020-05-22 21:30 ` [RFC 08/11] net: phy: Allow mdio buses to auto-probe c45 devices Jeremy Linton
2020-05-24 14:44   ` Andrew Lunn
2020-05-25  4:28     ` Jeremy Linton
2020-05-25  8:25       ` Russell King - ARM Linux admin
2020-05-25 13:43         ` Andrew Lunn
2020-05-25 22:09         ` Jeremy Linton
2020-05-25 22:41           ` Russell King - ARM Linux admin
2020-05-22 21:30 ` [RFC 09/11] net: phy: Refuse to consider phy_id=0 a valid phy Jeremy Linton
2020-05-22 21:30 ` [RFC 10/11] net: example acpize xgmac_mdio Jeremy Linton
2020-05-23 18:48   ` Russell King - ARM Linux admin
2020-05-22 21:30 ` [RFC 11/11] net: example xgmac enable extended scanning Jeremy Linton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).