All of lore.kernel.org
 help / color / mirror / Atom feed
* Broken ethernet on SolidRun cubox-i
@ 2020-12-26 12:18 Michael Walle
  2020-12-26 12:34 ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Walle @ 2020-12-26 12:18 UTC (permalink / raw)
  To: Russell King
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

Hi Russell,

Ethernet is broken on 5.8+ kernel for "some" cubox-i boards. Some users
reached out to me because they suspected my commit 0465d8f830dc ("net: 
phy:
at803x: fix PHY ID masks") for the breakage. But it turned out not to be
the case. Instead on faulty boards the PHY was always on PHY address 4:

[    4.008024] libphy: fec_enet_mii_bus: probed
[    4.008151] mdio_bus 2188000.ethernet-1: MDIO device at address 0 is 
missing.
[    4.010155] fec 2188000.ethernet eth0: registered PHC device 0
[..]
[   19.172510] fec 2188000.ethernet eth0: Unable to connect to phy

# cat 
/sys/devices/soc0/soc/2100000.bus/2188000.ethernet/mdio_bus/2188000.ethernet-1/2188000.ethernet-1:04/phy_id
0x004dd072

Thus I suspect your commit 86b08bd5b994 ("ARM: dts: imx6-sr-som: add
ethernet PHY configuration") to be the culprit ;) This will pin the
PHY to address 0. I don't know how it was done before; like was there
autoprobing if there is no "phy-handle" or did the bootloader fix that
inplace. I don't have any cubox-i.

Anyway, I'm not sure what boards have the PHY at address 4. If it is
just per model and affects only the quad-core iMX.6 one or if it is
worse and the exact same model has it on 0 or 4. [1] might imply the
latter. If that is the case, SolidRun has added/removed a pull-down
on LED_ACT sometime during the board revisions.

I've looked at the vendor bootloader and it seems they scan the bus
for the PHY [2].

[1] 
https://forum.armbian.com/topic/15418-upgrading-cubox-i-buster-from-kernel-57y-to-58y-breaks-ethernet/?do=findComment&comment=116114
[2] 
https://github.com/SolidRun/u-boot/blob/v2018.01-solidrun-a38x/board/solidrun/mx6cuboxi/mx6cuboxi.c#L160

-michael

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2020-12-26 12:18 Broken ethernet on SolidRun cubox-i Michael Walle
@ 2020-12-26 12:34 ` Russell King - ARM Linux admin
  2020-12-27 15:33   ` Michael Walle
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2020-12-26 12:34 UTC (permalink / raw)
  To: Michael Walle
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

On Sat, Dec 26, 2020 at 01:18:01PM +0100, Michael Walle wrote:
> Thus I suspect your commit 86b08bd5b994 ("ARM: dts: imx6-sr-som: add
> ethernet PHY configuration") to be the culprit ;) This will pin the
> PHY to address 0. I don't know how it was done before; like was there
> autoprobing if there is no "phy-handle" or did the bootloader fix that
> inplace. I don't have any cubox-i.

There is the need to move the AT803x quirk handling into DT rather than
using the PHY quirk system - the problem is, the PHY quirk system
applies the quirks to every iMX6 platform out there, whether the quirk
is right or wrong for that board.

I'd forgotten that there were boards out there with this problem...
the PHY address configuration is done via the LED_ACT pin, and SolidRun
omitted a pull resistor on it, so it "floats" with the leakage current
of the LED/pin - resulting in it sometimes appearing at address 0 and
sometimes at address 4.

I suppose there's no option but to revert the commit - but it needs to
be properly documented /why/ that is the case, and comments added to
ar8035_phy_fixup() in arch/arm/mach-imx/mach-imx6q.c to say why that
quirk can't be removed.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2020-12-26 12:34 ` Russell King - ARM Linux admin
@ 2020-12-27 15:33   ` Michael Walle
  2020-12-27 15:59     ` Michael Walle
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Walle @ 2020-12-27 15:33 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
> On Sat, Dec 26, 2020 at 01:18:01PM +0100, Michael Walle wrote:
>> Thus I suspect your commit 86b08bd5b994 ("ARM: dts: imx6-sr-som: add
>> ethernet PHY configuration") to be the culprit ;) This will pin the
>> PHY to address 0. I don't know how it was done before; like was there
>> autoprobing if there is no "phy-handle" or did the bootloader fix that
>> inplace. I don't have any cubox-i.
> 
> There is the need to move the AT803x quirk handling into DT rather than
> using the PHY quirk system - the problem is, the PHY quirk system
> applies the quirks to every iMX6 platform out there, whether the quirk
> is right or wrong for that board.
> 
> I'd forgotten that there were boards out there with this problem...
> the PHY address configuration is done via the LED_ACT pin, and SolidRun
> omitted a pull resistor on it, so it "floats" with the leakage current
> of the LED/pin - resulting in it sometimes appearing at address 0 and
> sometimes at address 4.

Mh, I've guessed that too, but there must be more to it. The datasheet
says it has an internal weak pull-up. Or Atheros messed up and it 
doesn't
reliably work if there is actually an LED attached to it. But then, why
would any other stronger pull-up/down work..

> I suppose there's no option but to revert the commit - but it needs to
> be properly documented /why/ that is the case, and comments added to
> ar8035_phy_fixup() in arch/arm/mach-imx/mach-imx6q.c to say why that
> quirk can't be removed.

Will you take care of that? Like I said, I don't have any cubox-i. I
could revert that commit, but it wouldn't be tested.

-michael

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2020-12-27 15:33   ` Michael Walle
@ 2020-12-27 15:59     ` Michael Walle
  2020-12-27 16:11       ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Walle @ 2020-12-27 15:59 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

Am 2020-12-27 16:33, schrieb Michael Walle:
> Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
>> I'd forgotten that there were boards out there with this problem...
>> the PHY address configuration is done via the LED_ACT pin, and 
>> SolidRun
>> omitted a pull resistor on it, so it "floats" with the leakage current
>> of the LED/pin - resulting in it sometimes appearing at address 0 and
>> sometimes at address 4.
> 
> Mh, I've guessed that too, but there must be more to it. The datasheet
> says it has an internal weak pull-up. Or Atheros messed up and it 
> doesn't
> reliably work if there is actually an LED attached to it. But then, why
> would any other stronger pull-up/down work..

Mhh, nevermind, from the commit log [1].

   "The LED_ACT pin on the carrier-one boards had a pull down that
   forces the phy address to 0x0; where on CuBox-i and the production
   HummingBoard that pin is connected directly to LED that depending
   on the pull down strength of the LED it might be sampled as '0' or '1' 
thus
   the phy address might appear as either address 0x0 or 0x4."

So it actually depends on the forward voltage of the LED and the
hi/low thresholds of the AT8035.. nice! Oh and btw. this pin also
switches between high and low-active LED output. So the missing
pull-down might not only switch the PHY address to 4 but also invert
the LED state.

-michael

[1] 
https://github.com/SolidRun/u-boot/commit/712be3eef69a2b0205d3b87fb5ab5632e36722d7

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2020-12-27 15:59     ` Michael Walle
@ 2020-12-27 16:11       ` Russell King - ARM Linux admin
  2020-12-27 18:07         ` Andrew Lunn
  2021-01-08 11:53         ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2020-12-27 16:11 UTC (permalink / raw)
  To: Michael Walle
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

On Sun, Dec 27, 2020 at 04:59:39PM +0100, Michael Walle wrote:
> Am 2020-12-27 16:33, schrieb Michael Walle:
> > Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
> > > I'd forgotten that there were boards out there with this problem...
> > > the PHY address configuration is done via the LED_ACT pin, and
> > > SolidRun
> > > omitted a pull resistor on it, so it "floats" with the leakage current
> > > of the LED/pin - resulting in it sometimes appearing at address 0 and
> > > sometimes at address 4.
> > 
> > Mh, I've guessed that too, but there must be more to it. The datasheet
> > says it has an internal weak pull-up. Or Atheros messed up and it
> > doesn't
> > reliably work if there is actually an LED attached to it. But then, why
> > would any other stronger pull-up/down work..
> 
> Mhh, nevermind, from the commit log [1].
> 
>   "The LED_ACT pin on the carrier-one boards had a pull down that
>   forces the phy address to 0x0; where on CuBox-i and the production
>   HummingBoard that pin is connected directly to LED that depending
>   on the pull down strength of the LED it might be sampled as '0' or '1'
> thus
>   the phy address might appear as either address 0x0 or 0x4."
> 
> So it actually depends on the forward voltage of the LED and the
> hi/low thresholds of the AT8035.. nice! Oh and btw. this pin also
> switches between high and low-active LED output. So the missing
> pull-down might not only switch the PHY address to 4 but also invert
> the LED state.

Indeed. And whether it appears at address 0 or 4 will depend on many
factors, including temperature - LEDs have a decrease of 2mV/°C.

I wonder if we can just delete the phy-handle property, and list a
PHY at both address 0 and 4 with the appropriate configuration...

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2020-12-27 16:11       ` Russell King - ARM Linux admin
@ 2020-12-27 18:07         ` Andrew Lunn
  2021-01-08 11:53         ` Russell King - ARM Linux admin
  1 sibling, 0 replies; 15+ messages in thread
From: Andrew Lunn @ 2020-12-27 18:07 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, Michael Walle, Christoph Mattheis, acmattheis,
	linux-arm-kernel

> I wonder if we can just delete the phy-handle property, and list a
> PHY at both address 0 and 4 with the appropriate configuration...

Hi Russell, Michael

The Freescale FEC has an open coded version of phy_find_first(). So if
you don't have a phy-handle, it will go searching for a PHY on the
MDIO bus and should find it at 0 or 4.

     Andrew

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2020-12-27 16:11       ` Russell King - ARM Linux admin
  2020-12-27 18:07         ` Andrew Lunn
@ 2021-01-08 11:53         ` Russell King - ARM Linux admin
  2021-01-08 11:58           ` Michael Walle
  1 sibling, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-08 11:53 UTC (permalink / raw)
  To: Michael Walle
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

[-- Attachment #1: Type: text/plain, Size: 2106 bytes --]

On Sun, Dec 27, 2020 at 04:11:14PM +0000, Russell King - ARM Linux admin wrote:
> On Sun, Dec 27, 2020 at 04:59:39PM +0100, Michael Walle wrote:
> > Am 2020-12-27 16:33, schrieb Michael Walle:
> > > Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
> > > > I'd forgotten that there were boards out there with this problem...
> > > > the PHY address configuration is done via the LED_ACT pin, and
> > > > SolidRun
> > > > omitted a pull resistor on it, so it "floats" with the leakage current
> > > > of the LED/pin - resulting in it sometimes appearing at address 0 and
> > > > sometimes at address 4.
> > > 
> > > Mh, I've guessed that too, but there must be more to it. The datasheet
> > > says it has an internal weak pull-up. Or Atheros messed up and it
> > > doesn't
> > > reliably work if there is actually an LED attached to it. But then, why
> > > would any other stronger pull-up/down work..
> > 
> > Mhh, nevermind, from the commit log [1].
> > 
> >   "The LED_ACT pin on the carrier-one boards had a pull down that
> >   forces the phy address to 0x0; where on CuBox-i and the production
> >   HummingBoard that pin is connected directly to LED that depending
> >   on the pull down strength of the LED it might be sampled as '0' or '1'
> > thus
> >   the phy address might appear as either address 0x0 or 0x4."
> > 
> > So it actually depends on the forward voltage of the LED and the
> > hi/low thresholds of the AT8035.. nice! Oh and btw. this pin also
> > switches between high and low-active LED output. So the missing
> > pull-down might not only switch the PHY address to 4 but also invert
> > the LED state.
> 
> Indeed. And whether it appears at address 0 or 4 will depend on many
> factors, including temperature - LEDs have a decrease of 2mV/°C.
> 
> I wonder if we can just delete the phy-handle property, and list a
> PHY at both address 0 and 4 with the appropriate configuration...

Michael, can you try the attached patch please?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

[-- Attachment #2: sr-som.diff --]
[-- Type: text/x-diff, Size: 717 bytes --]

diff --git a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
index b06577808ff4..3db08363d3fb 100644
--- a/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-sr-som.dtsi
@@ -53,7 +53,6 @@
 &fec {
 	pinctrl-names = "default";
 	pinctrl-0 = <&pinctrl_microsom_enet_ar8035>;
-	phy-handle = <&phy>;
 	phy-mode = "rgmii-id";
 	phy-reset-duration = <2>;
 	phy-reset-gpios = <&gpio4 15 GPIO_ACTIVE_LOW>;
@@ -63,10 +62,15 @@
 		#address-cells = <1>;
 		#size-cells = <0>;
 
-		phy: ethernet-phy@0 {
+		ethernet-phy@0 {
 			reg = <0>;
 			qca,clk-out-frequency = <125000000>;
 		};
+
+		ethernet-phy@4 {
+			reg = <4>;
+			qca,clk-out-frequency = <125000000>;
+		};
 	};
 };
 

[-- Attachment #3: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 11:53         ` Russell King - ARM Linux admin
@ 2021-01-08 11:58           ` Michael Walle
  2021-01-08 12:01             ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 15+ messages in thread
From: Michael Walle @ 2021-01-08 11:58 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

Am 2021-01-08 12:53, schrieb Russell King - ARM Linux admin:
> On Sun, Dec 27, 2020 at 04:11:14PM +0000, Russell King - ARM Linux 
> admin wrote:
>> On Sun, Dec 27, 2020 at 04:59:39PM +0100, Michael Walle wrote:
>> > Am 2020-12-27 16:33, schrieb Michael Walle:
>> > > Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
>> > > > I'd forgotten that there were boards out there with this problem...
>> > > > the PHY address configuration is done via the LED_ACT pin, and
>> > > > SolidRun
>> > > > omitted a pull resistor on it, so it "floats" with the leakage current
>> > > > of the LED/pin - resulting in it sometimes appearing at address 0 and
>> > > > sometimes at address 4.
>> > >
>> > > Mh, I've guessed that too, but there must be more to it. The datasheet
>> > > says it has an internal weak pull-up. Or Atheros messed up and it
>> > > doesn't
>> > > reliably work if there is actually an LED attached to it. But then, why
>> > > would any other stronger pull-up/down work..
>> >
>> > Mhh, nevermind, from the commit log [1].
>> >
>> >   "The LED_ACT pin on the carrier-one boards had a pull down that
>> >   forces the phy address to 0x0; where on CuBox-i and the production
>> >   HummingBoard that pin is connected directly to LED that depending
>> >   on the pull down strength of the LED it might be sampled as '0' or '1'
>> > thus
>> >   the phy address might appear as either address 0x0 or 0x4."
>> >
>> > So it actually depends on the forward voltage of the LED and the
>> > hi/low thresholds of the AT8035.. nice! Oh and btw. this pin also
>> > switches between high and low-active LED output. So the missing
>> > pull-down might not only switch the PHY address to 4 but also invert
>> > the LED state.
>> 
>> Indeed. And whether it appears at address 0 or 4 will depend on many
>> factors, including temperature - LEDs have a decrease of 2mV/°C.
>> 
>> I wonder if we can just delete the phy-handle property, and list a
>> PHY at both address 0 and 4 with the appropriate configuration...
> 
> Michael, can you try the attached patch please?

I don't have a cubox. But it's just a device tree patch. I could
try to hack one based on Christophs dtb and he could just replace
it on his sd card and test. Seems easy enough.

-- 
-michael

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 11:58           ` Michael Walle
@ 2021-01-08 12:01             ` Russell King - ARM Linux admin
  2021-01-08 12:14               ` Michael Walle
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-08 12:01 UTC (permalink / raw)
  To: Michael Walle
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

On Fri, Jan 08, 2021 at 12:58:17PM +0100, Michael Walle wrote:
> Am 2021-01-08 12:53, schrieb Russell King - ARM Linux admin:
> > On Sun, Dec 27, 2020 at 04:11:14PM +0000, Russell King - ARM Linux admin
> > wrote:
> > > On Sun, Dec 27, 2020 at 04:59:39PM +0100, Michael Walle wrote:
> > > > Am 2020-12-27 16:33, schrieb Michael Walle:
> > > > > Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
> > > > > > I'd forgotten that there were boards out there with this problem...
> > > > > > the PHY address configuration is done via the LED_ACT pin, and
> > > > > > SolidRun
> > > > > > omitted a pull resistor on it, so it "floats" with the leakage current
> > > > > > of the LED/pin - resulting in it sometimes appearing at address 0 and
> > > > > > sometimes at address 4.
> > > > >
> > > > > Mh, I've guessed that too, but there must be more to it. The datasheet
> > > > > says it has an internal weak pull-up. Or Atheros messed up and it
> > > > > doesn't
> > > > > reliably work if there is actually an LED attached to it. But then, why
> > > > > would any other stronger pull-up/down work..
> > > >
> > > > Mhh, nevermind, from the commit log [1].
> > > >
> > > >   "The LED_ACT pin on the carrier-one boards had a pull down that
> > > >   forces the phy address to 0x0; where on CuBox-i and the production
> > > >   HummingBoard that pin is connected directly to LED that depending
> > > >   on the pull down strength of the LED it might be sampled as '0' or '1'
> > > > thus
> > > >   the phy address might appear as either address 0x0 or 0x4."
> > > >
> > > > So it actually depends on the forward voltage of the LED and the
> > > > hi/low thresholds of the AT8035.. nice! Oh and btw. this pin also
> > > > switches between high and low-active LED output. So the missing
> > > > pull-down might not only switch the PHY address to 4 but also invert
> > > > the LED state.
> > > 
> > > Indeed. And whether it appears at address 0 or 4 will depend on many
> > > factors, including temperature - LEDs have a decrease of 2mV/°C.
> > > 
> > > I wonder if we can just delete the phy-handle property, and list a
> > > PHY at both address 0 and 4 with the appropriate configuration...
> > 
> > Michael, can you try the attached patch please?
> 
> I don't have a cubox. But it's just a device tree patch. I could
> try to hack one based on Christophs dtb and he could just replace
> it on his sd card and test. Seems easy enough.

This sounds like a mess of indirection. What is "Christophs dtb"?
Why are there different dtbs out there for the same platform? If
there's changes necessary, why aren't they being submitted to the
mainline kernel?

In fact, why aren't users reporting these problems to mainline kernel
developers? Why do we have to have this tortuous bug reporting route
which makes testing fixes difficult?

This rather makes me not want to care about this.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 12:01             ` Russell King - ARM Linux admin
@ 2021-01-08 12:14               ` Michael Walle
  2021-01-08 12:25                 ` Christoph Mattheis
  2021-01-08 12:37                 ` Russell King - ARM Linux admin
  0 siblings, 2 replies; 15+ messages in thread
From: Michael Walle @ 2021-01-08 12:14 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

Am 2021-01-08 13:01, schrieb Russell King - ARM Linux admin:
> On Fri, Jan 08, 2021 at 12:58:17PM +0100, Michael Walle wrote:
>> Am 2021-01-08 12:53, schrieb Russell King - ARM Linux admin:
>> > On Sun, Dec 27, 2020 at 04:11:14PM +0000, Russell King - ARM Linux admin
>> > wrote:
>> > > On Sun, Dec 27, 2020 at 04:59:39PM +0100, Michael Walle wrote:
>> > > > Am 2020-12-27 16:33, schrieb Michael Walle:
>> > > > > Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
>> > > > > > I'd forgotten that there were boards out there with this problem...
>> > > > > > the PHY address configuration is done via the LED_ACT pin, and
>> > > > > > SolidRun
>> > > > > > omitted a pull resistor on it, so it "floats" with the leakage current
>> > > > > > of the LED/pin - resulting in it sometimes appearing at address 0 and
>> > > > > > sometimes at address 4.
>> > > > >
>> > > > > Mh, I've guessed that too, but there must be more to it. The datasheet
>> > > > > says it has an internal weak pull-up. Or Atheros messed up and it
>> > > > > doesn't
>> > > > > reliably work if there is actually an LED attached to it. But then, why
>> > > > > would any other stronger pull-up/down work..
>> > > >
>> > > > Mhh, nevermind, from the commit log [1].
>> > > >
>> > > >   "The LED_ACT pin on the carrier-one boards had a pull down that
>> > > >   forces the phy address to 0x0; where on CuBox-i and the production
>> > > >   HummingBoard that pin is connected directly to LED that depending
>> > > >   on the pull down strength of the LED it might be sampled as '0' or '1'
>> > > > thus
>> > > >   the phy address might appear as either address 0x0 or 0x4."
>> > > >
>> > > > So it actually depends on the forward voltage of the LED and the
>> > > > hi/low thresholds of the AT8035.. nice! Oh and btw. this pin also
>> > > > switches between high and low-active LED output. So the missing
>> > > > pull-down might not only switch the PHY address to 4 but also invert
>> > > > the LED state.
>> > >
>> > > Indeed. And whether it appears at address 0 or 4 will depend on many
>> > > factors, including temperature - LEDs have a decrease of 2mV/°C.
>> > >
>> > > I wonder if we can just delete the phy-handle property, and list a
>> > > PHY at both address 0 and 4 with the appropriate configuration...
>> >
>> > Michael, can you try the attached patch please?
>> 
>> I don't have a cubox. But it's just a device tree patch. I could
>> try to hack one based on Christophs dtb and he could just replace
>> it on his sd card and test. Seems easy enough.
> 
> This sounds like a mess of indirection. What is "Christophs dtb"?
> Why are there different dtbs out there for the same platform? If
> there's changes necessary, why aren't they being submitted to the
> mainline kernel?
> 
> In fact, why aren't users reporting these problems to mainline kernel
> developers? Why do we have to have this tortuous bug reporting route
> which makes testing fixes difficult?
> 
> This rather makes me not want to care about this.

Well first it was a suspected issue with 'my' change in the Atheros
PHY driver, which turned out to be not the case. I _voluntarily_
tried to debug the issue with a user (Christoph) just to find out
that it is likely caused by the commit mentioned above. So for
startes, why would I care? I just wanted to be kind and provide
some help. If anything, this shows me, I should rather stick to
my own problems.

So please advise Christoph, where he should report this bug.

-michael

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 12:14               ` Michael Walle
@ 2021-01-08 12:25                 ` Christoph Mattheis
  2021-01-08 12:27                   ` Russell King - ARM Linux admin
  2021-01-08 12:37                 ` Russell King - ARM Linux admin
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Mattheis @ 2021-01-08 12:25 UTC (permalink / raw)
  To: Michael Walle, Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, acmattheis, linux-arm-kernel

Russell, Michael, I'm here and happy to support whatever you suggest - I have multiple cuboxes here, one impacted, two not - just tell me what I should do - BR Chris


> Michael Walle <michael@walle.cc> hat am 8. Januar 2021 um 13:14 geschrieben:
> 
> 
> Am 2021-01-08 13:01, schrieb Russell King - ARM Linux admin:
> > On Fri, Jan 08, 2021 at 12:58:17PM +0100, Michael Walle wrote:
> >> Am 2021-01-08 12:53, schrieb Russell King - ARM Linux admin:
> >> > On Sun, Dec 27, 2020 at 04:11:14PM +0000, Russell King - ARM Linux admin
> >> > wrote:
> >> > > On Sun, Dec 27, 2020 at 04:59:39PM +0100, Michael Walle wrote:
> >> > > > Am 2020-12-27 16:33, schrieb Michael Walle:
> >> > > > > Am 2020-12-26 13:34, schrieb Russell King - ARM Linux admin:
> >> > > > > > I'd forgotten that there were boards out there with this problem...
> >> > > > > > the PHY address configuration is done via the LED_ACT pin, and
> >> > > > > > SolidRun
> >> > > > > > omitted a pull resistor on it, so it "floats" with the leakage current
> >> > > > > > of the LED/pin - resulting in it sometimes appearing at address 0 and
> >> > > > > > sometimes at address 4.
> >> > > > >
> >> > > > > Mh, I've guessed that too, but there must be more to it. The datasheet
> >> > > > > says it has an internal weak pull-up. Or Atheros messed up and it
> >> > > > > doesn't
> >> > > > > reliably work if there is actually an LED attached to it. But then, why
> >> > > > > would any other stronger pull-up/down work..
> >> > > >
> >> > > > Mhh, nevermind, from the commit log [1].
> >> > > >
> >> > > >   "The LED_ACT pin on the carrier-one boards had a pull down that
> >> > > >   forces the phy address to 0x0; where on CuBox-i and the production
> >> > > >   HummingBoard that pin is connected directly to LED that depending
> >> > > >   on the pull down strength of the LED it might be sampled as '0' or '1'
> >> > > > thus
> >> > > >   the phy address might appear as either address 0x0 or 0x4."
> >> > > >
> >> > > > So it actually depends on the forward voltage of the LED and the
> >> > > > hi/low thresholds of the AT8035.. nice! Oh and btw. this pin also
> >> > > > switches between high and low-active LED output. So the missing
> >> > > > pull-down might not only switch the PHY address to 4 but also invert
> >> > > > the LED state.
> >> > >
> >> > > Indeed. And whether it appears at address 0 or 4 will depend on many
> >> > > factors, including temperature - LEDs have a decrease of 2mV/°C.
> >> > >
> >> > > I wonder if we can just delete the phy-handle property, and list a
> >> > > PHY at both address 0 and 4 with the appropriate configuration...
> >> >
> >> > Michael, can you try the attached patch please?
> >> 
> >> I don't have a cubox. But it's just a device tree patch. I could
> >> try to hack one based on Christophs dtb and he could just replace
> >> it on his sd card and test. Seems easy enough.
> > 
> > This sounds like a mess of indirection. What is "Christophs dtb"?
> > Why are there different dtbs out there for the same platform? If
> > there's changes necessary, why aren't they being submitted to the
> > mainline kernel?
> > 
> > In fact, why aren't users reporting these problems to mainline kernel
> > developers? Why do we have to have this tortuous bug reporting route
> > which makes testing fixes difficult?
> > 
> > This rather makes me not want to care about this.
> 
> Well first it was a suspected issue with 'my' change in the Atheros
> PHY driver, which turned out to be not the case. I _voluntarily_
> tried to debug the issue with a user (Christoph) just to find out
> that it is likely caused by the commit mentioned above. So for
> startes, why would I care? I just wanted to be kind and provide
> some help. If anything, this shows me, I should rather stick to
> my own problems.
> 
> So please advise Christoph, where he should report this bug.
> 
> -michael

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 12:25                 ` Christoph Mattheis
@ 2021-01-08 12:27                   ` Russell King - ARM Linux admin
  2021-01-08 12:33                     ` Christoph Mattheis
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-08 12:27 UTC (permalink / raw)
  To: Christoph Mattheis
  Cc: Matus Ujhelyi, Michael Walle, acmattheis, linux-arm-kernel

On Fri, Jan 08, 2021 at 01:25:51PM +0100, Christoph Mattheis wrote:
> Russell, Michael, I'm here and happy to support whatever you suggest - I have multiple cuboxes here, one impacted, two not - just tell me what I should do - BR Chris

Please try the DT patch I sent in my previous email, I'm hoping it will
resolve your problem. Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 12:27                   ` Russell King - ARM Linux admin
@ 2021-01-08 12:33                     ` Christoph Mattheis
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Mattheis @ 2021-01-08 12:33 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, Michael Walle, acmattheis, linux-arm-kernel

Thanks, Russell, will try - and let you know the results - BR Chris

> Russell King - ARM Linux admin <linux@armlinux.org.uk> hat am 8. Januar 2021 um 13:27 geschrieben:
> 
> 
> On Fri, Jan 08, 2021 at 01:25:51PM +0100, Christoph Mattheis wrote:
> > Russell, Michael, I'm here and happy to support whatever you suggest - I have multiple cuboxes here, one impacted, two not - just tell me what I should do - BR Chris
> 
> Please try the DT patch I sent in my previous email, I'm hoping it will
> resolve your problem. Thanks.
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 12:14               ` Michael Walle
  2021-01-08 12:25                 ` Christoph Mattheis
@ 2021-01-08 12:37                 ` Russell King - ARM Linux admin
  2021-01-08 12:51                   ` Michael Walle
  1 sibling, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux admin @ 2021-01-08 12:37 UTC (permalink / raw)
  To: Michael Walle
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

On Fri, Jan 08, 2021 at 01:14:19PM +0100, Michael Walle wrote:
> Am 2021-01-08 13:01, schrieb Russell King - ARM Linux admin:
> > This sounds like a mess of indirection. What is "Christophs dtb"?
> > Why are there different dtbs out there for the same platform? If
> > there's changes necessary, why aren't they being submitted to the
> > mainline kernel?
> > 
> > In fact, why aren't users reporting these problems to mainline kernel
> > developers? Why do we have to have this tortuous bug reporting route
> > which makes testing fixes difficult?
> > 
> > This rather makes me not want to care about this.
> 
> Well first it was a suspected issue with 'my' change in the Atheros
> PHY driver, which turned out to be not the case. I _voluntarily_
> tried to debug the issue with a user (Christoph) just to find out
> that it is likely caused by the commit mentioned above. So for
> startes, why would I care? I just wanted to be kind and provide
> some help. If anything, this shows me, I should rather stick to
> my own problems.
> 
> So please advise Christoph, where he should report this bug.

There is, unfortunately, a long history where the appropriate kernel
developers have had no knowledge of bugs that have been introduced
by changes that have been made, simply because users report them on
things like web forums or other mailing lists.

In the normal process of things, the bugs have only found out about
years later, maybe when someone finally contacts the appropriate
kernel developers pointing out the problem.

A case in point was the removal of bogomips from /proc/cpuinfo, which
resulted in kernel developers being roasted by Linus when it was
pointed out that users had been reporting the problem on forums for
over a year. Apparently, it was _our_ fault for not knowing about
them.

Similar seems to be happening with my SFP work - it seems people would
much rather report problems on random forums around the Internet rather
than send an email to anyone who can fix the problem in mainline
kernels.

Kernel development is fundamentally a difficult, frustrating and
depressing activity.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Broken ethernet on SolidRun cubox-i
  2021-01-08 12:37                 ` Russell King - ARM Linux admin
@ 2021-01-08 12:51                   ` Michael Walle
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Walle @ 2021-01-08 12:51 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: Matus Ujhelyi, acmattheis, Christoph Mattheis, linux-arm-kernel

Am 2021-01-08 13:37, schrieb Russell King - ARM Linux admin:
> On Fri, Jan 08, 2021 at 01:14:19PM +0100, Michael Walle wrote:
>> Am 2021-01-08 13:01, schrieb Russell King - ARM Linux admin:
>> > This sounds like a mess of indirection. What is "Christophs dtb"?
>> > Why are there different dtbs out there for the same platform? If
>> > there's changes necessary, why aren't they being submitted to the
>> > mainline kernel?
>> >
>> > In fact, why aren't users reporting these problems to mainline kernel
>> > developers? Why do we have to have this tortuous bug reporting route
>> > which makes testing fixes difficult?
>> >
>> > This rather makes me not want to care about this.
>> 
>> Well first it was a suspected issue with 'my' change in the Atheros
>> PHY driver, which turned out to be not the case. I _voluntarily_
>> tried to debug the issue with a user (Christoph) just to find out
>> that it is likely caused by the commit mentioned above. So for
>> startes, why would I care? I just wanted to be kind and provide
>> some help. If anything, this shows me, I should rather stick to
>> my own problems.
>> 
>> So please advise Christoph, where he should report this bug.
> 
> There is, unfortunately, a long history where the appropriate kernel
> developers have had no knowledge of bugs that have been introduced
> by changes that have been made, simply because users report them on
> things like web forums or other mailing lists.
> 
> In the normal process of things, the bugs have only found out about
> years later, maybe when someone finally contacts the appropriate
> kernel developers pointing out the problem.
> 
> A case in point was the removal of bogomips from /proc/cpuinfo, which
> resulted in kernel developers being roasted by Linus when it was
> pointed out that users had been reporting the problem on forums for
> over a year. Apparently, it was _our_ fault for not knowing about
> them.
> 
> Similar seems to be happening with my SFP work - it seems people would
> much rather report problems on random forums around the Internet rather
> than send an email to anyone who can fix the problem in mainline
> kernels.

Maybe they don't know better..

Christoph approached me privately (through Matus, because he was in
the PHY driver header, I guess). I tried to collect as much
information as possible and then posted it to the LKML. I don't
see anything wrong here.

I already mentioned I don't have any hardware to test. So all I
tried to do was to help Christoph get a device tree blob to easily
test on its hardware.

-michael

> Kernel development is fundamentally a difficult, frustrating and
> depressing activity.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-01-08 12:53 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-26 12:18 Broken ethernet on SolidRun cubox-i Michael Walle
2020-12-26 12:34 ` Russell King - ARM Linux admin
2020-12-27 15:33   ` Michael Walle
2020-12-27 15:59     ` Michael Walle
2020-12-27 16:11       ` Russell King - ARM Linux admin
2020-12-27 18:07         ` Andrew Lunn
2021-01-08 11:53         ` Russell King - ARM Linux admin
2021-01-08 11:58           ` Michael Walle
2021-01-08 12:01             ` Russell King - ARM Linux admin
2021-01-08 12:14               ` Michael Walle
2021-01-08 12:25                 ` Christoph Mattheis
2021-01-08 12:27                   ` Russell King - ARM Linux admin
2021-01-08 12:33                     ` Christoph Mattheis
2021-01-08 12:37                 ` Russell King - ARM Linux admin
2021-01-08 12:51                   ` Michael Walle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.