Netdev Archive on lore.kernel.org
 help / color / Atom feed
* realtek PHY commit bbc4d71d63549 causes regression
@ 2020-10-17 14:20 Ard Biesheuvel
  2020-10-17 14:44 ` Andrew Lunn
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 14:20 UTC (permalink / raw)
  To: open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Ilias Apalodimas, Willy Liu, David S. Miller, Jakub Kicinski,
	Sasha Levin, Andrew Lunn, Florian Fainelli, Heiner Kallweit,
	Masahisa Kojima

Hello all,

I just upgraded my arm64 SynQuacer box to 5.8.16 and lost all network
connectivity. This box has a on-SoC socionext 'netsec' network
controller wired to a Realtek 80211e PHY, and this was working without
problems until the following commit was merged

commit bbc4d71d63549bcd003a430de18a72a742d8c91e
Author: Willy Liu <willy.liu@realtek.com>
Date:   Tue Sep 29 10:10:49 2020 +0800

net: phy: realtek: fix rtl8211e rx/tx delay config

It was backported to v5.8 as b9f0dcfbfc07719be7cc732cda4e609280704605

The commit log in question is a bit vague, as it does not explain what
is wrong with the commit it aims to fix, or what the changes are meant
to achieve.

Since it breaks h/w in the field, can we please revert it? (and
backport the revert?)

Regards,
Ard.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 14:20 realtek PHY commit bbc4d71d63549 causes regression Ard Biesheuvel
@ 2020-10-17 14:44 ` Andrew Lunn
  2020-10-17 14:46   ` Ard Biesheuvel
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-17 14:44 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Ilias Apalodimas, Willy Liu, David S. Miller, Jakub Kicinski,
	Sasha Levin, Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, Oct 17, 2020 at 04:20:36PM +0200, Ard Biesheuvel wrote:
> Hello all,
> 
> I just upgraded my arm64 SynQuacer box to 5.8.16 and lost all network
> connectivity.

Hi Ard

Please could you point me at the DT files.

> This box has a on-SoC socionext 'netsec' network controller wired to
> a Realtek 80211e PHY, and this was working without problems until
> the following commit was merged

It could be this fix has uncovered a bug in the DT file. Before this
fix, if there is an phy-mode property in DT, it could of been ignored.
Now the phy-handle property is correctly implemented. So it could be
the DT has the wrong value, e.g. it has rgmii-rxid when maybe it
should have rgmii-id.

       Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 14:44 ` Andrew Lunn
@ 2020-10-17 14:46   ` Ard Biesheuvel
  2020-10-17 15:11     ` Andrew Lunn
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 14:46 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Ilias Apalodimas, Willy Liu, David S. Miller, Jakub Kicinski,
	Sasha Levin, Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, 17 Oct 2020 at 16:44, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Oct 17, 2020 at 04:20:36PM +0200, Ard Biesheuvel wrote:
> > Hello all,
> >
> > I just upgraded my arm64 SynQuacer box to 5.8.16 and lost all network
> > connectivity.
>
> Hi Ard
>
> Please could you point me at the DT files.
>
> > This box has a on-SoC socionext 'netsec' network controller wired to
> > a Realtek 80211e PHY, and this was working without problems until
> > the following commit was merged
>
> It could be this fix has uncovered a bug in the DT file. Before this
> fix, if there is an phy-mode property in DT, it could of been ignored.
> Now the phy-handle property is correctly implemented. So it could be
> the DT has the wrong value, e.g. it has rgmii-rxid when maybe it
> should have rgmii-id.
>

This is an ACPI system. The phy-mode device property is set to 'rgmii'

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 14:46   ` Ard Biesheuvel
@ 2020-10-17 15:11     ` Andrew Lunn
  2020-10-17 15:18       ` Ard Biesheuvel
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-17 15:11 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Ilias Apalodimas, Willy Liu, David S. Miller, Jakub Kicinski,
	Sasha Levin, Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, Oct 17, 2020 at 04:46:23PM +0200, Ard Biesheuvel wrote:
> On Sat, 17 Oct 2020 at 16:44, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Sat, Oct 17, 2020 at 04:20:36PM +0200, Ard Biesheuvel wrote:
> > > Hello all,
> > >
> > > I just upgraded my arm64 SynQuacer box to 5.8.16 and lost all network
> > > connectivity.
> >
> > Hi Ard
> >
> > Please could you point me at the DT files.
> >
> > > This box has a on-SoC socionext 'netsec' network controller wired to
> > > a Realtek 80211e PHY, and this was working without problems until
> > > the following commit was merged
> >
> > It could be this fix has uncovered a bug in the DT file. Before this
> > fix, if there is an phy-mode property in DT, it could of been ignored.
> > Now the phy-handle property is correctly implemented. So it could be
> > the DT has the wrong value, e.g. it has rgmii-rxid when maybe it
> > should have rgmii-id.
> >
> 
> This is an ACPI system. The phy-mode device property is set to 'rgmii'

Hi Ard

Please try rgmii-id.

Also, do you have the schematic? Can you see if there are any
strapping resistors? It could be, there are strapping resistors to put
it into rgmii-id. Now that the phy-mode properties is respected, the
reset defaults are being over-written to rgmii, which breaks the link.
Or the bootloader has already set the PHY mode to rgmii-id.

You can also use '' as the phy-mode, which results in
PHY_INTERFACE_MODE_NA, which effectively means, don't touch the PHY
mode, something else has already set it up. This might actually be the
correct way to go for ACPI. In the DT world, we tend to assume the
bootloader has done the absolute minimum and Linux should configure
everything. The ACPI takes the opposite view, the firmware will do the
basic hardware configuration, and Linux should not touch it, or ask
ACPI to modify it.

      Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 15:11     ` Andrew Lunn
@ 2020-10-17 15:18       ` Ard Biesheuvel
  2020-10-17 16:14         ` Ilias Apalodimas
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 15:18 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Ilias Apalodimas, Willy Liu, David S. Miller, Jakub Kicinski,
	Sasha Levin, Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, 17 Oct 2020 at 17:11, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Oct 17, 2020 at 04:46:23PM +0200, Ard Biesheuvel wrote:
> > On Sat, 17 Oct 2020 at 16:44, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > On Sat, Oct 17, 2020 at 04:20:36PM +0200, Ard Biesheuvel wrote:
> > > > Hello all,
> > > >
> > > > I just upgraded my arm64 SynQuacer box to 5.8.16 and lost all network
> > > > connectivity.
> > >
> > > Hi Ard
> > >
> > > Please could you point me at the DT files.
> > >
> > > > This box has a on-SoC socionext 'netsec' network controller wired to
> > > > a Realtek 80211e PHY, and this was working without problems until
> > > > the following commit was merged
> > >
> > > It could be this fix has uncovered a bug in the DT file. Before this
> > > fix, if there is an phy-mode property in DT, it could of been ignored.
> > > Now the phy-handle property is correctly implemented. So it could be
> > > the DT has the wrong value, e.g. it has rgmii-rxid when maybe it
> > > should have rgmii-id.
> > >
> >
> > This is an ACPI system. The phy-mode device property is set to 'rgmii'
>
> Hi Ard
>
> Please try rgmii-id.
>
> Also, do you have the schematic? Can you see if there are any
> strapping resistors? It could be, there are strapping resistors to put
> it into rgmii-id. Now that the phy-mode properties is respected, the
> reset defaults are being over-written to rgmii, which breaks the link.
> Or the bootloader has already set the PHY mode to rgmii-id.
>
> You can also use '' as the phy-mode, which results in
> PHY_INTERFACE_MODE_NA, which effectively means, don't touch the PHY
> mode, something else has already set it up. This might actually be the
> correct way to go for ACPI. In the DT world, we tend to assume the
> bootloader has done the absolute minimum and Linux should configure
> everything. The ACPI takes the opposite view, the firmware will do the
> basic hardware configuration, and Linux should not touch it, or ask
> ACPI to modify it.
>

Indeed, the firmware should have set this up. This would mean we could
do this in the driver: it currently uses

priv->phy_interface = device_get_phy_mode(&pdev->dev);

Can we just assign that to PHY_INTERFACE_MODE_NA instead?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 15:18       ` Ard Biesheuvel
@ 2020-10-17 16:14         ` Ilias Apalodimas
  2020-10-17 16:17           ` Ard Biesheuvel
  2020-10-17 18:01           ` Andrew Lunn
  0 siblings, 2 replies; 26+ messages in thread
From: Ilias Apalodimas @ 2020-10-17 16:14 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Andrew Lunn, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

Hi Ard, 

On Sat, Oct 17, 2020 at 05:18:16PM +0200, Ard Biesheuvel wrote:
> On Sat, 17 Oct 2020 at 17:11, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Sat, Oct 17, 2020 at 04:46:23PM +0200, Ard Biesheuvel wrote:
> > > On Sat, 17 Oct 2020 at 16:44, Andrew Lunn <andrew@lunn.ch> wrote:
> > > >
> > > > On Sat, Oct 17, 2020 at 04:20:36PM +0200, Ard Biesheuvel wrote:
> > > > > Hello all,
> > > > >
> > > > > I just upgraded my arm64 SynQuacer box to 5.8.16 and lost all network
> > > > > connectivity.
> > > >
> > > > Hi Ard
> > > >
> > > > Please could you point me at the DT files.
> > > >
> > > > > This box has a on-SoC socionext 'netsec' network controller wired to
> > > > > a Realtek 80211e PHY, and this was working without problems until
> > > > > the following commit was merged
> > > >
> > > > It could be this fix has uncovered a bug in the DT file. Before this
> > > > fix, if there is an phy-mode property in DT, it could of been ignored.
> > > > Now the phy-handle property is correctly implemented. So it could be
> > > > the DT has the wrong value, e.g. it has rgmii-rxid when maybe it
> > > > should have rgmii-id.
> > > >
> > >
> > > This is an ACPI system. The phy-mode device property is set to 'rgmii'
> >
> > Hi Ard
> >
> > Please try rgmii-id.
> >
> > Also, do you have the schematic? Can you see if there are any
> > strapping resistors? It could be, there are strapping resistors to put
> > it into rgmii-id. Now that the phy-mode properties is respected, the
> > reset defaults are being over-written to rgmii, which breaks the link.
> > Or the bootloader has already set the PHY mode to rgmii-id.
> >
> > You can also use '' as the phy-mode, which results in
> > PHY_INTERFACE_MODE_NA, which effectively means, don't touch the PHY
> > mode, something else has already set it up. This might actually be the
> > correct way to go for ACPI. In the DT world, we tend to assume the
> > bootloader has done the absolute minimum and Linux should configure
> > everything. The ACPI takes the opposite view, the firmware will do the
> > basic hardware configuration, and Linux should not touch it, or ask
> > ACPI to modify it.
> >
> 
> Indeed, the firmware should have set this up.

Would EDK2 take care of the RGMII Rx/Tx delays even when configured to 
use a DT instead of ACPI?

> This would mean we could > do this in the driver: it currently uses
> 
> priv->phy_interface = device_get_phy_mode(&pdev->dev);
> 
> Can we just assign that to PHY_INTERFACE_MODE_NA instead?

Thanks
/Ilias

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 16:14         ` Ilias Apalodimas
@ 2020-10-17 16:17           ` Ard Biesheuvel
  2020-10-17 16:21             ` Ilias Apalodimas
  2020-10-17 18:04             ` Andrew Lunn
  2020-10-17 18:01           ` Andrew Lunn
  1 sibling, 2 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 16:17 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Andrew Lunn, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, 17 Oct 2020 at 18:14, Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> Hi Ard,
>
> On Sat, Oct 17, 2020 at 05:18:16PM +0200, Ard Biesheuvel wrote:
> > On Sat, 17 Oct 2020 at 17:11, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > On Sat, Oct 17, 2020 at 04:46:23PM +0200, Ard Biesheuvel wrote:
> > > > On Sat, 17 Oct 2020 at 16:44, Andrew Lunn <andrew@lunn.ch> wrote:
> > > > >
> > > > > On Sat, Oct 17, 2020 at 04:20:36PM +0200, Ard Biesheuvel wrote:
> > > > > > Hello all,
> > > > > >
> > > > > > I just upgraded my arm64 SynQuacer box to 5.8.16 and lost all network
> > > > > > connectivity.
> > > > >
> > > > > Hi Ard
> > > > >
> > > > > Please could you point me at the DT files.
> > > > >
> > > > > > This box has a on-SoC socionext 'netsec' network controller wired to
> > > > > > a Realtek 80211e PHY, and this was working without problems until
> > > > > > the following commit was merged
> > > > >
> > > > > It could be this fix has uncovered a bug in the DT file. Before this
> > > > > fix, if there is an phy-mode property in DT, it could of been ignored.
> > > > > Now the phy-handle property is correctly implemented. So it could be
> > > > > the DT has the wrong value, e.g. it has rgmii-rxid when maybe it
> > > > > should have rgmii-id.
> > > > >
> > > >
> > > > This is an ACPI system. The phy-mode device property is set to 'rgmii'
> > >
> > > Hi Ard
> > >
> > > Please try rgmii-id.
> > >
> > > Also, do you have the schematic? Can you see if there are any
> > > strapping resistors? It could be, there are strapping resistors to put
> > > it into rgmii-id. Now that the phy-mode properties is respected, the
> > > reset defaults are being over-written to rgmii, which breaks the link.
> > > Or the bootloader has already set the PHY mode to rgmii-id.
> > >
> > > You can also use '' as the phy-mode, which results in
> > > PHY_INTERFACE_MODE_NA, which effectively means, don't touch the PHY
> > > mode, something else has already set it up. This might actually be the
> > > correct way to go for ACPI. In the DT world, we tend to assume the
> > > bootloader has done the absolute minimum and Linux should configure
> > > everything. The ACPI takes the opposite view, the firmware will do the
> > > basic hardware configuration, and Linux should not touch it, or ask
> > > ACPI to modify it.
> > >
> >
> > Indeed, the firmware should have set this up.
>
> Would EDK2 take care of the RGMII Rx/Tx delays even when configured to
> use a DT instead of ACPI?
>

Yes. The network driver has no awareness whatsoever which h/w
description is being provided to the OS.


> > This would mean we could > do this in the driver: it currently uses
> >
> > priv->phy_interface = device_get_phy_mode(&pdev->dev);
> >
> > Can we just assign that to PHY_INTERFACE_MODE_NA instead?
>

I have tried this, and it seems to fix the issue. I will send out a
patch against the netsec driver.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 16:17           ` Ard Biesheuvel
@ 2020-10-17 16:21             ` Ilias Apalodimas
  2020-10-17 18:04             ` Andrew Lunn
  1 sibling, 0 replies; 26+ messages in thread
From: Ilias Apalodimas @ 2020-10-17 16:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Andrew Lunn, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

Hi Ard,

[...]
> > > > You can also use '' as the phy-mode, which results in
> > > > PHY_INTERFACE_MODE_NA, which effectively means, don't touch the PHY
> > > > mode, something else has already set it up. This might actually be the
> > > > correct way to go for ACPI. In the DT world, we tend to assume the
> > > > bootloader has done the absolute minimum and Linux should configure
> > > > everything. The ACPI takes the opposite view, the firmware will do the
> > > > basic hardware configuration, and Linux should not touch it, or ask
> > > > ACPI to modify it.
> > > >
> > >
> > > Indeed, the firmware should have set this up.
> >
> > Would EDK2 take care of the RGMII Rx/Tx delays even when configured to
> > use a DT instead of ACPI?
> >
>
> Yes. The network driver has no awareness whatsoever which h/w
> description is being provided to the OS.
>
>
> > > This would mean we could > do this in the driver: it currently uses
> > >
> > > priv->phy_interface = device_get_phy_mode(&pdev->dev);
> > >
> > > Can we just assign that to PHY_INTERFACE_MODE_NA instead?
> >
>
> I have tried this, and it seems to fix the issue. I will send out a
> patch against the netsec driver.

Great thanks!

Cheers
/Ilias

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 16:14         ` Ilias Apalodimas
  2020-10-17 16:17           ` Ard Biesheuvel
@ 2020-10-17 18:01           ` Andrew Lunn
  1 sibling, 0 replies; 26+ messages in thread
From: Andrew Lunn @ 2020-10-17 18:01 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Ard Biesheuvel, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

> Would EDK2 take care of the RGMII Rx/Tx delays even when configured to 
> use a DT instead of ACPI?

But what about those users using u-boot and DT, not EDK2?

    Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 16:17           ` Ard Biesheuvel
  2020-10-17 16:21             ` Ilias Apalodimas
@ 2020-10-17 18:04             ` Andrew Lunn
  2020-10-17 18:11               ` Ard Biesheuvel
  1 sibling, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-17 18:04 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ilias Apalodimas, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

> I have tried this, and it seems to fix the issue. I will send out a
> patch against the netsec driver.

Please also fix the firmware so it does not pass rgmii.

If there are pure DT systems, which do require phy-mode to be used, we
will need to revert your proposed change in order to make the MAC
driver work as it should, rather than work around the broken firmware.

	Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 18:04             ` Andrew Lunn
@ 2020-10-17 18:11               ` Ard Biesheuvel
  2020-10-17 18:17                 ` Ard Biesheuvel
  2020-10-17 18:27                 ` Andrew Lunn
  0 siblings, 2 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 18:11 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ilias Apalodimas, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, 17 Oct 2020 at 20:04, Andrew Lunn <andrew@lunn.ch> wrote:
>
> > I have tried this, and it seems to fix the issue. I will send out a
> > patch against the netsec driver.
>
> Please also fix the firmware so it does not pass rgmii.
>
> If there are pure DT systems, which do require phy-mode to be used, we
> will need to revert your proposed change in order to make the MAC
> driver work as it should, rather than work around the broken firmware.
>

What do you mean by 'pure' DT system? Only EDK2 based firmware exists
for this platform, and it can be configured to boot either in DT or in
ACPI mode. In both cases, it will pass the same, incorrect PHY mode,
and in both cases, the firmware will already have configured the PHY
correctly.

So what I propose to do is drop the handling of the [mandatory]
phy-mode device property from the netsec driver (which is really only
used by this board). As we don't really need a phy-mode to begin with,
and given that firmware exists in the field that passes the wrong
value, the only option I see for passing a value here is to use a
different, *optional* DT property (force-phy-mode or
phy-mode-override) that takes the place of phy-mode. For ACPI boot,
you will just need to fix your firmware if you are using a different
PHY configuration.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 18:11               ` Ard Biesheuvel
@ 2020-10-17 18:17                 ` Ard Biesheuvel
  2020-10-17 18:27                 ` Andrew Lunn
  1 sibling, 0 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 18:17 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ilias Apalodimas, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, 17 Oct 2020 at 20:11, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Sat, 17 Oct 2020 at 20:04, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > I have tried this, and it seems to fix the issue. I will send out a
> > > patch against the netsec driver.
> >
> > Please also fix the firmware so it does not pass rgmii.
> >
> > If there are pure DT systems, which do require phy-mode to be used, we
> > will need to revert your proposed change in order to make the MAC
> > driver work as it should, rather than work around the broken firmware.
> >
>
> What do you mean by 'pure' DT system? Only EDK2 based firmware exists
> for this platform, and it can be configured to boot either in DT or in
> ACPI mode. In both cases, it will pass the same, incorrect PHY mode,
> and in both cases, the firmware will already have configured the PHY
> correctly.
>
> So what I propose to do is drop the handling of the [mandatory]
> phy-mode device property from the netsec driver (which is really only
> used by this board). As we don't really need a phy-mode to begin with,
> and given that firmware exists in the field that passes the wrong
> value, the only option I see for passing a value here is to use a
> different, *optional* DT property (force-phy-mode or
> phy-mode-override) that takes the place of phy-mode. For ACPI boot,
> you will just need to fix your firmware if you are using a different
> PHY configuration.

... or add a property that makes 'phy-mode' take effect, and default
to ignoring it.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 18:11               ` Ard Biesheuvel
  2020-10-17 18:17                 ` Ard Biesheuvel
@ 2020-10-17 18:27                 ` Andrew Lunn
  2020-10-17 18:55                   ` Ard Biesheuvel
  1 sibling, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-17 18:27 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ilias Apalodimas, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, Oct 17, 2020 at 08:11:24PM +0200, Ard Biesheuvel wrote:
> On Sat, 17 Oct 2020 at 20:04, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > I have tried this, and it seems to fix the issue. I will send out a
> > > patch against the netsec driver.
> >
> > Please also fix the firmware so it does not pass rgmii.
> >
> > If there are pure DT systems, which do require phy-mode to be used, we
> > will need to revert your proposed change in order to make the MAC
> > driver work as it should, rather than work around the broken firmware.
> >
> 
> What do you mean by 'pure' DT system? Only EDK2 based firmware exists
> for this platform

Currently, only EDK2 based firmware exists. Is there anything stopping
somebody using u-boot? ACPI is aimed for server class systems, on
ARM. If anybody wants to use this SoC in am embedded setting, not
server, then they are more likely to use DT, especially when you need
a complex network, eg. an Ethernet switch. It seems like ACPI is too
simple to support complex network hardware found in some embedded
systems.

> So what I propose to do is drop the handling of the [mandatory]
> phy-mode device property from the netsec driver (which is really only
> used by this board). As we don't really need a phy-mode to begin with,
> and given that firmware exists in the field that passes the wrong
> value, the only option I see for passing a value here is to use a
> different, *optional* DT property (force-phy-mode or
> phy-mode-override) that takes the place of phy-mode.

No, sorry, this is an ACPI problem, not a DT problem. I don't want to
accept DT hacks because of broken ACPI.

We have been through this before, when the Atheros PHY fixed is RGMII
delay support, and lots of platforms broke. Everybody just updated
their DT and were happy. I see no reason why ACPI should be different.

      Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 18:27                 ` Andrew Lunn
@ 2020-10-17 18:55                   ` Ard Biesheuvel
  2020-10-17 19:49                     ` Andrew Lunn
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 18:55 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ilias Apalodimas, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sat, 17 Oct 2020 at 20:27, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sat, Oct 17, 2020 at 08:11:24PM +0200, Ard Biesheuvel wrote:
> > On Sat, 17 Oct 2020 at 20:04, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > I have tried this, and it seems to fix the issue. I will send out a
> > > > patch against the netsec driver.
> > >
> > > Please also fix the firmware so it does not pass rgmii.
> > >
> > > If there are pure DT systems, which do require phy-mode to be used, we
> > > will need to revert your proposed change in order to make the MAC
> > > driver work as it should, rather than work around the broken firmware.
> > >
> >
> > What do you mean by 'pure' DT system? Only EDK2 based firmware exists
> > for this platform
>
> Currently, only EDK2 based firmware exists. Is there anything stopping
> somebody using u-boot? ACPI is aimed for server class systems, on
> ARM. If anybody wants to use this SoC in am embedded setting, not
> server, then they are more likely to use DT, especially when you need
> a complex network, eg. an Ethernet switch. It seems like ACPI is too
> simple to support complex network hardware found in some embedded
> systems.
>
> > So what I propose to do is drop the handling of the [mandatory]
> > phy-mode device property from the netsec driver (which is really only
> > used by this board). As we don't really need a phy-mode to begin with,
> > and given that firmware exists in the field that passes the wrong
> > value, the only option I see for passing a value here is to use a
> > different, *optional* DT property (force-phy-mode or
> > phy-mode-override) that takes the place of phy-mode.
>
> No, sorry, this is an ACPI problem, not a DT problem. I don't want to
> accept DT hacks because of broken ACPI.
>
> We have been through this before, when the Atheros PHY fixed is RGMII
> delay support, and lots of platforms broke. Everybody just updated
> their DT and were happy. I see no reason why ACPI should be different.
>

I don't understand why you insist on framing this as a ACPI vs DT
issue. AFAICT, the only meaningful distinction here is between
firmware that configures the PHY and firmware that doesn't.

Broken firmware exists for this platform, and it provides incorrect DT
data as well as incorrect ACPI data, but it does configure the PHY.
Fixing that firmware involves fixing both, and it is easily updatable
on this platform, so it is almost as simple as dropping a new DT file
in your /boot partition. But you still need to do that.

So we can fix this firmware by just setting phy-mode to the empty string, right?

So the only question is how we deal with broken firmware. Again, I
don't see much point in distinguishing between DT and ACPI here, as in
both cases, the same action is required on the part of the user to
change something on their system before they can upgrade their kernel.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 18:55                   ` Ard Biesheuvel
@ 2020-10-17 19:49                     ` Andrew Lunn
  2020-10-17 22:19                       ` Ard Biesheuvel
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-17 19:49 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ilias Apalodimas, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

> So we can fix this firmware by just setting phy-mode to the empty
> string, right?

I've never actually tried it, but i think so. There are no DT files
actually doing that, so you really do need to test it and see. And
there might be some differences between device_get_phy_mode() and
of_get_phy_mode().

	Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 19:49                     ` Andrew Lunn
@ 2020-10-17 22:19                       ` Ard Biesheuvel
  2020-10-17 23:02                         ` Andrew Lunn
  2020-10-18 14:20                         ` Masami Hiramatsu
  0 siblings, 2 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-17 22:19 UTC (permalink / raw)
  To: Andrew Lunn, Daniel Thompson, Sumit Garg, Alex Bennée,
	Masami Hiramatsu, steve
  Cc: Ilias Apalodimas, open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

(cc'ing some folks who may care about functional networking on SynQuacer)

On Sat, 17 Oct 2020 at 21:49, Andrew Lunn <andrew@lunn.ch> wrote:
>
> > So we can fix this firmware by just setting phy-mode to the empty
> > string, right?
>
> I've never actually tried it, but i think so. There are no DT files
> actually doing that, so you really do need to test it and see. And
> there might be some differences between device_get_phy_mode() and
> of_get_phy_mode().
>

Yes, that works fine. Fixed now in the latest firmware build [0]

But that still leaves the question whether and how to work around this
for units in the field. Ignoring the PHY mode in the driver would
help, as all known hardware ships with firmware that configures the
PHY with the correct settings, but we will lose the ability to use
other PHY modes in the future, in case the SoC is ever used with DT
based minimal firmware that does not configure networking.

Since ACPI implies rich firmware, we could make ACPI probing of the
driver ignore the phy-mode setting in the DSDT. But if we don't do the
same for DT, it would mean DT users are forced to upgrade their
firmware, and hopefully do so before upgrading to a kernel that breaks
networking. (These boxes are often used headless, so this can be
annoying)


[0] http://snapshots.linaro.org/components/kernel/leg-96boards-developerbox-edk2/83/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 22:19                       ` Ard Biesheuvel
@ 2020-10-17 23:02                         ` Andrew Lunn
  2020-10-18 10:35                           ` Ard Biesheuvel
  2020-10-18 14:20                         ` Masami Hiramatsu
  1 sibling, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-17 23:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	steve, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, Oct 18, 2020 at 12:19:25AM +0200, Ard Biesheuvel wrote:
> (cc'ing some folks who may care about functional networking on SynQuacer)
> 
> On Sat, 17 Oct 2020 at 21:49, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > So we can fix this firmware by just setting phy-mode to the empty
> > > string, right?
> >
> > I've never actually tried it, but i think so. There are no DT files
> > actually doing that, so you really do need to test it and see. And
> > there might be some differences between device_get_phy_mode() and
> > of_get_phy_mode().
> >
> 
> Yes, that works fine. Fixed now in the latest firmware build [0]

Thanks for reporting back on that. It probably needs to be something
we always recommend for ACPI systems, either use "", or preferably no
phy-mode at all, and let the driver default to NA. ACPI and networking
is at a very early stage at the moment, since UEFA says nothing about
it. It will take a while before we figure out the best practices, and
some vendor gets something added to the ACPI specifications.

> But that still leaves the question whether and how to work around this
> for units in the field. Ignoring the PHY mode in the driver would
> help, as all known hardware ships with firmware that configures the
> PHY with the correct settings, but we will lose the ability to use
> other PHY modes in the future, in case the SoC is ever used with DT
> based minimal firmware that does not configure networking.
> 
> Since ACPI implies rich firmware, we could make ACPI probing of the
> driver ignore the phy-mode setting in the DSDT.

I'm O.K. with that, for this driver, but please add a comment in the
code about why ACPI ignores DSDT, because of older broken firmware.

> But if we don't do the same for DT, it would mean DT users are
> forced to upgrade their firmware, and hopefully do so before
> upgrading to a kernel that breaks networking.

Nothing new there. As i said, we have been through this before with
the Atheros PHY and others.

One option would be to move the DT into the kernel and fix it. Most
distributions already use the DT version shipped with the kernel, so
they would automatically get the fixed DT at the same time as the
kernel which needs the fix.

       Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 23:02                         ` Andrew Lunn
@ 2020-10-18 10:35                           ` Ard Biesheuvel
  2020-10-18 10:56                             ` Ard Biesheuvel
  2020-10-18 15:38                             ` Andrew Lunn
  0 siblings, 2 replies; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-18 10:35 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	steve, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, 18 Oct 2020 at 01:02, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sun, Oct 18, 2020 at 12:19:25AM +0200, Ard Biesheuvel wrote:
> > (cc'ing some folks who may care about functional networking on SynQuacer)
> >
> > On Sat, 17 Oct 2020 at 21:49, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > So we can fix this firmware by just setting phy-mode to the empty
> > > > string, right?
> > >
> > > I've never actually tried it, but i think so. There are no DT files
> > > actually doing that, so you really do need to test it and see. And
> > > there might be some differences between device_get_phy_mode() and
> > > of_get_phy_mode().
> > >
> >
> > Yes, that works fine. Fixed now in the latest firmware build [0]
>
> Thanks for reporting back on that. It probably needs to be something
> we always recommend for ACPI systems, either use "", or preferably no
> phy-mode at all, and let the driver default to NA. ACPI and networking
> is at a very early stage at the moment, since UEFA says nothing about
> it. It will take a while before we figure out the best practices, and
> some vendor gets something added to the ACPI specifications.
>

You mean MDIO topology, right? Every x86 PC and server in the world
uses ACPI, and networking doesn't seem to be a problem there, so it is
simply a matter of choosing the right abstraction level. I know this
is much easier to achieve when all the network controllers are on PCIe
cards, but the point remains valid: exhaustively describing the entire
SoC like we do for DT is definitely not the right choice for ACPI
systems. This also means that ACPI is simply not the right fit for all
designs, and we should push back harder against the tick-the-box
exercises that are going on to use ACPI for describing unusual designs
that are never going to boot RHEL or other ACPI-only distros anyway.

> > But that still leaves the question whether and how to work around this
> > for units in the field. Ignoring the PHY mode in the driver would
> > help, as all known hardware ships with firmware that configures the
> > PHY with the correct settings, but we will lose the ability to use
> > other PHY modes in the future, in case the SoC is ever used with DT
> > based minimal firmware that does not configure networking.
> >
> > Since ACPI implies rich firmware, we could make ACPI probing of the
> > driver ignore the phy-mode setting in the DSDT.
>
> I'm O.K. with that, for this driver, but please add a comment in the
> code about why ACPI ignores DSDT, because of older broken firmware.
>

Sure.

> > But if we don't do the same for DT, it would mean DT users are
> > forced to upgrade their firmware, and hopefully do so before
> > upgrading to a kernel that breaks networking.
>
> Nothing new there. As i said, we have been through this before with
> the Atheros PHY and others.
>
> One option would be to move the DT into the kernel and fix it. Most
> distributions already use the DT version shipped with the kernel, so
> they would automatically get the fixed DT at the same time as the
> kernel which needs the fix.
>

The DT is not a flat file that you can simply source from elsewhere -
it is constructed at boot using firmware settings and other data that
is different between systems.


I do have a question about the history here, btw. As I understand it,
before commit f81dadbcf7fd ("net: phy: realtek: Add rtl8211e rx/tx
delays config"), the phy-mode setting did not have any effect on
Realtek PHYs in the first place, right? Since otherwise, this platform
would never have worked from the beginning.

So commit f81dadbcf7fd was backported to -stable, even though it
didn't actually work, and had to be fixed in bbc4d71d63549bcd ("net:
phy: realtek: fix rtl8211e rx/tx delay config"), which is the commit
that causes the regression on these boards.

So why was commit f81dadbcf7fd sent to -stable in the first place? It
doesn't have a fixes tag or cc:stable, and given that it is not
actually correct to begin with, there seems to be little justification
for this. Surely, no platform exists in the field that requires this
commit (as it is broken) or the followup fix (since only was never
taken into account before)

In summary, I think that - even if we agree that the way forward is to
fix the firmware and make the driver do the right thing without any
quirks - these patches do not belong in -stable, and should be
reverted, unless anyone can point out a system that was working
before, got broken and was fixed again by f81dadbcf7fd (i.e., a true
regression)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-18 10:35                           ` Ard Biesheuvel
@ 2020-10-18 10:56                             ` Ard Biesheuvel
  2020-10-18 15:45                               ` Andrew Lunn
  2020-10-18 15:38                             ` Andrew Lunn
  1 sibling, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-18 10:56 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	steve, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, 18 Oct 2020 at 12:35, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Sun, 18 Oct 2020 at 01:02, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Sun, Oct 18, 2020 at 12:19:25AM +0200, Ard Biesheuvel wrote:
> > > (cc'ing some folks who may care about functional networking on SynQuacer)
> > >
> > > On Sat, 17 Oct 2020 at 21:49, Andrew Lunn <andrew@lunn.ch> wrote:
> > > >
> > > > > So we can fix this firmware by just setting phy-mode to the empty
> > > > > string, right?
> > > >
> > > > I've never actually tried it, but i think so. There are no DT files
> > > > actually doing that, so you really do need to test it and see. And
> > > > there might be some differences between device_get_phy_mode() and
> > > > of_get_phy_mode().
> > > >
> > >
> > > Yes, that works fine. Fixed now in the latest firmware build [0]
> >
> > Thanks for reporting back on that. It probably needs to be something
> > we always recommend for ACPI systems, either use "", or preferably no
> > phy-mode at all, and let the driver default to NA. ACPI and networking
> > is at a very early stage at the moment, since UEFA says nothing about
> > it. It will take a while before we figure out the best practices, and
> > some vendor gets something added to the ACPI specifications.
> >
>
> You mean MDIO topology, right? Every x86 PC and server in the world
> uses ACPI, and networking doesn't seem to be a problem there, so it is
> simply a matter of choosing the right abstraction level. I know this
> is much easier to achieve when all the network controllers are on PCIe
> cards, but the point remains valid: exhaustively describing the entire
> SoC like we do for DT is definitely not the right choice for ACPI
> systems. This also means that ACPI is simply not the right fit for all
> designs, and we should push back harder against the tick-the-box
> exercises that are going on to use ACPI for describing unusual designs
> that are never going to boot RHEL or other ACPI-only distros anyway.
>
> > > But that still leaves the question whether and how to work around this
> > > for units in the field. Ignoring the PHY mode in the driver would
> > > help, as all known hardware ships with firmware that configures the
> > > PHY with the correct settings, but we will lose the ability to use
> > > other PHY modes in the future, in case the SoC is ever used with DT
> > > based minimal firmware that does not configure networking.
> > >
> > > Since ACPI implies rich firmware, we could make ACPI probing of the
> > > driver ignore the phy-mode setting in the DSDT.
> >
> > I'm O.K. with that, for this driver, but please add a comment in the
> > code about why ACPI ignores DSDT, because of older broken firmware.
> >
>
> Sure.
>
> > > But if we don't do the same for DT, it would mean DT users are
> > > forced to upgrade their firmware, and hopefully do so before
> > > upgrading to a kernel that breaks networking.
> >
> > Nothing new there. As i said, we have been through this before with
> > the Atheros PHY and others.
> >
> > One option would be to move the DT into the kernel and fix it. Most
> > distributions already use the DT version shipped with the kernel, so
> > they would automatically get the fixed DT at the same time as the
> > kernel which needs the fix.
> >
>
> The DT is not a flat file that you can simply source from elsewhere -
> it is constructed at boot using firmware settings and other data that
> is different between systems.
>
>
> I do have a question about the history here, btw. As I understand it,
> before commit f81dadbcf7fd ("net: phy: realtek: Add rtl8211e rx/tx
> delays config"), the phy-mode setting did not have any effect on
> Realtek PHYs in the first place, right? Since otherwise, this platform
> would never have worked from the beginning.
>
> So commit f81dadbcf7fd was backported to -stable, even though it
> didn't actually work, and had to be fixed in bbc4d71d63549bcd ("net:
> phy: realtek: fix rtl8211e rx/tx delay config"), which is the commit
> that causes the regression on these boards.
>
> So why was commit f81dadbcf7fd sent to -stable in the first place? It
> doesn't have a fixes tag or cc:stable, and given that it is not
> actually correct to begin with, there seems to be little justification
> for this. Surely, no platform exists in the field that requires this
> commit (as it is broken) or the followup fix (since only was never
> taken into account before)
>
> In summary, I think that - even if we agree that the way forward is to
> fix the firmware and make the driver do the right thing without any
> quirks - these patches do not belong in -stable, and should be
> reverted, unless anyone can point out a system that was working
> before, got broken and was fixed again by f81dadbcf7fd (i.e., a true
> regression)

Oops - f81dadbcf7fd was never backported, only the fix was. Apologies
for mixing that up.

However, that leaves the question why bbc4d71d63549bcd was backported,
although I understand why the discussion is a bit trickier there. But
if it did not fix a regression, only broken code that never worked in
the first place, I am not convinced it belongs in -stable.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-17 22:19                       ` Ard Biesheuvel
  2020-10-17 23:02                         ` Andrew Lunn
@ 2020-10-18 14:20                         ` Masami Hiramatsu
  1 sibling, 0 replies; 26+ messages in thread
From: Masami Hiramatsu @ 2020-10-18 14:20 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Andrew Lunn, Daniel Thompson, Sumit Garg, Alex Bennée,
	steve, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, 18 Oct 2020 00:19:25 +0200
Ard Biesheuvel <ardb@kernel.org> wrote:

> (cc'ing some folks who may care about functional networking on SynQuacer)
> 
> On Sat, 17 Oct 2020 at 21:49, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > So we can fix this firmware by just setting phy-mode to the empty
> > > string, right?
> >
> > I've never actually tried it, but i think so. There are no DT files
> > actually doing that, so you really do need to test it and see. And
> > there might be some differences between device_get_phy_mode() and
> > of_get_phy_mode().
> >
> 
> Yes, that works fine. Fixed now in the latest firmware build [0]

Great! I've just started chasing that bug this Friday.
Thanks!

> But that still leaves the question whether and how to work around this
> for units in the field. Ignoring the PHY mode in the driver would
> help, as all known hardware ships with firmware that configures the
> PHY with the correct settings, but we will lose the ability to use
> other PHY modes in the future, in case the SoC is ever used with DT
> based minimal firmware that does not configure networking.
> 
> Since ACPI implies rich firmware, we could make ACPI probing of the
> driver ignore the phy-mode setting in the DSDT. But if we don't do the
> same for DT, it would mean DT users are forced to upgrade their
> firmware, and hopefully do so before upgrading to a kernel that breaks
> networking. (These boxes are often used headless, so this can be
> annoying)
> 
> 
> [0] http://snapshots.linaro.org/components/kernel/leg-96boards-developerbox-edk2/83/


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-18 10:35                           ` Ard Biesheuvel
  2020-10-18 10:56                             ` Ard Biesheuvel
@ 2020-10-18 15:38                             ` Andrew Lunn
  1 sibling, 0 replies; 26+ messages in thread
From: Andrew Lunn @ 2020-10-18 15:38 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	steve, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

> > Thanks for reporting back on that. It probably needs to be something
> > we always recommend for ACPI systems, either use "", or preferably no
> > phy-mode at all, and let the driver default to NA. ACPI and networking
> > is at a very early stage at the moment, since UEFA says nothing about
> > it. It will take a while before we figure out the best practices, and
> > some vendor gets something added to the ACPI specifications.
> >
> 
> You mean MDIO topology, right?

That is part of it. I2C for SFPs, ethernet switches, etc.  I think
SFPs are going to be the real sticking point, since when you get above
10Gbps, you are into the land of SFPs, and server platforms are those
which will be going above 10G first.

> Every x86 PC and server in the world uses ACPI, and networking
> doesn't seem to be a problem there, so it is simply a matter of
> choosing the right abstraction level. I know this is much easier to
> achieve when all the network controllers are on PCIe cards, but the
> point remains valid: exhaustively describing the entire SoC like we
> do for DT is definitely not the right choice for ACPI systems.

Agreed. And i am pushing back. But we also have the conflict that some
SoC IP is used in systems that will boot RHEL, Debian, etc, and in
systems that are DT based. Do you want to write two drivers? One
assuming ACPI is doing all the work, and another where DT describes
the system and drivers and network core does all the work configuring
the hardware. For the same piece of hardware?

> I do have a question about the history here, btw. As I understand it,
> before commit f81dadbcf7fd ("net: phy: realtek: Add rtl8211e rx/tx
> delays config"), the phy-mode setting did not have any effect on
> Realtek PHYs in the first place, right? Since otherwise, this platform
> would never have worked from the beginning.

I suspect it did work to some extent, but not fully. So it could be,
it worked for whatever platform Serge was using, but failed in some
other cases, e.g. you board, and left the delays alone.

Later the vendor came along and said the code was wrong, and provided
some bits of information from the datasheet which is not publicly
available. As a result f81dadbcf7fd happened. Since that fixed a
previous patch, it was given a Fixes: tag and i assume back ported.

> So commit f81dadbcf7fd was backported to -stable, even though it
> didn't actually work, and had to be fixed in bbc4d71d63549bcd ("net:
> phy: realtek: fix rtl8211e rx/tx delay config"), which is the commit
> that causes the regression on these boards.
> 
> So why was commit f81dadbcf7fd sent to -stable in the first place?

f81dadbcf7fd first appears in v5.2. I don't see it in 4.19.152, the
LTS kernel older than that. So it is not in -stable.

    Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-18 10:56                             ` Ard Biesheuvel
@ 2020-10-18 15:45                               ` Andrew Lunn
  2020-10-25 14:16                                 ` Ard Biesheuvel
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-18 15:45 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	steve, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

> However, that leaves the question why bbc4d71d63549bcd was backported,
> although I understand why the discussion is a bit trickier there. But
> if it did not fix a regression, only broken code that never worked in
> the first place, I am not convinced it belongs in -stable.

Please ask Serge Semin what platform he tested on. I kind of expect it
worked for him, in some limited way, enough that it passed his
testing.

	Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-18 15:45                               ` Andrew Lunn
@ 2020-10-25 14:16                                 ` Ard Biesheuvel
  2020-10-25 14:28                                   ` Andrew Lunn
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-25 14:16 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	Steve McIntyre, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, 18 Oct 2020 at 17:45, Andrew Lunn <andrew@lunn.ch> wrote:
>
> > However, that leaves the question why bbc4d71d63549bcd was backported,
> > although I understand why the discussion is a bit trickier there. But
> > if it did not fix a regression, only broken code that never worked in
> > the first place, I am not convinced it belongs in -stable.
>
> Please ask Serge Semin what platform he tested on. I kind of expect it
> worked for him, in some limited way, enough that it passed his
> testing.
>

I'll make a note here that a rather large number of platforms got
broken by the same fix for the Realtek PHY driver:

https://lore.kernel.org/lkml/?q=bbc4d71d6354

I seriously doubt whether disabling TX/RX delay when it is enabled by
h/w straps is the right thing to do here.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-25 14:16                                 ` Ard Biesheuvel
@ 2020-10-25 14:28                                   ` Andrew Lunn
  2020-10-25 14:34                                     ` Ard Biesheuvel
  0 siblings, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2020-10-25 14:28 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	Steve McIntyre, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, Oct 25, 2020 at 03:16:36PM +0100, Ard Biesheuvel wrote:
> On Sun, 18 Oct 2020 at 17:45, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > However, that leaves the question why bbc4d71d63549bcd was backported,
> > > although I understand why the discussion is a bit trickier there. But
> > > if it did not fix a regression, only broken code that never worked in
> > > the first place, I am not convinced it belongs in -stable.
> >
> > Please ask Serge Semin what platform he tested on. I kind of expect it
> > worked for him, in some limited way, enough that it passed his
> > testing.
> >
> 
> I'll make a note here that a rather large number of platforms got
> broken by the same fix for the Realtek PHY driver:
> 
> https://lore.kernel.org/lkml/?q=bbc4d71d6354
> 
> I seriously doubt whether disabling TX/RX delay when it is enabled by
> h/w straps is the right thing to do here.

The device tree is explicitly asking for rgmii. If it wanted the
hardware left alone, it should of used PHY_INTERFACE_MODE_NA.

But we might be able to compromise for a cycle or two. As far as i
understand the hardware, we can read the strapping. If we find the
strapping resisters are present, but rgmii is in DT, print a warning
that the device tree needs upgrading, and ignore the DT mode. We can
add this to stable, but not net-next.

    Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-25 14:28                                   ` Andrew Lunn
@ 2020-10-25 14:34                                     ` Ard Biesheuvel
  2020-10-25 14:42                                       ` Andrew Lunn
  0 siblings, 1 reply; 26+ messages in thread
From: Ard Biesheuvel @ 2020-10-25 14:34 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	Steve McIntyre, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, 25 Oct 2020 at 15:29, Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Sun, Oct 25, 2020 at 03:16:36PM +0100, Ard Biesheuvel wrote:
> > On Sun, 18 Oct 2020 at 17:45, Andrew Lunn <andrew@lunn.ch> wrote:
> > >
> > > > However, that leaves the question why bbc4d71d63549bcd was backported,
> > > > although I understand why the discussion is a bit trickier there. But
> > > > if it did not fix a regression, only broken code that never worked in
> > > > the first place, I am not convinced it belongs in -stable.
> > >
> > > Please ask Serge Semin what platform he tested on. I kind of expect it
> > > worked for him, in some limited way, enough that it passed his
> > > testing.
> > >
> >
> > I'll make a note here that a rather large number of platforms got
> > broken by the same fix for the Realtek PHY driver:
> >
> > https://lore.kernel.org/lkml/?q=bbc4d71d6354
> >
> > I seriously doubt whether disabling TX/RX delay when it is enabled by
> > h/w straps is the right thing to do here.
>
> The device tree is explicitly asking for rgmii. If it wanted the
> hardware left alone, it should of used PHY_INTERFACE_MODE_NA.
>

Would you suggest that these DTs remove the phy-mode instead? As I
don't see anyone proposing that.

> But we might be able to compromise for a cycle or two. As far as i
> understand the hardware, we can read the strapping. If we find the
> strapping resisters are present, but rgmii is in DT, print a warning
> that the device tree needs upgrading, and ignore the DT mode. We can
> add this to stable, but not net-next.
>

That sounds reasonable, given how many different platforms seem to be
affected, and production ones may be running stable distro kernels,
and not expecting their Ethernet to fail without warning.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: realtek PHY commit bbc4d71d63549 causes regression
  2020-10-25 14:34                                     ` Ard Biesheuvel
@ 2020-10-25 14:42                                       ` Andrew Lunn
  0 siblings, 0 replies; 26+ messages in thread
From: Andrew Lunn @ 2020-10-25 14:42 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Daniel Thompson, Sumit Garg, Alex Bennée, Masami Hiramatsu,
	Steve McIntyre, Ilias Apalodimas,
	open list:BPF JIT for MIPS (32-BIT AND 64-BIT),
	Willy Liu, David S. Miller, Jakub Kicinski, Sasha Levin,
	Florian Fainelli, Heiner Kallweit, Masahisa Kojima

On Sun, Oct 25, 2020 at 03:34:06PM +0100, Ard Biesheuvel wrote:
> On Sun, 25 Oct 2020 at 15:29, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > On Sun, Oct 25, 2020 at 03:16:36PM +0100, Ard Biesheuvel wrote:
> > > On Sun, 18 Oct 2020 at 17:45, Andrew Lunn <andrew@lunn.ch> wrote:
> > > >
> > > > > However, that leaves the question why bbc4d71d63549bcd was backported,
> > > > > although I understand why the discussion is a bit trickier there. But
> > > > > if it did not fix a regression, only broken code that never worked in
> > > > > the first place, I am not convinced it belongs in -stable.
> > > >
> > > > Please ask Serge Semin what platform he tested on. I kind of expect it
> > > > worked for him, in some limited way, enough that it passed his
> > > > testing.
> > > >
> > >
> > > I'll make a note here that a rather large number of platforms got
> > > broken by the same fix for the Realtek PHY driver:
> > >
> > > https://lore.kernel.org/lkml/?q=bbc4d71d6354
> > >
> > > I seriously doubt whether disabling TX/RX delay when it is enabled by
> > > h/w straps is the right thing to do here.
> >
> > The device tree is explicitly asking for rgmii. If it wanted the
> > hardware left alone, it should of used PHY_INTERFACE_MODE_NA.
> >
> 
> Would you suggest that these DTs remove the phy-mode instead? As I
> don't see anyone proposing that.

What is also O.K, for most MAC drivers. Some might enforce it is
present, in which case, you can set it to "", which will get parsed as
PHY_INTERFACE_MODE_NA. But a few MAC drivers might configure there MII
bus depending on the PHY mode, RGMII vs GMII.

    Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, back to index

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-17 14:20 realtek PHY commit bbc4d71d63549 causes regression Ard Biesheuvel
2020-10-17 14:44 ` Andrew Lunn
2020-10-17 14:46   ` Ard Biesheuvel
2020-10-17 15:11     ` Andrew Lunn
2020-10-17 15:18       ` Ard Biesheuvel
2020-10-17 16:14         ` Ilias Apalodimas
2020-10-17 16:17           ` Ard Biesheuvel
2020-10-17 16:21             ` Ilias Apalodimas
2020-10-17 18:04             ` Andrew Lunn
2020-10-17 18:11               ` Ard Biesheuvel
2020-10-17 18:17                 ` Ard Biesheuvel
2020-10-17 18:27                 ` Andrew Lunn
2020-10-17 18:55                   ` Ard Biesheuvel
2020-10-17 19:49                     ` Andrew Lunn
2020-10-17 22:19                       ` Ard Biesheuvel
2020-10-17 23:02                         ` Andrew Lunn
2020-10-18 10:35                           ` Ard Biesheuvel
2020-10-18 10:56                             ` Ard Biesheuvel
2020-10-18 15:45                               ` Andrew Lunn
2020-10-25 14:16                                 ` Ard Biesheuvel
2020-10-25 14:28                                   ` Andrew Lunn
2020-10-25 14:34                                     ` Ard Biesheuvel
2020-10-25 14:42                                       ` Andrew Lunn
2020-10-18 15:38                             ` Andrew Lunn
2020-10-18 14:20                         ` Masami Hiramatsu
2020-10-17 18:01           ` Andrew Lunn

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git