netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* network unstable on odroid-c1/meson8b.
@ 2019-06-19 20:18 Aymeric
  2019-06-19 22:14 ` Heiner Kallweit
  0 siblings, 1 reply; 6+ messages in thread
From: Aymeric @ 2019-06-19 20:18 UTC (permalink / raw)
  To: netdev; +Cc: linux-amlogic


[-- Attachment #1.1: Type: text/plain, Size: 2032 bytes --]

Hello all,

I've an ODROID-C1 board (a meson8b/S805) and I've some network
unstablity with current mainline kernel; as time of writting, tested
5.0.y, 5.1.y, 5.2-rc4 and didn't try with any others versions.

After a few talks on linux-amlogic mailing list, I've been pointed here
to find and, hoppefully, fix the issue.
The whole thread on linux-amlogic is available here: [¹]

A short summary:
1. With Kernel 3.10.something made by Hardkernel (the one from the board
vendor), the network link is working at 1 gigabit and stay at 1 gigabit.
2. With Kernel 5.0.y, 5.1.y, mainline, the network link goes from up to
down every few seconds at 1 gigabit (making the board unusable) but is
working fine when forced at 100Mb (using ethtool command).
3. The ethernet cable is not the cause of the issue (see #4).
4. After a few more check, I was able to narrow the problem. It's only
present when the board is connected to my "internet box" (a Livebox
3/Sagemcom) but not with a "stupid" d-link switch (both have gigabit
capability).
5. With the help from Martin on linux-amlogic I've tried to disable EEE
in the dtb but it didn't change anything.
6. An extract of the dmesg output grepping ethernet and meson is here
when the issue is occuring: [²].


And the last comment from Martin and why I'm sending a mail here:
- the Amlogic SoCs use a DesignWare MAC (Ethernet controller, the driver
is called stmmac) with a Relatek RTL8211F Ethernet PHY.
- there's little Amlogic specific registers involved: they mostly
control the PHY interface (enabling RMII or RGMII) and the clocks so
it's very likely that someone on the netdev list has an idea how to
debug this because a large part of the Ethernet setup is not Amlogic SoC
specific

So if you've got any idea to fix this issue.. :)

Thanks in advance,

Aymeric.


[¹]:
http://lists.infradead.org/pipermail/linux-amlogic/2019-June/012341.html
[²]:
https://paste.aplu.fr/?b5eb6df48a9c95b6#sqHk8xhWGwRfagWNpL+u7mIsPGWVWFn2d7xBqika8Kc=




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 899 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: network unstable on odroid-c1/meson8b.
  2019-06-19 20:18 network unstable on odroid-c1/meson8b Aymeric
@ 2019-06-19 22:14 ` Heiner Kallweit
  2019-06-20  7:55   ` Aymeric
  0 siblings, 1 reply; 6+ messages in thread
From: Heiner Kallweit @ 2019-06-19 22:14 UTC (permalink / raw)
  To: Aymeric, netdev; +Cc: linux-amlogic, Martin Blumenstingl

On 19.06.2019 22:18, Aymeric wrote:
> Hello all,
> 
> I've an ODROID-C1 board (a meson8b/S805) and I've some network
> unstablity with current mainline kernel; as time of writting, tested
> 5.0.y, 5.1.y, 5.2-rc4 and didn't try with any others versions.
> 
> After a few talks on linux-amlogic mailing list, I've been pointed here
> to find and, hoppefully, fix the issue.
> The whole thread on linux-amlogic is available here: [¹]
> 
> A short summary:
> 1. With Kernel 3.10.something made by Hardkernel (the one from the board
> vendor), the network link is working at 1 gigabit and stay at 1 gigabit.
> 2. With Kernel 5.0.y, 5.1.y, mainline, the network link goes from up to
> down every few seconds at 1 gigabit (making the board unusable) but is
> working fine when forced at 100Mb (using ethtool command).
> 3. The ethernet cable is not the cause of the issue (see #4).
> 4. After a few more check, I was able to narrow the problem. It's only
> present when the board is connected to my "internet box" (a Livebox
> 3/Sagemcom) but not with a "stupid" d-link switch (both have gigabit
> capability).
> 5. With the help from Martin on linux-amlogic I've tried to disable EEE
> in the dtb but it didn't change anything.
> 6. An extract of the dmesg output grepping ethernet and meson is here
> when the issue is occuring: [²].
> 
Kernel 3.10 didn't have a dedicated RTL8211F PHY driver yet, therefore
I assume the genphy driver was used. Do you have a line with
"attached PHY driver" in dmesg output of the vendor kernel?

The dedicated PHY driver takes care of the tx delay, if the genphy
driver is used we have to rely on what uboot configured.
But if we indeed had an issue with a misconfigured delay, I think
the connection shouldn't be fine with just another link partner.
Just to have it tested you could make rtl8211f_config_init() in
drivers/net/phy/realtek.c a no-op (in current kernels).

And you could compare at least the basic PHY registers 0x00 - 0x30
with both kernel versions, e.g. with phytool.

> 
> And the last comment from Martin and why I'm sending a mail here:
> - the Amlogic SoCs use a DesignWare MAC (Ethernet controller, the driver
> is called stmmac) with a Relatek RTL8211F Ethernet PHY.
> - there's little Amlogic specific registers involved: they mostly
> control the PHY interface (enabling RMII or RGMII) and the clocks so
> it's very likely that someone on the netdev list has an idea how to
> debug this because a large part of the Ethernet setup is not Amlogic SoC
> specific
> 
> So if you've got any idea to fix this issue.. :)
> 
> Thanks in advance,
> 
> Aymeric.
> 
Heiner
> 
> [¹]:
> http://lists.infradead.org/pipermail/linux-amlogic/2019-June/012341.html
> [²]:
> https://paste.aplu.fr/?b5eb6df48a9c95b6#sqHk8xhWGwRfagWNpL+u7mIsPGWVWFn2d7xBqika8Kc=
> 
> 
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: network unstable on odroid-c1/meson8b.
  2019-06-19 22:14 ` Heiner Kallweit
@ 2019-06-20  7:55   ` Aymeric
  2019-06-20 15:53     ` Heiner Kallweit
  0 siblings, 1 reply; 6+ messages in thread
From: Aymeric @ 2019-06-20  7:55 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: netdev, linux-amlogic, Martin Blumenstingl

Hi,
On 2019-06-20 00:14, Heiner Kallweit wrote:
> On 19.06.2019 22:18, Aymeric wrote:
>> Hello all,
>> 

> Kernel 3.10 didn't have a dedicated RTL8211F PHY driver yet, therefore
> I assume the genphy driver was used. Do you have a line with
> "attached PHY driver" in dmesg output of the vendor kernel?

No.
Here is the full output of the dmesg from vendor kernel [¹].

I've also noticed something strange, it might be linked, but mac address 
of the board is set to a random value when using mainline kernel and 
I've to set it manually but not when using vendor kernel.

> 
> The dedicated PHY driver takes care of the tx delay, if the genphy
> driver is used we have to rely on what uboot configured.
> But if we indeed had an issue with a misconfigured delay, I think
> the connection shouldn't be fine with just another link partner.
> Just to have it tested you could make rtl8211f_config_init() in
> drivers/net/phy/realtek.c a no-op (in current kernels).
> 

I'm not an expert here, just adding a "return 0;" here[²] would be 
enough?

> And you could compare at least the basic PHY registers 0x00 - 0x30
> with both kernel versions, e.g. with phytool.
> 

They are not the same but I don't know what I'm looking for, so for 
kernel 3.10 [³] and for kernel 5.1.12 [⁴].

Aymeric

[¹]: 
https://paste.aplu.fr/?38ef95b44ebdbfc3#G666/YbhgU+O+tdC/2HaimUCigm8ZTB44qvQip/HJ5A=
[²]: 
https://github.com/torvalds/linux/blob/241e39004581475b2802cd63c111fec43bb0123e/drivers/net/phy/realtek.c#L164
[³]: 
https://paste.aplu.fr/?2dde1c32d5c68f4c#6xIa8MjTm6jpI6citEJAqFTLMMHDjFZRet/M00/EwjU=
[⁴]: 
https://paste.aplu.fr/?32130e9bcb05dde7#N/xdnvb5GklcJtiOxMpTCm+9gsUliRwH8X3dcwSV+ng=

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: network unstable on odroid-c1/meson8b.
  2019-06-20  7:55   ` Aymeric
@ 2019-06-20 15:53     ` Heiner Kallweit
  2019-06-20 20:54       ` Aymeric
  0 siblings, 1 reply; 6+ messages in thread
From: Heiner Kallweit @ 2019-06-20 15:53 UTC (permalink / raw)
  To: Aymeric; +Cc: netdev, linux-amlogic, Martin Blumenstingl

On 20.06.2019 09:55, Aymeric wrote:
> Hi,
> On 2019-06-20 00:14, Heiner Kallweit wrote:
>> On 19.06.2019 22:18, Aymeric wrote:
>>> Hello all,
>>>
> 
>> Kernel 3.10 didn't have a dedicated RTL8211F PHY driver yet, therefore
>> I assume the genphy driver was used. Do you have a line with
>> "attached PHY driver" in dmesg output of the vendor kernel?
> 
> No.
> Here is the full output of the dmesg from vendor kernel [¹].
> 
> I've also noticed something strange, it might be linked, but mac address of the board is set to a random value when using mainline kernel and I've to set it manually but not when using vendor kernel.
> 
>>
>> The dedicated PHY driver takes care of the tx delay, if the genphy
>> driver is used we have to rely on what uboot configured.
>> But if we indeed had an issue with a misconfigured delay, I think
>> the connection shouldn't be fine with just another link partner.
>> Just to have it tested you could make rtl8211f_config_init() in
>> drivers/net/phy/realtek.c a no-op (in current kernels).
>>
> 
> I'm not an expert here, just adding a "return 0;" here[²] would be enough?
> 
>> And you could compare at least the basic PHY registers 0x00 - 0x30
>> with both kernel versions, e.g. with phytool.
>>
> 
> They are not the same but I don't know what I'm looking for, so for kernel 3.10 [³] and for kernel 5.1.12 [⁴].
> 
> Aymeric
> 
> [¹]: https://paste.aplu.fr/?38ef95b44ebdbfc3#G666/YbhgU+O+tdC/2HaimUCigm8ZTB44qvQip/HJ5A=
> [²]: https://github.com/torvalds/linux/blob/241e39004581475b2802cd63c111fec43bb0123e/drivers/net/phy/realtek.c#L164
> [³]: https://paste.aplu.fr/?2dde1c32d5c68f4c#6xIa8MjTm6jpI6citEJAqFTLMMHDjFZRet/M00/EwjU=
> [⁴]: https://paste.aplu.fr/?32130e9bcb05dde7#N/xdnvb5GklcJtiOxMpTCm+9gsUliRwH8X3dcwSV+ng=
> 

The vendor kernel has some, but not really much magic:
https://github.com/hardkernel/linux/blob/odroidc-3.10.y/drivers/amlogic/ethernet/phy/am_rtl8211f.c
The write to RTL8211F_PHYCR2 is overwritten later, therefore we don't have to consider it.

The following should make the current Realtek PHY driver behave like in the vendor driver.
Could you test it?


diff --git a/drivers/net/phy/realtek.c b/drivers/net/phy/realtek.c
index a669945eb..f300b1cc9 100644
--- a/drivers/net/phy/realtek.c
+++ b/drivers/net/phy/realtek.c
@@ -163,6 +163,10 @@ static int rtl8211f_config_init(struct phy_device *phydev)
 {
 	u16 val;
 
+	phy_write_paged(phydev, 0x0a43, 0x19, 0x0803);
+	genphy_soft_reset(phydev);
+	return 0;
+
 	/* enable TX-delay for rgmii-{id,txid}, and disable it for rgmii and
 	 * rgmii-rxid. The RX-delay can be enabled by the external RXDLY pin.
 	 */
-- 
2.22.0



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: network unstable on odroid-c1/meson8b.
  2019-06-20 15:53     ` Heiner Kallweit
@ 2019-06-20 20:54       ` Aymeric
  2019-06-23 18:16         ` Aymeric
  0 siblings, 1 reply; 6+ messages in thread
From: Aymeric @ 2019-06-20 20:54 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: netdev, linux-amlogic, Martin Blumenstingl


Le 20/06/2019 à 17:53, Heiner Kallweit a écrit :
> On 20.06.2019 09:55, Aymeric wrote:
>> Hi,
>> On 2019-06-20 00:14, Heiner Kallweit wrote:
>>> On 19.06.2019 22:18, Aymeric wrote:
>>>> Hello all,
>>>>
>>> Kernel 3.10 didn't have a dedicated RTL8211F PHY driver yet, therefore
>>> I assume the genphy driver was used. Do you have a line with
>>> "attached PHY driver" in dmesg output of the vendor kernel?
>> No.
>> Here is the full output of the dmesg from vendor kernel [¹].
>>
>> I've also noticed something strange, it might be linked, but mac address of the board is set to a random value when using mainline kernel and I've to set it manually but not when using vendor kernel.
>>
>>> The dedicated PHY driver takes care of the tx delay, if the genphy
>>> driver is used we have to rely on what uboot configured.
>>> But if we indeed had an issue with a misconfigured delay, I think
>>> the connection shouldn't be fine with just another link partner.
>>> Just to have it tested you could make rtl8211f_config_init() in
>>> drivers/net/phy/realtek.c a no-op (in current kernels).
>>>
>> I'm not an expert here, just adding a "return 0;" here[²] would be enough?
>>
>>> And you could compare at least the basic PHY registers 0x00 - 0x30
>>> with both kernel versions, e.g. with phytool.
>>>
>> They are not the same but I don't know what I'm looking for, so for kernel 3.10 [³] and for kernel 5.1.12 [⁴].
>>
>> Aymeric
>>
>> [¹]: https://paste.aplu.fr/?38ef95b44ebdbfc3#G666/YbhgU+O+tdC/2HaimUCigm8ZTB44qvQip/HJ5A=
>> [²]: https://github.com/torvalds/linux/blob/241e39004581475b2802cd63c111fec43bb0123e/drivers/net/phy/realtek.c#L164
>> [³]: https://paste.aplu.fr/?2dde1c32d5c68f4c#6xIa8MjTm6jpI6citEJAqFTLMMHDjFZRet/M00/EwjU=
>> [⁴]: https://paste.aplu.fr/?32130e9bcb05dde7#N/xdnvb5GklcJtiOxMpTCm+9gsUliRwH8X3dcwSV+ng=
>>
> The vendor kernel has some, but not really much magic:
> https://github.com/hardkernel/linux/blob/odroidc-3.10.y/drivers/amlogic/ethernet/phy/am_rtl8211f.c
> The write to RTL8211F_PHYCR2 is overwritten later, therefore we don't have to consider it.
>
> The following should make the current Realtek PHY driver behave like in the vendor driver.
> Could you test it?

(sending again for mailing list, sorry, I forgot to force it in plaintext…)

I've applied your patch and tried but it doesn't change anything.

Here is dmesg output and phytool results.

https://paste.aplu.fr/?9735c99907528929#SeCgwR45cgnbDA1tXIVBHCBT8RNct2r41jU6vsguLVc=

-- 
Aymeric

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: network unstable on odroid-c1/meson8b.
  2019-06-20 20:54       ` Aymeric
@ 2019-06-23 18:16         ` Aymeric
  0 siblings, 0 replies; 6+ messages in thread
From: Aymeric @ 2019-06-23 18:16 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: Martin Blumenstingl, netdev, linux-amlogic


Le 20/06/2019 à 22:54, Aymeric a écrit :
> Le 20/06/2019 à 17:53, Heiner Kallweit a écrit :
>> On 20.06.2019 09:55, Aymeric wrote:
>>> Hi,
>>> On 2019-06-20 00:14, Heiner Kallweit wrote:
>>>> On 19.06.2019 22:18, Aymeric wrote:
>>>>> Hello all,
>>>>>
>>>> Kernel 3.10 didn't have a dedicated RTL8211F PHY driver yet, therefore
>>>> I assume the genphy driver was used. Do you have a line with
>>>> "attached PHY driver" in dmesg output of the vendor kernel?
>>> No.
>>> Here is the full output of the dmesg from vendor kernel [¹].
>>>
>>> I've also noticed something strange, it might be linked, but mac address of the board is set to a random value when using mainline kernel and I've to set it manually but not when using vendor kernel.
>>>
>>>> The dedicated PHY driver takes care of the tx delay, if the genphy
>>>> driver is used we have to rely on what uboot configured.
>>>> But if we indeed had an issue with a misconfigured delay, I think
>>>> the connection shouldn't be fine with just another link partner.
>>>> Just to have it tested you could make rtl8211f_config_init() in
>>>> drivers/net/phy/realtek.c a no-op (in current kernels).
>>>>
>>> I'm not an expert here, just adding a "return 0;" here[²] would be enough?
>>>
>>>> And you could compare at least the basic PHY registers 0x00 - 0x30
>>>> with both kernel versions, e.g. with phytool.
>>>>
>>> They are not the same but I don't know what I'm looking for, so for kernel 3.10 [³] and for kernel 5.1.12 [⁴].
>>>
>>> Aymeric
>>>
>>> [¹]: https://paste.aplu.fr/?38ef95b44ebdbfc3#G666/YbhgU+O+tdC/2HaimUCigm8ZTB44qvQip/HJ5A=
>>> [²]: https://github.com/torvalds/linux/blob/241e39004581475b2802cd63c111fec43bb0123e/drivers/net/phy/realtek.c#L164
>>> [³]: https://paste.aplu.fr/?2dde1c32d5c68f4c#6xIa8MjTm6jpI6citEJAqFTLMMHDjFZRet/M00/EwjU=
>>> [⁴]: https://paste.aplu.fr/?32130e9bcb05dde7#N/xdnvb5GklcJtiOxMpTCm+9gsUliRwH8X3dcwSV+ng=
>>>
>> The vendor kernel has some, but not really much magic:
>> https://github.com/hardkernel/linux/blob/odroidc-3.10.y/drivers/amlogic/ethernet/phy/am_rtl8211f.c
>> The write to RTL8211F_PHYCR2 is overwritten later, therefore we don't have to consider it.
>>
>> The following should make the current Realtek PHY driver behave like in the vendor driver.
>> Could you test it?
> (sending again for mailing list, sorry, I forgot to force it in plaintext…)
>
> I've applied your patch and tried but it doesn't change anything.
>
> Here is dmesg output and phytool results.
>
> https://paste.aplu.fr/?9735c99907528929#SeCgwR45cgnbDA1tXIVBHCBT8RNct2r41jU6vsguLVc=
>
Hello all,

I had some news from a friend who have the same issue than me, his board
is connected to an "intelligent" switch a Ubiquiti EdgeSwitch.

Also, when he force the link to 100 it is stable.

Aymeric.

-- 
Aymeric

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-06-23 18:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-19 20:18 network unstable on odroid-c1/meson8b Aymeric
2019-06-19 22:14 ` Heiner Kallweit
2019-06-20  7:55   ` Aymeric
2019-06-20 15:53     ` Heiner Kallweit
2019-06-20 20:54       ` Aymeric
2019-06-23 18:16         ` Aymeric

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).