netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] fec mdio times out under system stress
@ 2019-08-11 13:37 Russell King - ARM Linux admin
  2019-08-11 14:06 ` Russell King - ARM Linux admin
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-08-11 13:37 UTC (permalink / raw)
  To: linux-arm-kernel, Fabio Estevam, netdev, Andrew Lunn,
	Florian Fainelli, Heiner Kallweit

Hi Fabio,

When I woke up this morning, I found that one of the Hummingboards
had gone offline (as in, lost network link) during the night.
Investigating, I find that the system had gone into OOM, and at
that time, triggered an unrelated:

[4111697.698776] fec 2188000.ethernet eth0: MDIO read timeout
[4111697.712996] MII_DATA: 0x6006796d
[4111697.729415] MII_SPEED: 0x0000001a
[4111697.745232] IEVENT: 0x00000000
[4111697.745242] IMASK: 0x0a8000aa
[4111698.002233] Atheros 8035 ethernet 2188000.ethernet-1:00: PHY state change RUNNING -> HALTED
[4111698.009882] fec 2188000.ethernet eth0: Link is Down

This is on a dual-core iMX6.

It looks like the read actually completed (since MII_DATA contains
the register data) but we somehow lost the interrupt (or maybe
received the interrupt after wait_for_completion_timeout() timed
out.)

From what I can see, the OOM events happened on CPU1, CPU1 was
allocated the FEC interrupt, and the PHY polling that suffered the
MDIO timeout was on CPU0.

Given that IEVENT is zero, it seems that CPU1 had read serviced the
interrupt, but it is not clear how far through processing that it
was - it may be that fec_enet_interrupt() had been delayed by the
OOM condition.

This seems rather fragile - as the system slowing down due to OOM
triggers the network to completely collapse by phylib taking the
PHY offline, making the system inaccessible except through the
console.

In my case, even serial console wasn't operational (except for
magic sysrq).  Not sure what agetty was playing at... so the only
way I could recover any information from the system was to connect
the HDMI and plug in a USB keyboard.

Any thoughts on how FEC MDIO accesses could be made more robust?

Maybe phylib should retry a number of times - but with read-sensitive
registers, if the read has already completed successfully, and its
just a problem with the FEC MDIO hardware, that could cause issues.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] fec mdio times out under system stress
  2019-08-11 13:37 [BUG] fec mdio times out under system stress Russell King - ARM Linux admin
@ 2019-08-11 14:06 ` Russell King - ARM Linux admin
  2019-08-11 14:54 ` Vladimir Oltean
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-08-11 14:06 UTC (permalink / raw)
  To: linux-arm-kernel, Fabio Estevam, netdev, Andrew Lunn,
	Florian Fainelli, Heiner Kallweit

On Sun, Aug 11, 2019 at 02:37:07PM +0100, Russell King - ARM Linux admin wrote:
> Hi Fabio,
> 
> When I woke up this morning, I found that one of the Hummingboards
> had gone offline (as in, lost network link) during the night.
> Investigating, I find that the system had gone into OOM, and at
> that time, triggered an unrelated:
> 
> [4111697.698776] fec 2188000.ethernet eth0: MDIO read timeout
> [4111697.712996] MII_DATA: 0x6006796d
> [4111697.729415] MII_SPEED: 0x0000001a
> [4111697.745232] IEVENT: 0x00000000
> [4111697.745242] IMASK: 0x0a8000aa
> [4111698.002233] Atheros 8035 ethernet 2188000.ethernet-1:00: PHY state change RUNNING -> HALTED
> [4111698.009882] fec 2188000.ethernet eth0: Link is Down
> 
> This is on a dual-core iMX6.
> 
> It looks like the read actually completed (since MII_DATA contains
> the register data) but we somehow lost the interrupt (or maybe
> received the interrupt after wait_for_completion_timeout() timed
> out.)
> 
> From what I can see, the OOM events happened on CPU1, CPU1 was
> allocated the FEC interrupt, and the PHY polling that suffered the
> MDIO timeout was on CPU0.
> 
> Given that IEVENT is zero, it seems that CPU1 had read serviced the
> interrupt, but it is not clear how far through processing that it
> was - it may be that fec_enet_interrupt() had been delayed by the
> OOM condition.
> 
> This seems rather fragile - as the system slowing down due to OOM
> triggers the network to completely collapse by phylib taking the
> PHY offline, making the system inaccessible except through the
> console.
> 
> In my case, even serial console wasn't operational (except for
> magic sysrq).  Not sure what agetty was playing at... so the only
> way I could recover any information from the system was to connect
> the HDMI and plug in a USB keyboard.
> 
> Any thoughts on how FEC MDIO accesses could be made more robust?
> 
> Maybe phylib should retry a number of times - but with read-sensitive
> registers, if the read has already completed successfully, and its
> just a problem with the FEC MDIO hardware, that could cause issues.

I should also note for the phylib people:

After phylib has received an error, and has entered the HALTED state,
downing the interface produces this warning, which seems rather unfair
as the FEC doesn't know that its PHY suffered an error - it is merely
reversing the phy_start() that happened when the interface was opened.

[4144039.099786] ------------[ cut here ]------------
[4144039.109001] WARNING: CPU: 1 PID: 25842 at drivers/net/phy/phy.c:835 fec_enet_close+0x14/0x148
[4144039.124366] called from state HALTED
[4144039.132626] Modules linked in: 8021q brcmfmac brcmutil imx_thermal
snd_soc_imx_spdif snd_soc_imx_audmux nvmem_imx_ocotp cfg80211
snd_soc_sgtl5000 imx_sdma
virt_dma rc_cec snd_soc_fsl_ssi snd_soc_fsl_spdif coda imx_pcm_dma
v4l2_mem2mem
imx_vdoa etnaviv videobuf2_dma_contig gpu_sched dw_hdmi_cec
dw_hdmi_ahb_audio imx6q_cpufreq caamrng caam_jr caam error ip_tables
x_tables [last unloaded: evbug][4144039.177249] CPU: 1 PID: 25842 Comm:
ip Tainted: G        W         5.1.0+ #319
[4144039.189477] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[4144039.201025] [<c00194f0>] (unwind_backtrace) from [<c0014748>] (show_stack+0x10/0x14)
[4144039.213769] [<c0014748>] (show_stack) from [<c082e11c>] (dump_stack+0x9c/0xd4)
[4144039.225959] [<c082e11c>] (dump_stack) from [<c0030f3c>] (__warn+0xf8/0x124)
[4144039.237828] [<c0030f3c>] (__warn) from [<c0031030>] (warn_slowpath_fmt+0x38/0x48)
[4144039.250275] [<c0031030>] (warn_slowpath_fmt) from [<c052cb14>] (fec_enet_close+0x14/0x148)
[4144039.263558] [<c052cb14>] (fec_enet_close) from [<c066b5bc>] (__dev_close_many+0x88/0xf0)
[4144039.276639] [<c066b5bc>] (__dev_close_many) from [<c06740cc>] (__dev_change_flags+0xa4/0x1a0)
[4144039.290130] [<c06740cc>] (__dev_change_flags) from [<c06741e8>] (dev_change_flags+0x18/0x48)
[4144039.303558] [<c06741e8>] (dev_change_flags) from [<c068db80>] (do_setlink+0x29c/0x990)
[4144039.316517] [<c068db80>] (do_setlink) from [<c068eae8>] (__rtnl_newlink+0x4a4/0x72c)
[4144039.329282] [<c068eae8>] (__rtnl_newlink) from [<c068edb0>] (rtnl_newlink+0x40/0x60)
[4144039.342023] [<c068edb0>] (rtnl_newlink) from [<c0689a10>] (rtnetlink_rcv_msg+0x244/0x470)
[4144039.355224] [<c0689a10>] (rtnetlink_rcv_msg) from [<c06d20a0>] (netlink_rcv_skb+0xe0/0xf4)
[4144039.368492] [<c06d20a0>] (netlink_rcv_skb) from [<c06d194c>] (netlink_unicast+0x170/0x1b8)
[4144039.381758] [<c06d194c>] (netlink_unicast) from [<c06d1cdc>] (netlink_sendmsg+0x2b8/0x340)
[4144039.395023] [<c06d1cdc>] (netlink_sendmsg) from [<c064999c>] (sock_sendmsg+0x14/0x24)
[4144039.407825] [<c064999c>] (sock_sendmsg) from [<c064a5a0>] (___sys_sendmsg+0x200/0x214)
[4144039.420714] [<c064a5a0>] (___sys_sendmsg) from [<c064b0c4>] (__sys_sendmsg+0x40/0x6c)
[4144039.433506] [<c064b0c4>] (__sys_sendmsg) from [<c0009000>] (ret_fast_syscall+0x0/0x28)
[4144039.446366] Exception stack(0xe1933fa8 to 0xe1933ff0)
[4144039.456329] 3fa0:                   bea086f0 bea086bc 00000003 bea086d0 00000000 00000000
[4144039.469464] 3fc0: bea086f0 bea086bc 00000000 00000128 0062a278 bea086d0 5d5011f3 0062a000
[4144039.482590] 3fe0: 00000128 bea08680 b6dfc4cb b6d7d6f6
[4144039.492652] irq event stamp: 0
[4144039.500650] hardirqs last  enabled at (0): [<00000000>]   (null)
[4144039.511628] hardirqs last disabled at (0): [<c002e55c>] copy_process.part.4+0x30c/0x19e8
[4144039.524691] softirqs last  enabled at (0): [<c002e55c>] copy_process.part.4+0x30c/0x19e8
[4144039.537691] softirqs last disabled at (0): [<00000000>]   (null)
[4144039.548554] ---[ end trace b5e8d4b0f30ae00b ]---


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] fec mdio times out under system stress
  2019-08-11 13:37 [BUG] fec mdio times out under system stress Russell King - ARM Linux admin
  2019-08-11 14:06 ` Russell King - ARM Linux admin
@ 2019-08-11 14:54 ` Vladimir Oltean
  2019-08-11 16:03   ` Andrew Lunn
  2019-08-11 16:31 ` Andrew Lunn
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Vladimir Oltean @ 2019-08-11 14:54 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: linux-arm-kernel, Fabio Estevam, netdev, Andrew Lunn,
	Florian Fainelli, Heiner Kallweit, Hubert Feurstein

Hi Russell, Fabio,

On Sun, 11 Aug 2019 at 16:42, Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> Hi Fabio,
>
> When I woke up this morning, I found that one of the Hummingboards
> had gone offline (as in, lost network link) during the night.
> Investigating, I find that the system had gone into OOM, and at
> that time, triggered an unrelated:
>
> [4111697.698776] fec 2188000.ethernet eth0: MDIO read timeout
> [4111697.712996] MII_DATA: 0x6006796d
> [4111697.729415] MII_SPEED: 0x0000001a
> [4111697.745232] IEVENT: 0x00000000
> [4111697.745242] IMASK: 0x0a8000aa
> [4111698.002233] Atheros 8035 ethernet 2188000.ethernet-1:00: PHY state change RUNNING -> HALTED
> [4111698.009882] fec 2188000.ethernet eth0: Link is Down
>
> This is on a dual-core iMX6.
>
> It looks like the read actually completed (since MII_DATA contains
> the register data) but we somehow lost the interrupt (or maybe
> received the interrupt after wait_for_completion_timeout() timed
> out.)
>
> From what I can see, the OOM events happened on CPU1, CPU1 was
> allocated the FEC interrupt, and the PHY polling that suffered the
> MDIO timeout was on CPU0.
>
> Given that IEVENT is zero, it seems that CPU1 had read serviced the
> interrupt, but it is not clear how far through processing that it
> was - it may be that fec_enet_interrupt() had been delayed by the
> OOM condition.
>
> This seems rather fragile - as the system slowing down due to OOM
> triggers the network to completely collapse by phylib taking the
> PHY offline, making the system inaccessible except through the
> console.
>
> In my case, even serial console wasn't operational (except for
> magic sysrq).  Not sure what agetty was playing at... so the only
> way I could recover any information from the system was to connect
> the HDMI and plug in a USB keyboard.
>
> Any thoughts on how FEC MDIO accesses could be made more robust?
>

I think a better question is why is the FEC MDIO controller configured
to emit interrupts anyway (especially since the API built on top does
not benefit in any way from this)? Hubert (copied) sent an interesting
email very recently where he pointed out that this is one of the main
sources of jitter when reading the PTP clock on a Marvell switch
attached over MDIO.

> Maybe phylib should retry a number of times - but with read-sensitive
> registers, if the read has already completed successfully, and its
> just a problem with the FEC MDIO hardware, that could cause issues.
>
> Thanks.
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
> According to speedtest.net: 11.9Mbps down 500kbps up

Regards,
-Vladimir

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] fec mdio times out under system stress
  2019-08-11 14:54 ` Vladimir Oltean
@ 2019-08-11 16:03   ` Andrew Lunn
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Lunn @ 2019-08-11 16:03 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King - ARM Linux admin, Florian Fainelli, netdev,
	Hubert Feurstein, Fabio Estevam, linux-arm-kernel,
	Heiner Kallweit

> I think a better question is why is the FEC MDIO controller configured
> to emit interrupts anyway (especially since the API built on top does
> not benefit in any way from this)? Hubert (copied) sent an interesting
> email very recently where he pointed out that this is one of the main
> sources of jitter when reading the PTP clock on a Marvell switch
> attached over MDIO.

Hi Vladimir

One reason is runtime power management.

For a write operation, you could set it going and return
immediately. Many drivers do, and then when the next read/write
operation comes along, they poll in a loop waiting for the device to
go idle, if it was still busy from the last operation.

However, FEC supports runtime PM. When an MDIO read/write call is
made, it calls runtime PM functions to indicate the device is active.
Once it has completed the MDIO transaction, it calls runtime PM
functions to indicate the device is inactive. These transitions can
cause clocks to be enabled/disabled. If we don't wait around for the
operation to complete, the clock could be disabled too early, and bad
things would happen.

You could replace the interrupt with a sleeping poll, but my guess
would be, that has more jitter than using an interrupt.

      Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] fec mdio times out under system stress
  2019-08-11 13:37 [BUG] fec mdio times out under system stress Russell King - ARM Linux admin
  2019-08-11 14:06 ` Russell King - ARM Linux admin
  2019-08-11 14:54 ` Vladimir Oltean
@ 2019-08-11 16:31 ` Andrew Lunn
  2019-08-11 16:44 ` Andrew Lunn
  2019-08-12 15:10 ` Fabio Estevam
  4 siblings, 0 replies; 7+ messages in thread
From: Andrew Lunn @ 2019-08-11 16:31 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: linux-arm-kernel, Fabio Estevam, netdev, Florian Fainelli,
	Heiner Kallweit

On Sun, Aug 11, 2019 at 02:37:07PM +0100, Russell King - ARM Linux admin wrote:
> Hi Fabio,
> 
> When I woke up this morning, I found that one of the Hummingboards
> had gone offline (as in, lost network link) during the night.
> Investigating, I find that the system had gone into OOM, and at
> that time, triggered an unrelated:
> 
> [4111697.698776] fec 2188000.ethernet eth0: MDIO read timeout
> [4111697.712996] MII_DATA: 0x6006796d
> [4111697.729415] MII_SPEED: 0x0000001a
> [4111697.745232] IEVENT: 0x00000000
> [4111697.745242] IMASK: 0x0a8000aa
> [4111698.002233] Atheros 8035 ethernet 2188000.ethernet-1:00: PHY state change RUNNING -> HALTED
> [4111698.009882] fec 2188000.ethernet eth0: Link is Down
> 
> This is on a dual-core iMX6.
> 
> It looks like the read actually completed (since MII_DATA contains
> the register data) but we somehow lost the interrupt (or maybe
> received the interrupt after wait_for_completion_timeout() timed
> out.)

Hi Russell

The timeout is quite short,

#define FEC_MII_TIMEOUT         30000 /* us */

Looking at the Vybrid datasheet, there does not appear to be any way
to determine if the hardware is busy other than waiting for the
interrupt. There is no 'busy' bit which gets cleared on completion.

So about the only option is to make the timeout bigger.

   Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] fec mdio times out under system stress
  2019-08-11 13:37 [BUG] fec mdio times out under system stress Russell King - ARM Linux admin
                   ` (2 preceding siblings ...)
  2019-08-11 16:31 ` Andrew Lunn
@ 2019-08-11 16:44 ` Andrew Lunn
  2019-08-12 15:10 ` Fabio Estevam
  4 siblings, 0 replies; 7+ messages in thread
From: Andrew Lunn @ 2019-08-11 16:44 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: linux-arm-kernel, Fabio Estevam, netdev, Florian Fainelli,
	Heiner Kallweit

> Maybe phylib should retry a number of times - but with read-sensitive
> registers, if the read has already completed successfully, and its
> just a problem with the FEC MDIO hardware, that could cause issues.

Hi Russell

At the bus level, MDIO cannot fail. The bits get clocked out, and the
bits get clocked in. There is no way for the PHY to stretch the clock
as I2C slaves can. There is nothing like the USB NACK, try again
later.

If something fails, it fails at a higher level, which means it is a
driver issue. In this case, the interrupt got delayed, after the timer
interrupt.

The FEC is also quite unusual in using an interrupt. Most MDIO drivers
poll. And if time gets 'stretched' because the system is too busy,
generally, the right thing happens anyway.

So i don't think it is phylib job to work around this issue.

   Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] fec mdio times out under system stress
  2019-08-11 13:37 [BUG] fec mdio times out under system stress Russell King - ARM Linux admin
                   ` (3 preceding siblings ...)
  2019-08-11 16:44 ` Andrew Lunn
@ 2019-08-12 15:10 ` Fabio Estevam
  4 siblings, 0 replies; 7+ messages in thread
From: Fabio Estevam @ 2019-08-12 15:10 UTC (permalink / raw)
  To: Russell King - ARM Linux admin
  Cc: moderated list:ARM/FREESCALE IMX / MXC ARM ARCHITECTURE, netdev,
	Andrew Lunn, Florian Fainelli, Heiner Kallweit

Hi Russell,

On Sun, Aug 11, 2019 at 10:37 AM Russell King - ARM Linux admin
<linux@armlinux.org.uk> wrote:
>
> Hi Fabio,
>
> When I woke up this morning, I found that one of the Hummingboards
> had gone offline (as in, lost network link) during the night.
> Investigating, I find that the system had gone into OOM, and at
> that time, triggered an unrelated:
>
> [4111697.698776] fec 2188000.ethernet eth0: MDIO read timeout
> [4111697.712996] MII_DATA: 0x6006796d
> [4111697.729415] MII_SPEED: 0x0000001a
> [4111697.745232] IEVENT: 0x00000000
> [4111697.745242] IMASK: 0x0a8000aa
> [4111698.002233] Atheros 8035 ethernet 2188000.ethernet-1:00: PHY state change RUNNING -> HALTED
> [4111698.009882] fec 2188000.ethernet eth0: Link is Down
>
> This is on a dual-core iMX6.
>
> It looks like the read actually completed (since MII_DATA contains
> the register data) but we somehow lost the interrupt (or maybe
> received the interrupt after wait_for_completion_timeout() timed
> out.)
>
> From what I can see, the OOM events happened on CPU1, CPU1 was
> allocated the FEC interrupt, and the PHY polling that suffered the
> MDIO timeout was on CPU0.
>
> Given that IEVENT is zero, it seems that CPU1 had read serviced the
> interrupt, but it is not clear how far through processing that it
> was - it may be that fec_enet_interrupt() had been delayed by the
> OOM condition.
>
> This seems rather fragile - as the system slowing down due to OOM
> triggers the network to completely collapse by phylib taking the
> PHY offline, making the system inaccessible except through the
> console.
>
> In my case, even serial console wasn't operational (except for
> magic sysrq).  Not sure what agetty was playing at... so the only
> way I could recover any information from the system was to connect
> the HDMI and plug in a USB keyboard.
>
> Any thoughts on how FEC MDIO accesses could be made more robust?

Sorry for the delay. I am currently on vacation with limited e-mail access.

I think it is worth trying Andrew's suggestion to increase FEC_MII_TIMEOUT.

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-08-12 15:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-11 13:37 [BUG] fec mdio times out under system stress Russell King - ARM Linux admin
2019-08-11 14:06 ` Russell King - ARM Linux admin
2019-08-11 14:54 ` Vladimir Oltean
2019-08-11 16:03   ` Andrew Lunn
2019-08-11 16:31 ` Andrew Lunn
2019-08-11 16:44 ` Andrew Lunn
2019-08-12 15:10 ` Fabio Estevam

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).