All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joakim Zhang <qiangqing.zhang@nxp.com>
To: Russell King <linux@armlinux.org.uk>
Cc: Andrew Lunn <andrew@lunn.ch>, Vladimir Oltean <olteanv@gmail.com>,
	"peppe.cavallaro@st.com" <peppe.cavallaro@st.com>,
	"alexandre.torgue@foss.st.com" <alexandre.torgue@foss.st.com>,
	"joabreu@synopsys.com" <joabreu@synopsys.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"mcoquelin.stm32@gmail.com" <mcoquelin.stm32@gmail.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"f.fainelli@gmail.com" <f.fainelli@gmail.com>,
	"hkallweit1@gmail.com" <hkallweit1@gmail.com>,
	dl-linux-imx <linux-imx@nxp.com>
Subject: RE: [PATCH] net: stmmac: fix MAC not working when system resume back with WoL enabled
Date: Mon, 6 Sep 2021 02:21:46 +0000	[thread overview]
Message-ID: <DB8PR04MB6795B181B5573A081F9B8CE7E6D29@DB8PR04MB6795.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <20210903120127.GW22278@shell.armlinux.org.uk>


Hi Russell,

> -----Original Message-----
> From: Russell King <linux@armlinux.org.uk>
> Sent: 2021年9月3日 20:01
> To: Joakim Zhang <qiangqing.zhang@nxp.com>
> Cc: Andrew Lunn <andrew@lunn.ch>; Vladimir Oltean <olteanv@gmail.com>;
> peppe.cavallaro@st.com; alexandre.torgue@foss.st.com;
> joabreu@synopsys.com; davem@davemloft.net; kuba@kernel.org;
> mcoquelin.stm32@gmail.com; netdev@vger.kernel.org; f.fainelli@gmail.com;
> hkallweit1@gmail.com; dl-linux-imx <linux-imx@nxp.com>
> Subject: Re: [PATCH] net: stmmac: fix MAC not working when system resume
> back with WoL enabled
> 
> On Fri, Sep 03, 2021 at 11:04:57AM +0000, Joakim Zhang wrote:
> >
> > Hi Russell,
> >
> > [...]
> > > > > > > -----Original Message-----
> > > > > > > From: Andrew Lunn <andrew@lunn.ch>
> > > > > > > Sent: 2021年9月2日 20:24
> > > > > > > To: Joakim Zhang <qiangqing.zhang@nxp.com>
> > > > > > > Cc: Russell King <linux@armlinux.org.uk>; Vladimir Oltean
> > > > > > > <olteanv@gmail.com>; peppe.cavallaro@st.com;
> > > > > > > alexandre.torgue@foss.st.com; joabreu@synopsys.com;
> > > > > > > davem@davemloft.net; kuba@kernel.org;
> > > mcoquelin.stm32@gmail.com;
> > > > > > > netdev@vger.kernel.org; f.fainelli@gmail.com;
> > > > > > > hkallweit1@gmail.com; dl-linux-imx <linux-imx@nxp.com>
> > > > > > > Subject: Re: [PATCH] net: stmmac: fix MAC not working when
> > > > > > > system resume back with WoL enabled
> > > > > > >
> > > > > > > > Emm, @andrew@lunn.ch, Andrew is much familiar with FEC,
> > > > > > > > and PHY maintainers, Could you please help put insights here if
> possible?
> > > > > > >
> > > > > > > All the boards i have either have an Ethernet Switch
> > > > > > > connected to the MAC, or a Micrel PHY. None are setup for
> > > > > > > WoL, since it is not used in the use case of these boards.
> > > > > > >
> > > > > > > I think you need to scatter some printk() in various places
> > > > > > > to confirm what is going on. Where is the WoL implemented:
> > > > > > > MAC or PHY, what is suspended or not, etc.
> > > > > >
> > > > > > Thanks Andrew, Russell,
> > > > > >
> > > > > > I confirmed FEC is MAC-based WoL, and PHY is active when
> > > > > > system
> > > > > suspended if MAC-based WoL is active.
> > > > > > I scatter printk() in both phy_device.c and realtek.c phy
> > > > > > driver to debug this
> > > > > for both WoL active and inactive case.
> > > > > >
> > > > > > When MAC-based WoL is active, phy_suspend() is the last point
> > > > > > to actually
> > > > > suspend the PHY, you can see that,
> > > > > > 	phy_ethtool_get_wol(phydev, &wol);
> > > > > > 	if (wol.wolopts || (netdev && netdev->wol_enabled))
> > > > > > 		return -EBUSY;
> > > > > >
> > > > > > Here, netdev is true and netdev->wol_enabled is ture
> > > > > > (net/ethtool/wol.c: ethnl_set_wol() -> dev->wol_enabled =
> > > > > > !!wol.wolopts;) So that phydev->suspend() would not be called,
> > > > > > PHY is active
> > > > > after system suspended. PHY can receive packets and pass to MAC,
> > > > > MAC is responsible for detecting magic packets then generate a
> > > > > wakeup interrupt. So all is fine for FEC, and the behavior is clear.
> > > > >
> > > > > What happens on resume with FEC?
> > > >
> > > > Since we call phy_stop() in fec_suspend(), the link is down, but
> > > > the PHY is active, after receiving magic packets, the system
> > > > resume back; In fec_resume(), after restart/init FEC, we call
> > > > phy_start() to let link up, then
> > > all is going well.
> > >
> > > ... but the link never went down! So I don't understand the last point.
> >
> > Sorry, what the meaning of "the link never went down"? How do you
> > define the link is down? May be I have not get your original point correctly.
> 
> If the link goes down, connectivity between the MAC and the outside world is
> lost - whether that be the link between the MAC and PHY, or PHY and the
> outside world. That's my definition of "link down".
> 
> However, if the MAC is still alive and receiving packets, even for Wake-on-Lan
> purposes, from the outside world then the link can not be down, it must be
> operational and therefore it must be in the "up"
> state.
> 
> I'm talking about the physical state of the link - "up" meaning capable of
> passing packets to and from the MAC, "down" meaning incapable of passing
> packets.

Yes, actually what you described is physical state, what I mean is a soft state.

> > At my side, with MAC-based WoL is active, FEC calls phy_stop() in
> > fec_suspend(), then fec_enet_adjust_link() is called, further
> > fec_stop() is called, FEC only keep necessary receive logic active to
> > service WoL. This is not the link went down? At least I see the log "
> > fec 30be0000.ethernet eth0: Link is Down".
> 
> It looks like calling phy_stop() will force a link-down event to be reported from
> phylib. As I say above though, really, this doesn't affect the physical state of the
> link, because the link has to be up for the WoL packets to be received by the
> MAC.

Yes, both MAC and PHY are active for MAC-based WoL.

> What I don't like about that is that we're saying that the link is down, whereas
> the physical link is actually still up. This is going to make network drivers
> implementation of mac_link_down() rather yucky, especially ones that force the
> physical link down at the MAC end when operating in PHY mode and they see a
> call to mac_link_down() (which they do to stop packet reception.) There's no
> way for them to know whether mac_link_down() is a result of a real physical
> link down event, or whether this is a "soft" link down event as you're describing
> at a suspend.

As I think, if WoL is active then this is a soft link down events; Instead, if WoL is
inactive this is a physical link down events.

> Then there's the issue that some network drivers _must_ see a
> mac_link_down() call to force the link down prior to reconfiguring the link (since
> settings are not allowed to be changed while the physical link is up.) So we start
> to destroy the guarantee that
> mac_link_down() and mac_link_up() will be properly ordered.
> 
> Damn it.
> 
> > > There's a few more questions:
> > > 1. Since the state at this point will be that netdev and phylink believe
> > >    the link to still be up, should phylink_suspend() force a
> > >    netif_carrier_off() event to stop the netdev transmit watchdog - I
> > >    think it ought to, even though the link will actually remain up.
> >
> > Agree.
> 
> Note that phylib in the FEC case will do this just before calling the adjust_link
> function already.
> 
> With phylink, we can make that silent, and I think it should be silent because, as
> I describe above, the physical link isn't actually going down.

Yes.

> > > 2. Should we call mac_link_down() prior to the major reconfig - I think
> > >    we should to keep the mac_link_down()/mac_link_up() calls balanced
> > >    (as we do already guarantee.) Will that do any harm for stmmac if we
> > >    were to call mac_link_down() as a result of a call to
> > >    phylink_resume() ?
> >
> > For STMMAC, I think it's safe, since we calling phylink_resume() before
> re-config MAC.
> > But we design this for common usage, other MAC drivers may call this
> > at the end of resume path, but I think it also safe, like we unplug
> > then plug the cable. However, it will print the LINK DOWN then LINK UP log,
> which is very strange when system resume back. Is there any better solution?
> 
> I think at this point, we should just print the state of the link at resume, which
> should basically be only a "link down" if the link is now physically down at
> resume.
> 
> One thing worries me though - what happens if the link parameters change
> while the system is suspended. The MAC won't be updated with the new link
> parameters. What happens on resume?

There is a state machine in phylink, then this link change events would be captured later
from reschedule the machine, right? If so, this seems not a critical issue. MAC just not working
for a while.

Yes, I am also curious if link parameters changed during system suspended, how phylib or phylink
react when system resume back. Do you know some details?

> > > Why is it different from the .ndo_stop/.ndo_open case, where the PHY
> > > may have been suspended by the actions of .ndo_stop?
> >
> > It's a good question. PHY will suspend by the actions od .ndo_stop,
> > but it will resume before we config MAC. Please see stmmac_open(),
> > stmmac_init_phy()->phylink_of_phy_connect()
> > -> phy_attach_direct(): phy_resume(phydev), where PHY will be resume
> backed.
> 
> Ah, I forgot FEC does the PHY connect/disconnect at open/stop. Thanks.

Best Regards,
Joakim Zhang

      parent reply	other threads:[~2021-09-06  2:22 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-01  9:02 [PATCH] net: stmmac: fix MAC not working when system resume back with WoL enabled Joakim Zhang
2021-09-01  9:13 ` Russell King (Oracle)
2021-09-01 10:21   ` Joakim Zhang
2021-09-01 12:56     ` Russell King (Oracle)
2021-09-02  7:01       ` Joakim Zhang
2021-09-01 15:40     ` Heiner Kallweit
2021-09-02  7:35       ` Joakim Zhang
2021-09-01  9:21 ` Vladimir Oltean
2021-09-01 10:25   ` Joakim Zhang
2021-09-01 10:56     ` Vladimir Oltean
2021-09-01 11:42       ` Joakim Zhang
2021-09-01 13:25         ` Russell King (Oracle)
2021-09-02  7:28           ` Joakim Zhang
2021-09-02  8:32             ` Russell King (Oracle)
2021-09-02 10:26               ` Joakim Zhang
2021-09-02 10:49                 ` Russell King (Oracle)
2021-09-02 11:15                   ` Joakim Zhang
2021-09-02 12:24                     ` Andrew Lunn
2021-09-03  6:51                       ` Joakim Zhang
2021-09-03  8:01                         ` Russell King (Oracle)
2021-09-03  8:39                           ` Joakim Zhang
2021-09-03  9:32                             ` Russell King (Oracle)
2021-09-03 11:04                               ` Joakim Zhang
2021-09-03 12:01                                 ` Russell King (Oracle)
2021-09-03 20:12                                   ` Russell King - ARM Linux admin
2021-09-06  2:29                                     ` Joakim Zhang
2021-09-06  9:34                                       ` Russell King (Oracle)
2021-09-06 10:41                                         ` Joakim Zhang
2021-09-06 11:21                                           ` Russell King (Oracle)
2021-09-06 13:23                                             ` Andrew Lunn
2021-09-07  8:52                                             ` Russell King (Oracle)
2021-09-06  2:21                                   ` Joakim Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB8PR04MB6795B181B5573A081F9B8CE7E6D29@DB8PR04MB6795.eurprd04.prod.outlook.com \
    --to=qiangqing.zhang@nxp.com \
    --cc=alexandre.torgue@foss.st.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=hkallweit1@gmail.com \
    --cc=joabreu@synopsys.com \
    --cc=kuba@kernel.org \
    --cc=linux-imx@nxp.com \
    --cc=linux@armlinux.org.uk \
    --cc=mcoquelin.stm32@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=olteanv@gmail.com \
    --cc=peppe.cavallaro@st.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.