linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yan Markman <ymarkman@marvell.com>
To: Russell King - ARM Linux <linux@armlinux.org.uk>,
	Florian Fainelli <f.fainelli@gmail.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>,
	Antoine Tenart <antoine.tenart@free-electrons.com>,
	"andrew@lunn.ch" <andrew@lunn.ch>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"gregory.clement@free-electrons.com" 
	<gregory.clement@free-electrons.com>,
	"thomas.petazzoni@free-electrons.com" 
	<thomas.petazzoni@free-electrons.com>,
	"miquel.raynal@free-electrons.com"
	<miquel.raynal@free-electrons.com>,
	"Nadav Haklai" <nadavh@marvell.com>,
	"mw@semihalf.com" <mw@semihalf.com>,
	"Stefan Chulski" <stefanc@marvell.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [EXT] Re: [PATCH net] net: phylink: fix link state on phy-connect
Date: Sat, 2 Dec 2017 11:08:45 +0000	[thread overview]
Message-ID: <be9118b70c164ac793b2d79d5cfa1adf@IL-EXCH01.marvell.com> (raw)
In-Reply-To: <20171201174730.GM10595@n2100.armlinux.org.uk>

Hi Russel
The Grygorii has raised one Additional point (about netif_carrier_off) I just didn't want to start before finishing the previous one.
On ifconfig-down the mac_config() called but with LINK=0. 
The config has no any knowledge what is intention -- up or down and should be done under disabled ingress/egress,
       and so the mac_config one of its action is    netif_carrier_off.

After calling mac_config() the phylink checks  if (!link  &&  !netif_carrier_ok()) and decides to abort further down since all-done...

REMOVE netif_carrier_off looks like correct BUT has cases where de driver stops to works properly (sorry, I can't remember now what exactly).
So finally I have placed there the CONDITIONAL carrier-off depending upon link:

static void mvpp2_mac_config(){
	if (state->link)        --- occasionally is TRUE on UP but FALSE on down
		netif_carrier_off(port->dev);//YANM

BTW: It's seems your below patch should be present anyway.
+++ b/drivers/net/phy/phylink.c
@@ -798,6 +798,7 @@ void phylink_disconnect_phy(struct phylink *pl)
+		pl->phy_state.link = false;

Thank you
Best regards
Yan Markman

-----Original Message-----
From: Russell King - ARM Linux [mailto:linux@armlinux.org.uk] 
Sent: Friday, December 01, 2017 7:48 PM
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>; Yan Markman <ymarkman@marvell.com>; Antoine Tenart <antoine.tenart@free-electrons.com>; andrew@lunn.ch; davem@davemloft.net; gregory.clement@free-electrons.com; thomas.petazzoni@free-electrons.com; miquel.raynal@free-electrons.com; Nadav Haklai <nadavh@marvell.com>; mw@semihalf.com; Stefan Chulski <stefanc@marvell.com>; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [EXT] Re: [PATCH net] net: phylink: fix link state on phy-connect

On Fri, Dec 01, 2017 at 09:36:42AM -0800, Florian Fainelli wrote:
> On 12/01/2017 09:24 AM, Russell King - ARM Linux wrote:
> > On Fri, Dec 01, 2017 at 11:07:22AM -0600, Grygorii Strashko wrote:
> >> Hi Russell,
> >>
> >> On 11/30/2017 07:28 AM, Russell King - ARM Linux wrote:
> >>> On Thu, Nov 30, 2017 at 10:10:18AM +0000, Russell King - ARM Linux wrote:
> >>>> On Thu, Nov 30, 2017 at 08:51:21AM +0000, Yan Markman wrote:
> >>>>> The phylink_stop is called before phylink_disconnect_phy You 
> >>>>> could see in mvpp2.c:
> >>>>>
> >>>>> mvpp2_stop_dev() {
> >>>>> 	phylink_stop(port->phylink);
> >>>>> }
> >>>>>
> >>>>> mvpp2_stop()       {
> >>>>> 	mvpp2_stop_dev(port);
> >>>>> 	phylink_disconnect_phy(port->phylink);
> >>>>> }
> >>>>>
> >>>>> .ndo_stop = mvpp2_stop,
> >>>>
> >>>> Sorry, I don't have this in mvpp2.c, so I have no visibility of 
> >>>> what you're working with.
> >>>>
> >>>> What you have above looks correct, and I see no reason why the 
> >>>> p21 patch would not have resolved your issue.  The p21 patch 
> >>>> ensures that phylink_resolve() gets called and completes before 
> >>>> phylink_stop() returns.  In that case, phylink_resolve() will 
> >>>> call the mac_link_down() method if the link is not already down.  
> >>>> It will also print the "Link is Down" message.
> >>>>
> >>>> Florian has already tested this patch after encountering a 
> >>>> similar issue, and has reported that it solves the problem for 
> >>>> him.  I've also tested it with mvneta, and the original mvpp2x driver on Macchiatobin.
> >>>>
> >>>> Maybe there's something different about mvpp2, but as I have no 
> >>>> visibility of that driver and the modifications therein, I can't 
> >>>> comment further other than stating that it works for three 
> >>>> different implementations.
> >>>>
> >>>> Maybe you could try and work out what's going on with the p21 
> >>>> patch in your case?
> >>>
> >>> I think I now realise what's probably going on.
> >>>
> >>> If you call netif_carrier_off() before phylink_stop(), then 
> >>> phylink will believe that the link is already down, and so it 
> >>> won't bother calling
> >>> mac_link_down() - it will believe that the link is already down.
> >>>
> >>> I'll update the documentation for phylink_stop() to spell out this 
> >>> aspect.
> >>>
> >>
> >> There are pretty high number of net drivers which do call
> >> 	netif_carrier_off(dev);
> >> before
> >> 	phy_stop(dev->phydev);
> >> in .ndo_stop() callback.
> >>
> >> As per you comment this seems to be incorrect, so should such calls 
> >> be removed?
> > 
> > Well, I think the question that needs to be asked is this:
> > 
> >   Is calling netif_carrier_off() before phy_stop() safe?
> > 
> > Well, reading the phylib code, this is the answer I've come to:
> > 
> >   Between phy_start() and phy_stop(), phylib is free to manage the
> >   carrier state itself through the phylib state machine.
> > 
> >   This means if you call netif_carrier_off() prior to phy_stop(),
> >   there is nothing preventing the phylib state machine from running,
> >   and a co-incident poll of the PHY could notice that the link has
> >   come up, and re-enable the carrier while your ndo_stop() method
> >   is still running.
> > 
> > So, my conclusion is that this practice is provably racy, though 
> > it's probably not that easy to trigger the race (which is probably 
> > why no one has reported the problem.)
> > 
> > Given that it's racy, it's not something that I think phylink should 
> > care about, and should "softly" discourage it.  So, I'm happy with 
> > what phylink is doing here, and I suggest fixing the drivers for 
> > this race.
> > 
> > In any case, it should result in less code in the drivers - since 
> > the work you need to do when the link goes down is a subset of the 
> > work you need to do when the network interface is taken down.
> > 
> 
> While I agree with all of what written before, in practice, calling
> netif_carrier_off() when using PHYLIB can cause inconsistent carrier 
> states at most, but it would not be messing the state machine itself 
> because PHYLIB does not make uses of netif_carrier_ok() to make any 
> decisions as whether the link has dropped or not, it bases its 
> information solely on phydev->link.

Indeed, but the point I'm making is that this sequence is very possible with drivers that mess about by fiddling with stuff before they call phy_stop():

	CPU0					CPU1
	netif_carrier_off()
	mvpp2_egress_disable()
						phy_state_machine()
						 (phydev->state = PHY_AN)
						phy_link_up()
						phy_link_change()
						netif_carrier_on()
						mvpp2_link_event()
						mvpp2_egress_enable()
						mvpp2_ingress_enable()
	mvpp2_port_disable()
	phy_stop(ndev->phydev)

At this point, egress has not been disabled as mvpp2_stop_dev() wants, because the phylib state machine got in before it was stopped, called the adjust link function which then had the effect of re-enabling the egress.

If that doesn't matter, then what's the point of the
mvpp2_egress_disable() call in the mvpp2_stop_dev() path... either it matters and the mvpp2_stop_dev() sequence is broken, or it doesn't matter and some the work that mvpp2_stop_dev() is doing is unnecessary.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up

  reply	other threads:[~2017-12-02 11:09 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-28 13:29 [PATCH net] net: phylink: fix link state on phy-connect Antoine Tenart
2017-11-28 13:56 ` Andrew Lunn
2017-11-28 14:10   ` Antoine Tenart
2017-11-28 15:53 ` Russell King
2017-11-28 15:56   ` Russell King
2017-11-29  7:22     ` Antoine Tenart
2017-11-29 19:33     ` [EXT] " Yan Markman
2017-11-29 19:59       ` Russell King - ARM Linux
2017-11-29 21:06         ` [EXT] " Yan Markman
2017-11-29 21:20           ` Russell King - ARM Linux
2017-11-30  8:51             ` Yan Markman
2017-11-30 10:10               ` Russell King - ARM Linux
2017-11-30 13:28                 ` Russell King - ARM Linux
2017-12-01 17:07                   ` Grygorii Strashko
2017-12-01 17:24                     ` Russell King - ARM Linux
2017-12-01 17:36                       ` Florian Fainelli
2017-12-01 17:47                         ` Russell King - ARM Linux
2017-12-02 11:08                           ` Yan Markman [this message]
2017-12-02 14:58                             ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be9118b70c164ac793b2d79d5cfa1adf@IL-EXCH01.marvell.com \
    --to=ymarkman@marvell.com \
    --cc=andrew@lunn.ch \
    --cc=antoine.tenart@free-electrons.com \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=gregory.clement@free-electrons.com \
    --cc=grygorii.strashko@ti.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=miquel.raynal@free-electrons.com \
    --cc=mw@semihalf.com \
    --cc=nadavh@marvell.com \
    --cc=netdev@vger.kernel.org \
    --cc=stefanc@marvell.com \
    --cc=thomas.petazzoni@free-electrons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).