All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yan Markman <ymarkman@marvell.com>
To: Russell King - ARM Linux <linux@armlinux.org.uk>,
	Florian Fainelli <f.fainelli@gmail.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>,
	Antoine Tenart <antoine.tenart@free-electrons.com>,
	"andrew@lunn.ch" <andrew@lunn.ch>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"gregory.clement@free-electrons.com" 
	<gregory.clement@free-electrons.com>,
	"thomas.petazzoni@free-electrons.com" 
	<thomas.petazzoni@free-electrons.com>,
	"miquel.raynal@free-electrons.com"
	<miquel.raynal@free-electrons.com>,
	"Nadav Haklai" <nadavh@marvell.com>,
	"mw@semihalf.com" <mw@semihalf.com>,
	"Stefan Chulski" <stefanc@marvell.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [EXT] Re: [PATCH net] net: phylink: fix link state on phy-connect
Date: Sat, 2 Dec 2017 11:08:45 +0000	[thread overview]
Message-ID: <be9118b70c164ac793b2d79d5cfa1adf@IL-EXCH01.marvell.com> (raw)
In-Reply-To: <20171201174730.GM10595@n2100.armlinux.org.uk>

Hi Russel
The Grygorii has raised one Additional point (about netif_carrier_off) I just didn't want to start before finishing the previous one.
On ifconfig-down the mac_config() called but with LINK=0. 
The config has no any knowledge what is intention -- up or down and should be done under disabled ingress/egress,
       and so the mac_config one of its action is    netif_carrier_off.

After calling mac_config() the phylink checks  if (!link  &&  !netif_carrier_ok()) and decides to abort further down since all-done...

REMOVE netif_carrier_off looks like correct BUT has cases where de driver stops to works properly (sorry, I can't remember now what exactly).
So finally I have placed there the CONDITIONAL carrier-off depending upon link:

static void mvpp2_mac_config(){
	if (state->link)        --- occasionally is TRUE on UP but FALSE on down
		netif_carrier_off(port->dev);//YANM

BTW: It's seems your below patch should be present anyway.
+++ b/drivers/net/phy/phylink.c
@@ -798,6 +798,7 @@ void phylink_disconnect_phy(struct phylink *pl)
+		pl->phy_state.link = false;

Thank you
Best regards
Yan Markman

-----Original Message-----
From: Russell King - ARM Linux [mailto:linux@armlinux.org.uk] 
Sent: Friday, December 01, 2017 7:48 PM
To: Florian Fainelli <f.fainelli@gmail.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>; Yan Markman <ymarkman@marvell.com>; Antoine Tenart <antoine.tenart@free-electrons.com>; andrew@lunn.ch; davem@davemloft.net; gregory.clement@free-electrons.com; thomas.petazzoni@free-electrons.com; miquel.raynal@free-electrons.com; Nadav Haklai <nadavh@marvell.com>; mw@semihalf.com; Stefan Chulski <stefanc@marvell.com>; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [EXT] Re: [PATCH net] net: phylink: fix link state on phy-connect

On Fri, Dec 01, 2017 at 09:36:42AM -0800, Florian Fainelli wrote:
> On 12/01/2017 09:24 AM, Russell King - ARM Linux wrote:
> > On Fri, Dec 01, 2017 at 11:07:22AM -0600, Grygorii Strashko wrote:
> >> Hi Russell,
> >>
> >> On 11/30/2017 07:28 AM, Russell King - ARM Linux wrote:
> >>> On Thu, Nov 30, 2017 at 10:10:18AM +0000, Russell King - ARM Linux wrote:
> >>>> On Thu, Nov 30, 2017 at 08:51:21AM +0000, Yan Markman wrote:
> >>>>> The phylink_stop is called before phylink_disconnect_phy You 
> >>>>> could see in mvpp2.c:
> >>>>>
> >>>>> mvpp2_stop_dev() {
> >>>>> 	phylink_stop(port->phylink);
> >>>>> }
> >>>>>
> >>>>> mvpp2_stop()       {
> >>>>> 	mvpp2_stop_dev(port);
> >>>>> 	phylink_disconnect_phy(port->phylink);
> >>>>> }
> >>>>>
> >>>>> .ndo_stop = mvpp2_stop,
> >>>>
> >>>> Sorry, I don't have this in mvpp2.c, so I have no visibility of 
> >>>> what you're working with.
> >>>>
> >>>> What you have above looks correct, and I see no reason why the 
> >>>> p21 patch would not have resolved your issue.  The p21 patch 
> >>>> ensures that phylink_resolve() gets called and completes before 
> >>>> phylink_stop() returns.  In that case, phylink_resolve() will 
> >>>> call the mac_link_down() method if the link is not already down.  
> >>>> It will also print the "Link is Down" message.
> >>>>
> >>>> Florian has already tested this patch after encountering a 
> >>>> similar issue, and has reported that it solves the problem for 
> >>>> him.  I've also tested it with mvneta, and the original mvpp2x driver on Macchiatobin.
> >>>>
> >>>> Maybe there's something different about mvpp2, but as I have no 
> >>>> visibility of that driver and the modifications therein, I can't 
> >>>> comment further other than stating that it works for three 
> >>>> different implementations.
> >>>>
> >>>> Maybe you could try and work out what's going on with the p21 
> >>>> patch in your case?
> >>>
> >>> I think I now realise what's probably going on.
> >>>
> >>> If you call netif_carrier_off() before phylink_stop(), then 
> >>> phylink will believe that the link is already down, and so it 
> >>> won't bother calling
> >>> mac_link_down() - it will believe that the link is already down.
> >>>
> >>> I'll update the documentation for phylink_stop() to spell out this 
> >>> aspect.
> >>>
> >>
> >> There are pretty high number of net drivers which do call
> >> 	netif_carrier_off(dev);
> >> before
> >> 	phy_stop(dev->phydev);
> >> in .ndo_stop() callback.
> >>
> >> As per you comment this seems to be incorrect, so should such calls 
> >> be removed?
> > 
> > Well, I think the question that needs to be asked is this:
> > 
> >   Is calling netif_carrier_off() before phy_stop() safe?
> > 
> > Well, reading the phylib code, this is the answer I've come to:
> > 
> >   Between phy_start() and phy_stop(), phylib is free to manage the
> >   carrier state itself through the phylib state machine.
> > 
> >   This means if you call netif_carrier_off() prior to phy_stop(),
> >   there is nothing preventing the phylib state machine from running,
> >   and a co-incident poll of the PHY could notice that the link has
> >   come up, and re-enable the carrier while your ndo_stop() method
> >   is still running.
> > 
> > So, my conclusion is that this practice is provably racy, though 
> > it's probably not that easy to trigger the race (which is probably 
> > why no one has reported the problem.)
> > 
> > Given that it's racy, it's not something that I think phylink should 
> > care about, and should "softly" discourage it.  So, I'm happy with 
> > what phylink is doing here, and I suggest fixing the drivers for 
> > this race.
> > 
> > In any case, it should result in less code in the drivers - since 
> > the work you need to do when the link goes down is a subset of the 
> > work you need to do when the network interface is taken down.
> > 
> 
> While I agree with all of what written before, in practice, calling
> netif_carrier_off() when using PHYLIB can cause inconsistent carrier 
> states at most, but it would not be messing the state machine itself 
> because PHYLIB does not make uses of netif_carrier_ok() to make any 
> decisions as whether the link has dropped or not, it bases its 
> information solely on phydev->link.

Indeed, but the point I'm making is that this sequence is very possible with drivers that mess about by fiddling with stuff before they call phy_stop():

	CPU0					CPU1
	netif_carrier_off()
	mvpp2_egress_disable()
						phy_state_machine()
						 (phydev->state = PHY_AN)
						phy_link_up()
						phy_link_change()
						netif_carrier_on()
						mvpp2_link_event()
						mvpp2_egress_enable()
						mvpp2_ingress_enable()
	mvpp2_port_disable()
	phy_stop(ndev->phydev)

At this point, egress has not been disabled as mvpp2_stop_dev() wants, because the phylib state machine got in before it was stopped, called the adjust link function which then had the effect of re-enabling the egress.

If that doesn't matter, then what's the point of the
mvpp2_egress_disable() call in the mvpp2_stop_dev() path... either it matters and the mvpp2_stop_dev() sequence is broken, or it doesn't matter and some the work that mvpp2_stop_dev() is doing is unnecessary.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up

  reply	other threads:[~2017-12-02 11:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-28 13:29 [PATCH net] net: phylink: fix link state on phy-connect Antoine Tenart
2017-11-28 13:56 ` Andrew Lunn
2017-11-28 14:10   ` Antoine Tenart
2017-11-28 15:53 ` Russell King
2017-11-28 15:56   ` Russell King
2017-11-29  7:22     ` Antoine Tenart
2017-11-29 19:33     ` [EXT] " Yan Markman
2017-11-29 19:33       ` Yan Markman
2017-11-29 19:59       ` Russell King - ARM Linux
2017-11-29 19:59         ` Russell King - ARM Linux
2017-11-29 21:06         ` [EXT] " Yan Markman
2017-11-29 21:06           ` Yan Markman
2017-11-29 21:20           ` Russell King - ARM Linux
2017-11-29 21:20             ` Russell King - ARM Linux
2017-11-30  8:51             ` Yan Markman
2017-11-30  8:51               ` Yan Markman
2017-11-30 10:10               ` Russell King - ARM Linux
2017-11-30 10:10                 ` Russell King - ARM Linux
2017-11-30 13:28                 ` Russell King - ARM Linux
2017-11-30 13:28                   ` Russell King - ARM Linux
2017-12-01 17:07                   ` Grygorii Strashko
2017-12-01 17:07                     ` Grygorii Strashko
2017-12-01 17:24                     ` Russell King - ARM Linux
2017-12-01 17:24                       ` Russell King - ARM Linux
2017-12-01 17:36                       ` Florian Fainelli
2017-12-01 17:36                         ` Florian Fainelli
2017-12-01 17:47                         ` Russell King - ARM Linux
2017-12-01 17:47                           ` Russell King - ARM Linux
2017-12-02 11:08                           ` Yan Markman [this message]
2017-12-02 11:08                             ` Yan Markman
2017-12-02 14:58                             ` Russell King - ARM Linux
2017-12-02 14:58                               ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be9118b70c164ac793b2d79d5cfa1adf@IL-EXCH01.marvell.com \
    --to=ymarkman@marvell.com \
    --cc=andrew@lunn.ch \
    --cc=antoine.tenart@free-electrons.com \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=gregory.clement@free-electrons.com \
    --cc=grygorii.strashko@ti.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=miquel.raynal@free-electrons.com \
    --cc=mw@semihalf.com \
    --cc=nadavh@marvell.com \
    --cc=netdev@vger.kernel.org \
    --cc=stefanc@marvell.com \
    --cc=thomas.petazzoni@free-electrons.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.