netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Daney <ddaney.cavm@gmail.com>
To: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>,
	Florian Fainelli <f.fainelli@gmail.com>
Cc: netdev <netdev@vger.kernel.org>,
	Geert Uytterhoeven <geert+renesas@glider.be>,
	David Miller <davem@davemloft.net>, Andrew Lunn <andrew@lunn.ch>,
	Mans Rullgard <mans@mansr.com>, Mason <slash.tmp@free.fr>
Subject: Re: [PATCH net] Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"
Date: Thu, 31 Aug 2017 09:36:19 -0700	[thread overview]
Message-ID: <e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com> (raw)
In-Reply-To: <f4bb5ac8-dae8-c0af-7aa6-e546fc0783fa@sigmadesigns.com>

On 08/31/2017 05:29 AM, Marc Gonzalez wrote:
> On 31/08/2017 02:49, Florian Fainelli wrote:
> 
>> This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a ("net: phy:
>> Correctly process PHY_HALTED in phy_stop_machine()") because it is
>> creating the possibility for a NULL pointer dereference.
>>
>> David Daney provide the following call trace and diagram of events:
>>
>> When ndo_stop() is called we call:
>>
>>   phy_disconnect()
>>      +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;
> 
> What does this mean?

I meant that after the call to phy_stop_interrupts(), phydev->irq = 
PHY_POLL;


> 
> On the contrary, phy_stop_interrupts() is only called when *not* polling.

That is the case I have.  We are using interrupts from the phy.


> 
> 	if (phydev->irq > 0)
> 		phy_stop_interrupts(phydev);
> 
>>      +---> phy_stop_machine()
>>      |      +---> phy_state_machine()
>>      |              +----> queue_delayed_work(): Work queued.
> 
> You're referring to the fact that, at the end of phy_state_machine()
> (in polling mode) the code reschedules itself through:
> 
> 	if (phydev->irq == PHY_POLL)
> 		queue_delayed_work(system_power_efficient_wq, &phydev->state_queue, PHY_STATE_TIME * HZ);

Exactly.  The call to phy_disconnect() ensures that there are no more 
interrupts and also that phydev->irq = PHY_POLL

The call to cancel_delayed_work_sync() at the top of phy_stop_machine() 
was meant to ensure that phy_state_machine() was never run again.  No 
interrupts + no queued work means that it should be save to do...

> 
>>      +--->phy_detach() implies: phydev->attached_dev = NULL;

The problem is that by calling phy_state_machine() again (which the 
offending patch added) we now have work scheduled that will try to 
dereference the pointer that was set to NULL as a result of the phy_detach()


>>
>> Now at a later time the queued work does:
>>
>>   phy_state_machine()
>>      +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:
> 
> I tested a sequence of 500 link up / link down in polling mode,
> and saw no such issue. Race condition?
> 

You were lucky.

> For what case in phy_state_machine() is netif_carrier_off()
> being called? Surely not PHY_HALTED?
> 

The phy can be in a variety of states.  It is connected to something 
outside of the system that we don't control, so you cannot assume any 
particular state.  We must have code that doesn't crash the system no 
matter what state the phy is in.

I suspect, but have not checked, that the phy is in PHY_RUNNING.  I 
think that means that because this patch turned the state machine back 
on, it will start transitioning through PHY_UP, PHY_AN, ... and 
eventually get to the crash we see because phydev->attached_dev = NULL


> 
>> The original motivation for this change originated from Marc Gonzales
>> indicating that his network driver did not have its adjust_link callback
>> executing with phydev->link = 0 while he was expecting it.
> 
> I expect the core to call phy_adjust_link() for link changes.
> This used to work back in 3.4 and was broken somewhere along
> the way.
> 
>> PHYLIB has never made any such guarantees ever because phy_stop() merely
>> just tells the workqueue to move into PHY_HALTED state which will happen
>> asynchronously.
> 
> My original proposal was to fix the issue in the driver.
> I'll try locating it in my archives.
> 
> Regards.
> 

  parent reply	other threads:[~2017-08-31 16:36 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-31  0:49 [PATCH net] Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()" Florian Fainelli
2017-08-31  1:47 ` David Miller
2017-08-31 12:29 ` Marc Gonzalez
2017-08-31 14:21   ` Marc Gonzalez
2017-08-31 16:36   ` David Daney [this message]
2017-08-31 16:57     ` Florian Fainelli
2017-08-31 17:49       ` Mason
2017-08-31 17:53         ` Florian Fainelli
2017-08-31 18:12           ` Mason
2017-08-31 18:29             ` Florian Fainelli
2017-09-06 14:33               ` Mason
2017-09-06 17:53                 ` David Daney
2017-09-06 18:00               ` David Daney
2017-09-06 18:59                 ` Florian Fainelli
2017-09-06 20:49                   ` David Daney
2017-09-06 22:51                     ` David Daney
2017-09-06 23:14                       ` Florian Fainelli
2017-09-07  0:10                         ` David Daney
2017-09-07  1:41                           ` Florian Fainelli
2017-09-06 19:14                 ` Mason
2017-08-31 17:35     ` Mason
2017-08-31 17:03   ` Florian Fainelli
2017-08-31 19:09     ` Mason
2017-08-31 19:18       ` Florian Fainelli
2017-09-06 14:55         ` Mason
2017-09-06 19:28           ` Florian Fainelli
2017-09-06 15:51         ` Mason
2017-09-06 19:42           ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e24693e8-d8ae-188a-2a38-c9a83fdc94e3@gmail.com \
    --to=ddaney.cavm@gmail.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=geert+renesas@glider.be \
    --cc=mans@mansr.com \
    --cc=marc_gonzalez@sigmadesigns.com \
    --cc=netdev@vger.kernel.org \
    --cc=slash.tmp@free.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).