All of lore.kernel.org
 help / color / mirror / Atom feed
From: Russell King - ARM Linux admin <linux@armlinux.org.uk>
To: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>,
	John David Anglin <dave.anglin@bell.net>,
	Vivien Didelot <vivien.didelot@savoirfairelinux.com>,
	Florian Fainelli <f.fainelli@gmail.com>,
	netdev@vger.kernel.org
Subject: Re: [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit
Date: Tue, 12 Feb 2019 16:30:17 +0000	[thread overview]
Message-ID: <20190212163017.lwstmgtyw76cwrd7@shell.armlinux.org.uk> (raw)
In-Reply-To: <13c1e6d5-c287-0091-3b24-1978f9a18e7e@gmail.com>

On Tue, Feb 12, 2019 at 07:51:05AM +0100, Heiner Kallweit wrote:
> On 12.02.2019 04:58, Andrew Lunn wrote:
> > That change means we don't check the PHY device if it caused an
> > interrupt when its state is less than UP.
> > 
> > What i'm seeing is that the PHY is interrupting pretty early on after
> > a reboot when the previous boot had the interface up.
> > 
> So this means that when going down for reboot the interrupts are not
> properly masked / disabled? Because (at least for net-next) we enable
> interrupts in phy_start() only.

Looking at Linus' tree as opposed to net-next, things do look rather
broken wrt interrupts:

+-phy_attach_direct
  `-phydev->state = PHY_READY
+-phy_prepare_link
+-phy_start_machine
  `-phy_trigger_machine()
`-phy_start_interrupts
  +-request_threaded_irq()
  `-phy_enable_interrupts()
    +-phy_clear_interrupt()
    `-phy_config_interrupt(, PHY_INTERRUPT_ENABLED)

At this point, the PHY is then able to generate interrupts, which,
because phy_start() has not been called and phy_interrupt() checks
that phydev->state >= PHY_UP, get ignored by the interrupt handler
exactly as Andrew is finding.

So it looks like 5.0-rc is already in need of this being fixed.

In looking at this, I came across this chunk of code:

static inline bool __phy_is_started(struct phy_device *phydev)
{
        WARN_ON(!mutex_is_locked(&phydev->lock));

        return phydev->state >= PHY_UP;
}

/**
 * phy_is_started - Convenience function to check whether PHY is started
 * @phydev: The phy_device struct
 */
static inline bool phy_is_started(struct phy_device *phydev)
{
        bool started;

        mutex_lock(&phydev->lock);
        started = __phy_is_started(phydev);
        mutex_unlock(&phydev->lock);

        return started;
}

which looks to me like over-complication.  The mutex locking there is
completely pointless - what are you trying to achieve with it?

Let's go through this.  The above is exactly equivalent to:

bool phy_is_started(phydev)
{
	int state;

	mutex_lock(&phydev->lock);
	state = phydev->state;
	mutex_unlock(&phydev->lock);

	return state >= PHY_UP;
}

since when we do the test is irrelevant.  Architectures that Linux
runs on are single-copy atomic, which means that reading phydev->state
itself is an atomic operation.  So, the mutex locking around that
doesn't add to the atomicity of the entire operation.

How, depending on what you do with the rest of this function depends
whether the entire operation is safe or not.  For example, let's take
this code at the end of phy_state_machine():

        if (phy_polling_mode(phydev) && phy_is_started(phydev))
                phy_queue_state_machine(phydev, PHY_STATE_TIME);

state = PHY_UP
		thread 0			thread 1
						phy_disconnect()
						+-phy_is_started()
		phy_is_started()                |
						`-phy_stop()
						  +-phydev->state = PHY_HALTED
						  `-phy_stop_machine()
						    `-cancel_delayed_work_sync()
		phy_queue_state_machine()
		`-mod_delayed_work()

At this point, the phydev->state_queue() has been added back onto the
system workqueue despite phy_stop_machine() having been called and
cancel_delayed_work_sync() called on it.

The original code in 4.20 did not have this race condition.

Basically, the lock inside phy_is_started() does nothing useful, and
I'd say is dangerously misleading.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

  parent reply	other threads:[~2019-02-12 16:30 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-22 19:16 net: phylink: dsa: mv88e6xxx: flaky link detection on switch ports with internal PHYs John David Anglin
2019-01-22 20:28 ` Andrew Lunn
2019-01-22 21:40   ` John David Anglin
2019-01-22 22:36     ` Andrew Lunn
2019-01-22 23:52       ` John David Anglin
2019-01-23  0:11       ` John David Anglin
2019-01-23  0:22         ` Andrew Lunn
2019-01-25 16:30           ` John David Anglin
2019-01-25 16:48             ` Russell King - ARM Linux admin
2019-01-25 18:38               ` John David Anglin
2019-01-30 17:08           ` John David Anglin
2019-01-30 17:28             ` Andrew Lunn
2019-01-30 19:01               ` John David Anglin
2019-01-30 19:09                 ` Andrew Lunn
2019-01-30 22:24               ` John David Anglin
2019-01-30 22:38                 ` Andrew Lunn
2019-01-31  1:27                   ` John David Anglin
2019-01-31 17:27                     ` John David Anglin
2019-02-04 18:37                       ` [PATCH] net: phylink: dsa: mv88e6xxx: Revise irq setup ordering John David Anglin
2019-02-04 19:35                         ` Andrew Lunn
2019-02-04 19:52                           ` John David Anglin
2019-02-04 20:19                             ` Andrew Lunn
2019-02-04 21:38                               ` John David Anglin
2019-02-04 22:47                                 ` Andrew Lunn
2019-02-04 21:59                         ` [PATCH v2] net: " John David Anglin
2019-02-04 23:14                           ` Andrew Lunn
2019-02-05  0:38                             ` John David Anglin
2019-02-05  2:21                               ` Andrew Lunn
2019-02-05 19:20                                 ` John David Anglin
2019-02-05 19:54                                   ` Andrew Lunn
2019-02-05 18:37                           ` David Miller
2019-02-11 18:40                           ` [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit John David Anglin
2019-02-11 23:33                             ` Andrew Lunn
2019-02-12  0:57                               ` John David Anglin
2019-02-12  1:21                                 ` Andrew Lunn
2019-02-12  3:58                                 ` Andrew Lunn
2019-02-12  6:51                                   ` Heiner Kallweit
2019-02-12 12:56                                     ` Andrew Lunn
2019-02-12 18:42                                       ` Heiner Kallweit
2019-02-12 20:09                                       ` John David Anglin
2019-02-12 16:30                                     ` Russell King - ARM Linux admin [this message]
2019-02-12 20:11                                       ` Heiner Kallweit
2019-02-12 20:54                                       ` Heiner Kallweit
2019-02-12 22:55                                         ` Russell King - ARM Linux admin
2019-02-14  2:07                             ` Andrew Lunn
2019-02-14  4:47                               ` David Miller
2019-02-14  4:50                                 ` Andrew Lunn
2019-02-14 15:27                                   ` David Miller
2019-01-22 23:12 ` net: phylink: dsa: mv88e6xxx: flaky link detection on switch ports with internal PHYs Andrew Lunn
2019-01-22 23:48   ` John David Anglin
2019-01-23  0:00   ` John David Anglin
2019-01-23  0:04     ` Florian Fainelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190212163017.lwstmgtyw76cwrd7@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=andrew@lunn.ch \
    --cc=dave.anglin@bell.net \
    --cc=f.fainelli@gmail.com \
    --cc=hkallweit1@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=vivien.didelot@savoirfairelinux.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.