netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown
@ 2019-06-06 18:06 Robert Hancock
  2019-06-06 18:09 ` Russell King - ARM Linux admin
  0 siblings, 1 reply; 7+ messages in thread
From: Robert Hancock @ 2019-06-06 18:06 UTC (permalink / raw)
  To: netdev; +Cc: linux, andrew, f.fainelli, hkallweit1, Robert Hancock

SFP device polling can cause problems during the shutdown process if the
parent devices of the network controller have been shut down already.
This problem was seen on the iMX6 platform with PCIe devices, where
accessing the device after the bus is shut down causes a hang.

Stop all delayed work in the SFP driver during the shutdown process, and
set a flag which causes any further state checks or state machine events
(possibly triggered by previous GPIO IRQs) to be skipped.

Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
---

This is an updated version of a previous patch "net: sfp: Stop SFP polling
during shutdown" with the addition of stopping handling of GPIO-triggered
interrupts as well, as pointed out by Russell King.

 drivers/net/phy/sfp.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 554acc8..5fdf573 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -191,6 +191,7 @@ struct sfp {
 	struct delayed_work poll;
 	struct delayed_work timeout;
 	struct mutex sm_mutex;
+	bool shutdown;
 	unsigned char sm_mod_state;
 	unsigned char sm_dev_state;
 	unsigned short sm_state;
@@ -1466,6 +1467,11 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
 static void sfp_sm_event(struct sfp *sfp, unsigned int event)
 {
 	mutex_lock(&sfp->sm_mutex);
+	if (unlikely(sfp->shutdown)) {
+		/* Do not handle any more state machine events. */
+		mutex_unlock(&sfp->sm_mutex);
+		return;
+	}
 
 	dev_dbg(sfp->dev, "SM: enter %s:%s:%s event %s\n",
 		mod_state_to_str(sfp->sm_mod_state),
@@ -1704,6 +1710,13 @@ static void sfp_check_state(struct sfp *sfp)
 {
 	unsigned int state, i, changed;
 
+	mutex_lock(&sfp->sm_mutex);
+	if (unlikely(sfp->shutdown)) {
+		/* No more state checks */
+		mutex_unlock(&sfp->sm_mutex);
+		return;
+	}
+
 	state = sfp_get_state(sfp);
 	changed = state ^ sfp->state;
 	changed &= SFP_F_PRESENT | SFP_F_LOS | SFP_F_TX_FAULT;
@@ -1715,6 +1728,7 @@ static void sfp_check_state(struct sfp *sfp)
 
 	state |= sfp->state & (SFP_F_TX_DISABLE | SFP_F_RATE_SELECT);
 	sfp->state = state;
+	mutex_unlock(&sfp->sm_mutex);
 
 	rtnl_lock();
 	if (changed & SFP_F_PRESENT)
@@ -1928,9 +1942,22 @@ static int sfp_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void sfp_shutdown(struct platform_device *pdev)
+{
+	struct sfp *sfp = platform_get_drvdata(pdev);
+
+	mutex_lock(&sfp->sm_mutex);
+	sfp->shutdown = true;
+	mutex_unlock(&sfp->sm_mutex);
+
+	cancel_delayed_work_sync(&sfp->poll);
+	cancel_delayed_work_sync(&sfp->timeout);
+}
+
 static struct platform_driver sfp_driver = {
 	.probe = sfp_probe,
 	.remove = sfp_remove,
+	.shutdown = sfp_shutdown,
 	.driver = {
 		.name = "sfp",
 		.of_match_table = sfp_of_match,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown
  2019-06-06 18:06 [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown Robert Hancock
@ 2019-06-06 18:09 ` Russell King - ARM Linux admin
  2019-06-06 20:57   ` Robert Hancock
  0 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-06-06 18:09 UTC (permalink / raw)
  To: Robert Hancock; +Cc: netdev, andrew, f.fainelli, hkallweit1

On Thu, Jun 06, 2019 at 12:06:17PM -0600, Robert Hancock wrote:
> SFP device polling can cause problems during the shutdown process if the
> parent devices of the network controller have been shut down already.
> This problem was seen on the iMX6 platform with PCIe devices, where
> accessing the device after the bus is shut down causes a hang.
> 
> Stop all delayed work in the SFP driver during the shutdown process, and
> set a flag which causes any further state checks or state machine events
> (possibly triggered by previous GPIO IRQs) to be skipped.
> 
> Signed-off-by: Robert Hancock <hancock@sedsystems.ca>
> ---
> 
> This is an updated version of a previous patch "net: sfp: Stop SFP polling
> during shutdown" with the addition of stopping handling of GPIO-triggered
> interrupts as well, as pointed out by Russell King.
> 
>  drivers/net/phy/sfp.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
> index 554acc8..5fdf573 100644
> --- a/drivers/net/phy/sfp.c
> +++ b/drivers/net/phy/sfp.c
> @@ -191,6 +191,7 @@ struct sfp {
>  	struct delayed_work poll;
>  	struct delayed_work timeout;
>  	struct mutex sm_mutex;
> +	bool shutdown;
>  	unsigned char sm_mod_state;
>  	unsigned char sm_dev_state;
>  	unsigned short sm_state;
> @@ -1466,6 +1467,11 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
>  static void sfp_sm_event(struct sfp *sfp, unsigned int event)
>  {
>  	mutex_lock(&sfp->sm_mutex);
> +	if (unlikely(sfp->shutdown)) {
> +		/* Do not handle any more state machine events. */
> +		mutex_unlock(&sfp->sm_mutex);
> +		return;
> +	}
>  
>  	dev_dbg(sfp->dev, "SM: enter %s:%s:%s event %s\n",
>  		mod_state_to_str(sfp->sm_mod_state),
> @@ -1704,6 +1710,13 @@ static void sfp_check_state(struct sfp *sfp)
>  {
>  	unsigned int state, i, changed;
>  
> +	mutex_lock(&sfp->sm_mutex);
> +	if (unlikely(sfp->shutdown)) {
> +		/* No more state checks */
> +		mutex_unlock(&sfp->sm_mutex);
> +		return;
> +	}
> +

I don't think you need to add the mutex locking - just check for
sfp->shutdown and be done with it...

> +static void sfp_shutdown(struct platform_device *pdev)
> +{
> +	struct sfp *sfp = platform_get_drvdata(pdev);
> +
> +	mutex_lock(&sfp->sm_mutex);
> +	sfp->shutdown = true;
> +	mutex_unlock(&sfp->sm_mutex);
> +
> +	cancel_delayed_work_sync(&sfp->poll);
> +	cancel_delayed_work_sync(&sfp->timeout);

Since the work cancellation will ensure that the works are not running
at the point they return, and should they then run again, they'll hit
the sfp->shutdown condition.

> +}
> +
>  static struct platform_driver sfp_driver = {
>  	.probe = sfp_probe,
>  	.remove = sfp_remove,
> +	.shutdown = sfp_shutdown,
>  	.driver = {
>  		.name = "sfp",
>  		.of_match_table = sfp_of_match,
> -- 
> 1.8.3.1
> 
> 

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown
  2019-06-06 18:09 ` Russell King - ARM Linux admin
@ 2019-06-06 20:57   ` Robert Hancock
  2019-06-07  3:21     ` Florian Fainelli
                       ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Robert Hancock @ 2019-06-06 20:57 UTC (permalink / raw)
  To: Russell King - ARM Linux admin; +Cc: netdev, andrew, f.fainelli, hkallweit1

On 2019-06-06 12:09 p.m., Russell King - ARM Linux admin wrote:
>> @@ -1466,6 +1467,11 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
>>  static void sfp_sm_event(struct sfp *sfp, unsigned int event)
>>  {
>>  	mutex_lock(&sfp->sm_mutex);
>> +	if (unlikely(sfp->shutdown)) {
>> +		/* Do not handle any more state machine events. */
>> +		mutex_unlock(&sfp->sm_mutex);
>> +		return;
>> +	}
>>  
>>  	dev_dbg(sfp->dev, "SM: enter %s:%s:%s event %s\n",
>>  		mod_state_to_str(sfp->sm_mod_state),
>> @@ -1704,6 +1710,13 @@ static void sfp_check_state(struct sfp *sfp)
>>  {
>>  	unsigned int state, i, changed;
>>  
>> +	mutex_lock(&sfp->sm_mutex);
>> +	if (unlikely(sfp->shutdown)) {
>> +		/* No more state checks */
>> +		mutex_unlock(&sfp->sm_mutex);
>> +		return;
>> +	}
>> +
> 
> I don't think you need to add the mutex locking - just check for
> sfp->shutdown and be done with it...

The idea there was to deal with the case where GPIO interrupts were
previously raised before shutdown and not yet handled by the threaded
interrupt handler by the time shutdown is called. After shutdown on the
SFP completes, the bus the GPIO stuff is on could potentially be shut
down at any moment, so we really don't want to be digging into the GPIO
states after that. Locking the mutex there ensures that we don't read a
stale value for the shutdown flag in the interrupt handler, since AFAIK
there's no other synchronization around that value.

It may also be helpful that the lock is now held for the subsequent code
in sfp_check_state that's comparing the previous and new states - it
seems like you could otherwise run into trouble if that function was
being concurrently called from the polling thread and the interrupt
handler (for example if you had an SFP where some GPIOs supported
interrupts and some didn't).

> 
>> +static void sfp_shutdown(struct platform_device *pdev)
>> +{
>> +	struct sfp *sfp = platform_get_drvdata(pdev);
>> +
>> +	mutex_lock(&sfp->sm_mutex);
>> +	sfp->shutdown = true;
>> +	mutex_unlock(&sfp->sm_mutex);
>> +
>> +	cancel_delayed_work_sync(&sfp->poll);
>> +	cancel_delayed_work_sync(&sfp->timeout);
> 
> Since the work cancellation will ensure that the works are not running
> at the point they return, and should they then run again, they'll hit
> the sfp->shutdown condition.
> 
>> +}
>> +
>>  static struct platform_driver sfp_driver = {
>>  	.probe = sfp_probe,
>>  	.remove = sfp_remove,
>> +	.shutdown = sfp_shutdown,
>>  	.driver = {
>>  		.name = "sfp",
>>  		.of_match_table = sfp_of_match,
>> -- 
>> 1.8.3.1
>>
>>
> 

-- 
Robert Hancock
Senior Software Developer
SED Systems, a division of Calian Ltd.
Email: hancock@sedsystems.ca

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown
  2019-06-06 20:57   ` Robert Hancock
@ 2019-06-07  3:21     ` Florian Fainelli
  2019-06-07 10:26     ` Russell King - ARM Linux admin
  2019-06-07 10:42     ` Russell King - ARM Linux admin
  2 siblings, 0 replies; 7+ messages in thread
From: Florian Fainelli @ 2019-06-07  3:21 UTC (permalink / raw)
  To: Robert Hancock, Russell King - ARM Linux admin; +Cc: netdev, andrew, hkallweit1



On 6/6/2019 1:57 PM, Robert Hancock wrote:
> On 2019-06-06 12:09 p.m., Russell King - ARM Linux admin wrote:
>>> @@ -1466,6 +1467,11 @@ static void sfp_sm_mod_remove(struct sfp *sfp)
>>>  static void sfp_sm_event(struct sfp *sfp, unsigned int event)
>>>  {
>>>  	mutex_lock(&sfp->sm_mutex);
>>> +	if (unlikely(sfp->shutdown)) {
>>> +		/* Do not handle any more state machine events. */
>>> +		mutex_unlock(&sfp->sm_mutex);
>>> +		return;
>>> +	}
>>>  
>>>  	dev_dbg(sfp->dev, "SM: enter %s:%s:%s event %s\n",
>>>  		mod_state_to_str(sfp->sm_mod_state),
>>> @@ -1704,6 +1710,13 @@ static void sfp_check_state(struct sfp *sfp)
>>>  {
>>>  	unsigned int state, i, changed;
>>>  
>>> +	mutex_lock(&sfp->sm_mutex);
>>> +	if (unlikely(sfp->shutdown)) {
>>> +		/* No more state checks */
>>> +		mutex_unlock(&sfp->sm_mutex);
>>> +		return;
>>> +	}
>>> +
>>
>> I don't think you need to add the mutex locking - just check for
>> sfp->shutdown and be done with it...
> 
> The idea there was to deal with the case where GPIO interrupts were
> previously raised before shutdown and not yet handled by the threaded
> interrupt handler by the time shutdown is called. After shutdown on the
> SFP completes, the bus the GPIO stuff is on could potentially be shut
> down at any moment, so we really don't want to be digging into the GPIO
> states after that. Locking the mutex there ensures that we don't read a
> stale value for the shutdown flag in the interrupt handler, since AFAIK
> there's no other synchronization around that value.
> 
> It may also be helpful that the lock is now held for the subsequent code
> in sfp_check_state that's comparing the previous and new states - it
> seems like you could otherwise run into trouble if that function was
> being concurrently called from the polling thread and the interrupt
> handler (for example if you had an SFP where some GPIOs supported
> interrupts and some didn't).

Would not it be sufficient to call disable_irq() or devm_free_irq() (to
match the devm_request_threaded_irq call) in order to achieve what you
want here?
-- 
Florian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown
  2019-06-06 20:57   ` Robert Hancock
  2019-06-07  3:21     ` Florian Fainelli
@ 2019-06-07 10:26     ` Russell King - ARM Linux admin
  2019-06-07 10:42     ` Russell King - ARM Linux admin
  2 siblings, 0 replies; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-06-07 10:26 UTC (permalink / raw)
  To: Robert Hancock; +Cc: netdev, andrew, f.fainelli, hkallweit1

On Thu, Jun 06, 2019 at 02:57:22PM -0600, Robert Hancock wrote:
> It may also be helpful that the lock is now held for the subsequent code
> in sfp_check_state that's comparing the previous and new states - it
> seems like you could otherwise run into trouble if that function was
> being concurrently called from the polling thread and the interrupt
> handler (for example if you had an SFP where some GPIOs supported
> interrupts and some didn't).

That's a good point, one that we should address separately.  Rather
than re-using the existing mutex (which would be difficult to hold
across calling the state machine due to locking inversion) how about
this:

 drivers/net/phy/sfp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index 8a21294d1ce8..5ff427dcbb31 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -190,6 +190,7 @@ struct sfp {
 	struct delayed_work poll;
 	struct delayed_work timeout;
 	struct mutex sm_mutex;
+	struct mutex st_mutex;
 	unsigned char sm_mod_state;
 	unsigned char sm_dev_state;
 	unsigned short sm_state;
@@ -1976,6 +1977,7 @@ static void sfp_check_state(struct sfp *sfp)
 {
 	unsigned int state, i, changed;
 
+	mutex_lock(&sfp->st_mutex);
 	state = sfp_get_state(sfp);
 	changed = state ^ sfp->state;
 	changed &= SFP_F_PRESENT | SFP_F_LOS | SFP_F_TX_FAULT;
@@ -2001,6 +2003,7 @@ static void sfp_check_state(struct sfp *sfp)
 		sfp_sm_event(sfp, state & SFP_F_LOS ?
 				SFP_E_LOS_HIGH : SFP_E_LOS_LOW);
 	rtnl_unlock();
+	mutex_unlock(&sfp->st_mutex);
 }
 
 static irqreturn_t sfp_irq(int irq, void *data)
@@ -2031,6 +2034,7 @@ static struct sfp *sfp_alloc(struct device *dev)
 	sfp->dev = dev;
 
 	mutex_init(&sfp->sm_mutex);
+	mutex_init(&sfp->st_mutex);
 	INIT_DELAYED_WORK(&sfp->poll, sfp_poll);
 	INIT_DELAYED_WORK(&sfp->timeout, sfp_timeout);
 
-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown
  2019-06-06 20:57   ` Robert Hancock
  2019-06-07  3:21     ` Florian Fainelli
  2019-06-07 10:26     ` Russell King - ARM Linux admin
@ 2019-06-07 10:42     ` Russell King - ARM Linux admin
  2019-06-07 16:08       ` Robert Hancock
  2 siblings, 1 reply; 7+ messages in thread
From: Russell King - ARM Linux admin @ 2019-06-07 10:42 UTC (permalink / raw)
  To: Robert Hancock; +Cc: netdev, andrew, f.fainelli, hkallweit1

On Thu, Jun 06, 2019 at 02:57:22PM -0600, Robert Hancock wrote:
> The idea there was to deal with the case where GPIO interrupts were
> previously raised before shutdown and not yet handled by the threaded
> interrupt handler by the time shutdown is called. After shutdown on the
> SFP completes, the bus the GPIO stuff is on could potentially be shut
> down at any moment, so we really don't want to be digging into the GPIO
> states after that. Locking the mutex there ensures that we don't read a
> stale value for the shutdown flag in the interrupt handler, since AFAIK
> there's no other synchronization around that value.

There are two cases:

1) The interrupt is raised just as sfp_shutdown() is called but before
   the mutex is taken.  We will get the full processing in this case.

2) The interrupt is raised during the mutex-locked bit of sfp_shutdown()
   or after the mutex in sfp_shutdown() is released.  We will get the
   abbreviated processing.

This means that the mutex doesn't provide any protection against full
interrupt processing if it occurs just prior to or during the initial
execution of sfp_shutdown().

All that we need to ensure is that the state of sfp->shutdown is
visible by the time sfp_shutdown() returns, and that none of the
interrupt and worker functions are executing.  We have the worker
functions covered by the synchronous cancelling of them, but not the
interrupts, and as Florian points out, it's probably better to disable
the interrupts, and again, that can be done synchronously to ensure
that the handlers are not running.

If the workers and interrupt handlers are synchronously disabled, we
can be sure by the end of sfp_shutdown() that none of those paths are
executing, so the next time something attempts to trigger them, they
will see sfp->shutdown is set.

I'm not convinced that we even need sfp->shutdown if we have cancelled
the workers and disabled the interrupts.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown
  2019-06-07 10:42     ` Russell King - ARM Linux admin
@ 2019-06-07 16:08       ` Robert Hancock
  0 siblings, 0 replies; 7+ messages in thread
From: Robert Hancock @ 2019-06-07 16:08 UTC (permalink / raw)
  To: Russell King - ARM Linux admin; +Cc: netdev, andrew, f.fainelli, hkallweit1

On 2019-06-07 4:42 a.m., Russell King - ARM Linux admin wrote:
> On Thu, Jun 06, 2019 at 02:57:22PM -0600, Robert Hancock wrote:
>> The idea there was to deal with the case where GPIO interrupts were
>> previously raised before shutdown and not yet handled by the threaded
>> interrupt handler by the time shutdown is called. After shutdown on the
>> SFP completes, the bus the GPIO stuff is on could potentially be shut
>> down at any moment, so we really don't want to be digging into the GPIO
>> states after that. Locking the mutex there ensures that we don't read a
>> stale value for the shutdown flag in the interrupt handler, since AFAIK
>> there's no other synchronization around that value.
> 
> There are two cases:
> 
> 1) The interrupt is raised just as sfp_shutdown() is called but before
>    the mutex is taken.  We will get the full processing in this case.
> 
> 2) The interrupt is raised during the mutex-locked bit of sfp_shutdown()
>    or after the mutex in sfp_shutdown() is released.  We will get the
>    abbreviated processing.
> 
> This means that the mutex doesn't provide any protection against full
> interrupt processing if it occurs just prior to or during the initial
> execution of sfp_shutdown().
> 
> All that we need to ensure is that the state of sfp->shutdown is
> visible by the time sfp_shutdown() returns, and that none of the
> interrupt and worker functions are executing.  We have the worker
> functions covered by the synchronous cancelling of them, but not the
> interrupts, and as Florian points out, it's probably better to disable
> the interrupts, and again, that can be done synchronously to ensure
> that the handlers are not running.
> 
> If the workers and interrupt handlers are synchronously disabled, we
> can be sure by the end of sfp_shutdown() that none of those paths are
> executing, so the next time something attempts to trigger them, they
> will see sfp->shutdown is set.
> 
> I'm not convinced that we even need sfp->shutdown if we have cancelled
> the workers and disabled the interrupts.

I think you're right that if we just free the interrupts we avoid the
need for that flag, as that should ensure that the interrupt handlers
are no longer running or pending. Will resubmit with that approach and
also add a patch to add another mutex to the state checking as you
suggested.

-- 
Robert Hancock
Senior Software Developer
SED Systems, a division of Calian Ltd.
Email: hancock@sedsystems.ca

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-06-07 16:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06 18:06 [PATCH net-next] net: sfp: Stop SFP polling and interrupt handling during shutdown Robert Hancock
2019-06-06 18:09 ` Russell King - ARM Linux admin
2019-06-06 20:57   ` Robert Hancock
2019-06-07  3:21     ` Florian Fainelli
2019-06-07 10:26     ` Russell King - ARM Linux admin
2019-06-07 10:42     ` Russell King - ARM Linux admin
2019-06-07 16:08       ` Robert Hancock

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).