netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net 0/2] amd-xgbe: Sleeping while atomic during hibernate
@ 2019-10-15 13:49 James Morse
  2019-10-15 13:49 ` [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue() while holding a spinlock James Morse
  2019-10-15 13:49 ` [RFC PATCH net 2/2] amd-xgbe: Avoid sleeping in napi_disable() " James Morse
  0 siblings, 2 replies; 5+ messages in thread
From: James Morse @ 2019-10-15 13:49 UTC (permalink / raw)
  To: netdev; +Cc: Tom Lendacky, Dave S . Miller

Hi guys,

While testing hibernate on Seattle, I ran over these two.

I'm not sure the second one is correct, so I'm hoping someone else will
know how to fix it!

Thanks,

James Morse (2):
  amd-xgbe: Avoid sleeping in flush_workqueue() while holding a spinlock
  amd-xgbe: Avoid sleeping in napi_disable() while holding a spinlock

 drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 26 ++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue() while holding a spinlock
  2019-10-15 13:49 [RFC PATCH net 0/2] amd-xgbe: Sleeping while atomic during hibernate James Morse
@ 2019-10-15 13:49 ` James Morse
  2019-10-15 15:26   ` Lendacky, Thomas
  2019-10-15 13:49 ` [RFC PATCH net 2/2] amd-xgbe: Avoid sleeping in napi_disable() " James Morse
  1 sibling, 1 reply; 5+ messages in thread
From: James Morse @ 2019-10-15 13:49 UTC (permalink / raw)
  To: netdev; +Cc: Tom Lendacky, Dave S . Miller

xgbe_powerdown() takes an irqsave spinlock, then calls flush_workqueue()
which takes a mutex. DEBUG_ATOMIC_SLEEP isn't happy about this:
| BUG: sleeping function called from invalid context at [...] mutex.c:281
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 2733, name: bash
| CPU: 4 PID: 2733 Comm: bash Tainted: G        W         5.4.0-rc3 #113
| Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
| Call trace:
|  show_stack+0x24/0x30
|  dump_stack+0xb0/0xf8
|  ___might_sleep+0x124/0x148
|  __might_sleep+0x54/0x90
|  mutex_lock+0x2c/0x80
|  flush_workqueue+0x84/0x420
|  xgbe_powerdown+0x6c/0x108
|  xgbe_platform_suspend+0x34/0x80
|  pm_generic_freeze+0x3c/0x58
|  acpi_subsys_freeze+0x2c/0x38
|  dpm_run_callback+0x3c/0x1e8
|  __device_suspend+0x130/0x468
|  dpm_suspend+0x114/0x388
|  hibernation_snapshot+0xe8/0x378
|  hibernate+0x18c/0x2f8

Drop the lock for the duration of xgbe_powerdown(). We have already
stopeed the timers, so service_work can't be re-queued. Move the
pdata->power_down flag earlier so that it can be used by the interrupt
handler to know not to re-queue the tx_tstamp_work.

Signed-off-by: James Morse <james.morse@arm.com>

---
RFC as I'm not familiar with this driver. I'm happy to test a better fix!
---
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index 98f8f2033154..bfba7effcf9f 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -480,6 +480,8 @@ static void xgbe_isr_task(unsigned long data)
 	struct xgbe_channel *channel;
 	unsigned int dma_isr, dma_ch_isr;
 	unsigned int mac_isr, mac_tssr, mac_mdioisr;
+	unsigned long flags;
+	bool power_down;
 	unsigned int i;
 
 	/* The DMA interrupt status register also reports MAC and MTL
@@ -558,8 +560,14 @@ static void xgbe_isr_task(unsigned long data)
 				/* Read Tx Timestamp to clear interrupt */
 				pdata->tx_tstamp =
 					hw_if->get_tx_tstamp(pdata);
-				queue_work(pdata->dev_workqueue,
-					   &pdata->tx_tstamp_work);
+
+				spin_lock_irqsave(&pdata->lock, flags);
+				power_down = pdata->power_down;
+				spin_unlock_irqrestore(&pdata->lock, flags);
+
+				if (!power_down)
+					queue_work(pdata->dev_workqueue,
+						   &pdata->tx_tstamp_work);
 			}
 		}
 
@@ -1256,16 +1264,22 @@ int xgbe_powerdown(struct net_device *netdev, unsigned int caller)
 
 	netif_tx_stop_all_queues(netdev);
 
+	/* Stop service_work being re-queued by the service_timer */
 	xgbe_stop_timers(pdata);
+
+	/* Stop tx_tstamp_work being re-queued after flush_workqueue() */
+	pdata->power_down = 1;
+
+	/* drop the lock to allow flush_workqueue() to sleep. */
+	spin_unlock_irqrestore(&pdata->lock, flags);
 	flush_workqueue(pdata->dev_workqueue);
+	spin_lock_irqsave(&pdata->lock, flags);
 
 	hw_if->powerdown_tx(pdata);
 	hw_if->powerdown_rx(pdata);
 
 	xgbe_napi_disable(pdata, 0);
 
-	pdata->power_down = 1;
-
 	spin_unlock_irqrestore(&pdata->lock, flags);
 
 	DBGPR("<--xgbe_powerdown\n");
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH net 2/2] amd-xgbe: Avoid sleeping in napi_disable() while holding a spinlock
  2019-10-15 13:49 [RFC PATCH net 0/2] amd-xgbe: Sleeping while atomic during hibernate James Morse
  2019-10-15 13:49 ` [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue() while holding a spinlock James Morse
@ 2019-10-15 13:49 ` James Morse
  2019-10-15 15:41   ` Lendacky, Thomas
  1 sibling, 1 reply; 5+ messages in thread
From: James Morse @ 2019-10-15 13:49 UTC (permalink / raw)
  To: netdev; +Cc: Tom Lendacky, Dave S . Miller

xgbe_powerdown() takes an irqsave spinlock, then calls napi_disable()
via xgbe_napi_disable(). napi_disable() might call msleep().
DEBUG_ATOMIC_SLEEP isn't happy about this:
| BUG: sleeping function called from invalid context at ../net/core/dev.c:6332
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 2831, name: bash
| CPU: 3 PID: 2831 Comm: bash Tainted: G        W         5.4.0-rc3-00001-g9dbe793f263b #114
| Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
| Call trace:
|  dump_backtrace+0x0/0x160
|  show_stack+0x24/0x30
|  dump_stack+0xb0/0xf8
|  ___might_sleep+0x124/0x148
|  __might_sleep+0x54/0x90
|  napi_disable+0x48/0x140
|  xgbe_napi_disable+0x64/0xc0
|  xgbe_powerdown+0xb0/0x120
|  xgbe_platform_suspend+0x34/0x80
|  pm_generic_freeze+0x3c/0x58
|  acpi_subsys_freeze+0x2c/0x38
|  dpm_run_callback+0x3c/0x1e8
|  __device_suspend+0x130/0x468
|  dpm_suspend+0x114/0x388
|  hibernation_snapshot+0xe8/0x378
|  hibernate+0x18c/0x2f8

Move xgbe_napi_disable() outside the spin_lock()d region of
xgbe_powerdown(). This matches its use in xgbe_stop() ... but this
might only be safe because of the earlier call to xgbe_free_irqs().

Signed-off-by: James Morse <james.morse@arm.com>

---
RFC as I'm not familiar with this driver. I'm happy to test a better fix!
---
 drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
index bfba7effcf9f..a6e6c21e921f 100644
--- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
@@ -1278,10 +1278,10 @@ int xgbe_powerdown(struct net_device *netdev, unsigned int caller)
 	hw_if->powerdown_tx(pdata);
 	hw_if->powerdown_rx(pdata);
 
-	xgbe_napi_disable(pdata, 0);
-
 	spin_unlock_irqrestore(&pdata->lock, flags);
 
+	xgbe_napi_disable(pdata, 0);
+
 	DBGPR("<--xgbe_powerdown\n");
 
 	return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue() while holding a spinlock
  2019-10-15 13:49 ` [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue() while holding a spinlock James Morse
@ 2019-10-15 15:26   ` Lendacky, Thomas
  0 siblings, 0 replies; 5+ messages in thread
From: Lendacky, Thomas @ 2019-10-15 15:26 UTC (permalink / raw)
  To: James Morse, netdev; +Cc: Dave S . Miller

On 10/15/19 8:49 AM, James Morse wrote:
> xgbe_powerdown() takes an irqsave spinlock, then calls flush_workqueue()
> which takes a mutex. DEBUG_ATOMIC_SLEEP isn't happy about this:
> | BUG: sleeping function called from invalid context at [...] mutex.c:281
> | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 2733, name: bash
> | CPU: 4 PID: 2733 Comm: bash Tainted: G        W         5.4.0-rc3 #113
> | Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
> | Call trace:
> |  show_stack+0x24/0x30
> |  dump_stack+0xb0/0xf8
> |  ___might_sleep+0x124/0x148
> |  __might_sleep+0x54/0x90
> |  mutex_lock+0x2c/0x80
> |  flush_workqueue+0x84/0x420
> |  xgbe_powerdown+0x6c/0x108
> |  xgbe_platform_suspend+0x34/0x80
> |  pm_generic_freeze+0x3c/0x58
> |  acpi_subsys_freeze+0x2c/0x38
> |  dpm_run_callback+0x3c/0x1e8
> |  __device_suspend+0x130/0x468
> |  dpm_suspend+0x114/0x388
> |  hibernation_snapshot+0xe8/0x378
> |  hibernate+0x18c/0x2f8
> 
> Drop the lock for the duration of xgbe_powerdown(). We have already
> stopeed the timers, so service_work can't be re-queued. Move the
> pdata->power_down flag earlier so that it can be used by the interrupt
> handler to know not to re-queue the tx_tstamp_work.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> RFC as I'm not familiar with this driver. I'm happy to test a better fix!
> ---
>  drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 22 ++++++++++++++++++----
>  1 file changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index 98f8f2033154..bfba7effcf9f 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -480,6 +480,8 @@ static void xgbe_isr_task(unsigned long data)
>  	struct xgbe_channel *channel;
>  	unsigned int dma_isr, dma_ch_isr;
>  	unsigned int mac_isr, mac_tssr, mac_mdioisr;
> +	unsigned long flags;
> +	bool power_down;
>  	unsigned int i;
>  
>  	/* The DMA interrupt status register also reports MAC and MTL
> @@ -558,8 +560,14 @@ static void xgbe_isr_task(unsigned long data)
>  				/* Read Tx Timestamp to clear interrupt */
>  				pdata->tx_tstamp =
>  					hw_if->get_tx_tstamp(pdata);
> -				queue_work(pdata->dev_workqueue,
> -					   &pdata->tx_tstamp_work);
> +
> +				spin_lock_irqsave(&pdata->lock, flags);
> +				power_down = pdata->power_down;
> +				spin_unlock_irqrestore(&pdata->lock, flags);
> +
> +				if (!power_down)
> +					queue_work(pdata->dev_workqueue,
> +						   &pdata->tx_tstamp_work);

Hmm, I think this could race.  You probably want to protect queue_work()
with the spin lock, too, in which case you won't need the power_down var.

>  			}
>  		}
>  
> @@ -1256,16 +1264,22 @@ int xgbe_powerdown(struct net_device *netdev, unsigned int caller)
>  
>  	netif_tx_stop_all_queues(netdev);
>  
> +	/* Stop service_work being re-queued by the service_timer */
>  	xgbe_stop_timers(pdata);
> +
> +	/* Stop tx_tstamp_work being re-queued after flush_workqueue() */
> +	pdata->power_down = 1;

You can probably move this up even further, just after grabbing the lock.

> +
> +	/* drop the lock to allow flush_workqueue() to sleep. */
> +	spin_unlock_irqrestore(&pdata->lock, flags);
>  	flush_workqueue(pdata->dev_workqueue);

Rather than dropping and reacquiring the lock, I think you can get away
with moving this to after dropping the lock below.

Thanks,
Tom

> +	spin_lock_irqsave(&pdata->lock, flags);
>  
>  	hw_if->powerdown_tx(pdata);
>  	hw_if->powerdown_rx(pdata);
>  
>  	xgbe_napi_disable(pdata, 0);
>  
> -	pdata->power_down = 1;
> -
>  	spin_unlock_irqrestore(&pdata->lock, flags);
>  
>  	DBGPR("<--xgbe_powerdown\n");
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH net 2/2] amd-xgbe: Avoid sleeping in napi_disable() while holding a spinlock
  2019-10-15 13:49 ` [RFC PATCH net 2/2] amd-xgbe: Avoid sleeping in napi_disable() " James Morse
@ 2019-10-15 15:41   ` Lendacky, Thomas
  0 siblings, 0 replies; 5+ messages in thread
From: Lendacky, Thomas @ 2019-10-15 15:41 UTC (permalink / raw)
  To: James Morse, netdev; +Cc: Dave S . Miller

On 10/15/19 8:49 AM, James Morse wrote:
> xgbe_powerdown() takes an irqsave spinlock, then calls napi_disable()
> via xgbe_napi_disable(). napi_disable() might call msleep().
> DEBUG_ATOMIC_SLEEP isn't happy about this:
> | BUG: sleeping function called from invalid context at ../net/core/dev.c:6332
> | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 2831, name: bash
> | CPU: 3 PID: 2831 Comm: bash Tainted: G        W         5.4.0-rc3-00001-g9dbe793f263b #114
> | Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
> | Call trace:
> |  dump_backtrace+0x0/0x160
> |  show_stack+0x24/0x30
> |  dump_stack+0xb0/0xf8
> |  ___might_sleep+0x124/0x148
> |  __might_sleep+0x54/0x90
> |  napi_disable+0x48/0x140
> |  xgbe_napi_disable+0x64/0xc0
> |  xgbe_powerdown+0xb0/0x120
> |  xgbe_platform_suspend+0x34/0x80
> |  pm_generic_freeze+0x3c/0x58
> |  acpi_subsys_freeze+0x2c/0x38
> |  dpm_run_callback+0x3c/0x1e8
> |  __device_suspend+0x130/0x468
> |  dpm_suspend+0x114/0x388
> |  hibernation_snapshot+0xe8/0x378
> |  hibernate+0x18c/0x2f8
> 
> Move xgbe_napi_disable() outside the spin_lock()d region of
> xgbe_powerdown(). This matches its use in xgbe_stop() ... but this
> might only be safe because of the earlier call to xgbe_free_irqs().
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> RFC as I'm not familiar with this driver. I'm happy to test a better fix!
> ---
>  drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> index bfba7effcf9f..a6e6c21e921f 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c
> @@ -1278,10 +1278,10 @@ int xgbe_powerdown(struct net_device *netdev, unsigned int caller)
>  	hw_if->powerdown_tx(pdata);
>  	hw_if->powerdown_rx(pdata);
>  
> -	xgbe_napi_disable(pdata, 0);
> -
>  	spin_unlock_irqrestore(&pdata->lock, flags);
>  
> +	xgbe_napi_disable(pdata, 0);
> +

As far as I can tell, I think this is safe. Whether inside or outside the
spinlock doesn't make a difference to the interrupt routine since it
doesn't acquire this lock. And the suspend/resume functions can't be
called at the same time.

Thanks,
Tom

>  	DBGPR("<--xgbe_powerdown\n");
>  
>  	return 0;
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-10-15 15:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-15 13:49 [RFC PATCH net 0/2] amd-xgbe: Sleeping while atomic during hibernate James Morse
2019-10-15 13:49 ` [RFC PATCH net 1/2] amd-xgbe: Avoid sleeping in flush_workqueue() while holding a spinlock James Morse
2019-10-15 15:26   ` Lendacky, Thomas
2019-10-15 13:49 ` [RFC PATCH net 2/2] amd-xgbe: Avoid sleeping in napi_disable() " James Morse
2019-10-15 15:41   ` Lendacky, Thomas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).