All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH] ixgbevf: handle race between close and suspend on shutdown
@ 2016-10-31 20:00 Emil Tantilov
  2016-10-31 21:30 ` Alexander Duyck
  0 siblings, 1 reply; 2+ messages in thread
From: Emil Tantilov @ 2016-10-31 20:00 UTC (permalink / raw)
  To: intel-wired-lan

When an interface is part of a namespace it is possible that
ixgbevf_close() may be called while ixgbevf_suspend() is running
which ends up in a double free WARN and/or a BUG in free_msi_irqs()

This patch introduces a new adapter->state bit that will allow us
to handle this rate atomically. The idea is that ixgbevf_close() should
not run if ixgbevf_suspend is in session.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
---
 drivers/net/ethernet/intel/ixgbevf/ixgbevf.h      |    1 +
 drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |    9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
index 5639fbe..dc4b3b4 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf.h
@@ -446,6 +446,7 @@ enum ixbgevf_state_t {
 	__IXGBEVF_SERVICE_INITED,
 	__IXGBEVF_RESET_REQUESTED,
 	__IXGBEVF_QUEUE_RESET_REQUESTED,
+	__IXGBEVF_CLEAR_IRQS,
 };
 
 enum ixgbevf_boards {
diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
index d316f50..8ebe797 100644
--- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
@@ -3242,12 +3242,17 @@ int ixgbevf_close(struct net_device *netdev)
 {
 	struct ixgbevf_adapter *adapter = netdev_priv(netdev);
 
+	if (test_and_set_bit(__IXGBEVF_CLEAR_IRQS, &adapter->state))
+		return 0;
+
 	ixgbevf_down(adapter);
 	ixgbevf_free_irq(adapter);
 
 	ixgbevf_free_all_tx_resources(adapter);
 	ixgbevf_free_all_rx_resources(adapter);
 
+	clear_bit(__IXGBEVF_CLEAR_IRQS, &adapter->state);
+
 	return 0;
 }
 
@@ -3792,6 +3797,9 @@ static int ixgbevf_suspend(struct pci_dev *pdev, pm_message_t state)
 	int retval = 0;
 #endif
 
+	while (test_and_set_bit(__IXGBEVF_CLEAR_IRQS, &adapter->state))
+		usleep_range(1000, 2000);
+
 	netif_device_detach(netdev);
 
 	if (netif_running(netdev)) {
@@ -3813,6 +3821,7 @@ static int ixgbevf_suspend(struct pci_dev *pdev, pm_message_t state)
 	if (!test_and_set_bit(__IXGBEVF_DISABLED, &adapter->state))
 		pci_disable_device(pdev);
 
+	clear_bit(__IXGBEVF_CLEAR_IRQS, &adapter->state);
 	return 0;
 }
 


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [Intel-wired-lan] [PATCH] ixgbevf: handle race between close and suspend on shutdown
  2016-10-31 20:00 [Intel-wired-lan] [PATCH] ixgbevf: handle race between close and suspend on shutdown Emil Tantilov
@ 2016-10-31 21:30 ` Alexander Duyck
  0 siblings, 0 replies; 2+ messages in thread
From: Alexander Duyck @ 2016-10-31 21:30 UTC (permalink / raw)
  To: intel-wired-lan

On Mon, Oct 31, 2016 at 1:00 PM, Emil Tantilov <emil.s.tantilov@intel.com>
wrote:

> When an interface is part of a namespace it is possible that
> ixgbevf_close() may be called while ixgbevf_suspend() is running
> which ends up in a double free WARN and/or a BUG in free_msi_irqs()
>
> This patch introduces a new adapter->state bit that will allow us
> to handle this rate atomically. The idea is that ixgbevf_close() should
> not run if ixgbevf_suspend is in session.
>
> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
>

I'm pretty sure this isn't the correct fix for this.  You still have the
two technically race against each other because the close could get caught
behind the rtnl lock and then flow through trying to free the interrupt
from it's side.  Also there are scenarios where this can leak interrupts
and such since suspend could set the bit, close could run through take the
rtnl lock, see the bit set so it doesn't free anything, then suspend
resumes and netif_running returns false so you never free the ring
resources.

I think the correct fix for this is to move the rtnl_lock/unlock so that
they are outside of the netif_device_detach() and
ixgbevf_clear_interrupt_scheme() calls that are a part of ixgbevf_suspend.
That way you shouldn't race with close.

It also looks like there is a similar bug in
the ixgbevf_queue_reset_subtask().  It should be taking the rtnl_lock
before checking netif_running, and releasing it at the end of the
function.  You could also add a call to netif_device_present to
ixgbevf_close and if it returns false then just return since the rtnl_lock
will guarantee that if it is set then suspend will have already freed the
resources.

- Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20161031/ae8fc41e/attachment.html>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2016-10-31 21:30 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-31 20:00 [Intel-wired-lan] [PATCH] ixgbevf: handle race between close and suspend on shutdown Emil Tantilov
2016-10-31 21:30 ` Alexander Duyck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.