From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Kirsher Subject: [net-next 2/6] ixgbe/ixgbevf: Free IRQ when PCI error recovery removes the device Date: Thu, 17 May 2018 09:37:28 -0700 Message-ID: <20180517163732.30910-3-jeffrey.t.kirsher@intel.com> References: <20180517163732.30910-1-jeffrey.t.kirsher@intel.com> Cc: Mauro S M Rodrigues , netdev@vger.kernel.org, nhorman@redhat.com, sassmann@redhat.com, jogreene@redhat.com, Jeff Kirsher To: davem@davemloft.net Return-path: Received: from mga02.intel.com ([134.134.136.20]:13592 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752408AbeEQQgL (ORCPT ); Thu, 17 May 2018 12:36:11 -0400 In-Reply-To: <20180517163732.30910-1-jeffrey.t.kirsher@intel.com> Sender: netdev-owner@vger.kernel.org List-ID: From: Mauro S M Rodrigues Since commit f7f37e7ff2b9 ("ixgbe: handle close/suspend race with netif_device_detach/present") ixgbe_close_suspend is called, from ixgbe_close, only if the device is present, i.e. if it isn't detached. That exposed a situation where IRQs weren't freed if a PCI error recovery system opts to remove the device. For such case the pci channel state is set to pci_channel_io_perm_failure and ixgbe_io_error_detected was returning PCI_ERS_RESULT_DISCONNECT before calling ixgbe_close_suspend consequentially not freeing IRQ and crashing when the remove handler calls pci_disable_device, hitting a BUG_ON at free_msi_irqs, which asserts that there is no non-free IRQ associated with the device to be removed: BUG_ON(irq_has_action(entry->irq + i)); The issue is fixed by calling the ixgbe_close_suspend before evaluate the pci channel state. Reported-by: Naresh Bannoth Reported-by: Abdul Haleem Signed-off-by: Mauro S M Rodrigues Reviewed-by: Alexander Duyck Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +++--- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 163b34a9572d..a52d92e182ee 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -10935,14 +10935,14 @@ static pci_ers_result_t ixgbe_io_error_detected(struct pci_dev *pdev, rtnl_lock(); netif_device_detach(netdev); + if (netif_running(netdev)) + ixgbe_close_suspend(adapter); + if (state == pci_channel_io_perm_failure) { rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; } - if (netif_running(netdev)) - ixgbe_close_suspend(adapter); - if (!test_and_set_bit(__IXGBE_DISABLED, &adapter->state)) pci_disable_device(pdev); rtnl_unlock(); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 1ccce6cd51fc..9a939dcaf727 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -4747,14 +4747,14 @@ static pci_ers_result_t ixgbevf_io_error_detected(struct pci_dev *pdev, rtnl_lock(); netif_device_detach(netdev); + if (netif_running(netdev)) + ixgbevf_close_suspend(adapter); + if (state == pci_channel_io_perm_failure) { rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; } - if (netif_running(netdev)) - ixgbevf_close_suspend(adapter); - if (!test_and_set_bit(__IXGBEVF_DISABLED, &adapter->state)) pci_disable_device(pdev); rtnl_unlock(); -- 2.17.0