* [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash @ 2019-10-03 16:54 David Dai 2019-10-03 17:39 ` Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: David Dai @ 2019-10-03 16:54 UTC (permalink / raw) To: jeffrey.t.kirsher, davem Cc: intel-wired-lan, netdev, linux-kernel, zdai, zdai We see the behavior when EEH e1000e adapter detects io permanent failure, it will crash kernel with this stack: EEH: Beginning: 'error_detected(permanent failure)' EEH: PE#900000 (PCI 0115:90:00.1): Invoking e1000e->error_detected(permanent failure) EEH: PE#900000 (PCI 0115:90:00.1): e1000e driver reports: 'disconnect' EEH: PE#900000 (PCI 0115:90:00.0): Invoking e1000e->error_detected(permanent failure) EEH: PE#900000 (PCI 0115:90:00.0): e1000e driver reports: 'disconnect' EEH: Finished:'error_detected(permanent failure)' Oops: Exception in kernel mode, sig: 5 [#1] NIP [c0000000007b1be0] free_msi_irqs+0xa0/0x280 LR [c0000000007b1bd0] free_msi_irqs+0x90/0x280 Call Trace: [c0000004f491ba10] [c0000000007b1bd0] free_msi_irqs+0x90/0x280 (unreliable) [c0000004f491ba70] [c0000000007b260c] pci_disable_msi+0x13c/0x180 [c0000004f491bab0] [d0000000046381ac] e1000_remove+0x234/0x2a0 [e1000e] [c0000004f491baf0] [c000000000783cec] pci_device_remove+0x6c/0x120 [c0000004f491bb30] [c00000000088da6c] device_release_driver_internal+0x2bc/0x3f0 [c0000004f491bb80] [c00000000076f5a8] pci_stop_and_remove_bus_device+0xb8/0x110 [c0000004f491bbc0] [c00000000006e890] pci_hp_remove_devices+0x90/0x130 [c0000004f491bc50] [c00000000004ad34] eeh_handle_normal_event+0x1d4/0x660 [c0000004f491bd10] [c00000000004bf10] eeh_event_handler+0x1c0/0x1e0 [c0000004f491bdc0] [c00000000017c4ac] kthread+0x1ac/0x1c0 [c0000004f491be30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 Basically the e1000e irqs haven't been freed at the time eeh is trying to remove the the e1000e device. Need to make sure when e1000e_close is called to bring down the NIC, if adapter error_state is pci_channel_io_perm_failure, it should also bring down the link and free irqs. Reported-by: Morumuri Srivalli <smorumu1@in.ibm.com> Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> --- drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index d7d56e4..cf618e1 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -4715,7 +4715,8 @@ int e1000e_close(struct net_device *netdev) pm_runtime_get_sync(&pdev->dev); - if (!test_bit(__E1000_DOWN, &adapter->state)) { + if (!test_bit(__E1000_DOWN, &adapter->state) || + (adapter->pdev->error_state == pci_channel_io_perm_failure)) { e1000e_down(adapter, true); e1000_free_irq(adapter); -- 1.7.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash 2019-10-03 16:54 [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash David Dai @ 2019-10-03 17:39 ` Alexander Duyck 2019-10-03 18:50 ` David Z. Dai 0 siblings, 1 reply; 18+ messages in thread From: Alexander Duyck @ 2019-10-03 17:39 UTC (permalink / raw) To: David Dai; +Cc: Jeff Kirsher, David Miller, intel-wired-lan, Netdev, LKML, zdai On Thu, Oct 3, 2019 at 9:59 AM David Dai <zdai@linux.vnet.ibm.com> wrote: > > We see the behavior when EEH e1000e adapter detects io permanent failure, > it will crash kernel with this stack: > EEH: Beginning: 'error_detected(permanent failure)' > EEH: PE#900000 (PCI 0115:90:00.1): Invoking e1000e->error_detected(permanent failure) > EEH: PE#900000 (PCI 0115:90:00.1): e1000e driver reports: 'disconnect' > EEH: PE#900000 (PCI 0115:90:00.0): Invoking e1000e->error_detected(permanent failure) > EEH: PE#900000 (PCI 0115:90:00.0): e1000e driver reports: 'disconnect' > EEH: Finished:'error_detected(permanent failure)' > Oops: Exception in kernel mode, sig: 5 [#1] > NIP [c0000000007b1be0] free_msi_irqs+0xa0/0x280 > LR [c0000000007b1bd0] free_msi_irqs+0x90/0x280 > Call Trace: > [c0000004f491ba10] [c0000000007b1bd0] free_msi_irqs+0x90/0x280 (unreliable) > [c0000004f491ba70] [c0000000007b260c] pci_disable_msi+0x13c/0x180 > [c0000004f491bab0] [d0000000046381ac] e1000_remove+0x234/0x2a0 [e1000e] > [c0000004f491baf0] [c000000000783cec] pci_device_remove+0x6c/0x120 > [c0000004f491bb30] [c00000000088da6c] device_release_driver_internal+0x2bc/0x3f0 > [c0000004f491bb80] [c00000000076f5a8] pci_stop_and_remove_bus_device+0xb8/0x110 > [c0000004f491bbc0] [c00000000006e890] pci_hp_remove_devices+0x90/0x130 > [c0000004f491bc50] [c00000000004ad34] eeh_handle_normal_event+0x1d4/0x660 > [c0000004f491bd10] [c00000000004bf10] eeh_event_handler+0x1c0/0x1e0 > [c0000004f491bdc0] [c00000000017c4ac] kthread+0x1ac/0x1c0 > [c0000004f491be30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 > > Basically the e1000e irqs haven't been freed at the time eeh is trying to > remove the the e1000e device. > Need to make sure when e1000e_close is called to bring down the NIC, > if adapter error_state is pci_channel_io_perm_failure, it should also > bring down the link and free irqs. > > Reported-by: Morumuri Srivalli <smorumu1@in.ibm.com> > Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> > --- > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > 1 files changed, 2 insertions(+), 1 deletions(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index d7d56e4..cf618e1 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -4715,7 +4715,8 @@ int e1000e_close(struct net_device *netdev) > > pm_runtime_get_sync(&pdev->dev); > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > + if (!test_bit(__E1000_DOWN, &adapter->state) || > + (adapter->pdev->error_state == pci_channel_io_perm_failure)) { > e1000e_down(adapter, true); > e1000_free_irq(adapter); It seems like the issue is the fact that e1000_io_error_detected is calling e1000e_down without the e1000_free_irq() bit. Instead of doing this couldn't you simply add the following to e1000_is_slot_reset in the "result = PCI_ERS_RESULT_DISCONNECT" case: if (netif_running(netdev) e1000_free_irq(adapter); Alternatively we could look at freeing and reallocating the IRQs in the event of an error like we do for the e1000e_pm_freeze and e1000e_pm_thaw cases. That might make more sense since we are dealing with an error we might want to free and reallocate the IRQ resources assigned to the device. Thanks. - Alex ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash 2019-10-03 17:39 ` Alexander Duyck @ 2019-10-03 18:50 ` David Z. Dai 2019-10-03 20:39 ` Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: David Z. Dai @ 2019-10-03 18:50 UTC (permalink / raw) To: Alexander Duyck Cc: Jeff Kirsher, David Miller, intel-wired-lan, Netdev, LKML, zdai On Thu, 2019-10-03 at 10:39 -0700, Alexander Duyck wrote: > On Thu, Oct 3, 2019 at 9:59 AM David Dai <zdai@linux.vnet.ibm.com> wrote: > > > > We see the behavior when EEH e1000e adapter detects io permanent failure, > > it will crash kernel with this stack: > > EEH: Beginning: 'error_detected(permanent failure)' > > EEH: PE#900000 (PCI 0115:90:00.1): Invoking e1000e->error_detected(permanent failure) > > EEH: PE#900000 (PCI 0115:90:00.1): e1000e driver reports: 'disconnect' > > EEH: PE#900000 (PCI 0115:90:00.0): Invoking e1000e->error_detected(permanent failure) > > EEH: PE#900000 (PCI 0115:90:00.0): e1000e driver reports: 'disconnect' > > EEH: Finished:'error_detected(permanent failure)' > > Oops: Exception in kernel mode, sig: 5 [#1] > > NIP [c0000000007b1be0] free_msi_irqs+0xa0/0x280 > > LR [c0000000007b1bd0] free_msi_irqs+0x90/0x280 > > Call Trace: > > [c0000004f491ba10] [c0000000007b1bd0] free_msi_irqs+0x90/0x280 (unreliable) > > [c0000004f491ba70] [c0000000007b260c] pci_disable_msi+0x13c/0x180 > > [c0000004f491bab0] [d0000000046381ac] e1000_remove+0x234/0x2a0 [e1000e] > > [c0000004f491baf0] [c000000000783cec] pci_device_remove+0x6c/0x120 > > [c0000004f491bb30] [c00000000088da6c] device_release_driver_internal+0x2bc/0x3f0 > > [c0000004f491bb80] [c00000000076f5a8] pci_stop_and_remove_bus_device+0xb8/0x110 > > [c0000004f491bbc0] [c00000000006e890] pci_hp_remove_devices+0x90/0x130 > > [c0000004f491bc50] [c00000000004ad34] eeh_handle_normal_event+0x1d4/0x660 > > [c0000004f491bd10] [c00000000004bf10] eeh_event_handler+0x1c0/0x1e0 > > [c0000004f491bdc0] [c00000000017c4ac] kthread+0x1ac/0x1c0 > > [c0000004f491be30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 > > > > Basically the e1000e irqs haven't been freed at the time eeh is trying to > > remove the the e1000e device. > > Need to make sure when e1000e_close is called to bring down the NIC, > > if adapter error_state is pci_channel_io_perm_failure, it should also > > bring down the link and free irqs. > > > > Reported-by: Morumuri Srivalli <smorumu1@in.ibm.com> > > Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> > > --- > > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > > 1 files changed, 2 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > > index d7d56e4..cf618e1 100644 > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > > @@ -4715,7 +4715,8 @@ int e1000e_close(struct net_device *netdev) > > > > pm_runtime_get_sync(&pdev->dev); > > > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > > + if (!test_bit(__E1000_DOWN, &adapter->state) || > > + (adapter->pdev->error_state == pci_channel_io_perm_failure)) { > > e1000e_down(adapter, true); > > e1000_free_irq(adapter); > > It seems like the issue is the fact that e1000_io_error_detected is > calling e1000e_down without the e1000_free_irq() bit. Instead of doing > this couldn't you simply add the following to e1000_is_slot_reset in > the "result = PCI_ERS_RESULT_DISCONNECT" case: > if (netif_running(netdev) > e1000_free_irq(adapter); > > Alternatively we could look at freeing and reallocating the IRQs in > the event of an error like we do for the e1000e_pm_freeze and > e1000e_pm_thaw cases. That might make more sense since we are dealing > with an error we might want to free and reallocate the IRQ resources > assigned to the device. > > Thanks. > > - Alex Thanks for the quick reply and comment! Looked the e1000_io_slot_reset() routine: err = pci_enable_device_mem(pdev); if (err) { dev_err(&pdev->dev, "Cannot re-enable PCI device after reset.\n"); result = PCI_ERS_RESULT_DISCONNECT; } else { I didn't see log message "Cannot re-enable PCI device after reset" at the time of crash. I can still apply the same logic in e1000_io_error_detected() routine: if (state == pci_channel_io_perm_failure) { + if (netif_running(netdev)) + e1000_free_irq(adapter); return PCI_ERS_RESULT_DISCONNECT; } Will test this once the test hardware is available again. Thanks! - David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash 2019-10-03 18:50 ` David Z. Dai @ 2019-10-03 20:39 ` Alexander Duyck 2019-10-04 0:02 ` David Z. Dai 0 siblings, 1 reply; 18+ messages in thread From: Alexander Duyck @ 2019-10-03 20:39 UTC (permalink / raw) To: David Z. Dai Cc: Jeff Kirsher, David Miller, intel-wired-lan, Netdev, LKML, zdai On Thu, Oct 3, 2019 at 11:51 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > On Thu, 2019-10-03 at 10:39 -0700, Alexander Duyck wrote: > > On Thu, Oct 3, 2019 at 9:59 AM David Dai <zdai@linux.vnet.ibm.com> wrote: > > > > > > We see the behavior when EEH e1000e adapter detects io permanent failure, > > > it will crash kernel with this stack: > > > EEH: Beginning: 'error_detected(permanent failure)' > > > EEH: PE#900000 (PCI 0115:90:00.1): Invoking e1000e->error_detected(permanent failure) > > > EEH: PE#900000 (PCI 0115:90:00.1): e1000e driver reports: 'disconnect' > > > EEH: PE#900000 (PCI 0115:90:00.0): Invoking e1000e->error_detected(permanent failure) > > > EEH: PE#900000 (PCI 0115:90:00.0): e1000e driver reports: 'disconnect' > > > EEH: Finished:'error_detected(permanent failure)' > > > Oops: Exception in kernel mode, sig: 5 [#1] > > > NIP [c0000000007b1be0] free_msi_irqs+0xa0/0x280 > > > LR [c0000000007b1bd0] free_msi_irqs+0x90/0x280 > > > Call Trace: > > > [c0000004f491ba10] [c0000000007b1bd0] free_msi_irqs+0x90/0x280 (unreliable) > > > [c0000004f491ba70] [c0000000007b260c] pci_disable_msi+0x13c/0x180 > > > [c0000004f491bab0] [d0000000046381ac] e1000_remove+0x234/0x2a0 [e1000e] > > > [c0000004f491baf0] [c000000000783cec] pci_device_remove+0x6c/0x120 > > > [c0000004f491bb30] [c00000000088da6c] device_release_driver_internal+0x2bc/0x3f0 > > > [c0000004f491bb80] [c00000000076f5a8] pci_stop_and_remove_bus_device+0xb8/0x110 > > > [c0000004f491bbc0] [c00000000006e890] pci_hp_remove_devices+0x90/0x130 > > > [c0000004f491bc50] [c00000000004ad34] eeh_handle_normal_event+0x1d4/0x660 > > > [c0000004f491bd10] [c00000000004bf10] eeh_event_handler+0x1c0/0x1e0 > > > [c0000004f491bdc0] [c00000000017c4ac] kthread+0x1ac/0x1c0 > > > [c0000004f491be30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 > > > > > > Basically the e1000e irqs haven't been freed at the time eeh is trying to > > > remove the the e1000e device. > > > Need to make sure when e1000e_close is called to bring down the NIC, > > > if adapter error_state is pci_channel_io_perm_failure, it should also > > > bring down the link and free irqs. > > > > > > Reported-by: Morumuri Srivalli <smorumu1@in.ibm.com> > > > Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> > > > --- > > > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > > > 1 files changed, 2 insertions(+), 1 deletions(-) > > > > > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > > > index d7d56e4..cf618e1 100644 > > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > > > @@ -4715,7 +4715,8 @@ int e1000e_close(struct net_device *netdev) > > > > > > pm_runtime_get_sync(&pdev->dev); > > > > > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > > > + if (!test_bit(__E1000_DOWN, &adapter->state) || > > > + (adapter->pdev->error_state == pci_channel_io_perm_failure)) { > > > e1000e_down(adapter, true); > > > e1000_free_irq(adapter); > > > > It seems like the issue is the fact that e1000_io_error_detected is > > calling e1000e_down without the e1000_free_irq() bit. Instead of doing > > this couldn't you simply add the following to e1000_is_slot_reset in > > the "result = PCI_ERS_RESULT_DISCONNECT" case: > > if (netif_running(netdev) > > e1000_free_irq(adapter); > > > > Alternatively we could look at freeing and reallocating the IRQs in > > the event of an error like we do for the e1000e_pm_freeze and > > e1000e_pm_thaw cases. That might make more sense since we are dealing > > with an error we might want to free and reallocate the IRQ resources > > assigned to the device. > > > > Thanks. > > > > - Alex > > Thanks for the quick reply and comment! > Looked the e1000_io_slot_reset() routine: > err = pci_enable_device_mem(pdev); > if (err) { > dev_err(&pdev->dev, > "Cannot re-enable PCI device after reset.\n"); > result = PCI_ERS_RESULT_DISCONNECT; > } else { > I didn't see log message "Cannot re-enable PCI device after reset" at > the time of crash. > > I can still apply the same logic in e1000_io_error_detected() routine: > if (state == pci_channel_io_perm_failure) { > + if (netif_running(netdev)) > + e1000_free_irq(adapter); > return PCI_ERS_RESULT_DISCONNECT; > } > Will test this once the test hardware is available again. Are you sure this is the path you are hitting? Things aren't adding up. I thought the issue was that the interface for the error handling was calling e1000e_down() but not freeing the IRQs? In the path where you are adding your code I don't see how the __E1000_DOWN would have been set? - Alex ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash 2019-10-03 20:39 ` Alexander Duyck @ 2019-10-04 0:02 ` David Z. Dai 2019-10-04 14:35 ` Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: David Z. Dai @ 2019-10-04 0:02 UTC (permalink / raw) To: Alexander Duyck Cc: Jeff Kirsher, David Miller, intel-wired-lan, Netdev, LKML, zdai On Thu, 2019-10-03 at 13:39 -0700, Alexander Duyck wrote: > On Thu, Oct 3, 2019 at 11:51 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > > > On Thu, 2019-10-03 at 10:39 -0700, Alexander Duyck wrote: > > > On Thu, Oct 3, 2019 at 9:59 AM David Dai <zdai@linux.vnet.ibm.com> wrote: > > > > > > > > We see the behavior when EEH e1000e adapter detects io permanent failure, > > > > it will crash kernel with this stack: > > > > EEH: Beginning: 'error_detected(permanent failure)' > > > > EEH: PE#900000 (PCI 0115:90:00.1): Invoking e1000e->error_detected(permanent failure) > > > > EEH: PE#900000 (PCI 0115:90:00.1): e1000e driver reports: 'disconnect' > > > > EEH: PE#900000 (PCI 0115:90:00.0): Invoking e1000e->error_detected(permanent failure) > > > > EEH: PE#900000 (PCI 0115:90:00.0): e1000e driver reports: 'disconnect' > > > > EEH: Finished:'error_detected(permanent failure)' > > > > Oops: Exception in kernel mode, sig: 5 [#1] > > > > NIP [c0000000007b1be0] free_msi_irqs+0xa0/0x280 > > > > LR [c0000000007b1bd0] free_msi_irqs+0x90/0x280 > > > > Call Trace: > > > > [c0000004f491ba10] [c0000000007b1bd0] free_msi_irqs+0x90/0x280 (unreliable) > > > > [c0000004f491ba70] [c0000000007b260c] pci_disable_msi+0x13c/0x180 > > > > [c0000004f491bab0] [d0000000046381ac] e1000_remove+0x234/0x2a0 [e1000e] > > > > [c0000004f491baf0] [c000000000783cec] pci_device_remove+0x6c/0x120 > > > > [c0000004f491bb30] [c00000000088da6c] device_release_driver_internal+0x2bc/0x3f0 > > > > [c0000004f491bb80] [c00000000076f5a8] pci_stop_and_remove_bus_device+0xb8/0x110 > > > > [c0000004f491bbc0] [c00000000006e890] pci_hp_remove_devices+0x90/0x130 > > > > [c0000004f491bc50] [c00000000004ad34] eeh_handle_normal_event+0x1d4/0x660 > > > > [c0000004f491bd10] [c00000000004bf10] eeh_event_handler+0x1c0/0x1e0 > > > > [c0000004f491bdc0] [c00000000017c4ac] kthread+0x1ac/0x1c0 > > > > [c0000004f491be30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 > > > > > > > > Basically the e1000e irqs haven't been freed at the time eeh is trying to > > > > remove the the e1000e device. > > > > Need to make sure when e1000e_close is called to bring down the NIC, > > > > if adapter error_state is pci_channel_io_perm_failure, it should also > > > > bring down the link and free irqs. > > > > > > > > Reported-by: Morumuri Srivalli <smorumu1@in.ibm.com> > > > > Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> > > > > --- > > > > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > > > > 1 files changed, 2 insertions(+), 1 deletions(-) > > > > > > > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > > > > index d7d56e4..cf618e1 100644 > > > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > > > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > > > > @@ -4715,7 +4715,8 @@ int e1000e_close(struct net_device *netdev) > > > > > > > > pm_runtime_get_sync(&pdev->dev); > > > > > > > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > > > > + if (!test_bit(__E1000_DOWN, &adapter->state) || > > > > + (adapter->pdev->error_state == pci_channel_io_perm_failure)) { > > > > e1000e_down(adapter, true); > > > > e1000_free_irq(adapter); > > > > > > It seems like the issue is the fact that e1000_io_error_detected is > > > calling e1000e_down without the e1000_free_irq() bit. Instead of doing > > > this couldn't you simply add the following to e1000_is_slot_reset in > > > the "result = PCI_ERS_RESULT_DISCONNECT" case: > > > if (netif_running(netdev) > > > e1000_free_irq(adapter); > > > > > > Alternatively we could look at freeing and reallocating the IRQs in > > > the event of an error like we do for the e1000e_pm_freeze and > > > e1000e_pm_thaw cases. That might make more sense since we are dealing > > > with an error we might want to free and reallocate the IRQ resources > > > assigned to the device. > > > > > > Thanks. > > > > > > - Alex > > > > Thanks for the quick reply and comment! > > Looked the e1000_io_slot_reset() routine: > > err = pci_enable_device_mem(pdev); > > if (err) { > > dev_err(&pdev->dev, > > "Cannot re-enable PCI device after reset.\n"); > > result = PCI_ERS_RESULT_DISCONNECT; > > } else { > > I didn't see log message "Cannot re-enable PCI device after reset" at > > the time of crash. > > > > I can still apply the same logic in e1000_io_error_detected() routine: > > if (state == pci_channel_io_perm_failure) { > > + if (netif_running(netdev)) > > + e1000_free_irq(adapter); > > return PCI_ERS_RESULT_DISCONNECT; > > } > > Will test this once the test hardware is available again. > > Are you sure this is the path you are hitting? Things aren't adding up. > > I thought the issue was that the interface for the error handling was > calling e1000e_down() but not freeing the IRQs? In the path where you > are adding your code I don't see how the __E1000_DOWN would have been > set? > > - Alex We see the same stack every time the crash is triggered. My understanding is not that the interface for the error handling was calling e1000e_down() but not freeing IRQs. In our case, on powerpc , if injecting eeh errors to reach preset threshold value, it will be forced to be offline permanently. In e1000e_close() to bring down link, the check: "if (! test_bit(__E1000_DOWN, &adapter->state))" is false, so e1000e_down() and e1000_free_irq() are both not called. IRQs are not freed. When e1000_remove() is called, it sees IRQs are not free, hence crash the kernel. This is the reason I have the original proposed patch to add an extra check in e1000e_close(). For the 2nd change in e1000_io_error_detected() routine, I haven't tested it yet. Pardon me if causing any confusion, and Thanks for your time again! - David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash 2019-10-04 0:02 ` David Z. Dai @ 2019-10-04 14:35 ` Alexander Duyck 2019-10-04 17:04 ` David Z. Dai 0 siblings, 1 reply; 18+ messages in thread From: Alexander Duyck @ 2019-10-04 14:35 UTC (permalink / raw) To: David Z. Dai Cc: Jeff Kirsher, David Miller, intel-wired-lan, Netdev, LKML, zdai On Thu, Oct 3, 2019 at 5:02 PM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > On Thu, 2019-10-03 at 13:39 -0700, Alexander Duyck wrote: > > On Thu, Oct 3, 2019 at 11:51 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > > > > > On Thu, 2019-10-03 at 10:39 -0700, Alexander Duyck wrote: > > > > On Thu, Oct 3, 2019 at 9:59 AM David Dai <zdai@linux.vnet.ibm.com> wrote: > > > > > > > > > > We see the behavior when EEH e1000e adapter detects io permanent failure, > > > > > it will crash kernel with this stack: > > > > > EEH: Beginning: 'error_detected(permanent failure)' > > > > > EEH: PE#900000 (PCI 0115:90:00.1): Invoking e1000e->error_detected(permanent failure) > > > > > EEH: PE#900000 (PCI 0115:90:00.1): e1000e driver reports: 'disconnect' > > > > > EEH: PE#900000 (PCI 0115:90:00.0): Invoking e1000e->error_detected(permanent failure) > > > > > EEH: PE#900000 (PCI 0115:90:00.0): e1000e driver reports: 'disconnect' > > > > > EEH: Finished:'error_detected(permanent failure)' > > > > > Oops: Exception in kernel mode, sig: 5 [#1] > > > > > NIP [c0000000007b1be0] free_msi_irqs+0xa0/0x280 > > > > > LR [c0000000007b1bd0] free_msi_irqs+0x90/0x280 > > > > > Call Trace: > > > > > [c0000004f491ba10] [c0000000007b1bd0] free_msi_irqs+0x90/0x280 (unreliable) > > > > > [c0000004f491ba70] [c0000000007b260c] pci_disable_msi+0x13c/0x180 > > > > > [c0000004f491bab0] [d0000000046381ac] e1000_remove+0x234/0x2a0 [e1000e] > > > > > [c0000004f491baf0] [c000000000783cec] pci_device_remove+0x6c/0x120 > > > > > [c0000004f491bb30] [c00000000088da6c] device_release_driver_internal+0x2bc/0x3f0 > > > > > [c0000004f491bb80] [c00000000076f5a8] pci_stop_and_remove_bus_device+0xb8/0x110 > > > > > [c0000004f491bbc0] [c00000000006e890] pci_hp_remove_devices+0x90/0x130 > > > > > [c0000004f491bc50] [c00000000004ad34] eeh_handle_normal_event+0x1d4/0x660 > > > > > [c0000004f491bd10] [c00000000004bf10] eeh_event_handler+0x1c0/0x1e0 > > > > > [c0000004f491bdc0] [c00000000017c4ac] kthread+0x1ac/0x1c0 > > > > > [c0000004f491be30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 > > > > > > > > > > Basically the e1000e irqs haven't been freed at the time eeh is trying to > > > > > remove the the e1000e device. > > > > > Need to make sure when e1000e_close is called to bring down the NIC, > > > > > if adapter error_state is pci_channel_io_perm_failure, it should also > > > > > bring down the link and free irqs. > > > > > > > > > > Reported-by: Morumuri Srivalli <smorumu1@in.ibm.com> > > > > > Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> > > > > > --- > > > > > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > > > > > 1 files changed, 2 insertions(+), 1 deletions(-) > > > > > > > > > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > > > > > index d7d56e4..cf618e1 100644 > > > > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > > > > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > > > > > @@ -4715,7 +4715,8 @@ int e1000e_close(struct net_device *netdev) > > > > > > > > > > pm_runtime_get_sync(&pdev->dev); > > > > > > > > > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > > > > > + if (!test_bit(__E1000_DOWN, &adapter->state) || > > > > > + (adapter->pdev->error_state == pci_channel_io_perm_failure)) { > > > > > e1000e_down(adapter, true); > > > > > e1000_free_irq(adapter); > > > > > > > > It seems like the issue is the fact that e1000_io_error_detected is > > > > calling e1000e_down without the e1000_free_irq() bit. Instead of doing > > > > this couldn't you simply add the following to e1000_is_slot_reset in > > > > the "result = PCI_ERS_RESULT_DISCONNECT" case: > > > > if (netif_running(netdev) > > > > e1000_free_irq(adapter); > > > > > > > > Alternatively we could look at freeing and reallocating the IRQs in > > > > the event of an error like we do for the e1000e_pm_freeze and > > > > e1000e_pm_thaw cases. That might make more sense since we are dealing > > > > with an error we might want to free and reallocate the IRQ resources > > > > assigned to the device. > > > > > > > > Thanks. > > > > > > > > - Alex > > > > > > Thanks for the quick reply and comment! > > > Looked the e1000_io_slot_reset() routine: > > > err = pci_enable_device_mem(pdev); > > > if (err) { > > > dev_err(&pdev->dev, > > > "Cannot re-enable PCI device after reset.\n"); > > > result = PCI_ERS_RESULT_DISCONNECT; > > > } else { > > > I didn't see log message "Cannot re-enable PCI device after reset" at > > > the time of crash. > > > > > > I can still apply the same logic in e1000_io_error_detected() routine: > > > if (state == pci_channel_io_perm_failure) { > > > + if (netif_running(netdev)) > > > + e1000_free_irq(adapter); > > > return PCI_ERS_RESULT_DISCONNECT; > > > } > > > Will test this once the test hardware is available again. > > > > Are you sure this is the path you are hitting? Things aren't adding up. > > > > I thought the issue was that the interface for the error handling was > > calling e1000e_down() but not freeing the IRQs? In the path where you > > are adding your code I don't see how the __E1000_DOWN would have been > > set? > > > > - Alex > We see the same stack every time the crash is triggered. > > My understanding is not that the interface for the error handling was > calling e1000e_down() but not freeing IRQs. That is my understanding as well. However where you talked about adding the code will end up being before we call e1000e_down() won't it? > In our case, on powerpc , if injecting eeh errors to reach preset > threshold value, it will be forced to be offline permanently. > > In e1000e_close() to bring down link, the check: "if (! > test_bit(__E1000_DOWN, &adapter->state))" is false, so e1000e_down() and > e1000_free_irq() are both not called. IRQs are not freed. My concern is mainly that we don't want to mess up the ordering of things or perform the same action multiple times, or do things out-of-order. > When e1000_remove() is called, it sees IRQs are not free, hence crash > the kernel. > > This is the reason I have the original proposed patch to add an extra > check in e1000e_close(). I get that, however the way you said you were going to change things doesn't match up with that. You are freeing the IRQs without first bringing down the interface. > For the 2nd change in e1000_io_error_detected() routine, I haven't > tested it yet. > > Pardon me if causing any confusion, and Thanks for your time again! > > - David I need to take a look at a couple things. I am not sure why the e1000e_close is even checking for the __E1000_DOWN bit. From what I can tell the other drivers are just calling e1000_down() directly without the check so I am not sure if something was added that makes it so that we have to be careful about calling e1000_down more than once or not. If not we could probably just pull that check and simplify all of this. Thanks. - Alex ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash 2019-10-04 14:35 ` Alexander Duyck @ 2019-10-04 17:04 ` David Z. Dai 2019-10-04 23:36 ` [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: David Z. Dai @ 2019-10-04 17:04 UTC (permalink / raw) To: Alexander Duyck Cc: Jeff Kirsher, David Miller, intel-wired-lan, Netdev, LKML, zdai On Fri, 2019-10-04 at 07:35 -0700, Alexander Duyck wrote: > On Thu, Oct 3, 2019 at 5:02 PM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > > > On Thu, 2019-10-03 at 13:39 -0700, Alexander Duyck wrote: > > > On Thu, Oct 3, 2019 at 11:51 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > > > > > > > On Thu, 2019-10-03 at 10:39 -0700, Alexander Duyck wrote: > > > > > On Thu, Oct 3, 2019 at 9:59 AM David Dai <zdai@linux.vnet.ibm.com> wrote: > > > > > > > > > > > > We see the behavior when EEH e1000e adapter detects io permanent failure, > > > > > > it will crash kernel with this stack: > > > > > > EEH: Beginning: 'error_detected(permanent failure)' > > > > > > EEH: PE#900000 (PCI 0115:90:00.1): Invoking e1000e->error_detected(permanent failure) > > > > > > EEH: PE#900000 (PCI 0115:90:00.1): e1000e driver reports: 'disconnect' > > > > > > EEH: PE#900000 (PCI 0115:90:00.0): Invoking e1000e->error_detected(permanent failure) > > > > > > EEH: PE#900000 (PCI 0115:90:00.0): e1000e driver reports: 'disconnect' > > > > > > EEH: Finished:'error_detected(permanent failure)' > > > > > > Oops: Exception in kernel mode, sig: 5 [#1] > > > > > > NIP [c0000000007b1be0] free_msi_irqs+0xa0/0x280 > > > > > > LR [c0000000007b1bd0] free_msi_irqs+0x90/0x280 > > > > > > Call Trace: > > > > > > [c0000004f491ba10] [c0000000007b1bd0] free_msi_irqs+0x90/0x280 (unreliable) > > > > > > [c0000004f491ba70] [c0000000007b260c] pci_disable_msi+0x13c/0x180 > > > > > > [c0000004f491bab0] [d0000000046381ac] e1000_remove+0x234/0x2a0 [e1000e] > > > > > > [c0000004f491baf0] [c000000000783cec] pci_device_remove+0x6c/0x120 > > > > > > [c0000004f491bb30] [c00000000088da6c] device_release_driver_internal+0x2bc/0x3f0 > > > > > > [c0000004f491bb80] [c00000000076f5a8] pci_stop_and_remove_bus_device+0xb8/0x110 > > > > > > [c0000004f491bbc0] [c00000000006e890] pci_hp_remove_devices+0x90/0x130 > > > > > > [c0000004f491bc50] [c00000000004ad34] eeh_handle_normal_event+0x1d4/0x660 > > > > > > [c0000004f491bd10] [c00000000004bf10] eeh_event_handler+0x1c0/0x1e0 > > > > > > [c0000004f491bdc0] [c00000000017c4ac] kthread+0x1ac/0x1c0 > > > > > > [c0000004f491be30] [c00000000000b75c] ret_from_kernel_thread+0x5c/0x80 > > > > > > > > > > > > Basically the e1000e irqs haven't been freed at the time eeh is trying to > > > > > > remove the the e1000e device. > > > > > > Need to make sure when e1000e_close is called to bring down the NIC, > > > > > > if adapter error_state is pci_channel_io_perm_failure, it should also > > > > > > bring down the link and free irqs. > > > > > > > > > > > > Reported-by: Morumuri Srivalli <smorumu1@in.ibm.com> > > > > > > Signed-off-by: David Dai <zdai@linux.vnet.ibm.com> > > > > > > --- > > > > > > drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++- > > > > > > 1 files changed, 2 insertions(+), 1 deletions(-) > > > > > > > > > > > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > > > > > > index d7d56e4..cf618e1 100644 > > > > > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > > > > > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > > > > > > @@ -4715,7 +4715,8 @@ int e1000e_close(struct net_device *netdev) > > > > > > > > > > > > pm_runtime_get_sync(&pdev->dev); > > > > > > > > > > > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > > > > > > + if (!test_bit(__E1000_DOWN, &adapter->state) || > > > > > > + (adapter->pdev->error_state == pci_channel_io_perm_failure)) { > > > > > > e1000e_down(adapter, true); > > > > > > e1000_free_irq(adapter); > > > > > > > > > > It seems like the issue is the fact that e1000_io_error_detected is > > > > > calling e1000e_down without the e1000_free_irq() bit. Instead of doing > > > > > this couldn't you simply add the following to e1000_is_slot_reset in > > > > > the "result = PCI_ERS_RESULT_DISCONNECT" case: > > > > > if (netif_running(netdev) > > > > > e1000_free_irq(adapter); > > > > > > > > > > Alternatively we could look at freeing and reallocating the IRQs in > > > > > the event of an error like we do for the e1000e_pm_freeze and > > > > > e1000e_pm_thaw cases. That might make more sense since we are dealing > > > > > with an error we might want to free and reallocate the IRQ resources > > > > > assigned to the device. > > > > > > > > > > Thanks. > > > > > > > > > > - Alex > > > > > > > > Thanks for the quick reply and comment! > > > > Looked the e1000_io_slot_reset() routine: > > > > err = pci_enable_device_mem(pdev); > > > > if (err) { > > > > dev_err(&pdev->dev, > > > > "Cannot re-enable PCI device after reset.\n"); > > > > result = PCI_ERS_RESULT_DISCONNECT; > > > > } else { > > > > I didn't see log message "Cannot re-enable PCI device after reset" at > > > > the time of crash. > > > > > > > > I can still apply the same logic in e1000_io_error_detected() routine: > > > > if (state == pci_channel_io_perm_failure) { > > > > + if (netif_running(netdev)) > > > > + e1000_free_irq(adapter); > > > > return PCI_ERS_RESULT_DISCONNECT; > > > > } > > > > Will test this once the test hardware is available again. > > > > > > Are you sure this is the path you are hitting? Things aren't adding up. > > > > > > I thought the issue was that the interface for the error handling was > > > calling e1000e_down() but not freeing the IRQs? In the path where you > > > are adding your code I don't see how the __E1000_DOWN would have been > > > set? > > > > > > - Alex > > We see the same stack every time the crash is triggered. > > > > My understanding is not that the interface for the error handling was > > calling e1000e_down() but not freeing IRQs. > > That is my understanding as well. However where you talked about > adding the code will end up being before we call e1000e_down() won't > it? > Agree. It's possible. > > In our case, on powerpc , if injecting eeh errors to reach preset > > threshold value, it will be forced to be offline permanently. > > > > In e1000e_close() to bring down link, the check: "if (! > > test_bit(__E1000_DOWN, &adapter->state))" is false, so e1000e_down() and > > e1000_free_irq() are both not called. IRQs are not freed. > > My concern is mainly that we don't want to mess up the ordering of > things or perform the same action multiple times, or do things > out-of-order. > > > When e1000_remove() is called, it sees IRQs are not free, hence crash > > the kernel. > > > > This is the reason I have the original proposed patch to add an extra > > check in e1000e_close(). > > I get that, however the way you said you were going to change things > doesn't match up with that. You are freeing the IRQs without first > bringing down the interface. > > > For the 2nd change in e1000_io_error_detected() routine, I haven't > > tested it yet. > > > > Pardon me if causing any confusion, and Thanks for your time again! > > > > - David > > I need to take a look at a couple things. I am not sure why the > e1000e_close is even checking for the __E1000_DOWN bit. From what I > can tell the other drivers are just calling e1000_down() directly > without the check so I am not sure if something was added that makes > it so that we have to be careful about calling e1000_down more than > once or not. If not we could probably just pull that check and > simplify all of this. > > Thanks. > > - Alex I noticed this earlier like e1000 doesn't have the __E1000_DOWN bit check in e1000_close(), but e1000e has this bit check in e1000e_close(). If we can pull this check w/o side effect, that will simplify all of this. Thanks! - David ^ permalink raw reply [flat|nested] 18+ messages in thread
* [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-04 17:04 ` David Z. Dai @ 2019-10-04 23:36 ` Alexander Duyck 2019-10-05 2:18 ` David Z. Dai 0 siblings, 1 reply; 18+ messages in thread From: Alexander Duyck @ 2019-10-04 23:36 UTC (permalink / raw) To: zdai Cc: netdev, linux-kernel, alexander.duyck, intel-wired-lan, jeffrey.t.kirsher, zdai, davem From: Alexander Duyck <alexander.h.duyck@linux.intel.com> This patch is meant to address possible race conditions that can exist between network configuration and power management. A similar issue was fixed for igb in commit 9474933caf21 ("igb: close/suspend race in netif_device_detach"). In addition it consolidates the code so that the PCI error handling code will essentially perform the power management freeze on the device prior to attempting a reset, and will thaw the device afterwards if that is what it is planning to do. Otherwise when we call close on the interface it should see it is detached and not attempt to call the logic to down the interface and free the IRQs again. >From what I can tell the check that was adding the check for __E1000_DOWN in e1000e_close was added when runtime power management was added. However it should not be relevant for us as we perform a call to pm_runtime_get_sync before we call e1000_down/free_irq so it should always be back up before we call into this anyway. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> --- I'm putting this out as an RFC for now. I haven't had a chance to do much testing yet, but I have verified no build issues, and the driver appears to load, link, and pass traffic without problems. This should address issues seen with either double freeing or never freeing IRQs that have been seen on this and similar drivers in the past. I'll submit this formally after testing it over the weekend assuming there are no issues. drivers/net/ethernet/intel/e1000e/netdev.c | 33 ++++++++++++++-------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index d7d56e42a6aa..182a2c8f12d8 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -4715,12 +4715,12 @@ int e1000e_close(struct net_device *netdev) pm_runtime_get_sync(&pdev->dev); - if (!test_bit(__E1000_DOWN, &adapter->state)) { + if (netif_device_present(netdev)) { e1000e_down(adapter, true); e1000_free_irq(adapter); /* Link status message must follow this format */ - pr_info("%s NIC Link is Down\n", adapter->netdev->name); + pr_info("%s NIC Link is Down\n", netdev->name); } napi_disable(&adapter->napi); @@ -6299,6 +6299,7 @@ static int e1000e_pm_freeze(struct device *dev) struct net_device *netdev = dev_get_drvdata(dev); struct e1000_adapter *adapter = netdev_priv(netdev); + rtnl_lock(); netif_device_detach(netdev); if (netif_running(netdev)) { @@ -6313,6 +6314,8 @@ static int e1000e_pm_freeze(struct device *dev) e1000e_down(adapter, false); e1000_free_irq(adapter); } + rtnl_unlock(); + e1000e_reset_interrupt_capability(adapter); /* Allow time for pending master requests to run */ @@ -6626,27 +6629,30 @@ static int __e1000_resume(struct pci_dev *pdev) return 0; } -#ifdef CONFIG_PM_SLEEP static int e1000e_pm_thaw(struct device *dev) { struct net_device *netdev = dev_get_drvdata(dev); struct e1000_adapter *adapter = netdev_priv(netdev); + int rc = 0; e1000e_set_interrupt_capability(adapter); - if (netif_running(netdev)) { - u32 err = e1000_request_irq(adapter); - if (err) - return err; + rtnl_lock(); + if (netif_running(netdev)) { + rc = e1000_request_irq(adapter); + if (rc) + goto err_irq; e1000e_up(adapter); } netif_device_attach(netdev); - - return 0; + rtnl_unlock(); +err_irq: + return rc; } +#ifdef CONFIG_PM_SLEEP static int e1000e_pm_suspend(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); @@ -6821,13 +6827,11 @@ static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev, struct net_device *netdev = pci_get_drvdata(pdev); struct e1000_adapter *adapter = netdev_priv(netdev); - netif_device_detach(netdev); + e1000e_pm_freeze(&pdev->dev); if (state == pci_channel_io_perm_failure) return PCI_ERS_RESULT_DISCONNECT; - if (netif_running(netdev)) - e1000e_down(adapter, true); pci_disable_device(pdev); /* Request a slot slot reset. */ @@ -6893,10 +6897,7 @@ static void e1000_io_resume(struct pci_dev *pdev) e1000_init_manageability_pt(adapter); - if (netif_running(netdev)) - e1000e_up(adapter); - - netif_device_attach(netdev); + e1000e_pm_thaw(&pdev->dev); /* If the controller has AMT, do not set DRV_LOAD until the interface * is up. For all other cases, let the f/w know that the h/w is now ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-04 23:36 ` [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm Alexander Duyck @ 2019-10-05 2:18 ` David Z. Dai 2019-10-05 17:22 ` Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: David Z. Dai @ 2019-10-05 2:18 UTC (permalink / raw) To: Alexander Duyck Cc: netdev, linux-kernel, intel-wired-lan, jeffrey.t.kirsher, zdai, davem On Fri, 2019-10-04 at 16:36 -0700, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > This patch is meant to address possible race conditions that can exist > between network configuration and power management. A similar issue was > fixed for igb in commit 9474933caf21 ("igb: close/suspend race in > netif_device_detach"). > > In addition it consolidates the code so that the PCI error handling code > will essentially perform the power management freeze on the device prior to > attempting a reset, and will thaw the device afterwards if that is what it > is planning to do. Otherwise when we call close on the interface it should > see it is detached and not attempt to call the logic to down the interface > and free the IRQs again. > > >From what I can tell the check that was adding the check for __E1000_DOWN > in e1000e_close was added when runtime power management was added. However > it should not be relevant for us as we perform a call to > pm_runtime_get_sync before we call e1000_down/free_irq so it should always > be back up before we call into this anyway. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > --- > > I'm putting this out as an RFC for now. I haven't had a chance to do much > testing yet, but I have verified no build issues, and the driver appears > to load, link, and pass traffic without problems. > > This should address issues seen with either double freeing or never freeing > IRQs that have been seen on this and similar drivers in the past. > > I'll submit this formally after testing it over the weekend assuming there > are no issues. > > drivers/net/ethernet/intel/e1000e/netdev.c | 33 ++++++++++++++-------------- > 1 file changed, 17 insertions(+), 16 deletions(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index d7d56e42a6aa..182a2c8f12d8 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -4715,12 +4715,12 @@ int e1000e_close(struct net_device *netdev) > > pm_runtime_get_sync(&pdev->dev); > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > + if (netif_device_present(netdev)) { > e1000e_down(adapter, true); > e1000_free_irq(adapter); > > /* Link status message must follow this format */ > - pr_info("%s NIC Link is Down\n", adapter->netdev->name); > + pr_info("%s NIC Link is Down\n", netdev->name); > } > > napi_disable(&adapter->napi); > @@ -6299,6 +6299,7 @@ static int e1000e_pm_freeze(struct device *dev) > struct net_device *netdev = dev_get_drvdata(dev); > struct e1000_adapter *adapter = netdev_priv(netdev); > > + rtnl_lock(); > netif_device_detach(netdev); > > if (netif_running(netdev)) { > @@ -6313,6 +6314,8 @@ static int e1000e_pm_freeze(struct device *dev) > e1000e_down(adapter, false); > e1000_free_irq(adapter); > } > + rtnl_unlock(); > + > e1000e_reset_interrupt_capability(adapter); > > /* Allow time for pending master requests to run */ > @@ -6626,27 +6629,30 @@ static int __e1000_resume(struct pci_dev *pdev) > return 0; > } > > -#ifdef CONFIG_PM_SLEEP > static int e1000e_pm_thaw(struct device *dev) > { > struct net_device *netdev = dev_get_drvdata(dev); > struct e1000_adapter *adapter = netdev_priv(netdev); > + int rc = 0; > > e1000e_set_interrupt_capability(adapter); > - if (netif_running(netdev)) { > - u32 err = e1000_request_irq(adapter); > > - if (err) > - return err; > + rtnl_lock(); > + if (netif_running(netdev)) { > + rc = e1000_request_irq(adapter); > + if (rc) > + goto err_irq; > > e1000e_up(adapter); > } > > netif_device_attach(netdev); > - > - return 0; > + rtnl_unlock(); > +err_irq: > + return rc; > } > > +#ifdef CONFIG_PM_SLEEP > static int e1000e_pm_suspend(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > @@ -6821,13 +6827,11 @@ static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev, > struct net_device *netdev = pci_get_drvdata(pdev); > struct e1000_adapter *adapter = netdev_priv(netdev); > > - netif_device_detach(netdev); > + e1000e_pm_freeze(&pdev->dev); > > if (state == pci_channel_io_perm_failure) > return PCI_ERS_RESULT_DISCONNECT; > > - if (netif_running(netdev)) > - e1000e_down(adapter, true); > pci_disable_device(pdev); > > /* Request a slot slot reset. */ > @@ -6893,10 +6897,7 @@ static void e1000_io_resume(struct pci_dev *pdev) > > e1000_init_manageability_pt(adapter); > > - if (netif_running(netdev)) > - e1000e_up(adapter); > - > - netif_device_attach(netdev); > + e1000e_pm_thaw(&pdev->dev); > > /* If the controller has AMT, do not set DRV_LOAD until the interface > * is up. For all other cases, let the f/w know that the h/w is now In e1000e_pm_thaw(), these 2 lines need to switch order to avoid deadlock. from: + rtnl_unlock(); +err_irq: to: +err_irq: + rtnl_unlock(); I will find hardware to test this patch next week. Will update the test result later. Thanks! - David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-05 2:18 ` David Z. Dai @ 2019-10-05 17:22 ` Alexander Duyck 2019-10-07 15:50 ` David Z. Dai 0 siblings, 1 reply; 18+ messages in thread From: Alexander Duyck @ 2019-10-05 17:22 UTC (permalink / raw) To: David Z. Dai Cc: Netdev, LKML, intel-wired-lan, Jeff Kirsher, zdai, David Miller On Fri, Oct 4, 2019 at 7:18 PM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > On Fri, 2019-10-04 at 16:36 -0700, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > > > This patch is meant to address possible race conditions that can exist > > between network configuration and power management. A similar issue was > > fixed for igb in commit 9474933caf21 ("igb: close/suspend race in > > netif_device_detach"). > > > > In addition it consolidates the code so that the PCI error handling code > > will essentially perform the power management freeze on the device prior to > > attempting a reset, and will thaw the device afterwards if that is what it > > is planning to do. Otherwise when we call close on the interface it should > > see it is detached and not attempt to call the logic to down the interface > > and free the IRQs again. > > > > >From what I can tell the check that was adding the check for __E1000_DOWN > > in e1000e_close was added when runtime power management was added. However > > it should not be relevant for us as we perform a call to > > pm_runtime_get_sync before we call e1000_down/free_irq so it should always > > be back up before we call into this anyway. > > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > --- > > > > I'm putting this out as an RFC for now. I haven't had a chance to do much > > testing yet, but I have verified no build issues, and the driver appears > > to load, link, and pass traffic without problems. > > > > This should address issues seen with either double freeing or never freeing > > IRQs that have been seen on this and similar drivers in the past. > > > > I'll submit this formally after testing it over the weekend assuming there > > are no issues. > > > > drivers/net/ethernet/intel/e1000e/netdev.c | 33 ++++++++++++++-------------- > > 1 file changed, 17 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > > index d7d56e42a6aa..182a2c8f12d8 100644 > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c <snip> > > > > -#ifdef CONFIG_PM_SLEEP > > static int e1000e_pm_thaw(struct device *dev) > > { > > struct net_device *netdev = dev_get_drvdata(dev); > > struct e1000_adapter *adapter = netdev_priv(netdev); > > + int rc = 0; > > > > e1000e_set_interrupt_capability(adapter); > > - if (netif_running(netdev)) { > > - u32 err = e1000_request_irq(adapter); > > > > - if (err) > > - return err; > > + rtnl_lock(); > > + if (netif_running(netdev)) { > > + rc = e1000_request_irq(adapter); > > + if (rc) > > + goto err_irq; > > > > e1000e_up(adapter); > > } > > > > netif_device_attach(netdev); > > - > > - return 0; > > + rtnl_unlock(); > > +err_irq: > > + return rc; > > } > > > In e1000e_pm_thaw(), these 2 lines need to switch order to avoid > deadlock. > from: > + rtnl_unlock(); > +err_irq: > > to: > +err_irq: > + rtnl_unlock(); > > I will find hardware to test this patch next week. Will update the test > result later. > > Thanks! - David Thanks for spotting that. I will update my copy of the patch for when I submit the final revision. I'll probably wait to submit it for acceptance until you have had a chance to verify that it resolves the issue you were seeing. Thanks. - Alex ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-05 17:22 ` Alexander Duyck @ 2019-10-07 15:50 ` David Z. Dai 2019-10-07 17:02 ` Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: David Z. Dai @ 2019-10-07 15:50 UTC (permalink / raw) To: Alexander Duyck Cc: Netdev, LKML, intel-wired-lan, Jeff Kirsher, zdai, David Miller On Sat, 2019-10-05 at 10:22 -0700, Alexander Duyck wrote: > On Fri, Oct 4, 2019 at 7:18 PM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > > > On Fri, 2019-10-04 at 16:36 -0700, Alexander Duyck wrote: > > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > > > > > This patch is meant to address possible race conditions that can exist > > > between network configuration and power management. A similar issue was > > > fixed for igb in commit 9474933caf21 ("igb: close/suspend race in > > > netif_device_detach"). > > > > > > In addition it consolidates the code so that the PCI error handling code > > > will essentially perform the power management freeze on the device prior to > > > attempting a reset, and will thaw the device afterwards if that is what it > > > is planning to do. Otherwise when we call close on the interface it should > > > see it is detached and not attempt to call the logic to down the interface > > > and free the IRQs again. > > > > > > >From what I can tell the check that was adding the check for __E1000_DOWN > > > in e1000e_close was added when runtime power management was added. However > > > it should not be relevant for us as we perform a call to > > > pm_runtime_get_sync before we call e1000_down/free_irq so it should always > > > be back up before we call into this anyway. > > > > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > > --- > > > > > > I'm putting this out as an RFC for now. I haven't had a chance to do much > > > testing yet, but I have verified no build issues, and the driver appears > > > to load, link, and pass traffic without problems. > > > > > > This should address issues seen with either double freeing or never freeing > > > IRQs that have been seen on this and similar drivers in the past. > > > > > > I'll submit this formally after testing it over the weekend assuming there > > > are no issues. > > > > > > drivers/net/ethernet/intel/e1000e/netdev.c | 33 ++++++++++++++-------------- > > > 1 file changed, 17 insertions(+), 16 deletions(-) > > > > > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > > > index d7d56e42a6aa..182a2c8f12d8 100644 > > > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > > > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > > <snip> > > > > > > > -#ifdef CONFIG_PM_SLEEP > > > static int e1000e_pm_thaw(struct device *dev) > > > { > > > struct net_device *netdev = dev_get_drvdata(dev); > > > struct e1000_adapter *adapter = netdev_priv(netdev); > > > + int rc = 0; > > > > > > e1000e_set_interrupt_capability(adapter); > > > - if (netif_running(netdev)) { > > > - u32 err = e1000_request_irq(adapter); > > > > > > - if (err) > > > - return err; > > > + rtnl_lock(); > > > + if (netif_running(netdev)) { > > > + rc = e1000_request_irq(adapter); > > > + if (rc) > > > + goto err_irq; > > > > > > e1000e_up(adapter); > > > } > > > > > > netif_device_attach(netdev); > > > - > > > - return 0; > > > + rtnl_unlock(); > > > +err_irq: > > > + return rc; > > > } > > > > > In e1000e_pm_thaw(), these 2 lines need to switch order to avoid > > deadlock. > > from: > > + rtnl_unlock(); > > +err_irq: > > > > to: > > +err_irq: > > + rtnl_unlock(); > > > > I will find hardware to test this patch next week. Will update the test > > result later. > > > > Thanks! - David > > Thanks for spotting that. I will update my copy of the patch for when > I submit the final revision. > > I'll probably wait to submit it for acceptance until you have had a > chance to verify that it resolves the issue you were seeing. > > Thanks. > > - Alex We have tested on one of the test box. With this patch, it doesn't crash kernel anymore, which is good! However we see this warning message from the log file for irq number 0: [10206.317270] Trying to free already-free IRQ 0 With this stack: [10206.317344] NIP [c00000000018cbf8] __free_irq+0x308/0x370 [10206.317346] LR [c00000000018cbf4] __free_irq+0x304/0x370 [10206.317347] Call Trace: [10206.317348] [c00000008b92b970] [c00000000018cbf4] __free_irq +0x304/0x370 (unreliable) [10206.317351] [c00000008b92ba00] [c00000000018cd84] free_irq+0x84/0xf0 [10206.317358] [c00000008b92ba30] [d000000007449e60] e1000_free_irq +0x98/0xc0 [e1000e] [10206.317365] [c00000008b92ba60] [d000000007458a70] e1000e_pm_freeze +0xb8/0x100 [e1000e] [10206.317372] [c00000008b92baa0] [d000000007458b6c] e1000_io_error_detected+0x34/0x70 [e1000e] [10206.317375] [c00000008b92bad0] [c000000000040358] eeh_report_failure +0xc8/0x190 [10206.317377] [c00000008b92bb20] [c00000000003eb2c] eeh_pe_dev_traverse +0x9c/0x170 [10206.317379] [c00000008b92bbb0] [c000000000040d84] eeh_handle_normal_event+0xe4/0x580 [10206.317382] [c00000008b92bc60] [c000000000041330] eeh_handle_event +0x30/0x340 [10206.317384] [c00000008b92bd10] [c000000000041780] eeh_event_handler +0x140/0x200 [10206.317386] [c00000008b92bdc0] [c0000000001397c8] kthread+0x1a8/0x1b0 [10206.317389] [c00000008b92be30] [c00000000000b560] ret_from_kernel_thread+0x5c/0x7c Thanks! - David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-07 15:50 ` David Z. Dai @ 2019-10-07 17:02 ` Alexander Duyck 2019-10-07 17:12 ` David Z. Dai 0 siblings, 1 reply; 18+ messages in thread From: Alexander Duyck @ 2019-10-07 17:02 UTC (permalink / raw) To: David Z. Dai Cc: Netdev, LKML, intel-wired-lan, Jeff Kirsher, zdai, David Miller On Mon, Oct 7, 2019 at 8:51 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > <snip> > We have tested on one of the test box. > With this patch, it doesn't crash kernel anymore, which is good! > > However we see this warning message from the log file for irq number 0: > [10206.317270] Trying to free already-free IRQ 0 > > With this stack: > [10206.317344] NIP [c00000000018cbf8] __free_irq+0x308/0x370 > [10206.317346] LR [c00000000018cbf4] __free_irq+0x304/0x370 > [10206.317347] Call Trace: > [10206.317348] [c00000008b92b970] [c00000000018cbf4] __free_irq > +0x304/0x370 (unreliable) > [10206.317351] [c00000008b92ba00] [c00000000018cd84] free_irq+0x84/0xf0 > [10206.317358] [c00000008b92ba30] [d000000007449e60] e1000_free_irq > +0x98/0xc0 [e1000e] > [10206.317365] [c00000008b92ba60] [d000000007458a70] e1000e_pm_freeze > +0xb8/0x100 [e1000e] > [10206.317372] [c00000008b92baa0] [d000000007458b6c] > e1000_io_error_detected+0x34/0x70 [e1000e] > [10206.317375] [c00000008b92bad0] [c000000000040358] eeh_report_failure > +0xc8/0x190 > [10206.317377] [c00000008b92bb20] [c00000000003eb2c] eeh_pe_dev_traverse > +0x9c/0x170 > [10206.317379] [c00000008b92bbb0] [c000000000040d84] > eeh_handle_normal_event+0xe4/0x580 > [10206.317382] [c00000008b92bc60] [c000000000041330] eeh_handle_event > +0x30/0x340 > [10206.317384] [c00000008b92bd10] [c000000000041780] eeh_event_handler > +0x140/0x200 > [10206.317386] [c00000008b92bdc0] [c0000000001397c8] kthread+0x1a8/0x1b0 > [10206.317389] [c00000008b92be30] [c00000000000b560] > ret_from_kernel_thread+0x5c/0x7c > > Thanks! - David Hmm. I wonder if it is possibly calling the report e1000_io_error_detected multiple times. If so then the secondary calls to e1000_pm_freeze would cause issues. I will add a check so that we only down the interface and free the IRQs if the interface is in the present and running state. I'll submit an update patch shortly. Thanks. - Alex ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-07 17:02 ` Alexander Duyck @ 2019-10-07 17:12 ` David Z. Dai 2019-10-07 17:23 ` Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: David Z. Dai @ 2019-10-07 17:12 UTC (permalink / raw) To: Alexander Duyck Cc: Netdev, LKML, intel-wired-lan, Jeff Kirsher, zdai, David Miller On Mon, 2019-10-07 at 10:02 -0700, Alexander Duyck wrote: > On Mon, Oct 7, 2019 at 8:51 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > > > <snip> > > > We have tested on one of the test box. > > With this patch, it doesn't crash kernel anymore, which is good! > > > > However we see this warning message from the log file for irq number 0: > > [10206.317270] Trying to free already-free IRQ 0 > > > > With this stack: > > [10206.317344] NIP [c00000000018cbf8] __free_irq+0x308/0x370 > > [10206.317346] LR [c00000000018cbf4] __free_irq+0x304/0x370 > > [10206.317347] Call Trace: > > [10206.317348] [c00000008b92b970] [c00000000018cbf4] __free_irq > > +0x304/0x370 (unreliable) > > [10206.317351] [c00000008b92ba00] [c00000000018cd84] free_irq+0x84/0xf0 > > [10206.317358] [c00000008b92ba30] [d000000007449e60] e1000_free_irq > > +0x98/0xc0 [e1000e] > > [10206.317365] [c00000008b92ba60] [d000000007458a70] e1000e_pm_freeze > > +0xb8/0x100 [e1000e] > > [10206.317372] [c00000008b92baa0] [d000000007458b6c] > > e1000_io_error_detected+0x34/0x70 [e1000e] > > [10206.317375] [c00000008b92bad0] [c000000000040358] eeh_report_failure > > +0xc8/0x190 > > [10206.317377] [c00000008b92bb20] [c00000000003eb2c] eeh_pe_dev_traverse > > +0x9c/0x170 > > [10206.317379] [c00000008b92bbb0] [c000000000040d84] > > eeh_handle_normal_event+0xe4/0x580 > > [10206.317382] [c00000008b92bc60] [c000000000041330] eeh_handle_event > > +0x30/0x340 > > [10206.317384] [c00000008b92bd10] [c000000000041780] eeh_event_handler > > +0x140/0x200 > > [10206.317386] [c00000008b92bdc0] [c0000000001397c8] kthread+0x1a8/0x1b0 > > [10206.317389] [c00000008b92be30] [c00000000000b560] > > ret_from_kernel_thread+0x5c/0x7c > > > > Thanks! - David > > Hmm. I wonder if it is possibly calling the report > e1000_io_error_detected multiple times. If so then the secondary calls > to e1000_pm_freeze would cause issues. > > I will add a check so that we only down the interface and free the > IRQs if the interface is in the present and running state. > > I'll submit an update patch shortly. > > Thanks. > > - Alex It only complains about IRQ number 0 in the log. Could you please let me know the actual place where you will add the check? I can retest it again. Thanks! - David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-07 17:12 ` David Z. Dai @ 2019-10-07 17:23 ` Alexander Duyck 2019-10-07 17:27 ` [RFC PATCH v2] " Alexander Duyck 0 siblings, 1 reply; 18+ messages in thread From: Alexander Duyck @ 2019-10-07 17:23 UTC (permalink / raw) To: David Z. Dai Cc: Netdev, LKML, intel-wired-lan, Jeff Kirsher, zdai, David Miller On Mon, Oct 7, 2019 at 10:12 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > On Mon, 2019-10-07 at 10:02 -0700, Alexander Duyck wrote: > > On Mon, Oct 7, 2019 at 8:51 AM David Z. Dai <zdai@linux.vnet.ibm.com> wrote: > > > > > > > <snip> > > > > > We have tested on one of the test box. > > > With this patch, it doesn't crash kernel anymore, which is good! > > > > > > However we see this warning message from the log file for irq number 0: > > > [10206.317270] Trying to free already-free IRQ 0 > > > > > > With this stack: > > > [10206.317344] NIP [c00000000018cbf8] __free_irq+0x308/0x370 > > > [10206.317346] LR [c00000000018cbf4] __free_irq+0x304/0x370 > > > [10206.317347] Call Trace: > > > [10206.317348] [c00000008b92b970] [c00000000018cbf4] __free_irq > > > +0x304/0x370 (unreliable) > > > [10206.317351] [c00000008b92ba00] [c00000000018cd84] free_irq+0x84/0xf0 > > > [10206.317358] [c00000008b92ba30] [d000000007449e60] e1000_free_irq > > > +0x98/0xc0 [e1000e] > > > [10206.317365] [c00000008b92ba60] [d000000007458a70] e1000e_pm_freeze > > > +0xb8/0x100 [e1000e] > > > [10206.317372] [c00000008b92baa0] [d000000007458b6c] > > > e1000_io_error_detected+0x34/0x70 [e1000e] > > > [10206.317375] [c00000008b92bad0] [c000000000040358] eeh_report_failure > > > +0xc8/0x190 > > > [10206.317377] [c00000008b92bb20] [c00000000003eb2c] eeh_pe_dev_traverse > > > +0x9c/0x170 > > > [10206.317379] [c00000008b92bbb0] [c000000000040d84] > > > eeh_handle_normal_event+0xe4/0x580 > > > [10206.317382] [c00000008b92bc60] [c000000000041330] eeh_handle_event > > > +0x30/0x340 > > > [10206.317384] [c00000008b92bd10] [c000000000041780] eeh_event_handler > > > +0x140/0x200 > > > [10206.317386] [c00000008b92bdc0] [c0000000001397c8] kthread+0x1a8/0x1b0 > > > [10206.317389] [c00000008b92be30] [c00000000000b560] > > > ret_from_kernel_thread+0x5c/0x7c > > > > > > Thanks! - David > > > > Hmm. I wonder if it is possibly calling the report > > e1000_io_error_detected multiple times. If so then the secondary calls > > to e1000_pm_freeze would cause issues. > > > > I will add a check so that we only down the interface and free the > > IRQs if the interface is in the present and running state. > > > > I'll submit an update patch shortly. > > > > Thanks. > > > > - Alex > It only complains about IRQ number 0 in the log. I suspect that is because the IRQ is already freed. So there are no other interrupts enabled and it thinks it is running in legacy IRQ mode. > Could you please let me know the actual place where you will add the > check? > I can retest it again. > > Thanks! - David So I will need to add another variable called "present" to e1000_pm_freeze, I will assign netif_device_present() to it after taking the rtnl_lock and before I detach the interface, and will test for it and netif_running() before calling the logic that calls e1000_down and e1000_free_irq. - Alex ^ permalink raw reply [flat|nested] 18+ messages in thread
* [RFC PATCH v2] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-07 17:23 ` Alexander Duyck @ 2019-10-07 17:27 ` Alexander Duyck 2019-10-08 20:49 ` David Z. Dai 2020-02-25 9:42 ` Kai-Heng Feng 0 siblings, 2 replies; 18+ messages in thread From: Alexander Duyck @ 2019-10-07 17:27 UTC (permalink / raw) To: alexander.duyck Cc: zdai, netdev, linux-kernel, intel-wired-lan, jeffrey.t.kirsher, zdai, davem From: Alexander Duyck <alexander.h.duyck@linux.intel.com> This patch is meant to address possible race conditions that can exist between network configuration and power management. A similar issue was fixed for igb in commit 9474933caf21 ("igb: close/suspend race in netif_device_detach"). In addition it consolidates the code so that the PCI error handling code will essentially perform the power management freeze on the device prior to attempting a reset, and will thaw the device afterwards if that is what it is planning to do. Otherwise when we call close on the interface it should see it is detached and not attempt to call the logic to down the interface and free the IRQs again. >From what I can tell the check that was adding the check for __E1000_DOWN in e1000e_close was added when runtime power management was added. However it should not be relevant for us as we perform a call to pm_runtime_get_sync before we call e1000_down/free_irq so it should always be back up before we call into this anyway. Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> --- RFC v2: Dropped some unused variables Added logic to check for device present before removing to pm_freeze Fixed misplaced err_irq to before rtnl_unlock() drivers/net/ethernet/intel/e1000e/netdev.c | 40 +++++++++++++++------------- 1 file changed, 21 insertions(+), 19 deletions(-) diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index d7d56e42a6aa..8b4e589aca36 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -4715,12 +4715,12 @@ int e1000e_close(struct net_device *netdev) pm_runtime_get_sync(&pdev->dev); - if (!test_bit(__E1000_DOWN, &adapter->state)) { + if (netif_device_present(netdev)) { e1000e_down(adapter, true); e1000_free_irq(adapter); /* Link status message must follow this format */ - pr_info("%s NIC Link is Down\n", adapter->netdev->name); + pr_info("%s NIC Link is Down\n", netdev->name); } napi_disable(&adapter->napi); @@ -6298,10 +6298,14 @@ static int e1000e_pm_freeze(struct device *dev) { struct net_device *netdev = dev_get_drvdata(dev); struct e1000_adapter *adapter = netdev_priv(netdev); + bool present; + rtnl_lock(); + + present = netif_device_present(netdev); netif_device_detach(netdev); - if (netif_running(netdev)) { + if (present && netif_running(netdev)) { int count = E1000_CHECK_RESET_COUNT; while (test_bit(__E1000_RESETTING, &adapter->state) && count--) @@ -6313,6 +6317,8 @@ static int e1000e_pm_freeze(struct device *dev) e1000e_down(adapter, false); e1000_free_irq(adapter); } + rtnl_unlock(); + e1000e_reset_interrupt_capability(adapter); /* Allow time for pending master requests to run */ @@ -6626,27 +6632,31 @@ static int __e1000_resume(struct pci_dev *pdev) return 0; } -#ifdef CONFIG_PM_SLEEP static int e1000e_pm_thaw(struct device *dev) { struct net_device *netdev = dev_get_drvdata(dev); struct e1000_adapter *adapter = netdev_priv(netdev); + int rc = 0; e1000e_set_interrupt_capability(adapter); - if (netif_running(netdev)) { - u32 err = e1000_request_irq(adapter); - if (err) - return err; + rtnl_lock(); + if (netif_running(netdev)) { + rc = e1000_request_irq(adapter); + if (rc) + goto err_irq; e1000e_up(adapter); } netif_device_attach(netdev); +err_irq: + rtnl_unlock(); - return 0; + return rc; } +#ifdef CONFIG_PM_SLEEP static int e1000e_pm_suspend(struct device *dev) { struct pci_dev *pdev = to_pci_dev(dev); @@ -6818,16 +6828,11 @@ static void e1000_netpoll(struct net_device *netdev) static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev, pci_channel_state_t state) { - struct net_device *netdev = pci_get_drvdata(pdev); - struct e1000_adapter *adapter = netdev_priv(netdev); - - netif_device_detach(netdev); + e1000e_pm_freeze(&pdev->dev); if (state == pci_channel_io_perm_failure) return PCI_ERS_RESULT_DISCONNECT; - if (netif_running(netdev)) - e1000e_down(adapter, true); pci_disable_device(pdev); /* Request a slot slot reset. */ @@ -6893,10 +6898,7 @@ static void e1000_io_resume(struct pci_dev *pdev) e1000_init_manageability_pt(adapter); - if (netif_running(netdev)) - e1000e_up(adapter); - - netif_device_attach(netdev); + e1000e_pm_thaw(&pdev->dev); /* If the controller has AMT, do not set DRV_LOAD until the interface * is up. For all other cases, let the f/w know that the h/w is now ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-07 17:27 ` [RFC PATCH v2] " Alexander Duyck @ 2019-10-08 20:49 ` David Z. Dai 2020-02-25 9:42 ` Kai-Heng Feng 1 sibling, 0 replies; 18+ messages in thread From: David Z. Dai @ 2019-10-08 20:49 UTC (permalink / raw) To: Alexander Duyck Cc: netdev, linux-kernel, intel-wired-lan, jeffrey.t.kirsher, zdai, davem On Mon, 2019-10-07 at 10:27 -0700, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > This patch is meant to address possible race conditions that can exist > between network configuration and power management. A similar issue was > fixed for igb in commit 9474933caf21 ("igb: close/suspend race in > netif_device_detach"). > > In addition it consolidates the code so that the PCI error handling code > will essentially perform the power management freeze on the device prior to > attempting a reset, and will thaw the device afterwards if that is what it > is planning to do. Otherwise when we call close on the interface it should > see it is detached and not attempt to call the logic to down the interface > and free the IRQs again. > > >From what I can tell the check that was adding the check for __E1000_DOWN > in e1000e_close was added when runtime power management was added. However > it should not be relevant for us as we perform a call to > pm_runtime_get_sync before we call e1000_down/free_irq so it should always > be back up before we call into this anyway. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > --- > > RFC v2: Dropped some unused variables > Added logic to check for device present before removing to pm_freeze > Fixed misplaced err_irq to before rtnl_unlock() > > drivers/net/ethernet/intel/e1000e/netdev.c | 40 +++++++++++++++------------- > 1 file changed, 21 insertions(+), 19 deletions(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index d7d56e42a6aa..8b4e589aca36 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -4715,12 +4715,12 @@ int e1000e_close(struct net_device *netdev) > > pm_runtime_get_sync(&pdev->dev); > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > + if (netif_device_present(netdev)) { > e1000e_down(adapter, true); > e1000_free_irq(adapter); > > /* Link status message must follow this format */ > - pr_info("%s NIC Link is Down\n", adapter->netdev->name); > + pr_info("%s NIC Link is Down\n", netdev->name); > } > > napi_disable(&adapter->napi); > @@ -6298,10 +6298,14 @@ static int e1000e_pm_freeze(struct device *dev) > { > struct net_device *netdev = dev_get_drvdata(dev); > struct e1000_adapter *adapter = netdev_priv(netdev); > + bool present; > > + rtnl_lock(); > + > + present = netif_device_present(netdev); > netif_device_detach(netdev); > > - if (netif_running(netdev)) { > + if (present && netif_running(netdev)) { > int count = E1000_CHECK_RESET_COUNT; > > while (test_bit(__E1000_RESETTING, &adapter->state) && count--) > @@ -6313,6 +6317,8 @@ static int e1000e_pm_freeze(struct device *dev) > e1000e_down(adapter, false); > e1000_free_irq(adapter); > } > + rtnl_unlock(); > + > e1000e_reset_interrupt_capability(adapter); > > /* Allow time for pending master requests to run */ > @@ -6626,27 +6632,31 @@ static int __e1000_resume(struct pci_dev *pdev) > return 0; > } > > -#ifdef CONFIG_PM_SLEEP > static int e1000e_pm_thaw(struct device *dev) > { > struct net_device *netdev = dev_get_drvdata(dev); > struct e1000_adapter *adapter = netdev_priv(netdev); > + int rc = 0; > > e1000e_set_interrupt_capability(adapter); > - if (netif_running(netdev)) { > - u32 err = e1000_request_irq(adapter); > > - if (err) > - return err; > + rtnl_lock(); > + if (netif_running(netdev)) { > + rc = e1000_request_irq(adapter); > + if (rc) > + goto err_irq; > > e1000e_up(adapter); > } > > netif_device_attach(netdev); > +err_irq: > + rtnl_unlock(); > > - return 0; > + return rc; > } > > +#ifdef CONFIG_PM_SLEEP > static int e1000e_pm_suspend(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > @@ -6818,16 +6828,11 @@ static void e1000_netpoll(struct net_device *netdev) > static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev, > pci_channel_state_t state) > { > - struct net_device *netdev = pci_get_drvdata(pdev); > - struct e1000_adapter *adapter = netdev_priv(netdev); > - > - netif_device_detach(netdev); > + e1000e_pm_freeze(&pdev->dev); > > if (state == pci_channel_io_perm_failure) > return PCI_ERS_RESULT_DISCONNECT; > > - if (netif_running(netdev)) > - e1000e_down(adapter, true); > pci_disable_device(pdev); > > /* Request a slot slot reset. */ > @@ -6893,10 +6898,7 @@ static void e1000_io_resume(struct pci_dev *pdev) > > e1000_init_manageability_pt(adapter); > > - if (netif_running(netdev)) > - e1000e_up(adapter); > - > - netif_device_attach(netdev); > + e1000e_pm_thaw(&pdev->dev); > > /* If the controller has AMT, do not set DRV_LOAD until the interface > * is up. For all other cases, let the f/w know that the h/w is now > Tested this v2 version patch. There is no more crash, which is good. And It also eliminates the double free warning message. This patch is good to go. Thanks! - David ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2019-10-07 17:27 ` [RFC PATCH v2] " Alexander Duyck 2019-10-08 20:49 ` David Z. Dai @ 2020-02-25 9:42 ` Kai-Heng Feng 2020-02-25 20:46 ` Greg Kroah-Hartman 1 sibling, 1 reply; 18+ messages in thread From: Kai-Heng Feng @ 2020-02-25 9:42 UTC (permalink / raw) To: Greg Kroah-Hartman, stable Cc: Alexander Duyck, zdai, open list:NETWORKING DRIVERS, open list, moderated list:INTEL ETHERNET DRIVERS, Kirsher, Jeffrey T, zdai, David Miller Hi Greg, > On Oct 8, 2019, at 01:27, Alexander Duyck <alexander.duyck@gmail.com> wrote: > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > This patch is meant to address possible race conditions that can exist > between network configuration and power management. A similar issue was > fixed for igb in commit 9474933caf21 ("igb: close/suspend race in > netif_device_detach"). > > In addition it consolidates the code so that the PCI error handling code > will essentially perform the power management freeze on the device prior to > attempting a reset, and will thaw the device afterwards if that is what it > is planning to do. Otherwise when we call close on the interface it should > see it is detached and not attempt to call the logic to down the interface > and free the IRQs again. > >> From what I can tell the check that was adding the check for __E1000_DOWN > in e1000e_close was added when runtime power management was added. However > it should not be relevant for us as we perform a call to > pm_runtime_get_sync before we call e1000_down/free_irq so it should always > be back up before we call into this anyway. > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Please merge this commit, a7023819404ac9bd2bb311a4fafd38515cfa71ec to stable v5.14. `modprobe -r e1000e` triggers a null pointer dereference [1] after the the following two patches are applied to v5.4.y: d635e7c4b34e6a630c7a1e8f1a8fd52c3e3ceea7 e1000e: Revert "e1000e: Make watchdog use delayed work" 21c6137939723ed6f5e4aec7882cdfc247304c27 e1000e: Drop unnecessary __E1000_DOWN bit twiddling [1] https://bugs.launchpad.net/bugs/1864303 Kai-Heng > --- > > RFC v2: Dropped some unused variables > Added logic to check for device present before removing to pm_freeze > Fixed misplaced err_irq to before rtnl_unlock() > > drivers/net/ethernet/intel/e1000e/netdev.c | 40 +++++++++++++++------------- > 1 file changed, 21 insertions(+), 19 deletions(-) > > diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c > index d7d56e42a6aa..8b4e589aca36 100644 > --- a/drivers/net/ethernet/intel/e1000e/netdev.c > +++ b/drivers/net/ethernet/intel/e1000e/netdev.c > @@ -4715,12 +4715,12 @@ int e1000e_close(struct net_device *netdev) > > pm_runtime_get_sync(&pdev->dev); > > - if (!test_bit(__E1000_DOWN, &adapter->state)) { > + if (netif_device_present(netdev)) { > e1000e_down(adapter, true); > e1000_free_irq(adapter); > > /* Link status message must follow this format */ > - pr_info("%s NIC Link is Down\n", adapter->netdev->name); > + pr_info("%s NIC Link is Down\n", netdev->name); > } > > napi_disable(&adapter->napi); > @@ -6298,10 +6298,14 @@ static int e1000e_pm_freeze(struct device *dev) > { > struct net_device *netdev = dev_get_drvdata(dev); > struct e1000_adapter *adapter = netdev_priv(netdev); > + bool present; > > + rtnl_lock(); > + > + present = netif_device_present(netdev); > netif_device_detach(netdev); > > - if (netif_running(netdev)) { > + if (present && netif_running(netdev)) { > int count = E1000_CHECK_RESET_COUNT; > > while (test_bit(__E1000_RESETTING, &adapter->state) && count--) > @@ -6313,6 +6317,8 @@ static int e1000e_pm_freeze(struct device *dev) > e1000e_down(adapter, false); > e1000_free_irq(adapter); > } > + rtnl_unlock(); > + > e1000e_reset_interrupt_capability(adapter); > > /* Allow time for pending master requests to run */ > @@ -6626,27 +6632,31 @@ static int __e1000_resume(struct pci_dev *pdev) > return 0; > } > > -#ifdef CONFIG_PM_SLEEP > static int e1000e_pm_thaw(struct device *dev) > { > struct net_device *netdev = dev_get_drvdata(dev); > struct e1000_adapter *adapter = netdev_priv(netdev); > + int rc = 0; > > e1000e_set_interrupt_capability(adapter); > - if (netif_running(netdev)) { > - u32 err = e1000_request_irq(adapter); > > - if (err) > - return err; > + rtnl_lock(); > + if (netif_running(netdev)) { > + rc = e1000_request_irq(adapter); > + if (rc) > + goto err_irq; > > e1000e_up(adapter); > } > > netif_device_attach(netdev); > +err_irq: > + rtnl_unlock(); > > - return 0; > + return rc; > } > > +#ifdef CONFIG_PM_SLEEP > static int e1000e_pm_suspend(struct device *dev) > { > struct pci_dev *pdev = to_pci_dev(dev); > @@ -6818,16 +6828,11 @@ static void e1000_netpoll(struct net_device *netdev) > static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev, > pci_channel_state_t state) > { > - struct net_device *netdev = pci_get_drvdata(pdev); > - struct e1000_adapter *adapter = netdev_priv(netdev); > - > - netif_device_detach(netdev); > + e1000e_pm_freeze(&pdev->dev); > > if (state == pci_channel_io_perm_failure) > return PCI_ERS_RESULT_DISCONNECT; > > - if (netif_running(netdev)) > - e1000e_down(adapter, true); > pci_disable_device(pdev); > > /* Request a slot slot reset. */ > @@ -6893,10 +6898,7 @@ static void e1000_io_resume(struct pci_dev *pdev) > > e1000_init_manageability_pt(adapter); > > - if (netif_running(netdev)) > - e1000e_up(adapter); > - > - netif_device_attach(netdev); > + e1000e_pm_thaw(&pdev->dev); > > /* If the controller has AMT, do not set DRV_LOAD until the interface > * is up. For all other cases, let the f/w know that the h/w is now > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [RFC PATCH v2] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm 2020-02-25 9:42 ` Kai-Heng Feng @ 2020-02-25 20:46 ` Greg Kroah-Hartman 0 siblings, 0 replies; 18+ messages in thread From: Greg Kroah-Hartman @ 2020-02-25 20:46 UTC (permalink / raw) To: Kai-Heng Feng Cc: stable, Alexander Duyck, zdai, open list:NETWORKING DRIVERS, open list, moderated list:INTEL ETHERNET DRIVERS, Kirsher, Jeffrey T, zdai, David Miller On Tue, Feb 25, 2020 at 05:42:26PM +0800, Kai-Heng Feng wrote: > Hi Greg, > > > On Oct 8, 2019, at 01:27, Alexander Duyck <alexander.duyck@gmail.com> wrote: > > > > From: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > > > This patch is meant to address possible race conditions that can exist > > between network configuration and power management. A similar issue was > > fixed for igb in commit 9474933caf21 ("igb: close/suspend race in > > netif_device_detach"). > > > > In addition it consolidates the code so that the PCI error handling code > > will essentially perform the power management freeze on the device prior to > > attempting a reset, and will thaw the device afterwards if that is what it > > is planning to do. Otherwise when we call close on the interface it should > > see it is detached and not attempt to call the logic to down the interface > > and free the IRQs again. > > > >> From what I can tell the check that was adding the check for __E1000_DOWN > > in e1000e_close was added when runtime power management was added. However > > it should not be relevant for us as we perform a call to > > pm_runtime_get_sync before we call e1000_down/free_irq so it should always > > be back up before we call into this anyway. > > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> > > Please merge this commit, a7023819404ac9bd2bb311a4fafd38515cfa71ec to stable v5.14. > > `modprobe -r e1000e` triggers a null pointer dereference [1] after the the following two patches are applied to v5.4.y: > d635e7c4b34e6a630c7a1e8f1a8fd52c3e3ceea7 e1000e: Revert "e1000e: Make watchdog use delayed work" > 21c6137939723ed6f5e4aec7882cdfc247304c27 e1000e: Drop unnecessary __E1000_DOWN bit twiddling Now queued up, thanks. greg k-h ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2020-02-25 20:46 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-10-03 16:54 [v1] e1000e: EEH on e1000e adapter detects io perm failure can trigger crash David Dai 2019-10-03 17:39 ` Alexander Duyck 2019-10-03 18:50 ` David Z. Dai 2019-10-03 20:39 ` Alexander Duyck 2019-10-04 0:02 ` David Z. Dai 2019-10-04 14:35 ` Alexander Duyck 2019-10-04 17:04 ` David Z. Dai 2019-10-04 23:36 ` [RFC PATCH] e1000e: Use rtnl_lock to prevent race conditions between net and pci/pm Alexander Duyck 2019-10-05 2:18 ` David Z. Dai 2019-10-05 17:22 ` Alexander Duyck 2019-10-07 15:50 ` David Z. Dai 2019-10-07 17:02 ` Alexander Duyck 2019-10-07 17:12 ` David Z. Dai 2019-10-07 17:23 ` Alexander Duyck 2019-10-07 17:27 ` [RFC PATCH v2] " Alexander Duyck 2019-10-08 20:49 ` David Z. Dai 2020-02-25 9:42 ` Kai-Heng Feng 2020-02-25 20:46 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).