* [PATCH] igb: Fix igb_down hung on surprise removal @ 2023-05-18 7:26 Ying Hsu 2023-05-18 10:36 ` Pavan Chebbi 2023-06-05 4:47 ` [Intel-wired-lan] " Pucha, HimasekharX Reddy 0 siblings, 2 replies; 9+ messages in thread From: Ying Hsu @ 2023-05-18 7:26 UTC (permalink / raw) To: netdev Cc: grundler, Ying Hsu, David S. Miller, Eric Dumazet, Jakub Kicinski, Jesse Brandeburg, Paolo Abeni, Tony Nguyen, intel-wired-lan, linux-kernel In a setup where a Thunderbolt hub connects to Ethernet and a display through USB Type-C, users may experience a hung task timeout when they remove the cable between the PC and the Thunderbolt hub. This is because the igb_down function is called multiple times when the Thunderbolt hub is unplugged. For example, the igb_io_error_detected triggers the first call, and the igb_remove triggers the second call. The second call to igb_down will block at napi_synchronize. Here's the call trace: __schedule+0x3b0/0xddb ? __mod_timer+0x164/0x5d3 schedule+0x44/0xa8 schedule_timeout+0xb2/0x2a4 ? run_local_timers+0x4e/0x4e msleep+0x31/0x38 igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4] __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4] igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4] __dev_close_many+0x95/0xec dev_close_many+0x6e/0x103 unregister_netdevice_many+0x105/0x5b1 unregister_netdevice_queue+0xc2/0x10d unregister_netdev+0x1c/0x23 igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4] pci_device_remove+0x3f/0x9c device_release_driver_internal+0xfe/0x1b4 pci_stop_bus_device+0x5b/0x7f pci_stop_bus_device+0x30/0x7f pci_stop_bus_device+0x30/0x7f pci_stop_and_remove_bus_device+0x12/0x19 pciehp_unconfigure_device+0x76/0xe9 pciehp_disable_slot+0x6e/0x131 pciehp_handle_presence_or_link_change+0x7a/0x3f7 pciehp_ist+0xbe/0x194 irq_thread_fn+0x22/0x4d ? irq_thread+0x1fd/0x1fd irq_thread+0x17b/0x1fd ? irq_forced_thread_fn+0x5f/0x5f kthread+0x142/0x153 ? __irq_get_irqchip_state+0x46/0x46 ? kthread_associate_blkcg+0x71/0x71 ret_from_fork+0x1f/0x30 In this case, igb_io_error_detected detaches the network interface and requests a PCIE slot reset, however, the PCIE reset callback is not being invoked and thus the Ethernet connection breaks down. As the PCIE error in this case is a non-fatal one, requesting a slot reset can be avoided. This patch fixes the task hung issue and preserves Ethernet connection by ignoring non-fatal PCIE errors. Signed-off-by: Ying Hsu <yinghsu@chromium.org> --- This commit has been tested on a HP Elite Dragonfly Chromebook and a Caldigit TS3+ Thunderbolt hub. The Ethernet driver for the hub is igb. Non-fatal PCIE errors happen when users hot-plug the cables connected to the chromebook or to the external display. drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 58872a4c2540..a8b217368ca1 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev, struct net_device *netdev = pci_get_drvdata(pdev); struct igb_adapter *adapter = netdev_priv(netdev); + if (state == pci_channel_io_normal) { + dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n"); + return PCI_ERS_RESULT_CAN_RECOVER; + } + netif_device_detach(netdev); if (state == pci_channel_io_perm_failure) -- 2.40.1.606.ga4b1b128d6-goog ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-18 7:26 [PATCH] igb: Fix igb_down hung on surprise removal Ying Hsu @ 2023-05-18 10:36 ` Pavan Chebbi 2023-05-22 20:16 ` Grant Grundler 2023-06-05 4:47 ` [Intel-wired-lan] " Pucha, HimasekharX Reddy 1 sibling, 1 reply; 9+ messages in thread From: Pavan Chebbi @ 2023-05-18 10:36 UTC (permalink / raw) To: Ying Hsu Cc: netdev, grundler, David S. Miller, Eric Dumazet, Jakub Kicinski, Jesse Brandeburg, Paolo Abeni, Tony Nguyen, intel-wired-lan, linux-kernel [-- Attachment #1: Type: text/plain, Size: 1027 bytes --] On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> wrote: > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c > index 58872a4c2540..a8b217368ca1 100644 > --- a/drivers/net/ethernet/intel/igb/igb_main.c > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev, > struct net_device *netdev = pci_get_drvdata(pdev); > struct igb_adapter *adapter = netdev_priv(netdev); > > + if (state == pci_channel_io_normal) { > + dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n"); > + return PCI_ERS_RESULT_CAN_RECOVER; > + } > + This code may be good to have. But not sure if this should be the fix for igb_down() synchronization. Intel guys may comment. > netif_device_detach(netdev); > > if (state == pci_channel_io_perm_failure) > -- > 2.40.1.606.ga4b1b128d6-goog > > [-- Attachment #2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4209 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-18 10:36 ` Pavan Chebbi @ 2023-05-22 20:16 ` Grant Grundler 2023-05-23 18:03 ` Tony Nguyen 0 siblings, 1 reply; 9+ messages in thread From: Grant Grundler @ 2023-05-22 20:16 UTC (permalink / raw) To: Pavan Chebbi Cc: Ying Hsu, netdev, grundler, David S. Miller, Eric Dumazet, Jakub Kicinski, Jesse Brandeburg, Paolo Abeni, Tony Nguyen, intel-wired-lan, linux-kernel On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote: > > On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> wrote: > > > > diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c > > index 58872a4c2540..a8b217368ca1 100644 > > --- a/drivers/net/ethernet/intel/igb/igb_main.c > > +++ b/drivers/net/ethernet/intel/igb/igb_main.c > > @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev, > > struct net_device *netdev = pci_get_drvdata(pdev); > > struct igb_adapter *adapter = netdev_priv(netdev); > > > > + if (state == pci_channel_io_normal) { > > + dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n"); > > + return PCI_ERS_RESULT_CAN_RECOVER; > > + } > > + > > This code may be good to have. But not sure if this should be the fix > for igb_down() synchronization. I have the same opinion. This appears to solve the problem - but I don't know if there is a better way to solve this problem. > Intel guys may comment. Ping? Can we please get feedback from IGB/IGC maintainers this week? (I hope igc maintainers can confirm this isn't an issue for igc.) cheers, grant > > > netif_device_detach(netdev); > > > > if (state == pci_channel_io_perm_failure) > > -- > > 2.40.1.606.ga4b1b128d6-goog > > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-22 20:16 ` Grant Grundler @ 2023-05-23 18:03 ` Tony Nguyen 2023-05-24 12:31 ` Loktionov, Aleksandr 0 siblings, 1 reply; 9+ messages in thread From: Tony Nguyen @ 2023-05-23 18:03 UTC (permalink / raw) To: Grant Grundler, Pavan Chebbi, Aleksandr Loktionov, Neftin, Sasha, Ruinskiy, Dima Cc: Ying Hsu, netdev, David S. Miller, Eric Dumazet, Jakub Kicinski, Jesse Brandeburg, Paolo Abeni, intel-wired-lan, linux-kernel On 5/22/2023 1:16 PM, Grant Grundler wrote: > On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi <pavan.chebbi@broadcom.com> wrote: >> >> On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> wrote: >>> >>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c >>> index 58872a4c2540..a8b217368ca1 100644 >>> --- a/drivers/net/ethernet/intel/igb/igb_main.c >>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c >>> @@ -9581,6 +9581,11 @@ static pci_ers_result_t igb_io_error_detected(struct pci_dev *pdev, >>> struct net_device *netdev = pci_get_drvdata(pdev); >>> struct igb_adapter *adapter = netdev_priv(netdev); >>> >>> + if (state == pci_channel_io_normal) { >>> + dev_warn(&pdev->dev, "Non-correctable non-fatal error reported.\n"); >>> + return PCI_ERS_RESULT_CAN_RECOVER; >>> + } >>> + >> >> This code may be good to have. But not sure if this should be the fix >> for igb_down() synchronization. > > I have the same opinion. This appears to solve the problem - but I > don't know if there is a better way to solve this problem. > >> Intel guys may comment. > > Ping? Can we please get feedback from IGB/IGC maintainers this week? > > (I hope igc maintainers can confirm this isn't an issue for igc.) Adding some of the igb and igc developers. > cheers, > grant > >> >>> netif_device_detach(netdev); >>> >>> if (state == pci_channel_io_perm_failure) >>> -- >>> 2.40.1.606.ga4b1b128d6-goog >>> >>> ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-23 18:03 ` Tony Nguyen @ 2023-05-24 12:31 ` Loktionov, Aleksandr 2023-05-24 21:01 ` Grant Grundler 0 siblings, 1 reply; 9+ messages in thread From: Loktionov, Aleksandr @ 2023-05-24 12:31 UTC (permalink / raw) To: Nguyen, Anthony L, Grant Grundler, Pavan Chebbi, Neftin, Sasha, Ruinskiy, Dima Cc: Ying Hsu, netdev, David S. Miller, Eric Dumazet, Jakub Kicinski, Brandeburg, Jesse, Paolo Abeni, intel-wired-lan, linux-kernel Good day Tony We reviewed the patch and have nothing against. With the best regards Alex ND ITP Linux 40G base driver TL > -----Original Message----- > From: Nguyen, Anthony L <anthony.l.nguyen@intel.com> > Sent: Tuesday, May 23, 2023 8:04 PM > To: Grant Grundler <grundler@chromium.org>; Pavan Chebbi > <pavan.chebbi@broadcom.com>; Loktionov, Aleksandr > <aleksandr.loktionov@intel.com>; Neftin, Sasha <sasha.neftin@intel.com>; > Ruinskiy, Dima <dima.ruinskiy@intel.com> > Cc: Ying Hsu <yinghsu@chromium.org>; netdev@vger.kernel.org; David S. > Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; > Jakub Kicinski <kuba@kernel.org>; Brandeburg, Jesse > <jesse.brandeburg@intel.com>; Paolo Abeni <pabeni@redhat.com>; intel- > wired-lan@lists.osuosl.org; linux-kernel@vger.kernel.org > Subject: Re: [PATCH] igb: Fix igb_down hung on surprise removal > > On 5/22/2023 1:16 PM, Grant Grundler wrote: > > On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi > <pavan.chebbi@broadcom.com> wrote: > >> > >> On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> > wrote: > >>> > >>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c > >>> b/drivers/net/ethernet/intel/igb/igb_main.c > >>> index 58872a4c2540..a8b217368ca1 100644 > >>> --- a/drivers/net/ethernet/intel/igb/igb_main.c > >>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c > >>> @@ -9581,6 +9581,11 @@ static pci_ers_result_t > igb_io_error_detected(struct pci_dev *pdev, > >>> struct net_device *netdev = pci_get_drvdata(pdev); > >>> struct igb_adapter *adapter = netdev_priv(netdev); > >>> > >>> + if (state == pci_channel_io_normal) { > >>> + dev_warn(&pdev->dev, "Non-correctable non-fatal error > reported.\n"); > >>> + return PCI_ERS_RESULT_CAN_RECOVER; > >>> + } > >>> + > >> > >> This code may be good to have. But not sure if this should be the fix > >> for igb_down() synchronization. > > > > I have the same opinion. This appears to solve the problem - but I > > don't know if there is a better way to solve this problem. > > > >> Intel guys may comment. > > > > Ping? Can we please get feedback from IGB/IGC maintainers this week? > > > > (I hope igc maintainers can confirm this isn't an issue for igc.) > > Adding some of the igb and igc developers. > > > cheers, > > grant > > > >> > >>> netif_device_detach(netdev); > >>> > >>> if (state == pci_channel_io_perm_failure) > >>> -- > >>> 2.40.1.606.ga4b1b128d6-goog > >>> > >>> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-24 12:31 ` Loktionov, Aleksandr @ 2023-05-24 21:01 ` Grant Grundler 2023-05-24 22:22 ` Tony Nguyen 0 siblings, 1 reply; 9+ messages in thread From: Grant Grundler @ 2023-05-24 21:01 UTC (permalink / raw) To: Loktionov, Aleksandr Cc: Nguyen, Anthony L, Grant Grundler, Pavan Chebbi, Neftin, Sasha, Ruinskiy, Dima, Ying Hsu, netdev, David S. Miller, Eric Dumazet, Jakub Kicinski, Brandeburg, Jesse, Paolo Abeni, intel-wired-lan, linux-kernel On Wed, May 24, 2023 at 5:34 AM Loktionov, Aleksandr <aleksandr.loktionov@intel.com> wrote: > > Good day Tony > > We reviewed the patch and have nothing against. Thank you for reviewing! Can I take this as the equivalent of "Signed-off-by: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>"? Or since Tony is listed in MAINTAINERS for drivers/net/ethernet/intel, is he supposed to provide that? cheers, grant > > With the best regards > Alex > ND ITP Linux 40G base driver TL > > > > > -----Original Message----- > > From: Nguyen, Anthony L <anthony.l.nguyen@intel.com> > > Sent: Tuesday, May 23, 2023 8:04 PM > > To: Grant Grundler <grundler@chromium.org>; Pavan Chebbi > > <pavan.chebbi@broadcom.com>; Loktionov, Aleksandr > > <aleksandr.loktionov@intel.com>; Neftin, Sasha <sasha.neftin@intel.com>; > > Ruinskiy, Dima <dima.ruinskiy@intel.com> > > Cc: Ying Hsu <yinghsu@chromium.org>; netdev@vger.kernel.org; David S. > > Miller <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; > > Jakub Kicinski <kuba@kernel.org>; Brandeburg, Jesse > > <jesse.brandeburg@intel.com>; Paolo Abeni <pabeni@redhat.com>; intel- > > wired-lan@lists.osuosl.org; linux-kernel@vger.kernel.org > > Subject: Re: [PATCH] igb: Fix igb_down hung on surprise removal > > > > On 5/22/2023 1:16 PM, Grant Grundler wrote: > > > On Thu, May 18, 2023 at 3:36 AM Pavan Chebbi > > <pavan.chebbi@broadcom.com> wrote: > > >> > > >> On Thu, May 18, 2023 at 12:58 PM Ying Hsu <yinghsu@chromium.org> > > wrote: > > >>> > > >>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c > > >>> b/drivers/net/ethernet/intel/igb/igb_main.c > > >>> index 58872a4c2540..a8b217368ca1 100644 > > >>> --- a/drivers/net/ethernet/intel/igb/igb_main.c > > >>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c > > >>> @@ -9581,6 +9581,11 @@ static pci_ers_result_t > > igb_io_error_detected(struct pci_dev *pdev, > > >>> struct net_device *netdev = pci_get_drvdata(pdev); > > >>> struct igb_adapter *adapter = netdev_priv(netdev); > > >>> > > >>> + if (state == pci_channel_io_normal) { > > >>> + dev_warn(&pdev->dev, "Non-correctable non-fatal error > > reported.\n"); > > >>> + return PCI_ERS_RESULT_CAN_RECOVER; > > >>> + } > > >>> + > > >> > > >> This code may be good to have. But not sure if this should be the fix > > >> for igb_down() synchronization. > > > > > > I have the same opinion. This appears to solve the problem - but I > > > don't know if there is a better way to solve this problem. > > > > > >> Intel guys may comment. > > > > > > Ping? Can we please get feedback from IGB/IGC maintainers this week? > > > > > > (I hope igc maintainers can confirm this isn't an issue for igc.) > > > > Adding some of the igb and igc developers. > > > > > cheers, > > > grant > > > > > >> > > >>> netif_device_detach(netdev); > > >>> > > >>> if (state == pci_channel_io_perm_failure) > > >>> -- > > >>> 2.40.1.606.ga4b1b128d6-goog > > >>> > > >>> ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-24 21:01 ` Grant Grundler @ 2023-05-24 22:22 ` Tony Nguyen 2023-05-24 22:34 ` Grant Grundler 0 siblings, 1 reply; 9+ messages in thread From: Tony Nguyen @ 2023-05-24 22:22 UTC (permalink / raw) To: Grant Grundler, Loktionov, Aleksandr Cc: Pavan Chebbi, Neftin, Sasha, Ruinskiy, Dima, Ying Hsu, netdev, David S. Miller, Eric Dumazet, Jakub Kicinski, Brandeburg, Jesse, Paolo Abeni, intel-wired-lan, linux-kernel Hi Grant, On 5/24/2023 2:01 PM, Grant Grundler wrote: > On Wed, May 24, 2023 at 5:34 AM Loktionov, Aleksandr > <aleksandr.loktionov@intel.com> wrote: >> >> Good day Tony >> >> We reviewed the patch and have nothing against. > > Thank you for reviewing! > > Can I take this as the equivalent of "Signed-off-by: Loktionov, > Aleksandr <aleksandr.loktionov@intel.com>"? Unless a tag is explicitly given, I don't believe one can be inferred. > Or since Tony is listed in MAINTAINERS for drivers/net/ethernet/intel, > is he supposed to provide that? Assuming there's no comments/issues brought up, I'll apply it to the respective Intel Wired LAN tree for our validation to have a pass at it. Upon successful completion, I'll send the patch on to netdev for them to pull. Hope that helps. Thanks, Tony ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-24 22:22 ` Tony Nguyen @ 2023-05-24 22:34 ` Grant Grundler 0 siblings, 0 replies; 9+ messages in thread From: Grant Grundler @ 2023-05-24 22:34 UTC (permalink / raw) To: Tony Nguyen Cc: Grant Grundler, Loktionov, Aleksandr, Pavan Chebbi, Neftin, Sasha, Ruinskiy, Dima, Ying Hsu, netdev, David S. Miller, Eric Dumazet, Jakub Kicinski, Brandeburg, Jesse, Paolo Abeni, intel-wired-lan, linux-kernel On Wed, May 24, 2023 at 3:22 PM Tony Nguyen <anthony.l.nguyen@intel.com> wrote: > > Hi Grant, > > On 5/24/2023 2:01 PM, Grant Grundler wrote: > > On Wed, May 24, 2023 at 5:34 AM Loktionov, Aleksandr > > <aleksandr.loktionov@intel.com> wrote: > >> > >> Good day Tony > >> > >> We reviewed the patch and have nothing against. > > > > Thank you for reviewing! > > > > Can I take this as the equivalent of "Signed-off-by: Loktionov, > > Aleksandr <aleksandr.loktionov@intel.com>"? > > Unless a tag is explicitly given, I don't believe one can be inferred. Yes - that's what I thought and was asking in case that's what Aleksandr meant (and could easily confirm) > > > Or since Tony is listed in MAINTAINERS for drivers/net/ethernet/intel, > > is he supposed to provide that? > > Assuming there's no comments/issues brought up, I'll apply it to the > respective Intel Wired LAN tree for our validation to have a pass at it. > Upon successful completion, I'll send the patch on to netdev for them to > pull. Hope that helps. Yes - that's what I needed to know. Thank you Tony! :) Ying Hsu will apply this patch to Chrome OS kernels: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/4548800 cheers, grant > > Thanks, > Tony ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [Intel-wired-lan] [PATCH] igb: Fix igb_down hung on surprise removal 2023-05-18 7:26 [PATCH] igb: Fix igb_down hung on surprise removal Ying Hsu 2023-05-18 10:36 ` Pavan Chebbi @ 2023-06-05 4:47 ` Pucha, HimasekharX Reddy 1 sibling, 0 replies; 9+ messages in thread From: Pucha, HimasekharX Reddy @ 2023-06-05 4:47 UTC (permalink / raw) To: Ying Hsu, netdev Cc: grundler, intel-wired-lan, Brandeburg, Jesse, linux-kernel, Eric Dumazet, Nguyen, Anthony L, Jakub Kicinski, Paolo Abeni, David S. Miller > -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Ying Hsu > Sent: Thursday, May 18, 2023 12:57 PM > To: netdev@vger.kernel.org > Cc: grundler@chromium.org; intel-wired-lan@lists.osuosl.org; Ying Hsu <yinghsu@chromium.org>; Brandeburg, Jesse <jesse.brandeburg@intel.com>; linux-kernel@vger.kernel.org; Eric Dumazet <edumazet@google.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S. Miller <davem@davemloft.net> > Subject: [Intel-wired-lan] [PATCH] igb: Fix igb_down hung on surprise removal > > In a setup where a Thunderbolt hub connects to Ethernet and a display through USB Type-C, users may experience a hung task timeout when they remove the cable between the PC and the Thunderbolt hub. > This is because the igb_down function is called multiple times when the Thunderbolt hub is unplugged. For example, the igb_io_error_detected triggers the first call, and the igb_remove triggers the second call. > The second call to igb_down will block at napi_synchronize. > Here's the call trace: > __schedule+0x3b0/0xddb > ? __mod_timer+0x164/0x5d3 > schedule+0x44/0xa8 > schedule_timeout+0xb2/0x2a4 > ? run_local_timers+0x4e/0x4e > msleep+0x31/0x38 > igb_down+0x12c/0x22a [igb 6615058754948bfde0bf01429257eb59f13030d4] > __igb_close+0x6f/0x9c [igb 6615058754948bfde0bf01429257eb59f13030d4] > igb_close+0x23/0x2b [igb 6615058754948bfde0bf01429257eb59f13030d4] > __dev_close_many+0x95/0xec > dev_close_many+0x6e/0x103 > unregister_netdevice_many+0x105/0x5b1 > unregister_netdevice_queue+0xc2/0x10d > unregister_netdev+0x1c/0x23 > igb_remove+0xa7/0x11c [igb 6615058754948bfde0bf01429257eb59f13030d4] > pci_device_remove+0x3f/0x9c > device_release_driver_internal+0xfe/0x1b4 > pci_stop_bus_device+0x5b/0x7f > pci_stop_bus_device+0x30/0x7f > pci_stop_bus_device+0x30/0x7f > pci_stop_and_remove_bus_device+0x12/0x19 > pciehp_unconfigure_device+0x76/0xe9 > pciehp_disable_slot+0x6e/0x131 > pciehp_handle_presence_or_link_change+0x7a/0x3f7 > pciehp_ist+0xbe/0x194 > irq_thread_fn+0x22/0x4d > ? irq_thread+0x1fd/0x1fd > irq_thread+0x17b/0x1fd > ? irq_forced_thread_fn+0x5f/0x5f > kthread+0x142/0x153 > ? __irq_get_irqchip_state+0x46/0x46 > ? kthread_associate_blkcg+0x71/0x71 > ret_from_fork+0x1f/0x30 > > In this case, igb_io_error_detected detaches the network interface and requests a PCIE slot reset, however, the PCIE reset callback is not being invoked and thus the Ethernet connection breaks down. > As the PCIE error in this case is a non-fatal one, requesting a slot reset can be avoided. > This patch fixes the task hung issue and preserves Ethernet connection by ignoring non-fatal PCIE errors. > > Signed-off-by: Ying Hsu <yinghsu@chromium.org> > --- > This commit has been tested on a HP Elite Dragonfly Chromebook and a Caldigit TS3+ Thunderbolt hub. The Ethernet driver for the hub is igb. Non-fatal PCIE errors happen when users hot-plug the cables connected to the chromebook or to the external display. > > drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++ > 1 file changed, 5 insertions(+) > Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-06-05 4:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-05-18 7:26 [PATCH] igb: Fix igb_down hung on surprise removal Ying Hsu 2023-05-18 10:36 ` Pavan Chebbi 2023-05-22 20:16 ` Grant Grundler 2023-05-23 18:03 ` Tony Nguyen 2023-05-24 12:31 ` Loktionov, Aleksandr 2023-05-24 21:01 ` Grant Grundler 2023-05-24 22:22 ` Tony Nguyen 2023-05-24 22:34 ` Grant Grundler 2023-06-05 4:47 ` [Intel-wired-lan] " Pucha, HimasekharX Reddy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).