netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net/tg3: Release IRQs on permanent error
@ 2015-04-24  5:22 Gavin Shan
  2015-04-24 21:59 ` Prashant Sreedharan
  2015-04-25 18:42 ` David Miller
  0 siblings, 2 replies; 3+ messages in thread
From: Gavin Shan @ 2015-04-24  5:22 UTC (permalink / raw)
  To: netdev; +Cc: davem, prashant, mchan, Gavin Shan

When having permanent EEH error, the PCI device will be removed
from the system. For this case, we shouldn't set pcierr_recovery
to true wrongly, which blocks the driver to release the allocated
interrupts and their handlers. Eventually, we can't disable MSI
or MSIx successfully because of the MSI or MSIx interrupts still
have associated interrupt actions, which is turned into following
stack dump.

Oops: Exception in kernel mode, sig: 5 [#1]
        :
[c0000000003b76a8] .free_msi_irqs+0x80/0x1a0 (unreliable)
[c00000000039f388] .pci_remove_bus_device+0x98/0x110
[c0000000000790f4] .pcibios_remove_pci_devices+0x9c/0x128
[c000000000077b98] .handle_eeh_events+0x2d8/0x4b0
[c0000000000782d0] .eeh_event_handler+0x130/0x1c0
[c000000000022bd4] .kernel_thread+0x54/0x70

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 drivers/net/ethernet/broadcom/tg3.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 1270b18..069952f 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -18129,7 +18129,9 @@ static pci_ers_result_t tg3_io_error_detected(struct pci_dev *pdev,
 
 	rtnl_lock();
 
-	tp->pcierr_recovery = true;
+	/* We needn't recover from permanent error */
+	if (state == pci_channel_io_frozen)
+		tp->pcierr_recovery = true;
 
 	/* We probably don't have netdev yet */
 	if (!netdev || !netif_running(netdev))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] net/tg3: Release IRQs on permanent error
  2015-04-24  5:22 [PATCH] net/tg3: Release IRQs on permanent error Gavin Shan
@ 2015-04-24 21:59 ` Prashant Sreedharan
  2015-04-25 18:42 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: Prashant Sreedharan @ 2015-04-24 21:59 UTC (permalink / raw)
  To: Gavin Shan; +Cc: netdev, davem, mchan

On Fri, 2015-04-24 at 15:22 +1000, Gavin Shan wrote:
> When having permanent EEH error, the PCI device will be removed
> from the system. For this case, we shouldn't set pcierr_recovery
> to true wrongly, which blocks the driver to release the allocated
> interrupts and their handlers. Eventually, we can't disable MSI
> or MSIx successfully because of the MSI or MSIx interrupts still
> have associated interrupt actions, which is turned into following
> stack dump.
> 
> Oops: Exception in kernel mode, sig: 5 [#1]
>         :
> [c0000000003b76a8] .free_msi_irqs+0x80/0x1a0 (unreliable)
> [c00000000039f388] .pci_remove_bus_device+0x98/0x110
> [c0000000000790f4] .pcibios_remove_pci_devices+0x9c/0x128
> [c000000000077b98] .handle_eeh_events+0x2d8/0x4b0
> [c0000000000782d0] .eeh_event_handler+0x130/0x1c0
> [c000000000022bd4] .kernel_thread+0x54/0x70
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
>  drivers/net/ethernet/broadcom/tg3.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
> index 1270b18..069952f 100644
> --- a/drivers/net/ethernet/broadcom/tg3.c
> +++ b/drivers/net/ethernet/broadcom/tg3.c
> @@ -18129,7 +18129,9 @@ static pci_ers_result_t tg3_io_error_detected(struct pci_dev *pdev,
>  
>  	rtnl_lock();
>  
> -	tp->pcierr_recovery = true;
> +	/* We needn't recover from permanent error */
> +	if (state == pci_channel_io_frozen)
> +		tp->pcierr_recovery = true;
>  
>  	/* We probably don't have netdev yet */
>  	if (!netdev || !netif_running(netdev))

Acked-by: Prashant Sreedharan <prashant@broadcom.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] net/tg3: Release IRQs on permanent error
  2015-04-24  5:22 [PATCH] net/tg3: Release IRQs on permanent error Gavin Shan
  2015-04-24 21:59 ` Prashant Sreedharan
@ 2015-04-25 18:42 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2015-04-25 18:42 UTC (permalink / raw)
  To: gwshan; +Cc: netdev, prashant, mchan

From: Gavin Shan <gwshan@linux.vnet.ibm.com>
Date: Fri, 24 Apr 2015 15:22:23 +1000

> When having permanent EEH error, the PCI device will be removed
> from the system. For this case, we shouldn't set pcierr_recovery
> to true wrongly, which blocks the driver to release the allocated
> interrupts and their handlers. Eventually, we can't disable MSI
> or MSIx successfully because of the MSI or MSIx interrupts still
> have associated interrupt actions, which is turned into following
> stack dump.
> 
> Oops: Exception in kernel mode, sig: 5 [#1]
>         :
> [c0000000003b76a8] .free_msi_irqs+0x80/0x1a0 (unreliable)
> [c00000000039f388] .pci_remove_bus_device+0x98/0x110
> [c0000000000790f4] .pcibios_remove_pci_devices+0x9c/0x128
> [c000000000077b98] .handle_eeh_events+0x2d8/0x4b0
> [c0000000000782d0] .eeh_event_handler+0x130/0x1c0
> [c000000000022bd4] .kernel_thread+0x54/0x70
> 
> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>

Applied.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-04-25 18:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-24  5:22 [PATCH] net/tg3: Release IRQs on permanent error Gavin Shan
2015-04-24 21:59 ` Prashant Sreedharan
2015-04-25 18:42 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).