All of lore.kernel.org
 help / color / mirror / Atom feed
* [RESEND PATCH] intel-ethernet: warn when fatal read failure happens
@ 2019-05-23  3:22 ` Feng Tang
  0 siblings, 0 replies; 4+ messages in thread
From: Feng Tang @ 2019-05-23  3:22 UTC (permalink / raw)
  To: Jeff Kirsher, Sasha Neftin, Aaron F Brown, intel-wired-lan, netdev
  Cc: Feng Tang

Failed in reading the HW register is very serious for igb/igc driver,
as its hw_addr will be set to NULL and cause the adapter be seen as
"REMOVED".

We saw the error only a few times in the MTBF test for suspend/resume,
but can hardly get any useful info to debug.

Adding WARN() so that we can get the necessary information about
where and how it happens, and use it for root causing and fixing
this "PCIe link lost issue"

This affects igb, igc.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 1 +
 drivers/net/ethernet/intel/igc/igc_main.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 39f33afc479c..e5b7e638df28 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -753,6 +753,7 @@ u32 igb_rd32(struct e1000_hw *hw, u32 reg)
 		struct net_device *netdev = igb->netdev;
 		hw->hw_addr = NULL;
 		netdev_err(netdev, "PCIe link lost\n");
+		WARN(1, "igb: Failed to read reg 0x%x!\n", reg);
 	}
 
 	return value;
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 34fa0e60a780..28072b9aa932 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -3934,6 +3934,7 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg)
 		hw->hw_addr = NULL;
 		netif_device_detach(netdev);
 		netdev_err(netdev, "PCIe link lost, device now detached\n");
+		WARN(1, "igc: Failed to read reg 0x%x!\n", reg);
 	}
 
 	return value;
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Intel-wired-lan] [RESEND PATCH] intel-ethernet: warn when fatal read failure happens
@ 2019-05-23  3:22 ` Feng Tang
  0 siblings, 0 replies; 4+ messages in thread
From: Feng Tang @ 2019-05-23  3:22 UTC (permalink / raw)
  To: intel-wired-lan

Failed in reading the HW register is very serious for igb/igc driver,
as its hw_addr will be set to NULL and cause the adapter be seen as
"REMOVED".

We saw the error only a few times in the MTBF test for suspend/resume,
but can hardly get any useful info to debug.

Adding WARN() so that we can get the necessary information about
where and how it happens, and use it for root causing and fixing
this "PCIe link lost issue"

This affects igb, igc.

Signed-off-by: Feng Tang <feng.tang@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Acked-by: Sasha Neftin <sasha.neftin@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c | 1 +
 drivers/net/ethernet/intel/igc/igc_main.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 39f33afc479c..e5b7e638df28 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -753,6 +753,7 @@ u32 igb_rd32(struct e1000_hw *hw, u32 reg)
 		struct net_device *netdev = igb->netdev;
 		hw->hw_addr = NULL;
 		netdev_err(netdev, "PCIe link lost\n");
+		WARN(1, "igb: Failed to read reg 0x%x!\n", reg);
 	}
 
 	return value;
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 34fa0e60a780..28072b9aa932 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -3934,6 +3934,7 @@ u32 igc_rd32(struct igc_hw *hw, u32 reg)
 		hw->hw_addr = NULL;
 		netif_device_detach(netdev);
 		netdev_err(netdev, "PCIe link lost, device now detached\n");
+		WARN(1, "igc: Failed to read reg 0x%x!\n", reg);
 	}
 
 	return value;
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RESEND PATCH] intel-ethernet: warn when fatal read failure happens
  2019-05-23  3:22 ` [Intel-wired-lan] " Feng Tang
@ 2019-05-23 22:55   ` Jeff Kirsher
  -1 siblings, 0 replies; 4+ messages in thread
From: Jeff Kirsher @ 2019-05-23 22:55 UTC (permalink / raw)
  To: Feng Tang, Sasha Neftin, Aaron F Brown, intel-wired-lan, netdev

[-- Attachment #1: Type: text/plain, Size: 988 bytes --]

On Thu, 2019-05-23 at 11:22 +0800, Feng Tang wrote:
> Failed in reading the HW register is very serious for igb/igc driver,
> as its hw_addr will be set to NULL and cause the adapter be seen as
> "REMOVED".
> 
> We saw the error only a few times in the MTBF test for
> suspend/resume,
> but can hardly get any useful info to debug.
> 
> Adding WARN() so that we can get the necessary information about
> where and how it happens, and use it for root causing and fixing
> this "PCIe link lost issue"
> 
> This affects igb, igc.
> 
> Signed-off-by: Feng Tang <feng.tang@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Acked-by: Sasha Neftin <sasha.neftin@intel.com>
> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 1 +
>  drivers/net/ethernet/intel/igc/igc_main.c | 1 +
>  2 files changed, 2 insertions(+)

This patch is already in my next series of 1GbE patches to push to
Dave, so you can expect this to be pushed upstream before the weekend.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Intel-wired-lan] [RESEND PATCH] intel-ethernet: warn when fatal read failure happens
@ 2019-05-23 22:55   ` Jeff Kirsher
  0 siblings, 0 replies; 4+ messages in thread
From: Jeff Kirsher @ 2019-05-23 22:55 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, 2019-05-23 at 11:22 +0800, Feng Tang wrote:
> Failed in reading the HW register is very serious for igb/igc driver,
> as its hw_addr will be set to NULL and cause the adapter be seen as
> "REMOVED".
> 
> We saw the error only a few times in the MTBF test for
> suspend/resume,
> but can hardly get any useful info to debug.
> 
> Adding WARN() so that we can get the necessary information about
> where and how it happens, and use it for root causing and fixing
> this "PCIe link lost issue"
> 
> This affects igb, igc.
> 
> Signed-off-by: Feng Tang <feng.tang@intel.com>
> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
> Acked-by: Sasha Neftin <sasha.neftin@intel.com>
> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 1 +
>  drivers/net/ethernet/intel/igc/igc_main.c | 1 +
>  2 files changed, 2 insertions(+)

This patch is already in my next series of 1GbE patches to push to
Dave, so you can expect this to be pushed upstream before the weekend.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20190523/c6977fef/attachment.asc>

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-05-23 22:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-23  3:22 [RESEND PATCH] intel-ethernet: warn when fatal read failure happens Feng Tang
2019-05-23  3:22 ` [Intel-wired-lan] " Feng Tang
2019-05-23 22:55 ` Jeff Kirsher
2019-05-23 22:55   ` [Intel-wired-lan] " Jeff Kirsher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.