All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4][RFC] Disable e1000e power management if hardware error is detected
@ 2020-11-11  5:50 ` Chen Yu
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:50 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Neftin, Sasha, Len Brown, Rafael J. Wysocki, Brandt, Todd E,
	Zhang Rui, Tony Nguyen, Jesse Brandeburg, linux-kernel, Chen Yu

This is a trial patchset that aims to cope with an intermittently
triggered hardware error during system resume.

On some platforms the NIC's hardware error was detected during
resume from S3, causing the NIC to not fully initialize
and remain in unstable state afterwards. As a consequence
the system fails to suspend due to incorrect NIC status.

In theory if the NIC could not be initialized after resumed,
it should not do system/runtime suspend/resume afterwards.
There are two proposals to deal with this situation:

Either:
1. Each time before the NIC going to suspend, check the status
   of NIC by querying corresponding registers, bypass the suspend
   callback on this NIC if it's unstable.

Or:
2. During NIC resume, if the hardware error was detected, removes
   the NIC from power management list entirely.

Proposal 2 was chosen in this patch set because:
1. Proposal 1 requires that the driver queries the status
   of the NIC in e1000e driver. However there seems to be
   no specific registers for the e1000e to query the result
   of NIC initialization.
2. Proposal 1 just bypass the suspend process but the power management
   framework is still aware of this NIC, which might bring potential issue
   in race condition.
3. Approach 2 is a clean solution and it is platform independent
   that, not only e1000e, but also other drivers could leverage
   this generic mechanism in the future.

Comments appreciated.

Chen Yu (4):
  e1000e: save the return value of e1000e_reset()
  PM: sleep: export device_pm_remove() for driver use
  e1000e: Introduce workqueue to disable the power management
  e1000e: Disable the power management if hardware error detected during
    resume

 drivers/base/power/main.c                  |  1 +
 drivers/base/power/power.h                 |  8 -------
 drivers/net/ethernet/intel/e1000e/e1000.h  |  1 +
 drivers/net/ethernet/intel/e1000e/netdev.c | 27 ++++++++++++++++++----
 include/linux/pm.h                         | 12 ++++++++++
 5 files changed, 37 insertions(+), 12 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [PATCH 0/4][RFC] Disable e1000e power management if hardware error is detected
@ 2020-11-11  5:50 ` Chen Yu
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:50 UTC (permalink / raw)
  To: intel-wired-lan

This is a trial patchset that aims to cope with an intermittently
triggered hardware error during system resume.

On some platforms the NIC's hardware error was detected during
resume from S3, causing the NIC to not fully initialize
and remain in unstable state afterwards. As a consequence
the system fails to suspend due to incorrect NIC status.

In theory if the NIC could not be initialized after resumed,
it should not do system/runtime suspend/resume afterwards.
There are two proposals to deal with this situation:

Either:
1. Each time before the NIC going to suspend, check the status
   of NIC by querying corresponding registers, bypass the suspend
   callback on this NIC if it's unstable.

Or:
2. During NIC resume, if the hardware error was detected, removes
   the NIC from power management list entirely.

Proposal 2 was chosen in this patch set because:
1. Proposal 1 requires that the driver queries the status
   of the NIC in e1000e driver. However there seems to be
   no specific registers for the e1000e to query the result
   of NIC initialization.
2. Proposal 1 just bypass the suspend process but the power management
   framework is still aware of this NIC, which might bring potential issue
   in race condition.
3. Approach 2 is a clean solution and it is platform independent
   that, not only e1000e, but also other drivers could leverage
   this generic mechanism in the future.

Comments appreciated.

Chen Yu (4):
  e1000e: save the return value of e1000e_reset()
  PM: sleep: export device_pm_remove() for driver use
  e1000e: Introduce workqueue to disable the power management
  e1000e: Disable the power management if hardware error detected during
    resume

 drivers/base/power/main.c                  |  1 +
 drivers/base/power/power.h                 |  8 -------
 drivers/net/ethernet/intel/e1000e/e1000.h  |  1 +
 drivers/net/ethernet/intel/e1000e/netdev.c | 27 ++++++++++++++++++----
 include/linux/pm.h                         | 12 ++++++++++
 5 files changed, 37 insertions(+), 12 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4][RFC] e1000e: save the return value of e1000e_reset()
  2020-11-11  5:50 ` [Intel-wired-lan] " Chen Yu
@ 2020-11-11  5:51   ` Chen Yu
  -1 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:51 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Neftin, Sasha, Len Brown, Rafael J. Wysocki, Brandt, Todd E,
	Zhang Rui, Tony Nguyen, Jesse Brandeburg, linux-kernel, Chen Yu

Sometimes e1000e_reset() might fail during reume from S3 due to
hardware/firmware issues. Actually the return value from e1000e_reset()
can be used by the caller to verify if the NIC succeed to initialize or
not.

Introduce a static function _e1000e_reset() which is derived
from e1000e_reset(), except that the former returns the result
of this reset.

No functional change expected.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index b30f00891c03..f7c08426c0d7 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3998,7 +3998,7 @@ static void e1000e_systim_reset(struct e1000_adapter *adapter)
  * set/changed during runtime. After reset the device needs to be
  * properly configured for Rx, Tx etc.
  */
-void e1000e_reset(struct e1000_adapter *adapter)
+static int _e1000e_reset(struct e1000_adapter *adapter)
 {
 	struct e1000_mac_info *mac = &adapter->hw.mac;
 	struct e1000_fc_info *fc = &adapter->hw.fc;
@@ -4191,14 +4191,14 @@ void e1000e_reset(struct e1000_adapter *adapter)
 		default:
 			dev_err(&adapter->pdev->dev,
 				"Invalid PHY type setting EEE advertisement\n");
-			return;
+			return -EINVAL;
 		}
 
 		ret_val = hw->phy.ops.acquire(hw);
 		if (ret_val) {
 			dev_err(&adapter->pdev->dev,
 				"EEE advertisement - unable to acquire PHY\n");
-			return;
+			return -EBUSY;
 		}
 
 		e1000_write_emi_reg_locked(hw, adv_addr,
@@ -4239,6 +4239,12 @@ void e1000e_reset(struct e1000_adapter *adapter)
 		ew32(FEXTNVM9, reg);
 	}
 
+	return 0;
+}
+
+void e1000e_reset(struct e1000_adapter *adapter)
+{
+	_e1000e_reset(adapter);
 }
 
 /**
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [PATCH 1/4][RFC] e1000e: save the return value of e1000e_reset()
@ 2020-11-11  5:51   ` Chen Yu
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:51 UTC (permalink / raw)
  To: intel-wired-lan

Sometimes e1000e_reset() might fail during reume from S3 due to
hardware/firmware issues. Actually the return value from e1000e_reset()
can be used by the caller to verify if the NIC succeed to initialize or
not.

Introduce a static function _e1000e_reset() which is derived
from e1000e_reset(), except that the former returns the result
of this reset.

No functional change expected.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index b30f00891c03..f7c08426c0d7 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3998,7 +3998,7 @@ static void e1000e_systim_reset(struct e1000_adapter *adapter)
  * set/changed during runtime. After reset the device needs to be
  * properly configured for Rx, Tx etc.
  */
-void e1000e_reset(struct e1000_adapter *adapter)
+static int _e1000e_reset(struct e1000_adapter *adapter)
 {
 	struct e1000_mac_info *mac = &adapter->hw.mac;
 	struct e1000_fc_info *fc = &adapter->hw.fc;
@@ -4191,14 +4191,14 @@ void e1000e_reset(struct e1000_adapter *adapter)
 		default:
 			dev_err(&adapter->pdev->dev,
 				"Invalid PHY type setting EEE advertisement\n");
-			return;
+			return -EINVAL;
 		}
 
 		ret_val = hw->phy.ops.acquire(hw);
 		if (ret_val) {
 			dev_err(&adapter->pdev->dev,
 				"EEE advertisement - unable to acquire PHY\n");
-			return;
+			return -EBUSY;
 		}
 
 		e1000_write_emi_reg_locked(hw, adv_addr,
@@ -4239,6 +4239,12 @@ void e1000e_reset(struct e1000_adapter *adapter)
 		ew32(FEXTNVM9, reg);
 	}
 
+	return 0;
+}
+
+void e1000e_reset(struct e1000_adapter *adapter)
+{
+	_e1000e_reset(adapter);
 }
 
 /**
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4][RFC] PM: sleep: export device_pm_remove() for driver use
  2020-11-11  5:50 ` [Intel-wired-lan] " Chen Yu
@ 2020-11-11  5:51   ` Chen Yu
  -1 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:51 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Neftin, Sasha, Len Brown, Rafael J. Wysocki, Brandt, Todd E,
	Zhang Rui, Tony Nguyen, Jesse Brandeburg, linux-kernel, Chen Yu

Export device_pm_remove() and move the declaration of device_pm_remove()
into generic power header file so that the drivers could use this interface
to disable power management on that device.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/base/power/main.c  |  1 +
 drivers/base/power/power.h |  8 --------
 include/linux/pm.h         | 12 ++++++++++++
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index c7ac49042cee..4693da9d7d80 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -164,6 +164,7 @@ void device_pm_remove(struct device *dev)
 	pm_runtime_remove(dev);
 	device_pm_check_callbacks(dev);
 }
+EXPORT_SYMBOL_GPL(device_pm_remove);
 
 /**
  * device_pm_move_before - Move device in the PM core's list of active devices.
diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h
index 54292cdd7808..8c2e45f3e5a9 100644
--- a/drivers/base/power/power.h
+++ b/drivers/base/power/power.h
@@ -20,7 +20,6 @@ static inline void pm_runtime_early_init(struct device *dev)
 
 extern void pm_runtime_init(struct device *dev);
 extern void pm_runtime_reinit(struct device *dev);
-extern void pm_runtime_remove(struct device *dev);
 extern u64 pm_runtime_active_time(struct device *dev);
 
 #define WAKE_IRQ_DEDICATED_ALLOCATED	BIT(0)
@@ -85,7 +84,6 @@ static inline void pm_runtime_early_init(struct device *dev)
 
 static inline void pm_runtime_init(struct device *dev) {}
 static inline void pm_runtime_reinit(struct device *dev) {}
-static inline void pm_runtime_remove(struct device *dev) {}
 
 static inline int dpm_sysfs_add(struct device *dev) { return 0; }
 static inline void dpm_sysfs_remove(struct device *dev) {}
@@ -109,7 +107,6 @@ static inline struct device *to_device(struct list_head *entry)
 
 extern void device_pm_sleep_init(struct device *dev);
 extern void device_pm_add(struct device *);
-extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
@@ -133,11 +130,6 @@ static inline void device_pm_sleep_init(struct device *dev) {}
 
 static inline void device_pm_add(struct device *dev) {}
 
-static inline void device_pm_remove(struct device *dev)
-{
-	pm_runtime_remove(dev);
-}
-
 static inline void device_pm_move_before(struct device *deva,
 					 struct device *devb) {}
 static inline void device_pm_move_after(struct device *deva,
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 47aca6bac1d6..f9ceca6ac7ff 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -701,6 +701,11 @@ struct dev_pm_domain {
  * be able to use wakeup events to exit from runtime low-power states,
  * or from system low-power states such as standby or suspend-to-RAM.
  */
+#ifdef CONFIG_PM
+extern void pm_runtime_remove(struct device *dev);
+#else
+static inline void pm_runtime_remove(struct device *dev) {}
+#endif
 
 #ifdef CONFIG_PM_SLEEP
 extern void device_pm_lock(void);
@@ -753,6 +758,8 @@ extern void pm_generic_complete(struct device *dev);
 extern bool dev_pm_skip_resume(struct device *dev);
 extern bool dev_pm_skip_suspend(struct device *dev);
 
+extern void device_pm_remove(struct device *dev);
+
 #else /* !CONFIG_PM_SLEEP */
 
 #define device_pm_lock() do {} while (0)
@@ -774,6 +781,11 @@ static inline void dpm_for_each_dev(void *data, void (*fn)(struct device *, void
 {
 }
 
+static inline void device_pm_remove(struct device *dev)
+{
+	pm_runtime_remove(dev);
+}
+
 #define pm_generic_prepare		NULL
 #define pm_generic_suspend_late		NULL
 #define pm_generic_suspend_noirq	NULL
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [PATCH 2/4][RFC] PM: sleep: export device_pm_remove() for driver use
@ 2020-11-11  5:51   ` Chen Yu
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:51 UTC (permalink / raw)
  To: intel-wired-lan

Export device_pm_remove() and move the declaration of device_pm_remove()
into generic power header file so that the drivers could use this interface
to disable power management on that device.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/base/power/main.c  |  1 +
 drivers/base/power/power.h |  8 --------
 include/linux/pm.h         | 12 ++++++++++++
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index c7ac49042cee..4693da9d7d80 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -164,6 +164,7 @@ void device_pm_remove(struct device *dev)
 	pm_runtime_remove(dev);
 	device_pm_check_callbacks(dev);
 }
+EXPORT_SYMBOL_GPL(device_pm_remove);
 
 /**
  * device_pm_move_before - Move device in the PM core's list of active devices.
diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h
index 54292cdd7808..8c2e45f3e5a9 100644
--- a/drivers/base/power/power.h
+++ b/drivers/base/power/power.h
@@ -20,7 +20,6 @@ static inline void pm_runtime_early_init(struct device *dev)
 
 extern void pm_runtime_init(struct device *dev);
 extern void pm_runtime_reinit(struct device *dev);
-extern void pm_runtime_remove(struct device *dev);
 extern u64 pm_runtime_active_time(struct device *dev);
 
 #define WAKE_IRQ_DEDICATED_ALLOCATED	BIT(0)
@@ -85,7 +84,6 @@ static inline void pm_runtime_early_init(struct device *dev)
 
 static inline void pm_runtime_init(struct device *dev) {}
 static inline void pm_runtime_reinit(struct device *dev) {}
-static inline void pm_runtime_remove(struct device *dev) {}
 
 static inline int dpm_sysfs_add(struct device *dev) { return 0; }
 static inline void dpm_sysfs_remove(struct device *dev) {}
@@ -109,7 +107,6 @@ static inline struct device *to_device(struct list_head *entry)
 
 extern void device_pm_sleep_init(struct device *dev);
 extern void device_pm_add(struct device *);
-extern void device_pm_remove(struct device *);
 extern void device_pm_move_before(struct device *, struct device *);
 extern void device_pm_move_after(struct device *, struct device *);
 extern void device_pm_move_last(struct device *);
@@ -133,11 +130,6 @@ static inline void device_pm_sleep_init(struct device *dev) {}
 
 static inline void device_pm_add(struct device *dev) {}
 
-static inline void device_pm_remove(struct device *dev)
-{
-	pm_runtime_remove(dev);
-}
-
 static inline void device_pm_move_before(struct device *deva,
 					 struct device *devb) {}
 static inline void device_pm_move_after(struct device *deva,
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 47aca6bac1d6..f9ceca6ac7ff 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -701,6 +701,11 @@ struct dev_pm_domain {
  * be able to use wakeup events to exit from runtime low-power states,
  * or from system low-power states such as standby or suspend-to-RAM.
  */
+#ifdef CONFIG_PM
+extern void pm_runtime_remove(struct device *dev);
+#else
+static inline void pm_runtime_remove(struct device *dev) {}
+#endif
 
 #ifdef CONFIG_PM_SLEEP
 extern void device_pm_lock(void);
@@ -753,6 +758,8 @@ extern void pm_generic_complete(struct device *dev);
 extern bool dev_pm_skip_resume(struct device *dev);
 extern bool dev_pm_skip_suspend(struct device *dev);
 
+extern void device_pm_remove(struct device *dev);
+
 #else /* !CONFIG_PM_SLEEP */
 
 #define device_pm_lock() do {} while (0)
@@ -774,6 +781,11 @@ static inline void dpm_for_each_dev(void *data, void (*fn)(struct device *, void
 {
 }
 
+static inline void device_pm_remove(struct device *dev)
+{
+	pm_runtime_remove(dev);
+}
+
 #define pm_generic_prepare		NULL
 #define pm_generic_suspend_late		NULL
 #define pm_generic_suspend_noirq	NULL
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4][RFC] e1000e: Introduce workqueue to disable the power management
  2020-11-11  5:50 ` [Intel-wired-lan] " Chen Yu
@ 2020-11-11  5:51   ` Chen Yu
  -1 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:51 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Neftin, Sasha, Len Brown, Rafael J. Wysocki, Brandt, Todd E,
	Zhang Rui, Tony Nguyen, Jesse Brandeburg, linux-kernel, Chen Yu

Introduce a workqueue to disable the power management of this device.
It is supposed to be triggered when e1000e hardware error is detected
during resume from S3.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/net/ethernet/intel/e1000e/e1000.h  |  1 +
 drivers/net/ethernet/intel/e1000e/netdev.c | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/e1000.h b/drivers/net/ethernet/intel/e1000e/e1000.h
index ba7a0f8f6937..f50e5716d609 100644
--- a/drivers/net/ethernet/intel/e1000e/e1000.h
+++ b/drivers/net/ethernet/intel/e1000e/e1000.h
@@ -309,6 +309,7 @@ struct e1000_adapter {
 	struct work_struct downshift_task;
 	struct work_struct update_phy_task;
 	struct work_struct print_hang_task;
+	struct work_struct pm_remove_task;
 
 	int phy_hang_count;
 
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index f7c08426c0d7..45e0b1901440 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6030,6 +6030,16 @@ static void e1000_reset_task(struct work_struct *work)
 	e1000e_reinit_locked(adapter);
 }
 
+static void e1000_pm_remove_task(struct work_struct *work)
+{
+	struct e1000_adapter *adapter;
+	struct device *dev;
+
+	adapter = container_of(work, struct e1000_adapter, pm_remove_task);
+	dev = &adapter->pdev->dev;
+	device_pm_remove(dev);
+}
+
 /**
  * e1000_get_stats64 - Get System Network Statistics
  * @netdev: network interface device structure
@@ -7589,6 +7599,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	INIT_WORK(&adapter->downshift_task, e1000e_downshift_workaround);
 	INIT_WORK(&adapter->update_phy_task, e1000e_update_phy_task);
 	INIT_WORK(&adapter->print_hang_task, e1000_print_hw_hang);
+	INIT_WORK(&adapter->pm_remove_task, e1000_pm_remove_task);
 
 	/* Initialize link parameters. User can change them with ethtool */
 	adapter->hw.mac.autoneg = 1;
@@ -7731,6 +7742,7 @@ static void e1000_remove(struct pci_dev *pdev)
 	cancel_work_sync(&adapter->downshift_task);
 	cancel_work_sync(&adapter->update_phy_task);
 	cancel_work_sync(&adapter->print_hang_task);
+	cancel_work_sync(&adapter->pm_remove_task);
 
 	if (adapter->flags & FLAG_HAS_HW_TIMESTAMP) {
 		cancel_work_sync(&adapter->tx_hwtstamp_work);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [PATCH 3/4][RFC] e1000e: Introduce workqueue to disable the power management
@ 2020-11-11  5:51   ` Chen Yu
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:51 UTC (permalink / raw)
  To: intel-wired-lan

Introduce a workqueue to disable the power management of this device.
It is supposed to be triggered when e1000e hardware error is detected
during resume from S3.

Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/net/ethernet/intel/e1000e/e1000.h  |  1 +
 drivers/net/ethernet/intel/e1000e/netdev.c | 12 ++++++++++++
 2 files changed, 13 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/e1000.h b/drivers/net/ethernet/intel/e1000e/e1000.h
index ba7a0f8f6937..f50e5716d609 100644
--- a/drivers/net/ethernet/intel/e1000e/e1000.h
+++ b/drivers/net/ethernet/intel/e1000e/e1000.h
@@ -309,6 +309,7 @@ struct e1000_adapter {
 	struct work_struct downshift_task;
 	struct work_struct update_phy_task;
 	struct work_struct print_hang_task;
+	struct work_struct pm_remove_task;
 
 	int phy_hang_count;
 
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index f7c08426c0d7..45e0b1901440 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6030,6 +6030,16 @@ static void e1000_reset_task(struct work_struct *work)
 	e1000e_reinit_locked(adapter);
 }
 
+static void e1000_pm_remove_task(struct work_struct *work)
+{
+	struct e1000_adapter *adapter;
+	struct device *dev;
+
+	adapter = container_of(work, struct e1000_adapter, pm_remove_task);
+	dev = &adapter->pdev->dev;
+	device_pm_remove(dev);
+}
+
 /**
  * e1000_get_stats64 - Get System Network Statistics
  * @netdev: network interface device structure
@@ -7589,6 +7599,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	INIT_WORK(&adapter->downshift_task, e1000e_downshift_workaround);
 	INIT_WORK(&adapter->update_phy_task, e1000e_update_phy_task);
 	INIT_WORK(&adapter->print_hang_task, e1000_print_hw_hang);
+	INIT_WORK(&adapter->pm_remove_task, e1000_pm_remove_task);
 
 	/* Initialize link parameters. User can change them with ethtool */
 	adapter->hw.mac.autoneg = 1;
@@ -7731,6 +7742,7 @@ static void e1000_remove(struct pci_dev *pdev)
 	cancel_work_sync(&adapter->downshift_task);
 	cancel_work_sync(&adapter->update_phy_task);
 	cancel_work_sync(&adapter->print_hang_task);
+	cancel_work_sync(&adapter->pm_remove_task);
 
 	if (adapter->flags & FLAG_HAS_HW_TIMESTAMP) {
 		cancel_work_sync(&adapter->tx_hwtstamp_work);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4][RFC] e1000e: Disable the power management if hardware error detected during resume
  2020-11-11  5:50 ` [Intel-wired-lan] " Chen Yu
@ 2020-11-11  5:52   ` Chen Yu
  -1 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:52 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Neftin, Sasha, Len Brown, Rafael J. Wysocki, Brandt, Todd E,
	Zhang Rui, Tony Nguyen, Jesse Brandeburg, linux-kernel, Chen Yu

If the hardware error is detected during resume, the NIC might
be in a unstable status and blocks the subsequent suspend afterwards.
A broken device is not expected to impact the system wide suspend, and
this patch disable the power management support of this NIC. So that
the borken NIC will not be considered during suspend/resume, thus not
to prevent the system from suspend/resume.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205015
Reported-by: "Brandt, Todd E" <todd.e.brandt@intel.com>
Reported-by: Len Brown <len.brown@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 45e0b1901440..08bc544e879a 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6959,7 +6959,8 @@ static int __e1000_resume(struct pci_dev *pdev)
 		ew32(WUS, ~0);
 	}
 
-	e1000e_reset(adapter);
+	if (_e1000e_reset(adapter))
+		schedule_work(&adapter->pm_remove_task);
 
 	e1000_init_manageability_pt(adapter);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Intel-wired-lan] [PATCH 4/4][RFC] e1000e: Disable the power management if hardware error detected during resume
@ 2020-11-11  5:52   ` Chen Yu
  0 siblings, 0 replies; 10+ messages in thread
From: Chen Yu @ 2020-11-11  5:52 UTC (permalink / raw)
  To: intel-wired-lan

If the hardware error is detected during resume, the NIC might
be in a unstable status and blocks the subsequent suspend afterwards.
A broken device is not expected to impact the system wide suspend, and
this patch disable the power management support of this NIC. So that
the borken NIC will not be considered during suspend/resume, thus not
to prevent the system from suspend/resume.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205015
Reported-by: "Brandt, Todd E" <todd.e.brandt@intel.com>
Reported-by: Len Brown <len.brown@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 45e0b1901440..08bc544e879a 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -6959,7 +6959,8 @@ static int __e1000_resume(struct pci_dev *pdev)
 		ew32(WUS, ~0);
 	}
 
-	e1000e_reset(adapter);
+	if (_e1000e_reset(adapter))
+		schedule_work(&adapter->pm_remove_task);
 
 	e1000_init_manageability_pt(adapter);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-11-11  5:52 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-11  5:50 [PATCH 0/4][RFC] Disable e1000e power management if hardware error is detected Chen Yu
2020-11-11  5:50 ` [Intel-wired-lan] " Chen Yu
2020-11-11  5:51 ` [PATCH 1/4][RFC] e1000e: save the return value of e1000e_reset() Chen Yu
2020-11-11  5:51   ` [Intel-wired-lan] " Chen Yu
2020-11-11  5:51 ` [PATCH 2/4][RFC] PM: sleep: export device_pm_remove() for driver use Chen Yu
2020-11-11  5:51   ` [Intel-wired-lan] " Chen Yu
2020-11-11  5:51 ` [PATCH 3/4][RFC] e1000e: Introduce workqueue to disable the power management Chen Yu
2020-11-11  5:51   ` [Intel-wired-lan] " Chen Yu
2020-11-11  5:52 ` [PATCH 4/4][RFC] e1000e: Disable the power management if hardware error detected during resume Chen Yu
2020-11-11  5:52   ` [Intel-wired-lan] " Chen Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.