All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work
@ 2013-02-04 11:55 Konstantin Khlebnikov
  2013-02-04 11:55 ` [PATCH v2 1/7] e1000e: fix pci-device enable-counter balance Konstantin Khlebnikov
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:55 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel

This patchset contains some fixes for e1000e diver (broken since v2.6.35)
and some related fixes and useful debug for PCI code.

All together this fixes my regression report for v3.8-rc1:
https://lkml.org/lkml/2013/1/1/25

patchset was seriously reworked since v1:
http://lkml.org/lkml/2013/1/18/147

---

Konstantin Khlebnikov (6):
      e1000e: fix pci-device enable-counter balance
      PCI: don't touch enable_cnt in pci_device_shutdown()
      PCI: catch enable-counter underflows
      e1000e: fix runtime power management transitions
      PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
      e1000e: fix accessing to suspended device

Rafael J. Wysocki (1):
      PCI/PM: clear state_saved during suspend


 drivers/net/ethernet/intel/e1000e/ethtool.c |   13 ++++
 drivers/net/ethernet/intel/e1000e/netdev.c  |   82 +++++++--------------------
 drivers/pci/pci-driver.c                    |   21 +++++--
 drivers/pci/pci.c                           |    3 +
 4 files changed, 53 insertions(+), 66 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/7] e1000e: fix pci-device enable-counter balance
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
@ 2013-02-04 11:55 ` Konstantin Khlebnikov
  2013-02-04 11:55 ` [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown() Konstantin Khlebnikov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:55 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel
  Cc: Bruce Allan, Jeff Kirsher

This patch removes redundant and unbalanced pci_disable_device() from
__e1000_shutdown(). pci_clear_master() is enough, device can go into
suspended state with elevated enable_cnt.

Bug was introduced in commit 23606cf5d1192c2b17912cb2ef6e62f9b11de133
("e1000e / PCI / PM: Add basic runtime PM support (rev. 4)") in v2.6.35

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index fbf75fd..49e944f 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5503,7 +5503,7 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake,
 	 */
 	e1000e_release_hw_control(adapter);
 
-	pci_disable_device(pdev);
+	pci_clear_master(pdev);
 
 	return 0;
 }


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
  2013-02-04 11:55 ` [PATCH v2 1/7] e1000e: fix pci-device enable-counter balance Konstantin Khlebnikov
@ 2013-02-04 11:55 ` Konstantin Khlebnikov
  2013-02-04 22:20   ` Khalid Aziz
  2013-02-04 11:56 ` [PATCH v2 3/7] PCI: catch enable-counter underflows Konstantin Khlebnikov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:55 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel
  Cc: Bjorn Helgaas, Khalid Aziz, Andi Kleen, Alan Cox, Matthew Garrett

comment in commit b566a22c23327f18ce941ffad0ca907e50a53d41
("PCI: disable Bus Master on PCI device shutdown") says:
| Disable Bus Master bit on the device in pci_device_shutdown() to ensure PCI
| devices do not continue to DMA data after shutdown.  This can cause memory
| corruption in case of a kexec where the current kernel shuts down and
| transfers control to a new kernel while a PCI device continues to DMA to
| memory that does not belong to it any more in the new kernel.

Seems like pci_clear_master() must be used here instead of pci_disable_device(),
because it disables Bus Muster unconditionally and doesn't changes enable_cnt.

Matthew Garrett and Alan Cox said (see LKML link below) that clearing bus-master
for all PCI devices may lead to unpredictable consequences, some devices ignores
this bit and continues DMA, some of them hang after that or crash whole system.
Probably we should leave here only warning and disable bus-mastering for each
driver individually in ->shutdown() callback.

Link: https://lkml.org/lkml/2012/6/6/278
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Khalid Aziz <khalid.aziz@hp.com>
Cc: linux-pci@vger.kernel.org
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
---
 drivers/pci/pci-driver.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index f79cbcd..dc5bdce 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -392,7 +392,7 @@ static void pci_device_shutdown(struct device *dev)
 	 * Turn off Bus Master bit on the device to tell it to not
 	 * continue to do DMA
 	 */
-	pci_disable_device(pci_dev);
+	pci_clear_master(pci_dev);
 }
 
 #ifdef CONFIG_PM


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 3/7] PCI: catch enable-counter underflows
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
  2013-02-04 11:55 ` [PATCH v2 1/7] e1000e: fix pci-device enable-counter balance Konstantin Khlebnikov
  2013-02-04 11:55 ` [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown() Konstantin Khlebnikov
@ 2013-02-04 11:56 ` Konstantin Khlebnikov
  2013-02-04 11:56 ` [PATCH v2 4/7] PCI/PM: clear state_saved during suspend Konstantin Khlebnikov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:56 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel; +Cc: Bjorn Helgaas

This patch adds single WARN_ONCE() check for catching 'enable_cnt' imbalances.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/pci.c |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 5cb5820..ff93f8f 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1401,6 +1401,9 @@ pci_disable_device(struct pci_dev *dev)
 	if (dr)
 		dr->enabled = 0;
 
+	dev_WARN_ONCE(&dev->dev, atomic_read(&dev->enable_cnt) <= 0,
+			"enable counter underflow");
+
 	if (atomic_sub_return(1, &dev->enable_cnt) != 0)
 		return;
 


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 4/7] PCI/PM: clear state_saved during suspend
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
                   ` (2 preceding siblings ...)
  2013-02-04 11:56 ` [PATCH v2 3/7] PCI: catch enable-counter underflows Konstantin Khlebnikov
@ 2013-02-04 11:56 ` Konstantin Khlebnikov
  2013-02-04 11:56 ` [PATCH v2 5/7] e1000e: fix runtime power management transitions Konstantin Khlebnikov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:56 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel; +Cc: Bjorn Helgaas

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

This patch clears pci_dev->state_saved at the beginnig of suspending.
PCI config state may be saved long before that. Some divers calls
pci_save_state() from its ->probe() callback to get snapshot of sane
configuration space to use in the ->slot_reset() callback.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> # add comment
Cc: linux-pci@vger.kernel.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/pci-driver.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index dc5bdce..f9aa311 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -628,6 +628,7 @@ static int pci_pm_suspend(struct device *dev)
 		goto Fixup;
 	}
 
+	pci_dev->state_saved = false;
 	if (pm->suspend) {
 		pci_power_t prev = pci_dev->current_state;
 		int error;
@@ -774,6 +775,7 @@ static int pci_pm_freeze(struct device *dev)
 		return 0;
 	}
 
+	pci_dev->state_saved = false;
 	if (pm->freeze) {
 		int error;
 
@@ -862,6 +864,7 @@ static int pci_pm_poweroff(struct device *dev)
 		goto Fixup;
 	}
 
+	pci_dev->state_saved = false;
 	if (pm->poweroff) {
 		int error;
 
@@ -987,6 +990,7 @@ static int pci_pm_runtime_suspend(struct device *dev)
 	if (!pm || !pm->runtime_suspend)
 		return -ENOSYS;
 
+	pci_dev->state_saved = false;
 	pci_dev->no_d3cold = false;
 	error = pm->runtime_suspend(dev);
 	suspend_report_result(pm->runtime_suspend, error);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 5/7] e1000e: fix runtime power management transitions
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
                   ` (3 preceding siblings ...)
  2013-02-04 11:56 ` [PATCH v2 4/7] PCI/PM: clear state_saved during suspend Konstantin Khlebnikov
@ 2013-02-04 11:56 ` Konstantin Khlebnikov
  2013-02-04 11:56 ` [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback Konstantin Khlebnikov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:56 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel
  Cc: Bruce Allan, Jeff Kirsher

This patch removes redundant actions from driver and fixes its interaction
with actions in pci-bus runtime power management code.

It removes pci_save_state() from __e1000_shutdown() for normal adapters,
PCI bus callbacks pci_pm_*() will do all this for us. Now __e1000_shutdown()
switches to D3-state only quad-port adapters, because they needs quirk for
clearing false-positive error from downsteam pci-e port.

pci_save_state() now called after clearing bus-master bit, thus __e1000_resume()
and e1000_io_slot_reset() must set it back after restoring configuration space.

This patch set get_link_status before calling pm_runtime_put() in e1000_open()
to allow e1000_idle() get real link status and schedule first runtime suspend.

This patch also enables wakeup for device if management mode is enabled
(like for WoL) as result pci_prepare_to_sleep() would setup wakeup without
special actions like custom 'enable_wakeup' sign.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
---
 drivers/net/ethernet/intel/e1000e/netdev.c |   78 ++++++----------------------
 1 file changed, 18 insertions(+), 60 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 49e944f..708d609 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -3916,6 +3916,7 @@ static int e1000_open(struct net_device *netdev)
 	netif_start_queue(netdev);
 
 	adapter->idle_check = true;
+	hw->mac.get_link_status = true;
 	pm_runtime_put(&pdev->dev);
 
 	/* fire a link status change interrupt to start the watchdog */
@@ -5404,8 +5405,7 @@ release:
 	return retval;
 }
 
-static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake,
-			    bool runtime)
+static int __e1000_shutdown(struct pci_dev *pdev, bool runtime)
 {
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
@@ -5429,10 +5429,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake,
 	}
 	e1000e_reset_interrupt_capability(adapter);
 
-	retval = pci_save_state(pdev);
-	if (retval)
-		return retval;
-
 	status = er32(STATUS);
 	if (status & E1000_STATUS_LU)
 		wufc &= ~E1000_WUFC_LNKC;
@@ -5488,13 +5484,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake,
 		ew32(WUFC, 0);
 	}
 
-	*enable_wake = !!wufc;
-
-	/* make sure adapter isn't asleep if manageability is enabled */
-	if ((adapter->flags & FLAG_MNG_PT_ENABLED) ||
-	    (hw->mac.ops.check_mng_mode(hw)))
-		*enable_wake = true;
-
 	if (adapter->hw.phy.type == e1000_phy_igp_3)
 		e1000e_igp3_phy_powerdown_workaround_ich8lan(&adapter->hw);
 
@@ -5505,26 +5494,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake,
 
 	pci_clear_master(pdev);
 
-	return 0;
-}
-
-static void e1000_power_off(struct pci_dev *pdev, bool sleep, bool wake)
-{
-	if (sleep && wake) {
-		pci_prepare_to_sleep(pdev);
-		return;
-	}
-
-	pci_wake_from_d3(pdev, wake);
-	pci_set_power_state(pdev, PCI_D3hot);
-}
-
-static void e1000_complete_shutdown(struct pci_dev *pdev, bool sleep,
-                                    bool wake)
-{
-	struct net_device *netdev = pci_get_drvdata(pdev);
-	struct e1000_adapter *adapter = netdev_priv(netdev);
-
 	/* The pci-e switch on some quad port adapters will report a
 	 * correctable error when the MAC transitions from D0 to D3.  To
 	 * prevent this we need to mask off the correctable errors on the
@@ -5538,12 +5507,13 @@ static void e1000_complete_shutdown(struct pci_dev *pdev, bool sleep,
 		pcie_capability_write_word(us_dev, PCI_EXP_DEVCTL,
 					   (devctl & ~PCI_EXP_DEVCTL_CERE));
 
-		e1000_power_off(pdev, sleep, wake);
+		pci_save_state(pdev);
+		pci_prepare_to_sleep(pdev);
 
 		pcie_capability_write_word(us_dev, PCI_EXP_DEVCTL, devctl);
-	} else {
-		e1000_power_off(pdev, sleep, wake);
 	}
+
+	return 0;
 }
 
 #ifdef CONFIG_PCIEASPM
@@ -5594,9 +5564,7 @@ static int __e1000_resume(struct pci_dev *pdev)
 	if (aspm_disable_flag)
 		e1000e_disable_aspm(pdev, aspm_disable_flag);
 
-	pci_set_power_state(pdev, PCI_D0);
-	pci_restore_state(pdev);
-	pci_save_state(pdev);
+	pci_set_master(pdev);
 
 	e1000e_set_interrupt_capability(adapter);
 	if (netif_running(netdev)) {
@@ -5662,14 +5630,8 @@ static int __e1000_resume(struct pci_dev *pdev)
 static int e1000_suspend(struct device *dev)
 {
 	struct pci_dev *pdev = to_pci_dev(dev);
-	int retval;
-	bool wake;
-
-	retval = __e1000_shutdown(pdev, &wake, false);
-	if (!retval)
-		e1000_complete_shutdown(pdev, true, wake);
 
-	return retval;
+	return __e1000_shutdown(pdev, false);
 }
 
 static int e1000_resume(struct device *dev)
@@ -5692,13 +5654,10 @@ static int e1000_runtime_suspend(struct device *dev)
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 
-	if (e1000e_pm_ready(adapter)) {
-		bool wake;
-
-		__e1000_shutdown(pdev, &wake, true);
-	}
+	if (!e1000e_pm_ready(adapter))
+		return 0;
 
-	return 0;
+	return __e1000_shutdown(pdev, true);
 }
 
 static int e1000_idle(struct device *dev)
@@ -5736,12 +5695,7 @@ static int e1000_runtime_resume(struct device *dev)
 
 static void e1000_shutdown(struct pci_dev *pdev)
 {
-	bool wake = false;
-
-	__e1000_shutdown(pdev, &wake, false);
-
-	if (system_state == SYSTEM_POWER_OFF)
-		e1000_complete_shutdown(pdev, false, wake);
+	__e1000_shutdown(pdev, false);
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
@@ -5862,9 +5816,9 @@ static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev)
 			"Cannot re-enable PCI device after reset.\n");
 		result = PCI_ERS_RESULT_DISCONNECT;
 	} else {
-		pci_set_master(pdev);
 		pdev->state_saved = true;
 		pci_restore_state(pdev);
+		pci_set_master(pdev);
 
 		pci_enable_wake(pdev, PCI_D3hot, 0);
 		pci_enable_wake(pdev, PCI_D3cold, 0);
@@ -6295,7 +6249,11 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	/* initialize the wol settings based on the eeprom settings */
 	adapter->wol = adapter->eeprom_wol;
-	device_set_wakeup_enable(&adapter->pdev->dev, adapter->wol);
+
+	/* make sure adapter isn't asleep if manageability is enabled */
+	if (adapter->wol || (adapter->flags & FLAG_MNG_PT_ENABLED) ||
+	    (hw->mac.ops.check_mng_mode(hw)))
+		device_wakeup_enable(&pdev->dev);
 
 	/* save off EEPROM version number */
 	e1000_read_nvm(&adapter->hw, 5, 1, &adapter->eeprom_vers);


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
                   ` (4 preceding siblings ...)
  2013-02-04 11:56 ` [PATCH v2 5/7] e1000e: fix runtime power management transitions Konstantin Khlebnikov
@ 2013-02-04 11:56 ` Konstantin Khlebnikov
  2013-02-04 20:22   ` Rafael J. Wysocki
  2013-02-04 11:56 ` [PATCH v2 7/7] e1000e: fix accessing to suspended device Konstantin Khlebnikov
  2013-02-04 20:23 ` [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Rafael J. Wysocki
  7 siblings, 1 reply; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:56 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel; +Cc: Bjorn Helgaas

Documentation/power/pci.txt says:
| It is expected that the device driver's pm->runtime_suspend() callback will
| not attempt to prepare the device for signaling wakeup or to put it into a
| low-power state.  The driver ought to leave these tasks to the PCI subsystem
| that has all of the information necessary to perform them.

After commit 42eca2302146fed51335b95128e949ee6f54478f
("PCI: Don't touch card regs after runtime suspend D3")
| If the driver takes care of state saving, don't touch any registers on it.
pci_pm_runtime_suspend() thinks if state has been saved by ->runtime_suspend()
that means device alredy prepared for wakeup and probably no longer accessible.

Thus driver must either do all actions or leave all these tasks to PCI subsystem.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org
---
 drivers/pci/pci-driver.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index f9aa311..2b0ff9a 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1009,12 +1009,19 @@ static int pci_pm_runtime_suspend(struct device *dev)
 		return 0;
 	}
 
-	if (!pci_dev->state_saved) {
-		pci_save_state(pci_dev);
-		pci_finish_runtime_suspend(pci_dev);
+	if (pci_dev->state_saved) {
+		WARN_ONCE(pci_dev->current_state == prev,
+			"PCI PM: Power state of device not changed by %pF\n",
+			pm->runtime_suspend);
+		WARN_ONCE(pci_dev_run_wake(pci_dev) &&
+				!pci_dev->wakeup_prepared,
+			"PCI PM: Waking of device not configured by %pF\n",
+			pm->runtime_suspend);
+		return 0;
 	}
 
-	return 0;
+	pci_save_state(pci_dev);
+	return pci_finish_runtime_suspend(pci_dev);
 }
 
 static int pci_pm_runtime_resume(struct device *dev)


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 7/7] e1000e: fix accessing to suspended device
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
                   ` (5 preceding siblings ...)
  2013-02-04 11:56 ` [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback Konstantin Khlebnikov
@ 2013-02-04 11:56 ` Konstantin Khlebnikov
  2013-02-04 20:23 ` [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Rafael J. Wysocki
  7 siblings, 0 replies; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 11:56 UTC (permalink / raw)
  To: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel
  Cc: Bruce Allan, Jeff Kirsher

This patch fixes some annoying messages like 'Error reading PHY register' and
'Hardware Erorr' and saves several seconds on reboot.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: e1000-devel@lists.sourceforge.net
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Bruce Allan <bruce.w.allan@intel.com>
---
 drivers/net/ethernet/intel/e1000e/ethtool.c |   13 +++++++++++++
 drivers/net/ethernet/intel/e1000e/netdev.c  |    2 ++
 2 files changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/ethtool.c b/drivers/net/ethernet/intel/e1000e/ethtool.c
index f95bc6e..90afe58 100644
--- a/drivers/net/ethernet/intel/e1000e/ethtool.c
+++ b/drivers/net/ethernet/intel/e1000e/ethtool.c
@@ -35,6 +35,7 @@
 #include <linux/slab.h>
 #include <linux/delay.h>
 #include <linux/vmalloc.h>
+#include <linux/pm_runtime.h>
 
 #include "e1000.h"
 
@@ -2051,7 +2052,19 @@ static int e1000_get_rxnfc(struct net_device *netdev,
 	}
 }
 
+static int e1000e_ethtool_begin(struct net_device *netdev)
+{
+	return pm_runtime_get_sync(netdev->dev.parent);
+}
+
+static void e1000e_ethtool_complete(struct net_device *netdev)
+{
+	pm_runtime_put_sync(netdev->dev.parent);
+}
+
 static const struct ethtool_ops e1000_ethtool_ops = {
+	.begin			= e1000e_ethtool_begin,
+	.complete		= e1000e_ethtool_complete,
 	.get_settings		= e1000_get_settings,
 	.set_settings		= e1000_set_settings,
 	.get_drvinfo		= e1000_get_drvinfo,
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 708d609..e501298 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -4267,6 +4267,7 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
 	    (adapter->hw.phy.media_type == e1000_media_type_copper)) {
 		int ret_val;
 
+		pm_runtime_get_sync(&adapter->pdev->dev);
 		ret_val  = e1e_rphy(hw, PHY_CONTROL, &phy->bmcr);
 		ret_val |= e1e_rphy(hw, PHY_STATUS, &phy->bmsr);
 		ret_val |= e1e_rphy(hw, PHY_AUTONEG_ADV, &phy->advertise);
@@ -4277,6 +4278,7 @@ static void e1000_phy_read_status(struct e1000_adapter *adapter)
 		ret_val |= e1e_rphy(hw, PHY_EXT_STATUS, &phy->estatus);
 		if (ret_val)
 			e_warn("Error reading PHY register\n");
+		pm_runtime_put_sync(&adapter->pdev->dev);
 	} else {
 		/* Do not read PHY registers if link is not up
 		 * Set values to typical power-on defaults


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
  2013-02-04 11:56 ` [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback Konstantin Khlebnikov
@ 2013-02-04 20:22   ` Rafael J. Wysocki
  2013-02-04 20:57     ` Konstantin Khlebnikov
  0 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2013-02-04 20:22 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel, Bjorn Helgaas

On Monday, February 04, 2013 03:56:12 PM Konstantin Khlebnikov wrote:
> Documentation/power/pci.txt says:
> | It is expected that the device driver's pm->runtime_suspend() callback will
> | not attempt to prepare the device for signaling wakeup or to put it into a
> | low-power state.  The driver ought to leave these tasks to the PCI subsystem
> | that has all of the information necessary to perform them.
> 
> After commit 42eca2302146fed51335b95128e949ee6f54478f
> ("PCI: Don't touch card regs after runtime suspend D3")
> | If the driver takes care of state saving, don't touch any registers on it.
> pci_pm_runtime_suspend() thinks if state has been saved by ->runtime_suspend()
> that means device alredy prepared for wakeup and probably no longer accessible.
> 
> Thus driver must either do all actions or leave all these tasks to PCI subsystem.

I don't like this one, because it will generate noise for drivers that
legitimately use pci_save_state() in their runtime suspend callbacks
and know what they are doing.

Thanks,
Rafael


> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: linux-pci@vger.kernel.org
> ---
>  drivers/pci/pci-driver.c |   15 +++++++++++----
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index f9aa311..2b0ff9a 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -1009,12 +1009,19 @@ static int pci_pm_runtime_suspend(struct device *dev)
>  		return 0;
>  	}
>  
> -	if (!pci_dev->state_saved) {
> -		pci_save_state(pci_dev);
> -		pci_finish_runtime_suspend(pci_dev);
> +	if (pci_dev->state_saved) {
> +		WARN_ONCE(pci_dev->current_state == prev,
> +			"PCI PM: Power state of device not changed by %pF\n",
> +			pm->runtime_suspend);
> +		WARN_ONCE(pci_dev_run_wake(pci_dev) &&
> +				!pci_dev->wakeup_prepared,
> +			"PCI PM: Waking of device not configured by %pF\n",
> +			pm->runtime_suspend);
> +		return 0;
>  	}
>  
> -	return 0;
> +	pci_save_state(pci_dev);
> +	return pci_finish_runtime_suspend(pci_dev);
>  }
>  
>  static int pci_pm_runtime_resume(struct device *dev)
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work
  2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
                   ` (6 preceding siblings ...)
  2013-02-04 11:56 ` [PATCH v2 7/7] e1000e: fix accessing to suspended device Konstantin Khlebnikov
@ 2013-02-04 20:23 ` Rafael J. Wysocki
  2013-02-12  0:34   ` Bjorn Helgaas
  7 siblings, 1 reply; 19+ messages in thread
From: Rafael J. Wysocki @ 2013-02-04 20:23 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel

On Monday, February 04, 2013 03:55:47 PM Konstantin Khlebnikov wrote:
> This patchset contains some fixes for e1000e diver (broken since v2.6.35)
> and some related fixes and useful debug for PCI code.
> 
> All together this fixes my regression report for v3.8-rc1:
> https://lkml.org/lkml/2013/1/1/25
> 
> patchset was seriously reworked since v1:
> http://lkml.org/lkml/2013/1/18/147
> 
> ---
> 
> Konstantin Khlebnikov (6):
>       e1000e: fix pci-device enable-counter balance
>       PCI: don't touch enable_cnt in pci_device_shutdown()
>       PCI: catch enable-counter underflows
>       e1000e: fix runtime power management transitions
>       PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
>       e1000e: fix accessing to suspended device
> 
> Rafael J. Wysocki (1):
>       PCI/PM: clear state_saved during suspend
> 
> 
>  drivers/net/ethernet/intel/e1000e/ethtool.c |   13 ++++
>  drivers/net/ethernet/intel/e1000e/netdev.c  |   82 +++++++--------------------
>  drivers/pci/pci-driver.c                    |   21 +++++--
>  drivers/pci/pci.c                           |    3 +
>  4 files changed, 53 insertions(+), 66 deletions(-)

Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

for all patches except for [6/7].

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
  2013-02-04 20:22   ` Rafael J. Wysocki
@ 2013-02-04 20:57     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 19+ messages in thread
From: Konstantin Khlebnikov @ 2013-02-04 20:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel, Bjorn Helgaas

Rafael J. Wysocki wrote:
> On Monday, February 04, 2013 03:56:12 PM Konstantin Khlebnikov wrote:
>> Documentation/power/pci.txt says:
>> | It is expected that the device driver's pm->runtime_suspend() callback will
>> | not attempt to prepare the device for signaling wakeup or to put it into a
>> | low-power state.  The driver ought to leave these tasks to the PCI subsystem
>> | that has all of the information necessary to perform them.
>>
>> After commit 42eca2302146fed51335b95128e949ee6f54478f
>> ("PCI: Don't touch card regs after runtime suspend D3")
>> | If the driver takes care of state saving, don't touch any registers on it.
>> pci_pm_runtime_suspend() thinks if state has been saved by ->runtime_suspend()
>> that means device alredy prepared for wakeup and probably no longer accessible.
>>
>> Thus driver must either do all actions or leave all these tasks to PCI subsystem.
>
> I don't like this one, because it will generate noise for drivers that
> legitimately use pci_save_state() in their runtime suspend callbacks
> and know what they are doing.

Ok, you're right. Both these warnings can be false-positive.
For example some of wifi cards easily might be weird enough for this.

>
> Thanks,
> Rafael
>
>
>> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>> Cc: Rafael J. Wysocki<rafael.j.wysocki@intel.com>
>> Cc: Bjorn Helgaas<bhelgaas@google.com>
>> Cc: linux-pci@vger.kernel.org
>> ---
>>   drivers/pci/pci-driver.c |   15 +++++++++++----
>>   1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index f9aa311..2b0ff9a 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -1009,12 +1009,19 @@ static int pci_pm_runtime_suspend(struct device *dev)
>>   		return 0;
>>   	}
>>
>> -	if (!pci_dev->state_saved) {
>> -		pci_save_state(pci_dev);
>> -		pci_finish_runtime_suspend(pci_dev);
>> +	if (pci_dev->state_saved) {
>> +		WARN_ONCE(pci_dev->current_state == prev,
>> +			"PCI PM: Power state of device not changed by %pF\n",
>> +			pm->runtime_suspend);
>> +		WARN_ONCE(pci_dev_run_wake(pci_dev)&&
>> +				!pci_dev->wakeup_prepared,
>> +			"PCI PM: Waking of device not configured by %pF\n",
>> +			pm->runtime_suspend);
>> +		return 0;
>>   	}
>>
>> -	return 0;
>> +	pci_save_state(pci_dev);
>> +	return pci_finish_runtime_suspend(pci_dev);
>>   }
>>
>>   static int pci_pm_runtime_resume(struct device *dev)
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
  2013-02-04 11:55 ` [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown() Konstantin Khlebnikov
@ 2013-02-04 22:20   ` Khalid Aziz
  2013-02-04 23:13     ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Khalid Aziz @ 2013-02-04 22:20 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: e1000-devel, linux-pci, Rafael J. Wysocki, linux-kernel,
	Bjorn Helgaas, Andi Kleen, Alan Cox, Matthew Garrett,
	khalid.aziz

On Mon, 2013-02-04 at 15:55 +0400, Konstantin Khlebnikov wrote:
> Matthew Garrett and Alan Cox said (see LKML link below) that clearing bus-master
> for all PCI devices may lead to unpredictable consequences, some devices ignores
> this bit and continues DMA, some of them hang after that or crash whole system.
> Probably we should leave here only warning and disable bus-mastering for each
> driver individually in ->shutdown() callback.

Agreed that the right place for shutting down a PCI device properly and
clearing its Bus Master bit, is the driver shutdown routine, if only all
drivers supplied a shutdown routine. As it is today, there are too many
drivers that do not provide a shutdown routine, ata_piix, Marvell SATA
driver, ATI AGP driver just to name a few among a large number of them.
Yet kexec is expected to work inspite of these drivers especially since
kdump depends on it. So until all PCI drivers supply a shutdown routine,
this is just a band-aid to disable interrupt and Bus Master bit in
pci_device_shutdown(). Most drivers do seem to supply a suspend and
resume function and it was discussed many years ago if it is feasible to
use the suspend() routine for drivers to shut devices down cleanly.
Maybe it is time to revisit that discussion.

>Cc: Khalid Aziz <khalid.aziz@hp.com>

Please update this to khalid@gonehiking.org

--
Khalid



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
  2013-02-04 22:20   ` Khalid Aziz
@ 2013-02-04 23:13     ` Bjorn Helgaas
  2013-02-05 15:28       ` Khalid Aziz
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2013-02-04 23:13 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: Konstantin Khlebnikov, e1000-devel, linux-pci, Rafael J. Wysocki,
	linux-kernel, Andi Kleen, Alan Cox, Matthew Garrett, khalid.aziz

On Mon, Feb 4, 2013 at 3:20 PM, Khalid Aziz <khalid@gonehiking.org> wrote:
> On Mon, 2013-02-04 at 15:55 +0400, Konstantin Khlebnikov wrote:
>> Matthew Garrett and Alan Cox said (see LKML link below) that clearing bus-master
>> for all PCI devices may lead to unpredictable consequences, some devices ignores
>> this bit and continues DMA, some of them hang after that or crash whole system.
>> Probably we should leave here only warning and disable bus-mastering for each
>> driver individually in ->shutdown() callback.
>
> Agreed that the right place for shutting down a PCI device properly and
> clearing its Bus Master bit, is the driver shutdown routine, if only all
> drivers supplied a shutdown routine. As it is today, there are too many
> drivers that do not provide a shutdown routine, ata_piix, Marvell SATA
> driver, ATI AGP driver just to name a few among a large number of them.
> Yet kexec is expected to work inspite of these drivers especially since
> kdump depends on it. So until all PCI drivers supply a shutdown routine,
> this is just a band-aid to disable interrupt and Bus Master bit in
> pci_device_shutdown(). Most drivers do seem to supply a suspend and
> resume function and it was discussed many years ago if it is feasible to
> use the suspend() routine for drivers to shut devices down cleanly.
> Maybe it is time to revisit that discussion.

This patch as posted doesn't do anything with IRQs.  It only clears
PCI_COMMAND_MASTER.

I'm open to considering something with IRQs, but I don't understand
exactly what we should do.  In your response to the previous version
(https://lkml.org/lkml/2013/1/28/720) you suggested this:

  pci_clear_master(pci_dev);
  pcibios_disable_device(pci_dev);

Did you figure out specifically why pcibios_disable_device() helps?
Using pcibios_disable_device() doesn't seem like the ideal solution
because on most architectures, it is an empty function with no obvious
connection to IRQs.  On x86 with ACPI, it cleans up some ACPI PCI IRQ
stuff, but as far as I can tell, it doesn't actually touch the PCI
device itself or even the IOAPIC to which it's connected, so I'm not
sure how this would help kexec.

Bjorn

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
  2013-02-04 23:13     ` Bjorn Helgaas
@ 2013-02-05 15:28       ` Khalid Aziz
  2013-02-05 19:22         ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Khalid Aziz @ 2013-02-05 15:28 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Konstantin Khlebnikov, e1000-devel, linux-pci, Rafael J. Wysocki,
	linux-kernel, Andi Kleen, Alan Cox, Matthew Garrett, khalid.aziz

On Mon, 2013-02-04 at 16:13 -0700, Bjorn Helgaas wrote:
> On Mon, Feb 4, 2013 at 3:20 PM, Khalid Aziz <khalid@gonehiking.org> wrote:
> > On Mon, 2013-02-04 at 15:55 +0400, Konstantin Khlebnikov wrote:
> >> Matthew Garrett and Alan Cox said (see LKML link below) that clearing bus-master
> >> for all PCI devices may lead to unpredictable consequences, some devices ignores
> >> this bit and continues DMA, some of them hang after that or crash whole system.
> >> Probably we should leave here only warning and disable bus-mastering for each
> >> driver individually in ->shutdown() callback.
> >
> > Agreed that the right place for shutting down a PCI device properly and
> > clearing its Bus Master bit, is the driver shutdown routine, if only all
> > drivers supplied a shutdown routine. As it is today, there are too many
> > drivers that do not provide a shutdown routine, ata_piix, Marvell SATA
> > driver, ATI AGP driver just to name a few among a large number of them.
> > Yet kexec is expected to work inspite of these drivers especially since
> > kdump depends on it. So until all PCI drivers supply a shutdown routine,
> > this is just a band-aid to disable interrupt and Bus Master bit in
> > pci_device_shutdown(). Most drivers do seem to supply a suspend and
> > resume function and it was discussed many years ago if it is feasible to
> > use the suspend() routine for drivers to shut devices down cleanly.
> > Maybe it is time to revisit that discussion.
> 
> This patch as posted doesn't do anything with IRQs.  It only clears
> PCI_COMMAND_MASTER.
> 
> I'm open to considering something with IRQs, but I don't understand
> exactly what we should do.  In your response to the previous version
> (https://lkml.org/lkml/2013/1/28/720) you suggested this:
> 
>   pci_clear_master(pci_dev);
>   pcibios_disable_device(pci_dev);
> 
> Did you figure out specifically why pcibios_disable_device() helps?
> Using pcibios_disable_device() doesn't seem like the ideal solution
> because on most architectures, it is an empty function with no obvious
> connection to IRQs.  On x86 with ACPI, it cleans up some ACPI PCI IRQ
> stuff, but as far as I can tell, it doesn't actually touch the PCI
> device itself or even the IOAPIC to which it's connected, so I'm not
> sure how this would help kexec.
> 
> Bjorn

Hi Bjorn,

My reading of the code was that pcibios_disable_device() does clear the
interrupt on x86 and ia64. I am not deeply familiar with the ACPI code
and I might be interpreting it incorrectly, so please do correct me if I
am reading it incorrectly. Here is the code sequence I see:

pcibios_disable_device() ->
   pcibios_disable_irq() ->
       acpi_pci_irq_disable() -> 
           acpi_pci_link_free_irq() ->
              acpi_evaluate_object(link->device->handle, "_DIS", NULL,
NULL);

My understanding is the evaluation of ACPI _DIS method will disable the
interrupt from the device. Does that sound reasonable?

The problem this code attempts to solve is I/O devices continuing to be
active as we start to boot a kexec'd kernel. That activity can come from
DMA (has been seen with NICs for sure, but can happen from a SATA/IDE
controller as well when a pending read completes). When a DMA activity
overwrites a section of memory area in use by the new kexec'd kernel, it
takes lot of work to narrow that memory corruption down. The right way
to quiesce I/O devices is to call shutdown() function for every active
driver, which pci_device_shutdown() does today. If every driver provided
a proper shutdown() function, we would be done. Since that is not the
case, we need to stop potentially active devices from interfering with
kexec'd kernel. Too many drivers rely upon firmware reinitializing the
device when system is shut down. The two ways I can think of are to stop
DMA by clearing Bus Master bit and turn off the interrupt, which have
been shown to get kexec (and thus kdump) working on machines it didn't
work on before. 

This is a non-trivial problem to solve and I am very open to better
ideas.

--
Khalid


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
  2013-02-05 15:28       ` Khalid Aziz
@ 2013-02-05 19:22         ` Bjorn Helgaas
  2013-02-06  0:21           ` Khalid Aziz
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2013-02-05 19:22 UTC (permalink / raw)
  To: Khalid Aziz
  Cc: Konstantin Khlebnikov, e1000-devel, linux-pci, Rafael J. Wysocki,
	linux-kernel, Andi Kleen, Alan Cox, Matthew Garrett, khalid.aziz

On Tue, Feb 5, 2013 at 8:28 AM, Khalid Aziz <khalid@gonehiking.org> wrote:
> On Mon, 2013-02-04 at 16:13 -0700, Bjorn Helgaas wrote:
>> On Mon, Feb 4, 2013 at 3:20 PM, Khalid Aziz <khalid@gonehiking.org> wrote:
>> > On Mon, 2013-02-04 at 15:55 +0400, Konstantin Khlebnikov wrote:
>> >> Matthew Garrett and Alan Cox said (see LKML link below) that clearing bus-master
>> >> for all PCI devices may lead to unpredictable consequences, some devices ignores
>> >> this bit and continues DMA, some of them hang after that or crash whole system.
>> >> Probably we should leave here only warning and disable bus-mastering for each
>> >> driver individually in ->shutdown() callback.
>> >
>> > Agreed that the right place for shutting down a PCI device properly and
>> > clearing its Bus Master bit, is the driver shutdown routine, if only all
>> > drivers supplied a shutdown routine. As it is today, there are too many
>> > drivers that do not provide a shutdown routine, ata_piix, Marvell SATA
>> > driver, ATI AGP driver just to name a few among a large number of them.
>> > Yet kexec is expected to work inspite of these drivers especially since
>> > kdump depends on it. So until all PCI drivers supply a shutdown routine,
>> > this is just a band-aid to disable interrupt and Bus Master bit in
>> > pci_device_shutdown(). Most drivers do seem to supply a suspend and
>> > resume function and it was discussed many years ago if it is feasible to
>> > use the suspend() routine for drivers to shut devices down cleanly.
>> > Maybe it is time to revisit that discussion.
>>
>> This patch as posted doesn't do anything with IRQs.  It only clears
>> PCI_COMMAND_MASTER.
>>
>> I'm open to considering something with IRQs, but I don't understand
>> exactly what we should do.  In your response to the previous version
>> (https://lkml.org/lkml/2013/1/28/720) you suggested this:
>>
>>   pci_clear_master(pci_dev);
>>   pcibios_disable_device(pci_dev);
>>
>> Did you figure out specifically why pcibios_disable_device() helps?
>> Using pcibios_disable_device() doesn't seem like the ideal solution
>> because on most architectures, it is an empty function with no obvious
>> connection to IRQs.  On x86 with ACPI, it cleans up some ACPI PCI IRQ
>> stuff, but as far as I can tell, it doesn't actually touch the PCI
>> device itself or even the IOAPIC to which it's connected, so I'm not
>> sure how this would help kexec.
>
> My reading of the code was that pcibios_disable_device() does clear the
> interrupt on x86 and ia64. I am not deeply familiar with the ACPI code
> and I might be interpreting it incorrectly, so please do correct me if I
> am reading it incorrectly. Here is the code sequence I see:
>
> pcibios_disable_device() ->
>    pcibios_disable_irq() ->
>        acpi_pci_irq_disable() ->
>            acpi_pci_link_free_irq() ->
>               acpi_evaluate_object(link->device->handle, "_DIS", NULL,
> NULL);
>
> My understanding is the evaluation of ACPI _DIS method will disable the
> interrupt from the device. Does that sound reasonable?

I see the code you're looking at in acpi_pci_link_free_irq(), but we
only evaluate _DIS if link->refcnt == 0, and I don't think refcnt is
ever zero at that point.

refcnt starts out at zero in acpi_pci_link_add() (called when we find
PNP0C0F devices), and it's incremented in acpi_pci_link_allocate_irq()
(called in the pci_enable_device() path), but as far as I can tell,
it's never decremented, so I doubt that _DIS is ever evaluated.

If we did evaluate _DIS, it would act on an "interrupt link" device,
not on the PCI device itself.  I guess that could help, but only for
devices connected to such a link device.  For others, I guess we might
be able to accomplish something similar by updating local APIC and/or
IOAPIC config.  I don't think we do that today, at least not in the
pci_disable_device() path, but it might be something interesting to
explore.  There is also the INTx Disable bit, though it's obviously
only on new PCI devices.

> ... The two ways I can think of are to stop
> DMA by clearing Bus Master bit and turn off the interrupt, which have
> been shown to get kexec (and thus kdump) working on machines it didn't
> work on before.

I was just curious if you had actually verified that _DIS was being
evaluated and making a difference here, or if the Bus Master bit was
really the important part.

Bjorn

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
  2013-02-05 19:22         ` Bjorn Helgaas
@ 2013-02-06  0:21           ` Khalid Aziz
  0 siblings, 0 replies; 19+ messages in thread
From: Khalid Aziz @ 2013-02-06  0:21 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Konstantin Khlebnikov, e1000-devel, linux-pci, Rafael J. Wysocki,
	linux-kernel, Andi Kleen, Alan Cox, Matthew Garrett, khalid.aziz

On Tue, 2013-02-05 at 12:22 -0700, Bjorn Helgaas wrote:
> On Tue, Feb 5, 2013 at 8:28 AM, Khalid Aziz <khalid@gonehiking.org> wrote:
> > On Mon, 2013-02-04 at 16:13 -0700, Bjorn Helgaas wrote:
> >> Did you figure out specifically why pcibios_disable_device() helps?
> >> Using pcibios_disable_device() doesn't seem like the ideal solution
> >> because on most architectures, it is an empty function with no obvious
> >> connection to IRQs.  On x86 with ACPI, it cleans up some ACPI PCI IRQ
> >> stuff, but as far as I can tell, it doesn't actually touch the PCI
> >> device itself or even the IOAPIC to which it's connected, so I'm not
> >> sure how this would help kexec.
> >
> > My reading of the code was that pcibios_disable_device() does clear the
> > interrupt on x86 and ia64. I am not deeply familiar with the ACPI code
> > and I might be interpreting it incorrectly, so please do correct me if I
> > am reading it incorrectly. Here is the code sequence I see:
> >
> > pcibios_disable_device() ->
> >    pcibios_disable_irq() ->
> >        acpi_pci_irq_disable() ->
> >            acpi_pci_link_free_irq() ->
> >               acpi_evaluate_object(link->device->handle, "_DIS", NULL,
> > NULL);
> >
> > My understanding is the evaluation of ACPI _DIS method will disable the
> > interrupt from the device. Does that sound reasonable?
> 
> I see the code you're looking at in acpi_pci_link_free_irq(), but we
> only evaluate _DIS if link->refcnt == 0, and I don't think refcnt is
> ever zero at that point.
> 
> refcnt starts out at zero in acpi_pci_link_add() (called when we find
> PNP0C0F devices), and it's incremented in acpi_pci_link_allocate_irq()
> (called in the pci_enable_device() path), but as far as I can tell,
> it's never decremented, so I doubt that _DIS is ever evaluated.

Ah, that is interesting. I was assuming as we disable PCI devices, the
refcnt would have been decremented and if no one was using the IRQ, we
would evaluate _DIS method and disable the interrupt link.

> 
> If we did evaluate _DIS, it would act on an "interrupt link" device,
> not on the PCI device itself.  

Right, it should be the shutdown() routine for the device driver that
disables interrupt on the device itself. We want to turn interrupt off
one level higher in pci_device_shutdown().

> I guess that could help, but only for
> devices connected to such a link device.  For others, I guess we might
> be able to accomplish something similar by updating local APIC and/or
> IOAPIC config.  I don't think we do that today, at least not in the
> pci_disable_device() path, but it might be something interesting to
> explore.  There is also the INTx Disable bit, though it's obviously
> only on new PCI devices.

Turning interrupt of at local APIC or IOAPIC level sounds like more
reliable thing to do.

> 
> > ... The two ways I can think of are to stop
> > DMA by clearing Bus Master bit and turn off the interrupt, which have
> > been shown to get kexec (and thus kdump) working on machines it didn't
> > work on before.
> 
> I was just curious if you had actually verified that _DIS was being
> evaluated and making a difference here, or if the Bus Master bit was
> really the important part.
> 
> Bjorn

I have been able to reproduce the kexec problem caused by active PCI
devices only in limited cases and in those cases it was really the Bus
Master bit that was important. Keeping interrupts from errant devices
from reaching the kernel until newly kexc'd kernel is initialized is
additional safety measure.

--
Khalid


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work
  2013-02-04 20:23 ` [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Rafael J. Wysocki
@ 2013-02-12  0:34   ` Bjorn Helgaas
  2013-02-12  0:43     ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2013-02-12  0:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Konstantin Khlebnikov, e1000-devel, linux-pci, Rafael J. Wysocki,
	linux-kernel

On Mon, Feb 4, 2013 at 1:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Monday, February 04, 2013 03:55:47 PM Konstantin Khlebnikov wrote:
>> This patchset contains some fixes for e1000e diver (broken since v2.6.35)
>> and some related fixes and useful debug for PCI code.
>>
>> All together this fixes my regression report for v3.8-rc1:
>> https://lkml.org/lkml/2013/1/1/25
>>
>> patchset was seriously reworked since v1:
>> http://lkml.org/lkml/2013/1/18/147
>>
>> ---
>>
>> Konstantin Khlebnikov (6):
>>       e1000e: fix pci-device enable-counter balance
>>       PCI: don't touch enable_cnt in pci_device_shutdown()
>>       PCI: catch enable-counter underflows
>>       e1000e: fix runtime power management transitions
>>       PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
>>       e1000e: fix accessing to suspended device
>>
>> Rafael J. Wysocki (1):
>>       PCI/PM: clear state_saved during suspend
>>
>>
>>  drivers/net/ethernet/intel/e1000e/ethtool.c |   13 ++++
>>  drivers/net/ethernet/intel/e1000e/netdev.c  |   82 +++++++--------------------
>>  drivers/pci/pci-driver.c                    |   21 +++++--
>>  drivers/pci/pci.c                           |    3 +
>>  4 files changed, 53 insertions(+), 66 deletions(-)
>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> for all patches except for [6/7].

I added Rafael's acks and applied patches
  [2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
  [3/7] PCI: catch enable-counter underflows
to my pci/konstantin-runtime-pm branch for v3.9.

I assume the others will go through the e1000e maintainer.

Bjorn

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work
  2013-02-12  0:34   ` Bjorn Helgaas
@ 2013-02-12  0:43     ` Bjorn Helgaas
  2013-02-12 20:27       ` Rafael J. Wysocki
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2013-02-12  0:43 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Konstantin Khlebnikov, e1000-devel, linux-pci, Rafael J. Wysocki,
	linux-kernel

On Mon, Feb 11, 2013 at 5:34 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Mon, Feb 4, 2013 at 1:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Monday, February 04, 2013 03:55:47 PM Konstantin Khlebnikov wrote:
>>> This patchset contains some fixes for e1000e diver (broken since v2.6.35)
>>> and some related fixes and useful debug for PCI code.
>>>
>>> All together this fixes my regression report for v3.8-rc1:
>>> https://lkml.org/lkml/2013/1/1/25
>>>
>>> patchset was seriously reworked since v1:
>>> http://lkml.org/lkml/2013/1/18/147
>>>
>>> ---
>>>
>>> Konstantin Khlebnikov (6):
>>>       e1000e: fix pci-device enable-counter balance
>>>       PCI: don't touch enable_cnt in pci_device_shutdown()
>>>       PCI: catch enable-counter underflows
>>>       e1000e: fix runtime power management transitions
>>>       PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
>>>       e1000e: fix accessing to suspended device
>>>
>>> Rafael J. Wysocki (1):
>>>       PCI/PM: clear state_saved during suspend
>>>
>>>
>>>  drivers/net/ethernet/intel/e1000e/ethtool.c |   13 ++++
>>>  drivers/net/ethernet/intel/e1000e/netdev.c  |   82 +++++++--------------------
>>>  drivers/pci/pci-driver.c                    |   21 +++++--
>>>  drivers/pci/pci.c                           |    3 +
>>>  4 files changed, 53 insertions(+), 66 deletions(-)
>>
>> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> for all patches except for [6/7].
>
> I added Rafael's acks and applied patches
>   [2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
>   [3/7] PCI: catch enable-counter underflows
> to my pci/konstantin-runtime-pm branch for v3.9.

Oops, I missed
  [4/7] PCI/PM: Clear state_saved during suspend

I applied that one, too.

Bjorn

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work
  2013-02-12  0:43     ` Bjorn Helgaas
@ 2013-02-12 20:27       ` Rafael J. Wysocki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafael J. Wysocki @ 2013-02-12 20:27 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Konstantin Khlebnikov, e1000-devel, linux-pci, Rafael J. Wysocki,
	linux-kernel

On Monday, February 11, 2013 05:43:50 PM Bjorn Helgaas wrote:
> On Mon, Feb 11, 2013 at 5:34 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > On Mon, Feb 4, 2013 at 1:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> On Monday, February 04, 2013 03:55:47 PM Konstantin Khlebnikov wrote:
> >>> This patchset contains some fixes for e1000e diver (broken since v2.6.35)
> >>> and some related fixes and useful debug for PCI code.
> >>>
> >>> All together this fixes my regression report for v3.8-rc1:
> >>> https://lkml.org/lkml/2013/1/1/25
> >>>
> >>> patchset was seriously reworked since v1:
> >>> http://lkml.org/lkml/2013/1/18/147
> >>>
> >>> ---
> >>>
> >>> Konstantin Khlebnikov (6):
> >>>       e1000e: fix pci-device enable-counter balance
> >>>       PCI: don't touch enable_cnt in pci_device_shutdown()
> >>>       PCI: catch enable-counter underflows
> >>>       e1000e: fix runtime power management transitions
> >>>       PCI/PM: warn about incomplete actions in ->runtime_suspend() callback
> >>>       e1000e: fix accessing to suspended device
> >>>
> >>> Rafael J. Wysocki (1):
> >>>       PCI/PM: clear state_saved during suspend
> >>>
> >>>
> >>>  drivers/net/ethernet/intel/e1000e/ethtool.c |   13 ++++
> >>>  drivers/net/ethernet/intel/e1000e/netdev.c  |   82 +++++++--------------------
> >>>  drivers/pci/pci-driver.c                    |   21 +++++--
> >>>  drivers/pci/pci.c                           |    3 +
> >>>  4 files changed, 53 insertions(+), 66 deletions(-)
> >>
> >> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>
> >> for all patches except for [6/7].
> >
> > I added Rafael's acks and applied patches
> >   [2/7] PCI: don't touch enable_cnt in pci_device_shutdown()
> >   [3/7] PCI: catch enable-counter underflows
> > to my pci/konstantin-runtime-pm branch for v3.9.
> 
> Oops, I missed
>   [4/7] PCI/PM: Clear state_saved during suspend
> 
> I applied that one, too.

Thanks!


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-02-12 20:20 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-04 11:55 [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Konstantin Khlebnikov
2013-02-04 11:55 ` [PATCH v2 1/7] e1000e: fix pci-device enable-counter balance Konstantin Khlebnikov
2013-02-04 11:55 ` [PATCH v2 2/7] PCI: don't touch enable_cnt in pci_device_shutdown() Konstantin Khlebnikov
2013-02-04 22:20   ` Khalid Aziz
2013-02-04 23:13     ` Bjorn Helgaas
2013-02-05 15:28       ` Khalid Aziz
2013-02-05 19:22         ` Bjorn Helgaas
2013-02-06  0:21           ` Khalid Aziz
2013-02-04 11:56 ` [PATCH v2 3/7] PCI: catch enable-counter underflows Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 4/7] PCI/PM: clear state_saved during suspend Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 5/7] e1000e: fix runtime power management transitions Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 6/7] PCI/PM: warn about incomplete actions in ->runtime_suspend() callback Konstantin Khlebnikov
2013-02-04 20:22   ` Rafael J. Wysocki
2013-02-04 20:57     ` Konstantin Khlebnikov
2013-02-04 11:56 ` [PATCH v2 7/7] e1000e: fix accessing to suspended device Konstantin Khlebnikov
2013-02-04 20:23 ` [PATCH v2 0/7] pci/e1000e: return runtime-pm back to work Rafael J. Wysocki
2013-02-12  0:34   ` Bjorn Helgaas
2013-02-12  0:43     ` Bjorn Helgaas
2013-02-12 20:27       ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.