PCI / tg3: Give up chip reset and carrier loss handling if PCI device is not present
diff mbox series

Message ID 7807315.7cU9KldDtM@vostro.rjw.lan
State New, archived
Headers show
Series
  • PCI / tg3: Give up chip reset and carrier loss handling if PCI device is not present
Related show

Commit Message

Rafael J. Wysocki Dec. 1, 2013, 1:34 a.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify tg3_chip_reset() and tg3_close() to check if the PCI network
adapter device is accessible at all in order to skip poking it or
trying to handle a carrier loss in vain when that's not the case.
Introduce a special PCI helper function pci_device_is_present()
for this purpose.

Of course, this uncovers the lack of the appropriate RTNL locking
in tg3_suspend() and tg3_resume(), so add that locking in there
too.

These changes prevent tg3 from burning a CPU at 100% load level for
solid several seconds after the Thunderbolt link is disconnected from
a Matrox DS1 docking station.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/net/ethernet/broadcom/tg3.c |   26 +++++++++++++++++++-------
 drivers/pci/pci.c                   |    8 ++++++++
 include/linux/pci.h                 |    1 +
 3 files changed, 28 insertions(+), 7 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Michael Chan Dec. 1, 2013, 8:39 p.m. UTC | #1
On Sun, 2013-12-01 at 02:34 +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Modify tg3_chip_reset() and tg3_close() to check if the PCI network
> adapter device is accessible at all in order to skip poking it or
> trying to handle a carrier loss in vain when that's not the case.
> Introduce a special PCI helper function pci_device_is_present()
> for this purpose.
> 
> Of course, this uncovers the lack of the appropriate RTNL locking
> in tg3_suspend() and tg3_resume(), so add that locking in there
> too.
> 
> These changes prevent tg3 from burning a CPU at 100% load level for
> solid several seconds after the Thunderbolt link is disconnected from
> a Matrox DS1 docking station. 

I'm not familiar with the DS1.  Does the tg3 device get removed through
tg3_remove_one() in this case?  What happens when you reconnect the DS1?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Rafael J. Wysocki Dec. 2, 2013, 1:18 a.m. UTC | #2
On Sunday, December 01, 2013 12:39:19 PM Michael Chan wrote:
> On Sun, 2013-12-01 at 02:34 +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Modify tg3_chip_reset() and tg3_close() to check if the PCI network
> > adapter device is accessible at all in order to skip poking it or
> > trying to handle a carrier loss in vain when that's not the case.
> > Introduce a special PCI helper function pci_device_is_present()
> > for this purpose.
> > 
> > Of course, this uncovers the lack of the appropriate RTNL locking
> > in tg3_suspend() and tg3_resume(), so add that locking in there
> > too.
> > 
> > These changes prevent tg3 from burning a CPU at 100% load level for
> > solid several seconds after the Thunderbolt link is disconnected from
> > a Matrox DS1 docking station. 
> 
> I'm not familiar with the DS1.  Does the tg3 device get removed through
> tg3_remove_one() in this case?

Yes, this is a normal removal except that the device is gone physically when
it happens.

Thunderbolt cable disconnect is like a power failure for all devices on
the link (that is, all devices in the DS1 in this particular case).

> What happens when you reconnect the DS1?

Well, it works.  ACPIPHP just waits for the removal to complete and then it
handles the hot-add event and the device works normally after that.

The only problem is that without the patch the hot-removal takes a few seconds
and quite a lot of CPU power due to the tight loop spinning in tg3_carrier_off()
called by tg3_close(), while with the patch it just happens almost instantaneously.

And in the Thunderbolt disconnect case doing the entire tg3_chip_reset() is
pointless anyway, because the config space of the device is not available then.

Thanks!
David Miller Dec. 2, 2013, 6:48 p.m. UTC | #3
From: "Michael Chan" <mchan@broadcom.com>
Date: Sun, 1 Dec 2013 12:39:19 -0800

> I'm not familiar with the DS1.  Does the tg3 device get removed through
> tg3_remove_one() in this case?  What happens when you reconnect the DS1?

Michael, is Rafael's explanation sufficient for you?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Michael Chan Dec. 2, 2013, 6:54 p.m. UTC | #4
On Mon, 2013-12-02 at 13:48 -0500, David Miller wrote: 
> From: "Michael Chan" <mchan@broadcom.com>
> Date: Sun, 1 Dec 2013 12:39:19 -0800
> 
> > I'm not familiar with the DS1.  Does the tg3 device get removed through
> > tg3_remove_one() in this case?  What happens when you reconnect the DS1?
> 
> Michael, is Rafael's explanation sufficient for you?
> 

Yes, thanks.

Acked-by: Michael Chan <mchan@broadcom.com>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Dec. 2, 2013, 9:02 p.m. UTC | #5
From: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Date: Sun, 01 Dec 2013 02:34:37 +0100

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Modify tg3_chip_reset() and tg3_close() to check if the PCI network
> adapter device is accessible at all in order to skip poking it or
> trying to handle a carrier loss in vain when that's not the case.
> Introduce a special PCI helper function pci_device_is_present()
> for this purpose.
> 
> Of course, this uncovers the lack of the appropriate RTNL locking
> in tg3_suspend() and tg3_resume(), so add that locking in there
> too.
> 
> These changes prevent tg3 from burning a CPU at 100% load level for
> solid several seconds after the Thunderbolt link is disconnected from
> a Matrox DS1 docking station.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

Index: linux-pm/drivers/net/ethernet/broadcom/tg3.c
===================================================================
--- linux-pm.orig/drivers/net/ethernet/broadcom/tg3.c
+++ linux-pm/drivers/net/ethernet/broadcom/tg3.c
@@ -8932,6 +8932,9 @@  static int tg3_chip_reset(struct tg3 *tp
 	void (*write_op)(struct tg3 *, u32, u32);
 	int i, err;
 
+	if (!pci_device_is_present(tp->pdev))
+		return -ENODEV;
+
 	tg3_nvram_lock(tp);
 
 	tg3_ape_lock(tp, TG3_APE_LOCK_GRC);
@@ -11594,10 +11597,11 @@  static int tg3_close(struct net_device *
 	memset(&tp->net_stats_prev, 0, sizeof(tp->net_stats_prev));
 	memset(&tp->estats_prev, 0, sizeof(tp->estats_prev));
 
-	tg3_power_down_prepare(tp);
-
-	tg3_carrier_off(tp);
+	if (pci_device_is_present(tp->pdev)) {
+		tg3_power_down_prepare(tp);
 
+		tg3_carrier_off(tp);
+	}
 	return 0;
 }
 
@@ -17739,10 +17743,12 @@  static int tg3_suspend(struct device *de
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct tg3 *tp = netdev_priv(dev);
-	int err;
+	int err = 0;
+
+	rtnl_lock();
 
 	if (!netif_running(dev))
-		return 0;
+		goto unlock;
 
 	tg3_reset_task_cancel(tp);
 	tg3_phy_stop(tp);
@@ -17784,6 +17790,8 @@  out:
 			tg3_phy_start(tp);
 	}
 
+unlock:
+	rtnl_unlock();
 	return err;
 }
 
@@ -17792,10 +17800,12 @@  static int tg3_resume(struct device *dev
 	struct pci_dev *pdev = to_pci_dev(device);
 	struct net_device *dev = pci_get_drvdata(pdev);
 	struct tg3 *tp = netdev_priv(dev);
-	int err;
+	int err = 0;
+
+	rtnl_lock();
 
 	if (!netif_running(dev))
-		return 0;
+		goto unlock;
 
 	netif_device_attach(dev);
 
@@ -17819,6 +17829,8 @@  out:
 	if (!err)
 		tg3_phy_start(tp);
 
+unlock:
+	rtnl_unlock();
 	return err;
 }
 #endif /* CONFIG_PM_SLEEP */
Index: linux-pm/drivers/pci/pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pci.c
+++ linux-pm/drivers/pci/pci.c
@@ -4171,6 +4171,14 @@  int pci_set_vga_state(struct pci_dev *de
 	return 0;
 }
 
+bool pci_device_is_present(struct pci_dev *pdev)
+{
+	u32 v;
+
+	return pci_bus_read_dev_vendor_id(pdev->bus, pdev->devfn, &v, 0);
+}
+EXPORT_SYMBOL_GPL(pci_device_is_present);
+
 #define RESOURCE_ALIGNMENT_PARAM_SIZE COMMAND_LINE_SIZE
 static char resource_alignment_param[RESOURCE_ALIGNMENT_PARAM_SIZE] = {0};
 static DEFINE_SPINLOCK(resource_alignment_lock);
Index: linux-pm/include/linux/pci.h
===================================================================
--- linux-pm.orig/include/linux/pci.h
+++ linux-pm/include/linux/pci.h
@@ -961,6 +961,7 @@  void pci_update_resource(struct pci_dev
 int __must_check pci_assign_resource(struct pci_dev *dev, int i);
 int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
 int pci_select_bars(struct pci_dev *dev, unsigned long flags);
+bool pci_device_is_present(struct pci_dev *pdev);
 
 /* ROM control related routines */
 int pci_enable_rom(struct pci_dev *pdev);