On Wed, 2012-11-21 at 16:39 -0800, Greg Kroah-Hartman wrote: > 3.0-stable review patch. If anyone has any objections, please let me know. > > ------------------ > > From: Huang Ying > > commit 90b5c1d7c45eeb622302680ff96ed30c1a2b6f0e upstream. > > If a PCI device and its parents are put into D3cold, unbinding the > device will trigger deadlock as follow: > > - driver_unbind > - device_release_driver > - device_lock(dev) <--- previous lock here > - __device_release_driver > - pm_runtime_get_sync > ... > - rpm_resume(dev) > - rpm_resume(dev->parent) > ... > - pci_pm_runtime_resume > ... > - pci_set_power_state > - __pci_start_power_transition > - pci_wakeup_bus(dev->parent->subordinate) > - pci_walk_bus > - device_lock(dev) <--- deadlock here > > > If we do not do device_lock in pci_walk_bus, we can avoid deadlock. > Device_lock in pci_walk_bus is introduced in commit: > d71374dafbba7ec3f67371d3b7e9f6310a588808, corresponding email thread > is: https://lkml.org/lkml/2006/5/26/38. The patch author Zhang Yanmin > said device_lock is added to pci_walk_bus because: > > Some error handling functions call pci_walk_bus. For example, PCIe > aer. Here we lock the device, so the driver wouldn't detach from the > device, as the cb might call driver's callback function. > > So I fixed the deadlock as follows: > > - remove device_lock from pci_walk_bus > - add device_lock into callback if callback will call driver's callback > > I checked pci_walk_bus users one by one, and found only PCIe aer needs > device lock. [...] What about eeh_report_error() in arch/powerpc/platforms/pseries/eeh_driver.c? Also, is the deadlock even possible before this change in Linux 3.6? commit 448bd857d48e69b33ef323739dc6d8ca20d4cda7 Author: Huang Ying Date: Sat Jun 23 10:23:51 2012 +0800 PCI/PM: add PCIe runtime D3cold support Ben. -- Ben Hutchings Never attribute to conspiracy what can adequately be explained by stupidity.