All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5] Fixes to Xen pciback for 3.17.
@ 2014-07-14 16:18 Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS Konrad Rzeszutek Wilk
                   ` (15 more replies)
  0 siblings, 16 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel

Greg: goto GHK

This is v5 version of patches to fix some issues in Xen PCIback.

One of the issues Xen PCI back has that patch:

is fixing is that a deadlock can happen if the PCI device is
assigned to a guest and we try to 'unbind' it from Xen 'pciback' driver.
The issue is rather simple - the SysFS mechanism for the 'unbind' path
takes a device lock and the code in Xen PCI uses the pci_reset_function
which also takes the same lock. Solution is to use the lock-less version
and mandate that callers of said function in Xen pciback take the lock.
Easy enough.

GHK:
To guard against this happening in the future we also add an assert in the
form of lockdep assertion. That is OK except that it looks ugly as we take
it straight from the 'struct device' instead of using an appropriate macro.
See:

+       lockdep_assert_held(&dev->dev.mutex);

(in [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding).

The patch: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex
to do.

introduces a nice wrapper so it is bit cleaner. Greg, if you are OK with
it could you kindly Ack it as I would prefer to put this patchset
via the Xen tree. It would look now as:

-       lockdep_assert_held(&dev->dev.mutex);
+       device_lock_assert(&dev->dev);

I can also squash it in "[PATCH v5 2/6] xen/pciback: Don't deadlock when
 unbinding." but since that one is going through the stable tree I wasn't
sure whether you (Greg KH) would be OK with that.

END GHK:

Thank you!

Patches are also available on my git tree

 git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git devel/pciback-3.17.v5

 Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++
 drivers/xen/xen-pciback/passthrough.c          | 14 +++++++--
 drivers/xen/xen-pciback/pci_stub.c             | 42 ++++++++++++++------------
 drivers/xen/xen-pciback/pciback.h              |  7 +++--
 drivers/xen/xen-pciback/vpci.c                 | 14 +++++++--
 drivers/xen/xen-pciback/xenbus.c               |  4 +--
 include/linux/device.h                         |  5 +++
 7 files changed, 81 insertions(+), 30 deletions(-)

Konrad Rzeszutek Wilk (6):
      xen-pciback: Document the various parameters and attributes in SysFS
      xen/pciback: Don't deadlock when unbinding.
      driver core: Provide an wrapper around the mutex to do lockdep warnings
      xen/pciback: Include the domain id if removing the device whilst still in use
      xen/pciback: Print out the domain owning the device.
      xen/pciback: Remove tons of dereferences


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-28 13:04   ` David Vrabel
  2014-07-28 13:04   ` David Vrabel
  2014-07-14 16:18 ` [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding Konrad Rzeszutek Wilk
                   ` (13 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

Which hadn't been done with the initial commit.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2: Dropped the parameters and one that is unlikeable.
---
 Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback

diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
new file mode 100644
index 0000000..cdc8340
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-pciback
@@ -0,0 +1,25 @@
+What:           /sys/bus/pci/drivers/pciback/quirks
+Date:           Oct 2011
+KernelVersion:  3.1
+Contact:        xen-devel@lists.xenproject.org
+Description:
+                If the permissive attribute is set, then writing a string in
+                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
+                to write and read from the PCI device. That is Domain:Bus:
+                Device.Function-Register:Size:Mask (Domain is optional).
+                For example:
+                #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
+                will allow the guest to read and write to the configuration
+                register 0x0E.
+
+What:           /sys/bus/pci/drivers/pciback/irq_handlers
+Date:           Oct 2011
+KernelVersion:  3.1
+Contact:        xen-devel@lists.xenproject.org
+Description:
+                A list of all of the PCI devices owned by Xen PCI back and
+                whether Xen PCI backend will acknowledge the interrupts received
+                and the amount of interrupts received. Xen PCI back acknowledges
+                said interrupts only when they are level, shared with another
+                guest, and enabled by the guest.
+
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

Which hadn't been done with the initial commit.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2: Dropped the parameters and one that is unlikeable.
---
 Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback

diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
new file mode 100644
index 0000000..cdc8340
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-pciback
@@ -0,0 +1,25 @@
+What:           /sys/bus/pci/drivers/pciback/quirks
+Date:           Oct 2011
+KernelVersion:  3.1
+Contact:        xen-devel@lists.xenproject.org
+Description:
+                If the permissive attribute is set, then writing a string in
+                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
+                to write and read from the PCI device. That is Domain:Bus:
+                Device.Function-Register:Size:Mask (Domain is optional).
+                For example:
+                #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
+                will allow the guest to read and write to the configuration
+                register 0x0E.
+
+What:           /sys/bus/pci/drivers/pciback/irq_handlers
+Date:           Oct 2011
+KernelVersion:  3.1
+Contact:        xen-devel@lists.xenproject.org
+Description:
+                A list of all of the PCI devices owned by Xen PCI back and
+                whether Xen PCI backend will acknowledge the interrupts received
+                and the amount of interrupts received. Xen PCI back acknowledges
+                said interrupts only when they are level, shared with another
+                guest, and enabled by the guest.
+
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-28 13:06   ` David Vrabel
  2014-07-28 13:06   ` David Vrabel
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
                   ` (12 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk, stable

As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
explained there are four entry points in this function.
Two of them are when the user fiddles in the SysFS to
unbind a device which might be in use by a guest or not.

Both 'unbind' states will cause a deadlock as the the PCI lock has
already been taken, which then pci_device_reset tries to take.

We can simplify this by requiring that all callers of
pcistub_put_pci_dev MUST hold the device lock. And then
we can just call the lockless version of pci_device_reset.

To make it even simpler we will modify xen_pcibk_release_pci_dev
to quality whether it should take a lock or not - as it ends
up calling xen_pcibk_release_pci_dev and needs to hold the lock.

CC: stable@vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
[v2: Per David Vrabel's suggestion - use lockless version of reset]
[v3: Per Boris suggestion add assertion mechanism]
---
 drivers/xen/xen-pciback/passthrough.c | 14 +++++++++++---
 drivers/xen/xen-pciback/pci_stub.c    | 12 ++++++------
 drivers/xen/xen-pciback/pciback.h     |  7 ++++---
 drivers/xen/xen-pciback/vpci.c        | 14 +++++++++++---
 drivers/xen/xen-pciback/xenbus.c      |  2 +-
 5 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/xen/xen-pciback/passthrough.c b/drivers/xen/xen-pciback/passthrough.c
index 828dddc..f16a30e 100644
--- a/drivers/xen/xen-pciback/passthrough.c
+++ b/drivers/xen/xen-pciback/passthrough.c
@@ -69,7 +69,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
 }
 
 static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
-					struct pci_dev *dev)
+					struct pci_dev *dev, bool lock)
 {
 	struct passthrough_dev_data *dev_data = pdev->pci_dev_data;
 	struct pci_dev_entry *dev_entry, *t;
@@ -87,8 +87,13 @@ static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
 
 	mutex_unlock(&dev_data->lock);
 
-	if (found_dev)
+	if (found_dev) {
+		if (lock)
+			device_lock(&found_dev->dev);
 		pcistub_put_pci_dev(found_dev);
+		if (lock)
+			device_unlock(&found_dev->dev);
+	}
 }
 
 static int __xen_pcibk_init_devices(struct xen_pcibk_device *pdev)
@@ -156,8 +161,11 @@ static void __xen_pcibk_release_devices(struct xen_pcibk_device *pdev)
 	struct pci_dev_entry *dev_entry, *t;
 
 	list_for_each_entry_safe(dev_entry, t, &dev_data->dev_list, list) {
+		struct pci_dev *dev = dev_entry->dev;
 		list_del(&dev_entry->list);
-		pcistub_put_pci_dev(dev_entry->dev);
+		device_lock(&dev->dev);
+		pcistub_put_pci_dev(dev);
+		device_unlock(&dev->dev);
 		kfree(dev_entry);
 	}
 
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index d57a173..d4cae5b 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
  *  - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
  *
  *  As such we have to be careful.
+ *
+ *  To make this easier, the caller has to hold the device lock.
  */
 void pcistub_put_pci_dev(struct pci_dev *dev)
 {
@@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-
-	/* This is OK - we are running from workqueue context
-	 * and want to inhibit the user from fiddling with 'reset'
-	 */
-	pci_reset_function(dev);
+	lockdep_assert_held(&dev->dev.mutex);
+	__pci_reset_function_locked(dev);
 	pci_restore_state(dev);
 
 	/* This disables the device. */
@@ -567,7 +566,8 @@ static void pcistub_remove(struct pci_dev *dev)
 			/* N.B. This ends up calling pcistub_put_pci_dev which ends up
 			 * doing the FLR. */
 			xen_pcibk_release_pci_dev(found_psdev->pdev,
-						found_psdev->dev);
+						found_psdev->dev,
+						false /* caller holds the lock. */);
 		}
 
 		spin_lock_irqsave(&pcistub_devices_lock, flags);
diff --git a/drivers/xen/xen-pciback/pciback.h b/drivers/xen/xen-pciback/pciback.h
index f72af87..58e38d5 100644
--- a/drivers/xen/xen-pciback/pciback.h
+++ b/drivers/xen/xen-pciback/pciback.h
@@ -99,7 +99,8 @@ struct xen_pcibk_backend {
 		    unsigned int *domain, unsigned int *bus,
 		    unsigned int *devfn);
 	int (*publish)(struct xen_pcibk_device *pdev, publish_pci_root_cb cb);
-	void (*release)(struct xen_pcibk_device *pdev, struct pci_dev *dev);
+	void (*release)(struct xen_pcibk_device *pdev, struct pci_dev *dev,
+                        bool lock);
 	int (*add)(struct xen_pcibk_device *pdev, struct pci_dev *dev,
 		   int devid, publish_pci_dev_cb publish_cb);
 	struct pci_dev *(*get)(struct xen_pcibk_device *pdev,
@@ -122,10 +123,10 @@ static inline int xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
 }
 
 static inline void xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
-					     struct pci_dev *dev)
+					     struct pci_dev *dev, bool lock)
 {
 	if (xen_pcibk_backend && xen_pcibk_backend->release)
-		return xen_pcibk_backend->release(pdev, dev);
+		return xen_pcibk_backend->release(pdev, dev, lock);
 }
 
 static inline struct pci_dev *
diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c
index 51afff9..c99f8bb 100644
--- a/drivers/xen/xen-pciback/vpci.c
+++ b/drivers/xen/xen-pciback/vpci.c
@@ -145,7 +145,7 @@ out:
 }
 
 static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
-					struct pci_dev *dev)
+					struct pci_dev *dev, bool lock)
 {
 	int slot;
 	struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
@@ -169,8 +169,13 @@ static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
 out:
 	mutex_unlock(&vpci_dev->lock);
 
-	if (found_dev)
+	if (found_dev) {
+		if (lock)
+			device_lock(&found_dev->dev);
 		pcistub_put_pci_dev(found_dev);
+		if (lock)
+			device_unlock(&found_dev->dev);
+	}
 }
 
 static int __xen_pcibk_init_devices(struct xen_pcibk_device *pdev)
@@ -208,8 +213,11 @@ static void __xen_pcibk_release_devices(struct xen_pcibk_device *pdev)
 		struct pci_dev_entry *e, *tmp;
 		list_for_each_entry_safe(e, tmp, &vpci_dev->dev_list[slot],
 					 list) {
+			struct pci_dev *dev = e->dev;
 			list_del(&e->list);
-			pcistub_put_pci_dev(e->dev);
+			device_lock(&dev->dev);
+			pcistub_put_pci_dev(dev);
+			device_unlock(&dev->dev);
 			kfree(e);
 		}
 	}
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 4a7e6e0..b3318fd 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -290,7 +290,7 @@ static int xen_pcibk_remove_device(struct xen_pcibk_device *pdev,
 
 	/* N.B. This ends up calling pcistub_put_pci_dev which ends up
 	 * doing the FLR. */
-	xen_pcibk_release_pci_dev(pdev, dev);
+	xen_pcibk_release_pci_dev(pdev, dev, true /* use the lock. */);
 
 out:
 	return err;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (2 preceding siblings ...)
  2014-07-14 16:18 ` [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings Konrad Rzeszutek Wilk
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: stable, Konrad Rzeszutek Wilk

As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
explained there are four entry points in this function.
Two of them are when the user fiddles in the SysFS to
unbind a device which might be in use by a guest or not.

Both 'unbind' states will cause a deadlock as the the PCI lock has
already been taken, which then pci_device_reset tries to take.

We can simplify this by requiring that all callers of
pcistub_put_pci_dev MUST hold the device lock. And then
we can just call the lockless version of pci_device_reset.

To make it even simpler we will modify xen_pcibk_release_pci_dev
to quality whether it should take a lock or not - as it ends
up calling xen_pcibk_release_pci_dev and needs to hold the lock.

CC: stable@vger.kernel.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
[v2: Per David Vrabel's suggestion - use lockless version of reset]
[v3: Per Boris suggestion add assertion mechanism]
---
 drivers/xen/xen-pciback/passthrough.c | 14 +++++++++++---
 drivers/xen/xen-pciback/pci_stub.c    | 12 ++++++------
 drivers/xen/xen-pciback/pciback.h     |  7 ++++---
 drivers/xen/xen-pciback/vpci.c        | 14 +++++++++++---
 drivers/xen/xen-pciback/xenbus.c      |  2 +-
 5 files changed, 33 insertions(+), 16 deletions(-)

diff --git a/drivers/xen/xen-pciback/passthrough.c b/drivers/xen/xen-pciback/passthrough.c
index 828dddc..f16a30e 100644
--- a/drivers/xen/xen-pciback/passthrough.c
+++ b/drivers/xen/xen-pciback/passthrough.c
@@ -69,7 +69,7 @@ static int __xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
 }
 
 static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
-					struct pci_dev *dev)
+					struct pci_dev *dev, bool lock)
 {
 	struct passthrough_dev_data *dev_data = pdev->pci_dev_data;
 	struct pci_dev_entry *dev_entry, *t;
@@ -87,8 +87,13 @@ static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
 
 	mutex_unlock(&dev_data->lock);
 
-	if (found_dev)
+	if (found_dev) {
+		if (lock)
+			device_lock(&found_dev->dev);
 		pcistub_put_pci_dev(found_dev);
+		if (lock)
+			device_unlock(&found_dev->dev);
+	}
 }
 
 static int __xen_pcibk_init_devices(struct xen_pcibk_device *pdev)
@@ -156,8 +161,11 @@ static void __xen_pcibk_release_devices(struct xen_pcibk_device *pdev)
 	struct pci_dev_entry *dev_entry, *t;
 
 	list_for_each_entry_safe(dev_entry, t, &dev_data->dev_list, list) {
+		struct pci_dev *dev = dev_entry->dev;
 		list_del(&dev_entry->list);
-		pcistub_put_pci_dev(dev_entry->dev);
+		device_lock(&dev->dev);
+		pcistub_put_pci_dev(dev);
+		device_unlock(&dev->dev);
 		kfree(dev_entry);
 	}
 
diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index d57a173..d4cae5b 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
  *  - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
  *
  *  As such we have to be careful.
+ *
+ *  To make this easier, the caller has to hold the device lock.
  */
 void pcistub_put_pci_dev(struct pci_dev *dev)
 {
@@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-
-	/* This is OK - we are running from workqueue context
-	 * and want to inhibit the user from fiddling with 'reset'
-	 */
-	pci_reset_function(dev);
+	lockdep_assert_held(&dev->dev.mutex);
+	__pci_reset_function_locked(dev);
 	pci_restore_state(dev);
 
 	/* This disables the device. */
@@ -567,7 +566,8 @@ static void pcistub_remove(struct pci_dev *dev)
 			/* N.B. This ends up calling pcistub_put_pci_dev which ends up
 			 * doing the FLR. */
 			xen_pcibk_release_pci_dev(found_psdev->pdev,
-						found_psdev->dev);
+						found_psdev->dev,
+						false /* caller holds the lock. */);
 		}
 
 		spin_lock_irqsave(&pcistub_devices_lock, flags);
diff --git a/drivers/xen/xen-pciback/pciback.h b/drivers/xen/xen-pciback/pciback.h
index f72af87..58e38d5 100644
--- a/drivers/xen/xen-pciback/pciback.h
+++ b/drivers/xen/xen-pciback/pciback.h
@@ -99,7 +99,8 @@ struct xen_pcibk_backend {
 		    unsigned int *domain, unsigned int *bus,
 		    unsigned int *devfn);
 	int (*publish)(struct xen_pcibk_device *pdev, publish_pci_root_cb cb);
-	void (*release)(struct xen_pcibk_device *pdev, struct pci_dev *dev);
+	void (*release)(struct xen_pcibk_device *pdev, struct pci_dev *dev,
+                        bool lock);
 	int (*add)(struct xen_pcibk_device *pdev, struct pci_dev *dev,
 		   int devid, publish_pci_dev_cb publish_cb);
 	struct pci_dev *(*get)(struct xen_pcibk_device *pdev,
@@ -122,10 +123,10 @@ static inline int xen_pcibk_add_pci_dev(struct xen_pcibk_device *pdev,
 }
 
 static inline void xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
-					     struct pci_dev *dev)
+					     struct pci_dev *dev, bool lock)
 {
 	if (xen_pcibk_backend && xen_pcibk_backend->release)
-		return xen_pcibk_backend->release(pdev, dev);
+		return xen_pcibk_backend->release(pdev, dev, lock);
 }
 
 static inline struct pci_dev *
diff --git a/drivers/xen/xen-pciback/vpci.c b/drivers/xen/xen-pciback/vpci.c
index 51afff9..c99f8bb 100644
--- a/drivers/xen/xen-pciback/vpci.c
+++ b/drivers/xen/xen-pciback/vpci.c
@@ -145,7 +145,7 @@ out:
 }
 
 static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
-					struct pci_dev *dev)
+					struct pci_dev *dev, bool lock)
 {
 	int slot;
 	struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
@@ -169,8 +169,13 @@ static void __xen_pcibk_release_pci_dev(struct xen_pcibk_device *pdev,
 out:
 	mutex_unlock(&vpci_dev->lock);
 
-	if (found_dev)
+	if (found_dev) {
+		if (lock)
+			device_lock(&found_dev->dev);
 		pcistub_put_pci_dev(found_dev);
+		if (lock)
+			device_unlock(&found_dev->dev);
+	}
 }
 
 static int __xen_pcibk_init_devices(struct xen_pcibk_device *pdev)
@@ -208,8 +213,11 @@ static void __xen_pcibk_release_devices(struct xen_pcibk_device *pdev)
 		struct pci_dev_entry *e, *tmp;
 		list_for_each_entry_safe(e, tmp, &vpci_dev->dev_list[slot],
 					 list) {
+			struct pci_dev *dev = e->dev;
 			list_del(&e->list);
-			pcistub_put_pci_dev(e->dev);
+			device_lock(&dev->dev);
+			pcistub_put_pci_dev(dev);
+			device_unlock(&dev->dev);
 			kfree(e);
 		}
 	}
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index 4a7e6e0..b3318fd 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -290,7 +290,7 @@ static int xen_pcibk_remove_device(struct xen_pcibk_device *pdev,
 
 	/* N.B. This ends up calling pcistub_put_pci_dev which ends up
 	 * doing the FLR. */
-	xen_pcibk_release_pci_dev(pdev, dev);
+	xen_pcibk_release_pci_dev(pdev, dev, true /* use the lock. */);
 
 out:
 	return err;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (3 preceding siblings ...)
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 17:39   ` Greg KH
  2014-07-14 17:39   ` Greg KH
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
                   ` (10 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

Instead of open-coding it in drivers that want to double check
that their functions are indeed holding the device lock.

CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 2 +-
 include/linux/device.h             | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index d4cae5b..f9bf793 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -278,7 +278,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-	lockdep_assert_held(&dev->dev.mutex);
+	device_lock_assert(&dev->dev);
 	__pci_reset_function_locked(dev);
 	pci_restore_state(dev);
 
diff --git a/include/linux/device.h b/include/linux/device.h
index af424ac..1d29fb2 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -908,6 +908,11 @@ static inline void device_unlock(struct device *dev)
 	mutex_unlock(&dev->mutex);
 }
 
+static inline void device_lock_assert(struct device *dev)
+{
+	lockdep_assert_held(&dev->mutex);
+}
+
 void driver_init(void);
 
 /*
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (4 preceding siblings ...)
  2014-07-14 16:18 ` [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` [PATCH v5 4/6] xen/pciback: Include the domain id if removing the device whilst still in use Konrad Rzeszutek Wilk
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

Instead of open-coding it in drivers that want to double check
that their functions are indeed holding the device lock.

CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 2 +-
 include/linux/device.h             | 5 +++++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index d4cae5b..f9bf793 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -278,7 +278,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-	lockdep_assert_held(&dev->dev.mutex);
+	device_lock_assert(&dev->dev);
 	__pci_reset_function_locked(dev);
 	pci_restore_state(dev);
 
diff --git a/include/linux/device.h b/include/linux/device.h
index af424ac..1d29fb2 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -908,6 +908,11 @@ static inline void device_unlock(struct device *dev)
 	mutex_unlock(&dev->mutex);
 }
 
+static inline void device_lock_assert(struct device *dev)
+{
+	lockdep_assert_held(&dev->mutex);
+}
+
 void driver_init(void);
 
 /*
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 4/6] xen/pciback: Include the domain id if removing the device whilst still in use
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (6 preceding siblings ...)
  2014-07-14 16:18 ` [PATCH v5 4/6] xen/pciback: Include the domain id if removing the device whilst still in use Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` [PATCH v5 5/6] xen/pciback: Print out the domain owning the device Konrad Rzeszutek Wilk
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

Cleanup the function a bit - also include the id of the
domain that is using the device.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index f9bf793..121c725 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -553,12 +553,14 @@ static void pcistub_remove(struct pci_dev *dev)
 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
 
 	if (found_psdev) {
-		dev_dbg(&dev->dev, "found device to remove - in use? %p\n",
-			found_psdev->pdev);
+		dev_dbg(&dev->dev, "found device to remove %s\n",
+			found_psdev->pdev ? "- in-use" : "");
 
 		if (found_psdev->pdev) {
-			pr_warn("****** removing device %s while still in-use! ******\n",
-			       pci_name(found_psdev->dev));
+			int domid = xen_find_device_domain_owner(dev);
+
+			pr_warn("****** removing device %s while still in-use by domain %d! ******\n",
+			       pci_name(found_psdev->dev), domid);
 			pr_warn("****** driver domain may still access this device's i/o resources!\n");
 			pr_warn("****** shutdown driver domain before binding device\n");
 			pr_warn("****** to other drivers or domains\n");
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 4/6] xen/pciback: Include the domain id if removing the device whilst still in use
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (5 preceding siblings ...)
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

Cleanup the function a bit - also include the id of the
domain that is using the device.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index f9bf793..121c725 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -553,12 +553,14 @@ static void pcistub_remove(struct pci_dev *dev)
 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
 
 	if (found_psdev) {
-		dev_dbg(&dev->dev, "found device to remove - in use? %p\n",
-			found_psdev->pdev);
+		dev_dbg(&dev->dev, "found device to remove %s\n",
+			found_psdev->pdev ? "- in-use" : "");
 
 		if (found_psdev->pdev) {
-			pr_warn("****** removing device %s while still in-use! ******\n",
-			       pci_name(found_psdev->dev));
+			int domid = xen_find_device_domain_owner(dev);
+
+			pr_warn("****** removing device %s while still in-use by domain %d! ******\n",
+			       pci_name(found_psdev->dev), domid);
 			pr_warn("****** driver domain may still access this device's i/o resources!\n");
 			pr_warn("****** shutdown driver domain before binding device\n");
 			pr_warn("****** to other drivers or domains\n");
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 5/6] xen/pciback: Print out the domain owning the device.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (7 preceding siblings ...)
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

We had been printing it only if the device was built with
debug enabled. But this information is useful in the field
to troubleshoot.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/xenbus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index b3318fd..53e2dda 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -246,7 +246,7 @@ static int xen_pcibk_export_device(struct xen_pcibk_device *pdev,
 	if (err)
 		goto out;
 
-	dev_dbg(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
+	dev_info(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
 	if (xen_register_device_domain_owner(dev,
 					     pdev->xdev->otherend_id) != 0) {
 		dev_err(&dev->dev, "Stealing ownership from dom%d.\n",
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 5/6] xen/pciback: Print out the domain owning the device.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (8 preceding siblings ...)
  2014-07-14 16:18 ` [PATCH v5 5/6] xen/pciback: Print out the domain owning the device Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` [PATCH v5 6/6] xen/pciback: Remove tons of dereferences Konrad Rzeszutek Wilk
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

We had been printing it only if the device was built with
debug enabled. But this information is useful in the field
to troubleshoot.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/xenbus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index b3318fd..53e2dda 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -246,7 +246,7 @@ static int xen_pcibk_export_device(struct xen_pcibk_device *pdev,
 	if (err)
 		goto out;
 
-	dev_dbg(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
+	dev_info(&dev->dev, "registering for %d\n", pdev->xdev->otherend_id);
 	if (xen_register_device_domain_owner(dev,
 					     pdev->xdev->otherend_id) != 0) {
 		dev_err(&dev->dev, "Stealing ownership from dom%d.\n",
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 6/6] xen/pciback: Remove tons of dereferences
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (9 preceding siblings ...)
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

A little cleanup. No functional difference.

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 121c725..1ddd22f 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -631,10 +631,12 @@ static pci_ers_result_t common_process(struct pcistub_device *psdev,
 {
 	pci_ers_result_t res = result;
 	struct xen_pcie_aer_op *aer_op;
+	struct xen_pcibk_device *pdev = psdev->pdev;
+	struct xen_pci_sharedinfo *sh_info = pdev->sh_info;
 	int ret;
 
 	/*with PV AER drivers*/
-	aer_op = &(psdev->pdev->sh_info->aer_op);
+	aer_op = &(sh_info->aer_op);
 	aer_op->cmd = aer_cmd ;
 	/*useful for error_detected callback*/
 	aer_op->err = state;
@@ -655,36 +657,36 @@ static pci_ers_result_t common_process(struct pcistub_device *psdev,
 	* this flag to judge whether we need to check pci-front give aer
 	* service ack signal
 	*/
-	set_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+	set_bit(_PCIB_op_pending, (unsigned long *)&pdev->flags);
 
 	/*It is possible that a pcifront conf_read_write ops request invokes
 	* the callback which cause the spurious execution of wake_up.
 	* Yet it is harmless and better than a spinlock here
 	*/
 	set_bit(_XEN_PCIB_active,
-		(unsigned long *)&psdev->pdev->sh_info->flags);
+		(unsigned long *)&sh_info->flags);
 	wmb();
-	notify_remote_via_irq(psdev->pdev->evtchn_irq);
+	notify_remote_via_irq(pdev->evtchn_irq);
 
 	ret = wait_event_timeout(xen_pcibk_aer_wait_queue,
 				 !(test_bit(_XEN_PCIB_active, (unsigned long *)
-				 &psdev->pdev->sh_info->flags)), 300*HZ);
+				 &sh_info->flags)), 300*HZ);
 
 	if (!ret) {
 		if (test_bit(_XEN_PCIB_active,
-			(unsigned long *)&psdev->pdev->sh_info->flags)) {
+			(unsigned long *)&sh_info->flags)) {
 			dev_err(&psdev->dev->dev,
 				"pcifront aer process not responding!\n");
 			clear_bit(_XEN_PCIB_active,
-			  (unsigned long *)&psdev->pdev->sh_info->flags);
+			  (unsigned long *)&sh_info->flags);
 			aer_op->err = PCI_ERS_RESULT_NONE;
 			return res;
 		}
 	}
-	clear_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+	clear_bit(_PCIB_op_pending, (unsigned long *)&pdev->flags);
 
 	if (test_bit(_XEN_PCIF_active,
-		(unsigned long *)&psdev->pdev->sh_info->flags)) {
+		(unsigned long *)&sh_info->flags)) {
 		dev_dbg(&psdev->dev->dev,
 			"schedule pci_conf service in " DRV_NAME "\n");
 		xen_pcibk_test_and_schedule_op(psdev->pdev);
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v5 6/6] xen/pciback: Remove tons of dereferences
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (10 preceding siblings ...)
  2014-07-14 16:18 ` [PATCH v5 6/6] xen/pciback: Remove tons of dereferences Konrad Rzeszutek Wilk
@ 2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-14 17:40 ` [PATCH v5] Fixes to Xen pciback for 3.17 Greg KH
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel
  Cc: Konrad Rzeszutek Wilk

A little cleanup. No functional difference.

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 121c725..1ddd22f 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -631,10 +631,12 @@ static pci_ers_result_t common_process(struct pcistub_device *psdev,
 {
 	pci_ers_result_t res = result;
 	struct xen_pcie_aer_op *aer_op;
+	struct xen_pcibk_device *pdev = psdev->pdev;
+	struct xen_pci_sharedinfo *sh_info = pdev->sh_info;
 	int ret;
 
 	/*with PV AER drivers*/
-	aer_op = &(psdev->pdev->sh_info->aer_op);
+	aer_op = &(sh_info->aer_op);
 	aer_op->cmd = aer_cmd ;
 	/*useful for error_detected callback*/
 	aer_op->err = state;
@@ -655,36 +657,36 @@ static pci_ers_result_t common_process(struct pcistub_device *psdev,
 	* this flag to judge whether we need to check pci-front give aer
 	* service ack signal
 	*/
-	set_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+	set_bit(_PCIB_op_pending, (unsigned long *)&pdev->flags);
 
 	/*It is possible that a pcifront conf_read_write ops request invokes
 	* the callback which cause the spurious execution of wake_up.
 	* Yet it is harmless and better than a spinlock here
 	*/
 	set_bit(_XEN_PCIB_active,
-		(unsigned long *)&psdev->pdev->sh_info->flags);
+		(unsigned long *)&sh_info->flags);
 	wmb();
-	notify_remote_via_irq(psdev->pdev->evtchn_irq);
+	notify_remote_via_irq(pdev->evtchn_irq);
 
 	ret = wait_event_timeout(xen_pcibk_aer_wait_queue,
 				 !(test_bit(_XEN_PCIB_active, (unsigned long *)
-				 &psdev->pdev->sh_info->flags)), 300*HZ);
+				 &sh_info->flags)), 300*HZ);
 
 	if (!ret) {
 		if (test_bit(_XEN_PCIB_active,
-			(unsigned long *)&psdev->pdev->sh_info->flags)) {
+			(unsigned long *)&sh_info->flags)) {
 			dev_err(&psdev->dev->dev,
 				"pcifront aer process not responding!\n");
 			clear_bit(_XEN_PCIB_active,
-			  (unsigned long *)&psdev->pdev->sh_info->flags);
+			  (unsigned long *)&sh_info->flags);
 			aer_op->err = PCI_ERS_RESULT_NONE;
 			return res;
 		}
 	}
-	clear_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+	clear_bit(_PCIB_op_pending, (unsigned long *)&pdev->flags);
 
 	if (test_bit(_XEN_PCIF_active,
-		(unsigned long *)&psdev->pdev->sh_info->flags)) {
+		(unsigned long *)&sh_info->flags)) {
 		dev_dbg(&psdev->dev->dev,
 			"schedule pci_conf service in " DRV_NAME "\n");
 		xen_pcibk_test_and_schedule_op(psdev->pdev);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-07-14 17:40 ` [PATCH v5] Fixes to Xen pciback for 3.17 Greg KH
  2014-07-14 17:39   ` Konrad Rzeszutek Wilk
@ 2014-07-14 17:39   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 17:39 UTC (permalink / raw)
  To: Greg KH; +Cc: xen-devel, linux-kernel, boris.ostrovsky, david.vrabel

On Mon, Jul 14, 2014 at 10:40:42AM -0700, Greg KH wrote:
> On Mon, Jul 14, 2014 at 12:18:50PM -0400, Konrad Rzeszutek Wilk wrote:
> > Greg: goto GHK
> > 
> > This is v5 version of patches to fix some issues in Xen PCIback.
> > 
> > One of the issues Xen PCI back has that patch:
> > 
> > is fixing is that a deadlock can happen if the PCI device is
> > assigned to a guest and we try to 'unbind' it from Xen 'pciback' driver.
> > The issue is rather simple - the SysFS mechanism for the 'unbind' path
> > takes a device lock and the code in Xen PCI uses the pci_reset_function
> > which also takes the same lock. Solution is to use the lock-less version
> > and mandate that callers of said function in Xen pciback take the lock.
> > Easy enough.
> > 
> > GHK:
> > To guard against this happening in the future we also add an assert in the
> > form of lockdep assertion. That is OK except that it looks ugly as we take
> > it straight from the 'struct device' instead of using an appropriate macro.
> > See:
> > 
> > +       lockdep_assert_held(&dev->dev.mutex);
> > 
> > (in [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding).
> > 
> > The patch: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex
> > to do.
> > 
> > introduces a nice wrapper so it is bit cleaner. Greg, if you are OK with
> > it could you kindly Ack it as I would prefer to put this patchset
> > via the Xen tree. It would look now as:
> > 
> > -       lockdep_assert_held(&dev->dev.mutex);
> > +       device_lock_assert(&dev->dev);
> > 
> > I can also squash it in "[PATCH v5 2/6] xen/pciback: Don't deadlock when
> >  unbinding." but since that one is going through the stable tree I wasn't
> > sure whether you (Greg KH) would be OK with that.
> 
> You have my ack now, and feel free to squash it into patch 2/6 if you
> want, I don't mind having that in the stable trees.

Fantastic! Thank you.
> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-07-14 17:40 ` [PATCH v5] Fixes to Xen pciback for 3.17 Greg KH
@ 2014-07-14 17:39   ` Konrad Rzeszutek Wilk
  2014-07-14 17:39   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 17:39 UTC (permalink / raw)
  To: Greg KH; +Cc: xen-devel, boris.ostrovsky, linux-kernel, david.vrabel

On Mon, Jul 14, 2014 at 10:40:42AM -0700, Greg KH wrote:
> On Mon, Jul 14, 2014 at 12:18:50PM -0400, Konrad Rzeszutek Wilk wrote:
> > Greg: goto GHK
> > 
> > This is v5 version of patches to fix some issues in Xen PCIback.
> > 
> > One of the issues Xen PCI back has that patch:
> > 
> > is fixing is that a deadlock can happen if the PCI device is
> > assigned to a guest and we try to 'unbind' it from Xen 'pciback' driver.
> > The issue is rather simple - the SysFS mechanism for the 'unbind' path
> > takes a device lock and the code in Xen PCI uses the pci_reset_function
> > which also takes the same lock. Solution is to use the lock-less version
> > and mandate that callers of said function in Xen pciback take the lock.
> > Easy enough.
> > 
> > GHK:
> > To guard against this happening in the future we also add an assert in the
> > form of lockdep assertion. That is OK except that it looks ugly as we take
> > it straight from the 'struct device' instead of using an appropriate macro.
> > See:
> > 
> > +       lockdep_assert_held(&dev->dev.mutex);
> > 
> > (in [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding).
> > 
> > The patch: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex
> > to do.
> > 
> > introduces a nice wrapper so it is bit cleaner. Greg, if you are OK with
> > it could you kindly Ack it as I would prefer to put this patchset
> > via the Xen tree. It would look now as:
> > 
> > -       lockdep_assert_held(&dev->dev.mutex);
> > +       device_lock_assert(&dev->dev);
> > 
> > I can also squash it in "[PATCH v5 2/6] xen/pciback: Don't deadlock when
> >  unbinding." but since that one is going through the stable tree I wasn't
> > sure whether you (Greg KH) would be OK with that.
> 
> You have my ack now, and feel free to squash it into patch 2/6 if you
> want, I don't mind having that in the stable trees.

Fantastic! Thank you.
> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings
  2014-07-14 16:18 ` [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings Konrad Rzeszutek Wilk
  2014-07-14 17:39   ` Greg KH
@ 2014-07-14 17:39   ` Greg KH
  1 sibling, 0 replies; 68+ messages in thread
From: Greg KH @ 2014-07-14 17:39 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, linux-kernel, boris.ostrovsky, david.vrabel

On Mon, Jul 14, 2014 at 12:18:53PM -0400, Konrad Rzeszutek Wilk wrote:
> Instead of open-coding it in drivers that want to double check
> that their functions are indeed holding the device lock.
> 
> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings
  2014-07-14 16:18 ` [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings Konrad Rzeszutek Wilk
@ 2014-07-14 17:39   ` Greg KH
  2014-07-14 17:39   ` Greg KH
  1 sibling, 0 replies; 68+ messages in thread
From: Greg KH @ 2014-07-14 17:39 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, boris.ostrovsky, linux-kernel, david.vrabel

On Mon, Jul 14, 2014 at 12:18:53PM -0400, Konrad Rzeszutek Wilk wrote:
> Instead of open-coding it in drivers that want to double check
> that their functions are indeed holding the device lock.
> 
> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (11 preceding siblings ...)
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
@ 2014-07-14 17:40 ` Greg KH
  2014-07-14 17:39   ` Konrad Rzeszutek Wilk
  2014-07-14 17:39   ` Konrad Rzeszutek Wilk
  2014-07-14 17:40 ` Greg KH
                   ` (2 subsequent siblings)
  15 siblings, 2 replies; 68+ messages in thread
From: Greg KH @ 2014-07-14 17:40 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, linux-kernel, boris.ostrovsky, david.vrabel

On Mon, Jul 14, 2014 at 12:18:50PM -0400, Konrad Rzeszutek Wilk wrote:
> Greg: goto GHK
> 
> This is v5 version of patches to fix some issues in Xen PCIback.
> 
> One of the issues Xen PCI back has that patch:
> 
> is fixing is that a deadlock can happen if the PCI device is
> assigned to a guest and we try to 'unbind' it from Xen 'pciback' driver.
> The issue is rather simple - the SysFS mechanism for the 'unbind' path
> takes a device lock and the code in Xen PCI uses the pci_reset_function
> which also takes the same lock. Solution is to use the lock-less version
> and mandate that callers of said function in Xen pciback take the lock.
> Easy enough.
> 
> GHK:
> To guard against this happening in the future we also add an assert in the
> form of lockdep assertion. That is OK except that it looks ugly as we take
> it straight from the 'struct device' instead of using an appropriate macro.
> See:
> 
> +       lockdep_assert_held(&dev->dev.mutex);
> 
> (in [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding).
> 
> The patch: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex
> to do.
> 
> introduces a nice wrapper so it is bit cleaner. Greg, if you are OK with
> it could you kindly Ack it as I would prefer to put this patchset
> via the Xen tree. It would look now as:
> 
> -       lockdep_assert_held(&dev->dev.mutex);
> +       device_lock_assert(&dev->dev);
> 
> I can also squash it in "[PATCH v5 2/6] xen/pciback: Don't deadlock when
>  unbinding." but since that one is going through the stable tree I wasn't
> sure whether you (Greg KH) would be OK with that.

You have my ack now, and feel free to squash it into patch 2/6 if you
want, I don't mind having that in the stable trees.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (12 preceding siblings ...)
  2014-07-14 17:40 ` [PATCH v5] Fixes to Xen pciback for 3.17 Greg KH
@ 2014-07-14 17:40 ` Greg KH
  2014-08-01 15:30 ` David Vrabel
  2014-08-01 15:30 ` David Vrabel
  15 siblings, 0 replies; 68+ messages in thread
From: Greg KH @ 2014-07-14 17:40 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, boris.ostrovsky, linux-kernel, david.vrabel

On Mon, Jul 14, 2014 at 12:18:50PM -0400, Konrad Rzeszutek Wilk wrote:
> Greg: goto GHK
> 
> This is v5 version of patches to fix some issues in Xen PCIback.
> 
> One of the issues Xen PCI back has that patch:
> 
> is fixing is that a deadlock can happen if the PCI device is
> assigned to a guest and we try to 'unbind' it from Xen 'pciback' driver.
> The issue is rather simple - the SysFS mechanism for the 'unbind' path
> takes a device lock and the code in Xen PCI uses the pci_reset_function
> which also takes the same lock. Solution is to use the lock-less version
> and mandate that callers of said function in Xen pciback take the lock.
> Easy enough.
> 
> GHK:
> To guard against this happening in the future we also add an assert in the
> form of lockdep assertion. That is OK except that it looks ugly as we take
> it straight from the 'struct device' instead of using an appropriate macro.
> See:
> 
> +       lockdep_assert_held(&dev->dev.mutex);
> 
> (in [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding).
> 
> The patch: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex
> to do.
> 
> introduces a nice wrapper so it is bit cleaner. Greg, if you are OK with
> it could you kindly Ack it as I would prefer to put this patchset
> via the Xen tree. It would look now as:
> 
> -       lockdep_assert_held(&dev->dev.mutex);
> +       device_lock_assert(&dev->dev);
> 
> I can also squash it in "[PATCH v5 2/6] xen/pciback: Don't deadlock when
>  unbinding." but since that one is going through the stable tree I wasn't
> sure whether you (Greg KH) would be OK with that.

You have my ack now, and feel free to squash it into patch 2/6 if you
want, I don't mind having that in the stable trees.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
@ 2014-07-28 13:04   ` David Vrabel
  2014-07-28 14:56     ` Greg KH
  2014-07-28 14:56     ` Greg KH
  2014-07-28 13:04   ` David Vrabel
  1 sibling, 2 replies; 68+ messages in thread
From: David Vrabel @ 2014-07-28 13:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, gregkh, xen-devel, linux-kernel, boris.ostrovsky

On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> Which hadn't been done with the initial commit.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> v2: Dropped the parameters and one that is unlikeable.
> ---
>  Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
> new file mode 100644
> index 0000000..cdc8340
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> @@ -0,0 +1,25 @@
> +What:           /sys/bus/pci/drivers/pciback/quirks
> +Date:           Oct 2011
> +KernelVersion:  3.1
> +Contact:        xen-devel@lists.xenproject.org
> +Description:
> +                If the permissive attribute is set, then writing a string in
> +                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
> +                to write and read from the PCI device. That is Domain:Bus:

"...write and read the PCI device's configuration space."

> +                Device.Function-Register:Size:Mask (Domain is optional).
> +                For example:
> +                #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
> +                will allow the guest to read and write to the configuration
> +                register 0x0E.
> +
> +What:           /sys/bus/pci/drivers/pciback/irq_handlers
> +Date:           Oct 2011
> +KernelVersion:  3.1
> +Contact:        xen-devel@lists.xenproject.org
> +Description:
> +                A list of all of the PCI devices owned by Xen PCI back and
> +                whether Xen PCI backend will acknowledge the interrupts received
> +                and the amount of interrupts received. Xen PCI back acknowledges
> +                said interrupts only when they are level, shared with another
> +                guest, and enabled by the guest.

That's not very nice sysfs file.  This sort of thing ought to be in
debugfs.  Perhaps we shouldn't document it for now and try to move it to
debugfs later?

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-14 16:18 ` Konrad Rzeszutek Wilk
  2014-07-28 13:04   ` David Vrabel
@ 2014-07-28 13:04   ` David Vrabel
  1 sibling, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-07-28 13:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, gregkh, xen-devel, linux-kernel, boris.ostrovsky

On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> Which hadn't been done with the initial commit.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> v2: Dropped the parameters and one that is unlikeable.
> ---
>  Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
> new file mode 100644
> index 0000000..cdc8340
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> @@ -0,0 +1,25 @@
> +What:           /sys/bus/pci/drivers/pciback/quirks
> +Date:           Oct 2011
> +KernelVersion:  3.1
> +Contact:        xen-devel@lists.xenproject.org
> +Description:
> +                If the permissive attribute is set, then writing a string in
> +                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
> +                to write and read from the PCI device. That is Domain:Bus:

"...write and read the PCI device's configuration space."

> +                Device.Function-Register:Size:Mask (Domain is optional).
> +                For example:
> +                #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
> +                will allow the guest to read and write to the configuration
> +                register 0x0E.
> +
> +What:           /sys/bus/pci/drivers/pciback/irq_handlers
> +Date:           Oct 2011
> +KernelVersion:  3.1
> +Contact:        xen-devel@lists.xenproject.org
> +Description:
> +                A list of all of the PCI devices owned by Xen PCI back and
> +                whether Xen PCI backend will acknowledge the interrupts received
> +                and the amount of interrupts received. Xen PCI back acknowledges
> +                said interrupts only when they are level, shared with another
> +                guest, and enabled by the guest.

That's not very nice sysfs file.  This sort of thing ought to be in
debugfs.  Perhaps we shouldn't document it for now and try to move it to
debugfs later?

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-07-14 16:18 ` [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding Konrad Rzeszutek Wilk
  2014-07-28 13:06   ` David Vrabel
@ 2014-07-28 13:06   ` David Vrabel
  2014-08-04 18:42     ` Konrad Rzeszutek Wilk
  2014-08-04 18:42     ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 68+ messages in thread
From: David Vrabel @ 2014-07-28 13:06 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, gregkh, xen-devel, linux-kernel, boris.ostrovsky
  Cc: stable

On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
> 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
> explained there are four entry points in this function.
> Two of them are when the user fiddles in the SysFS to
> unbind a device which might be in use by a guest or not.
> 
> Both 'unbind' states will cause a deadlock as the the PCI lock has
> already been taken, which then pci_device_reset tries to take.
> 
> We can simplify this by requiring that all callers of
> pcistub_put_pci_dev MUST hold the device lock. And then
> we can just call the lockless version of pci_device_reset.
> 
> To make it even simpler we will modify xen_pcibk_release_pci_dev
> to quality whether it should take a lock or not - as it ends
> up calling xen_pcibk_release_pci_dev and needs to hold the lock.
> 
> CC: stable@vger.kernel.org

This deadlock is for a rather specific and uncommon use case (manually
unbinding a PCI while it is passed-through). Is this critical enough to
warrant a stable backport?

> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-07-14 16:18 ` [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding Konrad Rzeszutek Wilk
@ 2014-07-28 13:06   ` David Vrabel
  2014-07-28 13:06   ` David Vrabel
  1 sibling, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-07-28 13:06 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, gregkh, xen-devel, linux-kernel, boris.ostrovsky
  Cc: stable

On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
> 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
> explained there are four entry points in this function.
> Two of them are when the user fiddles in the SysFS to
> unbind a device which might be in use by a guest or not.
> 
> Both 'unbind' states will cause a deadlock as the the PCI lock has
> already been taken, which then pci_device_reset tries to take.
> 
> We can simplify this by requiring that all callers of
> pcistub_put_pci_dev MUST hold the device lock. And then
> we can just call the lockless version of pci_device_reset.
> 
> To make it even simpler we will modify xen_pcibk_release_pci_dev
> to quality whether it should take a lock or not - as it ends
> up calling xen_pcibk_release_pci_dev and needs to hold the lock.
> 
> CC: stable@vger.kernel.org

This deadlock is for a rather specific and uncommon use case (manually
unbinding a PCI while it is passed-through). Is this critical enough to
warrant a stable backport?

> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-28 13:04   ` David Vrabel
  2014-07-28 14:56     ` Greg KH
@ 2014-07-28 14:56     ` Greg KH
  2014-08-01 14:59       ` David Vrabel
  2014-08-01 14:59       ` [Xen-devel] " David Vrabel
  1 sibling, 2 replies; 68+ messages in thread
From: Greg KH @ 2014-07-28 14:56 UTC (permalink / raw)
  To: David Vrabel
  Cc: Konrad Rzeszutek Wilk, xen-devel, linux-kernel, boris.ostrovsky

On Mon, Jul 28, 2014 at 02:04:18PM +0100, David Vrabel wrote:
> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> > Which hadn't been done with the initial commit.
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> > v2: Dropped the parameters and one that is unlikeable.
> > ---
> >  Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> >  create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
> > new file mode 100644
> > index 0000000..cdc8340
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> > @@ -0,0 +1,25 @@
> > +What:           /sys/bus/pci/drivers/pciback/quirks
> > +Date:           Oct 2011
> > +KernelVersion:  3.1
> > +Contact:        xen-devel@lists.xenproject.org
> > +Description:
> > +                If the permissive attribute is set, then writing a string in
> > +                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
> > +                to write and read from the PCI device. That is Domain:Bus:
> 
> "...write and read the PCI device's configuration space."

How is this different from the normal pci device config file?

> 
> > +                Device.Function-Register:Size:Mask (Domain is optional).
> > +                For example:
> > +                #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
> > +                will allow the guest to read and write to the configuration
> > +                register 0x0E.
> > +
> > +What:           /sys/bus/pci/drivers/pciback/irq_handlers
> > +Date:           Oct 2011
> > +KernelVersion:  3.1
> > +Contact:        xen-devel@lists.xenproject.org
> > +Description:
> > +                A list of all of the PCI devices owned by Xen PCI back and
> > +                whether Xen PCI backend will acknowledge the interrupts received
> > +                and the amount of interrupts received. Xen PCI back acknowledges
> > +                said interrupts only when they are level, shared with another
> > +                guest, and enabled by the guest.
> 
> That's not very nice sysfs file.  This sort of thing ought to be in
> debugfs.  Perhaps we shouldn't document it for now and try to move it to
> debugfs later?

Move it to debugfs now would be better, that's not an acceptable sysfs
file at all, thanks for pointing it out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-28 13:04   ` David Vrabel
@ 2014-07-28 14:56     ` Greg KH
  2014-07-28 14:56     ` Greg KH
  1 sibling, 0 replies; 68+ messages in thread
From: Greg KH @ 2014-07-28 14:56 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, boris.ostrovsky, linux-kernel

On Mon, Jul 28, 2014 at 02:04:18PM +0100, David Vrabel wrote:
> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> > Which hadn't been done with the initial commit.
> > 
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> > v2: Dropped the parameters and one that is unlikeable.
> > ---
> >  Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
> >  1 file changed, 25 insertions(+)
> >  create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback
> > 
> > diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
> > new file mode 100644
> > index 0000000..cdc8340
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-driver-pciback
> > @@ -0,0 +1,25 @@
> > +What:           /sys/bus/pci/drivers/pciback/quirks
> > +Date:           Oct 2011
> > +KernelVersion:  3.1
> > +Contact:        xen-devel@lists.xenproject.org
> > +Description:
> > +                If the permissive attribute is set, then writing a string in
> > +                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
> > +                to write and read from the PCI device. That is Domain:Bus:
> 
> "...write and read the PCI device's configuration space."

How is this different from the normal pci device config file?

> 
> > +                Device.Function-Register:Size:Mask (Domain is optional).
> > +                For example:
> > +                #echo 00:19.0-E0:2:FF > /sys/bus/pci/drivers/pciback/quirks
> > +                will allow the guest to read and write to the configuration
> > +                register 0x0E.
> > +
> > +What:           /sys/bus/pci/drivers/pciback/irq_handlers
> > +Date:           Oct 2011
> > +KernelVersion:  3.1
> > +Contact:        xen-devel@lists.xenproject.org
> > +Description:
> > +                A list of all of the PCI devices owned by Xen PCI back and
> > +                whether Xen PCI backend will acknowledge the interrupts received
> > +                and the amount of interrupts received. Xen PCI back acknowledges
> > +                said interrupts only when they are level, shared with another
> > +                guest, and enabled by the guest.
> 
> That's not very nice sysfs file.  This sort of thing ought to be in
> debugfs.  Perhaps we shouldn't document it for now and try to move it to
> debugfs later?

Move it to debugfs now would be better, that's not an acceptable sysfs
file at all, thanks for pointing it out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-28 14:56     ` Greg KH
  2014-08-01 14:59       ` David Vrabel
@ 2014-08-01 14:59       ` David Vrabel
  1 sibling, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-01 14:59 UTC (permalink / raw)
  To: Greg KH, David Vrabel; +Cc: xen-devel, boris.ostrovsky, linux-kernel

On 28/07/14 15:56, Greg KH wrote:
> On Mon, Jul 28, 2014 at 02:04:18PM +0100, David Vrabel wrote:
>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> Which hadn't been done with the initial commit.
>>>
>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> ---
>>> v2: Dropped the parameters and one that is unlikeable.
>>> ---
>>>  Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
>>>  1 file changed, 25 insertions(+)
>>>  create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback
>>>
>>> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
>>> new file mode 100644
>>> index 0000000..cdc8340
>>> --- /dev/null
>>> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
>>> @@ -0,0 +1,25 @@
>>> +What:           /sys/bus/pci/drivers/pciback/quirks
>>> +Date:           Oct 2011
>>> +KernelVersion:  3.1
>>> +Contact:        xen-devel@lists.xenproject.org
>>> +Description:
>>> +                If the permissive attribute is set, then writing a string in
>>> +                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
>>> +                to write and read from the PCI device. That is Domain:Bus:
>>
>> "...write and read the PCI device's configuration space."
> 
> How is this different from the normal pci device config file?

This is setting the permissions for a guest to access the real hardware
PCI config space (instead of the virtualized config space which the
guest has access to by default).

The guest access the config space using the normal methods (the file or
the kernel functions).

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS
  2014-07-28 14:56     ` Greg KH
@ 2014-08-01 14:59       ` David Vrabel
  2014-08-01 14:59       ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-01 14:59 UTC (permalink / raw)
  To: Greg KH, David Vrabel; +Cc: xen-devel, boris.ostrovsky, linux-kernel

On 28/07/14 15:56, Greg KH wrote:
> On Mon, Jul 28, 2014 at 02:04:18PM +0100, David Vrabel wrote:
>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> Which hadn't been done with the initial commit.
>>>
>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> ---
>>> v2: Dropped the parameters and one that is unlikeable.
>>> ---
>>>  Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++++++++++++
>>>  1 file changed, 25 insertions(+)
>>>  create mode 100644 Documentation/ABI/testing/sysfs-driver-pciback
>>>
>>> diff --git a/Documentation/ABI/testing/sysfs-driver-pciback b/Documentation/ABI/testing/sysfs-driver-pciback
>>> new file mode 100644
>>> index 0000000..cdc8340
>>> --- /dev/null
>>> +++ b/Documentation/ABI/testing/sysfs-driver-pciback
>>> @@ -0,0 +1,25 @@
>>> +What:           /sys/bus/pci/drivers/pciback/quirks
>>> +Date:           Oct 2011
>>> +KernelVersion:  3.1
>>> +Contact:        xen-devel@lists.xenproject.org
>>> +Description:
>>> +                If the permissive attribute is set, then writing a string in
>>> +                the format of DDDD:BB:DD.F-REG:SIZE:MASK will allow the guest
>>> +                to write and read from the PCI device. That is Domain:Bus:
>>
>> "...write and read the PCI device's configuration space."
> 
> How is this different from the normal pci device config file?

This is setting the permissions for a guest to access the real hardware
PCI config space (instead of the virtualized config space which the
guest has access to by default).

The guest access the config space using the normal methods (the file or
the kernel functions).

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (13 preceding siblings ...)
  2014-07-14 17:40 ` Greg KH
@ 2014-08-01 15:30 ` David Vrabel
  2014-08-04 18:43   ` Konrad Rzeszutek Wilk
  2014-08-04 18:43   ` Konrad Rzeszutek Wilk
  2014-08-01 15:30 ` David Vrabel
  15 siblings, 2 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-01 15:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, gregkh, xen-devel, linux-kernel, boris.ostrovsky

On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> Greg: goto GHK
> 
> This is v5 version of patches to fix some issues in Xen PCIback.

Applied to devel/for-linus-3.17.

I dropped the stable Cc for #2 pending a final decision on whether it
really is a stable candidate.

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
                   ` (14 preceding siblings ...)
  2014-08-01 15:30 ` David Vrabel
@ 2014-08-01 15:30 ` David Vrabel
  15 siblings, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-01 15:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, gregkh, xen-devel, linux-kernel, boris.ostrovsky

On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> Greg: goto GHK
> 
> This is v5 version of patches to fix some issues in Xen PCIback.

Applied to devel/for-linus-3.17.

I dropped the stable Cc for #2 pending a final decision on whether it
really is a stable candidate.

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-07-28 13:06   ` David Vrabel
@ 2014-08-04 18:42     ` Konrad Rzeszutek Wilk
  2014-08-05  9:27       ` [Xen-devel] " David Vrabel
  2014-08-05  9:27       ` David Vrabel
  2014-08-04 18:42     ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-04 18:42 UTC (permalink / raw)
  To: David Vrabel; +Cc: gregkh, xen-devel, linux-kernel, boris.ostrovsky, stable

On Mon, Jul 28, 2014 at 02:06:59PM +0100, David Vrabel wrote:
> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> > As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
> > 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
> > explained there are four entry points in this function.
> > Two of them are when the user fiddles in the SysFS to
> > unbind a device which might be in use by a guest or not.
> > 
> > Both 'unbind' states will cause a deadlock as the the PCI lock has
> > already been taken, which then pci_device_reset tries to take.
> > 
> > We can simplify this by requiring that all callers of
> > pcistub_put_pci_dev MUST hold the device lock. And then
> > we can just call the lockless version of pci_device_reset.
> > 
> > To make it even simpler we will modify xen_pcibk_release_pci_dev
> > to quality whether it should take a lock or not - as it ends
> > up calling xen_pcibk_release_pci_dev and needs to hold the lock.
> > 
> > CC: stable@vger.kernel.org
> 
> This deadlock is for a rather specific and uncommon use case (manually
> unbinding a PCI while it is passed-through). Is this critical enough to
> warrant a stable backport?

We seem to trip over it frequently when rebooting a server.

That is the VF's end up being unbinded while the guests
are being shutdown. And depending on the timing we end up in a deadlock.

> 
> > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
> 
> David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-07-28 13:06   ` David Vrabel
  2014-08-04 18:42     ` Konrad Rzeszutek Wilk
@ 2014-08-04 18:42     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-04 18:42 UTC (permalink / raw)
  To: David Vrabel; +Cc: gregkh, boris.ostrovsky, linux-kernel, stable, xen-devel

On Mon, Jul 28, 2014 at 02:06:59PM +0100, David Vrabel wrote:
> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> > As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
> > 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
> > explained there are four entry points in this function.
> > Two of them are when the user fiddles in the SysFS to
> > unbind a device which might be in use by a guest or not.
> > 
> > Both 'unbind' states will cause a deadlock as the the PCI lock has
> > already been taken, which then pci_device_reset tries to take.
> > 
> > We can simplify this by requiring that all callers of
> > pcistub_put_pci_dev MUST hold the device lock. And then
> > we can just call the lockless version of pci_device_reset.
> > 
> > To make it even simpler we will modify xen_pcibk_release_pci_dev
> > to quality whether it should take a lock or not - as it ends
> > up calling xen_pcibk_release_pci_dev and needs to hold the lock.
> > 
> > CC: stable@vger.kernel.org
> 
> This deadlock is for a rather specific and uncommon use case (manually
> unbinding a PCI while it is passed-through). Is this critical enough to
> warrant a stable backport?

We seem to trip over it frequently when rebooting a server.

That is the VF's end up being unbinded while the guests
are being shutdown. And depending on the timing we end up in a deadlock.

> 
> > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> Reviewed-by: David Vrabel <david.vrabel@citrix.com>
> 
> David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-01 15:30 ` David Vrabel
@ 2014-08-04 18:43   ` Konrad Rzeszutek Wilk
  2014-08-05  8:44     ` Sander Eikelenboom
  2014-08-05  8:44     ` [Xen-devel] " Sander Eikelenboom
  2014-08-04 18:43   ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-04 18:43 UTC (permalink / raw)
  To: David Vrabel; +Cc: gregkh, xen-devel, linux-kernel, boris.ostrovsky

On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> > Greg: goto GHK
> > 
> > This is v5 version of patches to fix some issues in Xen PCIback.
> 
> Applied to devel/for-linus-3.17.

Thank you.
> 
> I dropped the stable Cc for #2 pending a final decision on whether it
> really is a stable candidate.

OK.
> 
> David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-01 15:30 ` David Vrabel
  2014-08-04 18:43   ` Konrad Rzeszutek Wilk
@ 2014-08-04 18:43   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-04 18:43 UTC (permalink / raw)
  To: David Vrabel; +Cc: gregkh, boris.ostrovsky, linux-kernel, xen-devel

On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> > Greg: goto GHK
> > 
> > This is v5 version of patches to fix some issues in Xen PCIback.
> 
> Applied to devel/for-linus-3.17.

Thank you.
> 
> I dropped the stable Cc for #2 pending a final decision on whether it
> really is a stable candidate.

OK.
> 
> David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-04 18:43   ` Konrad Rzeszutek Wilk
  2014-08-05  8:44     ` Sander Eikelenboom
@ 2014-08-05  8:44     ` Sander Eikelenboom
  2014-08-05  9:31       ` David Vrabel
  2014-08-05  9:31       ` [Xen-devel] " David Vrabel
  1 sibling, 2 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05  8:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Vrabel, gregkh, boris.ostrovsky, linux-kernel, xen-devel


Monday, August 4, 2014, 8:43:18 PM, you wrote:

> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> > Greg: goto GHK
>> > 
>> > This is v5 version of patches to fix some issues in Xen PCIback.
>> 
>> Applied to devel/for-linus-3.17.

> Thank you.
>> 
>> I dropped the stable Cc for #2 pending a final decision on whether it
>> really is a stable candidate.

> OK.
>> 
>> David

Hi Konrad / David,

This series still lacks a resolution on the sysfs /do_flr /reset,
as a result the pci devices are not reset after shutdown of a guest.
(no more pciback 0000:xx:xx.x: restoring config space at offset xxx)

So this series now introduces a regression to 3.16, which causes devices to malfunction 
after a guest reboot or after assigning the devices to another guest.

Apart from that .. i can't resist to remind the other issue with removing pci
devices passed through to HVM guests related to the signaling via xenstore,
described in:

http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html 


--
Sander



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-04 18:43   ` Konrad Rzeszutek Wilk
@ 2014-08-05  8:44     ` Sander Eikelenboom
  2014-08-05  8:44     ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05  8:44 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel


Monday, August 4, 2014, 8:43:18 PM, you wrote:

> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> > Greg: goto GHK
>> > 
>> > This is v5 version of patches to fix some issues in Xen PCIback.
>> 
>> Applied to devel/for-linus-3.17.

> Thank you.
>> 
>> I dropped the stable Cc for #2 pending a final decision on whether it
>> really is a stable candidate.

> OK.
>> 
>> David

Hi Konrad / David,

This series still lacks a resolution on the sysfs /do_flr /reset,
as a result the pci devices are not reset after shutdown of a guest.
(no more pciback 0000:xx:xx.x: restoring config space at offset xxx)

So this series now introduces a regression to 3.16, which causes devices to malfunction 
after a guest reboot or after assigning the devices to another guest.

Apart from that .. i can't resist to remind the other issue with removing pci
devices passed through to HVM guests related to the signaling via xenstore,
described in:

http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html 


--
Sander

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-08-04 18:42     ` Konrad Rzeszutek Wilk
@ 2014-08-05  9:27       ` David Vrabel
  2014-08-05  9:27       ` David Vrabel
  1 sibling, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-05  9:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, David Vrabel
  Cc: gregkh, boris.ostrovsky, linux-kernel, stable, xen-devel

On 04/08/14 19:42, Konrad Rzeszutek Wilk wrote:
> On Mon, Jul 28, 2014 at 02:06:59PM +0100, David Vrabel wrote:
>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
>>> 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
>>> explained there are four entry points in this function.
>>> Two of them are when the user fiddles in the SysFS to
>>> unbind a device which might be in use by a guest or not.
>>>
>>> Both 'unbind' states will cause a deadlock as the the PCI lock has
>>> already been taken, which then pci_device_reset tries to take.
>>>
>>> We can simplify this by requiring that all callers of
>>> pcistub_put_pci_dev MUST hold the device lock. And then
>>> we can just call the lockless version of pci_device_reset.
>>>
>>> To make it even simpler we will modify xen_pcibk_release_pci_dev
>>> to quality whether it should take a lock or not - as it ends
>>> up calling xen_pcibk_release_pci_dev and needs to hold the lock.
>>>
>>> CC: stable@vger.kernel.org
>>
>> This deadlock is for a rather specific and uncommon use case (manually
>> unbinding a PCI while it is passed-through). Is this critical enough to
>> warrant a stable backport?
> 
> We seem to trip over it frequently when rebooting a server.
> 
> That is the VF's end up being unbinded while the guests
> are being shutdown. And depending on the timing we end up in a deadlock.

Ok.  I'll add the stable tag.

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding.
  2014-08-04 18:42     ` Konrad Rzeszutek Wilk
  2014-08-05  9:27       ` [Xen-devel] " David Vrabel
@ 2014-08-05  9:27       ` David Vrabel
  1 sibling, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-05  9:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, David Vrabel
  Cc: gregkh, boris.ostrovsky, linux-kernel, stable, xen-devel

On 04/08/14 19:42, Konrad Rzeszutek Wilk wrote:
> On Mon, Jul 28, 2014 at 02:06:59PM +0100, David Vrabel wrote:
>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> As commit 0a9fd0152929db372ff61b0d6c280fdd34ae8bdb
>>> 'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
>>> explained there are four entry points in this function.
>>> Two of them are when the user fiddles in the SysFS to
>>> unbind a device which might be in use by a guest or not.
>>>
>>> Both 'unbind' states will cause a deadlock as the the PCI lock has
>>> already been taken, which then pci_device_reset tries to take.
>>>
>>> We can simplify this by requiring that all callers of
>>> pcistub_put_pci_dev MUST hold the device lock. And then
>>> we can just call the lockless version of pci_device_reset.
>>>
>>> To make it even simpler we will modify xen_pcibk_release_pci_dev
>>> to quality whether it should take a lock or not - as it ends
>>> up calling xen_pcibk_release_pci_dev and needs to hold the lock.
>>>
>>> CC: stable@vger.kernel.org
>>
>> This deadlock is for a rather specific and uncommon use case (manually
>> unbinding a PCI while it is passed-through). Is this critical enough to
>> warrant a stable backport?
> 
> We seem to trip over it frequently when rebooting a server.
> 
> That is the VF's end up being unbinded while the guests
> are being shutdown. And depending on the timing we end up in a deadlock.

Ok.  I'll add the stable tag.

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05  8:44     ` [Xen-devel] " Sander Eikelenboom
  2014-08-05  9:31       ` David Vrabel
@ 2014-08-05  9:31       ` David Vrabel
  2014-08-05  9:44         ` Sander Eikelenboom
  2014-08-05  9:44         ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 2 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-05  9:31 UTC (permalink / raw)
  To: Sander Eikelenboom, Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel

On 05/08/14 09:44, Sander Eikelenboom wrote:
> 
> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> 
>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>>> Greg: goto GHK
>>>>
>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>>
>>> Applied to devel/for-linus-3.17.
> 
>> Thank you.
>>>
>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>> really is a stable candidate.
> 
>> OK.
>>>
>>> David
> 
> Hi Konrad / David,
> 
> This series still lacks a resolution on the sysfs /do_flr /reset,
> as a result the pci devices are not reset after shutdown of a guest.
> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> 
> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> after a guest reboot or after assigning the devices to another guest.

I don't follow what you're saying.  The lack of a device reset for PCI
devices with no FLR method isn't a regression as this has never worked.
 Can you explain in more detail what the regression is and which patch
caused it?

> Apart from that .. i can't resist to remind the other issue with removing pci
> devices passed through to HVM guests related to the signaling via xenstore,
> described in:
> 
> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html

I don't remember seeing you posting a patch...?

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05  8:44     ` [Xen-devel] " Sander Eikelenboom
@ 2014-08-05  9:31       ` David Vrabel
  2014-08-05  9:31       ` [Xen-devel] " David Vrabel
  1 sibling, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-05  9:31 UTC (permalink / raw)
  To: Sander Eikelenboom, Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel

On 05/08/14 09:44, Sander Eikelenboom wrote:
> 
> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> 
>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>>> Greg: goto GHK
>>>>
>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>>
>>> Applied to devel/for-linus-3.17.
> 
>> Thank you.
>>>
>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>> really is a stable candidate.
> 
>> OK.
>>>
>>> David
> 
> Hi Konrad / David,
> 
> This series still lacks a resolution on the sysfs /do_flr /reset,
> as a result the pci devices are not reset after shutdown of a guest.
> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> 
> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> after a guest reboot or after assigning the devices to another guest.

I don't follow what you're saying.  The lack of a device reset for PCI
devices with no FLR method isn't a regression as this has never worked.
 Can you explain in more detail what the regression is and which patch
caused it?

> Apart from that .. i can't resist to remind the other issue with removing pci
> devices passed through to HVM guests related to the signaling via xenstore,
> described in:
> 
> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html

I don't remember seeing you posting a patch...?

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05  9:31       ` [Xen-devel] " David Vrabel
  2014-08-05  9:44         ` Sander Eikelenboom
@ 2014-08-05  9:44         ` Sander Eikelenboom
  2014-08-05 13:49           ` Konrad Rzeszutek Wilk
  2014-08-05 13:49           ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05  9:44 UTC (permalink / raw)
  To: David Vrabel
  Cc: Konrad Rzeszutek Wilk, gregkh, boris.ostrovsky, xen-devel, linux-kernel


Tuesday, August 5, 2014, 11:31:08 AM, you wrote:

> On 05/08/14 09:44, Sander Eikelenboom wrote:
>> 
>> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> 
>>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>>>> Greg: goto GHK
>>>>>
>>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>>>
>>>> Applied to devel/for-linus-3.17.
>> 
>>> Thank you.
>>>>
>>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>>> really is a stable candidate.
>> 
>>> OK.
>>>>
>>>> David
>> 
>> Hi Konrad / David,
>> 
>> This series still lacks a resolution on the sysfs /do_flr /reset,
>> as a result the pci devices are not reset after shutdown of a guest.
>> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> 
>> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> after a guest reboot or after assigning the devices to another guest.

> I don't follow what you're saying.  The lack of a device reset for PCI
> devices with no FLR method isn't a regression as this has never worked.
>  Can you explain in more detail what the regression is and which patch
> caused it?

I haven't bisected it to a specific patch in this series,
but this patch series (when pulled on top of 3.16) cause the following:

- Do a system start and HVM guest start
- HVM guest with pci passthrough, devices work fine
- shutdown the HVM guest
- "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
  appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
- Starting the HVM guest again with the same devices passed through.
- Devices malfunction (for example a USB host controller will fail a simple 
  "lsusb"
- And this all works fine on vanilla 3.16.  

>> Apart from that .. i can't resist to remind the other issue with removing pci
>> devices passed through to HVM guests related to the signaling via xenstore,
>> described in:
>> 
>> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html

> I don't remember seeing you posting a patch...?

> David



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05  9:31       ` [Xen-devel] " David Vrabel
@ 2014-08-05  9:44         ` Sander Eikelenboom
  2014-08-05  9:44         ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05  9:44 UTC (permalink / raw)
  To: David Vrabel; +Cc: gregkh, boris.ostrovsky, xen-devel, linux-kernel


Tuesday, August 5, 2014, 11:31:08 AM, you wrote:

> On 05/08/14 09:44, Sander Eikelenboom wrote:
>> 
>> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> 
>>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>>>> Greg: goto GHK
>>>>>
>>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>>>
>>>> Applied to devel/for-linus-3.17.
>> 
>>> Thank you.
>>>>
>>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>>> really is a stable candidate.
>> 
>>> OK.
>>>>
>>>> David
>> 
>> Hi Konrad / David,
>> 
>> This series still lacks a resolution on the sysfs /do_flr /reset,
>> as a result the pci devices are not reset after shutdown of a guest.
>> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> 
>> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> after a guest reboot or after assigning the devices to another guest.

> I don't follow what you're saying.  The lack of a device reset for PCI
> devices with no FLR method isn't a regression as this has never worked.
>  Can you explain in more detail what the regression is and which patch
> caused it?

I haven't bisected it to a specific patch in this series,
but this patch series (when pulled on top of 3.16) cause the following:

- Do a system start and HVM guest start
- HVM guest with pci passthrough, devices work fine
- shutdown the HVM guest
- "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
  appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
- Starting the HVM guest again with the same devices passed through.
- Devices malfunction (for example a USB host controller will fail a simple 
  "lsusb"
- And this all works fine on vanilla 3.16.  

>> Apart from that .. i can't resist to remind the other issue with removing pci
>> devices passed through to HVM guests related to the signaling via xenstore,
>> described in:
>> 
>> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html

> I don't remember seeing you posting a patch...?

> David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05  9:44         ` [Xen-devel] " Sander Eikelenboom
  2014-08-05 13:49           ` Konrad Rzeszutek Wilk
@ 2014-08-05 13:49           ` Konrad Rzeszutek Wilk
  2014-08-05 14:04             ` Sander Eikelenboom
                               ` (3 more replies)
  1 sibling, 4 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-05 13:49 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: David Vrabel, gregkh, boris.ostrovsky, xen-devel, linux-kernel

On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> 
> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> 
> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >> 
> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >> 
> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >>>>> Greg: goto GHK
> >>>>>
> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >>>>
> >>>> Applied to devel/for-linus-3.17.
> >> 
> >>> Thank you.
> >>>>
> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >>>> really is a stable candidate.
> >> 
> >>> OK.
> >>>>
> >>>> David
> >> 
> >> Hi Konrad / David,
> >> 
> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >> as a result the pci devices are not reset after shutdown of a guest.
> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >> 
> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >> after a guest reboot or after assigning the devices to another guest.
> 
> > I don't follow what you're saying.  The lack of a device reset for PCI
> > devices with no FLR method isn't a regression as this has never worked.
> >  Can you explain in more detail what the regression is and which patch
> > caused it?
> 
> I haven't bisected it to a specific patch in this series,
> but this patch series (when pulled on top of 3.16) cause the following:
> 
> - Do a system start and HVM guest start
> - HVM guest with pci passthrough, devices work fine
> - shutdown the HVM guest
> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> - Starting the HVM guest again with the same devices passed through.
> - Devices malfunction (for example a USB host controller will fail a simple 
>   "lsusb"
> - And this all works fine on vanilla 3.16.  

Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
"xen/pciback: Don't deadlock when unbinding."
but it does not change any of that code path. Only figures out whether
to take a lock or not.

I will try it out on my box and see if I can reproduce it.

And just to be 100% sure - you are using vanilla Xen? No changes on top
of it?

Thanks!
> 
> >> Apart from that .. i can't resist to remind the other issue with removing pci
> >> devices passed through to HVM guests related to the signaling via xenstore,
> >> described in:
> >> 
> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
> 
> > I don't remember seeing you posting a patch...?

I was going to, but I think we need to figure out the 'do_flr' mechanism
first.
> 
> > David
> 
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05  9:44         ` [Xen-devel] " Sander Eikelenboom
@ 2014-08-05 13:49           ` Konrad Rzeszutek Wilk
  2014-08-05 13:49           ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-05 13:49 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel

On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> 
> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> 
> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >> 
> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >> 
> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >>>>> Greg: goto GHK
> >>>>>
> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >>>>
> >>>> Applied to devel/for-linus-3.17.
> >> 
> >>> Thank you.
> >>>>
> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >>>> really is a stable candidate.
> >> 
> >>> OK.
> >>>>
> >>>> David
> >> 
> >> Hi Konrad / David,
> >> 
> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >> as a result the pci devices are not reset after shutdown of a guest.
> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >> 
> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >> after a guest reboot or after assigning the devices to another guest.
> 
> > I don't follow what you're saying.  The lack of a device reset for PCI
> > devices with no FLR method isn't a regression as this has never worked.
> >  Can you explain in more detail what the regression is and which patch
> > caused it?
> 
> I haven't bisected it to a specific patch in this series,
> but this patch series (when pulled on top of 3.16) cause the following:
> 
> - Do a system start and HVM guest start
> - HVM guest with pci passthrough, devices work fine
> - shutdown the HVM guest
> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> - Starting the HVM guest again with the same devices passed through.
> - Devices malfunction (for example a USB host controller will fail a simple 
>   "lsusb"
> - And this all works fine on vanilla 3.16.  

Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
"xen/pciback: Don't deadlock when unbinding."
but it does not change any of that code path. Only figures out whether
to take a lock or not.

I will try it out on my box and see if I can reproduce it.

And just to be 100% sure - you are using vanilla Xen? No changes on top
of it?

Thanks!
> 
> >> Apart from that .. i can't resist to remind the other issue with removing pci
> >> devices passed through to HVM guests related to the signaling via xenstore,
> >> described in:
> >> 
> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
> 
> > I don't remember seeing you posting a patch...?

I was going to, but I think we need to figure out the 'do_flr' mechanism
first.
> 
> > David
> 
> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05 13:49           ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2014-08-05 14:04             ` Sander Eikelenboom
  2014-08-06 18:59               ` Sander Eikelenboom
  2014-08-06 18:59               ` Sander Eikelenboom
  2014-08-05 14:04             ` Sander Eikelenboom
                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05 14:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Vrabel, gregkh, boris.ostrovsky, xen-devel, linux-kernel


Tuesday, August 5, 2014, 3:49:30 PM, you wrote:

> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> 
>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> 
>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> 
>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> 
>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >>>>> Greg: goto GHK
>> >>>>>
>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >>>>
>> >>>> Applied to devel/for-linus-3.17.
>> >> 
>> >>> Thank you.
>> >>>>
>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >>>> really is a stable candidate.
>> >> 
>> >>> OK.
>> >>>>
>> >>>> David
>> >> 
>> >> Hi Konrad / David,
>> >> 
>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> 
>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> after a guest reboot or after assigning the devices to another guest.
>> 
>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> > devices with no FLR method isn't a regression as this has never worked.
>> >  Can you explain in more detail what the regression is and which patch
>> > caused it?
>> 
>> I haven't bisected it to a specific patch in this series,
>> but this patch series (when pulled on top of 3.16) cause the following:
>> 
>> - Do a system start and HVM guest start
>> - HVM guest with pci passthrough, devices work fine
>> - shutdown the HVM guest
>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> - Starting the HVM guest again with the same devices passed through.
>> - Devices malfunction (for example a USB host controller will fail a simple 
>>   "lsusb"
>> - And this all works fine on vanilla 3.16.  

> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> "xen/pciback: Don't deadlock when unbinding."
> but it does not change any of that code path. Only figures out whether
> to take a lock or not.

Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
assumed there could be a connection)

> I will try it out on my box and see if I can reproduce it.

> And just to be 100% sure - you are using vanilla Xen? No changes on top
> of it?

Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
If you can't reproduce i will see if i can dive deeper into it tonight !

> Thanks!

>> 
>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>> >> devices passed through to HVM guests related to the signaling via xenstore,
>> >> described in:
>> >> 
>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>> 
>> > I don't remember seeing you posting a patch...?

> I was going to, but I think we need to figure out the 'do_flr' mechanism
> first.

>> 
>> > David
>> 
>> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05 13:49           ` [Xen-devel] " Konrad Rzeszutek Wilk
  2014-08-05 14:04             ` Sander Eikelenboom
@ 2014-08-05 14:04             ` Sander Eikelenboom
  2014-08-05 14:33             ` Sander Eikelenboom
  2014-08-05 14:33             ` [Xen-devel] " Sander Eikelenboom
  3 siblings, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05 14:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel


Tuesday, August 5, 2014, 3:49:30 PM, you wrote:

> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> 
>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> 
>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> 
>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> 
>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >>>>> Greg: goto GHK
>> >>>>>
>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >>>>
>> >>>> Applied to devel/for-linus-3.17.
>> >> 
>> >>> Thank you.
>> >>>>
>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >>>> really is a stable candidate.
>> >> 
>> >>> OK.
>> >>>>
>> >>>> David
>> >> 
>> >> Hi Konrad / David,
>> >> 
>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> 
>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> after a guest reboot or after assigning the devices to another guest.
>> 
>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> > devices with no FLR method isn't a regression as this has never worked.
>> >  Can you explain in more detail what the regression is and which patch
>> > caused it?
>> 
>> I haven't bisected it to a specific patch in this series,
>> but this patch series (when pulled on top of 3.16) cause the following:
>> 
>> - Do a system start and HVM guest start
>> - HVM guest with pci passthrough, devices work fine
>> - shutdown the HVM guest
>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> - Starting the HVM guest again with the same devices passed through.
>> - Devices malfunction (for example a USB host controller will fail a simple 
>>   "lsusb"
>> - And this all works fine on vanilla 3.16.  

> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> "xen/pciback: Don't deadlock when unbinding."
> but it does not change any of that code path. Only figures out whether
> to take a lock or not.

Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
assumed there could be a connection)

> I will try it out on my box and see if I can reproduce it.

> And just to be 100% sure - you are using vanilla Xen? No changes on top
> of it?

Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
If you can't reproduce i will see if i can dive deeper into it tonight !

> Thanks!

>> 
>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>> >> devices passed through to HVM guests related to the signaling via xenstore,
>> >> described in:
>> >> 
>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>> 
>> > I don't remember seeing you posting a patch...?

> I was going to, but I think we need to figure out the 'do_flr' mechanism
> first.

>> 
>> > David
>> 
>> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05 13:49           ` [Xen-devel] " Konrad Rzeszutek Wilk
                               ` (2 preceding siblings ...)
  2014-08-05 14:33             ` Sander Eikelenboom
@ 2014-08-05 14:33             ` Sander Eikelenboom
  3 siblings, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05 14:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Vrabel, gregkh, boris.ostrovsky, xen-devel, linux-kernel


Tuesday, August 5, 2014, 3:49:30 PM, you wrote:

> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> 
>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> 
>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> 
>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> 
>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >>>>> Greg: goto GHK
>> >>>>>
>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >>>>
>> >>>> Applied to devel/for-linus-3.17.
>> >> 
>> >>> Thank you.
>> >>>>
>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >>>> really is a stable candidate.
>> >> 
>> >>> OK.
>> >>>>
>> >>>> David
>> >> 
>> >> Hi Konrad / David,
>> >> 
>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> 
>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> after a guest reboot or after assigning the devices to another guest.
>> 
>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> > devices with no FLR method isn't a regression as this has never worked.
>> >  Can you explain in more detail what the regression is and which patch
>> > caused it?
>> 
>> I haven't bisected it to a specific patch in this series,
>> but this patch series (when pulled on top of 3.16) cause the following:
>> 
>> - Do a system start and HVM guest start
>> - HVM guest with pci passthrough, devices work fine
>> - shutdown the HVM guest
>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> - Starting the HVM guest again with the same devices passed through.
>> - Devices malfunction (for example a USB host controller will fail a simple 
>>   "lsusb"
>> - And this all works fine on vanilla 3.16.  

> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> "xen/pciback: Don't deadlock when unbinding."
> but it does not change any of that code path. Only figures out whether
> to take a lock or not.

> I will try it out on my box and see if I can reproduce it.

> And just to be 100% sure - you are using vanilla Xen? No changes on top
> of it?

BTW could it have anything to do with the do_flr patch that went into Xen:
(http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=ab78724fc5628318b172b4344f7280621a151e1b)

And isn't into linux yet .. and somehow the old code not having a problem with 
that but the new (after the patch series with the linux do_flr patch) has ?


> Thanks!
>> 
>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>> >> devices passed through to HVM guests related to the signaling via xenstore,
>> >> described in:
>> >> 
>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>> 
>> > I don't remember seeing you posting a patch...?

> I was going to, but I think we need to figure out the 'do_flr' mechanism
> first.
>> 
>> > David
>> 
>> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05 13:49           ` [Xen-devel] " Konrad Rzeszutek Wilk
  2014-08-05 14:04             ` Sander Eikelenboom
  2014-08-05 14:04             ` Sander Eikelenboom
@ 2014-08-05 14:33             ` Sander Eikelenboom
  2014-08-05 14:33             ` [Xen-devel] " Sander Eikelenboom
  3 siblings, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-05 14:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel


Tuesday, August 5, 2014, 3:49:30 PM, you wrote:

> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> 
>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> 
>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> 
>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> 
>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >>>>> Greg: goto GHK
>> >>>>>
>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >>>>
>> >>>> Applied to devel/for-linus-3.17.
>> >> 
>> >>> Thank you.
>> >>>>
>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >>>> really is a stable candidate.
>> >> 
>> >>> OK.
>> >>>>
>> >>>> David
>> >> 
>> >> Hi Konrad / David,
>> >> 
>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> 
>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> after a guest reboot or after assigning the devices to another guest.
>> 
>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> > devices with no FLR method isn't a regression as this has never worked.
>> >  Can you explain in more detail what the regression is and which patch
>> > caused it?
>> 
>> I haven't bisected it to a specific patch in this series,
>> but this patch series (when pulled on top of 3.16) cause the following:
>> 
>> - Do a system start and HVM guest start
>> - HVM guest with pci passthrough, devices work fine
>> - shutdown the HVM guest
>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> - Starting the HVM guest again with the same devices passed through.
>> - Devices malfunction (for example a USB host controller will fail a simple 
>>   "lsusb"
>> - And this all works fine on vanilla 3.16.  

> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> "xen/pciback: Don't deadlock when unbinding."
> but it does not change any of that code path. Only figures out whether
> to take a lock or not.

> I will try it out on my box and see if I can reproduce it.

> And just to be 100% sure - you are using vanilla Xen? No changes on top
> of it?

BTW could it have anything to do with the do_flr patch that went into Xen:
(http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=ab78724fc5628318b172b4344f7280621a151e1b)

And isn't into linux yet .. and somehow the old code not having a problem with 
that but the new (after the patch series with the linux do_flr patch) has ?


> Thanks!
>> 
>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>> >> devices passed through to HVM guests related to the signaling via xenstore,
>> >> described in:
>> >> 
>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>> 
>> > I don't remember seeing you posting a patch...?

> I was going to, but I think we need to figure out the 'do_flr' mechanism
> first.
>> 
>> > David
>> 
>> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05 14:04             ` Sander Eikelenboom
@ 2014-08-06 18:59               ` Sander Eikelenboom
  2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
  2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
  2014-08-06 18:59               ` Sander Eikelenboom
  1 sibling, 2 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 18:59 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel


Tuesday, August 5, 2014, 4:04:43 PM, you wrote:


> Tuesday, August 5, 2014, 3:49:30 PM, you wrote:

>> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>>> 
>>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>>> 
>>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>>> >> 
>>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>>> >> 
>>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> >>>>> Greg: goto GHK
>>> >>>>>
>>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>> >>>>
>>> >>>> Applied to devel/for-linus-3.17.
>>> >> 
>>> >>> Thank you.
>>> >>>>
>>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>> >>>> really is a stable candidate.
>>> >> 
>>> >>> OK.
>>> >>>>
>>> >>>> David
>>> >> 
>>> >> Hi Konrad / David,
>>> >> 
>>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>>> >> as a result the pci devices are not reset after shutdown of a guest.
>>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>>> >> 
>>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>>> >> after a guest reboot or after assigning the devices to another guest.
>>> 
>>> > I don't follow what you're saying.  The lack of a device reset for PCI
>>> > devices with no FLR method isn't a regression as this has never worked.
>>> >  Can you explain in more detail what the regression is and which patch
>>> > caused it?
>>> 
>>> I haven't bisected it to a specific patch in this series,
>>> but this patch series (when pulled on top of 3.16) cause the following:
>>> 
>>> - Do a system start and HVM guest start
>>> - HVM guest with pci passthrough, devices work fine
>>> - shutdown the HVM guest
>>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>>> - Starting the HVM guest again with the same devices passed through.
>>> - Devices malfunction (for example a USB host controller will fail a simple 
>>>   "lsusb"
>>> - And this all works fine on vanilla 3.16.  

>> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> "xen/pciback: Don't deadlock when unbinding."
>> but it does not change any of that code path. Only figures out whether
>> to take a lock or not.

> Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> assumed there could be a connection)

>> I will try it out on my box and see if I can reproduce it.

>> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> of it?

> Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> If you can't reproduce i will see if i can dive deeper into it tonight !

Hi Konrad,

It looks like the issues is this part of the change:

    --- a/drivers/xen/xen-pciback/pci_stub.c
    +++ b/drivers/xen/xen-pciback/pci_stub.c
    @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
    * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
    *
    * As such we have to be careful.
    + *
    + * To make this easier, the caller has to hold the device lock.
    */
    void pcistub_put_pci_dev(struct pci_dev *dev)
    {
    @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
    /* Cleanup our device
    * (so it's ready for the next domain)
    */
    -
    - /* This is OK - we are running from workqueue context
    - * and want to inhibit the user from fiddling with 'reset'
    - */
    - pci_reset_function(dev);
    + lockdep_assert_held(&dev->dev.mutex);
    + __pci_reset_function_locked(dev);
    pci_restore_state(dev);
   /* This disables the device. */

More specifically:
The old "pci_reset_function(dev)" potentially seems to do much more than 
__pci_reset_function_locked(dev).


"__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
while "pci_reset_function" not only calls pci_dev_reset, but on succes
it also calls: "pci_dev_save_and_disable" which does a save state etc.


So i added a little more debug:

device_lock_assert(&dev->dev);
ret = __pci_reset_function_locked(dev);
dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
pci_restore_state(dev);

And this returns:
[  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0

So that confirms there is no saved_state to get restored by 
pci_restore_state(dev) in the next line.

However there seems to be no "locked" variant of the function 
"pci_reset_function" in pci.c that has all the same logic ...

--
Sander 

>> Thanks!

>>> 
>>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>>> >> devices passed through to HVM guests related to the signaling via xenstore,
>>> >> described in:
>>> >> 
>>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>>> 
>>> > I don't remember seeing you posting a patch...?

>> I was going to, but I think we need to figure out the 'do_flr' mechanism
>> first.

>>> 
>>> > David
>>> 
>>> 






^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-05 14:04             ` Sander Eikelenboom
  2014-08-06 18:59               ` Sander Eikelenboom
@ 2014-08-06 18:59               ` Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 18:59 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel


Tuesday, August 5, 2014, 4:04:43 PM, you wrote:


> Tuesday, August 5, 2014, 3:49:30 PM, you wrote:

>> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>>> 
>>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>>> 
>>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>>> >> 
>>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>>> >> 
>>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> >>>>> Greg: goto GHK
>>> >>>>>
>>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>> >>>>
>>> >>>> Applied to devel/for-linus-3.17.
>>> >> 
>>> >>> Thank you.
>>> >>>>
>>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>> >>>> really is a stable candidate.
>>> >> 
>>> >>> OK.
>>> >>>>
>>> >>>> David
>>> >> 
>>> >> Hi Konrad / David,
>>> >> 
>>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>>> >> as a result the pci devices are not reset after shutdown of a guest.
>>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>>> >> 
>>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>>> >> after a guest reboot or after assigning the devices to another guest.
>>> 
>>> > I don't follow what you're saying.  The lack of a device reset for PCI
>>> > devices with no FLR method isn't a regression as this has never worked.
>>> >  Can you explain in more detail what the regression is and which patch
>>> > caused it?
>>> 
>>> I haven't bisected it to a specific patch in this series,
>>> but this patch series (when pulled on top of 3.16) cause the following:
>>> 
>>> - Do a system start and HVM guest start
>>> - HVM guest with pci passthrough, devices work fine
>>> - shutdown the HVM guest
>>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>>> - Starting the HVM guest again with the same devices passed through.
>>> - Devices malfunction (for example a USB host controller will fail a simple 
>>>   "lsusb"
>>> - And this all works fine on vanilla 3.16.  

>> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> "xen/pciback: Don't deadlock when unbinding."
>> but it does not change any of that code path. Only figures out whether
>> to take a lock or not.

> Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> assumed there could be a connection)

>> I will try it out on my box and see if I can reproduce it.

>> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> of it?

> Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> If you can't reproduce i will see if i can dive deeper into it tonight !

Hi Konrad,

It looks like the issues is this part of the change:

    --- a/drivers/xen/xen-pciback/pci_stub.c
    +++ b/drivers/xen/xen-pciback/pci_stub.c
    @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
    * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
    *
    * As such we have to be careful.
    + *
    + * To make this easier, the caller has to hold the device lock.
    */
    void pcistub_put_pci_dev(struct pci_dev *dev)
    {
    @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
    /* Cleanup our device
    * (so it's ready for the next domain)
    */
    -
    - /* This is OK - we are running from workqueue context
    - * and want to inhibit the user from fiddling with 'reset'
    - */
    - pci_reset_function(dev);
    + lockdep_assert_held(&dev->dev.mutex);
    + __pci_reset_function_locked(dev);
    pci_restore_state(dev);
   /* This disables the device. */

More specifically:
The old "pci_reset_function(dev)" potentially seems to do much more than 
__pci_reset_function_locked(dev).


"__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
while "pci_reset_function" not only calls pci_dev_reset, but on succes
it also calls: "pci_dev_save_and_disable" which does a save state etc.


So i added a little more debug:

device_lock_assert(&dev->dev);
ret = __pci_reset_function_locked(dev);
dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
pci_restore_state(dev);

And this returns:
[  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0

So that confirms there is no saved_state to get restored by 
pci_restore_state(dev) in the next line.

However there seems to be no "locked" variant of the function 
"pci_reset_function" in pci.c that has all the same logic ...

--
Sander 

>> Thanks!

>>> 
>>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>>> >> devices passed through to HVM guests related to the signaling via xenstore,
>>> >> described in:
>>> >> 
>>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>>> 
>>> > I don't remember seeing you posting a patch...?

>> I was going to, but I think we need to figure out the 'do_flr' mechanism
>> first.

>>> 
>>> > David
>>> 
>>> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 18:59               ` Sander Eikelenboom
@ 2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
  2014-08-06 19:25                   ` Sander Eikelenboom
  2014-08-06 19:25                   ` [Xen-devel] " Sander Eikelenboom
  2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 19:18 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel

On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
> 
> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
> 
> 
> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
> 
> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> >>> 
> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> >>> 
> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >>> >> 
> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >>> >> 
> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >>> >>>>> Greg: goto GHK
> >>> >>>>>
> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >>> >>>>
> >>> >>>> Applied to devel/for-linus-3.17.
> >>> >> 
> >>> >>> Thank you.
> >>> >>>>
> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >>> >>>> really is a stable candidate.
> >>> >> 
> >>> >>> OK.
> >>> >>>>
> >>> >>>> David
> >>> >> 
> >>> >> Hi Konrad / David,
> >>> >> 
> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >>> >> as a result the pci devices are not reset after shutdown of a guest.
> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >>> >> 
> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >>> >> after a guest reboot or after assigning the devices to another guest.
> >>> 
> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
> >>> > devices with no FLR method isn't a regression as this has never worked.
> >>> >  Can you explain in more detail what the regression is and which patch
> >>> > caused it?
> >>> 
> >>> I haven't bisected it to a specific patch in this series,
> >>> but this patch series (when pulled on top of 3.16) cause the following:
> >>> 
> >>> - Do a system start and HVM guest start
> >>> - HVM guest with pci passthrough, devices work fine
> >>> - shutdown the HVM guest
> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> >>> - Starting the HVM guest again with the same devices passed through.
> >>> - Devices malfunction (for example a USB host controller will fail a simple 
> >>>   "lsusb"
> >>> - And this all works fine on vanilla 3.16.  
> 
> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> >> "xen/pciback: Don't deadlock when unbinding."
> >> but it does not change any of that code path. Only figures out whether
> >> to take a lock or not.
> 
> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> > assumed there could be a connection)
> 
> >> I will try it out on my box and see if I can reproduce it.
> 
> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
> >> of it?
> 
> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> > If you can't reproduce i will see if i can dive deeper into it tonight !
> 
> Hi Konrad,
> 
> It looks like the issues is this part of the change:
> 
>     --- a/drivers/xen/xen-pciback/pci_stub.c
>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>     *
>     * As such we have to be careful.
>     + *
>     + * To make this easier, the caller has to hold the device lock.
>     */
>     void pcistub_put_pci_dev(struct pci_dev *dev)
>     {
>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>     /* Cleanup our device
>     * (so it's ready for the next domain)
>     */
>     -
>     - /* This is OK - we are running from workqueue context
>     - * and want to inhibit the user from fiddling with 'reset'
>     - */
>     - pci_reset_function(dev);
>     + lockdep_assert_held(&dev->dev.mutex);
>     + __pci_reset_function_locked(dev);
>     pci_restore_state(dev);
>    /* This disables the device. */
> 
> More specifically:
> The old "pci_reset_function(dev)" potentially seems to do much more than 
> __pci_reset_function_locked(dev).
> 
> 
> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
> while "pci_reset_function" not only calls pci_dev_reset, but on succes
> it also calls: "pci_dev_save_and_disable" which does a save state etc.
> 
> 
> So i added a little more debug:
> 
> device_lock_assert(&dev->dev);
> ret = __pci_reset_function_locked(dev);
> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
> pci_restore_state(dev);
> 
> And this returns:
> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
> 
> So that confirms there is no saved_state to get restored by 
> pci_restore_state(dev) in the next line.
> 
> However there seems to be no "locked" variant of the function 
> "pci_reset_function" in pci.c that has all the same logic ...

Yup. I've a preliminary patch:


diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 1ddd22f..4cb7901 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
 	 */
 	__pci_reset_function_locked(dev);
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
-		dev_dbg(&dev->dev, "Could not reload PCI state\n");
+		dev_info(&dev->dev, "Could not reload PCI state\n");
 	else
 		pci_restore_state(dev);
 
@@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
 	unsigned long flags;
+	struct xen_pcibk_dev_data *dev_data;
 
 	spin_lock_irqsave(&pcistub_devices_lock, flags);
 
@@ -278,10 +279,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-	device_lock_assert(&dev->dev);
-	__pci_reset_function_locked(dev);
-	pci_restore_state(dev);
-
+	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
+		dev_info(&dev->dev, "Could not reload PCI state\n");
+	else {
+		device_lock_assert(&dev->dev);
+		__pci_reset_function_locked(dev);
+		/*
+		 * The usual sequence is pci_save_state & pci_restore_state
+		 * but the guest might have messed the config space up. Use
+		 * the initial configuration (when device was binded to us).
+		 */
+		pci_restore_state(dev);
+		/*
+		 * The next steps are to reload the configuration for the
+		 * next time we need to unbind/bind to a guest..
+		 */
+		dev_data = pci_get_drvdata(dev);
+		pci_save_state(dev);
+		dev_data->pci_saved_state = pci_store_saved_state(dev);
+	}
 	/* This disables the device. */
 	xen_pcibk_reset_device(dev);
 
> 
> --
> Sander 
> 
> >> Thanks!
> 
> >>> 
> >>> >> Apart from that .. i can't resist to remind the other issue with removing pci
> >>> >> devices passed through to HVM guests related to the signaling via xenstore,
> >>> >> described in:
> >>> >> 
> >>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
> >>> 
> >>> > I don't remember seeing you posting a patch...?
> 
> >> I was going to, but I think we need to figure out the 'do_flr' mechanism
> >> first.
> 
> >>> 
> >>> > David
> >>> 
> >>> 
> 
> 
> 
> 
> 

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 18:59               ` Sander Eikelenboom
  2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
@ 2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 19:18 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel

On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
> 
> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
> 
> 
> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
> 
> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> >>> 
> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> >>> 
> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >>> >> 
> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >>> >> 
> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >>> >>>>> Greg: goto GHK
> >>> >>>>>
> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >>> >>>>
> >>> >>>> Applied to devel/for-linus-3.17.
> >>> >> 
> >>> >>> Thank you.
> >>> >>>>
> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >>> >>>> really is a stable candidate.
> >>> >> 
> >>> >>> OK.
> >>> >>>>
> >>> >>>> David
> >>> >> 
> >>> >> Hi Konrad / David,
> >>> >> 
> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >>> >> as a result the pci devices are not reset after shutdown of a guest.
> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >>> >> 
> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >>> >> after a guest reboot or after assigning the devices to another guest.
> >>> 
> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
> >>> > devices with no FLR method isn't a regression as this has never worked.
> >>> >  Can you explain in more detail what the regression is and which patch
> >>> > caused it?
> >>> 
> >>> I haven't bisected it to a specific patch in this series,
> >>> but this patch series (when pulled on top of 3.16) cause the following:
> >>> 
> >>> - Do a system start and HVM guest start
> >>> - HVM guest with pci passthrough, devices work fine
> >>> - shutdown the HVM guest
> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> >>> - Starting the HVM guest again with the same devices passed through.
> >>> - Devices malfunction (for example a USB host controller will fail a simple 
> >>>   "lsusb"
> >>> - And this all works fine on vanilla 3.16.  
> 
> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> >> "xen/pciback: Don't deadlock when unbinding."
> >> but it does not change any of that code path. Only figures out whether
> >> to take a lock or not.
> 
> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> > assumed there could be a connection)
> 
> >> I will try it out on my box and see if I can reproduce it.
> 
> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
> >> of it?
> 
> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> > If you can't reproduce i will see if i can dive deeper into it tonight !
> 
> Hi Konrad,
> 
> It looks like the issues is this part of the change:
> 
>     --- a/drivers/xen/xen-pciback/pci_stub.c
>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>     *
>     * As such we have to be careful.
>     + *
>     + * To make this easier, the caller has to hold the device lock.
>     */
>     void pcistub_put_pci_dev(struct pci_dev *dev)
>     {
>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>     /* Cleanup our device
>     * (so it's ready for the next domain)
>     */
>     -
>     - /* This is OK - we are running from workqueue context
>     - * and want to inhibit the user from fiddling with 'reset'
>     - */
>     - pci_reset_function(dev);
>     + lockdep_assert_held(&dev->dev.mutex);
>     + __pci_reset_function_locked(dev);
>     pci_restore_state(dev);
>    /* This disables the device. */
> 
> More specifically:
> The old "pci_reset_function(dev)" potentially seems to do much more than 
> __pci_reset_function_locked(dev).
> 
> 
> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
> while "pci_reset_function" not only calls pci_dev_reset, but on succes
> it also calls: "pci_dev_save_and_disable" which does a save state etc.
> 
> 
> So i added a little more debug:
> 
> device_lock_assert(&dev->dev);
> ret = __pci_reset_function_locked(dev);
> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
> pci_restore_state(dev);
> 
> And this returns:
> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
> 
> So that confirms there is no saved_state to get restored by 
> pci_restore_state(dev) in the next line.
> 
> However there seems to be no "locked" variant of the function 
> "pci_reset_function" in pci.c that has all the same logic ...

Yup. I've a preliminary patch:


diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 1ddd22f..4cb7901 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
 	 */
 	__pci_reset_function_locked(dev);
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
-		dev_dbg(&dev->dev, "Could not reload PCI state\n");
+		dev_info(&dev->dev, "Could not reload PCI state\n");
 	else
 		pci_restore_state(dev);
 
@@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
 	unsigned long flags;
+	struct xen_pcibk_dev_data *dev_data;
 
 	spin_lock_irqsave(&pcistub_devices_lock, flags);
 
@@ -278,10 +279,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-	device_lock_assert(&dev->dev);
-	__pci_reset_function_locked(dev);
-	pci_restore_state(dev);
-
+	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
+		dev_info(&dev->dev, "Could not reload PCI state\n");
+	else {
+		device_lock_assert(&dev->dev);
+		__pci_reset_function_locked(dev);
+		/*
+		 * The usual sequence is pci_save_state & pci_restore_state
+		 * but the guest might have messed the config space up. Use
+		 * the initial configuration (when device was binded to us).
+		 */
+		pci_restore_state(dev);
+		/*
+		 * The next steps are to reload the configuration for the
+		 * next time we need to unbind/bind to a guest..
+		 */
+		dev_data = pci_get_drvdata(dev);
+		pci_save_state(dev);
+		dev_data->pci_saved_state = pci_store_saved_state(dev);
+	}
 	/* This disables the device. */
 	xen_pcibk_reset_device(dev);
 
> 
> --
> Sander 
> 
> >> Thanks!
> 
> >>> 
> >>> >> Apart from that .. i can't resist to remind the other issue with removing pci
> >>> >> devices passed through to HVM guests related to the signaling via xenstore,
> >>> >> described in:
> >>> >> 
> >>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
> >>> 
> >>> > I don't remember seeing you posting a patch...?
> 
> >> I was going to, but I think we need to figure out the 'do_flr' mechanism
> >> first.
> 
> >>> 
> >>> > David
> >>> 
> >>> 
> 
> 
> 
> 
> 

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
  2014-08-06 19:25                   ` Sander Eikelenboom
@ 2014-08-06 19:25                   ` Sander Eikelenboom
  2014-08-06 19:39                     ` Konrad Rzeszutek Wilk
  2014-08-06 19:39                     ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 19:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel


Wednesday, August 6, 2014, 9:18:31 PM, you wrote:

> On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>> 
>> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>> 
>> 
>> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>> 
>> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> >>> 
>> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> >>> 
>> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >>> >> 
>> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >>> >> 
>> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >>> >>>>> Greg: goto GHK
>> >>> >>>>>
>> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >>> >>>>
>> >>> >>>> Applied to devel/for-linus-3.17.
>> >>> >> 
>> >>> >>> Thank you.
>> >>> >>>>
>> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >>> >>>> really is a stable candidate.
>> >>> >> 
>> >>> >>> OK.
>> >>> >>>>
>> >>> >>>> David
>> >>> >> 
>> >>> >> Hi Konrad / David,
>> >>> >> 
>> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >>> >> 
>> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >>> >> after a guest reboot or after assigning the devices to another guest.
>> >>> 
>> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> >>> > devices with no FLR method isn't a regression as this has never worked.
>> >>> >  Can you explain in more detail what the regression is and which patch
>> >>> > caused it?
>> >>> 
>> >>> I haven't bisected it to a specific patch in this series,
>> >>> but this patch series (when pulled on top of 3.16) cause the following:
>> >>> 
>> >>> - Do a system start and HVM guest start
>> >>> - HVM guest with pci passthrough, devices work fine
>> >>> - shutdown the HVM guest
>> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> >>> - Starting the HVM guest again with the same devices passed through.
>> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>> >>>   "lsusb"
>> >>> - And this all works fine on vanilla 3.16.  
>> 
>> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> >> "xen/pciback: Don't deadlock when unbinding."
>> >> but it does not change any of that code path. Only figures out whether
>> >> to take a lock or not.
>> 
>> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>> > assumed there could be a connection)
>> 
>> >> I will try it out on my box and see if I can reproduce it.
>> 
>> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> >> of it?
>> 
>> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>> > If you can't reproduce i will see if i can dive deeper into it tonight !
>> 
>> Hi Konrad,
>> 
>> It looks like the issues is this part of the change:
>> 
>>     --- a/drivers/xen/xen-pciback/pci_stub.c
>>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>>     *
>>     * As such we have to be careful.
>>     + *
>>     + * To make this easier, the caller has to hold the device lock.
>>     */
>>     void pcistub_put_pci_dev(struct pci_dev *dev)
>>     {
>>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>     /* Cleanup our device
>>     * (so it's ready for the next domain)
>>     */
>>     -
>>     - /* This is OK - we are running from workqueue context
>>     - * and want to inhibit the user from fiddling with 'reset'
>>     - */
>>     - pci_reset_function(dev);
>>     + lockdep_assert_held(&dev->dev.mutex);
>>     + __pci_reset_function_locked(dev);
>>     pci_restore_state(dev);
>>    /* This disables the device. */
>> 
>> More specifically:
>> The old "pci_reset_function(dev)" potentially seems to do much more than 
>> __pci_reset_function_locked(dev).
>> 
>> 
>> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>> 
>> 
>> So i added a little more debug:
>> 
>> device_lock_assert(&dev->dev);
>> ret = __pci_reset_function_locked(dev);
>> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>> pci_restore_state(dev);
>> 
>> And this returns:
>> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>> 
>> So that confirms there is no saved_state to get restored by 
>> pci_restore_state(dev) in the next line.
>> 
>> However there seems to be no "locked" variant of the function 
>> "pci_reset_function" in pci.c that has all the same logic ...

> Yup. I've a preliminary patch:

Preliminary in the sense: "this should fix it .. needs more testing" ?

> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index 1ddd22f..4cb7901 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>          */
>         __pci_reset_function_locked(dev);
>         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
>         else
>                 pci_restore_state(dev);
>  
> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  {
>         struct pcistub_device *psdev, *found_psdev = NULL;
>         unsigned long flags;
> +       struct xen_pcibk_dev_data *dev_data;
>  
>         spin_lock_irqsave(&pcistub_devices_lock, flags);
>  
> @@ -278,10 +279,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>         /* Cleanup our device
>          * (so it's ready for the next domain)
>          */
> -       device_lock_assert(&dev->dev);
> -       __pci_reset_function_locked(dev);
> -       pci_restore_state(dev);
> -
> +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
> +       else {
> +               device_lock_assert(&dev->dev);
> +               __pci_reset_function_locked(dev);
> +               /*
> +                * The usual sequence is pci_save_state & pci_restore_state
> +                * but the guest might have messed the config space up. Use
> +                * the initial configuration (when device was binded to us).
> +                */
> +               pci_restore_state(dev);
> +               /*
> +                * The next steps are to reload the configuration for the
> +                * next time we need to unbind/bind to a guest..
> +                */
> +               dev_data = pci_get_drvdata(dev);
> +               pci_save_state(dev);
> +               dev_data->pci_saved_state = pci_store_saved_state(dev);
> +       }
>         /* This disables the device. */
>         xen_pcibk_reset_device(dev);
>  
>> 
>> --
>> Sander 
>> 
>> >> Thanks!
>> 
>> >>> 
>> >>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>> >>> >> devices passed through to HVM guests related to the signaling via xenstore,
>> >>> >> described in:
>> >>> >> 
>> >>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>> >>> 
>> >>> > I don't remember seeing you posting a patch...?
>> 
>> >> I was going to, but I think we need to figure out the 'do_flr' mechanism
>> >> first.
>> 
>> >>> 
>> >>> > David
>> >>> 
>> >>> 
>> 
>> 
>> 
>> 
>> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
@ 2014-08-06 19:25                   ` Sander Eikelenboom
  2014-08-06 19:25                   ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 19:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel


Wednesday, August 6, 2014, 9:18:31 PM, you wrote:

> On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>> 
>> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>> 
>> 
>> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>> 
>> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> >>> 
>> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> >>> 
>> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >>> >> 
>> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >>> >> 
>> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >>> >>>>> Greg: goto GHK
>> >>> >>>>>
>> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >>> >>>>
>> >>> >>>> Applied to devel/for-linus-3.17.
>> >>> >> 
>> >>> >>> Thank you.
>> >>> >>>>
>> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >>> >>>> really is a stable candidate.
>> >>> >> 
>> >>> >>> OK.
>> >>> >>>>
>> >>> >>>> David
>> >>> >> 
>> >>> >> Hi Konrad / David,
>> >>> >> 
>> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >>> >> 
>> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >>> >> after a guest reboot or after assigning the devices to another guest.
>> >>> 
>> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> >>> > devices with no FLR method isn't a regression as this has never worked.
>> >>> >  Can you explain in more detail what the regression is and which patch
>> >>> > caused it?
>> >>> 
>> >>> I haven't bisected it to a specific patch in this series,
>> >>> but this patch series (when pulled on top of 3.16) cause the following:
>> >>> 
>> >>> - Do a system start and HVM guest start
>> >>> - HVM guest with pci passthrough, devices work fine
>> >>> - shutdown the HVM guest
>> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> >>> - Starting the HVM guest again with the same devices passed through.
>> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>> >>>   "lsusb"
>> >>> - And this all works fine on vanilla 3.16.  
>> 
>> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> >> "xen/pciback: Don't deadlock when unbinding."
>> >> but it does not change any of that code path. Only figures out whether
>> >> to take a lock or not.
>> 
>> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>> > assumed there could be a connection)
>> 
>> >> I will try it out on my box and see if I can reproduce it.
>> 
>> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> >> of it?
>> 
>> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>> > If you can't reproduce i will see if i can dive deeper into it tonight !
>> 
>> Hi Konrad,
>> 
>> It looks like the issues is this part of the change:
>> 
>>     --- a/drivers/xen/xen-pciback/pci_stub.c
>>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>>     *
>>     * As such we have to be careful.
>>     + *
>>     + * To make this easier, the caller has to hold the device lock.
>>     */
>>     void pcistub_put_pci_dev(struct pci_dev *dev)
>>     {
>>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>     /* Cleanup our device
>>     * (so it's ready for the next domain)
>>     */
>>     -
>>     - /* This is OK - we are running from workqueue context
>>     - * and want to inhibit the user from fiddling with 'reset'
>>     - */
>>     - pci_reset_function(dev);
>>     + lockdep_assert_held(&dev->dev.mutex);
>>     + __pci_reset_function_locked(dev);
>>     pci_restore_state(dev);
>>    /* This disables the device. */
>> 
>> More specifically:
>> The old "pci_reset_function(dev)" potentially seems to do much more than 
>> __pci_reset_function_locked(dev).
>> 
>> 
>> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>> 
>> 
>> So i added a little more debug:
>> 
>> device_lock_assert(&dev->dev);
>> ret = __pci_reset_function_locked(dev);
>> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>> pci_restore_state(dev);
>> 
>> And this returns:
>> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>> 
>> So that confirms there is no saved_state to get restored by 
>> pci_restore_state(dev) in the next line.
>> 
>> However there seems to be no "locked" variant of the function 
>> "pci_reset_function" in pci.c that has all the same logic ...

> Yup. I've a preliminary patch:

Preliminary in the sense: "this should fix it .. needs more testing" ?

> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index 1ddd22f..4cb7901 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>          */
>         __pci_reset_function_locked(dev);
>         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
>         else
>                 pci_restore_state(dev);
>  
> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  {
>         struct pcistub_device *psdev, *found_psdev = NULL;
>         unsigned long flags;
> +       struct xen_pcibk_dev_data *dev_data;
>  
>         spin_lock_irqsave(&pcistub_devices_lock, flags);
>  
> @@ -278,10 +279,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>         /* Cleanup our device
>          * (so it's ready for the next domain)
>          */
> -       device_lock_assert(&dev->dev);
> -       __pci_reset_function_locked(dev);
> -       pci_restore_state(dev);
> -
> +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
> +       else {
> +               device_lock_assert(&dev->dev);
> +               __pci_reset_function_locked(dev);
> +               /*
> +                * The usual sequence is pci_save_state & pci_restore_state
> +                * but the guest might have messed the config space up. Use
> +                * the initial configuration (when device was binded to us).
> +                */
> +               pci_restore_state(dev);
> +               /*
> +                * The next steps are to reload the configuration for the
> +                * next time we need to unbind/bind to a guest..
> +                */
> +               dev_data = pci_get_drvdata(dev);
> +               pci_save_state(dev);
> +               dev_data->pci_saved_state = pci_store_saved_state(dev);
> +       }
>         /* This disables the device. */
>         xen_pcibk_reset_device(dev);
>  
>> 
>> --
>> Sander 
>> 
>> >> Thanks!
>> 
>> >>> 
>> >>> >> Apart from that .. i can't resist to remind the other issue with removing pci
>> >>> >> devices passed through to HVM guests related to the signaling via xenstore,
>> >>> >> described in:
>> >>> >> 
>> >>> >> http://lists.xen.org/archives/html/xen-devel/2014-07/msg01875.html
>> >>> 
>> >>> > I don't remember seeing you posting a patch...?
>> 
>> >> I was going to, but I think we need to figure out the 'do_flr' mechanism
>> >> first.
>> 
>> >>> 
>> >>> > David
>> >>> 
>> >>> 
>> 
>> 
>> 
>> 
>> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:25                   ` [Xen-devel] " Sander Eikelenboom
  2014-08-06 19:39                     ` Konrad Rzeszutek Wilk
@ 2014-08-06 19:39                     ` Konrad Rzeszutek Wilk
  2014-08-06 19:47                       ` Sander Eikelenboom
                                         ` (3 more replies)
  1 sibling, 4 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 19:39 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel

On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
> 
> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
> 
> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
> >> 
> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
> >> 
> >> 
> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
> >> 
> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> >> >>> 
> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> >> >>> 
> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >> >>> >> 
> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >> >>> >> 
> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >> >>> >>>>> Greg: goto GHK
> >> >>> >>>>>
> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >> >>> >>>>
> >> >>> >>>> Applied to devel/for-linus-3.17.
> >> >>> >> 
> >> >>> >>> Thank you.
> >> >>> >>>>
> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >> >>> >>>> really is a stable candidate.
> >> >>> >> 
> >> >>> >>> OK.
> >> >>> >>>>
> >> >>> >>>> David
> >> >>> >> 
> >> >>> >> Hi Konrad / David,
> >> >>> >> 
> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >> >>> >> 
> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >> >>> >> after a guest reboot or after assigning the devices to another guest.
> >> >>> 
> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
> >> >>> > devices with no FLR method isn't a regression as this has never worked.
> >> >>> >  Can you explain in more detail what the regression is and which patch
> >> >>> > caused it?
> >> >>> 
> >> >>> I haven't bisected it to a specific patch in this series,
> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
> >> >>> 
> >> >>> - Do a system start and HVM guest start
> >> >>> - HVM guest with pci passthrough, devices work fine
> >> >>> - shutdown the HVM guest
> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> >> >>> - Starting the HVM guest again with the same devices passed through.
> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
> >> >>>   "lsusb"
> >> >>> - And this all works fine on vanilla 3.16.  
> >> 
> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> >> >> "xen/pciback: Don't deadlock when unbinding."
> >> >> but it does not change any of that code path. Only figures out whether
> >> >> to take a lock or not.
> >> 
> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> >> > assumed there could be a connection)
> >> 
> >> >> I will try it out on my box and see if I can reproduce it.
> >> 
> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
> >> >> of it?
> >> 
> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
> >> 
> >> Hi Konrad,
> >> 
> >> It looks like the issues is this part of the change:
> >> 
> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
> >>     *
> >>     * As such we have to be careful.
> >>     + *
> >>     + * To make this easier, the caller has to hold the device lock.
> >>     */
> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
> >>     {
> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >>     /* Cleanup our device
> >>     * (so it's ready for the next domain)
> >>     */
> >>     -
> >>     - /* This is OK - we are running from workqueue context
> >>     - * and want to inhibit the user from fiddling with 'reset'
> >>     - */
> >>     - pci_reset_function(dev);
> >>     + lockdep_assert_held(&dev->dev.mutex);
> >>     + __pci_reset_function_locked(dev);
> >>     pci_restore_state(dev);
> >>    /* This disables the device. */
> >> 
> >> More specifically:
> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
> >> __pci_reset_function_locked(dev).
> >> 
> >> 
> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
> >> 
> >> 
> >> So i added a little more debug:
> >> 
> >> device_lock_assert(&dev->dev);
> >> ret = __pci_reset_function_locked(dev);
> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
> >> pci_restore_state(dev);
> >> 
> >> And this returns:
> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
> >> 
> >> So that confirms there is no saved_state to get restored by 
> >> pci_restore_state(dev) in the next line.
> >> 
> >> However there seems to be no "locked" variant of the function 
> >> "pci_reset_function" in pci.c that has all the same logic ...
> 
> > Yup. I've a preliminary patch:
> 
> Preliminary in the sense: "this should fix it .. needs more testing" ?

This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:


>From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Wed, 6 Aug 2014 16:21:32 -0400
Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
 a guest.

The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
"xen/pciback: Don't deadlock when unbinding." was using
the version of pci_reset_function which would lock the device lock.
That is no good as we can dead-lock. As such we swapped to using
the lock-less version and requiring that the callers
of 'pcistub_put_pci_dev' take the device lock. And as such
this bug got exposed.

Using the lock-less version is  OK, except that we tried to
use 'pci_restore_state' after the lock-less version of
__pci_reset_function_locked - which won't work as 'state_saved'
is set to false. Said 'state_saved' is a toggle boolean that
is to be used by the sequence of a) pci_save_state/pci_restore_state
or b) pci_load_and_free_saved_state/pci_restore_state. We don't
want to use a) as the guest might have messed up the PCI
configuration space and we want it to revert to the state
when the PCI device was binded to us. Therefore we pick
b) to restore the configuration space.

To still retain the PCI configuration space, we save it once
more and store it on our private copy to be restored when:
 - Device is unbinded from pciback
 - Device is detached from a guest.

Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
 1 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 1ddd22f..8cf7f2b 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
 	 */
 	__pci_reset_function_locked(dev);
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
-		dev_dbg(&dev->dev, "Could not reload PCI state\n");
+		dev_info(&dev->dev, "Could not reload PCI state\n");
 	else
 		pci_restore_state(dev);
 
@@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
 	unsigned long flags;
+	struct xen_pcibk_dev_data *dev_data;
 
 	spin_lock_irqsave(&pcistub_devices_lock, flags);
 
@@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	 * (so it's ready for the next domain)
 	 */
 	device_lock_assert(&dev->dev);
-	__pci_reset_function_locked(dev);
-	pci_restore_state(dev);
-
+	dev_data = pci_get_drvdata(dev);
+	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
+		dev_info(&dev->dev, "Could not reload PCI state\n");
+	else {
+		__pci_reset_function_locked(dev);
+		/*
+		 * The usual sequence is pci_save_state & pci_restore_state
+		 * but the guest might have messed the configuration space up.
+		 * Use the initial version (when device was binded to us).
+		 */
+		pci_restore_state(dev);
+		/*
+		 * The next steps are to reload the configuration for the
+		 * next time we bind & unbind to a guest - or unload from
+		 * pciback.
+		 */
+		pci_save_state(dev);
+		dev_data->pci_saved_state = pci_store_saved_state(dev);
+	}
 	/* This disables the device. */
 	xen_pcibk_reset_device(dev);
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:25                   ` [Xen-devel] " Sander Eikelenboom
@ 2014-08-06 19:39                     ` Konrad Rzeszutek Wilk
  2014-08-06 19:39                     ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 19:39 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel

On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
> 
> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
> 
> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
> >> 
> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
> >> 
> >> 
> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
> >> 
> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> >> >>> 
> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> >> >>> 
> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >> >>> >> 
> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >> >>> >> 
> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >> >>> >>>>> Greg: goto GHK
> >> >>> >>>>>
> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >> >>> >>>>
> >> >>> >>>> Applied to devel/for-linus-3.17.
> >> >>> >> 
> >> >>> >>> Thank you.
> >> >>> >>>>
> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >> >>> >>>> really is a stable candidate.
> >> >>> >> 
> >> >>> >>> OK.
> >> >>> >>>>
> >> >>> >>>> David
> >> >>> >> 
> >> >>> >> Hi Konrad / David,
> >> >>> >> 
> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >> >>> >> 
> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >> >>> >> after a guest reboot or after assigning the devices to another guest.
> >> >>> 
> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
> >> >>> > devices with no FLR method isn't a regression as this has never worked.
> >> >>> >  Can you explain in more detail what the regression is and which patch
> >> >>> > caused it?
> >> >>> 
> >> >>> I haven't bisected it to a specific patch in this series,
> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
> >> >>> 
> >> >>> - Do a system start and HVM guest start
> >> >>> - HVM guest with pci passthrough, devices work fine
> >> >>> - shutdown the HVM guest
> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> >> >>> - Starting the HVM guest again with the same devices passed through.
> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
> >> >>>   "lsusb"
> >> >>> - And this all works fine on vanilla 3.16.  
> >> 
> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> >> >> "xen/pciback: Don't deadlock when unbinding."
> >> >> but it does not change any of that code path. Only figures out whether
> >> >> to take a lock or not.
> >> 
> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> >> > assumed there could be a connection)
> >> 
> >> >> I will try it out on my box and see if I can reproduce it.
> >> 
> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
> >> >> of it?
> >> 
> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
> >> 
> >> Hi Konrad,
> >> 
> >> It looks like the issues is this part of the change:
> >> 
> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
> >>     *
> >>     * As such we have to be careful.
> >>     + *
> >>     + * To make this easier, the caller has to hold the device lock.
> >>     */
> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
> >>     {
> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >>     /* Cleanup our device
> >>     * (so it's ready for the next domain)
> >>     */
> >>     -
> >>     - /* This is OK - we are running from workqueue context
> >>     - * and want to inhibit the user from fiddling with 'reset'
> >>     - */
> >>     - pci_reset_function(dev);
> >>     + lockdep_assert_held(&dev->dev.mutex);
> >>     + __pci_reset_function_locked(dev);
> >>     pci_restore_state(dev);
> >>    /* This disables the device. */
> >> 
> >> More specifically:
> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
> >> __pci_reset_function_locked(dev).
> >> 
> >> 
> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
> >> 
> >> 
> >> So i added a little more debug:
> >> 
> >> device_lock_assert(&dev->dev);
> >> ret = __pci_reset_function_locked(dev);
> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
> >> pci_restore_state(dev);
> >> 
> >> And this returns:
> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
> >> 
> >> So that confirms there is no saved_state to get restored by 
> >> pci_restore_state(dev) in the next line.
> >> 
> >> However there seems to be no "locked" variant of the function 
> >> "pci_reset_function" in pci.c that has all the same logic ...
> 
> > Yup. I've a preliminary patch:
> 
> Preliminary in the sense: "this should fix it .. needs more testing" ?

This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:


>From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Wed, 6 Aug 2014 16:21:32 -0400
Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
 a guest.

The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
"xen/pciback: Don't deadlock when unbinding." was using
the version of pci_reset_function which would lock the device lock.
That is no good as we can dead-lock. As such we swapped to using
the lock-less version and requiring that the callers
of 'pcistub_put_pci_dev' take the device lock. And as such
this bug got exposed.

Using the lock-less version is  OK, except that we tried to
use 'pci_restore_state' after the lock-less version of
__pci_reset_function_locked - which won't work as 'state_saved'
is set to false. Said 'state_saved' is a toggle boolean that
is to be used by the sequence of a) pci_save_state/pci_restore_state
or b) pci_load_and_free_saved_state/pci_restore_state. We don't
want to use a) as the guest might have messed up the PCI
configuration space and we want it to revert to the state
when the PCI device was binded to us. Therefore we pick
b) to restore the configuration space.

To still retain the PCI configuration space, we save it once
more and store it on our private copy to be restored when:
 - Device is unbinded from pciback
 - Device is detached from a guest.

Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
 1 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 1ddd22f..8cf7f2b 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
 	 */
 	__pci_reset_function_locked(dev);
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
-		dev_dbg(&dev->dev, "Could not reload PCI state\n");
+		dev_info(&dev->dev, "Could not reload PCI state\n");
 	else
 		pci_restore_state(dev);
 
@@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
 	unsigned long flags;
+	struct xen_pcibk_dev_data *dev_data;
 
 	spin_lock_irqsave(&pcistub_devices_lock, flags);
 
@@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	 * (so it's ready for the next domain)
 	 */
 	device_lock_assert(&dev->dev);
-	__pci_reset_function_locked(dev);
-	pci_restore_state(dev);
-
+	dev_data = pci_get_drvdata(dev);
+	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
+		dev_info(&dev->dev, "Could not reload PCI state\n");
+	else {
+		__pci_reset_function_locked(dev);
+		/*
+		 * The usual sequence is pci_save_state & pci_restore_state
+		 * but the guest might have messed the configuration space up.
+		 * Use the initial version (when device was binded to us).
+		 */
+		pci_restore_state(dev);
+		/*
+		 * The next steps are to reload the configuration for the
+		 * next time we bind & unbind to a guest - or unload from
+		 * pciback.
+		 */
+		pci_save_state(dev);
+		dev_data->pci_saved_state = pci_store_saved_state(dev);
+	}
 	/* This disables the device. */
 	xen_pcibk_reset_device(dev);
 
-- 
1.7.7.6

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:39                     ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2014-08-06 19:47                       ` Sander Eikelenboom
  2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
  2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
  2014-08-06 19:47                       ` Sander Eikelenboom
                                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 19:47 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel


Wednesday, August 6, 2014, 9:39:16 PM, you wrote:

> On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
>> 
>> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
>> 
>> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>> >> 
>> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>> >> 
>> >> 
>> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>> >> 
>> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> >> >>> 
>> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> >> >>> 
>> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> >>> >> 
>> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> >>> >> 
>> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >> >>> >>>>> Greg: goto GHK
>> >> >>> >>>>>
>> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >> >>> >>>>
>> >> >>> >>>> Applied to devel/for-linus-3.17.
>> >> >>> >> 
>> >> >>> >>> Thank you.
>> >> >>> >>>>
>> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >> >>> >>>> really is a stable candidate.
>> >> >>> >> 
>> >> >>> >>> OK.
>> >> >>> >>>>
>> >> >>> >>>> David
>> >> >>> >> 
>> >> >>> >> Hi Konrad / David,
>> >> >>> >> 
>> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> >>> >> 
>> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> >>> >> after a guest reboot or after assigning the devices to another guest.
>> >> >>> 
>> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> >> >>> > devices with no FLR method isn't a regression as this has never worked.
>> >> >>> >  Can you explain in more detail what the regression is and which patch
>> >> >>> > caused it?
>> >> >>> 
>> >> >>> I haven't bisected it to a specific patch in this series,
>> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
>> >> >>> 
>> >> >>> - Do a system start and HVM guest start
>> >> >>> - HVM guest with pci passthrough, devices work fine
>> >> >>> - shutdown the HVM guest
>> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> >> >>> - Starting the HVM guest again with the same devices passed through.
>> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>> >> >>>   "lsusb"
>> >> >>> - And this all works fine on vanilla 3.16.  
>> >> 
>> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> >> >> "xen/pciback: Don't deadlock when unbinding."
>> >> >> but it does not change any of that code path. Only figures out whether
>> >> >> to take a lock or not.
>> >> 
>> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>> >> > assumed there could be a connection)
>> >> 
>> >> >> I will try it out on my box and see if I can reproduce it.
>> >> 
>> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> >> >> of it?
>> >> 
>> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
>> >> 
>> >> Hi Konrad,
>> >> 
>> >> It looks like the issues is this part of the change:
>> >> 
>> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
>> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>> >>     *
>> >>     * As such we have to be careful.
>> >>     + *
>> >>     + * To make this easier, the caller has to hold the device lock.
>> >>     */
>> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>     {
>> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>     /* Cleanup our device
>> >>     * (so it's ready for the next domain)
>> >>     */
>> >>     -
>> >>     - /* This is OK - we are running from workqueue context
>> >>     - * and want to inhibit the user from fiddling with 'reset'
>> >>     - */
>> >>     - pci_reset_function(dev);
>> >>     + lockdep_assert_held(&dev->dev.mutex);
>> >>     + __pci_reset_function_locked(dev);
>> >>     pci_restore_state(dev);
>> >>    /* This disables the device. */
>> >> 
>> >> More specifically:
>> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
>> >> __pci_reset_function_locked(dev).
>> >> 
>> >> 
>> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>> >> 
>> >> 
>> >> So i added a little more debug:
>> >> 
>> >> device_lock_assert(&dev->dev);
>> >> ret = __pci_reset_function_locked(dev);
>> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>> >> pci_restore_state(dev);
>> >> 
>> >> And this returns:
>> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>> >> 
>> >> So that confirms there is no saved_state to get restored by 
>> >> pci_restore_state(dev) in the next line.
>> >> 
>> >> However there seems to be no "locked" variant of the function 
>> >> "pci_reset_function" in pci.c that has all the same logic ...
>> 
>> > Yup. I've a preliminary patch:
>> 
>> Preliminary in the sense: "this should fix it .. needs more testing" ?

> This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:


> From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Wed, 6 Aug 2014 16:21:32 -0400
> Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>  a guest.

> The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
> "xen/pciback: Don't deadlock when unbinding." was using
> the version of pci_reset_function which would lock the device lock.
> That is no good as we can dead-lock. As such we swapped to using
> the lock-less version and requiring that the callers
> of 'pcistub_put_pci_dev' take the device lock. And as such
> this bug got exposed.

> Using the lock-less version is  OK, except that we tried to
> use 'pci_restore_state' after the lock-less version of
> __pci_reset_function_locked - which won't work as 'state_saved'
> is set to false. Said 'state_saved' is a toggle boolean that
> is to be used by the sequence of a) pci_save_state/pci_restore_state
> or b) pci_load_and_free_saved_state/pci_restore_state. We don't
> want to use a) as the guest might have messed up the PCI
> configuration space and we want it to revert to the state
> when the PCI device was binded to us. Therefore we pick
> b) to restore the configuration space.

> To still retain the PCI configuration space, we save it once
> more and store it on our private copy to be restored when:
>  - Device is unbinded from pciback
>  - Device is detached from a guest.

> Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
>  1 files changed, 21 insertions(+), 4 deletions(-)

> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index 1ddd22f..8cf7f2b 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>          */
>         __pci_reset_function_locked(dev);
>         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
>         else
>                 pci_restore_state(dev);
>  
> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  {
>         struct pcistub_device *psdev, *found_psdev = NULL;
>         unsigned long flags;
> +       struct xen_pcibk_dev_data *dev_data;
>  
>         spin_lock_irqsave(&pcistub_devices_lock, flags);
>  
> @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>          * (so it's ready for the next domain)
>          */
>         device_lock_assert(&dev->dev);
> -       __pci_reset_function_locked(dev);
> -       pci_restore_state(dev);
> -
> +       dev_data = pci_get_drvdata(dev);
> +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
> +       else {
> +               __pci_reset_function_locked(dev);
> +               /*
> +                * The usual sequence is pci_save_state & pci_restore_state
> +                * but the guest might have messed the configuration space up.
> +                * Use the initial version (when device was binded to us).
> +                */
> +               pci_restore_state(dev);
> +               /*
> +                * The next steps are to reload the configuration for the
> +                * next time we bind & unbind to a guest - or unload from
> +                * pciback.
> +                */
> +               pci_save_state(dev);
> +               dev_data->pci_saved_state = pci_store_saved_state(dev);
> +       }
>         /* This disables the device. */
>         xen_pcibk_reset_device(dev);
>  


Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
"pci_load_and_free_saved_state" ?

Or is it safer because you don't reset the device although it's in an unknown 
state (and resetting it while it's back to dom0 could lead to more problems  ?)


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:39                     ` [Xen-devel] " Konrad Rzeszutek Wilk
  2014-08-06 19:47                       ` Sander Eikelenboom
@ 2014-08-06 19:47                       ` Sander Eikelenboom
  2014-08-07  9:04                       ` [Xen-devel] " David Vrabel
  2014-08-07  9:04                       ` David Vrabel
  3 siblings, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 19:47 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel


Wednesday, August 6, 2014, 9:39:16 PM, you wrote:

> On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
>> 
>> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
>> 
>> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>> >> 
>> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>> >> 
>> >> 
>> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>> >> 
>> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> >> >>> 
>> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> >> >>> 
>> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> >>> >> 
>> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> >>> >> 
>> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >> >>> >>>>> Greg: goto GHK
>> >> >>> >>>>>
>> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >> >>> >>>>
>> >> >>> >>>> Applied to devel/for-linus-3.17.
>> >> >>> >> 
>> >> >>> >>> Thank you.
>> >> >>> >>>>
>> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >> >>> >>>> really is a stable candidate.
>> >> >>> >> 
>> >> >>> >>> OK.
>> >> >>> >>>>
>> >> >>> >>>> David
>> >> >>> >> 
>> >> >>> >> Hi Konrad / David,
>> >> >>> >> 
>> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> >>> >> 
>> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> >>> >> after a guest reboot or after assigning the devices to another guest.
>> >> >>> 
>> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> >> >>> > devices with no FLR method isn't a regression as this has never worked.
>> >> >>> >  Can you explain in more detail what the regression is and which patch
>> >> >>> > caused it?
>> >> >>> 
>> >> >>> I haven't bisected it to a specific patch in this series,
>> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
>> >> >>> 
>> >> >>> - Do a system start and HVM guest start
>> >> >>> - HVM guest with pci passthrough, devices work fine
>> >> >>> - shutdown the HVM guest
>> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> >> >>> - Starting the HVM guest again with the same devices passed through.
>> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>> >> >>>   "lsusb"
>> >> >>> - And this all works fine on vanilla 3.16.  
>> >> 
>> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> >> >> "xen/pciback: Don't deadlock when unbinding."
>> >> >> but it does not change any of that code path. Only figures out whether
>> >> >> to take a lock or not.
>> >> 
>> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>> >> > assumed there could be a connection)
>> >> 
>> >> >> I will try it out on my box and see if I can reproduce it.
>> >> 
>> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> >> >> of it?
>> >> 
>> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
>> >> 
>> >> Hi Konrad,
>> >> 
>> >> It looks like the issues is this part of the change:
>> >> 
>> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
>> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>> >>     *
>> >>     * As such we have to be careful.
>> >>     + *
>> >>     + * To make this easier, the caller has to hold the device lock.
>> >>     */
>> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>     {
>> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>     /* Cleanup our device
>> >>     * (so it's ready for the next domain)
>> >>     */
>> >>     -
>> >>     - /* This is OK - we are running from workqueue context
>> >>     - * and want to inhibit the user from fiddling with 'reset'
>> >>     - */
>> >>     - pci_reset_function(dev);
>> >>     + lockdep_assert_held(&dev->dev.mutex);
>> >>     + __pci_reset_function_locked(dev);
>> >>     pci_restore_state(dev);
>> >>    /* This disables the device. */
>> >> 
>> >> More specifically:
>> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
>> >> __pci_reset_function_locked(dev).
>> >> 
>> >> 
>> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>> >> 
>> >> 
>> >> So i added a little more debug:
>> >> 
>> >> device_lock_assert(&dev->dev);
>> >> ret = __pci_reset_function_locked(dev);
>> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>> >> pci_restore_state(dev);
>> >> 
>> >> And this returns:
>> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>> >> 
>> >> So that confirms there is no saved_state to get restored by 
>> >> pci_restore_state(dev) in the next line.
>> >> 
>> >> However there seems to be no "locked" variant of the function 
>> >> "pci_reset_function" in pci.c that has all the same logic ...
>> 
>> > Yup. I've a preliminary patch:
>> 
>> Preliminary in the sense: "this should fix it .. needs more testing" ?

> This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:


> From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Wed, 6 Aug 2014 16:21:32 -0400
> Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>  a guest.

> The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
> "xen/pciback: Don't deadlock when unbinding." was using
> the version of pci_reset_function which would lock the device lock.
> That is no good as we can dead-lock. As such we swapped to using
> the lock-less version and requiring that the callers
> of 'pcistub_put_pci_dev' take the device lock. And as such
> this bug got exposed.

> Using the lock-less version is  OK, except that we tried to
> use 'pci_restore_state' after the lock-less version of
> __pci_reset_function_locked - which won't work as 'state_saved'
> is set to false. Said 'state_saved' is a toggle boolean that
> is to be used by the sequence of a) pci_save_state/pci_restore_state
> or b) pci_load_and_free_saved_state/pci_restore_state. We don't
> want to use a) as the guest might have messed up the PCI
> configuration space and we want it to revert to the state
> when the PCI device was binded to us. Therefore we pick
> b) to restore the configuration space.

> To still retain the PCI configuration space, we save it once
> more and store it on our private copy to be restored when:
>  - Device is unbinded from pciback
>  - Device is detached from a guest.

> Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
>  1 files changed, 21 insertions(+), 4 deletions(-)

> diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> index 1ddd22f..8cf7f2b 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>          */
>         __pci_reset_function_locked(dev);
>         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
>         else
>                 pci_restore_state(dev);
>  
> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  {
>         struct pcistub_device *psdev, *found_psdev = NULL;
>         unsigned long flags;
> +       struct xen_pcibk_dev_data *dev_data;
>  
>         spin_lock_irqsave(&pcistub_devices_lock, flags);
>  
> @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>          * (so it's ready for the next domain)
>          */
>         device_lock_assert(&dev->dev);
> -       __pci_reset_function_locked(dev);
> -       pci_restore_state(dev);
> -
> +       dev_data = pci_get_drvdata(dev);
> +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> +               dev_info(&dev->dev, "Could not reload PCI state\n");
> +       else {
> +               __pci_reset_function_locked(dev);
> +               /*
> +                * The usual sequence is pci_save_state & pci_restore_state
> +                * but the guest might have messed the configuration space up.
> +                * Use the initial version (when device was binded to us).
> +                */
> +               pci_restore_state(dev);
> +               /*
> +                * The next steps are to reload the configuration for the
> +                * next time we bind & unbind to a guest - or unload from
> +                * pciback.
> +                */
> +               pci_save_state(dev);
> +               dev_data->pci_saved_state = pci_store_saved_state(dev);
> +       }
>         /* This disables the device. */
>         xen_pcibk_reset_device(dev);
>  


Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
"pci_load_and_free_saved_state" ?

Or is it safer because you don't reset the device although it's in an unknown 
state (and resetting it while it's back to dom0 could lead to more problems  ?)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:47                       ` Sander Eikelenboom
@ 2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
  2014-08-06 20:17                           ` Sander Eikelenboom
  2014-08-06 20:17                           ` [Xen-devel] " Sander Eikelenboom
  2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 20:09 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel

On Wed, Aug 06, 2014 at 09:47:43PM +0200, Sander Eikelenboom wrote:
> 
> Wednesday, August 6, 2014, 9:39:16 PM, you wrote:
> 
> > On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
> >> 
> >> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
> >> 
> >> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
> >> >> 
> >> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
> >> >> 
> >> >> 
> >> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
> >> >> 
> >> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> >> >> >>> 
> >> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> >> >> >>> 
> >> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >> >> >>> >> 
> >> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >> >> >>> >> 
> >> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >> >> >>> >>>>> Greg: goto GHK
> >> >> >>> >>>>>
> >> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >> >> >>> >>>>
> >> >> >>> >>>> Applied to devel/for-linus-3.17.
> >> >> >>> >> 
> >> >> >>> >>> Thank you.
> >> >> >>> >>>>
> >> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >> >> >>> >>>> really is a stable candidate.
> >> >> >>> >> 
> >> >> >>> >>> OK.
> >> >> >>> >>>>
> >> >> >>> >>>> David
> >> >> >>> >> 
> >> >> >>> >> Hi Konrad / David,
> >> >> >>> >> 
> >> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
> >> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >> >> >>> >> 
> >> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >> >> >>> >> after a guest reboot or after assigning the devices to another guest.
> >> >> >>> 
> >> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
> >> >> >>> > devices with no FLR method isn't a regression as this has never worked.
> >> >> >>> >  Can you explain in more detail what the regression is and which patch
> >> >> >>> > caused it?
> >> >> >>> 
> >> >> >>> I haven't bisected it to a specific patch in this series,
> >> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
> >> >> >>> 
> >> >> >>> - Do a system start and HVM guest start
> >> >> >>> - HVM guest with pci passthrough, devices work fine
> >> >> >>> - shutdown the HVM guest
> >> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
> >> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> >> >> >>> - Starting the HVM guest again with the same devices passed through.
> >> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
> >> >> >>>   "lsusb"
> >> >> >>> - And this all works fine on vanilla 3.16.  
> >> >> 
> >> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> >> >> >> "xen/pciback: Don't deadlock when unbinding."
> >> >> >> but it does not change any of that code path. Only figures out whether
> >> >> >> to take a lock or not.
> >> >> 
> >> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> >> >> > assumed there could be a connection)
> >> >> 
> >> >> >> I will try it out on my box and see if I can reproduce it.
> >> >> 
> >> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
> >> >> >> of it?
> >> >> 
> >> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> >> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
> >> >> 
> >> >> Hi Konrad,
> >> >> 
> >> >> It looks like the issues is this part of the change:
> >> >> 
> >> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
> >> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
> >> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
> >> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
> >> >>     *
> >> >>     * As such we have to be careful.
> >> >>     + *
> >> >>     + * To make this easier, the caller has to hold the device lock.
> >> >>     */
> >> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
> >> >>     {
> >> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >> >>     /* Cleanup our device
> >> >>     * (so it's ready for the next domain)
> >> >>     */
> >> >>     -
> >> >>     - /* This is OK - we are running from workqueue context
> >> >>     - * and want to inhibit the user from fiddling with 'reset'
> >> >>     - */
> >> >>     - pci_reset_function(dev);
> >> >>     + lockdep_assert_held(&dev->dev.mutex);
> >> >>     + __pci_reset_function_locked(dev);
> >> >>     pci_restore_state(dev);
> >> >>    /* This disables the device. */
> >> >> 
> >> >> More specifically:
> >> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
> >> >> __pci_reset_function_locked(dev).
> >> >> 
> >> >> 
> >> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
> >> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
> >> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
> >> >> 
> >> >> 
> >> >> So i added a little more debug:
> >> >> 
> >> >> device_lock_assert(&dev->dev);
> >> >> ret = __pci_reset_function_locked(dev);
> >> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
> >> >> pci_restore_state(dev);
> >> >> 
> >> >> And this returns:
> >> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
> >> >> 
> >> >> So that confirms there is no saved_state to get restored by 
> >> >> pci_restore_state(dev) in the next line.
> >> >> 
> >> >> However there seems to be no "locked" variant of the function 
> >> >> "pci_reset_function" in pci.c that has all the same logic ...
> >> 
> >> > Yup. I've a preliminary patch:
> >> 
> >> Preliminary in the sense: "this should fix it .. needs more testing" ?
> 
> > This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:
> 
> 
> > From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
> > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Date: Wed, 6 Aug 2014 16:21:32 -0400
> > Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
> >  a guest.
> 
> > The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
> > "xen/pciback: Don't deadlock when unbinding." was using
> > the version of pci_reset_function which would lock the device lock.
> > That is no good as we can dead-lock. As such we swapped to using
> > the lock-less version and requiring that the callers
> > of 'pcistub_put_pci_dev' take the device lock. And as such
> > this bug got exposed.
> 
> > Using the lock-less version is  OK, except that we tried to
> > use 'pci_restore_state' after the lock-less version of
> > __pci_reset_function_locked - which won't work as 'state_saved'
> > is set to false. Said 'state_saved' is a toggle boolean that
> > is to be used by the sequence of a) pci_save_state/pci_restore_state
> > or b) pci_load_and_free_saved_state/pci_restore_state. We don't
> > want to use a) as the guest might have messed up the PCI
> > configuration space and we want it to revert to the state
> > when the PCI device was binded to us. Therefore we pick
> > b) to restore the configuration space.
> 
> > To still retain the PCI configuration space, we save it once
> > more and store it on our private copy to be restored when:
> >  - Device is unbinded from pciback
> >  - Device is detached from a guest.
> 
> > Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> >  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
> >  1 files changed, 21 insertions(+), 4 deletions(-)
> 
> > diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> > index 1ddd22f..8cf7f2b 100644
> > --- a/drivers/xen/xen-pciback/pci_stub.c
> > +++ b/drivers/xen/xen-pciback/pci_stub.c
> > @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
> >          */
> >         __pci_reset_function_locked(dev);
> >         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> > -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
> >         else
> >                 pci_restore_state(dev);
> >  
> > @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >  {
> >         struct pcistub_device *psdev, *found_psdev = NULL;
> >         unsigned long flags;
> > +       struct xen_pcibk_dev_data *dev_data;
> >  
> >         spin_lock_irqsave(&pcistub_devices_lock, flags);
> >  
> > @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >          * (so it's ready for the next domain)
> >          */
> >         device_lock_assert(&dev->dev);
> > -       __pci_reset_function_locked(dev);
> > -       pci_restore_state(dev);
> > -
> > +       dev_data = pci_get_drvdata(dev);
> > +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
> > +       else {
> > +               __pci_reset_function_locked(dev);
> > +               /*
> > +                * The usual sequence is pci_save_state & pci_restore_state
> > +                * but the guest might have messed the configuration space up.
> > +                * Use the initial version (when device was binded to us).
> > +                */
> > +               pci_restore_state(dev);
> > +               /*
> > +                * The next steps are to reload the configuration for the
> > +                * next time we bind & unbind to a guest - or unload from
> > +                * pciback.
> > +                */
> > +               pci_save_state(dev);
> > +               dev_data->pci_saved_state = pci_store_saved_state(dev);
> > +       }
> >         /* This disables the device. */
> >         xen_pcibk_reset_device(dev);
> >  
> 
> 
> Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
> "pci_load_and_free_saved_state" ?

It could be redone a bit differently - as in:

 rc = pci_load_and_free_saved_state(..);
 __pci_reset_function_locked(dev);
 if (!rc) {
	pci_restore_state(dev);
	...

In which case we will only do the restore state (and save state) when the device
is in expected state. And the reset happens at that point.

> 
> Or is it safer because you don't reset the device although it's in an unknown 
> state (and resetting it while it's back to dom0 could lead to more problems  ?)

It could very well lead to disaster. I am not exactly sure what the ramifications
are with a device for which we cannot save PCI configuration space - aka - extremely
borked.

> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:47                       ` Sander Eikelenboom
  2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
@ 2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-08-06 20:09 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel

On Wed, Aug 06, 2014 at 09:47:43PM +0200, Sander Eikelenboom wrote:
> 
> Wednesday, August 6, 2014, 9:39:16 PM, you wrote:
> 
> > On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
> >> 
> >> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
> >> 
> >> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
> >> >> 
> >> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
> >> >> 
> >> >> 
> >> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
> >> >> 
> >> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
> >> >> >>> 
> >> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
> >> >> >>> 
> >> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
> >> >> >>> >> 
> >> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
> >> >> >>> >> 
> >> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
> >> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
> >> >> >>> >>>>> Greg: goto GHK
> >> >> >>> >>>>>
> >> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
> >> >> >>> >>>>
> >> >> >>> >>>> Applied to devel/for-linus-3.17.
> >> >> >>> >> 
> >> >> >>> >>> Thank you.
> >> >> >>> >>>>
> >> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
> >> >> >>> >>>> really is a stable candidate.
> >> >> >>> >> 
> >> >> >>> >>> OK.
> >> >> >>> >>>>
> >> >> >>> >>>> David
> >> >> >>> >> 
> >> >> >>> >> Hi Konrad / David,
> >> >> >>> >> 
> >> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
> >> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
> >> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
> >> >> >>> >> 
> >> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
> >> >> >>> >> after a guest reboot or after assigning the devices to another guest.
> >> >> >>> 
> >> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
> >> >> >>> > devices with no FLR method isn't a regression as this has never worked.
> >> >> >>> >  Can you explain in more detail what the regression is and which patch
> >> >> >>> > caused it?
> >> >> >>> 
> >> >> >>> I haven't bisected it to a specific patch in this series,
> >> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
> >> >> >>> 
> >> >> >>> - Do a system start and HVM guest start
> >> >> >>> - HVM guest with pci passthrough, devices work fine
> >> >> >>> - shutdown the HVM guest
> >> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
> >> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
> >> >> >>> - Starting the HVM guest again with the same devices passed through.
> >> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
> >> >> >>>   "lsusb"
> >> >> >>> - And this all works fine on vanilla 3.16.  
> >> >> 
> >> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
> >> >> >> "xen/pciback: Don't deadlock when unbinding."
> >> >> >> but it does not change any of that code path. Only figures out whether
> >> >> >> to take a lock or not.
> >> >> 
> >> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
> >> >> > assumed there could be a connection)
> >> >> 
> >> >> >> I will try it out on my box and see if I can reproduce it.
> >> >> 
> >> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
> >> >> >> of it?
> >> >> 
> >> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
> >> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
> >> >> 
> >> >> Hi Konrad,
> >> >> 
> >> >> It looks like the issues is this part of the change:
> >> >> 
> >> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
> >> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
> >> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
> >> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
> >> >>     *
> >> >>     * As such we have to be careful.
> >> >>     + *
> >> >>     + * To make this easier, the caller has to hold the device lock.
> >> >>     */
> >> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
> >> >>     {
> >> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >> >>     /* Cleanup our device
> >> >>     * (so it's ready for the next domain)
> >> >>     */
> >> >>     -
> >> >>     - /* This is OK - we are running from workqueue context
> >> >>     - * and want to inhibit the user from fiddling with 'reset'
> >> >>     - */
> >> >>     - pci_reset_function(dev);
> >> >>     + lockdep_assert_held(&dev->dev.mutex);
> >> >>     + __pci_reset_function_locked(dev);
> >> >>     pci_restore_state(dev);
> >> >>    /* This disables the device. */
> >> >> 
> >> >> More specifically:
> >> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
> >> >> __pci_reset_function_locked(dev).
> >> >> 
> >> >> 
> >> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
> >> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
> >> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
> >> >> 
> >> >> 
> >> >> So i added a little more debug:
> >> >> 
> >> >> device_lock_assert(&dev->dev);
> >> >> ret = __pci_reset_function_locked(dev);
> >> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
> >> >> pci_restore_state(dev);
> >> >> 
> >> >> And this returns:
> >> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
> >> >> 
> >> >> So that confirms there is no saved_state to get restored by 
> >> >> pci_restore_state(dev) in the next line.
> >> >> 
> >> >> However there seems to be no "locked" variant of the function 
> >> >> "pci_reset_function" in pci.c that has all the same logic ...
> >> 
> >> > Yup. I've a preliminary patch:
> >> 
> >> Preliminary in the sense: "this should fix it .. needs more testing" ?
> 
> > This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:
> 
> 
> > From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
> > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > Date: Wed, 6 Aug 2014 16:21:32 -0400
> > Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
> >  a guest.
> 
> > The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
> > "xen/pciback: Don't deadlock when unbinding." was using
> > the version of pci_reset_function which would lock the device lock.
> > That is no good as we can dead-lock. As such we swapped to using
> > the lock-less version and requiring that the callers
> > of 'pcistub_put_pci_dev' take the device lock. And as such
> > this bug got exposed.
> 
> > Using the lock-less version is  OK, except that we tried to
> > use 'pci_restore_state' after the lock-less version of
> > __pci_reset_function_locked - which won't work as 'state_saved'
> > is set to false. Said 'state_saved' is a toggle boolean that
> > is to be used by the sequence of a) pci_save_state/pci_restore_state
> > or b) pci_load_and_free_saved_state/pci_restore_state. We don't
> > want to use a) as the guest might have messed up the PCI
> > configuration space and we want it to revert to the state
> > when the PCI device was binded to us. Therefore we pick
> > b) to restore the configuration space.
> 
> > To still retain the PCI configuration space, we save it once
> > more and store it on our private copy to be restored when:
> >  - Device is unbinded from pciback
> >  - Device is detached from a guest.
> 
> > Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > ---
> >  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
> >  1 files changed, 21 insertions(+), 4 deletions(-)
> 
> > diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
> > index 1ddd22f..8cf7f2b 100644
> > --- a/drivers/xen/xen-pciback/pci_stub.c
> > +++ b/drivers/xen/xen-pciback/pci_stub.c
> > @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
> >          */
> >         __pci_reset_function_locked(dev);
> >         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> > -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
> >         else
> >                 pci_restore_state(dev);
> >  
> > @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >  {
> >         struct pcistub_device *psdev, *found_psdev = NULL;
> >         unsigned long flags;
> > +       struct xen_pcibk_dev_data *dev_data;
> >  
> >         spin_lock_irqsave(&pcistub_devices_lock, flags);
> >  
> > @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >          * (so it's ready for the next domain)
> >          */
> >         device_lock_assert(&dev->dev);
> > -       __pci_reset_function_locked(dev);
> > -       pci_restore_state(dev);
> > -
> > +       dev_data = pci_get_drvdata(dev);
> > +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
> > +       else {
> > +               __pci_reset_function_locked(dev);
> > +               /*
> > +                * The usual sequence is pci_save_state & pci_restore_state
> > +                * but the guest might have messed the configuration space up.
> > +                * Use the initial version (when device was binded to us).
> > +                */
> > +               pci_restore_state(dev);
> > +               /*
> > +                * The next steps are to reload the configuration for the
> > +                * next time we bind & unbind to a guest - or unload from
> > +                * pciback.
> > +                */
> > +               pci_save_state(dev);
> > +               dev_data->pci_saved_state = pci_store_saved_state(dev);
> > +       }
> >         /* This disables the device. */
> >         xen_pcibk_reset_device(dev);
> >  
> 
> 
> Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
> "pci_load_and_free_saved_state" ?

It could be redone a bit differently - as in:

 rc = pci_load_and_free_saved_state(..);
 __pci_reset_function_locked(dev);
 if (!rc) {
	pci_restore_state(dev);
	...

In which case we will only do the restore state (and save state) when the device
is in expected state. And the reset happens at that point.

> 
> Or is it safer because you don't reset the device although it's in an unknown 
> state (and resetting it while it's back to dom0 could lead to more problems  ?)

It could very well lead to disaster. I am not exactly sure what the ramifications
are with a device for which we cannot save PCI configuration space - aka - extremely
borked.

> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
  2014-08-06 20:17                           ` Sander Eikelenboom
@ 2014-08-06 20:17                           ` Sander Eikelenboom
  2014-08-06 22:08                             ` Sander Eikelenboom
  2014-08-06 22:08                             ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 2 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 20:17 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel


Wednesday, August 6, 2014, 10:09:59 PM, you wrote:

> On Wed, Aug 06, 2014 at 09:47:43PM +0200, Sander Eikelenboom wrote:
>> 
>> Wednesday, August 6, 2014, 9:39:16 PM, you wrote:
>> 
>> > On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
>> >> 
>> >> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
>> >> 
>> >> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>> >> >> 
>> >> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>> >> >> 
>> >> >> 
>> >> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>> >> >> 
>> >> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> >> >> >>> 
>> >> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> >> >> >>> 
>> >> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> >> >>> >> 
>> >> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> >> >>> >> 
>> >> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >> >> >>> >>>>> Greg: goto GHK
>> >> >> >>> >>>>>
>> >> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >> >> >>> >>>>
>> >> >> >>> >>>> Applied to devel/for-linus-3.17.
>> >> >> >>> >> 
>> >> >> >>> >>> Thank you.
>> >> >> >>> >>>>
>> >> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >> >> >>> >>>> really is a stable candidate.
>> >> >> >>> >> 
>> >> >> >>> >>> OK.
>> >> >> >>> >>>>
>> >> >> >>> >>>> David
>> >> >> >>> >> 
>> >> >> >>> >> Hi Konrad / David,
>> >> >> >>> >> 
>> >> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> >> >>> >> 
>> >> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> >> >>> >> after a guest reboot or after assigning the devices to another guest.
>> >> >> >>> 
>> >> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> >> >> >>> > devices with no FLR method isn't a regression as this has never worked.
>> >> >> >>> >  Can you explain in more detail what the regression is and which patch
>> >> >> >>> > caused it?
>> >> >> >>> 
>> >> >> >>> I haven't bisected it to a specific patch in this series,
>> >> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
>> >> >> >>> 
>> >> >> >>> - Do a system start and HVM guest start
>> >> >> >>> - HVM guest with pci passthrough, devices work fine
>> >> >> >>> - shutdown the HVM guest
>> >> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>> >> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> >> >> >>> - Starting the HVM guest again with the same devices passed through.
>> >> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>> >> >> >>>   "lsusb"
>> >> >> >>> - And this all works fine on vanilla 3.16.  
>> >> >> 
>> >> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> >> >> >> "xen/pciback: Don't deadlock when unbinding."
>> >> >> >> but it does not change any of that code path. Only figures out whether
>> >> >> >> to take a lock or not.
>> >> >> 
>> >> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>> >> >> > assumed there could be a connection)
>> >> >> 
>> >> >> >> I will try it out on my box and see if I can reproduce it.
>> >> >> 
>> >> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> >> >> >> of it?
>> >> >> 
>> >> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>> >> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
>> >> >> 
>> >> >> Hi Konrad,
>> >> >> 
>> >> >> It looks like the issues is this part of the change:
>> >> >> 
>> >> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
>> >> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>> >> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>> >> >>     *
>> >> >>     * As such we have to be careful.
>> >> >>     + *
>> >> >>     + * To make this easier, the caller has to hold the device lock.
>> >> >>     */
>> >> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
>> >> >>     {
>> >> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >> >>     /* Cleanup our device
>> >> >>     * (so it's ready for the next domain)
>> >> >>     */
>> >> >>     -
>> >> >>     - /* This is OK - we are running from workqueue context
>> >> >>     - * and want to inhibit the user from fiddling with 'reset'
>> >> >>     - */
>> >> >>     - pci_reset_function(dev);
>> >> >>     + lockdep_assert_held(&dev->dev.mutex);
>> >> >>     + __pci_reset_function_locked(dev);
>> >> >>     pci_restore_state(dev);
>> >> >>    /* This disables the device. */
>> >> >> 
>> >> >> More specifically:
>> >> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
>> >> >> __pci_reset_function_locked(dev).
>> >> >> 
>> >> >> 
>> >> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>> >> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>> >> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>> >> >> 
>> >> >> 
>> >> >> So i added a little more debug:
>> >> >> 
>> >> >> device_lock_assert(&dev->dev);
>> >> >> ret = __pci_reset_function_locked(dev);
>> >> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>> >> >> pci_restore_state(dev);
>> >> >> 
>> >> >> And this returns:
>> >> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>> >> >> 
>> >> >> So that confirms there is no saved_state to get restored by 
>> >> >> pci_restore_state(dev) in the next line.
>> >> >> 
>> >> >> However there seems to be no "locked" variant of the function 
>> >> >> "pci_reset_function" in pci.c that has all the same logic ...
>> >> 
>> >> > Yup. I've a preliminary patch:
>> >> 
>> >> Preliminary in the sense: "this should fix it .. needs more testing" ?
>> 
>> > This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:
>> 
>> 
>> > From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
>> > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> > Date: Wed, 6 Aug 2014 16:21:32 -0400
>> > Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>> >  a guest.
>> 
>> > The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
>> > "xen/pciback: Don't deadlock when unbinding." was using
>> > the version of pci_reset_function which would lock the device lock.
>> > That is no good as we can dead-lock. As such we swapped to using
>> > the lock-less version and requiring that the callers
>> > of 'pcistub_put_pci_dev' take the device lock. And as such
>> > this bug got exposed.
>> 
>> > Using the lock-less version is  OK, except that we tried to
>> > use 'pci_restore_state' after the lock-less version of
>> > __pci_reset_function_locked - which won't work as 'state_saved'
>> > is set to false. Said 'state_saved' is a toggle boolean that
>> > is to be used by the sequence of a) pci_save_state/pci_restore_state
>> > or b) pci_load_and_free_saved_state/pci_restore_state. We don't
>> > want to use a) as the guest might have messed up the PCI
>> > configuration space and we want it to revert to the state
>> > when the PCI device was binded to us. Therefore we pick
>> > b) to restore the configuration space.
>> 
>> > To still retain the PCI configuration space, we save it once
>> > more and store it on our private copy to be restored when:
>> >  - Device is unbinded from pciback
>> >  - Device is detached from a guest.
>> 
>> > Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
>> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> > ---
>> >  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
>> >  1 files changed, 21 insertions(+), 4 deletions(-)
>> 
>> > diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>> > index 1ddd22f..8cf7f2b 100644
>> > --- a/drivers/xen/xen-pciback/pci_stub.c
>> > +++ b/drivers/xen/xen-pciback/pci_stub.c
>> > @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>> >          */
>> >         __pci_reset_function_locked(dev);
>> >         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> > -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>> >         else
>> >                 pci_restore_state(dev);
>> >  
>> > @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >  {
>> >         struct pcistub_device *psdev, *found_psdev = NULL;
>> >         unsigned long flags;
>> > +       struct xen_pcibk_dev_data *dev_data;
>> >  
>> >         spin_lock_irqsave(&pcistub_devices_lock, flags);
>> >  
>> > @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >          * (so it's ready for the next domain)
>> >          */
>> >         device_lock_assert(&dev->dev);
>> > -       __pci_reset_function_locked(dev);
>> > -       pci_restore_state(dev);
>> > -
>> > +       dev_data = pci_get_drvdata(dev);
>> > +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>> > +       else {
>> > +               __pci_reset_function_locked(dev);
>> > +               /*
>> > +                * The usual sequence is pci_save_state & pci_restore_state
>> > +                * but the guest might have messed the configuration space up.
>> > +                * Use the initial version (when device was binded to us).
>> > +                */
>> > +               pci_restore_state(dev);
>> > +               /*
>> > +                * The next steps are to reload the configuration for the
>> > +                * next time we bind & unbind to a guest - or unload from
>> > +                * pciback.
>> > +                */
>> > +               pci_save_state(dev);
>> > +               dev_data->pci_saved_state = pci_store_saved_state(dev);
>> > +       }
>> >         /* This disables the device. */
>> >         xen_pcibk_reset_device(dev);
>> >  
>> 
>> 
>> Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
>> "pci_load_and_free_saved_state" ?

> It could be redone a bit differently - as in:

>  rc = pci_load_and_free_saved_state(..);
>  __pci_reset_function_locked(dev);
>  if (!rc) {
>         pci_restore_state(dev);
>         ...

> In which case we will only do the restore state (and save state) when the device
> is in expected state. And the reset happens at that point.

>> 
>> Or is it safer because you don't reset the device although it's in an unknown 
>> state (and resetting it while it's back to dom0 could lead to more problems  ?)

> It could very well lead to disaster. I am not exactly sure what the ramifications
> are with a device for which we cannot save PCI configuration space - aka - extremely
> borked.

If it would .. perhaps you even shouldn't pass it through / seize it, when you can't save it.
And make it unassignable to other guests / rebindable to dom0 if restore fails.

Compile is done .. lets test :-)

>> 



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
@ 2014-08-06 20:17                           ` Sander Eikelenboom
  2014-08-06 20:17                           ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 20:17 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel


Wednesday, August 6, 2014, 10:09:59 PM, you wrote:

> On Wed, Aug 06, 2014 at 09:47:43PM +0200, Sander Eikelenboom wrote:
>> 
>> Wednesday, August 6, 2014, 9:39:16 PM, you wrote:
>> 
>> > On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
>> >> 
>> >> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
>> >> 
>> >> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>> >> >> 
>> >> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>> >> >> 
>> >> >> 
>> >> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>> >> >> 
>> >> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>> >> >> >>> 
>> >> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>> >> >> >>> 
>> >> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>> >> >> >>> >> 
>> >> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>> >> >> >>> >> 
>> >> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>> >> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>> >> >> >>> >>>>> Greg: goto GHK
>> >> >> >>> >>>>>
>> >> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>> >> >> >>> >>>>
>> >> >> >>> >>>> Applied to devel/for-linus-3.17.
>> >> >> >>> >> 
>> >> >> >>> >>> Thank you.
>> >> >> >>> >>>>
>> >> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>> >> >> >>> >>>> really is a stable candidate.
>> >> >> >>> >> 
>> >> >> >>> >>> OK.
>> >> >> >>> >>>>
>> >> >> >>> >>>> David
>> >> >> >>> >> 
>> >> >> >>> >> Hi Konrad / David,
>> >> >> >>> >> 
>> >> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>> >> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>> >> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>> >> >> >>> >> 
>> >> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>> >> >> >>> >> after a guest reboot or after assigning the devices to another guest.
>> >> >> >>> 
>> >> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>> >> >> >>> > devices with no FLR method isn't a regression as this has never worked.
>> >> >> >>> >  Can you explain in more detail what the regression is and which patch
>> >> >> >>> > caused it?
>> >> >> >>> 
>> >> >> >>> I haven't bisected it to a specific patch in this series,
>> >> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
>> >> >> >>> 
>> >> >> >>> - Do a system start and HVM guest start
>> >> >> >>> - HVM guest with pci passthrough, devices work fine
>> >> >> >>> - shutdown the HVM guest
>> >> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>> >> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>> >> >> >>> - Starting the HVM guest again with the same devices passed through.
>> >> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>> >> >> >>>   "lsusb"
>> >> >> >>> - And this all works fine on vanilla 3.16.  
>> >> >> 
>> >> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>> >> >> >> "xen/pciback: Don't deadlock when unbinding."
>> >> >> >> but it does not change any of that code path. Only figures out whether
>> >> >> >> to take a lock or not.
>> >> >> 
>> >> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>> >> >> > assumed there could be a connection)
>> >> >> 
>> >> >> >> I will try it out on my box and see if I can reproduce it.
>> >> >> 
>> >> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>> >> >> >> of it?
>> >> >> 
>> >> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>> >> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
>> >> >> 
>> >> >> Hi Konrad,
>> >> >> 
>> >> >> It looks like the issues is this part of the change:
>> >> >> 
>> >> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
>> >> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>> >> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>> >> >>     *
>> >> >>     * As such we have to be careful.
>> >> >>     + *
>> >> >>     + * To make this easier, the caller has to hold the device lock.
>> >> >>     */
>> >> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
>> >> >>     {
>> >> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >> >>     /* Cleanup our device
>> >> >>     * (so it's ready for the next domain)
>> >> >>     */
>> >> >>     -
>> >> >>     - /* This is OK - we are running from workqueue context
>> >> >>     - * and want to inhibit the user from fiddling with 'reset'
>> >> >>     - */
>> >> >>     - pci_reset_function(dev);
>> >> >>     + lockdep_assert_held(&dev->dev.mutex);
>> >> >>     + __pci_reset_function_locked(dev);
>> >> >>     pci_restore_state(dev);
>> >> >>    /* This disables the device. */
>> >> >> 
>> >> >> More specifically:
>> >> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
>> >> >> __pci_reset_function_locked(dev).
>> >> >> 
>> >> >> 
>> >> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>> >> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>> >> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>> >> >> 
>> >> >> 
>> >> >> So i added a little more debug:
>> >> >> 
>> >> >> device_lock_assert(&dev->dev);
>> >> >> ret = __pci_reset_function_locked(dev);
>> >> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>> >> >> pci_restore_state(dev);
>> >> >> 
>> >> >> And this returns:
>> >> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>> >> >> 
>> >> >> So that confirms there is no saved_state to get restored by 
>> >> >> pci_restore_state(dev) in the next line.
>> >> >> 
>> >> >> However there seems to be no "locked" variant of the function 
>> >> >> "pci_reset_function" in pci.c that has all the same logic ...
>> >> 
>> >> > Yup. I've a preliminary patch:
>> >> 
>> >> Preliminary in the sense: "this should fix it .. needs more testing" ?
>> 
>> > This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:
>> 
>> 
>> > From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
>> > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> > Date: Wed, 6 Aug 2014 16:21:32 -0400
>> > Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>> >  a guest.
>> 
>> > The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
>> > "xen/pciback: Don't deadlock when unbinding." was using
>> > the version of pci_reset_function which would lock the device lock.
>> > That is no good as we can dead-lock. As such we swapped to using
>> > the lock-less version and requiring that the callers
>> > of 'pcistub_put_pci_dev' take the device lock. And as such
>> > this bug got exposed.
>> 
>> > Using the lock-less version is  OK, except that we tried to
>> > use 'pci_restore_state' after the lock-less version of
>> > __pci_reset_function_locked - which won't work as 'state_saved'
>> > is set to false. Said 'state_saved' is a toggle boolean that
>> > is to be used by the sequence of a) pci_save_state/pci_restore_state
>> > or b) pci_load_and_free_saved_state/pci_restore_state. We don't
>> > want to use a) as the guest might have messed up the PCI
>> > configuration space and we want it to revert to the state
>> > when the PCI device was binded to us. Therefore we pick
>> > b) to restore the configuration space.
>> 
>> > To still retain the PCI configuration space, we save it once
>> > more and store it on our private copy to be restored when:
>> >  - Device is unbinded from pciback
>> >  - Device is detached from a guest.
>> 
>> > Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
>> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> > ---
>> >  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
>> >  1 files changed, 21 insertions(+), 4 deletions(-)
>> 
>> > diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>> > index 1ddd22f..8cf7f2b 100644
>> > --- a/drivers/xen/xen-pciback/pci_stub.c
>> > +++ b/drivers/xen/xen-pciback/pci_stub.c
>> > @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>> >          */
>> >         __pci_reset_function_locked(dev);
>> >         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> > -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>> >         else
>> >                 pci_restore_state(dev);
>> >  
>> > @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >  {
>> >         struct pcistub_device *psdev, *found_psdev = NULL;
>> >         unsigned long flags;
>> > +       struct xen_pcibk_dev_data *dev_data;
>> >  
>> >         spin_lock_irqsave(&pcistub_devices_lock, flags);
>> >  
>> > @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >          * (so it's ready for the next domain)
>> >          */
>> >         device_lock_assert(&dev->dev);
>> > -       __pci_reset_function_locked(dev);
>> > -       pci_restore_state(dev);
>> > -
>> > +       dev_data = pci_get_drvdata(dev);
>> > +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>> > +       else {
>> > +               __pci_reset_function_locked(dev);
>> > +               /*
>> > +                * The usual sequence is pci_save_state & pci_restore_state
>> > +                * but the guest might have messed the configuration space up.
>> > +                * Use the initial version (when device was binded to us).
>> > +                */
>> > +               pci_restore_state(dev);
>> > +               /*
>> > +                * The next steps are to reload the configuration for the
>> > +                * next time we bind & unbind to a guest - or unload from
>> > +                * pciback.
>> > +                */
>> > +               pci_save_state(dev);
>> > +               dev_data->pci_saved_state = pci_store_saved_state(dev);
>> > +       }
>> >         /* This disables the device. */
>> >         xen_pcibk_reset_device(dev);
>> >  
>> 
>> 
>> Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
>> "pci_load_and_free_saved_state" ?

> It could be redone a bit differently - as in:

>  rc = pci_load_and_free_saved_state(..);
>  __pci_reset_function_locked(dev);
>  if (!rc) {
>         pci_restore_state(dev);
>         ...

> In which case we will only do the restore state (and save state) when the device
> is in expected state. And the reset happens at that point.

>> 
>> Or is it safer because you don't reset the device although it's in an unknown 
>> state (and resetting it while it's back to dom0 could lead to more problems  ?)

> It could very well lead to disaster. I am not exactly sure what the ramifications
> are with a device for which we cannot save PCI configuration space - aka - extremely
> borked.

If it would .. perhaps you even shouldn't pass it through / seize it, when you can't save it.
And make it unassignable to other guests / rebindable to dom0 if restore fails.

Compile is done .. lets test :-)

>> 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 20:17                           ` [Xen-devel] " Sander Eikelenboom
  2014-08-06 22:08                             ` Sander Eikelenboom
@ 2014-08-06 22:08                             ` Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 22:08 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel


Wednesday, August 6, 2014, 10:17:19 PM, you wrote:


> Wednesday, August 6, 2014, 10:09:59 PM, you wrote:

>> On Wed, Aug 06, 2014 at 09:47:43PM +0200, Sander Eikelenboom wrote:
>>> 
>>> Wednesday, August 6, 2014, 9:39:16 PM, you wrote:
>>> 
>>> > On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
>>> >> 
>>> >> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
>>> >> 
>>> >> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>>> >> >> 
>>> >> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>>> >> >> 
>>> >> >> 
>>> >> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>>> >> >> 
>>> >> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>>> >> >> >>> 
>>> >> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>>> >> >> >>> 
>>> >> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>>> >> >> >>> >> 
>>> >> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>>> >> >> >>> >> 
>>> >> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>> >> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> >> >> >>> >>>>> Greg: goto GHK
>>> >> >> >>> >>>>>
>>> >> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>> >> >> >>> >>>>
>>> >> >> >>> >>>> Applied to devel/for-linus-3.17.
>>> >> >> >>> >> 
>>> >> >> >>> >>> Thank you.
>>> >> >> >>> >>>>
>>> >> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>> >> >> >>> >>>> really is a stable candidate.
>>> >> >> >>> >> 
>>> >> >> >>> >>> OK.
>>> >> >> >>> >>>>
>>> >> >> >>> >>>> David
>>> >> >> >>> >> 
>>> >> >> >>> >> Hi Konrad / David,
>>> >> >> >>> >> 
>>> >> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>>> >> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>>> >> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>>> >> >> >>> >> 
>>> >> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>>> >> >> >>> >> after a guest reboot or after assigning the devices to another guest.
>>> >> >> >>> 
>>> >> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>>> >> >> >>> > devices with no FLR method isn't a regression as this has never worked.
>>> >> >> >>> >  Can you explain in more detail what the regression is and which patch
>>> >> >> >>> > caused it?
>>> >> >> >>> 
>>> >> >> >>> I haven't bisected it to a specific patch in this series,
>>> >> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
>>> >> >> >>> 
>>> >> >> >>> - Do a system start and HVM guest start
>>> >> >> >>> - HVM guest with pci passthrough, devices work fine
>>> >> >> >>> - shutdown the HVM guest
>>> >> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>> >> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>>> >> >> >>> - Starting the HVM guest again with the same devices passed through.
>>> >> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>>> >> >> >>>   "lsusb"
>>> >> >> >>> - And this all works fine on vanilla 3.16.  
>>> >> >> 
>>> >> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>>> >> >> >> "xen/pciback: Don't deadlock when unbinding."
>>> >> >> >> but it does not change any of that code path. Only figures out whether
>>> >> >> >> to take a lock or not.
>>> >> >> 
>>> >> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>>> >> >> > assumed there could be a connection)
>>> >> >> 
>>> >> >> >> I will try it out on my box and see if I can reproduce it.
>>> >> >> 
>>> >> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>>> >> >> >> of it?
>>> >> >> 
>>> >> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>>> >> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
>>> >> >> 
>>> >> >> Hi Konrad,
>>> >> >> 
>>> >> >> It looks like the issues is this part of the change:
>>> >> >> 
>>> >> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
>>> >> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>>> >> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>>> >> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>>> >> >>     *
>>> >> >>     * As such we have to be careful.
>>> >> >>     + *
>>> >> >>     + * To make this easier, the caller has to hold the device lock.
>>> >> >>     */
>>> >> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >> >>     {
>>> >> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >> >>     /* Cleanup our device
>>> >> >>     * (so it's ready for the next domain)
>>> >> >>     */
>>> >> >>     -
>>> >> >>     - /* This is OK - we are running from workqueue context
>>> >> >>     - * and want to inhibit the user from fiddling with 'reset'
>>> >> >>     - */
>>> >> >>     - pci_reset_function(dev);
>>> >> >>     + lockdep_assert_held(&dev->dev.mutex);
>>> >> >>     + __pci_reset_function_locked(dev);
>>> >> >>     pci_restore_state(dev);
>>> >> >>    /* This disables the device. */
>>> >> >> 
>>> >> >> More specifically:
>>> >> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
>>> >> >> __pci_reset_function_locked(dev).
>>> >> >> 
>>> >> >> 
>>> >> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>>> >> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>>> >> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>>> >> >> 
>>> >> >> 
>>> >> >> So i added a little more debug:
>>> >> >> 
>>> >> >> device_lock_assert(&dev->dev);
>>> >> >> ret = __pci_reset_function_locked(dev);
>>> >> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>>> >> >> pci_restore_state(dev);
>>> >> >> 
>>> >> >> And this returns:
>>> >> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>>> >> >> 
>>> >> >> So that confirms there is no saved_state to get restored by 
>>> >> >> pci_restore_state(dev) in the next line.
>>> >> >> 
>>> >> >> However there seems to be no "locked" variant of the function 
>>> >> >> "pci_reset_function" in pci.c that has all the same logic ...
>>> >> 
>>> >> > Yup. I've a preliminary patch:
>>> >> 
>>> >> Preliminary in the sense: "this should fix it .. needs more testing" ?
>>> 
>>> > This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:
>>> 
>>> 
>>> > From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
>>> > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> > Date: Wed, 6 Aug 2014 16:21:32 -0400
>>> > Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>>> >  a guest.
>>> 
>>> > The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
>>> > "xen/pciback: Don't deadlock when unbinding." was using
>>> > the version of pci_reset_function which would lock the device lock.
>>> > That is no good as we can dead-lock. As such we swapped to using
>>> > the lock-less version and requiring that the callers
>>> > of 'pcistub_put_pci_dev' take the device lock. And as such
>>> > this bug got exposed.
>>> 
>>> > Using the lock-less version is  OK, except that we tried to
>>> > use 'pci_restore_state' after the lock-less version of
>>> > __pci_reset_function_locked - which won't work as 'state_saved'
>>> > is set to false. Said 'state_saved' is a toggle boolean that
>>> > is to be used by the sequence of a) pci_save_state/pci_restore_state
>>> > or b) pci_load_and_free_saved_state/pci_restore_state. We don't
>>> > want to use a) as the guest might have messed up the PCI
>>> > configuration space and we want it to revert to the state
>>> > when the PCI device was binded to us. Therefore we pick
>>> > b) to restore the configuration space.
>>> 
>>> > To still retain the PCI configuration space, we save it once
>>> > more and store it on our private copy to be restored when:
>>> >  - Device is unbinded from pciback
>>> >  - Device is detached from a guest.
>>> 
>>> > Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
>>> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> > ---
>>> >  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
>>> >  1 files changed, 21 insertions(+), 4 deletions(-)
>>> 
>>> > diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>>> > index 1ddd22f..8cf7f2b 100644
>>> > --- a/drivers/xen/xen-pciback/pci_stub.c
>>> > +++ b/drivers/xen/xen-pciback/pci_stub.c
>>> > @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>>> >          */
>>> >         __pci_reset_function_locked(dev);
>>> >         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>>> > -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>>> >         else
>>> >                 pci_restore_state(dev);
>>> >  
>>> > @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >  {
>>> >         struct pcistub_device *psdev, *found_psdev = NULL;
>>> >         unsigned long flags;
>>> > +       struct xen_pcibk_dev_data *dev_data;
>>> >  
>>> >         spin_lock_irqsave(&pcistub_devices_lock, flags);
>>> >  
>>> > @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >          * (so it's ready for the next domain)
>>> >          */
>>> >         device_lock_assert(&dev->dev);
>>> > -       __pci_reset_function_locked(dev);
>>> > -       pci_restore_state(dev);
>>> > -
>>> > +       dev_data = pci_get_drvdata(dev);
>>> > +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>>> > +       else {
>>> > +               __pci_reset_function_locked(dev);
>>> > +               /*
>>> > +                * The usual sequence is pci_save_state & pci_restore_state
>>> > +                * but the guest might have messed the configuration space up.
>>> > +                * Use the initial version (when device was binded to us).
>>> > +                */
>>> > +               pci_restore_state(dev);
>>> > +               /*
>>> > +                * The next steps are to reload the configuration for the
>>> > +                * next time we bind & unbind to a guest - or unload from
>>> > +                * pciback.
>>> > +                */
>>> > +               pci_save_state(dev);
>>> > +               dev_data->pci_saved_state = pci_store_saved_state(dev);
>>> > +       }
>>> >         /* This disables the device. */
>>> >         xen_pcibk_reset_device(dev);
>>> >  
>>> 
>>> 
>>> Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
>>> "pci_load_and_free_saved_state" ?

>> It could be redone a bit differently - as in:

>>  rc = pci_load_and_free_saved_state(..);
>>  __pci_reset_function_locked(dev);
>>  if (!rc) {
>>         pci_restore_state(dev);
>>         ...

>> In which case we will only do the restore state (and save state) when the device
>> is in expected state. And the reset happens at that point.

>>> 
>>> Or is it safer because you don't reset the device although it's in an unknown 
>>> state (and resetting it while it's back to dom0 could lead to more problems  ?)

>> It could very well lead to disaster. I am not exactly sure what the ramifications
>> are with a device for which we cannot save PCI configuration space - aka - extremely
>> borked.

> If it would .. perhaps you even shouldn't pass it through / seize it, when you can't save it.
> And make it unassignable to other guests / rebindable to dom0 if restore fails.

> Compile is done .. lets test :-)

If you like, you may stick on a:

Tested-By: Sander Eikelenboom <linux@eikelenboom.it>

Thanks for fixing Konrad !

--
Sander 






^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 20:17                           ` [Xen-devel] " Sander Eikelenboom
@ 2014-08-06 22:08                             ` Sander Eikelenboom
  2014-08-06 22:08                             ` [Xen-devel] " Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-06 22:08 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, David Vrabel, linux-kernel, xen-devel


Wednesday, August 6, 2014, 10:17:19 PM, you wrote:


> Wednesday, August 6, 2014, 10:09:59 PM, you wrote:

>> On Wed, Aug 06, 2014 at 09:47:43PM +0200, Sander Eikelenboom wrote:
>>> 
>>> Wednesday, August 6, 2014, 9:39:16 PM, you wrote:
>>> 
>>> > On Wed, Aug 06, 2014 at 09:25:59PM +0200, Sander Eikelenboom wrote:
>>> >> 
>>> >> Wednesday, August 6, 2014, 9:18:31 PM, you wrote:
>>> >> 
>>> >> > On Wed, Aug 06, 2014 at 08:59:59PM +0200, Sander Eikelenboom wrote:
>>> >> >> 
>>> >> >> Tuesday, August 5, 2014, 4:04:43 PM, you wrote:
>>> >> >> 
>>> >> >> 
>>> >> >> > Tuesday, August 5, 2014, 3:49:30 PM, you wrote:
>>> >> >> 
>>> >> >> >> On Tue, Aug 05, 2014 at 11:44:33AM +0200, Sander Eikelenboom wrote:
>>> >> >> >>> 
>>> >> >> >>> Tuesday, August 5, 2014, 11:31:08 AM, you wrote:
>>> >> >> >>> 
>>> >> >> >>> > On 05/08/14 09:44, Sander Eikelenboom wrote:
>>> >> >> >>> >> 
>>> >> >> >>> >> Monday, August 4, 2014, 8:43:18 PM, you wrote:
>>> >> >> >>> >> 
>>> >> >> >>> >>> On Fri, Aug 01, 2014 at 04:30:05PM +0100, David Vrabel wrote:
>>> >> >> >>> >>>> On 14/07/14 17:18, Konrad Rzeszutek Wilk wrote:
>>> >> >> >>> >>>>> Greg: goto GHK
>>> >> >> >>> >>>>>
>>> >> >> >>> >>>>> This is v5 version of patches to fix some issues in Xen PCIback.
>>> >> >> >>> >>>>
>>> >> >> >>> >>>> Applied to devel/for-linus-3.17.
>>> >> >> >>> >> 
>>> >> >> >>> >>> Thank you.
>>> >> >> >>> >>>>
>>> >> >> >>> >>>> I dropped the stable Cc for #2 pending a final decision on whether it
>>> >> >> >>> >>>> really is a stable candidate.
>>> >> >> >>> >> 
>>> >> >> >>> >>> OK.
>>> >> >> >>> >>>>
>>> >> >> >>> >>>> David
>>> >> >> >>> >> 
>>> >> >> >>> >> Hi Konrad / David,
>>> >> >> >>> >> 
>>> >> >> >>> >> This series still lacks a resolution on the sysfs /do_flr /reset,
>>> >> >> >>> >> as a result the pci devices are not reset after shutdown of a guest.
>>> >> >> >>> >> (no more pciback 0000:xx:xx.x: restoring config space at offset xxx)
>>> >> >> >>> >> 
>>> >> >> >>> >> So this series now introduces a regression to 3.16, which causes devices to malfunction 
>>> >> >> >>> >> after a guest reboot or after assigning the devices to another guest.
>>> >> >> >>> 
>>> >> >> >>> > I don't follow what you're saying.  The lack of a device reset for PCI
>>> >> >> >>> > devices with no FLR method isn't a regression as this has never worked.
>>> >> >> >>> >  Can you explain in more detail what the regression is and which patch
>>> >> >> >>> > caused it?
>>> >> >> >>> 
>>> >> >> >>> I haven't bisected it to a specific patch in this series,
>>> >> >> >>> but this patch series (when pulled on top of 3.16) cause the following:
>>> >> >> >>> 
>>> >> >> >>> - Do a system start and HVM guest start
>>> >> >> >>> - HVM guest with pci passthrough, devices work fine
>>> >> >> >>> - shutdown the HVM guest
>>> >> >> >>> - "pciback 0000:xx:xx.x: restoring config space at offset xxx" messages do not
>>> >> >> >>>   appear anymore when shutting down the HVM guest (as they do with vanilla 3.16)
>>> >> >> >>> - Starting the HVM guest again with the same devices passed through.
>>> >> >> >>> - Devices malfunction (for example a USB host controller will fail a simple 
>>> >> >> >>>   "lsusb"
>>> >> >> >>> - And this all works fine on vanilla 3.16.  
>>> >> >> 
>>> >> >> >> Hm, the only patch that makes code changes is 63fc5ec97cc54257d1c4ee49ed2131f754a5ff9b
>>> >> >> >> "xen/pciback: Don't deadlock when unbinding."
>>> >> >> >> but it does not change any of that code path. Only figures out whether
>>> >> >> >> to take a lock or not.
>>> >> >> 
>>> >> >> > Ok and the do_flr nack by david is unrelated to this part (i didn't check just 
>>> >> >> > assumed there could be a connection)
>>> >> >> 
>>> >> >> >> I will try it out on my box and see if I can reproduce it.
>>> >> >> 
>>> >> >> >> And just to be 100% sure - you are using vanilla Xen? No changes on top
>>> >> >> >> of it?
>>> >> >> 
>>> >> >> > Except the fix from jan for the pirq/msi stuff (and an unrelated hpet one), other than that no.
>>> >> >> > If you can't reproduce i will see if i can dive deeper into it tonight !
>>> >> >> 
>>> >> >> Hi Konrad,
>>> >> >> 
>>> >> >> It looks like the issues is this part of the change:
>>> >> >> 
>>> >> >>     --- a/drivers/xen/xen-pciback/pci_stub.c
>>> >> >>     +++ b/drivers/xen/xen-pciback/pci_stub.c
>>> >> >>     @@ -250,6 +250,8 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
>>> >> >>     * - 'echo BDF > unbind' with a guest still using it. See pcistub_remove
>>> >> >>     *
>>> >> >>     * As such we have to be careful.
>>> >> >>     + *
>>> >> >>     + * To make this easier, the caller has to hold the device lock.
>>> >> >>     */
>>> >> >>     void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >> >>     {
>>> >> >>     @@ -276,11 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >> >>     /* Cleanup our device
>>> >> >>     * (so it's ready for the next domain)
>>> >> >>     */
>>> >> >>     -
>>> >> >>     - /* This is OK - we are running from workqueue context
>>> >> >>     - * and want to inhibit the user from fiddling with 'reset'
>>> >> >>     - */
>>> >> >>     - pci_reset_function(dev);
>>> >> >>     + lockdep_assert_held(&dev->dev.mutex);
>>> >> >>     + __pci_reset_function_locked(dev);
>>> >> >>     pci_restore_state(dev);
>>> >> >>    /* This disables the device. */
>>> >> >> 
>>> >> >> More specifically:
>>> >> >> The old "pci_reset_function(dev)" potentially seems to do much more than 
>>> >> >> __pci_reset_function_locked(dev).
>>> >> >> 
>>> >> >> 
>>> >> >> "__pci_reset_function_locked(dev)" only calls  "__pci_dev_reset"
>>> >> >> while "pci_reset_function" not only calls pci_dev_reset, but on succes
>>> >> >> it also calls: "pci_dev_save_and_disable" which does a save state etc.
>>> >> >> 
>>> >> >> 
>>> >> >> So i added a little more debug:
>>> >> >> 
>>> >> >> device_lock_assert(&dev->dev);
>>> >> >> ret = __pci_reset_function_locked(dev);
>>> >> >> dev_dbg(&dev->dev, "%s __pci_reset_function_locked:%d  dev->state_saved:%d\n", __func__, ret, (!dev->state_saved) ? 0 : 1 );
>>> >> >> pci_restore_state(dev);
>>> >> >> 
>>> >> >> And this returns:
>>> >> >> [  494.570579] pciback 0000:04:00.0: pcistub_put_pci_dev __pci_reset_function_locked:0  dev->state_saved:0
>>> >> >> 
>>> >> >> So that confirms there is no saved_state to get restored by 
>>> >> >> pci_restore_state(dev) in the next line.
>>> >> >> 
>>> >> >> However there seems to be no "locked" variant of the function 
>>> >> >> "pci_reset_function" in pci.c that has all the same logic ...
>>> >> 
>>> >> > Yup. I've a preliminary patch:
>>> >> 
>>> >> Preliminary in the sense: "this should fix it .. needs more testing" ?
>>> 
>>> > This should fix it, albeit the fix has a disastrous flaw. Here is the proper version:
>>> 
>>> 
>>> > From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
>>> > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> > Date: Wed, 6 Aug 2014 16:21:32 -0400
>>> > Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>>> >  a guest.
>>> 
>>> > The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
>>> > "xen/pciback: Don't deadlock when unbinding." was using
>>> > the version of pci_reset_function which would lock the device lock.
>>> > That is no good as we can dead-lock. As such we swapped to using
>>> > the lock-less version and requiring that the callers
>>> > of 'pcistub_put_pci_dev' take the device lock. And as such
>>> > this bug got exposed.
>>> 
>>> > Using the lock-less version is  OK, except that we tried to
>>> > use 'pci_restore_state' after the lock-less version of
>>> > __pci_reset_function_locked - which won't work as 'state_saved'
>>> > is set to false. Said 'state_saved' is a toggle boolean that
>>> > is to be used by the sequence of a) pci_save_state/pci_restore_state
>>> > or b) pci_load_and_free_saved_state/pci_restore_state. We don't
>>> > want to use a) as the guest might have messed up the PCI
>>> > configuration space and we want it to revert to the state
>>> > when the PCI device was binded to us. Therefore we pick
>>> > b) to restore the configuration space.
>>> 
>>> > To still retain the PCI configuration space, we save it once
>>> > more and store it on our private copy to be restored when:
>>> >  - Device is unbinded from pciback
>>> >  - Device is detached from a guest.
>>> 
>>> > Reported-by:  Sander Eikelenboom <linux@eikelenboom.it>
>>> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>>> > ---
>>> >  drivers/xen/xen-pciback/pci_stub.c |   25 +++++++++++++++++++++----
>>> >  1 files changed, 21 insertions(+), 4 deletions(-)
>>> 
>>> > diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
>>> > index 1ddd22f..8cf7f2b 100644
>>> > --- a/drivers/xen/xen-pciback/pci_stub.c
>>> > +++ b/drivers/xen/xen-pciback/pci_stub.c
>>> > @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>>> >          */
>>> >         __pci_reset_function_locked(dev);
>>> >         if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>>> > -               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>>> >         else
>>> >                 pci_restore_state(dev);
>>> >  
>>> > @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >  {
>>> >         struct pcistub_device *psdev, *found_psdev = NULL;
>>> >         unsigned long flags;
>>> > +       struct xen_pcibk_dev_data *dev_data;
>>> >  
>>> >         spin_lock_irqsave(&pcistub_devices_lock, flags);
>>> >  
>>> > @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>> >          * (so it's ready for the next domain)
>>> >          */
>>> >         device_lock_assert(&dev->dev);
>>> > -       __pci_reset_function_locked(dev);
>>> > -       pci_restore_state(dev);
>>> > -
>>> > +       dev_data = pci_get_drvdata(dev);
>>> > +       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>>> > +               dev_info(&dev->dev, "Could not reload PCI state\n");
>>> > +       else {
>>> > +               __pci_reset_function_locked(dev);
>>> > +               /*
>>> > +                * The usual sequence is pci_save_state & pci_restore_state
>>> > +                * but the guest might have messed the configuration space up.
>>> > +                * Use the initial version (when device was binded to us).
>>> > +                */
>>> > +               pci_restore_state(dev);
>>> > +               /*
>>> > +                * The next steps are to reload the configuration for the
>>> > +                * next time we bind & unbind to a guest - or unload from
>>> > +                * pciback.
>>> > +                */
>>> > +               pci_save_state(dev);
>>> > +               dev_data->pci_saved_state = pci_store_saved_state(dev);
>>> > +       }
>>> >         /* This disables the device. */
>>> >         xen_pcibk_reset_device(dev);
>>> >  
>>> 
>>> 
>>> Is it save to have "__pci_reset_function_locked(dev)" to be conditional on succes of 
>>> "pci_load_and_free_saved_state" ?

>> It could be redone a bit differently - as in:

>>  rc = pci_load_and_free_saved_state(..);
>>  __pci_reset_function_locked(dev);
>>  if (!rc) {
>>         pci_restore_state(dev);
>>         ...

>> In which case we will only do the restore state (and save state) when the device
>> is in expected state. And the reset happens at that point.

>>> 
>>> Or is it safer because you don't reset the device although it's in an unknown 
>>> state (and resetting it while it's back to dom0 could lead to more problems  ?)

>> It could very well lead to disaster. I am not exactly sure what the ramifications
>> are with a device for which we cannot save PCI configuration space - aka - extremely
>> borked.

> If it would .. perhaps you even shouldn't pass it through / seize it, when you can't save it.
> And make it unassignable to other guests / rebindable to dom0 if restore fails.

> Compile is done .. lets test :-)

If you like, you may stick on a:

Tested-By: Sander Eikelenboom <linux@eikelenboom.it>

Thanks for fixing Konrad !

--
Sander 

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:39                     ` [Xen-devel] " Konrad Rzeszutek Wilk
  2014-08-06 19:47                       ` Sander Eikelenboom
  2014-08-06 19:47                       ` Sander Eikelenboom
@ 2014-08-07  9:04                       ` David Vrabel
  2014-08-25 17:18                         ` Sander Eikelenboom
  2014-08-25 17:18                         ` Sander Eikelenboom
  2014-08-07  9:04                       ` David Vrabel
  3 siblings, 2 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-07  9:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, linux-kernel, xen-devel

On 06/08/14 20:39, Konrad Rzeszutek Wilk wrote:
> 
> From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Wed, 6 Aug 2014 16:21:32 -0400
> Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>  a guest.
> 
> The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
> "xen/pciback: Don't deadlock when unbinding." was using
> the version of pci_reset_function which would lock the device lock.
> That is no good as we can dead-lock. As such we swapped to using
> the lock-less version and requiring that the callers
> of 'pcistub_put_pci_dev' take the device lock. And as such
> this bug got exposed.
> 
> Using the lock-less version is  OK, except that we tried to
> use 'pci_restore_state' after the lock-less version of
> __pci_reset_function_locked - which won't work as 'state_saved'
> is set to false. Said 'state_saved' is a toggle boolean that
> is to be used by the sequence of a) pci_save_state/pci_restore_state
> or b) pci_load_and_free_saved_state/pci_restore_state. We don't
> want to use a) as the guest might have messed up the PCI
> configuration space and we want it to revert to the state
> when the PCI device was binded to us. Therefore we pick
> b) to restore the configuration space.
> 
> To still retain the PCI configuration space, we save it once
> more and store it on our private copy to be restored when:
>  - Device is unbinded from pciback
>  - Device is detached from a guest.

This should be folded into the original patch.

[...]
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>  	 */
>  	__pci_reset_function_locked(dev);
>  	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

I dislike testing for errors like this as it looks like it's testing for
a boolean success.  Use

   ret = pci_load_and_free_saved_state(...);
   if (ret < 0)
      ...

And similarly, below.

> -		dev_dbg(&dev->dev, "Could not reload PCI state\n");
> +		dev_info(&dev->dev, "Could not reload PCI state\n");

This should be dev_warn().

pci_load_and_free_saved_state() won't fail because we know the state is
valid (since we saved it from the device originally).  Warning and
skipping the restore is fine here (and below).

>  	else
>  		pci_restore_state(dev);
>  
> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  {
>  	struct pcistub_device *psdev, *found_psdev = NULL;
>  	unsigned long flags;
> +	struct xen_pcibk_dev_data *dev_data;
>  
>  	spin_lock_irqsave(&pcistub_devices_lock, flags);
>  
> @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	 * (so it's ready for the next domain)
>  	 */
>  	device_lock_assert(&dev->dev);
> -	__pci_reset_function_locked(dev);
> -	pci_restore_state(dev);
> -
> +	dev_data = pci_get_drvdata(dev);
> +	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

This should be pci_load_saved_state() and then you can avoid the
pci_save_state() below.

> +		dev_info(&dev->dev, "Could not reload PCI state\n");

dev_warn() also.

> +	else {
> +		__pci_reset_function_locked(dev);
> +		/*
> +		 * The usual sequence is pci_save_state & pci_restore_state
> +		 * but the guest might have messed the configuration space up.
> +		 * Use the initial version (when device was binded to us).

s/binded/bound/

> +		pci_restore_state(dev);
> +		/*
> +		 * The next steps are to reload the configuration for the
> +		 * next time we bind & unbind to a guest - or unload from
> +		 * pciback.
> +		 */
> +		pci_save_state(dev);
> +		dev_data->pci_saved_state = pci_store_saved_state(dev);

You don't need this if you don't free the original state above.

> +	}
>  	/* This disables the device. */
>  	xen_pcibk_reset_device(dev);

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-06 19:39                     ` [Xen-devel] " Konrad Rzeszutek Wilk
                                         ` (2 preceding siblings ...)
  2014-08-07  9:04                       ` [Xen-devel] " David Vrabel
@ 2014-08-07  9:04                       ` David Vrabel
  3 siblings, 0 replies; 68+ messages in thread
From: David Vrabel @ 2014-08-07  9:04 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, Sander Eikelenboom
  Cc: gregkh, boris.ostrovsky, linux-kernel, xen-devel

On 06/08/14 20:39, Konrad Rzeszutek Wilk wrote:
> 
> From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Date: Wed, 6 Aug 2014 16:21:32 -0400
> Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>  a guest.
> 
> The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
> "xen/pciback: Don't deadlock when unbinding." was using
> the version of pci_reset_function which would lock the device lock.
> That is no good as we can dead-lock. As such we swapped to using
> the lock-less version and requiring that the callers
> of 'pcistub_put_pci_dev' take the device lock. And as such
> this bug got exposed.
> 
> Using the lock-less version is  OK, except that we tried to
> use 'pci_restore_state' after the lock-less version of
> __pci_reset_function_locked - which won't work as 'state_saved'
> is set to false. Said 'state_saved' is a toggle boolean that
> is to be used by the sequence of a) pci_save_state/pci_restore_state
> or b) pci_load_and_free_saved_state/pci_restore_state. We don't
> want to use a) as the guest might have messed up the PCI
> configuration space and we want it to revert to the state
> when the PCI device was binded to us. Therefore we pick
> b) to restore the configuration space.
> 
> To still retain the PCI configuration space, we save it once
> more and store it on our private copy to be restored when:
>  - Device is unbinded from pciback
>  - Device is detached from a guest.

This should be folded into the original patch.

[...]
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>  	 */
>  	__pci_reset_function_locked(dev);
>  	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

I dislike testing for errors like this as it looks like it's testing for
a boolean success.  Use

   ret = pci_load_and_free_saved_state(...);
   if (ret < 0)
      ...

And similarly, below.

> -		dev_dbg(&dev->dev, "Could not reload PCI state\n");
> +		dev_info(&dev->dev, "Could not reload PCI state\n");

This should be dev_warn().

pci_load_and_free_saved_state() won't fail because we know the state is
valid (since we saved it from the device originally).  Warning and
skipping the restore is fine here (and below).

>  	else
>  		pci_restore_state(dev);
>  
> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  {
>  	struct pcistub_device *psdev, *found_psdev = NULL;
>  	unsigned long flags;
> +	struct xen_pcibk_dev_data *dev_data;
>  
>  	spin_lock_irqsave(&pcistub_devices_lock, flags);
>  
> @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	 * (so it's ready for the next domain)
>  	 */
>  	device_lock_assert(&dev->dev);
> -	__pci_reset_function_locked(dev);
> -	pci_restore_state(dev);
> -
> +	dev_data = pci_get_drvdata(dev);
> +	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

This should be pci_load_saved_state() and then you can avoid the
pci_save_state() below.

> +		dev_info(&dev->dev, "Could not reload PCI state\n");

dev_warn() also.

> +	else {
> +		__pci_reset_function_locked(dev);
> +		/*
> +		 * The usual sequence is pci_save_state & pci_restore_state
> +		 * but the guest might have messed the configuration space up.
> +		 * Use the initial version (when device was binded to us).

s/binded/bound/

> +		pci_restore_state(dev);
> +		/*
> +		 * The next steps are to reload the configuration for the
> +		 * next time we bind & unbind to a guest - or unload from
> +		 * pciback.
> +		 */
> +		pci_save_state(dev);
> +		dev_data->pci_saved_state = pci_store_saved_state(dev);

You don't need this if you don't free the original state above.

> +	}
>  	/* This disables the device. */
>  	xen_pcibk_reset_device(dev);

David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [Xen-devel] [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-07  9:04                       ` [Xen-devel] " David Vrabel
@ 2014-08-25 17:18                         ` Sander Eikelenboom
  2014-08-25 17:18                         ` Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-25 17:18 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: David Vrabel, gregkh, boris.ostrovsky, linux-kernel, xen-devel


Hi Konrad,

Just in case you forgot this one .. a subtle ping (we are just in the low RC's) :-)

--
Sander

Thursday, August 7, 2014, 11:04:02 AM, you wrote:

> On 06/08/14 20:39, Konrad Rzeszutek Wilk wrote:
>> 
>> From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
>> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Date: Wed, 6 Aug 2014 16:21:32 -0400
>> Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>>  a guest.
>> 
>> The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
>> "xen/pciback: Don't deadlock when unbinding." was using
>> the version of pci_reset_function which would lock the device lock.
>> That is no good as we can dead-lock. As such we swapped to using
>> the lock-less version and requiring that the callers
>> of 'pcistub_put_pci_dev' take the device lock. And as such
>> this bug got exposed.
>> 
>> Using the lock-less version is  OK, except that we tried to
>> use 'pci_restore_state' after the lock-less version of
>> __pci_reset_function_locked - which won't work as 'state_saved'
>> is set to false. Said 'state_saved' is a toggle boolean that
>> is to be used by the sequence of a) pci_save_state/pci_restore_state
>> or b) pci_load_and_free_saved_state/pci_restore_state. We don't
>> want to use a) as the guest might have messed up the PCI
>> configuration space and we want it to revert to the state
>> when the PCI device was binded to us. Therefore we pick
>> b) to restore the configuration space.
>> 
>> To still retain the PCI configuration space, we save it once
>> more and store it on our private copy to be restored when:
>>  - Device is unbinded from pciback
>>  - Device is detached from a guest.

> This should be folded into the original patch.

> [...]
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>>        */
>>       __pci_reset_function_locked(dev);
>>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

> I dislike testing for errors like this as it looks like it's testing for
> a boolean success.  Use

>    ret = pci_load_and_free_saved_state(...);
>    if (ret < 0)
>       ...

> And similarly, below.

>> -             dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> +             dev_info(&dev->dev, "Could not reload PCI state\n");

> This should be dev_warn().

> pci_load_and_free_saved_state() won't fail because we know the state is
> valid (since we saved it from the device originally).  Warning and
> skipping the restore is fine here (and below).

>>       else
>>               pci_restore_state(dev);
>>  
>> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>  {
>>       struct pcistub_device *psdev, *found_psdev = NULL;
>>       unsigned long flags;
>> +     struct xen_pcibk_dev_data *dev_data;
>>  
>>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>>  
>> @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>        * (so it's ready for the next domain)
>>        */
>>       device_lock_assert(&dev->dev);
>> -     __pci_reset_function_locked(dev);
>> -     pci_restore_state(dev);
>> -
>> +     dev_data = pci_get_drvdata(dev);
>> +     if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

> This should be pci_load_saved_state() and then you can avoid the
> pci_save_state() below.

>> +             dev_info(&dev->dev, "Could not reload PCI state\n");

> dev_warn() also.

>> +     else {
>> +             __pci_reset_function_locked(dev);
>> +             /*
>> +              * The usual sequence is pci_save_state & pci_restore_state
>> +              * but the guest might have messed the configuration space up.
>> +              * Use the initial version (when device was binded to us).

> s/binded/bound/

>> +             pci_restore_state(dev);
>> +             /*
>> +              * The next steps are to reload the configuration for the
>> +              * next time we bind & unbind to a guest - or unload from
>> +              * pciback.
>> +              */
>> +             pci_save_state(dev);
>> +             dev_data->pci_saved_state = pci_store_saved_state(dev);

> You don't need this if you don't free the original state above.

>> +     }
>>       /* This disables the device. */
>>       xen_pcibk_reset_device(dev);

> David



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v5] Fixes to Xen pciback for 3.17.
  2014-08-07  9:04                       ` [Xen-devel] " David Vrabel
  2014-08-25 17:18                         ` Sander Eikelenboom
@ 2014-08-25 17:18                         ` Sander Eikelenboom
  1 sibling, 0 replies; 68+ messages in thread
From: Sander Eikelenboom @ 2014-08-25 17:18 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: gregkh, boris.ostrovsky, xen-devel, David Vrabel, linux-kernel


Hi Konrad,

Just in case you forgot this one .. a subtle ping (we are just in the low RC's) :-)

--
Sander

Thursday, August 7, 2014, 11:04:02 AM, you wrote:

> On 06/08/14 20:39, Konrad Rzeszutek Wilk wrote:
>> 
>> From 00a5b6e3c9ee2c2d605879bdaebc627fa640b024 Mon Sep 17 00:00:00 2001
>> From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Date: Wed, 6 Aug 2014 16:21:32 -0400
>> Subject: [PATCH] xen/pciback: Restore configuration space when detaching from
>>  a guest.
>> 
>> The commit 9eea3f7695226f9af9992cebf8e98ac0ad78b277
>> "xen/pciback: Don't deadlock when unbinding." was using
>> the version of pci_reset_function which would lock the device lock.
>> That is no good as we can dead-lock. As such we swapped to using
>> the lock-less version and requiring that the callers
>> of 'pcistub_put_pci_dev' take the device lock. And as such
>> this bug got exposed.
>> 
>> Using the lock-less version is  OK, except that we tried to
>> use 'pci_restore_state' after the lock-less version of
>> __pci_reset_function_locked - which won't work as 'state_saved'
>> is set to false. Said 'state_saved' is a toggle boolean that
>> is to be used by the sequence of a) pci_save_state/pci_restore_state
>> or b) pci_load_and_free_saved_state/pci_restore_state. We don't
>> want to use a) as the guest might have messed up the PCI
>> configuration space and we want it to revert to the state
>> when the PCI device was binded to us. Therefore we pick
>> b) to restore the configuration space.
>> 
>> To still retain the PCI configuration space, we save it once
>> more and store it on our private copy to be restored when:
>>  - Device is unbinded from pciback
>>  - Device is detached from a guest.

> This should be folded into the original patch.

> [...]
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -105,7 +105,7 @@ static void pcistub_device_release(struct kref *kref)
>>        */
>>       __pci_reset_function_locked(dev);
>>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

> I dislike testing for errors like this as it looks like it's testing for
> a boolean success.  Use

>    ret = pci_load_and_free_saved_state(...);
>    if (ret < 0)
>       ...

> And similarly, below.

>> -             dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> +             dev_info(&dev->dev, "Could not reload PCI state\n");

> This should be dev_warn().

> pci_load_and_free_saved_state() won't fail because we know the state is
> valid (since we saved it from the device originally).  Warning and
> skipping the restore is fine here (and below).

>>       else
>>               pci_restore_state(dev);
>>  
>> @@ -257,6 +257,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>  {
>>       struct pcistub_device *psdev, *found_psdev = NULL;
>>       unsigned long flags;
>> +     struct xen_pcibk_dev_data *dev_data;
>>  
>>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>>  
>> @@ -279,9 +280,25 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>        * (so it's ready for the next domain)
>>        */
>>       device_lock_assert(&dev->dev);
>> -     __pci_reset_function_locked(dev);
>> -     pci_restore_state(dev);
>> -
>> +     dev_data = pci_get_drvdata(dev);
>> +     if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))

> This should be pci_load_saved_state() and then you can avoid the
> pci_save_state() below.

>> +             dev_info(&dev->dev, "Could not reload PCI state\n");

> dev_warn() also.

>> +     else {
>> +             __pci_reset_function_locked(dev);
>> +             /*
>> +              * The usual sequence is pci_save_state & pci_restore_state
>> +              * but the guest might have messed the configuration space up.
>> +              * Use the initial version (when device was binded to us).

> s/binded/bound/

>> +             pci_restore_state(dev);
>> +             /*
>> +              * The next steps are to reload the configuration for the
>> +              * next time we bind & unbind to a guest - or unload from
>> +              * pciback.
>> +              */
>> +             pci_save_state(dev);
>> +             dev_data->pci_saved_state = pci_store_saved_state(dev);

> You don't need this if you don't free the original state above.

>> +     }
>>       /* This disables the device. */
>>       xen_pcibk_reset_device(dev);

> David

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v5] Fixes to Xen pciback for 3.17.
@ 2014-07-14 16:18 Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 68+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-07-14 16:18 UTC (permalink / raw)
  To: gregkh, xen-devel, linux-kernel, boris.ostrovsky, david.vrabel

Greg: goto GHK

This is v5 version of patches to fix some issues in Xen PCIback.

One of the issues Xen PCI back has that patch:

is fixing is that a deadlock can happen if the PCI device is
assigned to a guest and we try to 'unbind' it from Xen 'pciback' driver.
The issue is rather simple - the SysFS mechanism for the 'unbind' path
takes a device lock and the code in Xen PCI uses the pci_reset_function
which also takes the same lock. Solution is to use the lock-less version
and mandate that callers of said function in Xen pciback take the lock.
Easy enough.

GHK:
To guard against this happening in the future we also add an assert in the
form of lockdep assertion. That is OK except that it looks ugly as we take
it straight from the 'struct device' instead of using an appropriate macro.
See:

+       lockdep_assert_held(&dev->dev.mutex);

(in [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding).

The patch: [PATCH v5 3/6] driver core: Provide an wrapper around the mutex
to do.

introduces a nice wrapper so it is bit cleaner. Greg, if you are OK with
it could you kindly Ack it as I would prefer to put this patchset
via the Xen tree. It would look now as:

-       lockdep_assert_held(&dev->dev.mutex);
+       device_lock_assert(&dev->dev);

I can also squash it in "[PATCH v5 2/6] xen/pciback: Don't deadlock when
 unbinding." but since that one is going through the stable tree I wasn't
sure whether you (Greg KH) would be OK with that.

END GHK:

Thank you!

Patches are also available on my git tree

 git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git devel/pciback-3.17.v5

 Documentation/ABI/testing/sysfs-driver-pciback | 25 +++++++++++++++
 drivers/xen/xen-pciback/passthrough.c          | 14 +++++++--
 drivers/xen/xen-pciback/pci_stub.c             | 42 ++++++++++++++------------
 drivers/xen/xen-pciback/pciback.h              |  7 +++--
 drivers/xen/xen-pciback/vpci.c                 | 14 +++++++--
 drivers/xen/xen-pciback/xenbus.c               |  4 +--
 include/linux/device.h                         |  5 +++
 7 files changed, 81 insertions(+), 30 deletions(-)

Konrad Rzeszutek Wilk (6):
      xen-pciback: Document the various parameters and attributes in SysFS
      xen/pciback: Don't deadlock when unbinding.
      driver core: Provide an wrapper around the mutex to do lockdep warnings
      xen/pciback: Include the domain id if removing the device whilst still in use
      xen/pciback: Print out the domain owning the device.
      xen/pciback: Remove tons of dereferences

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2014-08-25 17:18 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-14 16:18 [PATCH v5] Fixes to Xen pciback for 3.17 Konrad Rzeszutek Wilk
2014-07-14 16:18 ` [PATCH v5 1/6] xen-pciback: Document the various parameters and attributes in SysFS Konrad Rzeszutek Wilk
2014-07-14 16:18 ` Konrad Rzeszutek Wilk
2014-07-28 13:04   ` David Vrabel
2014-07-28 14:56     ` Greg KH
2014-07-28 14:56     ` Greg KH
2014-08-01 14:59       ` David Vrabel
2014-08-01 14:59       ` [Xen-devel] " David Vrabel
2014-07-28 13:04   ` David Vrabel
2014-07-14 16:18 ` [PATCH v5 2/6] xen/pciback: Don't deadlock when unbinding Konrad Rzeszutek Wilk
2014-07-28 13:06   ` David Vrabel
2014-07-28 13:06   ` David Vrabel
2014-08-04 18:42     ` Konrad Rzeszutek Wilk
2014-08-05  9:27       ` [Xen-devel] " David Vrabel
2014-08-05  9:27       ` David Vrabel
2014-08-04 18:42     ` Konrad Rzeszutek Wilk
2014-07-14 16:18 ` Konrad Rzeszutek Wilk
2014-07-14 16:18 ` [PATCH v5 3/6] driver core: Provide an wrapper around the mutex to do lockdep warnings Konrad Rzeszutek Wilk
2014-07-14 17:39   ` Greg KH
2014-07-14 17:39   ` Greg KH
2014-07-14 16:18 ` Konrad Rzeszutek Wilk
2014-07-14 16:18 ` [PATCH v5 4/6] xen/pciback: Include the domain id if removing the device whilst still in use Konrad Rzeszutek Wilk
2014-07-14 16:18 ` Konrad Rzeszutek Wilk
2014-07-14 16:18 ` [PATCH v5 5/6] xen/pciback: Print out the domain owning the device Konrad Rzeszutek Wilk
2014-07-14 16:18 ` Konrad Rzeszutek Wilk
2014-07-14 16:18 ` [PATCH v5 6/6] xen/pciback: Remove tons of dereferences Konrad Rzeszutek Wilk
2014-07-14 16:18 ` Konrad Rzeszutek Wilk
2014-07-14 17:40 ` [PATCH v5] Fixes to Xen pciback for 3.17 Greg KH
2014-07-14 17:39   ` Konrad Rzeszutek Wilk
2014-07-14 17:39   ` Konrad Rzeszutek Wilk
2014-07-14 17:40 ` Greg KH
2014-08-01 15:30 ` David Vrabel
2014-08-04 18:43   ` Konrad Rzeszutek Wilk
2014-08-05  8:44     ` Sander Eikelenboom
2014-08-05  8:44     ` [Xen-devel] " Sander Eikelenboom
2014-08-05  9:31       ` David Vrabel
2014-08-05  9:31       ` [Xen-devel] " David Vrabel
2014-08-05  9:44         ` Sander Eikelenboom
2014-08-05  9:44         ` [Xen-devel] " Sander Eikelenboom
2014-08-05 13:49           ` Konrad Rzeszutek Wilk
2014-08-05 13:49           ` [Xen-devel] " Konrad Rzeszutek Wilk
2014-08-05 14:04             ` Sander Eikelenboom
2014-08-06 18:59               ` Sander Eikelenboom
2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
2014-08-06 19:25                   ` Sander Eikelenboom
2014-08-06 19:25                   ` [Xen-devel] " Sander Eikelenboom
2014-08-06 19:39                     ` Konrad Rzeszutek Wilk
2014-08-06 19:39                     ` [Xen-devel] " Konrad Rzeszutek Wilk
2014-08-06 19:47                       ` Sander Eikelenboom
2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
2014-08-06 20:17                           ` Sander Eikelenboom
2014-08-06 20:17                           ` [Xen-devel] " Sander Eikelenboom
2014-08-06 22:08                             ` Sander Eikelenboom
2014-08-06 22:08                             ` [Xen-devel] " Sander Eikelenboom
2014-08-06 20:09                         ` Konrad Rzeszutek Wilk
2014-08-06 19:47                       ` Sander Eikelenboom
2014-08-07  9:04                       ` [Xen-devel] " David Vrabel
2014-08-25 17:18                         ` Sander Eikelenboom
2014-08-25 17:18                         ` Sander Eikelenboom
2014-08-07  9:04                       ` David Vrabel
2014-08-06 19:18                 ` Konrad Rzeszutek Wilk
2014-08-06 18:59               ` Sander Eikelenboom
2014-08-05 14:04             ` Sander Eikelenboom
2014-08-05 14:33             ` Sander Eikelenboom
2014-08-05 14:33             ` [Xen-devel] " Sander Eikelenboom
2014-08-04 18:43   ` Konrad Rzeszutek Wilk
2014-08-01 15:30 ` David Vrabel
2014-07-14 16:18 Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.