All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
@ 2013-12-13 16:09 Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev Konrad Rzeszutek Wilk
                   ` (12 more replies)
  0 siblings, 13 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan

Hey,

While I was trying to narrow down the state of GPU passthrough
(still not finished) and figuring what needs to be done I realized
that Xen PCIback did not reset my GPU properly (when I crashed the
Windows guest by mistake). It does an FLR reset or Power one - if
the device supports it. But it seems that some of these GPUs
are liars and actually don't do the power part properly.

One way to fix this is by doing a slot (aka device) and bus reset.
Of course to do that - you need to make sure that all of the
functions of a device are under the ownership of xen-pciback.
Otherwise you might accidently reset your sound card while it is
being used.

These RFC patches cleanup pci back a bit and also make it possible
for Xen pciback to do the whole gamma of 'reset' for PCI devices:
FLR, power management, AER, slot and bus reset if neccessary.

Thanks go to Gordan Bobic for educating me on how to "reprogram"
and GFX460 in a Quardro 4000M and also reporting oddities when
a PCI device was reset but it looked like it was not reset.


 drivers/xen/xen-pciback/pci_stub.c | 142 +++++++++++++++++++++++++++++++------
 drivers/xen/xen-pciback/xenbus.c   |   5 +-
 2 files changed, 124 insertions(+), 23 deletions(-)


Konrad Rzeszutek Wilk (5):
      xen-pciback: Cleanup up pcistub_put_pci_dev
      xen-pciback: First reset, then free.
      xen-pciback: Document when we FLR an PCI device.
      xen/pciback: Move the FLR code to a function.
      xen/pciback: PCI reset slot or bus


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-16  9:19   ` [Xen-devel] " Jan Beulich
  2013-12-16  9:19   ` Jan Beulich
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

We are using both psdev->dev and dev and both are the same.
To keep it straight lets just use one - dev.

This will also make it easier in the patch:
"xen/pciback: When reconfiguring an PCIe device, FLR it"
and "xen/pciback: PCI reset slot or bus" to use it.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 62fcd48..5300a21 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -272,16 +272,16 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	 * and want to inhibit the user from fiddling with 'reset'
 	 */
 	pci_reset_function(dev);
-	pci_restore_state(psdev->dev);
+	pci_restore_state(dev);
 
 	/* This disables the device. */
-	xen_pcibk_reset_device(found_psdev->dev);
+	xen_pcibk_reset_device(dev);
 
 	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_free_dyn_fields(found_psdev->dev);
-	xen_pcibk_config_reset_dev(found_psdev->dev);
+	xen_pcibk_config_free_dyn_fields(dev);
+	xen_pcibk_config_reset_dev(dev);
 
-	xen_unregister_device_domain_owner(found_psdev->dev);
+	xen_unregister_device_domain_owner(dev);
 
 	spin_lock_irqsave(&found_psdev->lock, flags);
 	found_psdev->pdev = NULL;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` [RFC PATCH 2/5] xen-pciback: First reset, then free Konrad Rzeszutek Wilk
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

We are using both psdev->dev and dev and both are the same.
To keep it straight lets just use one - dev.

This will also make it easier in the patch:
"xen/pciback: When reconfiguring an PCIe device, FLR it"
and "xen/pciback: PCI reset slot or bus" to use it.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 62fcd48..5300a21 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -272,16 +272,16 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	 * and want to inhibit the user from fiddling with 'reset'
 	 */
 	pci_reset_function(dev);
-	pci_restore_state(psdev->dev);
+	pci_restore_state(dev);
 
 	/* This disables the device. */
-	xen_pcibk_reset_device(found_psdev->dev);
+	xen_pcibk_reset_device(dev);
 
 	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_free_dyn_fields(found_psdev->dev);
-	xen_pcibk_config_reset_dev(found_psdev->dev);
+	xen_pcibk_config_free_dyn_fields(dev);
+	xen_pcibk_config_reset_dev(dev);
 
-	xen_unregister_device_domain_owner(found_psdev->dev);
+	xen_unregister_device_domain_owner(dev);
 
 	spin_lock_irqsave(&found_psdev->lock, flags);
 	found_psdev->pdev = NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/5] xen-pciback: First reset, then free.
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (2 preceding siblings ...)
  2013-12-13 16:09 ` [RFC PATCH 2/5] xen-pciback: First reset, then free Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-16  9:23   ` [Xen-devel] " Jan Beulich
  2013-12-16  9:23   ` Jan Beulich
  2013-12-13 16:09 ` [RFC PATCH 3/5] xen-pciback: Document when we FLR an PCI device Konrad Rzeszutek Wilk
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

We were doing the operations of freeing and reset in the wrong
order. Granted nothing broke b/c the reset functions just
set bar->which = 0.

But nonethless this was incorrect.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 5300a21..36dd4f3 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -278,8 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	xen_pcibk_reset_device(dev);
 
 	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_free_dyn_fields(dev);
 	xen_pcibk_config_reset_dev(dev);
+	xen_pcibk_config_free_dyn_fields(dev);
 
 	xen_unregister_device_domain_owner(dev);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 2/5] xen-pciback: First reset, then free.
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

We were doing the operations of freeing and reset in the wrong
order. Granted nothing broke b/c the reset functions just
set bar->which = 0.

But nonethless this was incorrect.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 5300a21..36dd4f3 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -278,8 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	xen_pcibk_reset_device(dev);
 
 	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_free_dyn_fields(dev);
 	xen_pcibk_config_reset_dev(dev);
+	xen_pcibk_config_free_dyn_fields(dev);
 
 	xen_unregister_device_domain_owner(dev);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 3/5] xen-pciback: Document when we FLR an PCI device.
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (4 preceding siblings ...)
  2013-12-13 16:09 ` [RFC PATCH 3/5] xen-pciback: Document when we FLR an PCI device Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-16  9:27   ` Jan Beulich
  2013-12-16  9:27   ` [Xen-devel] " Jan Beulich
  2013-12-13 16:09 ` [RFC PATCH 4/5] xen/pciback: Move the FLR code to a function Konrad Rzeszutek Wilk
                   ` (6 subsequent siblings)
  12 siblings, 2 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

When the toolstack wants us to drop or add an PCI device it
changes the XenBus state to Configuring - and as result of that
we find out which devices we should still be exporting out and
which ones not. For the ones we don't need anymore we need to
do an PCI reset so that it is ready for the next guest.

We are already doing it - but it was not clear _how_
it was done. This should make it more obvious.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 12 +-----------
 drivers/xen/xen-pciback/xenbus.c   |  5 ++++-
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 36dd4f3..b6a856f 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -267,18 +267,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
+	pcistub_reset_pci_dev(dev);
 
-	/* This is OK - we are running from workqueue context
-	 * and want to inhibit the user from fiddling with 'reset'
-	 */
-	pci_reset_function(dev);
-	pci_restore_state(dev);
-
-	/* This disables the device. */
-	xen_pcibk_reset_device(dev);
-
-	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_reset_dev(dev);
 	xen_pcibk_config_free_dyn_fields(dev);
 
 	xen_unregister_device_domain_owner(dev);
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index a9ed867..301d1bc 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -93,6 +93,8 @@ static void free_pdev(struct xen_pcibk_device *pdev)
 
 	xen_pcibk_disconnect(pdev);
 
+	/* N.B. This calls pcistub_put_pci_dev which does the FLR on all
+	 * of the PCIe devices. */
 	xen_pcibk_release_devices(pdev);
 
 	dev_set_drvdata(&pdev->xdev->dev, NULL);
@@ -252,7 +254,6 @@ static int xen_pcibk_export_device(struct xen_pcibk_device *pdev,
 		xen_unregister_device_domain_owner(dev);
 		xen_register_device_domain_owner(dev, pdev->xdev->otherend_id);
 	}
-
 	/* TODO: It'd be nice to export a bridge and have all of its children
 	 * get exported with it. This may be best done in xend (which will
 	 * have to calculate resource usage anyway) but we probably want to
@@ -286,6 +287,8 @@ static int xen_pcibk_remove_device(struct xen_pcibk_device *pdev,
 	dev_dbg(&dev->dev, "unregistering for %d\n", pdev->xdev->otherend_id);
 	xen_unregister_device_domain_owner(dev);
 
+	/* N.B. This ends up calling pcistub_put_pci_dev which ends up
+	 * doing the FLR. */
 	xen_pcibk_release_pci_dev(pdev, dev);
 
 out:
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 3/5] xen-pciback: Document when we FLR an PCI device.
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (3 preceding siblings ...)
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

When the toolstack wants us to drop or add an PCI device it
changes the XenBus state to Configuring - and as result of that
we find out which devices we should still be exporting out and
which ones not. For the ones we don't need anymore we need to
do an PCI reset so that it is ready for the next guest.

We are already doing it - but it was not clear _how_
it was done. This should make it more obvious.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 12 +-----------
 drivers/xen/xen-pciback/xenbus.c   |  5 ++++-
 2 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 36dd4f3..b6a856f 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -267,18 +267,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
+	pcistub_reset_pci_dev(dev);
 
-	/* This is OK - we are running from workqueue context
-	 * and want to inhibit the user from fiddling with 'reset'
-	 */
-	pci_reset_function(dev);
-	pci_restore_state(dev);
-
-	/* This disables the device. */
-	xen_pcibk_reset_device(dev);
-
-	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_reset_dev(dev);
 	xen_pcibk_config_free_dyn_fields(dev);
 
 	xen_unregister_device_domain_owner(dev);
diff --git a/drivers/xen/xen-pciback/xenbus.c b/drivers/xen/xen-pciback/xenbus.c
index a9ed867..301d1bc 100644
--- a/drivers/xen/xen-pciback/xenbus.c
+++ b/drivers/xen/xen-pciback/xenbus.c
@@ -93,6 +93,8 @@ static void free_pdev(struct xen_pcibk_device *pdev)
 
 	xen_pcibk_disconnect(pdev);
 
+	/* N.B. This calls pcistub_put_pci_dev which does the FLR on all
+	 * of the PCIe devices. */
 	xen_pcibk_release_devices(pdev);
 
 	dev_set_drvdata(&pdev->xdev->dev, NULL);
@@ -252,7 +254,6 @@ static int xen_pcibk_export_device(struct xen_pcibk_device *pdev,
 		xen_unregister_device_domain_owner(dev);
 		xen_register_device_domain_owner(dev, pdev->xdev->otherend_id);
 	}
-
 	/* TODO: It'd be nice to export a bridge and have all of its children
 	 * get exported with it. This may be best done in xend (which will
 	 * have to calculate resource usage anyway) but we probably want to
@@ -286,6 +287,8 @@ static int xen_pcibk_remove_device(struct xen_pcibk_device *pdev,
 	dev_dbg(&dev->dev, "unregistering for %d\n", pdev->xdev->otherend_id);
 	xen_unregister_device_domain_owner(dev);
 
+	/* N.B. This ends up calling pcistub_put_pci_dev which ends up
+	 * doing the FLR. */
 	xen_pcibk_release_pci_dev(pdev, dev);
 
 out:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 4/5] xen/pciback: Move the FLR code to a function.
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (6 preceding siblings ...)
  2013-12-13 16:09 ` [RFC PATCH 4/5] xen/pciback: Move the FLR code to a function Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-16  9:28   ` Jan Beulich
  2013-12-16  9:28   ` [Xen-devel] " Jan Beulich
  2013-12-13 16:09 ` [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus Konrad Rzeszutek Wilk
                   ` (4 subsequent siblings)
  12 siblings, 2 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

Moving the bulk of the code its own function to aid
in making the 'xen/pciback: PCI reset slot or bus'
easier.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index b6a856f..4b450c5 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -242,6 +242,24 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
 	return found_dev;
 }
 
+void pcistub_reset_pci_dev(struct pci_dev *dev)
+{
+	/* This is OK - we are running from workqueue context
+	 * and want to inhibit the user from fiddling with 'reset'
+	 */
+
+	dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
+
+	pci_reset_function(dev);
+	pci_restore_state(dev);
+
+	/* This disables the device. */
+	xen_pcibk_reset_device(dev);
+
+	/* And cleanup up our emulated fields. */
+	xen_pcibk_config_reset_dev(dev);
+}
+
 void pcistub_put_pci_dev(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 4/5] xen/pciback: Move the FLR code to a function.
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (5 preceding siblings ...)
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

Moving the bulk of the code its own function to aid
in making the 'xen/pciback: PCI reset slot or bus'
easier.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index b6a856f..4b450c5 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -242,6 +242,24 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
 	return found_dev;
 }
 
+void pcistub_reset_pci_dev(struct pci_dev *dev)
+{
+	/* This is OK - we are running from workqueue context
+	 * and want to inhibit the user from fiddling with 'reset'
+	 */
+
+	dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
+
+	pci_reset_function(dev);
+	pci_restore_state(dev);
+
+	/* This disables the device. */
+	xen_pcibk_reset_device(dev);
+
+	/* And cleanup up our emulated fields. */
+	xen_pcibk_config_reset_dev(dev);
+}
+
 void pcistub_put_pci_dev(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (8 preceding siblings ...)
  2013-12-13 16:09 ` [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:18   ` Konrad Rzeszutek Wilk
                     ` (3 more replies)
  2013-12-13 16:52 ` [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Gordan Bobic
                   ` (2 subsequent siblings)
  12 siblings, 4 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

The life-cycle of a PCI device in Xen pciback is a bit complex.

It starts with the device being binded to us - for which
we do a device function reset.

If the device is unbinded from us - we also do a function
reset.

If the device is un-assigned from a guest - we do a function
reset.

All on the individual PCI function level.

Unfortunatly a function reset is not adequate for certain
PCIe devices. The reset for an individual PCI function "means
device must support FLR (PCIe or AF), PM reset on D3hot->D0
device specific reset, or be a singleton device on a bus
a secondary bus reset.  FLR does not have widespread support,
reset is not very reliable, and bus topology is dictated by the
and device design.  We need to provide a means for a user to
a bus reset in cases where the existing mechanisms are not
 or not reliable. " (Adam Williamson, 'vfio-pci: PCI hot reset
interface' commit 8b27ee60bfd6bbb84d2df28fa706c5c5081066ca).

As such to do a slot (so all of the functions on a device)
or a bus reset is complex - and is not exported to SysFS
(function reset is is via 'reset' parameter). This is due
to the complexity - we MUST know that the different functions
on a PCIe device are not in use, or if they are in use
(say only one of them) - if it is still OK to reset the slot.

This patch does that by doing an slot or bus reset (if
slot reset not supported) if all of the functions of a PCIe
device belong to Xen PCIback and are not in usage.

The reset is done when all of the functions of a device
are binded to Xen pciback. Or when a device is un-assigned
from a guest. We do this by having a 'completion' workqueue
on which the users of the PCI device wait. This workqueue
is started once the device has been 'binded' or de-assigned
from a guest.

In short - once an PCI device or its functions are under
the ownership of Xen PCIback they are reset. If they
are detached from a guest - they are reset. If they are
unbound from Xen pciback - they are reset.

Reported-by: Gordan Bobic <gordan@bobich.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 120 ++++++++++++++++++++++++++++++++-----
 1 file changed, 105 insertions(+), 15 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 4b450c5..bcc8733 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -47,6 +47,9 @@ struct pcistub_device {
 	struct list_head dev_list;
 	spinlock_t lock;
 
+	struct work_struct reset_work;
+	struct completion reset_done;
+
 	struct pci_dev *dev;
 	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
 };
@@ -63,6 +66,7 @@ static LIST_HEAD(pcistub_devices);
 static int initialize_devices;
 static LIST_HEAD(seized_devices);
 
+static void pcistub_device_reset(struct work_struct *work);
 static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev;
@@ -81,6 +85,8 @@ static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 
 	kref_init(&psdev->kref);
 	spin_lock_init(&psdev->lock);
+	init_completion(&psdev->reset_done);
+	INIT_WORK(&psdev->reset_work, pcistub_device_reset);
 
 	return psdev;
 }
@@ -100,6 +106,9 @@ static void pcistub_device_release(struct kref *kref)
 
 	xen_unregister_device_domain_owner(dev);
 
+	/* If there was an FLR in progress, let it finish and join
+	 * it here. */
+	cancel_work_sync(&psdev->reset_work);
 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
@@ -219,6 +228,9 @@ struct pci_dev *pcistub_get_pci_dev_by_slot(struct xen_pcibk_device *pdev,
 	}
 
 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+	if (found_dev)
+		wait_for_completion(&psdev->reset_done);
+
 	return found_dev;
 }
 
@@ -239,25 +251,93 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
 	}
 
 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+	if (found_dev)
+		wait_for_completion(&psdev->reset_done);
+
 	return found_dev;
 }
 
 void pcistub_reset_pci_dev(struct pci_dev *dev)
 {
+	int slots = 0, inuse = 0;
+	unsigned long flags;
+	struct pci_dev *pci_dev;
+	struct pcistub_device *psdev;
+
 	/* This is OK - we are running from workqueue context
 	 * and want to inhibit the user from fiddling with 'reset'
 	 */
 
-	dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
+	dev_dbg(&dev->dev, "resetting (FLR, D3, bus, slot, etc) the device\n");
 
 	pci_reset_function(dev);
+
+	/* Don't do this on a bridge level. */
+	if (pci_is_root_bus(dev->bus))
+		return;
+
+	/* We expect X amount of slots (this would also find out
+	 * if we do not have all of the slots assigned to us).
+	 */
+	list_for_each_entry(pci_dev, &dev->bus->devices, bus_list)
+		slots++;
+
+	spin_lock_irqsave(&pcistub_devices_lock, flags);
+	/* Iterate over the existing devices to ascertain whether
+	 * all of them are under the bridge and not in use. */
+	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
+		if (!psdev->dev)
+			continue;
+
+		if (pci_domain_nr(psdev->dev->bus) == pci_domain_nr(dev->bus) &&
+		    psdev->dev->bus->number == dev->bus->number &&
+		    PCI_SLOT(psdev->dev->devfn) == PCI_SLOT(dev->devfn)) {
+			slots--;
+			/* Ignore ourselves in case hadn't cleaned up yet */
+			if (psdev->pdev && psdev->dev != dev)
+				inuse++;
+		}
+	}
+	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+	/* Slots should be zero (all slots under the bridge are
+	 * accounted for), and inuse should be zero (not assigned
+	 * to anybody). */
+	if (!slots && !inuse) {
+		int rc = 0, bus = 0;
+		list_for_each_entry(pci_dev, &dev->bus->devices, bus_list) {
+			dev_dbg(&pci_dev->dev, "resetting the slot device\n");
+			if (!pci_probe_reset_slot(pci_dev->slot))
+				rc = pci_reset_slot(pci_dev->slot);
+			else
+				bus = 1;
+			if (rc)
+				dev_info(&pci_dev->dev, "resetting slot failed with %d\n", rc);
+		}
+		if (bus && !pci_probe_reset_bus(dev->bus)) {
+			dev_dbg(&dev->bus->dev, "resetting the bus device\n");
+			rc = pci_reset_bus(dev->bus);
+			if (rc)
+				dev_info(&dev->bus->dev, "resetting bus failed with %d\n", rc);
+		}
+	}
+
 	pci_restore_state(dev);
 
 	/* This disables the device. */
 	xen_pcibk_reset_device(dev);
 
-	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_reset_dev(dev);
+	xen_pcibk_config_free_dyn_fields(dev);
+}
+
+static void pcistub_device_reset(struct work_struct *work)
+{
+	struct pcistub_device *psdev = container_of(work, typeof(*psdev), reset_work);
+
+	pcistub_reset_pci_dev(psdev->dev);
+
+	/* Wakes up, if paying attention: pcistub_get_pci_dev_by_slot,
+	 * pcistub_get_pci_dev, and pcistub_put_pci_dev */
+	complete(&psdev->reset_done);
 }
 
 void pcistub_put_pci_dev(struct pci_dev *dev)
@@ -285,9 +365,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-	pcistub_reset_pci_dev(dev);
-
-	xen_pcibk_config_free_dyn_fields(dev);
+	/* And cleanup up our emulated fields. */
+	xen_pcibk_config_reset_dev(dev);
 
 	xen_unregister_device_domain_owner(dev);
 
@@ -295,6 +374,11 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	found_psdev->pdev = NULL;
 	spin_unlock_irqrestore(&found_psdev->lock, flags);
 
+	schedule_work(&found_psdev->reset_work);
+
+	/* Wait .. wait */
+	wait_for_completion(&found_psdev->reset_done);
+
 	pcistub_device_put(found_psdev);
 	up_write(&pcistub_sem);
 }
@@ -402,16 +486,13 @@ static int pcistub_init_device(struct pci_dev *dev)
 	if (!dev_data->pci_saved_state)
 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
 	else {
-		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
+		dev_dbg(&dev->dev, "resetting (FLR, D4, etc) the device\n");
 		__pci_reset_function_locked(dev);
+
+		/* WE should figure out whether we can reset the bus. But
+		 * it is locked! (dev_bus)*/
 		pci_restore_state(dev);
 	}
-	/* Now disable the device (this also ensures some private device
-	 * data is setup before we export)
-	 */
-	dev_dbg(&dev->dev, "reset device\n");
-	xen_pcibk_reset_device(dev);
-
 	dev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
 	return 0;
 
@@ -455,8 +536,13 @@ static int __init pcistub_init_devices_late(void)
 
 		spin_lock_irqsave(&pcistub_devices_lock, flags);
 
-		if (psdev)
+		if (psdev) {
 			list_add_tail(&psdev->dev_list, &pcistub_devices);
+			spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+
+			schedule_work(&psdev->reset_work);
+			spin_lock_irqsave(&pcistub_devices_lock, flags);
+		}
 	}
 
 	initialize_devices = 1;
@@ -497,10 +583,13 @@ static int pcistub_seize(struct pci_dev *dev)
 
 	if (err)
 		pcistub_device_put(psdev);
+	else
+		schedule_work(&psdev->reset_work);
 
 	return err;
 }
 
+/* Called when 'bind' */
 static int pcistub_probe(struct pci_dev *dev, const struct pci_device_id *id)
 {
 	int err = 0;
@@ -528,6 +617,7 @@ out:
 	return err;
 }
 
+/* Called when 'unbind' */
 static void pcistub_remove(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
@@ -1183,7 +1273,7 @@ static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char *buf)
 			continue;
 		count +=
 		    scnprintf(buf + count, PAGE_SIZE - count,
-			      "%s:%s:%sing:%ld\n",
+			      "%s:%s:%sing:%ld:%s\n",
 			      pci_name(psdev->dev),
 			      dev_data->isr_on ? "on" : "off",
 			      dev_data->ack_intr ? "ack" : "not ack",
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (7 preceding siblings ...)
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan; +Cc: Konrad Rzeszutek Wilk

The life-cycle of a PCI device in Xen pciback is a bit complex.

It starts with the device being binded to us - for which
we do a device function reset.

If the device is unbinded from us - we also do a function
reset.

If the device is un-assigned from a guest - we do a function
reset.

All on the individual PCI function level.

Unfortunatly a function reset is not adequate for certain
PCIe devices. The reset for an individual PCI function "means
device must support FLR (PCIe or AF), PM reset on D3hot->D0
device specific reset, or be a singleton device on a bus
a secondary bus reset.  FLR does not have widespread support,
reset is not very reliable, and bus topology is dictated by the
and device design.  We need to provide a means for a user to
a bus reset in cases where the existing mechanisms are not
 or not reliable. " (Adam Williamson, 'vfio-pci: PCI hot reset
interface' commit 8b27ee60bfd6bbb84d2df28fa706c5c5081066ca).

As such to do a slot (so all of the functions on a device)
or a bus reset is complex - and is not exported to SysFS
(function reset is is via 'reset' parameter). This is due
to the complexity - we MUST know that the different functions
on a PCIe device are not in use, or if they are in use
(say only one of them) - if it is still OK to reset the slot.

This patch does that by doing an slot or bus reset (if
slot reset not supported) if all of the functions of a PCIe
device belong to Xen PCIback and are not in usage.

The reset is done when all of the functions of a device
are binded to Xen pciback. Or when a device is un-assigned
from a guest. We do this by having a 'completion' workqueue
on which the users of the PCI device wait. This workqueue
is started once the device has been 'binded' or de-assigned
from a guest.

In short - once an PCI device or its functions are under
the ownership of Xen PCIback they are reset. If they
are detached from a guest - they are reset. If they are
unbound from Xen pciback - they are reset.

Reported-by: Gordan Bobic <gordan@bobich.net>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-pciback/pci_stub.c | 120 ++++++++++++++++++++++++++++++++-----
 1 file changed, 105 insertions(+), 15 deletions(-)

diff --git a/drivers/xen/xen-pciback/pci_stub.c b/drivers/xen/xen-pciback/pci_stub.c
index 4b450c5..bcc8733 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -47,6 +47,9 @@ struct pcistub_device {
 	struct list_head dev_list;
 	spinlock_t lock;
 
+	struct work_struct reset_work;
+	struct completion reset_done;
+
 	struct pci_dev *dev;
 	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
 };
@@ -63,6 +66,7 @@ static LIST_HEAD(pcistub_devices);
 static int initialize_devices;
 static LIST_HEAD(seized_devices);
 
+static void pcistub_device_reset(struct work_struct *work);
 static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev;
@@ -81,6 +85,8 @@ static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 
 	kref_init(&psdev->kref);
 	spin_lock_init(&psdev->lock);
+	init_completion(&psdev->reset_done);
+	INIT_WORK(&psdev->reset_work, pcistub_device_reset);
 
 	return psdev;
 }
@@ -100,6 +106,9 @@ static void pcistub_device_release(struct kref *kref)
 
 	xen_unregister_device_domain_owner(dev);
 
+	/* If there was an FLR in progress, let it finish and join
+	 * it here. */
+	cancel_work_sync(&psdev->reset_work);
 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
@@ -219,6 +228,9 @@ struct pci_dev *pcistub_get_pci_dev_by_slot(struct xen_pcibk_device *pdev,
 	}
 
 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+	if (found_dev)
+		wait_for_completion(&psdev->reset_done);
+
 	return found_dev;
 }
 
@@ -239,25 +251,93 @@ struct pci_dev *pcistub_get_pci_dev(struct xen_pcibk_device *pdev,
 	}
 
 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+	if (found_dev)
+		wait_for_completion(&psdev->reset_done);
+
 	return found_dev;
 }
 
 void pcistub_reset_pci_dev(struct pci_dev *dev)
 {
+	int slots = 0, inuse = 0;
+	unsigned long flags;
+	struct pci_dev *pci_dev;
+	struct pcistub_device *psdev;
+
 	/* This is OK - we are running from workqueue context
 	 * and want to inhibit the user from fiddling with 'reset'
 	 */
 
-	dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
+	dev_dbg(&dev->dev, "resetting (FLR, D3, bus, slot, etc) the device\n");
 
 	pci_reset_function(dev);
+
+	/* Don't do this on a bridge level. */
+	if (pci_is_root_bus(dev->bus))
+		return;
+
+	/* We expect X amount of slots (this would also find out
+	 * if we do not have all of the slots assigned to us).
+	 */
+	list_for_each_entry(pci_dev, &dev->bus->devices, bus_list)
+		slots++;
+
+	spin_lock_irqsave(&pcistub_devices_lock, flags);
+	/* Iterate over the existing devices to ascertain whether
+	 * all of them are under the bridge and not in use. */
+	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
+		if (!psdev->dev)
+			continue;
+
+		if (pci_domain_nr(psdev->dev->bus) == pci_domain_nr(dev->bus) &&
+		    psdev->dev->bus->number == dev->bus->number &&
+		    PCI_SLOT(psdev->dev->devfn) == PCI_SLOT(dev->devfn)) {
+			slots--;
+			/* Ignore ourselves in case hadn't cleaned up yet */
+			if (psdev->pdev && psdev->dev != dev)
+				inuse++;
+		}
+	}
+	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+	/* Slots should be zero (all slots under the bridge are
+	 * accounted for), and inuse should be zero (not assigned
+	 * to anybody). */
+	if (!slots && !inuse) {
+		int rc = 0, bus = 0;
+		list_for_each_entry(pci_dev, &dev->bus->devices, bus_list) {
+			dev_dbg(&pci_dev->dev, "resetting the slot device\n");
+			if (!pci_probe_reset_slot(pci_dev->slot))
+				rc = pci_reset_slot(pci_dev->slot);
+			else
+				bus = 1;
+			if (rc)
+				dev_info(&pci_dev->dev, "resetting slot failed with %d\n", rc);
+		}
+		if (bus && !pci_probe_reset_bus(dev->bus)) {
+			dev_dbg(&dev->bus->dev, "resetting the bus device\n");
+			rc = pci_reset_bus(dev->bus);
+			if (rc)
+				dev_info(&dev->bus->dev, "resetting bus failed with %d\n", rc);
+		}
+	}
+
 	pci_restore_state(dev);
 
 	/* This disables the device. */
 	xen_pcibk_reset_device(dev);
 
-	/* And cleanup up our emulated fields. */
-	xen_pcibk_config_reset_dev(dev);
+	xen_pcibk_config_free_dyn_fields(dev);
+}
+
+static void pcistub_device_reset(struct work_struct *work)
+{
+	struct pcistub_device *psdev = container_of(work, typeof(*psdev), reset_work);
+
+	pcistub_reset_pci_dev(psdev->dev);
+
+	/* Wakes up, if paying attention: pcistub_get_pci_dev_by_slot,
+	 * pcistub_get_pci_dev, and pcistub_put_pci_dev */
+	complete(&psdev->reset_done);
 }
 
 void pcistub_put_pci_dev(struct pci_dev *dev)
@@ -285,9 +365,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* Cleanup our device
 	 * (so it's ready for the next domain)
 	 */
-	pcistub_reset_pci_dev(dev);
-
-	xen_pcibk_config_free_dyn_fields(dev);
+	/* And cleanup up our emulated fields. */
+	xen_pcibk_config_reset_dev(dev);
 
 	xen_unregister_device_domain_owner(dev);
 
@@ -295,6 +374,11 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	found_psdev->pdev = NULL;
 	spin_unlock_irqrestore(&found_psdev->lock, flags);
 
+	schedule_work(&found_psdev->reset_work);
+
+	/* Wait .. wait */
+	wait_for_completion(&found_psdev->reset_done);
+
 	pcistub_device_put(found_psdev);
 	up_write(&pcistub_sem);
 }
@@ -402,16 +486,13 @@ static int pcistub_init_device(struct pci_dev *dev)
 	if (!dev_data->pci_saved_state)
 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
 	else {
-		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
+		dev_dbg(&dev->dev, "resetting (FLR, D4, etc) the device\n");
 		__pci_reset_function_locked(dev);
+
+		/* WE should figure out whether we can reset the bus. But
+		 * it is locked! (dev_bus)*/
 		pci_restore_state(dev);
 	}
-	/* Now disable the device (this also ensures some private device
-	 * data is setup before we export)
-	 */
-	dev_dbg(&dev->dev, "reset device\n");
-	xen_pcibk_reset_device(dev);
-
 	dev->dev_flags |= PCI_DEV_FLAGS_ASSIGNED;
 	return 0;
 
@@ -455,8 +536,13 @@ static int __init pcistub_init_devices_late(void)
 
 		spin_lock_irqsave(&pcistub_devices_lock, flags);
 
-		if (psdev)
+		if (psdev) {
 			list_add_tail(&psdev->dev_list, &pcistub_devices);
+			spin_unlock_irqrestore(&pcistub_devices_lock, flags);
+
+			schedule_work(&psdev->reset_work);
+			spin_lock_irqsave(&pcistub_devices_lock, flags);
+		}
 	}
 
 	initialize_devices = 1;
@@ -497,10 +583,13 @@ static int pcistub_seize(struct pci_dev *dev)
 
 	if (err)
 		pcistub_device_put(psdev);
+	else
+		schedule_work(&psdev->reset_work);
 
 	return err;
 }
 
+/* Called when 'bind' */
 static int pcistub_probe(struct pci_dev *dev, const struct pci_device_id *id)
 {
 	int err = 0;
@@ -528,6 +617,7 @@ out:
 	return err;
 }
 
+/* Called when 'unbind' */
 static void pcistub_remove(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev, *found_psdev = NULL;
@@ -1183,7 +1273,7 @@ static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char *buf)
 			continue;
 		count +=
 		    scnprintf(buf + count, PAGE_SIZE - count,
-			      "%s:%s:%sing:%ld\n",
+			      "%s:%s:%sing:%ld:%s\n",
 			      pci_name(psdev->dev),
 			      dev_data->isr_on ? "on" : "off",
 			      dev_data->ack_intr ? "ack" : "not ack",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-13 16:18   ` Konrad Rzeszutek Wilk
  2013-12-13 16:18   ` Konrad Rzeszutek Wilk
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:18 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan

> +/* Called when 'bind' */
>  static int pcistub_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  {
>  	int err = 0;
> @@ -528,6 +617,7 @@ out:
>  	return err;
>  }
>  
> +/* Called when 'unbind' */
>  static void pcistub_remove(struct pci_dev *dev)
>  {
>  	struct pcistub_device *psdev, *found_psdev = NULL;
> @@ -1183,7 +1273,7 @@ static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char *buf)
>  			continue;
>  		count +=
>  		    scnprintf(buf + count, PAGE_SIZE - count,
> -			      "%s:%s:%sing:%ld\n",
> +			      "%s:%s:%sing:%ld:%s\n",

Um.. clearly it is RFC :-)

>  			      pci_name(psdev->dev),
>  			      dev_data->isr_on ? "on" : "off",
>  			      dev_data->ack_intr ? "ack" : "not ack",
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:18   ` Konrad Rzeszutek Wilk
@ 2013-12-13 16:18   ` Konrad Rzeszutek Wilk
  2013-12-16 11:34   ` [Xen-devel] " David Vrabel
  2013-12-16 11:34   ` David Vrabel
  3 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:18 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan

> +/* Called when 'bind' */
>  static int pcistub_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  {
>  	int err = 0;
> @@ -528,6 +617,7 @@ out:
>  	return err;
>  }
>  
> +/* Called when 'unbind' */
>  static void pcistub_remove(struct pci_dev *dev)
>  {
>  	struct pcistub_device *psdev, *found_psdev = NULL;
> @@ -1183,7 +1273,7 @@ static ssize_t pcistub_irq_handler_show(struct device_driver *drv, char *buf)
>  			continue;
>  		count +=
>  		    scnprintf(buf + count, PAGE_SIZE - count,
> -			      "%s:%s:%sing:%ld\n",
> +			      "%s:%s:%sing:%ld:%s\n",

Um.. clearly it is RFC :-)

>  			      pci_name(psdev->dev),
>  			      dev_data->isr_on ? "on" : "off",
>  			      dev_data->ack_intr ? "ack" : "not ack",
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (9 preceding siblings ...)
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-13 16:52 ` Gordan Bobic
  2013-12-16 10:59 ` David Vrabel
  2013-12-16 10:59 ` [Xen-devel] " David Vrabel
  12 siblings, 0 replies; 40+ messages in thread
From: Gordan Bobic @ 2013-12-13 16:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, linux-kernel

On 12/13/2013 04:09 PM, Konrad Rzeszutek Wilk wrote:
> Hey,
>
> While I was trying to narrow down the state of GPU passthrough
> (still not finished) and figuring what needs to be done I realized
> that Xen PCIback did not reset my GPU properly (when I crashed the
> Windows guest by mistake). It does an FLR reset or Power one - if
> the device supports it. But it seems that some of these GPUs
> are liars and actually don't do the power part properly.

Some cards (IIRC Fermi based Nvidia cards) support neither the required 
power states nor FLR, although you can use setpci to force a powered off 
state on the card without an error being triggered. Unfortunately, it 
seems to be a null operation, and the card doesn't end up getting reset.

> One way to fix this is by doing a slot (aka device) and bus reset.
> Of course to do that - you need to make sure that all of the
> functions of a device are under the ownership of xen-pciback.
> Otherwise you might accidently reset your sound card while it is
> being used.
 >
> These RFC patches cleanup pci back a bit and also make it possible
> for Xen pciback to do the whole gamma of 'reset' for PCI devices:
> FLR, power management, AER, slot and bus reset if neccessary.
>
> Thanks go to Gordan Bobic for educating me on how to "reprogram"
> and GFX460 in a Quardro 4000M and also reporting oddities when
> a PCI device was reset but it looked like it was not reset.

You're welcome. :)

Gordan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev
  2013-12-13 16:09 ` [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev Konrad Rzeszutek Wilk
@ 2013-12-16  9:19   ` Jan Beulich
  2013-12-16  9:19   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:19 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: gordan, xen-devel, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> We are using both psdev->dev and dev and both are the same.
> To keep it straight lets just use one - dev.

This should properly describe the patch, which otherwise at the
first glance looks wrong: You're also replacing found_psdev->dev.

> This will also make it easier in the patch:
> "xen/pciback: When reconfiguring an PCIe device, FLR it"
> and "xen/pciback: PCI reset slot or bus" to use it.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Apart from the above,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

> ---
>  drivers/xen/xen-pciback/pci_stub.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index 62fcd48..5300a21 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -272,16 +272,16 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	 * and want to inhibit the user from fiddling with 'reset'
>  	 */
>  	pci_reset_function(dev);
> -	pci_restore_state(psdev->dev);
> +	pci_restore_state(dev);
>  
>  	/* This disables the device. */
> -	xen_pcibk_reset_device(found_psdev->dev);
> +	xen_pcibk_reset_device(dev);
>  
>  	/* And cleanup up our emulated fields. */
> -	xen_pcibk_config_free_dyn_fields(found_psdev->dev);
> -	xen_pcibk_config_reset_dev(found_psdev->dev);
> +	xen_pcibk_config_free_dyn_fields(dev);
> +	xen_pcibk_config_reset_dev(dev);
>  
> -	xen_unregister_device_domain_owner(found_psdev->dev);
> +	xen_unregister_device_domain_owner(dev);
>  
>  	spin_lock_irqsave(&found_psdev->lock, flags);
>  	found_psdev->pdev = NULL;
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev
  2013-12-13 16:09 ` [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev Konrad Rzeszutek Wilk
  2013-12-16  9:19   ` [Xen-devel] " Jan Beulich
@ 2013-12-16  9:19   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:19 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> We are using both psdev->dev and dev and both are the same.
> To keep it straight lets just use one - dev.

This should properly describe the patch, which otherwise at the
first glance looks wrong: You're also replacing found_psdev->dev.

> This will also make it easier in the patch:
> "xen/pciback: When reconfiguring an PCIe device, FLR it"
> and "xen/pciback: PCI reset slot or bus" to use it.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Apart from the above,
Reviewed-by: Jan Beulich <jbeulich@suse.com>

> ---
>  drivers/xen/xen-pciback/pci_stub.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index 62fcd48..5300a21 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -272,16 +272,16 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	 * and want to inhibit the user from fiddling with 'reset'
>  	 */
>  	pci_reset_function(dev);
> -	pci_restore_state(psdev->dev);
> +	pci_restore_state(dev);
>  
>  	/* This disables the device. */
> -	xen_pcibk_reset_device(found_psdev->dev);
> +	xen_pcibk_reset_device(dev);
>  
>  	/* And cleanup up our emulated fields. */
> -	xen_pcibk_config_free_dyn_fields(found_psdev->dev);
> -	xen_pcibk_config_reset_dev(found_psdev->dev);
> +	xen_pcibk_config_free_dyn_fields(dev);
> +	xen_pcibk_config_reset_dev(dev);
>  
> -	xen_unregister_device_domain_owner(found_psdev->dev);
> +	xen_unregister_device_domain_owner(dev);
>  
>  	spin_lock_irqsave(&found_psdev->lock, flags);
>  	found_psdev->pdev = NULL;
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH 2/5] xen-pciback: First reset, then free.
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-16  9:23   ` Jan Beulich
  2013-12-16  9:23   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:23 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: gordan, xen-devel, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> We were doing the operations of freeing and reset in the wrong
> order. Granted nothing broke b/c the reset functions just
> set bar->which = 0.
> 
> But nonethless this was incorrect.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> ---
>  drivers/xen/xen-pciback/pci_stub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index 5300a21..36dd4f3 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -278,8 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	xen_pcibk_reset_device(dev);
>  
>  	/* And cleanup up our emulated fields. */
> -	xen_pcibk_config_free_dyn_fields(dev);
>  	xen_pcibk_config_reset_dev(dev);
> +	xen_pcibk_config_free_dyn_fields(dev);
>  
>  	xen_unregister_device_domain_owner(dev);
>  
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 2/5] xen-pciback: First reset, then free.
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-16  9:23   ` [Xen-devel] " Jan Beulich
@ 2013-12-16  9:23   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:23 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> We were doing the operations of freeing and reset in the wrong
> order. Granted nothing broke b/c the reset functions just
> set bar->which = 0.
> 
> But nonethless this was incorrect.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> ---
>  drivers/xen/xen-pciback/pci_stub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/xen/xen-pciback/pci_stub.c 
> b/drivers/xen/xen-pciback/pci_stub.c
> index 5300a21..36dd4f3 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -278,8 +278,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	xen_pcibk_reset_device(dev);
>  
>  	/* And cleanup up our emulated fields. */
> -	xen_pcibk_config_free_dyn_fields(dev);
>  	xen_pcibk_config_reset_dev(dev);
> +	xen_pcibk_config_free_dyn_fields(dev);
>  
>  	xen_unregister_device_domain_owner(dev);
>  
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH 3/5] xen-pciback: Document when we FLR an PCI device.
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-16  9:27   ` Jan Beulich
@ 2013-12-16  9:27   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: gordan, xen-devel, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> When the toolstack wants us to drop or add an PCI device it
> changes the XenBus state to Configuring - and as result of that
> we find out which devices we should still be exporting out and
> which ones not. For the ones we don't need anymore we need to
> do an PCI reset so that it is ready for the next guest.
> 
> We are already doing it - but it was not clear _how_
> it was done. This should make it more obvious.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>...
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -267,18 +267,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	/* Cleanup our device
>  	 * (so it's ready for the next domain)
>  	 */
> +	pcistub_reset_pci_dev(dev);
>  
> -	/* This is OK - we are running from workqueue context
> -	 * and want to inhibit the user from fiddling with 'reset'
> -	 */
> -	pci_reset_function(dev);
> -	pci_restore_state(dev);
> -
> -	/* This disables the device. */
> -	xen_pcibk_reset_device(dev);
> -
> -	/* And cleanup up our emulated fields. */
> -	xen_pcibk_config_reset_dev(dev);
>  	xen_pcibk_config_free_dyn_fields(dev);
>  
>  	xen_unregister_device_domain_owner(dev);

Does this really belong here? It's neither covered by the description,
nor can I see where pcistub_reset_pci_dev() would be defined (i.e.
afaict this ought to yield a build or load time error).

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 3/5] xen-pciback: Document when we FLR an PCI device.
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-16  9:27   ` Jan Beulich
  2013-12-16  9:27   ` [Xen-devel] " Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:27 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> When the toolstack wants us to drop or add an PCI device it
> changes the XenBus state to Configuring - and as result of that
> we find out which devices we should still be exporting out and
> which ones not. For the ones we don't need anymore we need to
> do an PCI reset so that it is ready for the next guest.
> 
> We are already doing it - but it was not clear _how_
> it was done. This should make it more obvious.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>...
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -267,18 +267,8 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	/* Cleanup our device
>  	 * (so it's ready for the next domain)
>  	 */
> +	pcistub_reset_pci_dev(dev);
>  
> -	/* This is OK - we are running from workqueue context
> -	 * and want to inhibit the user from fiddling with 'reset'
> -	 */
> -	pci_reset_function(dev);
> -	pci_restore_state(dev);
> -
> -	/* This disables the device. */
> -	xen_pcibk_reset_device(dev);
> -
> -	/* And cleanup up our emulated fields. */
> -	xen_pcibk_config_reset_dev(dev);
>  	xen_pcibk_config_free_dyn_fields(dev);
>  
>  	xen_unregister_device_domain_owner(dev);

Does this really belong here? It's neither covered by the description,
nor can I see where pcistub_reset_pci_dev() would be defined (i.e.
afaict this ought to yield a build or load time error).

Jan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH 4/5] xen/pciback: Move the FLR code to a function.
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-16  9:28   ` Jan Beulich
@ 2013-12-16  9:28   ` Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:28 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: gordan, xen-devel, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> +void pcistub_reset_pci_dev(struct pci_dev *dev)
> +{
> +	/* This is OK - we are running from workqueue context
> +	 * and want to inhibit the user from fiddling with 'reset'
> +	 */
> +
> +	dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
> +
> +	pci_reset_function(dev);
> +	pci_restore_state(dev);
> +
> +	/* This disables the device. */
> +	xen_pcibk_reset_device(dev);
> +
> +	/* And cleanup up our emulated fields. */
> +	xen_pcibk_config_reset_dev(dev);
> +}

Ah, here it comes. Improperly split series then...

Jan


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 4/5] xen/pciback: Move the FLR code to a function.
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
@ 2013-12-16  9:28   ` Jan Beulich
  2013-12-16  9:28   ` [Xen-devel] " Jan Beulich
  1 sibling, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2013-12-16  9:28 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, linux-kernel

>>> On 13.12.13 at 17:09, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> +void pcistub_reset_pci_dev(struct pci_dev *dev)
> +{
> +	/* This is OK - we are running from workqueue context
> +	 * and want to inhibit the user from fiddling with 'reset'
> +	 */
> +
> +	dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
> +
> +	pci_reset_function(dev);
> +	pci_restore_state(dev);
> +
> +	/* This disables the device. */
> +	xen_pcibk_reset_device(dev);
> +
> +	/* And cleanup up our emulated fields. */
> +	xen_pcibk_config_reset_dev(dev);
> +}

Ah, here it comes. Improperly split series then...

Jan

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (11 preceding siblings ...)
  2013-12-16 10:59 ` David Vrabel
@ 2013-12-16 10:59 ` David Vrabel
  2013-12-16 14:35   ` Konrad Rzeszutek Wilk
  2013-12-16 14:35   ` Konrad Rzeszutek Wilk
  12 siblings, 2 replies; 40+ messages in thread
From: David Vrabel @ 2013-12-16 10:59 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, linux-kernel, gordan

On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> Hey,
> 
> While I was trying to narrow down the state of GPU passthrough
> (still not finished) and figuring what needs to be done I realized
> that Xen PCIback did not reset my GPU properly (when I crashed the
> Windows guest by mistake). It does an FLR reset or Power one - if
> the device supports it. But it seems that some of these GPUs
> are liars and actually don't do the power part properly.

In my experience the devices do not lie.  They correctly report that
they do not perform a reset in D3hot.

Here's the patch I'm using to solve this.  It does something similar.
i.e., a SBR if all devices on that bus are safe to be reset.

I prefer it because it provides the standard 'reset' sysfs file that the
toolstack/userspace can use.

It does have some limitations:  a) It does not check whether a device is
in use (only if it is bound to pciback); and b) it hand rolls
pci_slot_reset() (because it didn't exist at the time).

diff --git a/drivers/xen/xen-pciback/pci_stub.c
b/drivers/xen/xen-pciback/pci_stub.c
index 4e8ba38..5a03e63 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -14,6 +14,7 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/atomic.h>
+#include <linux/delay.h>
 #include <xen/events.h>
 #include <asm/xen/pci.h>
 #include <asm/xen/hypervisor.h>
@@ -43,6 +44,7 @@ struct pcistub_device {
 	struct kref kref;
 	struct list_head dev_list;
 	spinlock_t lock;
+	bool created_reset_file;

 	struct pci_dev *dev;
 	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
@@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
 static int initialize_devices;
 static LIST_HEAD(seized_devices);

+/*
+ * pci_reset_function() will only work if there is a mechanism to
+ * reset that single function (e.g., FLR or a D-state transition).
+ * For PCI hardware that has two or more functions but no per-function
+ * reset, we can do a bus reset iff all the functions are co-assigned
+ * to the same domain.
+ *
+ * If a function has no per-function reset mechanism the 'reset' sysfs
+ * file that the toolstack uses to reset a function prior to assigning
+ * the device will be missing.  In this case, pciback adds its own
+ * which will try a bus reset.
+ *
+ * Note: pciback does not check for co-assigment before doing a bus
+ * reset, only that the devices are bound to pciback.  The toolstack
+ * is assumed to have done the right thing.
+ */
+static int __pcistub_reset_function(struct pci_dev *dev)
+{
+	struct pci_dev *pdev;
+	u16 ctrl;
+	int ret;
+
+	ret = __pci_reset_function_locked(dev);
+	if (ret == 0)
+		return 0;
+
+	if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
+		return -ENOTTY;
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
+		if (pdev != dev && (!pdev->driver
+				    || strcmp(pdev->driver->name, "pciback")))
+			return -ENOTTY;
+		pci_save_state(pdev);
+	}
+
+	pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
+	ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
+		pci_restore_state(pdev);
+
+	return 0;
+}
+
+static int pcistub_reset_function(struct pci_dev *dev)
+{
+	int ret;
+
+	device_lock(&dev->dev);
+	ret = __pcistub_reset_function(dev);
+	device_unlock(&dev->dev);
+
+	return ret;
+}
+
+static ssize_t pcistub_reset_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	unsigned long val;
+	ssize_t result = strict_strtoul(buf, 0, &val);
+
+	if (result < 0)
+		return result;
+
+	if (val != 1)
+		return -EINVAL;
+
+	result = pcistub_reset_function(pdev);
+	if (result < 0)
+		return result;
+	return count;
+}
+static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
+
+static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
+{
+	struct device *dev = &psdev->dev->dev;
+	struct sysfs_dirent *reset_dirent;
+	int ret;
+
+	reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
+	if (reset_dirent) {
+		sysfs_put(reset_dirent);
+		return 0;
+	}
+
+	ret = device_create_file(dev, &dev_attr_reset);
+	if (ret < 0)
+		return ret;
+	psdev->created_reset_file = true;
+	return 0;
+}
+
+static void pcistub_remove_reset_file(struct pcistub_device *psdev)
+{
+	if (psdev && psdev->created_reset_file)
+		device_remove_file(&psdev->dev->dev, &dev_attr_reset);
+}
+
 static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev;
@@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)

 	dev_dbg(&dev->dev, "pcistub_device_release\n");

+	pcistub_remove_reset_file(psdev);
+
 	xen_unregister_device_domain_owner(dev);

 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
-	__pci_reset_function_locked(dev);
+	__pcistub_reset_function(psdev->dev);
+
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
 		dev_dbg(&dev->dev, "Could not reload PCI state\n");
 	else
@@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* This is OK - we are running from workqueue context
 	 * and want to inhibit the user from fiddling with 'reset'
 	 */
-	pci_reset_function(dev);
+	pcistub_reset_function(psdev->dev);
 	pci_restore_state(psdev->dev);

 	/* This disables the device. */
@@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
 	else {
 		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
-		__pci_reset_function_locked(dev);
+		__pcistub_reset_function(dev);
 		pci_restore_state(dev);
 	}
 	/* Now disable the device (this also ensures some private device
@@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
 	if (!psdev)
 		return -ENOMEM;

+	err = pcistub_try_create_reset_file(psdev);
+	if (err < 0)
+		goto out;
+
 	spin_lock_irqsave(&pcistub_devices_lock, flags);

 	if (initialize_devices) {
@@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
 	}

 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
-
+out:
 	if (err)
 		pcistub_device_put(psdev);
-
 	return err;
 }


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
                   ` (10 preceding siblings ...)
  2013-12-13 16:52 ` [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Gordan Bobic
@ 2013-12-16 10:59 ` David Vrabel
  2013-12-16 10:59 ` [Xen-devel] " David Vrabel
  12 siblings, 0 replies; 40+ messages in thread
From: David Vrabel @ 2013-12-16 10:59 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, linux-kernel

On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> Hey,
> 
> While I was trying to narrow down the state of GPU passthrough
> (still not finished) and figuring what needs to be done I realized
> that Xen PCIback did not reset my GPU properly (when I crashed the
> Windows guest by mistake). It does an FLR reset or Power one - if
> the device supports it. But it seems that some of these GPUs
> are liars and actually don't do the power part properly.

In my experience the devices do not lie.  They correctly report that
they do not perform a reset in D3hot.

Here's the patch I'm using to solve this.  It does something similar.
i.e., a SBR if all devices on that bus are safe to be reset.

I prefer it because it provides the standard 'reset' sysfs file that the
toolstack/userspace can use.

It does have some limitations:  a) It does not check whether a device is
in use (only if it is bound to pciback); and b) it hand rolls
pci_slot_reset() (because it didn't exist at the time).

diff --git a/drivers/xen/xen-pciback/pci_stub.c
b/drivers/xen/xen-pciback/pci_stub.c
index 4e8ba38..5a03e63 100644
--- a/drivers/xen/xen-pciback/pci_stub.c
+++ b/drivers/xen/xen-pciback/pci_stub.c
@@ -14,6 +14,7 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/atomic.h>
+#include <linux/delay.h>
 #include <xen/events.h>
 #include <asm/xen/pci.h>
 #include <asm/xen/hypervisor.h>
@@ -43,6 +44,7 @@ struct pcistub_device {
 	struct kref kref;
 	struct list_head dev_list;
 	spinlock_t lock;
+	bool created_reset_file;

 	struct pci_dev *dev;
 	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
@@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
 static int initialize_devices;
 static LIST_HEAD(seized_devices);

+/*
+ * pci_reset_function() will only work if there is a mechanism to
+ * reset that single function (e.g., FLR or a D-state transition).
+ * For PCI hardware that has two or more functions but no per-function
+ * reset, we can do a bus reset iff all the functions are co-assigned
+ * to the same domain.
+ *
+ * If a function has no per-function reset mechanism the 'reset' sysfs
+ * file that the toolstack uses to reset a function prior to assigning
+ * the device will be missing.  In this case, pciback adds its own
+ * which will try a bus reset.
+ *
+ * Note: pciback does not check for co-assigment before doing a bus
+ * reset, only that the devices are bound to pciback.  The toolstack
+ * is assumed to have done the right thing.
+ */
+static int __pcistub_reset_function(struct pci_dev *dev)
+{
+	struct pci_dev *pdev;
+	u16 ctrl;
+	int ret;
+
+	ret = __pci_reset_function_locked(dev);
+	if (ret == 0)
+		return 0;
+
+	if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
+		return -ENOTTY;
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
+		if (pdev != dev && (!pdev->driver
+				    || strcmp(pdev->driver->name, "pciback")))
+			return -ENOTTY;
+		pci_save_state(pdev);
+	}
+
+	pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
+	ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
+	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
+	msleep(200);
+
+	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
+		pci_restore_state(pdev);
+
+	return 0;
+}
+
+static int pcistub_reset_function(struct pci_dev *dev)
+{
+	int ret;
+
+	device_lock(&dev->dev);
+	ret = __pcistub_reset_function(dev);
+	device_unlock(&dev->dev);
+
+	return ret;
+}
+
+static ssize_t pcistub_reset_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	unsigned long val;
+	ssize_t result = strict_strtoul(buf, 0, &val);
+
+	if (result < 0)
+		return result;
+
+	if (val != 1)
+		return -EINVAL;
+
+	result = pcistub_reset_function(pdev);
+	if (result < 0)
+		return result;
+	return count;
+}
+static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
+
+static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
+{
+	struct device *dev = &psdev->dev->dev;
+	struct sysfs_dirent *reset_dirent;
+	int ret;
+
+	reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
+	if (reset_dirent) {
+		sysfs_put(reset_dirent);
+		return 0;
+	}
+
+	ret = device_create_file(dev, &dev_attr_reset);
+	if (ret < 0)
+		return ret;
+	psdev->created_reset_file = true;
+	return 0;
+}
+
+static void pcistub_remove_reset_file(struct pcistub_device *psdev)
+{
+	if (psdev && psdev->created_reset_file)
+		device_remove_file(&psdev->dev->dev, &dev_attr_reset);
+}
+
 static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
 {
 	struct pcistub_device *psdev;
@@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)

 	dev_dbg(&dev->dev, "pcistub_device_release\n");

+	pcistub_remove_reset_file(psdev);
+
 	xen_unregister_device_domain_owner(dev);

 	/* Call the reset function which does not take lock as this
 	 * is called from "unbind" which takes a device_lock mutex.
 	 */
-	__pci_reset_function_locked(dev);
+	__pcistub_reset_function(psdev->dev);
+
 	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
 		dev_dbg(&dev->dev, "Could not reload PCI state\n");
 	else
@@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
 	/* This is OK - we are running from workqueue context
 	 * and want to inhibit the user from fiddling with 'reset'
 	 */
-	pci_reset_function(dev);
+	pcistub_reset_function(psdev->dev);
 	pci_restore_state(psdev->dev);

 	/* This disables the device. */
@@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
 		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
 	else {
 		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
-		__pci_reset_function_locked(dev);
+		__pcistub_reset_function(dev);
 		pci_restore_state(dev);
 	}
 	/* Now disable the device (this also ensures some private device
@@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
 	if (!psdev)
 		return -ENOMEM;

+	err = pcistub_try_create_reset_file(psdev);
+	if (err < 0)
+		goto out;
+
 	spin_lock_irqsave(&pcistub_devices_lock, flags);

 	if (initialize_devices) {
@@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
 	}

 	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
-
+out:
 	if (err)
 		pcistub_device_put(psdev);
-
 	return err;
 }

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
  2013-12-13 16:18   ` Konrad Rzeszutek Wilk
  2013-12-13 16:18   ` Konrad Rzeszutek Wilk
@ 2013-12-16 11:34   ` David Vrabel
  2013-12-16 14:39     ` Konrad Rzeszutek Wilk
  2013-12-16 14:39     ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-12-16 11:34   ` David Vrabel
  3 siblings, 2 replies; 40+ messages in thread
From: David Vrabel @ 2013-12-16 11:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, linux-kernel, gordan

On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> The life-cycle of a PCI device in Xen pciback is a bit complex.
> 
> It starts with the device being binded to us - for which
> we do a device function reset.
> 
> If the device is unbinded from us - we also do a function
> reset.

Spelling: bound and unbound.

> The reset is done when all of the functions of a device
> are binded to Xen pciback. Or when a device is un-assigned
> from a guest. We do this by having a 'completion' workqueue
> on which the users of the PCI device wait. This workqueue
> is started once the device has been 'binded' or de-assigned
> from a guest.

The use of a work item and a completion baffles me.  What problem does
this solve?

> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
[...]
> +	/* We expect X amount of slots (this would also find out
> +	 * if we do not have all of the slots assigned to us).
> +	 */
> +	list_for_each_entry(pci_dev, &dev->bus->devices, bus_list)
> +		slots++;
> +
> +	spin_lock_irqsave(&pcistub_devices_lock, flags);
> +	/* Iterate over the existing devices to ascertain whether
> +	 * all of them are under the bridge and not in use. */
> +	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
> +		if (!psdev->dev)
> +			continue;
> +
> +		if (pci_domain_nr(psdev->dev->bus) == pci_domain_nr(dev->bus) &&
> +		    psdev->dev->bus->number == dev->bus->number &&
> +		    PCI_SLOT(psdev->dev->devfn) == PCI_SLOT(dev->devfn)) {
> +			slots--;
> +			/* Ignore ourselves in case hadn't cleaned up yet */
> +			if (psdev->pdev && psdev->dev != dev)
> +				inuse++;
> +		}
> +	}

This check looks broken.  A device added to pciback but still bound to
another driver will be considered as safe to reset.

I think you want something like:

list_for_each_entry(pdev, &dev->bus->devices, bus_list)
{
    if (pdev != dev && pdev->driver
        && pdev->driver != xen_pcibk_pci_driver))
        return -ENOTTY;
}

It is safe to reset unbound devices (even if they're not (or intended)
to be available to pciback).

It is also possible in the above loop if slot reset is supported to
ignore sibling devices that are on different slots.

> +	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> +	/* Slots should be zero (all slots under the bridge are
> +	 * accounted for), and inuse should be zero (not assigned
> +	 * to anybody). */
> +	if (!slots && !inuse) {
> +		int rc = 0, bus = 0;
> +		list_for_each_entry(pci_dev, &dev->bus->devices, bus_list) {
> +			dev_dbg(&pci_dev->dev, "resetting the slot device\n");
> +			if (!pci_probe_reset_slot(pci_dev->slot))
> +				rc = pci_reset_slot(pci_dev->slot);
> +			else
> +				bus = 1;
> +			if (rc)
> +				dev_info(&pci_dev->dev, "resetting slot failed with %d\n", rc);
> +		}

Why are you resetting every slot on the bus?  You only need to reset the
slot that the device is on.

Take a look at the vfio-pci driver.  It does this bus/slot reset choice
in a much more straightforward way.

David

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-13 16:09 ` Konrad Rzeszutek Wilk
                     ` (2 preceding siblings ...)
  2013-12-16 11:34   ` [Xen-devel] " David Vrabel
@ 2013-12-16 11:34   ` David Vrabel
  3 siblings, 0 replies; 40+ messages in thread
From: David Vrabel @ 2013-12-16 11:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, linux-kernel

On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> The life-cycle of a PCI device in Xen pciback is a bit complex.
> 
> It starts with the device being binded to us - for which
> we do a device function reset.
> 
> If the device is unbinded from us - we also do a function
> reset.

Spelling: bound and unbound.

> The reset is done when all of the functions of a device
> are binded to Xen pciback. Or when a device is un-assigned
> from a guest. We do this by having a 'completion' workqueue
> on which the users of the PCI device wait. This workqueue
> is started once the device has been 'binded' or de-assigned
> from a guest.

The use of a work item and a completion baffles me.  What problem does
this solve?

> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
[...]
> +	/* We expect X amount of slots (this would also find out
> +	 * if we do not have all of the slots assigned to us).
> +	 */
> +	list_for_each_entry(pci_dev, &dev->bus->devices, bus_list)
> +		slots++;
> +
> +	spin_lock_irqsave(&pcistub_devices_lock, flags);
> +	/* Iterate over the existing devices to ascertain whether
> +	 * all of them are under the bridge and not in use. */
> +	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
> +		if (!psdev->dev)
> +			continue;
> +
> +		if (pci_domain_nr(psdev->dev->bus) == pci_domain_nr(dev->bus) &&
> +		    psdev->dev->bus->number == dev->bus->number &&
> +		    PCI_SLOT(psdev->dev->devfn) == PCI_SLOT(dev->devfn)) {
> +			slots--;
> +			/* Ignore ourselves in case hadn't cleaned up yet */
> +			if (psdev->pdev && psdev->dev != dev)
> +				inuse++;
> +		}
> +	}

This check looks broken.  A device added to pciback but still bound to
another driver will be considered as safe to reset.

I think you want something like:

list_for_each_entry(pdev, &dev->bus->devices, bus_list)
{
    if (pdev != dev && pdev->driver
        && pdev->driver != xen_pcibk_pci_driver))
        return -ENOTTY;
}

It is safe to reset unbound devices (even if they're not (or intended)
to be available to pciback).

It is also possible in the above loop if slot reset is supported to
ignore sibling devices that are on different slots.

> +	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> +	/* Slots should be zero (all slots under the bridge are
> +	 * accounted for), and inuse should be zero (not assigned
> +	 * to anybody). */
> +	if (!slots && !inuse) {
> +		int rc = 0, bus = 0;
> +		list_for_each_entry(pci_dev, &dev->bus->devices, bus_list) {
> +			dev_dbg(&pci_dev->dev, "resetting the slot device\n");
> +			if (!pci_probe_reset_slot(pci_dev->slot))
> +				rc = pci_reset_slot(pci_dev->slot);
> +			else
> +				bus = 1;
> +			if (rc)
> +				dev_info(&pci_dev->dev, "resetting slot failed with %d\n", rc);
> +		}

Why are you resetting every slot on the bus?  You only need to reset the
slot that the device is on.

Take a look at the vfio-pci driver.  It does this bus/slot reset choice
in a much more straightforward way.

David

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 10:59 ` [Xen-devel] " David Vrabel
@ 2013-12-16 14:35   ` Konrad Rzeszutek Wilk
  2013-12-16 15:23     ` Sander Eikelenboom
  2013-12-16 15:23     ` Sander Eikelenboom
  2013-12-16 14:35   ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-16 14:35 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, linux-kernel, gordan

On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> > Hey,
> > 
> > While I was trying to narrow down the state of GPU passthrough
> > (still not finished) and figuring what needs to be done I realized
> > that Xen PCIback did not reset my GPU properly (when I crashed the
> > Windows guest by mistake). It does an FLR reset or Power one - if
> > the device supports it. But it seems that some of these GPUs
> > are liars and actually don't do the power part properly.
> 
> In my experience the devices do not lie.  They correctly report that
> they do not perform a reset in D3hot.
> 
> Here's the patch I'm using to solve this.  It does something similar.
> i.e., a SBR if all devices on that bus are safe to be reset.
> 
> I prefer it because it provides the standard 'reset' sysfs file that the
> toolstack/userspace can use.

We can still add the 'reset' to SysFS
> 
> It does have some limitations:  a) It does not check whether a device is
> in use (only if it is bound to pciback); and b) it hand rolls
> pci_slot_reset() (because it didn't exist at the time).

.. which can have those limiations removed and be based on this patchset.
Meaning it won't do a bus-reset or device reset if the rest of the devices
are _not_ assigned to pciback.

> 
> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> b/drivers/xen/xen-pciback/pci_stub.c
> index 4e8ba38..5a03e63 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -14,6 +14,7 @@
>  #include <linux/wait.h>
>  #include <linux/sched.h>
>  #include <linux/atomic.h>
> +#include <linux/delay.h>
>  #include <xen/events.h>
>  #include <asm/xen/pci.h>
>  #include <asm/xen/hypervisor.h>
> @@ -43,6 +44,7 @@ struct pcistub_device {
>  	struct kref kref;
>  	struct list_head dev_list;
>  	spinlock_t lock;
> +	bool created_reset_file;
> 
>  	struct pci_dev *dev;
>  	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>  static int initialize_devices;
>  static LIST_HEAD(seized_devices);
> 
> +/*
> + * pci_reset_function() will only work if there is a mechanism to
> + * reset that single function (e.g., FLR or a D-state transition).
> + * For PCI hardware that has two or more functions but no per-function
> + * reset, we can do a bus reset iff all the functions are co-assigned
> + * to the same domain.
> + *
> + * If a function has no per-function reset mechanism the 'reset' sysfs
> + * file that the toolstack uses to reset a function prior to assigning
> + * the device will be missing.  In this case, pciback adds its own
> + * which will try a bus reset.
> + *
> + * Note: pciback does not check for co-assigment before doing a bus
> + * reset, only that the devices are bound to pciback.  The toolstack
> + * is assumed to have done the right thing.
> + */
> +static int __pcistub_reset_function(struct pci_dev *dev)
> +{
> +	struct pci_dev *pdev;
> +	u16 ctrl;
> +	int ret;
> +
> +	ret = __pci_reset_function_locked(dev);
> +	if (ret == 0)
> +		return 0;
> +
> +	if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
> +		return -ENOTTY;
> +
> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
> +		if (pdev != dev && (!pdev->driver
> +				    || strcmp(pdev->driver->name, "pciback")))
> +			return -ENOTTY;
> +		pci_save_state(pdev);
> +	}
> +
> +	pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
> +	ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
> +	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> +	msleep(200);
> +
> +	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> +	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> +	msleep(200);
> +
> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> +		pci_restore_state(pdev);
> +
> +	return 0;
> +}
> +
> +static int pcistub_reset_function(struct pci_dev *dev)
> +{
> +	int ret;
> +
> +	device_lock(&dev->dev);
> +	ret = __pcistub_reset_function(dev);
> +	device_unlock(&dev->dev);
> +
> +	return ret;
> +}
> +
> +static ssize_t pcistub_reset_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	unsigned long val;
> +	ssize_t result = strict_strtoul(buf, 0, &val);
> +
> +	if (result < 0)
> +		return result;
> +
> +	if (val != 1)
> +		return -EINVAL;
> +
> +	result = pcistub_reset_function(pdev);
> +	if (result < 0)
> +		return result;
> +	return count;
> +}
> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
> +
> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
> +{
> +	struct device *dev = &psdev->dev->dev;
> +	struct sysfs_dirent *reset_dirent;
> +	int ret;
> +
> +	reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
> +	if (reset_dirent) {
> +		sysfs_put(reset_dirent);
> +		return 0;
> +	}
> +
> +	ret = device_create_file(dev, &dev_attr_reset);
> +	if (ret < 0)
> +		return ret;
> +	psdev->created_reset_file = true;
> +	return 0;
> +}
> +
> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
> +{
> +	if (psdev && psdev->created_reset_file)
> +		device_remove_file(&psdev->dev->dev, &dev_attr_reset);
> +}
> +
>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>  {
>  	struct pcistub_device *psdev;
> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
> 
>  	dev_dbg(&dev->dev, "pcistub_device_release\n");
> 
> +	pcistub_remove_reset_file(psdev);
> +
>  	xen_unregister_device_domain_owner(dev);
> 
>  	/* Call the reset function which does not take lock as this
>  	 * is called from "unbind" which takes a device_lock mutex.
>  	 */
> -	__pci_reset_function_locked(dev);
> +	__pcistub_reset_function(psdev->dev);
> +
>  	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>  		dev_dbg(&dev->dev, "Could not reload PCI state\n");
>  	else
> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	/* This is OK - we are running from workqueue context
>  	 * and want to inhibit the user from fiddling with 'reset'
>  	 */
> -	pci_reset_function(dev);
> +	pcistub_reset_function(psdev->dev);
>  	pci_restore_state(psdev->dev);
> 
>  	/* This disables the device. */
> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>  		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>  	else {
>  		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
> -		__pci_reset_function_locked(dev);
> +		__pcistub_reset_function(dev);
>  		pci_restore_state(dev);
>  	}
>  	/* Now disable the device (this also ensures some private device
> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>  	if (!psdev)
>  		return -ENOMEM;
> 
> +	err = pcistub_try_create_reset_file(psdev);
> +	if (err < 0)
> +		goto out;
> +
>  	spin_lock_irqsave(&pcistub_devices_lock, flags);
> 
>  	if (initialize_devices) {
> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>  	}
> 
>  	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> -
> +out:
>  	if (err)
>  		pcistub_device_put(psdev);
> -
>  	return err;
>  }
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 10:59 ` [Xen-devel] " David Vrabel
  2013-12-16 14:35   ` Konrad Rzeszutek Wilk
@ 2013-12-16 14:35   ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-16 14:35 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, gordan, linux-kernel

On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> > Hey,
> > 
> > While I was trying to narrow down the state of GPU passthrough
> > (still not finished) and figuring what needs to be done I realized
> > that Xen PCIback did not reset my GPU properly (when I crashed the
> > Windows guest by mistake). It does an FLR reset or Power one - if
> > the device supports it. But it seems that some of these GPUs
> > are liars and actually don't do the power part properly.
> 
> In my experience the devices do not lie.  They correctly report that
> they do not perform a reset in D3hot.
> 
> Here's the patch I'm using to solve this.  It does something similar.
> i.e., a SBR if all devices on that bus are safe to be reset.
> 
> I prefer it because it provides the standard 'reset' sysfs file that the
> toolstack/userspace can use.

We can still add the 'reset' to SysFS
> 
> It does have some limitations:  a) It does not check whether a device is
> in use (only if it is bound to pciback); and b) it hand rolls
> pci_slot_reset() (because it didn't exist at the time).

.. which can have those limiations removed and be based on this patchset.
Meaning it won't do a bus-reset or device reset if the rest of the devices
are _not_ assigned to pciback.

> 
> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> b/drivers/xen/xen-pciback/pci_stub.c
> index 4e8ba38..5a03e63 100644
> --- a/drivers/xen/xen-pciback/pci_stub.c
> +++ b/drivers/xen/xen-pciback/pci_stub.c
> @@ -14,6 +14,7 @@
>  #include <linux/wait.h>
>  #include <linux/sched.h>
>  #include <linux/atomic.h>
> +#include <linux/delay.h>
>  #include <xen/events.h>
>  #include <asm/xen/pci.h>
>  #include <asm/xen/hypervisor.h>
> @@ -43,6 +44,7 @@ struct pcistub_device {
>  	struct kref kref;
>  	struct list_head dev_list;
>  	spinlock_t lock;
> +	bool created_reset_file;
> 
>  	struct pci_dev *dev;
>  	struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>  static int initialize_devices;
>  static LIST_HEAD(seized_devices);
> 
> +/*
> + * pci_reset_function() will only work if there is a mechanism to
> + * reset that single function (e.g., FLR or a D-state transition).
> + * For PCI hardware that has two or more functions but no per-function
> + * reset, we can do a bus reset iff all the functions are co-assigned
> + * to the same domain.
> + *
> + * If a function has no per-function reset mechanism the 'reset' sysfs
> + * file that the toolstack uses to reset a function prior to assigning
> + * the device will be missing.  In this case, pciback adds its own
> + * which will try a bus reset.
> + *
> + * Note: pciback does not check for co-assigment before doing a bus
> + * reset, only that the devices are bound to pciback.  The toolstack
> + * is assumed to have done the right thing.
> + */
> +static int __pcistub_reset_function(struct pci_dev *dev)
> +{
> +	struct pci_dev *pdev;
> +	u16 ctrl;
> +	int ret;
> +
> +	ret = __pci_reset_function_locked(dev);
> +	if (ret == 0)
> +		return 0;
> +
> +	if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
> +		return -ENOTTY;
> +
> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
> +		if (pdev != dev && (!pdev->driver
> +				    || strcmp(pdev->driver->name, "pciback")))
> +			return -ENOTTY;
> +		pci_save_state(pdev);
> +	}
> +
> +	pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
> +	ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
> +	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> +	msleep(200);
> +
> +	ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> +	pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> +	msleep(200);
> +
> +	list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> +		pci_restore_state(pdev);
> +
> +	return 0;
> +}
> +
> +static int pcistub_reset_function(struct pci_dev *dev)
> +{
> +	int ret;
> +
> +	device_lock(&dev->dev);
> +	ret = __pcistub_reset_function(dev);
> +	device_unlock(&dev->dev);
> +
> +	return ret;
> +}
> +
> +static ssize_t pcistub_reset_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	unsigned long val;
> +	ssize_t result = strict_strtoul(buf, 0, &val);
> +
> +	if (result < 0)
> +		return result;
> +
> +	if (val != 1)
> +		return -EINVAL;
> +
> +	result = pcistub_reset_function(pdev);
> +	if (result < 0)
> +		return result;
> +	return count;
> +}
> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
> +
> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
> +{
> +	struct device *dev = &psdev->dev->dev;
> +	struct sysfs_dirent *reset_dirent;
> +	int ret;
> +
> +	reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
> +	if (reset_dirent) {
> +		sysfs_put(reset_dirent);
> +		return 0;
> +	}
> +
> +	ret = device_create_file(dev, &dev_attr_reset);
> +	if (ret < 0)
> +		return ret;
> +	psdev->created_reset_file = true;
> +	return 0;
> +}
> +
> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
> +{
> +	if (psdev && psdev->created_reset_file)
> +		device_remove_file(&psdev->dev->dev, &dev_attr_reset);
> +}
> +
>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>  {
>  	struct pcistub_device *psdev;
> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
> 
>  	dev_dbg(&dev->dev, "pcistub_device_release\n");
> 
> +	pcistub_remove_reset_file(psdev);
> +
>  	xen_unregister_device_domain_owner(dev);
> 
>  	/* Call the reset function which does not take lock as this
>  	 * is called from "unbind" which takes a device_lock mutex.
>  	 */
> -	__pci_reset_function_locked(dev);
> +	__pcistub_reset_function(psdev->dev);
> +
>  	if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>  		dev_dbg(&dev->dev, "Could not reload PCI state\n");
>  	else
> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>  	/* This is OK - we are running from workqueue context
>  	 * and want to inhibit the user from fiddling with 'reset'
>  	 */
> -	pci_reset_function(dev);
> +	pcistub_reset_function(psdev->dev);
>  	pci_restore_state(psdev->dev);
> 
>  	/* This disables the device. */
> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>  		dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>  	else {
>  		dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
> -		__pci_reset_function_locked(dev);
> +		__pcistub_reset_function(dev);
>  		pci_restore_state(dev);
>  	}
>  	/* Now disable the device (this also ensures some private device
> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>  	if (!psdev)
>  		return -ENOMEM;
> 
> +	err = pcistub_try_create_reset_file(psdev);
> +	if (err < 0)
> +		goto out;
> +
>  	spin_lock_irqsave(&pcistub_devices_lock, flags);
> 
>  	if (initialize_devices) {
> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>  	}
> 
>  	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> -
> +out:
>  	if (err)
>  		pcistub_device_put(psdev);
> -
>  	return err;
>  }
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-16 11:34   ` [Xen-devel] " David Vrabel
  2013-12-16 14:39     ` Konrad Rzeszutek Wilk
@ 2013-12-16 14:39     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-16 14:39 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, linux-kernel, gordan

On Mon, Dec 16, 2013 at 11:34:32AM +0000, David Vrabel wrote:
> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> > The life-cycle of a PCI device in Xen pciback is a bit complex.
> > 
> > It starts with the device being binded to us - for which
> > we do a device function reset.
> > 
> > If the device is unbinded from us - we also do a function
> > reset.
> 
> Spelling: bound and unbound.
> 
> > The reset is done when all of the functions of a device
> > are binded to Xen pciback. Or when a device is un-assigned
> > from a guest. We do this by having a 'completion' workqueue
> > on which the users of the PCI device wait. This workqueue
> > is started once the device has been 'binded' or de-assigned
> > from a guest.
> 
> The use of a work item and a completion baffles me.  What problem does
> this solve?

Avoiding of deadlock. Whenever you do any SysFS operations on on PCI
devices it ends up locking pci_dev and then we try to use it .. and
end up dead-locking. I should have clarified that more.

I could add code to pci (Generic) to have an non-locking variant - but
there is soo much of it I think doing it in a work item and completion
would be much simpler.

> 
> > --- a/drivers/xen/xen-pciback/pci_stub.c
> > +++ b/drivers/xen/xen-pciback/pci_stub.c
> [...]
> > +	/* We expect X amount of slots (this would also find out
> > +	 * if we do not have all of the slots assigned to us).
> > +	 */
> > +	list_for_each_entry(pci_dev, &dev->bus->devices, bus_list)
> > +		slots++;
> > +
> > +	spin_lock_irqsave(&pcistub_devices_lock, flags);
> > +	/* Iterate over the existing devices to ascertain whether
> > +	 * all of them are under the bridge and not in use. */
> > +	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
> > +		if (!psdev->dev)
> > +			continue;
> > +
> > +		if (pci_domain_nr(psdev->dev->bus) == pci_domain_nr(dev->bus) &&
> > +		    psdev->dev->bus->number == dev->bus->number &&
> > +		    PCI_SLOT(psdev->dev->devfn) == PCI_SLOT(dev->devfn)) {
> > +			slots--;
> > +			/* Ignore ourselves in case hadn't cleaned up yet */
> > +			if (psdev->pdev && psdev->dev != dev)
> > +				inuse++;
> > +		}
> > +	}
> 
> This check looks broken.  A device added to pciback but still bound to
> another driver will be considered as safe to reset.
> 
> I think you want something like:
> 
> list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> {
>     if (pdev != dev && pdev->driver
>         && pdev->driver != xen_pcibk_pci_driver))
>         return -ENOTTY;
> }

Good catch!
> 
> It is safe to reset unbound devices (even if they're not (or intended)
> to be available to pciback).

OK, we can check for that.
> 
> It is also possible in the above loop if slot reset is supported to
> ignore sibling devices that are on different slots.

Not sure what you mean?
> 
> > +	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> > +	/* Slots should be zero (all slots under the bridge are
> > +	 * accounted for), and inuse should be zero (not assigned
> > +	 * to anybody). */
> > +	if (!slots && !inuse) {
> > +		int rc = 0, bus = 0;
> > +		list_for_each_entry(pci_dev, &dev->bus->devices, bus_list) {
> > +			dev_dbg(&pci_dev->dev, "resetting the slot device\n");
> > +			if (!pci_probe_reset_slot(pci_dev->slot))
> > +				rc = pci_reset_slot(pci_dev->slot);
> > +			else
> > +				bus = 1;
> > +			if (rc)
> > +				dev_info(&pci_dev->dev, "resetting slot failed with %d\n", rc);
> > +		}
> 
> Why are you resetting every slot on the bus?  You only need to reset the
> slot that the device is on.

Bug.
> 
> Take a look at the vfio-pci driver.  It does this bus/slot reset choice
> in a much more straightforward way.
> 
> David

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus
  2013-12-16 11:34   ` [Xen-devel] " David Vrabel
@ 2013-12-16 14:39     ` Konrad Rzeszutek Wilk
  2013-12-16 14:39     ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-16 14:39 UTC (permalink / raw)
  To: David Vrabel; +Cc: xen-devel, gordan, linux-kernel

On Mon, Dec 16, 2013 at 11:34:32AM +0000, David Vrabel wrote:
> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> > The life-cycle of a PCI device in Xen pciback is a bit complex.
> > 
> > It starts with the device being binded to us - for which
> > we do a device function reset.
> > 
> > If the device is unbinded from us - we also do a function
> > reset.
> 
> Spelling: bound and unbound.
> 
> > The reset is done when all of the functions of a device
> > are binded to Xen pciback. Or when a device is un-assigned
> > from a guest. We do this by having a 'completion' workqueue
> > on which the users of the PCI device wait. This workqueue
> > is started once the device has been 'binded' or de-assigned
> > from a guest.
> 
> The use of a work item and a completion baffles me.  What problem does
> this solve?

Avoiding of deadlock. Whenever you do any SysFS operations on on PCI
devices it ends up locking pci_dev and then we try to use it .. and
end up dead-locking. I should have clarified that more.

I could add code to pci (Generic) to have an non-locking variant - but
there is soo much of it I think doing it in a work item and completion
would be much simpler.

> 
> > --- a/drivers/xen/xen-pciback/pci_stub.c
> > +++ b/drivers/xen/xen-pciback/pci_stub.c
> [...]
> > +	/* We expect X amount of slots (this would also find out
> > +	 * if we do not have all of the slots assigned to us).
> > +	 */
> > +	list_for_each_entry(pci_dev, &dev->bus->devices, bus_list)
> > +		slots++;
> > +
> > +	spin_lock_irqsave(&pcistub_devices_lock, flags);
> > +	/* Iterate over the existing devices to ascertain whether
> > +	 * all of them are under the bridge and not in use. */
> > +	list_for_each_entry(psdev, &pcistub_devices, dev_list) {
> > +		if (!psdev->dev)
> > +			continue;
> > +
> > +		if (pci_domain_nr(psdev->dev->bus) == pci_domain_nr(dev->bus) &&
> > +		    psdev->dev->bus->number == dev->bus->number &&
> > +		    PCI_SLOT(psdev->dev->devfn) == PCI_SLOT(dev->devfn)) {
> > +			slots--;
> > +			/* Ignore ourselves in case hadn't cleaned up yet */
> > +			if (psdev->pdev && psdev->dev != dev)
> > +				inuse++;
> > +		}
> > +	}
> 
> This check looks broken.  A device added to pciback but still bound to
> another driver will be considered as safe to reset.
> 
> I think you want something like:
> 
> list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> {
>     if (pdev != dev && pdev->driver
>         && pdev->driver != xen_pcibk_pci_driver))
>         return -ENOTTY;
> }

Good catch!
> 
> It is safe to reset unbound devices (even if they're not (or intended)
> to be available to pciback).

OK, we can check for that.
> 
> It is also possible in the above loop if slot reset is supported to
> ignore sibling devices that are on different slots.

Not sure what you mean?
> 
> > +	spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> > +	/* Slots should be zero (all slots under the bridge are
> > +	 * accounted for), and inuse should be zero (not assigned
> > +	 * to anybody). */
> > +	if (!slots && !inuse) {
> > +		int rc = 0, bus = 0;
> > +		list_for_each_entry(pci_dev, &dev->bus->devices, bus_list) {
> > +			dev_dbg(&pci_dev->dev, "resetting the slot device\n");
> > +			if (!pci_probe_reset_slot(pci_dev->slot))
> > +				rc = pci_reset_slot(pci_dev->slot);
> > +			else
> > +				bus = 1;
> > +			if (rc)
> > +				dev_info(&pci_dev->dev, "resetting slot failed with %d\n", rc);
> > +		}
> 
> Why are you resetting every slot on the bus?  You only need to reset the
> slot that the device is on.

Bug.
> 
> Take a look at the vfio-pci driver.  It does this bus/slot reset choice
> in a much more straightforward way.
> 
> David

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 14:35   ` Konrad Rzeszutek Wilk
@ 2013-12-16 15:23     ` Sander Eikelenboom
  2013-12-16 15:36       ` Konrad Rzeszutek Wilk
  2013-12-16 15:36       ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-12-16 15:23     ` Sander Eikelenboom
  1 sibling, 2 replies; 40+ messages in thread
From: Sander Eikelenboom @ 2013-12-16 15:23 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: David Vrabel, xen-devel, gordan, linux-kernel


Monday, December 16, 2013, 3:35:15 PM, you wrote:

> On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
>> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
>> > Hey,
>> > 
>> > While I was trying to narrow down the state of GPU passthrough
>> > (still not finished) and figuring what needs to be done I realized
>> > that Xen PCIback did not reset my GPU properly (when I crashed the
>> > Windows guest by mistake). It does an FLR reset or Power one - if
>> > the device supports it. But it seems that some of these GPUs
>> > are liars and actually don't do the power part properly.
>> 
>> In my experience the devices do not lie.  They correctly report that
>> they do not perform a reset in D3hot.
>> 
>> Here's the patch I'm using to solve this.  It does something similar.
>> i.e., a SBR if all devices on that bus are safe to be reset.
>> 
>> I prefer it because it provides the standard 'reset' sysfs file that the
>> toolstack/userspace can use.

> We can still add the 'reset' to SysFS
>> 
>> It does have some limitations:  a) It does not check whether a device is
>> in use (only if it is bound to pciback); and b) it hand rolls
>> pci_slot_reset() (because it didn't exist at the time).

> .. which can have those limiations removed and be based on this patchset.
> Meaning it won't do a bus-reset or device reset if the rest of the devices
> are _not_ assigned to pciback.

Perhaps there is something to learn from the steps vfio-pci takes to do this ?
(they sorted out quite some stuff around pci/vga passtrough)

>> 
>> diff --git a/drivers/xen/xen-pciback/pci_stub.c
>> b/drivers/xen/xen-pciback/pci_stub.c
>> index 4e8ba38..5a03e63 100644
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -14,6 +14,7 @@
>>  #include <linux/wait.h>
>>  #include <linux/sched.h>
>>  #include <linux/atomic.h>
>> +#include <linux/delay.h>
>>  #include <xen/events.h>
>>  #include <asm/xen/pci.h>
>>  #include <asm/xen/hypervisor.h>
>> @@ -43,6 +44,7 @@ struct pcistub_device {
>>       struct kref kref;
>>       struct list_head dev_list;
>>       spinlock_t lock;
>> +     bool created_reset_file;
>> 
>>       struct pci_dev *dev;
>>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
>> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>>  static int initialize_devices;
>>  static LIST_HEAD(seized_devices);
>> 
>> +/*
>> + * pci_reset_function() will only work if there is a mechanism to
>> + * reset that single function (e.g., FLR or a D-state transition).
>> + * For PCI hardware that has two or more functions but no per-function
>> + * reset, we can do a bus reset iff all the functions are co-assigned
>> + * to the same domain.
>> + *
>> + * If a function has no per-function reset mechanism the 'reset' sysfs
>> + * file that the toolstack uses to reset a function prior to assigning
>> + * the device will be missing.  In this case, pciback adds its own
>> + * which will try a bus reset.
>> + *
>> + * Note: pciback does not check for co-assigment before doing a bus
>> + * reset, only that the devices are bound to pciback.  The toolstack
>> + * is assumed to have done the right thing.
>> + */
>> +static int __pcistub_reset_function(struct pci_dev *dev)
>> +{
>> +     struct pci_dev *pdev;
>> +     u16 ctrl;
>> +     int ret;
>> +
>> +     ret = __pci_reset_function_locked(dev);
>> +     if (ret == 0)
>> +             return 0;
>> +
>> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
>> +             return -ENOTTY;
>> +
>> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
>> +             if (pdev != dev && (!pdev->driver
>> +                                 || strcmp(pdev->driver->name, "pciback")))
>> +                     return -ENOTTY;
>> +             pci_save_state(pdev);
>> +     }
>> +
>> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
>> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
>> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> +     msleep(200);
>> +
>> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> +     msleep(200);
>> +
>> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
>> +             pci_restore_state(pdev);
>> +
>> +     return 0;
>> +}
>> +
>> +static int pcistub_reset_function(struct pci_dev *dev)
>> +{
>> +     int ret;
>> +
>> +     device_lock(&dev->dev);
>> +     ret = __pcistub_reset_function(dev);
>> +     device_unlock(&dev->dev);
>> +
>> +     return ret;
>> +}
>> +
>> +static ssize_t pcistub_reset_store(struct device *dev,
>> +                                struct device_attribute *attr,
>> +                                const char *buf, size_t count)
>> +{
>> +     struct pci_dev *pdev = to_pci_dev(dev);
>> +     unsigned long val;
>> +     ssize_t result = strict_strtoul(buf, 0, &val);
>> +
>> +     if (result < 0)
>> +             return result;
>> +
>> +     if (val != 1)
>> +             return -EINVAL;
>> +
>> +     result = pcistub_reset_function(pdev);
>> +     if (result < 0)
>> +             return result;
>> +     return count;
>> +}
>> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
>> +
>> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
>> +{
>> +     struct device *dev = &psdev->dev->dev;
>> +     struct sysfs_dirent *reset_dirent;
>> +     int ret;
>> +
>> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
>> +     if (reset_dirent) {
>> +             sysfs_put(reset_dirent);
>> +             return 0;
>> +     }
>> +
>> +     ret = device_create_file(dev, &dev_attr_reset);
>> +     if (ret < 0)
>> +             return ret;
>> +     psdev->created_reset_file = true;
>> +     return 0;
>> +}
>> +
>> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
>> +{
>> +     if (psdev && psdev->created_reset_file)
>> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
>> +}
>> +
>>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>>  {
>>       struct pcistub_device *psdev;
>> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
>> 
>>       dev_dbg(&dev->dev, "pcistub_device_release\n");
>> 
>> +     pcistub_remove_reset_file(psdev);
>> +
>>       xen_unregister_device_domain_owner(dev);
>> 
>>       /* Call the reset function which does not take lock as this
>>        * is called from "unbind" which takes a device_lock mutex.
>>        */
>> -     __pci_reset_function_locked(dev);
>> +     __pcistub_reset_function(psdev->dev);
>> +
>>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>>       else
>> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>       /* This is OK - we are running from workqueue context
>>        * and want to inhibit the user from fiddling with 'reset'
>>        */
>> -     pci_reset_function(dev);
>> +     pcistub_reset_function(psdev->dev);
>>       pci_restore_state(psdev->dev);
>> 
>>       /* This disables the device. */
>> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>>       else {
>>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>> -             __pci_reset_function_locked(dev);
>> +             __pcistub_reset_function(dev);
>>               pci_restore_state(dev);
>>       }
>>       /* Now disable the device (this also ensures some private device
>> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>>       if (!psdev)
>>               return -ENOMEM;
>> 
>> +     err = pcistub_try_create_reset_file(psdev);
>> +     if (err < 0)
>> +             goto out;
>> +
>>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>> 
>>       if (initialize_devices) {
>> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>>       }
>> 
>>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
>> -
>> +out:
>>       if (err)
>>               pcistub_device_put(psdev);
>> -
>>       return err;
>>  }
>> 




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 14:35   ` Konrad Rzeszutek Wilk
  2013-12-16 15:23     ` Sander Eikelenboom
@ 2013-12-16 15:23     ` Sander Eikelenboom
  1 sibling, 0 replies; 40+ messages in thread
From: Sander Eikelenboom @ 2013-12-16 15:23 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, David Vrabel, linux-kernel


Monday, December 16, 2013, 3:35:15 PM, you wrote:

> On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
>> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
>> > Hey,
>> > 
>> > While I was trying to narrow down the state of GPU passthrough
>> > (still not finished) and figuring what needs to be done I realized
>> > that Xen PCIback did not reset my GPU properly (when I crashed the
>> > Windows guest by mistake). It does an FLR reset or Power one - if
>> > the device supports it. But it seems that some of these GPUs
>> > are liars and actually don't do the power part properly.
>> 
>> In my experience the devices do not lie.  They correctly report that
>> they do not perform a reset in D3hot.
>> 
>> Here's the patch I'm using to solve this.  It does something similar.
>> i.e., a SBR if all devices on that bus are safe to be reset.
>> 
>> I prefer it because it provides the standard 'reset' sysfs file that the
>> toolstack/userspace can use.

> We can still add the 'reset' to SysFS
>> 
>> It does have some limitations:  a) It does not check whether a device is
>> in use (only if it is bound to pciback); and b) it hand rolls
>> pci_slot_reset() (because it didn't exist at the time).

> .. which can have those limiations removed and be based on this patchset.
> Meaning it won't do a bus-reset or device reset if the rest of the devices
> are _not_ assigned to pciback.

Perhaps there is something to learn from the steps vfio-pci takes to do this ?
(they sorted out quite some stuff around pci/vga passtrough)

>> 
>> diff --git a/drivers/xen/xen-pciback/pci_stub.c
>> b/drivers/xen/xen-pciback/pci_stub.c
>> index 4e8ba38..5a03e63 100644
>> --- a/drivers/xen/xen-pciback/pci_stub.c
>> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> @@ -14,6 +14,7 @@
>>  #include <linux/wait.h>
>>  #include <linux/sched.h>
>>  #include <linux/atomic.h>
>> +#include <linux/delay.h>
>>  #include <xen/events.h>
>>  #include <asm/xen/pci.h>
>>  #include <asm/xen/hypervisor.h>
>> @@ -43,6 +44,7 @@ struct pcistub_device {
>>       struct kref kref;
>>       struct list_head dev_list;
>>       spinlock_t lock;
>> +     bool created_reset_file;
>> 
>>       struct pci_dev *dev;
>>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
>> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>>  static int initialize_devices;
>>  static LIST_HEAD(seized_devices);
>> 
>> +/*
>> + * pci_reset_function() will only work if there is a mechanism to
>> + * reset that single function (e.g., FLR or a D-state transition).
>> + * For PCI hardware that has two or more functions but no per-function
>> + * reset, we can do a bus reset iff all the functions are co-assigned
>> + * to the same domain.
>> + *
>> + * If a function has no per-function reset mechanism the 'reset' sysfs
>> + * file that the toolstack uses to reset a function prior to assigning
>> + * the device will be missing.  In this case, pciback adds its own
>> + * which will try a bus reset.
>> + *
>> + * Note: pciback does not check for co-assigment before doing a bus
>> + * reset, only that the devices are bound to pciback.  The toolstack
>> + * is assumed to have done the right thing.
>> + */
>> +static int __pcistub_reset_function(struct pci_dev *dev)
>> +{
>> +     struct pci_dev *pdev;
>> +     u16 ctrl;
>> +     int ret;
>> +
>> +     ret = __pci_reset_function_locked(dev);
>> +     if (ret == 0)
>> +             return 0;
>> +
>> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
>> +             return -ENOTTY;
>> +
>> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
>> +             if (pdev != dev && (!pdev->driver
>> +                                 || strcmp(pdev->driver->name, "pciback")))
>> +                     return -ENOTTY;
>> +             pci_save_state(pdev);
>> +     }
>> +
>> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
>> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
>> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> +     msleep(200);
>> +
>> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> +     msleep(200);
>> +
>> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
>> +             pci_restore_state(pdev);
>> +
>> +     return 0;
>> +}
>> +
>> +static int pcistub_reset_function(struct pci_dev *dev)
>> +{
>> +     int ret;
>> +
>> +     device_lock(&dev->dev);
>> +     ret = __pcistub_reset_function(dev);
>> +     device_unlock(&dev->dev);
>> +
>> +     return ret;
>> +}
>> +
>> +static ssize_t pcistub_reset_store(struct device *dev,
>> +                                struct device_attribute *attr,
>> +                                const char *buf, size_t count)
>> +{
>> +     struct pci_dev *pdev = to_pci_dev(dev);
>> +     unsigned long val;
>> +     ssize_t result = strict_strtoul(buf, 0, &val);
>> +
>> +     if (result < 0)
>> +             return result;
>> +
>> +     if (val != 1)
>> +             return -EINVAL;
>> +
>> +     result = pcistub_reset_function(pdev);
>> +     if (result < 0)
>> +             return result;
>> +     return count;
>> +}
>> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
>> +
>> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
>> +{
>> +     struct device *dev = &psdev->dev->dev;
>> +     struct sysfs_dirent *reset_dirent;
>> +     int ret;
>> +
>> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
>> +     if (reset_dirent) {
>> +             sysfs_put(reset_dirent);
>> +             return 0;
>> +     }
>> +
>> +     ret = device_create_file(dev, &dev_attr_reset);
>> +     if (ret < 0)
>> +             return ret;
>> +     psdev->created_reset_file = true;
>> +     return 0;
>> +}
>> +
>> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
>> +{
>> +     if (psdev && psdev->created_reset_file)
>> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
>> +}
>> +
>>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>>  {
>>       struct pcistub_device *psdev;
>> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
>> 
>>       dev_dbg(&dev->dev, "pcistub_device_release\n");
>> 
>> +     pcistub_remove_reset_file(psdev);
>> +
>>       xen_unregister_device_domain_owner(dev);
>> 
>>       /* Call the reset function which does not take lock as this
>>        * is called from "unbind" which takes a device_lock mutex.
>>        */
>> -     __pci_reset_function_locked(dev);
>> +     __pcistub_reset_function(psdev->dev);
>> +
>>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>>       else
>> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>>       /* This is OK - we are running from workqueue context
>>        * and want to inhibit the user from fiddling with 'reset'
>>        */
>> -     pci_reset_function(dev);
>> +     pcistub_reset_function(psdev->dev);
>>       pci_restore_state(psdev->dev);
>> 
>>       /* This disables the device. */
>> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>>       else {
>>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>> -             __pci_reset_function_locked(dev);
>> +             __pcistub_reset_function(dev);
>>               pci_restore_state(dev);
>>       }
>>       /* Now disable the device (this also ensures some private device
>> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>>       if (!psdev)
>>               return -ENOMEM;
>> 
>> +     err = pcistub_try_create_reset_file(psdev);
>> +     if (err < 0)
>> +             goto out;
>> +
>>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>> 
>>       if (initialize_devices) {
>> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>>       }
>> 
>>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
>> -
>> +out:
>>       if (err)
>>               pcistub_device_put(psdev);
>> -
>>       return err;
>>  }
>> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 15:23     ` Sander Eikelenboom
  2013-12-16 15:36       ` Konrad Rzeszutek Wilk
@ 2013-12-16 15:36       ` Konrad Rzeszutek Wilk
  2013-12-16 15:45         ` Sander Eikelenboom
                           ` (3 more replies)
  1 sibling, 4 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-16 15:36 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: David Vrabel, xen-devel, gordan, linux-kernel

On Mon, Dec 16, 2013 at 04:23:53PM +0100, Sander Eikelenboom wrote:
> 
> Monday, December 16, 2013, 3:35:15 PM, you wrote:
> 
> > On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
> >> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> >> > Hey,
> >> > 
> >> > While I was trying to narrow down the state of GPU passthrough
> >> > (still not finished) and figuring what needs to be done I realized
> >> > that Xen PCIback did not reset my GPU properly (when I crashed the
> >> > Windows guest by mistake). It does an FLR reset or Power one - if
> >> > the device supports it. But it seems that some of these GPUs
> >> > are liars and actually don't do the power part properly.
> >> 
> >> In my experience the devices do not lie.  They correctly report that
> >> they do not perform a reset in D3hot.
> >> 
> >> Here's the patch I'm using to solve this.  It does something similar.
> >> i.e., a SBR if all devices on that bus are safe to be reset.
> >> 
> >> I prefer it because it provides the standard 'reset' sysfs file that the
> >> toolstack/userspace can use.
> 
> > We can still add the 'reset' to SysFS
> >> 
> >> It does have some limitations:  a) It does not check whether a device is
> >> in use (only if it is bound to pciback); and b) it hand rolls
> >> pci_slot_reset() (because it didn't exist at the time).
> 
> > .. which can have those limiations removed and be based on this patchset.
> > Meaning it won't do a bus-reset or device reset if the rest of the devices
> > are _not_ assigned to pciback.
> 
> Perhaps there is something to learn from the steps vfio-pci takes to do this ?
> (they sorted out quite some stuff around pci/vga passtrough)

That is actually what I based it on :-)

> 
> >> 
> >> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> >> b/drivers/xen/xen-pciback/pci_stub.c
> >> index 4e8ba38..5a03e63 100644
> >> --- a/drivers/xen/xen-pciback/pci_stub.c
> >> +++ b/drivers/xen/xen-pciback/pci_stub.c
> >> @@ -14,6 +14,7 @@
> >>  #include <linux/wait.h>
> >>  #include <linux/sched.h>
> >>  #include <linux/atomic.h>
> >> +#include <linux/delay.h>
> >>  #include <xen/events.h>
> >>  #include <asm/xen/pci.h>
> >>  #include <asm/xen/hypervisor.h>
> >> @@ -43,6 +44,7 @@ struct pcistub_device {
> >>       struct kref kref;
> >>       struct list_head dev_list;
> >>       spinlock_t lock;
> >> +     bool created_reset_file;
> >> 
> >>       struct pci_dev *dev;
> >>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
> >> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
> >>  static int initialize_devices;
> >>  static LIST_HEAD(seized_devices);
> >> 
> >> +/*
> >> + * pci_reset_function() will only work if there is a mechanism to
> >> + * reset that single function (e.g., FLR or a D-state transition).
> >> + * For PCI hardware that has two or more functions but no per-function
> >> + * reset, we can do a bus reset iff all the functions are co-assigned
> >> + * to the same domain.
> >> + *
> >> + * If a function has no per-function reset mechanism the 'reset' sysfs
> >> + * file that the toolstack uses to reset a function prior to assigning
> >> + * the device will be missing.  In this case, pciback adds its own
> >> + * which will try a bus reset.
> >> + *
> >> + * Note: pciback does not check for co-assigment before doing a bus
> >> + * reset, only that the devices are bound to pciback.  The toolstack
> >> + * is assumed to have done the right thing.
> >> + */
> >> +static int __pcistub_reset_function(struct pci_dev *dev)
> >> +{
> >> +     struct pci_dev *pdev;
> >> +     u16 ctrl;
> >> +     int ret;
> >> +
> >> +     ret = __pci_reset_function_locked(dev);
> >> +     if (ret == 0)
> >> +             return 0;
> >> +
> >> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
> >> +             return -ENOTTY;
> >> +
> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
> >> +             if (pdev != dev && (!pdev->driver
> >> +                                 || strcmp(pdev->driver->name, "pciback")))
> >> +                     return -ENOTTY;
> >> +             pci_save_state(pdev);
> >> +     }
> >> +
> >> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
> >> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> >> +     msleep(200);
> >> +
> >> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> >> +     msleep(200);
> >> +
> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> >> +             pci_restore_state(pdev);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int pcistub_reset_function(struct pci_dev *dev)
> >> +{
> >> +     int ret;
> >> +
> >> +     device_lock(&dev->dev);
> >> +     ret = __pcistub_reset_function(dev);
> >> +     device_unlock(&dev->dev);
> >> +
> >> +     return ret;
> >> +}
> >> +
> >> +static ssize_t pcistub_reset_store(struct device *dev,
> >> +                                struct device_attribute *attr,
> >> +                                const char *buf, size_t count)
> >> +{
> >> +     struct pci_dev *pdev = to_pci_dev(dev);
> >> +     unsigned long val;
> >> +     ssize_t result = strict_strtoul(buf, 0, &val);
> >> +
> >> +     if (result < 0)
> >> +             return result;
> >> +
> >> +     if (val != 1)
> >> +             return -EINVAL;
> >> +
> >> +     result = pcistub_reset_function(pdev);
> >> +     if (result < 0)
> >> +             return result;
> >> +     return count;
> >> +}
> >> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
> >> +
> >> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
> >> +{
> >> +     struct device *dev = &psdev->dev->dev;
> >> +     struct sysfs_dirent *reset_dirent;
> >> +     int ret;
> >> +
> >> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
> >> +     if (reset_dirent) {
> >> +             sysfs_put(reset_dirent);
> >> +             return 0;
> >> +     }
> >> +
> >> +     ret = device_create_file(dev, &dev_attr_reset);
> >> +     if (ret < 0)
> >> +             return ret;
> >> +     psdev->created_reset_file = true;
> >> +     return 0;
> >> +}
> >> +
> >> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
> >> +{
> >> +     if (psdev && psdev->created_reset_file)
> >> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
> >> +}
> >> +
> >>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
> >>  {
> >>       struct pcistub_device *psdev;
> >> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
> >> 
> >>       dev_dbg(&dev->dev, "pcistub_device_release\n");
> >> 
> >> +     pcistub_remove_reset_file(psdev);
> >> +
> >>       xen_unregister_device_domain_owner(dev);
> >> 
> >>       /* Call the reset function which does not take lock as this
> >>        * is called from "unbind" which takes a device_lock mutex.
> >>        */
> >> -     __pci_reset_function_locked(dev);
> >> +     __pcistub_reset_function(psdev->dev);
> >> +
> >>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> >>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> >>       else
> >> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >>       /* This is OK - we are running from workqueue context
> >>        * and want to inhibit the user from fiddling with 'reset'
> >>        */
> >> -     pci_reset_function(dev);
> >> +     pcistub_reset_function(psdev->dev);
> >>       pci_restore_state(psdev->dev);
> >> 
> >>       /* This disables the device. */
> >> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
> >>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
> >>       else {
> >>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
> >> -             __pci_reset_function_locked(dev);
> >> +             __pcistub_reset_function(dev);
> >>               pci_restore_state(dev);
> >>       }
> >>       /* Now disable the device (this also ensures some private device
> >> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
> >>       if (!psdev)
> >>               return -ENOMEM;
> >> 
> >> +     err = pcistub_try_create_reset_file(psdev);
> >> +     if (err < 0)
> >> +             goto out;
> >> +
> >>       spin_lock_irqsave(&pcistub_devices_lock, flags);
> >> 
> >>       if (initialize_devices) {
> >> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
> >>       }
> >> 
> >>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> >> -
> >> +out:
> >>       if (err)
> >>               pcistub_device_put(psdev);
> >> -
> >>       return err;
> >>  }
> >> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 15:23     ` Sander Eikelenboom
@ 2013-12-16 15:36       ` Konrad Rzeszutek Wilk
  2013-12-16 15:36       ` [Xen-devel] " Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-16 15:36 UTC (permalink / raw)
  To: Sander Eikelenboom; +Cc: xen-devel, gordan, David Vrabel, linux-kernel

On Mon, Dec 16, 2013 at 04:23:53PM +0100, Sander Eikelenboom wrote:
> 
> Monday, December 16, 2013, 3:35:15 PM, you wrote:
> 
> > On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
> >> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
> >> > Hey,
> >> > 
> >> > While I was trying to narrow down the state of GPU passthrough
> >> > (still not finished) and figuring what needs to be done I realized
> >> > that Xen PCIback did not reset my GPU properly (when I crashed the
> >> > Windows guest by mistake). It does an FLR reset or Power one - if
> >> > the device supports it. But it seems that some of these GPUs
> >> > are liars and actually don't do the power part properly.
> >> 
> >> In my experience the devices do not lie.  They correctly report that
> >> they do not perform a reset in D3hot.
> >> 
> >> Here's the patch I'm using to solve this.  It does something similar.
> >> i.e., a SBR if all devices on that bus are safe to be reset.
> >> 
> >> I prefer it because it provides the standard 'reset' sysfs file that the
> >> toolstack/userspace can use.
> 
> > We can still add the 'reset' to SysFS
> >> 
> >> It does have some limitations:  a) It does not check whether a device is
> >> in use (only if it is bound to pciback); and b) it hand rolls
> >> pci_slot_reset() (because it didn't exist at the time).
> 
> > .. which can have those limiations removed and be based on this patchset.
> > Meaning it won't do a bus-reset or device reset if the rest of the devices
> > are _not_ assigned to pciback.
> 
> Perhaps there is something to learn from the steps vfio-pci takes to do this ?
> (they sorted out quite some stuff around pci/vga passtrough)

That is actually what I based it on :-)

> 
> >> 
> >> diff --git a/drivers/xen/xen-pciback/pci_stub.c
> >> b/drivers/xen/xen-pciback/pci_stub.c
> >> index 4e8ba38..5a03e63 100644
> >> --- a/drivers/xen/xen-pciback/pci_stub.c
> >> +++ b/drivers/xen/xen-pciback/pci_stub.c
> >> @@ -14,6 +14,7 @@
> >>  #include <linux/wait.h>
> >>  #include <linux/sched.h>
> >>  #include <linux/atomic.h>
> >> +#include <linux/delay.h>
> >>  #include <xen/events.h>
> >>  #include <asm/xen/pci.h>
> >>  #include <asm/xen/hypervisor.h>
> >> @@ -43,6 +44,7 @@ struct pcistub_device {
> >>       struct kref kref;
> >>       struct list_head dev_list;
> >>       spinlock_t lock;
> >> +     bool created_reset_file;
> >> 
> >>       struct pci_dev *dev;
> >>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
> >> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
> >>  static int initialize_devices;
> >>  static LIST_HEAD(seized_devices);
> >> 
> >> +/*
> >> + * pci_reset_function() will only work if there is a mechanism to
> >> + * reset that single function (e.g., FLR or a D-state transition).
> >> + * For PCI hardware that has two or more functions but no per-function
> >> + * reset, we can do a bus reset iff all the functions are co-assigned
> >> + * to the same domain.
> >> + *
> >> + * If a function has no per-function reset mechanism the 'reset' sysfs
> >> + * file that the toolstack uses to reset a function prior to assigning
> >> + * the device will be missing.  In this case, pciback adds its own
> >> + * which will try a bus reset.
> >> + *
> >> + * Note: pciback does not check for co-assigment before doing a bus
> >> + * reset, only that the devices are bound to pciback.  The toolstack
> >> + * is assumed to have done the right thing.
> >> + */
> >> +static int __pcistub_reset_function(struct pci_dev *dev)
> >> +{
> >> +     struct pci_dev *pdev;
> >> +     u16 ctrl;
> >> +     int ret;
> >> +
> >> +     ret = __pci_reset_function_locked(dev);
> >> +     if (ret == 0)
> >> +             return 0;
> >> +
> >> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
> >> +             return -ENOTTY;
> >> +
> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
> >> +             if (pdev != dev && (!pdev->driver
> >> +                                 || strcmp(pdev->driver->name, "pciback")))
> >> +                     return -ENOTTY;
> >> +             pci_save_state(pdev);
> >> +     }
> >> +
> >> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
> >> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> >> +     msleep(200);
> >> +
> >> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
> >> +     msleep(200);
> >> +
> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
> >> +             pci_restore_state(pdev);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static int pcistub_reset_function(struct pci_dev *dev)
> >> +{
> >> +     int ret;
> >> +
> >> +     device_lock(&dev->dev);
> >> +     ret = __pcistub_reset_function(dev);
> >> +     device_unlock(&dev->dev);
> >> +
> >> +     return ret;
> >> +}
> >> +
> >> +static ssize_t pcistub_reset_store(struct device *dev,
> >> +                                struct device_attribute *attr,
> >> +                                const char *buf, size_t count)
> >> +{
> >> +     struct pci_dev *pdev = to_pci_dev(dev);
> >> +     unsigned long val;
> >> +     ssize_t result = strict_strtoul(buf, 0, &val);
> >> +
> >> +     if (result < 0)
> >> +             return result;
> >> +
> >> +     if (val != 1)
> >> +             return -EINVAL;
> >> +
> >> +     result = pcistub_reset_function(pdev);
> >> +     if (result < 0)
> >> +             return result;
> >> +     return count;
> >> +}
> >> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
> >> +
> >> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
> >> +{
> >> +     struct device *dev = &psdev->dev->dev;
> >> +     struct sysfs_dirent *reset_dirent;
> >> +     int ret;
> >> +
> >> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
> >> +     if (reset_dirent) {
> >> +             sysfs_put(reset_dirent);
> >> +             return 0;
> >> +     }
> >> +
> >> +     ret = device_create_file(dev, &dev_attr_reset);
> >> +     if (ret < 0)
> >> +             return ret;
> >> +     psdev->created_reset_file = true;
> >> +     return 0;
> >> +}
> >> +
> >> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
> >> +{
> >> +     if (psdev && psdev->created_reset_file)
> >> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
> >> +}
> >> +
> >>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
> >>  {
> >>       struct pcistub_device *psdev;
> >> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
> >> 
> >>       dev_dbg(&dev->dev, "pcistub_device_release\n");
> >> 
> >> +     pcistub_remove_reset_file(psdev);
> >> +
> >>       xen_unregister_device_domain_owner(dev);
> >> 
> >>       /* Call the reset function which does not take lock as this
> >>        * is called from "unbind" which takes a device_lock mutex.
> >>        */
> >> -     __pci_reset_function_locked(dev);
> >> +     __pcistub_reset_function(psdev->dev);
> >> +
> >>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
> >>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
> >>       else
> >> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
> >>       /* This is OK - we are running from workqueue context
> >>        * and want to inhibit the user from fiddling with 'reset'
> >>        */
> >> -     pci_reset_function(dev);
> >> +     pcistub_reset_function(psdev->dev);
> >>       pci_restore_state(psdev->dev);
> >> 
> >>       /* This disables the device. */
> >> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
> >>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
> >>       else {
> >>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
> >> -             __pci_reset_function_locked(dev);
> >> +             __pcistub_reset_function(dev);
> >>               pci_restore_state(dev);
> >>       }
> >>       /* Now disable the device (this also ensures some private device
> >> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
> >>       if (!psdev)
> >>               return -ENOMEM;
> >> 
> >> +     err = pcistub_try_create_reset_file(psdev);
> >> +     if (err < 0)
> >> +             goto out;
> >> +
> >>       spin_lock_irqsave(&pcistub_devices_lock, flags);
> >> 
> >>       if (initialize_devices) {
> >> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
> >>       }
> >> 
> >>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
> >> -
> >> +out:
> >>       if (err)
> >>               pcistub_device_put(psdev);
> >> -
> >>       return err;
> >>  }
> >> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 15:36       ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-12-16 15:45         ` Sander Eikelenboom
@ 2013-12-16 15:45         ` Sander Eikelenboom
  2013-12-16 22:51         ` Sander Eikelenboom
  2013-12-16 22:51         ` Sander Eikelenboom
  3 siblings, 0 replies; 40+ messages in thread
From: Sander Eikelenboom @ 2013-12-16 15:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: David Vrabel, xen-devel, gordan, linux-kernel


Monday, December 16, 2013, 4:36:12 PM, you wrote:

> On Mon, Dec 16, 2013 at 04:23:53PM +0100, Sander Eikelenboom wrote:
>> 
>> Monday, December 16, 2013, 3:35:15 PM, you wrote:
>> 
>> > On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
>> >> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
>> >> > Hey,
>> >> > 
>> >> > While I was trying to narrow down the state of GPU passthrough
>> >> > (still not finished) and figuring what needs to be done I realized
>> >> > that Xen PCIback did not reset my GPU properly (when I crashed the
>> >> > Windows guest by mistake). It does an FLR reset or Power one - if
>> >> > the device supports it. But it seems that some of these GPUs
>> >> > are liars and actually don't do the power part properly.
>> >> 
>> >> In my experience the devices do not lie.  They correctly report that
>> >> they do not perform a reset in D3hot.
>> >> 
>> >> Here's the patch I'm using to solve this.  It does something similar.
>> >> i.e., a SBR if all devices on that bus are safe to be reset.
>> >> 
>> >> I prefer it because it provides the standard 'reset' sysfs file that the
>> >> toolstack/userspace can use.
>> 
>> > We can still add the 'reset' to SysFS
>> >> 
>> >> It does have some limitations:  a) It does not check whether a device is
>> >> in use (only if it is bound to pciback); and b) it hand rolls
>> >> pci_slot_reset() (because it didn't exist at the time).
>> 
>> > .. which can have those limiations removed and be based on this patchset.
>> > Meaning it won't do a bus-reset or device reset if the rest of the devices
>> > are _not_ assigned to pciback.
>> 
>> Perhaps there is something to learn from the steps vfio-pci takes to do this ?
>> (they sorted out quite some stuff around pci/vga passtrough)

> That is actually what I based it on :-)

OK was already suspecting it somehow :-)

Reminds me to see if the Radeon maintainer knows a way to hookup a sysfs reset entry for
Radeon cards, that completly cycles the card (including potential bios voodoo). Since AMD is supporting the development
of the opensource driver now, there should be chance and it would be welcome for both Xen and KVM
since those cards don't support FLR.

>> 
>> >> 
>> >> diff --git a/drivers/xen/xen-pciback/pci_stub.c
>> >> b/drivers/xen/xen-pciback/pci_stub.c
>> >> index 4e8ba38..5a03e63 100644
>> >> --- a/drivers/xen/xen-pciback/pci_stub.c
>> >> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >> @@ -14,6 +14,7 @@
>> >>  #include <linux/wait.h>
>> >>  #include <linux/sched.h>
>> >>  #include <linux/atomic.h>
>> >> +#include <linux/delay.h>
>> >>  #include <xen/events.h>
>> >>  #include <asm/xen/pci.h>
>> >>  #include <asm/xen/hypervisor.h>
>> >> @@ -43,6 +44,7 @@ struct pcistub_device {
>> >>       struct kref kref;
>> >>       struct list_head dev_list;
>> >>       spinlock_t lock;
>> >> +     bool created_reset_file;
>> >> 
>> >>       struct pci_dev *dev;
>> >>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
>> >> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>> >>  static int initialize_devices;
>> >>  static LIST_HEAD(seized_devices);
>> >> 
>> >> +/*
>> >> + * pci_reset_function() will only work if there is a mechanism to
>> >> + * reset that single function (e.g., FLR or a D-state transition).
>> >> + * For PCI hardware that has two or more functions but no per-function
>> >> + * reset, we can do a bus reset iff all the functions are co-assigned
>> >> + * to the same domain.
>> >> + *
>> >> + * If a function has no per-function reset mechanism the 'reset' sysfs
>> >> + * file that the toolstack uses to reset a function prior to assigning
>> >> + * the device will be missing.  In this case, pciback adds its own
>> >> + * which will try a bus reset.
>> >> + *
>> >> + * Note: pciback does not check for co-assigment before doing a bus
>> >> + * reset, only that the devices are bound to pciback.  The toolstack
>> >> + * is assumed to have done the right thing.
>> >> + */
>> >> +static int __pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     struct pci_dev *pdev;
>> >> +     u16 ctrl;
>> >> +     int ret;
>> >> +
>> >> +     ret = __pci_reset_function_locked(dev);
>> >> +     if (ret == 0)
>> >> +             return 0;
>> >> +
>> >> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
>> >> +             return -ENOTTY;
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
>> >> +             if (pdev != dev && (!pdev->driver
>> >> +                                 || strcmp(pdev->driver->name, "pciback")))
>> >> +                     return -ENOTTY;
>> >> +             pci_save_state(pdev);
>> >> +     }
>> >> +
>> >> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
>> >> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
>> >> +             pci_restore_state(pdev);
>> >> +
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static int pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     int ret;
>> >> +
>> >> +     device_lock(&dev->dev);
>> >> +     ret = __pcistub_reset_function(dev);
>> >> +     device_unlock(&dev->dev);
>> >> +
>> >> +     return ret;
>> >> +}
>> >> +
>> >> +static ssize_t pcistub_reset_store(struct device *dev,
>> >> +                                struct device_attribute *attr,
>> >> +                                const char *buf, size_t count)
>> >> +{
>> >> +     struct pci_dev *pdev = to_pci_dev(dev);
>> >> +     unsigned long val;
>> >> +     ssize_t result = strict_strtoul(buf, 0, &val);
>> >> +
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +
>> >> +     if (val != 1)
>> >> +             return -EINVAL;
>> >> +
>> >> +     result = pcistub_reset_function(pdev);
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +     return count;
>> >> +}
>> >> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
>> >> +
>> >> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     struct device *dev = &psdev->dev->dev;
>> >> +     struct sysfs_dirent *reset_dirent;
>> >> +     int ret;
>> >> +
>> >> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
>> >> +     if (reset_dirent) {
>> >> +             sysfs_put(reset_dirent);
>> >> +             return 0;
>> >> +     }
>> >> +
>> >> +     ret = device_create_file(dev, &dev_attr_reset);
>> >> +     if (ret < 0)
>> >> +             return ret;
>> >> +     psdev->created_reset_file = true;
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     if (psdev && psdev->created_reset_file)
>> >> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
>> >> +}
>> >> +
>> >>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>> >>  {
>> >>       struct pcistub_device *psdev;
>> >> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
>> >> 
>> >>       dev_dbg(&dev->dev, "pcistub_device_release\n");
>> >> 
>> >> +     pcistub_remove_reset_file(psdev);
>> >> +
>> >>       xen_unregister_device_domain_owner(dev);
>> >> 
>> >>       /* Call the reset function which does not take lock as this
>> >>        * is called from "unbind" which takes a device_lock mutex.
>> >>        */
>> >> -     __pci_reset_function_locked(dev);
>> >> +     __pcistub_reset_function(psdev->dev);
>> >> +
>> >>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> >>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> >>       else
>> >> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>       /* This is OK - we are running from workqueue context
>> >>        * and want to inhibit the user from fiddling with 'reset'
>> >>        */
>> >> -     pci_reset_function(dev);
>> >> +     pcistub_reset_function(psdev->dev);
>> >>       pci_restore_state(psdev->dev);
>> >> 
>> >>       /* This disables the device. */
>> >> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>> >>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>> >>       else {
>> >>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>> >> -             __pci_reset_function_locked(dev);
>> >> +             __pcistub_reset_function(dev);
>> >>               pci_restore_state(dev);
>> >>       }
>> >>       /* Now disable the device (this also ensures some private device
>> >> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       if (!psdev)
>> >>               return -ENOMEM;
>> >> 
>> >> +     err = pcistub_try_create_reset_file(psdev);
>> >> +     if (err < 0)
>> >> +             goto out;
>> >> +
>> >>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>> >> 
>> >>       if (initialize_devices) {
>> >> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       }
>> >> 
>> >>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
>> >> -
>> >> +out:
>> >>       if (err)
>> >>               pcistub_device_put(psdev);
>> >> -
>> >>       return err;
>> >>  }
>> >> 
>> 
>> 
>> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 15:36       ` [Xen-devel] " Konrad Rzeszutek Wilk
@ 2013-12-16 15:45         ` Sander Eikelenboom
  2013-12-16 15:45         ` [Xen-devel] " Sander Eikelenboom
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 40+ messages in thread
From: Sander Eikelenboom @ 2013-12-16 15:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, David Vrabel, linux-kernel


Monday, December 16, 2013, 4:36:12 PM, you wrote:

> On Mon, Dec 16, 2013 at 04:23:53PM +0100, Sander Eikelenboom wrote:
>> 
>> Monday, December 16, 2013, 3:35:15 PM, you wrote:
>> 
>> > On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
>> >> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
>> >> > Hey,
>> >> > 
>> >> > While I was trying to narrow down the state of GPU passthrough
>> >> > (still not finished) and figuring what needs to be done I realized
>> >> > that Xen PCIback did not reset my GPU properly (when I crashed the
>> >> > Windows guest by mistake). It does an FLR reset or Power one - if
>> >> > the device supports it. But it seems that some of these GPUs
>> >> > are liars and actually don't do the power part properly.
>> >> 
>> >> In my experience the devices do not lie.  They correctly report that
>> >> they do not perform a reset in D3hot.
>> >> 
>> >> Here's the patch I'm using to solve this.  It does something similar.
>> >> i.e., a SBR if all devices on that bus are safe to be reset.
>> >> 
>> >> I prefer it because it provides the standard 'reset' sysfs file that the
>> >> toolstack/userspace can use.
>> 
>> > We can still add the 'reset' to SysFS
>> >> 
>> >> It does have some limitations:  a) It does not check whether a device is
>> >> in use (only if it is bound to pciback); and b) it hand rolls
>> >> pci_slot_reset() (because it didn't exist at the time).
>> 
>> > .. which can have those limiations removed and be based on this patchset.
>> > Meaning it won't do a bus-reset or device reset if the rest of the devices
>> > are _not_ assigned to pciback.
>> 
>> Perhaps there is something to learn from the steps vfio-pci takes to do this ?
>> (they sorted out quite some stuff around pci/vga passtrough)

> That is actually what I based it on :-)

OK was already suspecting it somehow :-)

Reminds me to see if the Radeon maintainer knows a way to hookup a sysfs reset entry for
Radeon cards, that completly cycles the card (including potential bios voodoo). Since AMD is supporting the development
of the opensource driver now, there should be chance and it would be welcome for both Xen and KVM
since those cards don't support FLR.

>> 
>> >> 
>> >> diff --git a/drivers/xen/xen-pciback/pci_stub.c
>> >> b/drivers/xen/xen-pciback/pci_stub.c
>> >> index 4e8ba38..5a03e63 100644
>> >> --- a/drivers/xen/xen-pciback/pci_stub.c
>> >> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >> @@ -14,6 +14,7 @@
>> >>  #include <linux/wait.h>
>> >>  #include <linux/sched.h>
>> >>  #include <linux/atomic.h>
>> >> +#include <linux/delay.h>
>> >>  #include <xen/events.h>
>> >>  #include <asm/xen/pci.h>
>> >>  #include <asm/xen/hypervisor.h>
>> >> @@ -43,6 +44,7 @@ struct pcistub_device {
>> >>       struct kref kref;
>> >>       struct list_head dev_list;
>> >>       spinlock_t lock;
>> >> +     bool created_reset_file;
>> >> 
>> >>       struct pci_dev *dev;
>> >>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
>> >> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>> >>  static int initialize_devices;
>> >>  static LIST_HEAD(seized_devices);
>> >> 
>> >> +/*
>> >> + * pci_reset_function() will only work if there is a mechanism to
>> >> + * reset that single function (e.g., FLR or a D-state transition).
>> >> + * For PCI hardware that has two or more functions but no per-function
>> >> + * reset, we can do a bus reset iff all the functions are co-assigned
>> >> + * to the same domain.
>> >> + *
>> >> + * If a function has no per-function reset mechanism the 'reset' sysfs
>> >> + * file that the toolstack uses to reset a function prior to assigning
>> >> + * the device will be missing.  In this case, pciback adds its own
>> >> + * which will try a bus reset.
>> >> + *
>> >> + * Note: pciback does not check for co-assigment before doing a bus
>> >> + * reset, only that the devices are bound to pciback.  The toolstack
>> >> + * is assumed to have done the right thing.
>> >> + */
>> >> +static int __pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     struct pci_dev *pdev;
>> >> +     u16 ctrl;
>> >> +     int ret;
>> >> +
>> >> +     ret = __pci_reset_function_locked(dev);
>> >> +     if (ret == 0)
>> >> +             return 0;
>> >> +
>> >> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
>> >> +             return -ENOTTY;
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
>> >> +             if (pdev != dev && (!pdev->driver
>> >> +                                 || strcmp(pdev->driver->name, "pciback")))
>> >> +                     return -ENOTTY;
>> >> +             pci_save_state(pdev);
>> >> +     }
>> >> +
>> >> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
>> >> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
>> >> +             pci_restore_state(pdev);
>> >> +
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static int pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     int ret;
>> >> +
>> >> +     device_lock(&dev->dev);
>> >> +     ret = __pcistub_reset_function(dev);
>> >> +     device_unlock(&dev->dev);
>> >> +
>> >> +     return ret;
>> >> +}
>> >> +
>> >> +static ssize_t pcistub_reset_store(struct device *dev,
>> >> +                                struct device_attribute *attr,
>> >> +                                const char *buf, size_t count)
>> >> +{
>> >> +     struct pci_dev *pdev = to_pci_dev(dev);
>> >> +     unsigned long val;
>> >> +     ssize_t result = strict_strtoul(buf, 0, &val);
>> >> +
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +
>> >> +     if (val != 1)
>> >> +             return -EINVAL;
>> >> +
>> >> +     result = pcistub_reset_function(pdev);
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +     return count;
>> >> +}
>> >> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
>> >> +
>> >> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     struct device *dev = &psdev->dev->dev;
>> >> +     struct sysfs_dirent *reset_dirent;
>> >> +     int ret;
>> >> +
>> >> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
>> >> +     if (reset_dirent) {
>> >> +             sysfs_put(reset_dirent);
>> >> +             return 0;
>> >> +     }
>> >> +
>> >> +     ret = device_create_file(dev, &dev_attr_reset);
>> >> +     if (ret < 0)
>> >> +             return ret;
>> >> +     psdev->created_reset_file = true;
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     if (psdev && psdev->created_reset_file)
>> >> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
>> >> +}
>> >> +
>> >>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>> >>  {
>> >>       struct pcistub_device *psdev;
>> >> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
>> >> 
>> >>       dev_dbg(&dev->dev, "pcistub_device_release\n");
>> >> 
>> >> +     pcistub_remove_reset_file(psdev);
>> >> +
>> >>       xen_unregister_device_domain_owner(dev);
>> >> 
>> >>       /* Call the reset function which does not take lock as this
>> >>        * is called from "unbind" which takes a device_lock mutex.
>> >>        */
>> >> -     __pci_reset_function_locked(dev);
>> >> +     __pcistub_reset_function(psdev->dev);
>> >> +
>> >>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> >>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> >>       else
>> >> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>       /* This is OK - we are running from workqueue context
>> >>        * and want to inhibit the user from fiddling with 'reset'
>> >>        */
>> >> -     pci_reset_function(dev);
>> >> +     pcistub_reset_function(psdev->dev);
>> >>       pci_restore_state(psdev->dev);
>> >> 
>> >>       /* This disables the device. */
>> >> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>> >>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>> >>       else {
>> >>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>> >> -             __pci_reset_function_locked(dev);
>> >> +             __pcistub_reset_function(dev);
>> >>               pci_restore_state(dev);
>> >>       }
>> >>       /* Now disable the device (this also ensures some private device
>> >> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       if (!psdev)
>> >>               return -ENOMEM;
>> >> 
>> >> +     err = pcistub_try_create_reset_file(psdev);
>> >> +     if (err < 0)
>> >> +             goto out;
>> >> +
>> >>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>> >> 
>> >>       if (initialize_devices) {
>> >> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       }
>> >> 
>> >>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
>> >> -
>> >> +out:
>> >>       if (err)
>> >>               pcistub_device_put(psdev);
>> >> -
>> >>       return err;
>> >>  }
>> >> 
>> 
>> 
>> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [Xen-devel] [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 15:36       ` [Xen-devel] " Konrad Rzeszutek Wilk
  2013-12-16 15:45         ` Sander Eikelenboom
  2013-12-16 15:45         ` [Xen-devel] " Sander Eikelenboom
@ 2013-12-16 22:51         ` Sander Eikelenboom
  2013-12-16 22:51         ` Sander Eikelenboom
  3 siblings, 0 replies; 40+ messages in thread
From: Sander Eikelenboom @ 2013-12-16 22:51 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: David Vrabel, xen-devel, gordan, linux-kernel


Monday, December 16, 2013, 4:36:12 PM, you wrote:

> On Mon, Dec 16, 2013 at 04:23:53PM +0100, Sander Eikelenboom wrote:
>> 
>> Monday, December 16, 2013, 3:35:15 PM, you wrote:
>> 
>> > On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
>> >> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
>> >> > Hey,
>> >> > 
>> >> > While I was trying to narrow down the state of GPU passthrough
>> >> > (still not finished) and figuring what needs to be done I realized
>> >> > that Xen PCIback did not reset my GPU properly (when I crashed the
>> >> > Windows guest by mistake). It does an FLR reset or Power one - if
>> >> > the device supports it. But it seems that some of these GPUs
>> >> > are liars and actually don't do the power part properly.
>> >> 
>> >> In my experience the devices do not lie.  They correctly report that
>> >> they do not perform a reset in D3hot.
>> >> 
>> >> Here's the patch I'm using to solve this.  It does something similar.
>> >> i.e., a SBR if all devices on that bus are safe to be reset.
>> >> 
>> >> I prefer it because it provides the standard 'reset' sysfs file that the
>> >> toolstack/userspace can use.
>> 
>> > We can still add the 'reset' to SysFS
>> >> 
>> >> It does have some limitations:  a) It does not check whether a device is
>> >> in use (only if it is bound to pciback); and b) it hand rolls
>> >> pci_slot_reset() (because it didn't exist at the time).
>> 
>> > .. which can have those limiations removed and be based on this patchset.
>> > Meaning it won't do a bus-reset or device reset if the rest of the devices
>> > are _not_ assigned to pciback.
>> 
>> Perhaps there is something to learn from the steps vfio-pci takes to do this ?
>> (they sorted out quite some stuff around pci/vga passtrough)

> That is actually what I based it on :-)

Perhaps noteworthy then: [PATCH] pci: Add "try" reset interfaces
http://lkml.indiana.edu/hypermail/linux/kernel/1312.2/00577.html

>> 
>> >> 
>> >> diff --git a/drivers/xen/xen-pciback/pci_stub.c
>> >> b/drivers/xen/xen-pciback/pci_stub.c
>> >> index 4e8ba38..5a03e63 100644
>> >> --- a/drivers/xen/xen-pciback/pci_stub.c
>> >> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >> @@ -14,6 +14,7 @@
>> >>  #include <linux/wait.h>
>> >>  #include <linux/sched.h>
>> >>  #include <linux/atomic.h>
>> >> +#include <linux/delay.h>
>> >>  #include <xen/events.h>
>> >>  #include <asm/xen/pci.h>
>> >>  #include <asm/xen/hypervisor.h>
>> >> @@ -43,6 +44,7 @@ struct pcistub_device {
>> >>       struct kref kref;
>> >>       struct list_head dev_list;
>> >>       spinlock_t lock;
>> >> +     bool created_reset_file;
>> >> 
>> >>       struct pci_dev *dev;
>> >>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
>> >> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>> >>  static int initialize_devices;
>> >>  static LIST_HEAD(seized_devices);
>> >> 
>> >> +/*
>> >> + * pci_reset_function() will only work if there is a mechanism to
>> >> + * reset that single function (e.g., FLR or a D-state transition).
>> >> + * For PCI hardware that has two or more functions but no per-function
>> >> + * reset, we can do a bus reset iff all the functions are co-assigned
>> >> + * to the same domain.
>> >> + *
>> >> + * If a function has no per-function reset mechanism the 'reset' sysfs
>> >> + * file that the toolstack uses to reset a function prior to assigning
>> >> + * the device will be missing.  In this case, pciback adds its own
>> >> + * which will try a bus reset.
>> >> + *
>> >> + * Note: pciback does not check for co-assigment before doing a bus
>> >> + * reset, only that the devices are bound to pciback.  The toolstack
>> >> + * is assumed to have done the right thing.
>> >> + */
>> >> +static int __pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     struct pci_dev *pdev;
>> >> +     u16 ctrl;
>> >> +     int ret;
>> >> +
>> >> +     ret = __pci_reset_function_locked(dev);
>> >> +     if (ret == 0)
>> >> +             return 0;
>> >> +
>> >> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
>> >> +             return -ENOTTY;
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
>> >> +             if (pdev != dev && (!pdev->driver
>> >> +                                 || strcmp(pdev->driver->name, "pciback")))
>> >> +                     return -ENOTTY;
>> >> +             pci_save_state(pdev);
>> >> +     }
>> >> +
>> >> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
>> >> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
>> >> +             pci_restore_state(pdev);
>> >> +
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static int pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     int ret;
>> >> +
>> >> +     device_lock(&dev->dev);
>> >> +     ret = __pcistub_reset_function(dev);
>> >> +     device_unlock(&dev->dev);
>> >> +
>> >> +     return ret;
>> >> +}
>> >> +
>> >> +static ssize_t pcistub_reset_store(struct device *dev,
>> >> +                                struct device_attribute *attr,
>> >> +                                const char *buf, size_t count)
>> >> +{
>> >> +     struct pci_dev *pdev = to_pci_dev(dev);
>> >> +     unsigned long val;
>> >> +     ssize_t result = strict_strtoul(buf, 0, &val);
>> >> +
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +
>> >> +     if (val != 1)
>> >> +             return -EINVAL;
>> >> +
>> >> +     result = pcistub_reset_function(pdev);
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +     return count;
>> >> +}
>> >> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
>> >> +
>> >> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     struct device *dev = &psdev->dev->dev;
>> >> +     struct sysfs_dirent *reset_dirent;
>> >> +     int ret;
>> >> +
>> >> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
>> >> +     if (reset_dirent) {
>> >> +             sysfs_put(reset_dirent);
>> >> +             return 0;
>> >> +     }
>> >> +
>> >> +     ret = device_create_file(dev, &dev_attr_reset);
>> >> +     if (ret < 0)
>> >> +             return ret;
>> >> +     psdev->created_reset_file = true;
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     if (psdev && psdev->created_reset_file)
>> >> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
>> >> +}
>> >> +
>> >>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>> >>  {
>> >>       struct pcistub_device *psdev;
>> >> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
>> >> 
>> >>       dev_dbg(&dev->dev, "pcistub_device_release\n");
>> >> 
>> >> +     pcistub_remove_reset_file(psdev);
>> >> +
>> >>       xen_unregister_device_domain_owner(dev);
>> >> 
>> >>       /* Call the reset function which does not take lock as this
>> >>        * is called from "unbind" which takes a device_lock mutex.
>> >>        */
>> >> -     __pci_reset_function_locked(dev);
>> >> +     __pcistub_reset_function(psdev->dev);
>> >> +
>> >>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> >>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> >>       else
>> >> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>       /* This is OK - we are running from workqueue context
>> >>        * and want to inhibit the user from fiddling with 'reset'
>> >>        */
>> >> -     pci_reset_function(dev);
>> >> +     pcistub_reset_function(psdev->dev);
>> >>       pci_restore_state(psdev->dev);
>> >> 
>> >>       /* This disables the device. */
>> >> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>> >>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>> >>       else {
>> >>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>> >> -             __pci_reset_function_locked(dev);
>> >> +             __pcistub_reset_function(dev);
>> >>               pci_restore_state(dev);
>> >>       }
>> >>       /* Now disable the device (this also ensures some private device
>> >> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       if (!psdev)
>> >>               return -ENOMEM;
>> >> 
>> >> +     err = pcistub_try_create_reset_file(psdev);
>> >> +     if (err < 0)
>> >> +             goto out;
>> >> +
>> >>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>> >> 
>> >>       if (initialize_devices) {
>> >> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       }
>> >> 
>> >>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
>> >> -
>> >> +out:
>> >>       if (err)
>> >>               pcistub_device_put(psdev);
>> >> -
>> >>       return err;
>> >>  }
>> >> 
>> 
>> 
>> 



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
  2013-12-16 15:36       ` [Xen-devel] " Konrad Rzeszutek Wilk
                           ` (2 preceding siblings ...)
  2013-12-16 22:51         ` Sander Eikelenboom
@ 2013-12-16 22:51         ` Sander Eikelenboom
  3 siblings, 0 replies; 40+ messages in thread
From: Sander Eikelenboom @ 2013-12-16 22:51 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, gordan, David Vrabel, linux-kernel


Monday, December 16, 2013, 4:36:12 PM, you wrote:

> On Mon, Dec 16, 2013 at 04:23:53PM +0100, Sander Eikelenboom wrote:
>> 
>> Monday, December 16, 2013, 3:35:15 PM, you wrote:
>> 
>> > On Mon, Dec 16, 2013 at 10:59:01AM +0000, David Vrabel wrote:
>> >> On 13/12/13 16:09, Konrad Rzeszutek Wilk wrote:
>> >> > Hey,
>> >> > 
>> >> > While I was trying to narrow down the state of GPU passthrough
>> >> > (still not finished) and figuring what needs to be done I realized
>> >> > that Xen PCIback did not reset my GPU properly (when I crashed the
>> >> > Windows guest by mistake). It does an FLR reset or Power one - if
>> >> > the device supports it. But it seems that some of these GPUs
>> >> > are liars and actually don't do the power part properly.
>> >> 
>> >> In my experience the devices do not lie.  They correctly report that
>> >> they do not perform a reset in D3hot.
>> >> 
>> >> Here's the patch I'm using to solve this.  It does something similar.
>> >> i.e., a SBR if all devices on that bus are safe to be reset.
>> >> 
>> >> I prefer it because it provides the standard 'reset' sysfs file that the
>> >> toolstack/userspace can use.
>> 
>> > We can still add the 'reset' to SysFS
>> >> 
>> >> It does have some limitations:  a) It does not check whether a device is
>> >> in use (only if it is bound to pciback); and b) it hand rolls
>> >> pci_slot_reset() (because it didn't exist at the time).
>> 
>> > .. which can have those limiations removed and be based on this patchset.
>> > Meaning it won't do a bus-reset or device reset if the rest of the devices
>> > are _not_ assigned to pciback.
>> 
>> Perhaps there is something to learn from the steps vfio-pci takes to do this ?
>> (they sorted out quite some stuff around pci/vga passtrough)

> That is actually what I based it on :-)

Perhaps noteworthy then: [PATCH] pci: Add "try" reset interfaces
http://lkml.indiana.edu/hypermail/linux/kernel/1312.2/00577.html

>> 
>> >> 
>> >> diff --git a/drivers/xen/xen-pciback/pci_stub.c
>> >> b/drivers/xen/xen-pciback/pci_stub.c
>> >> index 4e8ba38..5a03e63 100644
>> >> --- a/drivers/xen/xen-pciback/pci_stub.c
>> >> +++ b/drivers/xen/xen-pciback/pci_stub.c
>> >> @@ -14,6 +14,7 @@
>> >>  #include <linux/wait.h>
>> >>  #include <linux/sched.h>
>> >>  #include <linux/atomic.h>
>> >> +#include <linux/delay.h>
>> >>  #include <xen/events.h>
>> >>  #include <asm/xen/pci.h>
>> >>  #include <asm/xen/hypervisor.h>
>> >> @@ -43,6 +44,7 @@ struct pcistub_device {
>> >>       struct kref kref;
>> >>       struct list_head dev_list;
>> >>       spinlock_t lock;
>> >> +     bool created_reset_file;
>> >> 
>> >>       struct pci_dev *dev;
>> >>       struct xen_pcibk_device *pdev;/* non-NULL if struct pci_dev is in use */
>> >> @@ -60,6 +62,114 @@ static LIST_HEAD(pcistub_devices);
>> >>  static int initialize_devices;
>> >>  static LIST_HEAD(seized_devices);
>> >> 
>> >> +/*
>> >> + * pci_reset_function() will only work if there is a mechanism to
>> >> + * reset that single function (e.g., FLR or a D-state transition).
>> >> + * For PCI hardware that has two or more functions but no per-function
>> >> + * reset, we can do a bus reset iff all the functions are co-assigned
>> >> + * to the same domain.
>> >> + *
>> >> + * If a function has no per-function reset mechanism the 'reset' sysfs
>> >> + * file that the toolstack uses to reset a function prior to assigning
>> >> + * the device will be missing.  In this case, pciback adds its own
>> >> + * which will try a bus reset.
>> >> + *
>> >> + * Note: pciback does not check for co-assigment before doing a bus
>> >> + * reset, only that the devices are bound to pciback.  The toolstack
>> >> + * is assumed to have done the right thing.
>> >> + */
>> >> +static int __pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     struct pci_dev *pdev;
>> >> +     u16 ctrl;
>> >> +     int ret;
>> >> +
>> >> +     ret = __pci_reset_function_locked(dev);
>> >> +     if (ret == 0)
>> >> +             return 0;
>> >> +
>> >> +     if (pci_is_root_bus(dev->bus) || dev->subordinate || !dev->bus->self)
>> >> +             return -ENOTTY;
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list) {
>> >> +             if (pdev != dev && (!pdev->driver
>> >> +                                 || strcmp(pdev->driver->name, "pciback")))
>> >> +                     return -ENOTTY;
>> >> +             pci_save_state(pdev);
>> >> +     }
>> >> +
>> >> +     pci_read_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, &ctrl);
>> >> +     ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>> >> +     pci_write_config_word(dev->bus->self, PCI_BRIDGE_CONTROL, ctrl);
>> >> +     msleep(200);
>> >> +
>> >> +     list_for_each_entry(pdev, &dev->bus->devices, bus_list)
>> >> +             pci_restore_state(pdev);
>> >> +
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static int pcistub_reset_function(struct pci_dev *dev)
>> >> +{
>> >> +     int ret;
>> >> +
>> >> +     device_lock(&dev->dev);
>> >> +     ret = __pcistub_reset_function(dev);
>> >> +     device_unlock(&dev->dev);
>> >> +
>> >> +     return ret;
>> >> +}
>> >> +
>> >> +static ssize_t pcistub_reset_store(struct device *dev,
>> >> +                                struct device_attribute *attr,
>> >> +                                const char *buf, size_t count)
>> >> +{
>> >> +     struct pci_dev *pdev = to_pci_dev(dev);
>> >> +     unsigned long val;
>> >> +     ssize_t result = strict_strtoul(buf, 0, &val);
>> >> +
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +
>> >> +     if (val != 1)
>> >> +             return -EINVAL;
>> >> +
>> >> +     result = pcistub_reset_function(pdev);
>> >> +     if (result < 0)
>> >> +             return result;
>> >> +     return count;
>> >> +}
>> >> +static DEVICE_ATTR(reset, 0200, NULL, pcistub_reset_store);
>> >> +
>> >> +static int pcistub_try_create_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     struct device *dev = &psdev->dev->dev;
>> >> +     struct sysfs_dirent *reset_dirent;
>> >> +     int ret;
>> >> +
>> >> +     reset_dirent = sysfs_get_dirent(dev->kobj.sd, NULL, "reset");
>> >> +     if (reset_dirent) {
>> >> +             sysfs_put(reset_dirent);
>> >> +             return 0;
>> >> +     }
>> >> +
>> >> +     ret = device_create_file(dev, &dev_attr_reset);
>> >> +     if (ret < 0)
>> >> +             return ret;
>> >> +     psdev->created_reset_file = true;
>> >> +     return 0;
>> >> +}
>> >> +
>> >> +static void pcistub_remove_reset_file(struct pcistub_device *psdev)
>> >> +{
>> >> +     if (psdev && psdev->created_reset_file)
>> >> +             device_remove_file(&psdev->dev->dev, &dev_attr_reset);
>> >> +}
>> >> +
>> >>  static struct pcistub_device *pcistub_device_alloc(struct pci_dev *dev)
>> >>  {
>> >>       struct pcistub_device *psdev;
>> >> @@ -95,12 +205,15 @@ static void pcistub_device_release(struct kref *kref)
>> >> 
>> >>       dev_dbg(&dev->dev, "pcistub_device_release\n");
>> >> 
>> >> +     pcistub_remove_reset_file(psdev);
>> >> +
>> >>       xen_unregister_device_domain_owner(dev);
>> >> 
>> >>       /* Call the reset function which does not take lock as this
>> >>        * is called from "unbind" which takes a device_lock mutex.
>> >>        */
>> >> -     __pci_reset_function_locked(dev);
>> >> +     __pcistub_reset_function(psdev->dev);
>> >> +
>> >>       if (pci_load_and_free_saved_state(dev, &dev_data->pci_saved_state))
>> >>               dev_dbg(&dev->dev, "Could not reload PCI state\n");
>> >>       else
>> >> @@ -268,7 +381,7 @@ void pcistub_put_pci_dev(struct pci_dev *dev)
>> >>       /* This is OK - we are running from workqueue context
>> >>        * and want to inhibit the user from fiddling with 'reset'
>> >>        */
>> >> -     pci_reset_function(dev);
>> >> +     pcistub_reset_function(psdev->dev);
>> >>       pci_restore_state(psdev->dev);
>> >> 
>> >>       /* This disables the device. */
>> >> @@ -392,7 +505,7 @@ static int pcistub_init_device(struct pci_dev *dev)
>> >>               dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
>> >>       else {
>> >>               dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
>> >> -             __pci_reset_function_locked(dev);
>> >> +             __pcistub_reset_function(dev);
>> >>               pci_restore_state(dev);
>> >>       }
>> >>       /* Now disable the device (this also ensures some private device
>> >> @@ -467,6 +580,10 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       if (!psdev)
>> >>               return -ENOMEM;
>> >> 
>> >> +     err = pcistub_try_create_reset_file(psdev);
>> >> +     if (err < 0)
>> >> +             goto out;
>> >> +
>> >>       spin_lock_irqsave(&pcistub_devices_lock, flags);
>> >> 
>> >>       if (initialize_devices) {
>> >> @@ -485,10 +602,9 @@ static int pcistub_seize(struct pci_dev *dev)
>> >>       }
>> >> 
>> >>       spin_unlock_irqrestore(&pcistub_devices_lock, flags);
>> >> -
>> >> +out:
>> >>       if (err)
>> >>               pcistub_device_put(psdev);
>> >> -
>> >>       return err;
>> >>  }
>> >> 
>> 
>> 
>> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
@ 2014-06-04 22:15 Ruediger Otte
  0 siblings, 0 replies; 40+ messages in thread
From: Ruediger Otte @ 2014-06-04 22:15 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel

Hi,

this patch actually solves the problem I experienced when trying to
passthrough a Radeon 7750 to a windows guest with a custom build of
Xen 4.4.0 on Debian Wheezy.

While I was always getting dom0 lockups at the first reboot of the
guest, I'm now able to do several reboots without sending the host to
standby in between. I haven't done extensive testing yet, but gpu
passthrough seems to just work now.

However I'm now seeing a different issue at dom0 boot when the
devices are assigned to xen-pciback (in kernel, no module). After the
message "xen_pciback: backend is passthrough" the host hangs for 2-3
seconds, then reboots. Strangely sometimes the host just boots ok,
but then again I get three or more failed boots in a row before it
finally works.

Before I applied your patch I have already followed every hint and
tried every available workaround with no success so far, so I would
be glad if this code could be further improved. If there's the need I
can of course provide debug output and configuration details from my
setup.

Kind regards,
Ruediger Otte

> Hey,
> 
> While I was trying to narrow down the state of GPU passthrough
> (still not finished) and figuring what needs to be done I realized
> that Xen PCIback did not reset my GPU properly (when I crashed the
> Windows guest by mistake). It does an FLR reset or Power one - if
> the device supports it. But it seems that some of these GPUs
> are liars and actually don't do the power part properly.
> 
> One way to fix this is by doing a slot (aka device) and bus reset.
> Of course to do that - you need to make sure that all of the
> functions of a device are under the ownership of xen-pciback.
> Otherwise you might accidently reset your sound card while it is
> being used.
> 
> These RFC patches cleanup pci back a bit and also make it possible
> for Xen pciback to do the whole gamma of 'reset' for PCI devices:
> FLR, power management, AER, slot and bus reset if neccessary.
> 
> Thanks go to Gordan Bobic for educating me on how to "reprogram"
> and GFX460 in a Quardro 4000M and also reporting oddities when
> a PCI device was reset but it looked like it was not reset.
> 
> 
>  drivers/xen/xen-pciback/pci_stub.c | 142
> +++++++++++++++++++++++++++++++------
> drivers/xen/xen-pciback/xenbus.c   |   5 +- 2 files changed, 124
> insertions(+), 23 deletions(-)
> 
> 
> Konrad Rzeszutek Wilk (5):
>       xen-pciback: Cleanup up pcistub_put_pci_dev
>       xen-pciback: First reset, then free.
>       xen-pciback: Document when we FLR an PCI device.
>       xen/pciback: Move the FLR code to a function.
>       xen/pciback: PCI reset slot or bus

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [RFC PATCH] Xen PCI back - do slot and bus reset (v0).
@ 2013-12-13 16:09 Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-12-13 16:09 UTC (permalink / raw)
  To: xen-devel, linux-kernel, gordan

Hey,

While I was trying to narrow down the state of GPU passthrough
(still not finished) and figuring what needs to be done I realized
that Xen PCIback did not reset my GPU properly (when I crashed the
Windows guest by mistake). It does an FLR reset or Power one - if
the device supports it. But it seems that some of these GPUs
are liars and actually don't do the power part properly.

One way to fix this is by doing a slot (aka device) and bus reset.
Of course to do that - you need to make sure that all of the
functions of a device are under the ownership of xen-pciback.
Otherwise you might accidently reset your sound card while it is
being used.

These RFC patches cleanup pci back a bit and also make it possible
for Xen pciback to do the whole gamma of 'reset' for PCI devices:
FLR, power management, AER, slot and bus reset if neccessary.

Thanks go to Gordan Bobic for educating me on how to "reprogram"
and GFX460 in a Quardro 4000M and also reporting oddities when
a PCI device was reset but it looked like it was not reset.


 drivers/xen/xen-pciback/pci_stub.c | 142 +++++++++++++++++++++++++++++++------
 drivers/xen/xen-pciback/xenbus.c   |   5 +-
 2 files changed, 124 insertions(+), 23 deletions(-)


Konrad Rzeszutek Wilk (5):
      xen-pciback: Cleanup up pcistub_put_pci_dev
      xen-pciback: First reset, then free.
      xen-pciback: Document when we FLR an PCI device.
      xen/pciback: Move the FLR code to a function.
      xen/pciback: PCI reset slot or bus

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2014-06-04 22:15 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-13 16:09 [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Konrad Rzeszutek Wilk
2013-12-13 16:09 ` [RFC PATCH 1/5] xen-pciback: Cleanup up pcistub_put_pci_dev Konrad Rzeszutek Wilk
2013-12-16  9:19   ` [Xen-devel] " Jan Beulich
2013-12-16  9:19   ` Jan Beulich
2013-12-13 16:09 ` Konrad Rzeszutek Wilk
2013-12-13 16:09 ` [RFC PATCH 2/5] xen-pciback: First reset, then free Konrad Rzeszutek Wilk
2013-12-13 16:09 ` Konrad Rzeszutek Wilk
2013-12-16  9:23   ` [Xen-devel] " Jan Beulich
2013-12-16  9:23   ` Jan Beulich
2013-12-13 16:09 ` [RFC PATCH 3/5] xen-pciback: Document when we FLR an PCI device Konrad Rzeszutek Wilk
2013-12-13 16:09 ` Konrad Rzeszutek Wilk
2013-12-16  9:27   ` Jan Beulich
2013-12-16  9:27   ` [Xen-devel] " Jan Beulich
2013-12-13 16:09 ` [RFC PATCH 4/5] xen/pciback: Move the FLR code to a function Konrad Rzeszutek Wilk
2013-12-13 16:09 ` Konrad Rzeszutek Wilk
2013-12-16  9:28   ` Jan Beulich
2013-12-16  9:28   ` [Xen-devel] " Jan Beulich
2013-12-13 16:09 ` [RFC PATCH 5/5] xen/pciback: PCI reset slot or bus Konrad Rzeszutek Wilk
2013-12-13 16:09 ` Konrad Rzeszutek Wilk
2013-12-13 16:18   ` Konrad Rzeszutek Wilk
2013-12-13 16:18   ` Konrad Rzeszutek Wilk
2013-12-16 11:34   ` [Xen-devel] " David Vrabel
2013-12-16 14:39     ` Konrad Rzeszutek Wilk
2013-12-16 14:39     ` [Xen-devel] " Konrad Rzeszutek Wilk
2013-12-16 11:34   ` David Vrabel
2013-12-13 16:52 ` [RFC PATCH] Xen PCI back - do slot and bus reset (v0) Gordan Bobic
2013-12-16 10:59 ` David Vrabel
2013-12-16 10:59 ` [Xen-devel] " David Vrabel
2013-12-16 14:35   ` Konrad Rzeszutek Wilk
2013-12-16 15:23     ` Sander Eikelenboom
2013-12-16 15:36       ` Konrad Rzeszutek Wilk
2013-12-16 15:36       ` [Xen-devel] " Konrad Rzeszutek Wilk
2013-12-16 15:45         ` Sander Eikelenboom
2013-12-16 15:45         ` [Xen-devel] " Sander Eikelenboom
2013-12-16 22:51         ` Sander Eikelenboom
2013-12-16 22:51         ` Sander Eikelenboom
2013-12-16 15:23     ` Sander Eikelenboom
2013-12-16 14:35   ` Konrad Rzeszutek Wilk
2013-12-13 16:09 Konrad Rzeszutek Wilk
2014-06-04 22:15 Ruediger Otte

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.