All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-07  7:19 ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

Hi All:

This is a rework on the IRQ hardening for virtio which is done
previously by the following commits are reverted:

9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
080cd7c3ac87 ("virtio-pci: harden INTX interrupts")

The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
with the assumption of the affinity managed IRQ that is used by some
virtio drivers. And what's more, it is only done for virtio-pci but
not other transports.

In this rework, I try to implement a general virtio solution which
borrows the idea of the INTX hardening by re-using per virtqueue
boolean vq->broken and toggle it in virtio_device_ready() and
virtio_reset_device(). Then we can simply reuse the existing checks in
the vring_interrupt() and return early if the driver is not ready.

Note that, I only did compile test on ccw and MMIO transport.

Please review.

Changes since v1:

- Use transport specific irq synchronization method when possible
- Drop the module parameter and enable the hardening unconditonally
- Tweak the barrier/ordering facilities used in the code
- Reanme irq_soft_enabled to driver_ready
- Avoid unnecssary IRQ synchornization (e.g during boot)

Changes since V2:

- add ccw and MMIO support
- rename synchronize_vqs() to synchronize_cbs()
- switch to re-use vq->broken instead of introducing new device
  attributes for the future virtqueue reset support
- remove unnecssary READ_ONCE()/WRITE_ONCE()
- a new patch to remove device triggerable BUG_ON()
- more tweaks on the comments

Changes since V3:

- Rename synchornize_vqs() to synchronize_cbs()
- tweak the comment for synchronize_cbs()
- switch to use a dedicated helper __virtio_unbreak_device() and
  document it should be only used for probing
- switch to use rwlick to synchornize the non airq for ccw

Jason Wang (8):
  virtio: use virtio_reset_device() when possible
  virtio: introduce config op to synchronize vring callbacks
  virtio-pci: implement synchronize_cbs()
  virtio-mmio: implement synchronize_cbs()
  virtio-ccw: implement synchronize_cbs()
  virtio: allow to unbreak virtqueue
  virtio: harden vring IRQ
  virtio: use WARN_ON() to warning illegal status value

Stefano Garzarella (1):
  virtio: use virtio_device_ready() in virtio_device_restore()

 drivers/s390/virtio/virtio_ccw.c   | 27 +++++++++++++++++++++
 drivers/virtio/virtio.c            | 24 ++++++++++++------
 drivers/virtio/virtio_mmio.c       |  9 +++++++
 drivers/virtio/virtio_pci_legacy.c |  1 +
 drivers/virtio/virtio_pci_modern.c |  2 ++
 drivers/virtio/virtio_ring.c       | 32 +++++++++++++++++++++---
 include/linux/virtio.h             |  1 +
 include/linux/virtio_config.h      | 39 +++++++++++++++++++++++++++++-
 8 files changed, 123 insertions(+), 12 deletions(-)

-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-07  7:19 ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

Hi All:

This is a rework on the IRQ hardening for virtio which is done
previously by the following commits are reverted:

9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
080cd7c3ac87 ("virtio-pci: harden INTX interrupts")

The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
with the assumption of the affinity managed IRQ that is used by some
virtio drivers. And what's more, it is only done for virtio-pci but
not other transports.

In this rework, I try to implement a general virtio solution which
borrows the idea of the INTX hardening by re-using per virtqueue
boolean vq->broken and toggle it in virtio_device_ready() and
virtio_reset_device(). Then we can simply reuse the existing checks in
the vring_interrupt() and return early if the driver is not ready.

Note that, I only did compile test on ccw and MMIO transport.

Please review.

Changes since v1:

- Use transport specific irq synchronization method when possible
- Drop the module parameter and enable the hardening unconditonally
- Tweak the barrier/ordering facilities used in the code
- Reanme irq_soft_enabled to driver_ready
- Avoid unnecssary IRQ synchornization (e.g during boot)

Changes since V2:

- add ccw and MMIO support
- rename synchronize_vqs() to synchronize_cbs()
- switch to re-use vq->broken instead of introducing new device
  attributes for the future virtqueue reset support
- remove unnecssary READ_ONCE()/WRITE_ONCE()
- a new patch to remove device triggerable BUG_ON()
- more tweaks on the comments

Changes since V3:

- Rename synchornize_vqs() to synchronize_cbs()
- tweak the comment for synchronize_cbs()
- switch to use a dedicated helper __virtio_unbreak_device() and
  document it should be only used for probing
- switch to use rwlick to synchornize the non airq for ccw

Jason Wang (8):
  virtio: use virtio_reset_device() when possible
  virtio: introduce config op to synchronize vring callbacks
  virtio-pci: implement synchronize_cbs()
  virtio-mmio: implement synchronize_cbs()
  virtio-ccw: implement synchronize_cbs()
  virtio: allow to unbreak virtqueue
  virtio: harden vring IRQ
  virtio: use WARN_ON() to warning illegal status value

Stefano Garzarella (1):
  virtio: use virtio_device_ready() in virtio_device_restore()

 drivers/s390/virtio/virtio_ccw.c   | 27 +++++++++++++++++++++
 drivers/virtio/virtio.c            | 24 ++++++++++++------
 drivers/virtio/virtio_mmio.c       |  9 +++++++
 drivers/virtio/virtio_pci_legacy.c |  1 +
 drivers/virtio/virtio_pci_modern.c |  2 ++
 drivers/virtio/virtio_ring.c       | 32 +++++++++++++++++++++---
 include/linux/virtio.h             |  1 +
 include/linux/virtio_config.h      | 39 +++++++++++++++++++++++++++++-
 8 files changed, 123 insertions(+), 12 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 71+ messages in thread

* [PATCH V4 1/9] virtio: use virtio_device_ready() in virtio_device_restore()
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

From: Stefano Garzarella <sgarzare@redhat.com>

It will allow us to do extension on virtio_device_ready() without
duplicating code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 22f15f444f75..75c8d560bbd3 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -526,8 +526,9 @@ int virtio_device_restore(struct virtio_device *dev)
 			goto err;
 	}
 
-	/* Finally, tell the device we're all set */
-	virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
+	/* If restore didn't do it, mark device DRIVER_OK ourselves. */
+	if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK))
+		virtio_device_ready(dev);
 
 	virtio_config_enable(dev);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 1/9] virtio: use virtio_device_ready() in virtio_device_restore()
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

From: Stefano Garzarella <sgarzare@redhat.com>

It will allow us to do extension on virtio_device_ready() without
duplicating code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 22f15f444f75..75c8d560bbd3 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -526,8 +526,9 @@ int virtio_device_restore(struct virtio_device *dev)
 			goto err;
 	}
 
-	/* Finally, tell the device we're all set */
-	virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK);
+	/* If restore didn't do it, mark device DRIVER_OK ourselves. */
+	if (!(dev->config->get_status(dev) & VIRTIO_CONFIG_S_DRIVER_OK))
+		virtio_device_ready(dev);
 
 	virtio_config_enable(dev);
 
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 2/9] virtio: use virtio_reset_device() when possible
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

This allows us to do common extension without duplicating code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 75c8d560bbd3..8dde44ea044a 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -430,7 +430,7 @@ int register_virtio_device(struct virtio_device *dev)
 
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up.  This also tests that code path a little. */
-	dev->config->reset(dev);
+	virtio_reset_device(dev);
 
 	/* Acknowledge that we've seen the device. */
 	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
@@ -496,7 +496,7 @@ int virtio_device_restore(struct virtio_device *dev)
 
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up. */
-	dev->config->reset(dev);
+	virtio_reset_device(dev);
 
 	/* Acknowledge that we've seen the device. */
 	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 2/9] virtio: use virtio_reset_device() when possible
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

This allows us to do common extension without duplicating code.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 75c8d560bbd3..8dde44ea044a 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -430,7 +430,7 @@ int register_virtio_device(struct virtio_device *dev)
 
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up.  This also tests that code path a little. */
-	dev->config->reset(dev);
+	virtio_reset_device(dev);
 
 	/* Acknowledge that we've seen the device. */
 	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
@@ -496,7 +496,7 @@ int virtio_device_restore(struct virtio_device *dev)
 
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up. */
-	dev->config->reset(dev);
+	virtio_reset_device(dev);
 
 	/* Acknowledge that we've seen the device. */
 	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 3/9] virtio: introduce config op to synchronize vring callbacks
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

This patch introduces new virtio config op to vring
callbacks. Transport specific method is required to make sure the
write before this function is visible to the vring_interrupt() that is
called after the return of this function. For the transport that
doesn't provide synchronize_vqs(), use synchornize_rcu() which
synchronize with IRQ implicitly as a fallback.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index b341dd62aa4d..d8a2340f928e 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -57,6 +57,11 @@ struct virtio_shm_region {
  *		include a NULL entry for vqs unused by driver
  *	Returns 0 on success or error status
  * @del_vqs: free virtqueues found by find_vqs().
+ * @synchronize_cbs: synchronize with the virtqueue callbacks (optional)
+ *      The function guarantees that all memory operations on the
+ *      queue before it are visible to the vring_interrupt() that is
+ *      called after it.
+ *      vdev: the virtio_device
  * @get_features: get the array of feature bits for this device.
  *	vdev: the virtio_device
  *	Returns the first 64 feature bits (all we currently need).
@@ -89,6 +94,7 @@ struct virtio_config_ops {
 			const char * const names[], const bool *ctx,
 			struct irq_affinity *desc);
 	void (*del_vqs)(struct virtio_device *);
+	void (*synchronize_cbs)(struct virtio_device *);
 	u64 (*get_features)(struct virtio_device *vdev);
 	int (*finalize_features)(struct virtio_device *vdev);
 	const char *(*bus_name)(struct virtio_device *vdev);
@@ -217,6 +223,25 @@ int virtio_find_vqs_ctx(struct virtio_device *vdev, unsigned nvqs,
 				      desc);
 }
 
+/**
+ * virtio_synchronize_cbs - synchronize with virtqueue callbacks
+ * @vdev: the device
+ */
+static inline
+void virtio_synchronize_cbs(struct virtio_device *dev)
+{
+	if (dev->config->synchronize_cbs) {
+		dev->config->synchronize_cbs(dev);
+	} else {
+		/*
+		 * A best effort fallback to synchronize with
+		 * interrupts, preemption and softirq. See comment
+		 * above synchronize_rcu().
+		 */
+		synchronize_rcu();
+	}
+}
+
 /**
  * virtio_device_ready - enable vq use in probe function
  * @vdev: the device
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 3/9] virtio: introduce config op to synchronize vring callbacks
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

This patch introduces new virtio config op to vring
callbacks. Transport specific method is required to make sure the
write before this function is visible to the vring_interrupt() that is
called after the return of this function. For the transport that
doesn't provide synchronize_vqs(), use synchornize_rcu() which
synchronize with IRQ implicitly as a fallback.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index b341dd62aa4d..d8a2340f928e 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -57,6 +57,11 @@ struct virtio_shm_region {
  *		include a NULL entry for vqs unused by driver
  *	Returns 0 on success or error status
  * @del_vqs: free virtqueues found by find_vqs().
+ * @synchronize_cbs: synchronize with the virtqueue callbacks (optional)
+ *      The function guarantees that all memory operations on the
+ *      queue before it are visible to the vring_interrupt() that is
+ *      called after it.
+ *      vdev: the virtio_device
  * @get_features: get the array of feature bits for this device.
  *	vdev: the virtio_device
  *	Returns the first 64 feature bits (all we currently need).
@@ -89,6 +94,7 @@ struct virtio_config_ops {
 			const char * const names[], const bool *ctx,
 			struct irq_affinity *desc);
 	void (*del_vqs)(struct virtio_device *);
+	void (*synchronize_cbs)(struct virtio_device *);
 	u64 (*get_features)(struct virtio_device *vdev);
 	int (*finalize_features)(struct virtio_device *vdev);
 	const char *(*bus_name)(struct virtio_device *vdev);
@@ -217,6 +223,25 @@ int virtio_find_vqs_ctx(struct virtio_device *vdev, unsigned nvqs,
 				      desc);
 }
 
+/**
+ * virtio_synchronize_cbs - synchronize with virtqueue callbacks
+ * @vdev: the device
+ */
+static inline
+void virtio_synchronize_cbs(struct virtio_device *dev)
+{
+	if (dev->config->synchronize_cbs) {
+		dev->config->synchronize_cbs(dev);
+	} else {
+		/*
+		 * A best effort fallback to synchronize with
+		 * interrupts, preemption and softirq. See comment
+		 * above synchronize_rcu().
+		 */
+		synchronize_rcu();
+	}
+}
+
 /**
  * virtio_device_ready - enable vq use in probe function
  * @vdev: the device
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 4/9] virtio-pci: implement synchronize_cbs()
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

We can simply reuse vp_synchronize_vectors() for .synchronize_cbs().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio_pci_legacy.c | 1 +
 drivers/virtio/virtio_pci_modern.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index 6f4e34ce96b8..207985107150 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -192,6 +192,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
 	.reset		= vp_reset,
 	.find_vqs	= vp_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index a2671a20ef77..18c2190e3059 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -394,6 +394,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
 	.reset		= vp_reset,
 	.find_vqs	= vp_modern_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
@@ -411,6 +412,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
 	.reset		= vp_reset,
 	.find_vqs	= vp_modern_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 4/9] virtio-pci: implement synchronize_cbs()
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

We can simply reuse vp_synchronize_vectors() for .synchronize_cbs().

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio_pci_legacy.c | 1 +
 drivers/virtio/virtio_pci_modern.c | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index 6f4e34ce96b8..207985107150 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -192,6 +192,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
 	.reset		= vp_reset,
 	.find_vqs	= vp_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index a2671a20ef77..18c2190e3059 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -394,6 +394,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
 	.reset		= vp_reset,
 	.find_vqs	= vp_modern_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
@@ -411,6 +412,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
 	.reset		= vp_reset,
 	.find_vqs	= vp_modern_find_vqs,
 	.del_vqs	= vp_del_vqs,
+	.synchronize_cbs = vp_synchronize_vectors,
 	.get_features	= vp_get_features,
 	.finalize_features = vp_finalize_features,
 	.bus_name	= vp_bus_name,
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 5/9] virtio-mmio: implement synchronize_cbs()
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

Simply synchronize the platform irq that is used by us.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio_mmio.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 56128b9c46eb..4a3b66e4e198 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -345,6 +345,14 @@ static void vm_del_vqs(struct virtio_device *vdev)
 	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
 }
 
+
+static void vm_synchronize_cbs(struct virtio_device *vdev)
+{
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+
+	synchronize_irq(platform_get_irq(vm_dev->pdev, 0));
+}
+
 static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
 				  void (*callback)(struct virtqueue *vq),
 				  const char *name, bool ctx)
@@ -541,6 +549,7 @@ static const struct virtio_config_ops virtio_mmio_config_ops = {
 	.finalize_features = vm_finalize_features,
 	.bus_name	= vm_bus_name,
 	.get_shm_region = vm_get_shm_region,
+	.synchronize_cbs = vm_synchronize_cbs,
 };
 
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 5/9] virtio-mmio: implement synchronize_cbs()
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

Simply synchronize the platform irq that is used by us.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio_mmio.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 56128b9c46eb..4a3b66e4e198 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -345,6 +345,14 @@ static void vm_del_vqs(struct virtio_device *vdev)
 	free_irq(platform_get_irq(vm_dev->pdev, 0), vm_dev);
 }
 
+
+static void vm_synchronize_cbs(struct virtio_device *vdev)
+{
+	struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
+
+	synchronize_irq(platform_get_irq(vm_dev->pdev, 0));
+}
+
 static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
 				  void (*callback)(struct virtqueue *vq),
 				  const char *name, bool ctx)
@@ -541,6 +549,7 @@ static const struct virtio_config_ops virtio_mmio_config_ops = {
 	.finalize_features = vm_finalize_features,
 	.bus_name	= vm_bus_name,
 	.get_shm_region = vm_get_shm_region,
+	.synchronize_cbs = vm_synchronize_cbs,
 };
 
 
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

This patch tries to implement the synchronize_cbs() for ccw. For the
vring_interrupt() that is called via virtio_airq_handler(), the
synchronization is simply done via the airq_info's lock. For the
vring_interrupt() that is called via virtio_ccw_int_handler(), a per
device spinlock for irq is introduced ans used in the synchronization
method.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/s390/virtio/virtio_ccw.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index d35e7a3f7067..001e1f0e6037 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -62,6 +62,7 @@ struct virtio_ccw_device {
 	unsigned int revision; /* Transport revision */
 	wait_queue_head_t wait_q;
 	spinlock_t lock;
+	rwlock_t irq_lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
 	bool is_thinint;
@@ -984,6 +985,27 @@ static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
 	return dev_name(&vcdev->cdev->dev);
 }
 
+static void virtio_ccw_synchronize_cbs(struct virtio_device *vdev)
+{
+	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
+	struct airq_info *info = vcdev->airq_info;
+
+	if (info) {
+		/*
+		 * Synchronize with the vring_interrupt() with airq indicator
+		 */
+		write_lock(&info->lock);
+		write_unlock(&info->lock);
+	} else {
+		/*
+		 * Synchronize with the vring_interrupt() called by
+		 * virtio_ccw_int_handler().
+		 */
+		write_lock(&vcdev->irq_lock);
+		write_unlock(&vcdev->irq_lock);
+	}
+}
+
 static const struct virtio_config_ops virtio_ccw_config_ops = {
 	.get_features = virtio_ccw_get_features,
 	.finalize_features = virtio_ccw_finalize_features,
@@ -995,6 +1017,7 @@ static const struct virtio_config_ops virtio_ccw_config_ops = {
 	.find_vqs = virtio_ccw_find_vqs,
 	.del_vqs = virtio_ccw_del_vqs,
 	.bus_name = virtio_ccw_bus_name,
+	.synchronize_cbs = virtio_ccw_synchronize_cbs,
 };
 
 
@@ -1079,6 +1102,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 {
 	__u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK;
 	struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev);
+	unsigned long flags;
 	int i;
 	struct virtqueue *vq;
 
@@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 			vcdev->err = -EIO;
 	}
 	virtio_ccw_check_activity(vcdev, activity);
+	read_lock_irqsave(&vcdev->irq_lock, flags);
 	for_each_set_bit(i, indicators(vcdev),
 			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
 		/* The bit clear must happen before the vring kick. */
@@ -1114,6 +1139,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 		vq = virtio_ccw_vq_by_ind(vcdev, i);
 		vring_interrupt(0, vq);
 	}
+	read_unlock_irqrestore(&vcdev->irq_lock, flags);
 	if (test_bit(0, indicators2(vcdev))) {
 		virtio_config_changed(&vcdev->vdev);
 		clear_bit(0, indicators2(vcdev));
@@ -1284,6 +1310,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	init_waitqueue_head(&vcdev->wait_q);
 	INIT_LIST_HEAD(&vcdev->virtqueues);
 	spin_lock_init(&vcdev->lock);
+	rwlock_init(&vcdev->irq_lock);
 	mutex_init(&vcdev->io_lock);
 
 	spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

This patch tries to implement the synchronize_cbs() for ccw. For the
vring_interrupt() that is called via virtio_airq_handler(), the
synchronization is simply done via the airq_info's lock. For the
vring_interrupt() that is called via virtio_ccw_int_handler(), a per
device spinlock for irq is introduced ans used in the synchronization
method.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/s390/virtio/virtio_ccw.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index d35e7a3f7067..001e1f0e6037 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -62,6 +62,7 @@ struct virtio_ccw_device {
 	unsigned int revision; /* Transport revision */
 	wait_queue_head_t wait_q;
 	spinlock_t lock;
+	rwlock_t irq_lock;
 	struct mutex io_lock; /* Serializes I/O requests */
 	struct list_head virtqueues;
 	bool is_thinint;
@@ -984,6 +985,27 @@ static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
 	return dev_name(&vcdev->cdev->dev);
 }
 
+static void virtio_ccw_synchronize_cbs(struct virtio_device *vdev)
+{
+	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
+	struct airq_info *info = vcdev->airq_info;
+
+	if (info) {
+		/*
+		 * Synchronize with the vring_interrupt() with airq indicator
+		 */
+		write_lock(&info->lock);
+		write_unlock(&info->lock);
+	} else {
+		/*
+		 * Synchronize with the vring_interrupt() called by
+		 * virtio_ccw_int_handler().
+		 */
+		write_lock(&vcdev->irq_lock);
+		write_unlock(&vcdev->irq_lock);
+	}
+}
+
 static const struct virtio_config_ops virtio_ccw_config_ops = {
 	.get_features = virtio_ccw_get_features,
 	.finalize_features = virtio_ccw_finalize_features,
@@ -995,6 +1017,7 @@ static const struct virtio_config_ops virtio_ccw_config_ops = {
 	.find_vqs = virtio_ccw_find_vqs,
 	.del_vqs = virtio_ccw_del_vqs,
 	.bus_name = virtio_ccw_bus_name,
+	.synchronize_cbs = virtio_ccw_synchronize_cbs,
 };
 
 
@@ -1079,6 +1102,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 {
 	__u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK;
 	struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev);
+	unsigned long flags;
 	int i;
 	struct virtqueue *vq;
 
@@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 			vcdev->err = -EIO;
 	}
 	virtio_ccw_check_activity(vcdev, activity);
+	read_lock_irqsave(&vcdev->irq_lock, flags);
 	for_each_set_bit(i, indicators(vcdev),
 			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
 		/* The bit clear must happen before the vring kick. */
@@ -1114,6 +1139,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
 		vq = virtio_ccw_vq_by_ind(vcdev, i);
 		vring_interrupt(0, vq);
 	}
+	read_unlock_irqrestore(&vcdev->irq_lock, flags);
 	if (test_bit(0, indicators2(vcdev))) {
 		virtio_config_changed(&vcdev->vdev);
 		clear_bit(0, indicators2(vcdev));
@@ -1284,6 +1310,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
 	init_waitqueue_head(&vcdev->wait_q);
 	INIT_LIST_HEAD(&vcdev->virtqueues);
 	spin_lock_init(&vcdev->lock);
+	rwlock_init(&vcdev->irq_lock);
 	mutex_init(&vcdev->io_lock);
 
 	spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 7/9] virtio: allow to unbreak virtqueue
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

This patch allows the new introduced __virtio_break_device() to
unbreak the virtqueue.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio_ring.c | 21 +++++++++++++++++++++
 include/linux/virtio.h       |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cfb028ca238e..5b7df7c455f0 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2397,6 +2397,27 @@ void virtio_break_device(struct virtio_device *dev)
 }
 EXPORT_SYMBOL_GPL(virtio_break_device);
 
+/*
+ * This should allow the device to be used by the driver. You may
+ * need to grab appropriate locks to flush. This should only be used
+ * in some specific case e.g (probing and restoring). Driver should
+ * not call this directly.
+ */
+void __virtio_unbreak_device(struct virtio_device *dev)
+{
+	struct virtqueue *_vq;
+
+	spin_lock(&dev->vqs_list_lock);
+	list_for_each_entry(_vq, &dev->vqs, list) {
+		struct vring_virtqueue *vq = to_vvq(_vq);
+
+		/* Pairs with READ_ONCE() in virtqueue_is_broken(). */
+		WRITE_ONCE(vq->broken, false);
+	}
+	spin_unlock(&dev->vqs_list_lock);
+}
+EXPORT_SYMBOL_GPL(__virtio_unbreak_device);
+
 dma_addr_t virtqueue_get_desc_addr(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5464f398912a..d8fdf170637c 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -131,6 +131,7 @@ void unregister_virtio_device(struct virtio_device *dev);
 bool is_virtio_device(struct device *dev);
 
 void virtio_break_device(struct virtio_device *dev);
+void __virtio_unbreak_device(struct virtio_device *dev);
 
 void virtio_config_changed(struct virtio_device *dev);
 #ifdef CONFIG_PM_SLEEP
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 7/9] virtio: allow to unbreak virtqueue
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

This patch allows the new introduced __virtio_break_device() to
unbreak the virtqueue.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio_ring.c | 21 +++++++++++++++++++++
 include/linux/virtio.h       |  1 +
 2 files changed, 22 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cfb028ca238e..5b7df7c455f0 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2397,6 +2397,27 @@ void virtio_break_device(struct virtio_device *dev)
 }
 EXPORT_SYMBOL_GPL(virtio_break_device);
 
+/*
+ * This should allow the device to be used by the driver. You may
+ * need to grab appropriate locks to flush. This should only be used
+ * in some specific case e.g (probing and restoring). Driver should
+ * not call this directly.
+ */
+void __virtio_unbreak_device(struct virtio_device *dev)
+{
+	struct virtqueue *_vq;
+
+	spin_lock(&dev->vqs_list_lock);
+	list_for_each_entry(_vq, &dev->vqs, list) {
+		struct vring_virtqueue *vq = to_vvq(_vq);
+
+		/* Pairs with READ_ONCE() in virtqueue_is_broken(). */
+		WRITE_ONCE(vq->broken, false);
+	}
+	spin_unlock(&dev->vqs_list_lock);
+}
+EXPORT_SYMBOL_GPL(__virtio_unbreak_device);
+
 dma_addr_t virtqueue_get_desc_addr(struct virtqueue *_vq)
 {
 	struct vring_virtqueue *vq = to_vvq(_vq);
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 5464f398912a..d8fdf170637c 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -131,6 +131,7 @@ void unregister_virtio_device(struct virtio_device *dev);
 bool is_virtio_device(struct device *dev);
 
 void virtio_break_device(struct virtio_device *dev);
+void __virtio_unbreak_device(struct virtio_device *dev);
 
 void virtio_config_changed(struct virtio_device *dev);
 #ifdef CONFIG_PM_SLEEP
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 8/9] virtio: harden vring IRQ
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

This is a rework on the previous IRQ hardening that is done for
virtio-pci where several drawbacks were found and were reverted:

1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
   that is used by some device such as virtio-blk
2) done only for PCI transport

The vq->broken is re-used in this patch for implementing the IRQ
hardening. The vq->broken is set to true during both initialization
and reset. And the vq->broken is set to false in
virtio_device_ready(). Then vring_interrupt can check and return when
vq->broken is true. And in this case, switch to return IRQ_NONE to let
the interrupt core aware of such invalid interrupt to prevent IRQ
storm.

The reason of using a per queue variable instead of a per device one
is that we may need it for per queue reset hardening in the future.

Note that the hardening is only done for vring interrupt since the
config interrupt hardening is already done in commit 22b7050a024d7
("virtio: defer config changed notifications"). But the method that is
used by config interrupt can't be reused by the vring interrupt
handler because it uses spinlock to do the synchronization which is
expensive.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio.c       | 15 ++++++++++++---
 drivers/virtio/virtio_ring.c  | 11 +++++++----
 include/linux/virtio_config.h | 12 ++++++++++++
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 8dde44ea044a..696f5ba4f38e 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev)
  * */
 void virtio_reset_device(struct virtio_device *dev)
 {
+	/*
+	 * The below virtio_synchronize_cbs() guarantees that any
+	 * interrupt for this line arriving after
+	 * virtio_synchronize_vqs() has completed is guaranteed to see
+	 * driver_ready == false.
+	 */
+	virtio_break_device(dev);
+	virtio_synchronize_cbs(dev);
+
 	dev->config->reset(dev);
 }
 EXPORT_SYMBOL_GPL(virtio_reset_device);
@@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev)
 	dev->config_enabled = false;
 	dev->config_change_pending = false;
 
+	INIT_LIST_HEAD(&dev->vqs);
+	spin_lock_init(&dev->vqs_list_lock);
+
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up.  This also tests that code path a little. */
 	virtio_reset_device(dev);
@@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev)
 	/* Acknowledge that we've seen the device. */
 	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
 
-	INIT_LIST_HEAD(&dev->vqs);
-	spin_lock_init(&dev->vqs_list_lock);
-
 	/*
 	 * device_add() causes the bus infrastructure to look for a matching
 	 * driver.
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5b7df7c455f0..9dfad2890d7a 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1690,7 +1690,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
 	vq->we_own_ring = true;
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
-	vq->broken = false;
+	vq->broken = true;
 	vq->last_used_idx = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
@@ -2136,8 +2136,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
 		return IRQ_NONE;
 	}
 
-	if (unlikely(vq->broken))
-		return IRQ_HANDLED;
+	if (unlikely(vq->broken)) {
+		dev_warn_once(&vq->vq.vdev->dev,
+			      "virtio vring IRQ raised before DRIVER_OK");
+		return IRQ_NONE;
+	}
 
 	/* Just a hint for performance: so it's ok that this can be racy! */
 	if (vq->event)
@@ -2179,7 +2182,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
 	vq->we_own_ring = false;
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
-	vq->broken = false;
+	vq->broken = true;
 	vq->last_used_idx = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index d8a2340f928e..23f1694cdbd5 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
 	unsigned status = dev->config->get_status(dev);
 
 	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
+
+	/*
+	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
+	 * will see the driver specific setup if it sees vq->broken
+	 * as false.
+	 */
+	virtio_synchronize_cbs(dev);
+	__virtio_unbreak_device(dev);
+	/*
+	 * The transport is expected ensure the visibility of
+	 * vq->broken before setting VIRTIO_CONFIG_S_DRIVER_OK.
+	 */
 	dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 8/9] virtio: harden vring IRQ
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

This is a rework on the previous IRQ hardening that is done for
virtio-pci where several drawbacks were found and were reverted:

1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
   that is used by some device such as virtio-blk
2) done only for PCI transport

The vq->broken is re-used in this patch for implementing the IRQ
hardening. The vq->broken is set to true during both initialization
and reset. And the vq->broken is set to false in
virtio_device_ready(). Then vring_interrupt can check and return when
vq->broken is true. And in this case, switch to return IRQ_NONE to let
the interrupt core aware of such invalid interrupt to prevent IRQ
storm.

The reason of using a per queue variable instead of a per device one
is that we may need it for per queue reset hardening in the future.

Note that the hardening is only done for vring interrupt since the
config interrupt hardening is already done in commit 22b7050a024d7
("virtio: defer config changed notifications"). But the method that is
used by config interrupt can't be reused by the vring interrupt
handler because it uses spinlock to do the synchronization which is
expensive.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/virtio/virtio.c       | 15 ++++++++++++---
 drivers/virtio/virtio_ring.c  | 11 +++++++----
 include/linux/virtio_config.h | 12 ++++++++++++
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 8dde44ea044a..696f5ba4f38e 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev)
  * */
 void virtio_reset_device(struct virtio_device *dev)
 {
+	/*
+	 * The below virtio_synchronize_cbs() guarantees that any
+	 * interrupt for this line arriving after
+	 * virtio_synchronize_vqs() has completed is guaranteed to see
+	 * driver_ready == false.
+	 */
+	virtio_break_device(dev);
+	virtio_synchronize_cbs(dev);
+
 	dev->config->reset(dev);
 }
 EXPORT_SYMBOL_GPL(virtio_reset_device);
@@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev)
 	dev->config_enabled = false;
 	dev->config_change_pending = false;
 
+	INIT_LIST_HEAD(&dev->vqs);
+	spin_lock_init(&dev->vqs_list_lock);
+
 	/* We always start by resetting the device, in case a previous
 	 * driver messed it up.  This also tests that code path a little. */
 	virtio_reset_device(dev);
@@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev)
 	/* Acknowledge that we've seen the device. */
 	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
 
-	INIT_LIST_HEAD(&dev->vqs);
-	spin_lock_init(&dev->vqs_list_lock);
-
 	/*
 	 * device_add() causes the bus infrastructure to look for a matching
 	 * driver.
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5b7df7c455f0..9dfad2890d7a 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1690,7 +1690,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
 	vq->we_own_ring = true;
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
-	vq->broken = false;
+	vq->broken = true;
 	vq->last_used_idx = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
@@ -2136,8 +2136,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
 		return IRQ_NONE;
 	}
 
-	if (unlikely(vq->broken))
-		return IRQ_HANDLED;
+	if (unlikely(vq->broken)) {
+		dev_warn_once(&vq->vq.vdev->dev,
+			      "virtio vring IRQ raised before DRIVER_OK");
+		return IRQ_NONE;
+	}
 
 	/* Just a hint for performance: so it's ok that this can be racy! */
 	if (vq->event)
@@ -2179,7 +2182,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
 	vq->we_own_ring = false;
 	vq->notify = notify;
 	vq->weak_barriers = weak_barriers;
-	vq->broken = false;
+	vq->broken = true;
 	vq->last_used_idx = 0;
 	vq->event_triggered = false;
 	vq->num_added = 0;
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index d8a2340f928e..23f1694cdbd5 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
 	unsigned status = dev->config->get_status(dev);
 
 	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
+
+	/*
+	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
+	 * will see the driver specific setup if it sees vq->broken
+	 * as false.
+	 */
+	virtio_synchronize_cbs(dev);
+	__virtio_unbreak_device(dev);
+	/*
+	 * The transport is expected ensure the visibility of
+	 * vq->broken before setting VIRTIO_CONFIG_S_DRIVER_OK.
+	 */
 	dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
 }
 
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 9/9] virtio: use WARN_ON() to warning illegal status value
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-07  7:19   ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, cohuck, eperezma, lulu,
	sgarzare, xuanzhuo

We used to use BUG_ON() in virtio_device_ready() to detect illegal
status value, this seems sub-optimal since the value is under the
control of the device. Switch to use WARN_ON() instead.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/virtio_config.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 23f1694cdbd5..5d539f39f7ee 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -255,7 +255,7 @@ void virtio_device_ready(struct virtio_device *dev)
 {
 	unsigned status = dev->config->get_status(dev);
 
-	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
+	WARN_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
 
 	/*
 	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 71+ messages in thread

* [PATCH V4 9/9] virtio: use WARN_ON() to warning illegal status value
@ 2022-05-07  7:19   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-07  7:19 UTC (permalink / raw)
  To: jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, cohuck, pasic, eperezma, tglx

We used to use BUG_ON() in virtio_device_ready() to detect illegal
status value, this seems sub-optimal since the value is under the
control of the device. Switch to use WARN_ON() instead.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 include/linux/virtio_config.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 23f1694cdbd5..5d539f39f7ee 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -255,7 +255,7 @@ void virtio_device_ready(struct virtio_device *dev)
 {
 	unsigned status = dev->config->get_status(dev);
 
-	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
+	WARN_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
 
 	/*
 	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
-- 
2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 1/9] virtio: use virtio_device_ready() in virtio_device_restore()
  2022-05-07  7:19   ` Jason Wang
@ 2022-05-09 15:22     ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:22 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, pasic, eperezma, tglx

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> From: Stefano Garzarella <sgarzare@redhat.com>
>
> It will allow us to do extension on virtio_device_ready() without
> duplicating code.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

I think you forgot my R-b on this and the following patch...

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 1/9] virtio: use virtio_device_ready() in virtio_device_restore()
@ 2022-05-09 15:22     ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:22 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, eperezma, lulu, sgarzare, xuanzhuo

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> From: Stefano Garzarella <sgarzare@redhat.com>
>
> It will allow us to do extension on virtio_device_ready() without
> duplicating code.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

I think you forgot my R-b on this and the following patch...


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 3/9] virtio: introduce config op to synchronize vring callbacks
  2022-05-07  7:19   ` Jason Wang
@ 2022-05-09 15:24     ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:24 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, eperezma, lulu, sgarzare, xuanzhuo

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> This patch introduces new virtio config op to vring
> callbacks. Transport specific method is required to make sure the
> write before this function is visible to the vring_interrupt() that is
> called after the return of this function. For the transport that
> doesn't provide synchronize_vqs(), use synchornize_rcu() which
> synchronize with IRQ implicitly as a fallback.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 3/9] virtio: introduce config op to synchronize vring callbacks
@ 2022-05-09 15:24     ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:24 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, pasic, eperezma, tglx

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> This patch introduces new virtio config op to vring
> callbacks. Transport specific method is required to make sure the
> write before this function is visible to the vring_interrupt() that is
> called after the return of this function. For the transport that
> doesn't provide synchronize_vqs(), use synchornize_rcu() which
> synchronize with IRQ implicitly as a fallback.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  include/linux/virtio_config.h | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 4/9] virtio-pci: implement synchronize_cbs()
  2022-05-07  7:19   ` Jason Wang
@ 2022-05-09 15:26     ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:26 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, eperezma, lulu, sgarzare, xuanzhuo

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> We can simply reuse vp_synchronize_vectors() for .synchronize_cbs().
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio_pci_legacy.c | 1 +
>  drivers/virtio/virtio_pci_modern.c | 2 ++
>  2 files changed, 3 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 4/9] virtio-pci: implement synchronize_cbs()
@ 2022-05-09 15:26     ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:26 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, pasic, eperezma, tglx

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> We can simply reuse vp_synchronize_vectors() for .synchronize_cbs().
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio_pci_legacy.c | 1 +
>  drivers/virtio/virtio_pci_modern.c | 2 ++
>  2 files changed, 3 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 5/9] virtio-mmio: implement synchronize_cbs()
  2022-05-07  7:19   ` Jason Wang
@ 2022-05-09 15:34     ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:34 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, eperezma, lulu, sgarzare, xuanzhuo

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> Simply synchronize the platform irq that is used by us.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio_mmio.c | 9 +++++++++
>  1 file changed, 9 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 5/9] virtio-mmio: implement synchronize_cbs()
@ 2022-05-09 15:34     ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-09 15:34 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, pasic, eperezma, tglx

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> Simply synchronize the platform irq that is used by us.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio_mmio.c | 9 +++++++++
>  1 file changed, 9 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 1/9] virtio: use virtio_device_ready() in virtio_device_restore()
  2022-05-09 15:22     ` Cornelia Huck
@ 2022-05-10  1:50       ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-10  1:50 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: mst, virtualization, linux-kernel, Thomas Gleixner,
	Peter Zijlstra, Paul E. McKenney, Marc Zyngier, Halil Pasic,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Mon, May 9, 2022 at 11:22 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > From: Stefano Garzarella <sgarzare@redhat.com>
> >
> > It will allow us to do extension on virtio_device_ready() without
> > duplicating code.
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Halil Pasic <pasic@linux.ibm.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/virtio/virtio.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
>
> I think you forgot my R-b on this and the following patch...

Sorry, I will add them in the next version (or I will repost if
everyone thinks this version is fine).

Thanks

>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 1/9] virtio: use virtio_device_ready() in virtio_device_restore()
@ 2022-05-10  1:50       ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-10  1:50 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Cindy Lu, Paul E. McKenney, mst, Peter Zijlstra, Marc Zyngier,
	linux-kernel, virtualization, Halil Pasic, eperezma,
	Thomas Gleixner

On Mon, May 9, 2022 at 11:22 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > From: Stefano Garzarella <sgarzare@redhat.com>
> >
> > It will allow us to do extension on virtio_device_ready() without
> > duplicating code.
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Halil Pasic <pasic@linux.ibm.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/virtio/virtio.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
>
> I think you forgot my R-b on this and the following patch...

Sorry, I will add them in the next version (or I will repost if
everyone thinks this version is fine).

Thanks

>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
  2022-05-07  7:19 ` Jason Wang
@ 2022-05-10  9:29   ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-10  9:29 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: tglx, peterz, paulmck, maz, pasic, eperezma, lulu, sgarzare, xuanzhuo

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> Hi All:
>
> This is a rework on the IRQ hardening for virtio which is done
> previously by the following commits are reverted:
>
> 9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
> 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
>
> The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
> with the assumption of the affinity managed IRQ that is used by some
> virtio drivers. And what's more, it is only done for virtio-pci but
> not other transports.
>
> In this rework, I try to implement a general virtio solution which
> borrows the idea of the INTX hardening by re-using per virtqueue
> boolean vq->broken and toggle it in virtio_device_ready() and
> virtio_reset_device(). Then we can simply reuse the existing checks in
> the vring_interrupt() and return early if the driver is not ready.
>
> Note that, I only did compile test on ccw and MMIO transport.

Lockdep is unhappy with the ccw parts:

================================
WARNING: inconsistent lock state
5.18.0-rc6+ #191 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-R} -> {HARDIRQ-ON-W} usage.
kworker/u4:0/9 [HC0[0]:SC0[0]:HE1:SE1] takes:
00000000058e9618 (&vcdev->irq_lock){+-..}-{2:2}, at: virtio_ccw_synchronize_cbs+0x4e/0x60
{IN-HARDIRQ-R} state was registered at:
  __lock_acquire+0x442/0xc20
  lock_acquire.part.0+0xdc/0x228
  lock_acquire+0xa6/0x1b0
  _raw_read_lock_irqsave+0x72/0x100
  virtio_ccw_int_handler+0x84/0x238
  ccw_device_call_handler+0x72/0xd0
  ccw_device_irq+0x7a/0x198
  do_cio_interrupt+0x11c/0x1d0
  __handle_irq_event_percpu+0xc2/0x318
  handle_irq_event_percpu+0x26/0x68
  handle_percpu_irq+0x64/0x88
  generic_handle_irq+0x40/0x58
  do_irq_async+0x56/0xb0
  do_io_irq+0x82/0x160
  io_int_handler+0xe6/0x120
  rcu_read_lock_sched_held+0x3e/0xb0
  lock_acquired+0x12e/0x208
  new_inode+0x3e/0xd0
  debugfs_get_inode+0x22/0x68
  __debugfs_create_file+0x78/0x1c0
  debugfs_create_file_unsafe+0x36/0x58
  debugfs_create_u32+0x38/0x68
  sched_init_debug+0xb0/0x1c0
  do_one_initcall+0x108/0x280
  do_initcalls+0x124/0x148
  kernel_init_freeable+0x242/0x280
  kernel_init+0x2e/0x158
  __ret_from_fork+0x3c/0x50
  ret_from_fork+0xa/0x40
irq event stamp: 539789
hardirqs last  enabled at (539789): [<0000000000d9c632>] _raw_spin_unlock_irqrestore+0x72/0x88
hardirqs last disabled at (539788): [<0000000000d9c2b6>] _raw_spin_lock_irqsave+0x96/0xd0
softirqs last  enabled at (539568): [<0000000000d9e0d4>] __do_softirq+0x434/0x588
softirqs last disabled at (539503): [<000000000018cd66>] __irq_exit_rcu+0x146/0x170

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&vcdev->irq_lock);
  <Interrupt>
    lock(&vcdev->irq_lock);

 *** DEADLOCK ***

2 locks held by kworker/u4:0/9:
 #0: 000000000288d948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
 #1: 000003800004bdc8 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658

stack backtrace:
CPU: 1 PID: 9 Comm: kworker/u4:0 Not tainted 5.18.0-rc6+ #191
Hardware name: QEMU 8561 QEMU (KVM/Linux)
Workqueue: events_unbound async_run_entry_fn
Call Trace:
 [<0000000000d8af22>] dump_stack_lvl+0x92/0xd0 
 [<00000000002032ac>] mark_lock_irq+0x864/0x968 
 [<0000000000203670>] mark_lock.part.0+0x2c0/0x790 
 [<0000000000203cea>] mark_usage+0x10a/0x178 
 [<000000000020692a>] __lock_acquire+0x442/0xc20 
 [<0000000000207cc4>] lock_acquire.part.0+0xdc/0x228 
 [<0000000000207eb6>] lock_acquire+0xa6/0x1b0 
 [<0000000000d9c774>] _raw_write_lock+0x54/0xa8 
 [<0000000000d5a1f6>] virtio_ccw_synchronize_cbs+0x4e/0x60 
 [<00000000008eec04>] register_virtio_device+0xdc/0x1b0 
 [<0000000000d5aabe>] virtio_ccw_online+0x246/0x2e8 
 [<0000000000c9fecc>] ccw_device_set_online+0x1c4/0x540 
 [<0000000000d5a05e>] virtio_ccw_auto_online+0x26/0x50 
 [<00000000001ba2b0>] async_run_entry_fn+0x40/0x108 
 [<00000000001ab9b4>] process_one_work+0x2a4/0x658 
 [<00000000001abdd0>] worker_thread+0x68/0x440 
 [<00000000001b4668>] kthread+0x128/0x130 
 [<0000000000102fac>] __ret_from_fork+0x3c/0x50 
 [<0000000000d9d3aa>] ret_from_fork+0xa/0x40 
INFO: lockdep is turned off.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-10  9:29   ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-10  9:29 UTC (permalink / raw)
  To: Jason Wang, jasowang, mst, virtualization, linux-kernel
  Cc: lulu, paulmck, peterz, maz, pasic, eperezma, tglx

On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:

> Hi All:
>
> This is a rework on the IRQ hardening for virtio which is done
> previously by the following commits are reverted:
>
> 9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
> 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
>
> The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
> with the assumption of the affinity managed IRQ that is used by some
> virtio drivers. And what's more, it is only done for virtio-pci but
> not other transports.
>
> In this rework, I try to implement a general virtio solution which
> borrows the idea of the INTX hardening by re-using per virtqueue
> boolean vq->broken and toggle it in virtio_device_ready() and
> virtio_reset_device(). Then we can simply reuse the existing checks in
> the vring_interrupt() and return early if the driver is not ready.
>
> Note that, I only did compile test on ccw and MMIO transport.

Lockdep is unhappy with the ccw parts:

================================
WARNING: inconsistent lock state
5.18.0-rc6+ #191 Not tainted
--------------------------------
inconsistent {IN-HARDIRQ-R} -> {HARDIRQ-ON-W} usage.
kworker/u4:0/9 [HC0[0]:SC0[0]:HE1:SE1] takes:
00000000058e9618 (&vcdev->irq_lock){+-..}-{2:2}, at: virtio_ccw_synchronize_cbs+0x4e/0x60
{IN-HARDIRQ-R} state was registered at:
  __lock_acquire+0x442/0xc20
  lock_acquire.part.0+0xdc/0x228
  lock_acquire+0xa6/0x1b0
  _raw_read_lock_irqsave+0x72/0x100
  virtio_ccw_int_handler+0x84/0x238
  ccw_device_call_handler+0x72/0xd0
  ccw_device_irq+0x7a/0x198
  do_cio_interrupt+0x11c/0x1d0
  __handle_irq_event_percpu+0xc2/0x318
  handle_irq_event_percpu+0x26/0x68
  handle_percpu_irq+0x64/0x88
  generic_handle_irq+0x40/0x58
  do_irq_async+0x56/0xb0
  do_io_irq+0x82/0x160
  io_int_handler+0xe6/0x120
  rcu_read_lock_sched_held+0x3e/0xb0
  lock_acquired+0x12e/0x208
  new_inode+0x3e/0xd0
  debugfs_get_inode+0x22/0x68
  __debugfs_create_file+0x78/0x1c0
  debugfs_create_file_unsafe+0x36/0x58
  debugfs_create_u32+0x38/0x68
  sched_init_debug+0xb0/0x1c0
  do_one_initcall+0x108/0x280
  do_initcalls+0x124/0x148
  kernel_init_freeable+0x242/0x280
  kernel_init+0x2e/0x158
  __ret_from_fork+0x3c/0x50
  ret_from_fork+0xa/0x40
irq event stamp: 539789
hardirqs last  enabled at (539789): [<0000000000d9c632>] _raw_spin_unlock_irqrestore+0x72/0x88
hardirqs last disabled at (539788): [<0000000000d9c2b6>] _raw_spin_lock_irqsave+0x96/0xd0
softirqs last  enabled at (539568): [<0000000000d9e0d4>] __do_softirq+0x434/0x588
softirqs last disabled at (539503): [<000000000018cd66>] __irq_exit_rcu+0x146/0x170

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&vcdev->irq_lock);
  <Interrupt>
    lock(&vcdev->irq_lock);

 *** DEADLOCK ***

2 locks held by kworker/u4:0/9:
 #0: 000000000288d948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
 #1: 000003800004bdc8 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658

stack backtrace:
CPU: 1 PID: 9 Comm: kworker/u4:0 Not tainted 5.18.0-rc6+ #191
Hardware name: QEMU 8561 QEMU (KVM/Linux)
Workqueue: events_unbound async_run_entry_fn
Call Trace:
 [<0000000000d8af22>] dump_stack_lvl+0x92/0xd0 
 [<00000000002032ac>] mark_lock_irq+0x864/0x968 
 [<0000000000203670>] mark_lock.part.0+0x2c0/0x790 
 [<0000000000203cea>] mark_usage+0x10a/0x178 
 [<000000000020692a>] __lock_acquire+0x442/0xc20 
 [<0000000000207cc4>] lock_acquire.part.0+0xdc/0x228 
 [<0000000000207eb6>] lock_acquire+0xa6/0x1b0 
 [<0000000000d9c774>] _raw_write_lock+0x54/0xa8 
 [<0000000000d5a1f6>] virtio_ccw_synchronize_cbs+0x4e/0x60 
 [<00000000008eec04>] register_virtio_device+0xdc/0x1b0 
 [<0000000000d5aabe>] virtio_ccw_online+0x246/0x2e8 
 [<0000000000c9fecc>] ccw_device_set_online+0x1c4/0x540 
 [<0000000000d5a05e>] virtio_ccw_auto_online+0x26/0x50 
 [<00000000001ba2b0>] async_run_entry_fn+0x40/0x108 
 [<00000000001ab9b4>] process_one_work+0x2a4/0x658 
 [<00000000001abdd0>] worker_thread+0x68/0x440 
 [<00000000001b4668>] kthread+0x128/0x130 
 [<0000000000102fac>] __ret_from_fork+0x3c/0x50 
 [<0000000000d9d3aa>] ret_from_fork+0xa/0x40 
INFO: lockdep is turned off.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-07  7:19   ` Jason Wang
@ 2022-05-10 11:27     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 71+ messages in thread
From: Michael S. Tsirkin @ 2022-05-10 11:27 UTC (permalink / raw)
  To: Jason Wang
  Cc: virtualization, linux-kernel, tglx, peterz, paulmck, maz, pasic,
	cohuck, eperezma, lulu, sgarzare, xuanzhuo

On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> This patch tries to implement the synchronize_cbs() for ccw. For the
> vring_interrupt() that is called via virtio_airq_handler(), the
> synchronization is simply done via the airq_info's lock. For the
> vring_interrupt() that is called via virtio_ccw_int_handler(), a per
> device spinlock for irq is introduced ans used in the synchronization
> method.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index d35e7a3f7067..001e1f0e6037 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -62,6 +62,7 @@ struct virtio_ccw_device {
>  	unsigned int revision; /* Transport revision */
>  	wait_queue_head_t wait_q;
>  	spinlock_t lock;
> +	rwlock_t irq_lock;
>  	struct mutex io_lock; /* Serializes I/O requests */
>  	struct list_head virtqueues;
>  	bool is_thinint;
> @@ -984,6 +985,27 @@ static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
>  	return dev_name(&vcdev->cdev->dev);
>  }
>  
> +static void virtio_ccw_synchronize_cbs(struct virtio_device *vdev)
> +{
> +	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
> +	struct airq_info *info = vcdev->airq_info;
> +
> +	if (info) {
> +		/*
> +		 * Synchronize with the vring_interrupt() with airq indicator
> +		 */
> +		write_lock(&info->lock);
> +		write_unlock(&info->lock);
> +	} else {
> +		/*
> +		 * Synchronize with the vring_interrupt() called by
> +		 * virtio_ccw_int_handler().
> +		 */
> +		write_lock(&vcdev->irq_lock);
> +		write_unlock(&vcdev->irq_lock);
> +	}
> +}
> +
>  static const struct virtio_config_ops virtio_ccw_config_ops = {
>  	.get_features = virtio_ccw_get_features,
>  	.finalize_features = virtio_ccw_finalize_features,
> @@ -995,6 +1017,7 @@ static const struct virtio_config_ops virtio_ccw_config_ops = {
>  	.find_vqs = virtio_ccw_find_vqs,
>  	.del_vqs = virtio_ccw_del_vqs,
>  	.bus_name = virtio_ccw_bus_name,
> +	.synchronize_cbs = virtio_ccw_synchronize_cbs,
>  };
>  
>  
> @@ -1079,6 +1102,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>  {
>  	__u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK;
>  	struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev);
> +	unsigned long flags;
>  	int i;
>  	struct virtqueue *vq;
>  
> @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>  			vcdev->err = -EIO;
>  	}
>  	virtio_ccw_check_activity(vcdev, activity);
> +	read_lock_irqsave(&vcdev->irq_lock, flags);
>  	for_each_set_bit(i, indicators(vcdev),
>  			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>  		/* The bit clear must happen before the vring kick. */

Cornelia sent a lockdep trace on this.

Basically I think this gets the irqsave/restore logic wrong.
It attempts to disable irqs in the handler (which is an interrupt
anyway).
And it does not disable irqs in the synchronize_cbs.

As a result in interrupt might try to take a read lock while
.synchronize_cbs has the writer lock, resulting in a deadlock.

I think you want regular read_lock + write_lock_irq.


> @@ -1114,6 +1139,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>  		vq = virtio_ccw_vq_by_ind(vcdev, i);
>  		vring_interrupt(0, vq);
>  	}
> +	read_unlock_irqrestore(&vcdev->irq_lock, flags);
>  	if (test_bit(0, indicators2(vcdev))) {
>  		virtio_config_changed(&vcdev->vdev);
>  		clear_bit(0, indicators2(vcdev));
> @@ -1284,6 +1310,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  	init_waitqueue_head(&vcdev->wait_q);
>  	INIT_LIST_HEAD(&vcdev->virtqueues);
>  	spin_lock_init(&vcdev->lock);
> +	rwlock_init(&vcdev->irq_lock);
>  	mutex_init(&vcdev->io_lock);
>  
>  	spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-10 11:27     ` Michael S. Tsirkin
  0 siblings, 0 replies; 71+ messages in thread
From: Michael S. Tsirkin @ 2022-05-10 11:27 UTC (permalink / raw)
  To: Jason Wang
  Cc: lulu, paulmck, peterz, maz, cohuck, linux-kernel, virtualization,
	pasic, eperezma, tglx

On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> This patch tries to implement the synchronize_cbs() for ccw. For the
> vring_interrupt() that is called via virtio_airq_handler(), the
> synchronization is simply done via the airq_info's lock. For the
> vring_interrupt() that is called via virtio_ccw_int_handler(), a per
> device spinlock for irq is introduced ans used in the synchronization
> method.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/s390/virtio/virtio_ccw.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index d35e7a3f7067..001e1f0e6037 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -62,6 +62,7 @@ struct virtio_ccw_device {
>  	unsigned int revision; /* Transport revision */
>  	wait_queue_head_t wait_q;
>  	spinlock_t lock;
> +	rwlock_t irq_lock;
>  	struct mutex io_lock; /* Serializes I/O requests */
>  	struct list_head virtqueues;
>  	bool is_thinint;
> @@ -984,6 +985,27 @@ static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
>  	return dev_name(&vcdev->cdev->dev);
>  }
>  
> +static void virtio_ccw_synchronize_cbs(struct virtio_device *vdev)
> +{
> +	struct virtio_ccw_device *vcdev = to_vc_device(vdev);
> +	struct airq_info *info = vcdev->airq_info;
> +
> +	if (info) {
> +		/*
> +		 * Synchronize with the vring_interrupt() with airq indicator
> +		 */
> +		write_lock(&info->lock);
> +		write_unlock(&info->lock);
> +	} else {
> +		/*
> +		 * Synchronize with the vring_interrupt() called by
> +		 * virtio_ccw_int_handler().
> +		 */
> +		write_lock(&vcdev->irq_lock);
> +		write_unlock(&vcdev->irq_lock);
> +	}
> +}
> +
>  static const struct virtio_config_ops virtio_ccw_config_ops = {
>  	.get_features = virtio_ccw_get_features,
>  	.finalize_features = virtio_ccw_finalize_features,
> @@ -995,6 +1017,7 @@ static const struct virtio_config_ops virtio_ccw_config_ops = {
>  	.find_vqs = virtio_ccw_find_vqs,
>  	.del_vqs = virtio_ccw_del_vqs,
>  	.bus_name = virtio_ccw_bus_name,
> +	.synchronize_cbs = virtio_ccw_synchronize_cbs,
>  };
>  
>  
> @@ -1079,6 +1102,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>  {
>  	__u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK;
>  	struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev);
> +	unsigned long flags;
>  	int i;
>  	struct virtqueue *vq;
>  
> @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>  			vcdev->err = -EIO;
>  	}
>  	virtio_ccw_check_activity(vcdev, activity);
> +	read_lock_irqsave(&vcdev->irq_lock, flags);
>  	for_each_set_bit(i, indicators(vcdev),
>  			 sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>  		/* The bit clear must happen before the vring kick. */

Cornelia sent a lockdep trace on this.

Basically I think this gets the irqsave/restore logic wrong.
It attempts to disable irqs in the handler (which is an interrupt
anyway).
And it does not disable irqs in the synchronize_cbs.

As a result in interrupt might try to take a read lock while
.synchronize_cbs has the writer lock, resulting in a deadlock.

I think you want regular read_lock + write_lock_irq.


> @@ -1114,6 +1139,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>  		vq = virtio_ccw_vq_by_ind(vcdev, i);
>  		vring_interrupt(0, vq);
>  	}
> +	read_unlock_irqrestore(&vcdev->irq_lock, flags);
>  	if (test_bit(0, indicators2(vcdev))) {
>  		virtio_config_changed(&vcdev->vdev);
>  		clear_bit(0, indicators2(vcdev));
> @@ -1284,6 +1310,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
>  	init_waitqueue_head(&vcdev->wait_q);
>  	INIT_LIST_HEAD(&vcdev->virtqueues);
>  	spin_lock_init(&vcdev->lock);
> +	rwlock_init(&vcdev->irq_lock);
>  	mutex_init(&vcdev->io_lock);
>  
>  	spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
> -- 
> 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
  2022-05-07  7:19   ` Jason Wang
@ 2022-05-10 11:32     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 71+ messages in thread
From: Michael S. Tsirkin @ 2022-05-10 11:32 UTC (permalink / raw)
  To: Jason Wang
  Cc: virtualization, linux-kernel, tglx, peterz, paulmck, maz, pasic,
	cohuck, eperezma, lulu, sgarzare, xuanzhuo

On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> This is a rework on the previous IRQ hardening that is done for
> virtio-pci where several drawbacks were found and were reverted:
> 
> 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
>    that is used by some device such as virtio-blk
> 2) done only for PCI transport
> 
> The vq->broken is re-used in this patch for implementing the IRQ
> hardening. The vq->broken is set to true during both initialization
> and reset. And the vq->broken is set to false in
> virtio_device_ready(). Then vring_interrupt can check and return when
> vq->broken is true. And in this case, switch to return IRQ_NONE to let
> the interrupt core aware of such invalid interrupt to prevent IRQ
> storm.
> 
> The reason of using a per queue variable instead of a per device one
> is that we may need it for per queue reset hardening in the future.
> 
> Note that the hardening is only done for vring interrupt since the
> config interrupt hardening is already done in commit 22b7050a024d7
> ("virtio: defer config changed notifications"). But the method that is
> used by config interrupt can't be reused by the vring interrupt
> handler because it uses spinlock to do the synchronization which is
> expensive.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio.c       | 15 ++++++++++++---
>  drivers/virtio/virtio_ring.c  | 11 +++++++----
>  include/linux/virtio_config.h | 12 ++++++++++++
>  3 files changed, 31 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 8dde44ea044a..696f5ba4f38e 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev)
>   * */
>  void virtio_reset_device(struct virtio_device *dev)
>  {
> +	/*
> +	 * The below virtio_synchronize_cbs() guarantees that any
> +	 * interrupt for this line arriving after
> +	 * virtio_synchronize_vqs() has completed is guaranteed to see
> +	 * driver_ready == false.
> +	 */
> +	virtio_break_device(dev);
> +	virtio_synchronize_cbs(dev);
> +
>  	dev->config->reset(dev);
>  }
>  EXPORT_SYMBOL_GPL(virtio_reset_device);
> @@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev)
>  	dev->config_enabled = false;
>  	dev->config_change_pending = false;
>  
> +	INIT_LIST_HEAD(&dev->vqs);
> +	spin_lock_init(&dev->vqs_list_lock);
> +
>  	/* We always start by resetting the device, in case a previous
>  	 * driver messed it up.  This also tests that code path a little. */
>  	virtio_reset_device(dev);
> @@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev)
>  	/* Acknowledge that we've seen the device. */
>  	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
>  
> -	INIT_LIST_HEAD(&dev->vqs);
> -	spin_lock_init(&dev->vqs_list_lock);
> -
>  	/*
>  	 * device_add() causes the bus infrastructure to look for a matching
>  	 * driver.
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5b7df7c455f0..9dfad2890d7a 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -1690,7 +1690,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
>  	vq->we_own_ring = true;
>  	vq->notify = notify;
>  	vq->weak_barriers = weak_barriers;
> -	vq->broken = false;
> +	vq->broken = true;
>  	vq->last_used_idx = 0;
>  	vq->event_triggered = false;
>  	vq->num_added = 0;
> @@ -2136,8 +2136,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
>  		return IRQ_NONE;
>  	}
>  
> -	if (unlikely(vq->broken))
> -		return IRQ_HANDLED;
> +	if (unlikely(vq->broken)) {
> +		dev_warn_once(&vq->vq.vdev->dev,
> +			      "virtio vring IRQ raised before DRIVER_OK");
> +		return IRQ_NONE;
> +	}
>  
>  	/* Just a hint for performance: so it's ok that this can be racy! */
>  	if (vq->event)
> @@ -2179,7 +2182,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>  	vq->we_own_ring = false;
>  	vq->notify = notify;
>  	vq->weak_barriers = weak_barriers;
> -	vq->broken = false;
> +	vq->broken = true;
>  	vq->last_used_idx = 0;
>  	vq->event_triggered = false;
>  	vq->num_added = 0;
> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> index d8a2340f928e..23f1694cdbd5 100644
> --- a/include/linux/virtio_config.h
> +++ b/include/linux/virtio_config.h
> @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
>  	unsigned status = dev->config->get_status(dev);
>  
>  	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> +
> +	/*
> +	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
> +	 * will see the driver specific setup if it sees vq->broken
> +	 * as false.
> +	 */
> +	virtio_synchronize_cbs(dev);

since you mention vq->broken above, maybe add
	"set vq->broken to false"

> +	__virtio_unbreak_device(dev);
> +	/*
> +	 * The transport is expected ensure the visibility of

to ensure

> +	 * vq->broken

let's add: "visibility by vq callbacks"

> before setting VIRTIO_CONFIG_S_DRIVER_OK.
> +	 */


Can I see some analysis of existing transports showing
this is actually the case for them?
And maybe add a comment near set_status to document the
requirement.

>  	dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
>  }
>  
> -- 
> 2.25.1


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
@ 2022-05-10 11:32     ` Michael S. Tsirkin
  0 siblings, 0 replies; 71+ messages in thread
From: Michael S. Tsirkin @ 2022-05-10 11:32 UTC (permalink / raw)
  To: Jason Wang
  Cc: lulu, paulmck, peterz, maz, cohuck, linux-kernel, virtualization,
	pasic, eperezma, tglx

On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> This is a rework on the previous IRQ hardening that is done for
> virtio-pci where several drawbacks were found and were reverted:
> 
> 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
>    that is used by some device such as virtio-blk
> 2) done only for PCI transport
> 
> The vq->broken is re-used in this patch for implementing the IRQ
> hardening. The vq->broken is set to true during both initialization
> and reset. And the vq->broken is set to false in
> virtio_device_ready(). Then vring_interrupt can check and return when
> vq->broken is true. And in this case, switch to return IRQ_NONE to let
> the interrupt core aware of such invalid interrupt to prevent IRQ
> storm.
> 
> The reason of using a per queue variable instead of a per device one
> is that we may need it for per queue reset hardening in the future.
> 
> Note that the hardening is only done for vring interrupt since the
> config interrupt hardening is already done in commit 22b7050a024d7
> ("virtio: defer config changed notifications"). But the method that is
> used by config interrupt can't be reused by the vring interrupt
> handler because it uses spinlock to do the synchronization which is
> expensive.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Halil Pasic <pasic@linux.ibm.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/virtio/virtio.c       | 15 ++++++++++++---
>  drivers/virtio/virtio_ring.c  | 11 +++++++----
>  include/linux/virtio_config.h | 12 ++++++++++++
>  3 files changed, 31 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> index 8dde44ea044a..696f5ba4f38e 100644
> --- a/drivers/virtio/virtio.c
> +++ b/drivers/virtio/virtio.c
> @@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev)
>   * */
>  void virtio_reset_device(struct virtio_device *dev)
>  {
> +	/*
> +	 * The below virtio_synchronize_cbs() guarantees that any
> +	 * interrupt for this line arriving after
> +	 * virtio_synchronize_vqs() has completed is guaranteed to see
> +	 * driver_ready == false.
> +	 */
> +	virtio_break_device(dev);
> +	virtio_synchronize_cbs(dev);
> +
>  	dev->config->reset(dev);
>  }
>  EXPORT_SYMBOL_GPL(virtio_reset_device);
> @@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev)
>  	dev->config_enabled = false;
>  	dev->config_change_pending = false;
>  
> +	INIT_LIST_HEAD(&dev->vqs);
> +	spin_lock_init(&dev->vqs_list_lock);
> +
>  	/* We always start by resetting the device, in case a previous
>  	 * driver messed it up.  This also tests that code path a little. */
>  	virtio_reset_device(dev);
> @@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev)
>  	/* Acknowledge that we've seen the device. */
>  	virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
>  
> -	INIT_LIST_HEAD(&dev->vqs);
> -	spin_lock_init(&dev->vqs_list_lock);
> -
>  	/*
>  	 * device_add() causes the bus infrastructure to look for a matching
>  	 * driver.
> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> index 5b7df7c455f0..9dfad2890d7a 100644
> --- a/drivers/virtio/virtio_ring.c
> +++ b/drivers/virtio/virtio_ring.c
> @@ -1690,7 +1690,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
>  	vq->we_own_ring = true;
>  	vq->notify = notify;
>  	vq->weak_barriers = weak_barriers;
> -	vq->broken = false;
> +	vq->broken = true;
>  	vq->last_used_idx = 0;
>  	vq->event_triggered = false;
>  	vq->num_added = 0;
> @@ -2136,8 +2136,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
>  		return IRQ_NONE;
>  	}
>  
> -	if (unlikely(vq->broken))
> -		return IRQ_HANDLED;
> +	if (unlikely(vq->broken)) {
> +		dev_warn_once(&vq->vq.vdev->dev,
> +			      "virtio vring IRQ raised before DRIVER_OK");
> +		return IRQ_NONE;
> +	}
>  
>  	/* Just a hint for performance: so it's ok that this can be racy! */
>  	if (vq->event)
> @@ -2179,7 +2182,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
>  	vq->we_own_ring = false;
>  	vq->notify = notify;
>  	vq->weak_barriers = weak_barriers;
> -	vq->broken = false;
> +	vq->broken = true;
>  	vq->last_used_idx = 0;
>  	vq->event_triggered = false;
>  	vq->num_added = 0;
> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> index d8a2340f928e..23f1694cdbd5 100644
> --- a/include/linux/virtio_config.h
> +++ b/include/linux/virtio_config.h
> @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
>  	unsigned status = dev->config->get_status(dev);
>  
>  	BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> +
> +	/*
> +	 * The virtio_synchronize_cbs() makes sure vring_interrupt()
> +	 * will see the driver specific setup if it sees vq->broken
> +	 * as false.
> +	 */
> +	virtio_synchronize_cbs(dev);

since you mention vq->broken above, maybe add
	"set vq->broken to false"

> +	__virtio_unbreak_device(dev);
> +	/*
> +	 * The transport is expected ensure the visibility of

to ensure

> +	 * vq->broken

let's add: "visibility by vq callbacks"

> before setting VIRTIO_CONFIG_S_DRIVER_OK.
> +	 */


Can I see some analysis of existing transports showing
this is actually the case for them?
And maybe add a comment near set_status to document the
requirement.

>  	dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
>  }
>  
> -- 
> 2.25.1

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
  2022-05-10  9:29   ` Cornelia Huck
@ 2022-05-11  2:22     ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  2:22 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: mst, virtualization, linux-kernel, Thomas Gleixner,
	Peter Zijlstra, Paul E. McKenney, Marc Zyngier, Halil Pasic,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Tue, May 10, 2022 at 5:29 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > Hi All:
> >
> > This is a rework on the IRQ hardening for virtio which is done
> > previously by the following commits are reverted:
> >
> > 9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
> > 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> >
> > The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
> > with the assumption of the affinity managed IRQ that is used by some
> > virtio drivers. And what's more, it is only done for virtio-pci but
> > not other transports.
> >
> > In this rework, I try to implement a general virtio solution which
> > borrows the idea of the INTX hardening by re-using per virtqueue
> > boolean vq->broken and toggle it in virtio_device_ready() and
> > virtio_reset_device(). Then we can simply reuse the existing checks in
> > the vring_interrupt() and return early if the driver is not ready.
> >
> > Note that, I only did compile test on ccw and MMIO transport.
>
> Lockdep is unhappy with the ccw parts:
>
> ================================
> WARNING: inconsistent lock state
> 5.18.0-rc6+ #191 Not tainted
> --------------------------------
> inconsistent {IN-HARDIRQ-R} -> {HARDIRQ-ON-W} usage.
> kworker/u4:0/9 [HC0[0]:SC0[0]:HE1:SE1] takes:
> 00000000058e9618 (&vcdev->irq_lock){+-..}-{2:2}, at: virtio_ccw_synchronize_cbs+0x4e/0x60
> {IN-HARDIRQ-R} state was registered at:
>   __lock_acquire+0x442/0xc20
>   lock_acquire.part.0+0xdc/0x228
>   lock_acquire+0xa6/0x1b0
>   _raw_read_lock_irqsave+0x72/0x100
>   virtio_ccw_int_handler+0x84/0x238
>   ccw_device_call_handler+0x72/0xd0
>   ccw_device_irq+0x7a/0x198
>   do_cio_interrupt+0x11c/0x1d0
>   __handle_irq_event_percpu+0xc2/0x318
>   handle_irq_event_percpu+0x26/0x68
>   handle_percpu_irq+0x64/0x88
>   generic_handle_irq+0x40/0x58
>   do_irq_async+0x56/0xb0
>   do_io_irq+0x82/0x160
>   io_int_handler+0xe6/0x120
>   rcu_read_lock_sched_held+0x3e/0xb0
>   lock_acquired+0x12e/0x208
>   new_inode+0x3e/0xd0
>   debugfs_get_inode+0x22/0x68
>   __debugfs_create_file+0x78/0x1c0
>   debugfs_create_file_unsafe+0x36/0x58
>   debugfs_create_u32+0x38/0x68
>   sched_init_debug+0xb0/0x1c0
>   do_one_initcall+0x108/0x280
>   do_initcalls+0x124/0x148
>   kernel_init_freeable+0x242/0x280
>   kernel_init+0x2e/0x158
>   __ret_from_fork+0x3c/0x50
>   ret_from_fork+0xa/0x40
> irq event stamp: 539789
> hardirqs last  enabled at (539789): [<0000000000d9c632>] _raw_spin_unlock_irqrestore+0x72/0x88
> hardirqs last disabled at (539788): [<0000000000d9c2b6>] _raw_spin_lock_irqsave+0x96/0xd0
> softirqs last  enabled at (539568): [<0000000000d9e0d4>] __do_softirq+0x434/0x588
> softirqs last disabled at (539503): [<000000000018cd66>] __irq_exit_rcu+0x146/0x170
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&vcdev->irq_lock);
>   <Interrupt>
>     lock(&vcdev->irq_lock);
>
>  *** DEADLOCK ***

It looks to me we need to use write_lock_irq()/write_unlock_irq() to
do the synchronization.

And we probably need to keep the
read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
virtio_ccw_int_handler() to be called from process context (e.g from
the io_subchannel_quiesce()).

Thanks

>
> 2 locks held by kworker/u4:0/9:
>  #0: 000000000288d948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
>  #1: 000003800004bdc8 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
>
> stack backtrace:
> CPU: 1 PID: 9 Comm: kworker/u4:0 Not tainted 5.18.0-rc6+ #191
> Hardware name: QEMU 8561 QEMU (KVM/Linux)
> Workqueue: events_unbound async_run_entry_fn
> Call Trace:
>  [<0000000000d8af22>] dump_stack_lvl+0x92/0xd0
>  [<00000000002032ac>] mark_lock_irq+0x864/0x968
>  [<0000000000203670>] mark_lock.part.0+0x2c0/0x790
>  [<0000000000203cea>] mark_usage+0x10a/0x178
>  [<000000000020692a>] __lock_acquire+0x442/0xc20
>  [<0000000000207cc4>] lock_acquire.part.0+0xdc/0x228
>  [<0000000000207eb6>] lock_acquire+0xa6/0x1b0
>  [<0000000000d9c774>] _raw_write_lock+0x54/0xa8
>  [<0000000000d5a1f6>] virtio_ccw_synchronize_cbs+0x4e/0x60
>  [<00000000008eec04>] register_virtio_device+0xdc/0x1b0
>  [<0000000000d5aabe>] virtio_ccw_online+0x246/0x2e8
>  [<0000000000c9fecc>] ccw_device_set_online+0x1c4/0x540
>  [<0000000000d5a05e>] virtio_ccw_auto_online+0x26/0x50
>  [<00000000001ba2b0>] async_run_entry_fn+0x40/0x108
>  [<00000000001ab9b4>] process_one_work+0x2a4/0x658
>  [<00000000001abdd0>] worker_thread+0x68/0x440
>  [<00000000001b4668>] kthread+0x128/0x130
>  [<0000000000102fac>] __ret_from_fork+0x3c/0x50
>  [<0000000000d9d3aa>] ret_from_fork+0xa/0x40
> INFO: lockdep is turned off.
>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-11  2:22     ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  2:22 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Cindy Lu, Paul E. McKenney, mst, Peter Zijlstra, Marc Zyngier,
	linux-kernel, virtualization, Halil Pasic, eperezma,
	Thomas Gleixner

On Tue, May 10, 2022 at 5:29 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Sat, May 07 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > Hi All:
> >
> > This is a rework on the IRQ hardening for virtio which is done
> > previously by the following commits are reverted:
> >
> > 9e35276a5344 ("virtio_pci: harden MSI-X interrupts")
> > 080cd7c3ac87 ("virtio-pci: harden INTX interrupts")
> >
> > The reason is that it depends on the IRQF_NO_AUTOEN which may conflict
> > with the assumption of the affinity managed IRQ that is used by some
> > virtio drivers. And what's more, it is only done for virtio-pci but
> > not other transports.
> >
> > In this rework, I try to implement a general virtio solution which
> > borrows the idea of the INTX hardening by re-using per virtqueue
> > boolean vq->broken and toggle it in virtio_device_ready() and
> > virtio_reset_device(). Then we can simply reuse the existing checks in
> > the vring_interrupt() and return early if the driver is not ready.
> >
> > Note that, I only did compile test on ccw and MMIO transport.
>
> Lockdep is unhappy with the ccw parts:
>
> ================================
> WARNING: inconsistent lock state
> 5.18.0-rc6+ #191 Not tainted
> --------------------------------
> inconsistent {IN-HARDIRQ-R} -> {HARDIRQ-ON-W} usage.
> kworker/u4:0/9 [HC0[0]:SC0[0]:HE1:SE1] takes:
> 00000000058e9618 (&vcdev->irq_lock){+-..}-{2:2}, at: virtio_ccw_synchronize_cbs+0x4e/0x60
> {IN-HARDIRQ-R} state was registered at:
>   __lock_acquire+0x442/0xc20
>   lock_acquire.part.0+0xdc/0x228
>   lock_acquire+0xa6/0x1b0
>   _raw_read_lock_irqsave+0x72/0x100
>   virtio_ccw_int_handler+0x84/0x238
>   ccw_device_call_handler+0x72/0xd0
>   ccw_device_irq+0x7a/0x198
>   do_cio_interrupt+0x11c/0x1d0
>   __handle_irq_event_percpu+0xc2/0x318
>   handle_irq_event_percpu+0x26/0x68
>   handle_percpu_irq+0x64/0x88
>   generic_handle_irq+0x40/0x58
>   do_irq_async+0x56/0xb0
>   do_io_irq+0x82/0x160
>   io_int_handler+0xe6/0x120
>   rcu_read_lock_sched_held+0x3e/0xb0
>   lock_acquired+0x12e/0x208
>   new_inode+0x3e/0xd0
>   debugfs_get_inode+0x22/0x68
>   __debugfs_create_file+0x78/0x1c0
>   debugfs_create_file_unsafe+0x36/0x58
>   debugfs_create_u32+0x38/0x68
>   sched_init_debug+0xb0/0x1c0
>   do_one_initcall+0x108/0x280
>   do_initcalls+0x124/0x148
>   kernel_init_freeable+0x242/0x280
>   kernel_init+0x2e/0x158
>   __ret_from_fork+0x3c/0x50
>   ret_from_fork+0xa/0x40
> irq event stamp: 539789
> hardirqs last  enabled at (539789): [<0000000000d9c632>] _raw_spin_unlock_irqrestore+0x72/0x88
> hardirqs last disabled at (539788): [<0000000000d9c2b6>] _raw_spin_lock_irqsave+0x96/0xd0
> softirqs last  enabled at (539568): [<0000000000d9e0d4>] __do_softirq+0x434/0x588
> softirqs last disabled at (539503): [<000000000018cd66>] __irq_exit_rcu+0x146/0x170
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock(&vcdev->irq_lock);
>   <Interrupt>
>     lock(&vcdev->irq_lock);
>
>  *** DEADLOCK ***

It looks to me we need to use write_lock_irq()/write_unlock_irq() to
do the synchronization.

And we probably need to keep the
read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
virtio_ccw_int_handler() to be called from process context (e.g from
the io_subchannel_quiesce()).

Thanks

>
> 2 locks held by kworker/u4:0/9:
>  #0: 000000000288d948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
>  #1: 000003800004bdc8 ((work_completion)(&entry->work)){+.+.}-{0:0}, at: process_one_work+0x1ea/0x658
>
> stack backtrace:
> CPU: 1 PID: 9 Comm: kworker/u4:0 Not tainted 5.18.0-rc6+ #191
> Hardware name: QEMU 8561 QEMU (KVM/Linux)
> Workqueue: events_unbound async_run_entry_fn
> Call Trace:
>  [<0000000000d8af22>] dump_stack_lvl+0x92/0xd0
>  [<00000000002032ac>] mark_lock_irq+0x864/0x968
>  [<0000000000203670>] mark_lock.part.0+0x2c0/0x790
>  [<0000000000203cea>] mark_usage+0x10a/0x178
>  [<000000000020692a>] __lock_acquire+0x442/0xc20
>  [<0000000000207cc4>] lock_acquire.part.0+0xdc/0x228
>  [<0000000000207eb6>] lock_acquire+0xa6/0x1b0
>  [<0000000000d9c774>] _raw_write_lock+0x54/0xa8
>  [<0000000000d5a1f6>] virtio_ccw_synchronize_cbs+0x4e/0x60
>  [<00000000008eec04>] register_virtio_device+0xdc/0x1b0
>  [<0000000000d5aabe>] virtio_ccw_online+0x246/0x2e8
>  [<0000000000c9fecc>] ccw_device_set_online+0x1c4/0x540
>  [<0000000000d5a05e>] virtio_ccw_auto_online+0x26/0x50
>  [<00000000001ba2b0>] async_run_entry_fn+0x40/0x108
>  [<00000000001ab9b4>] process_one_work+0x2a4/0x658
>  [<00000000001abdd0>] worker_thread+0x68/0x440
>  [<00000000001b4668>] kthread+0x128/0x130
>  [<0000000000102fac>] __ret_from_fork+0x3c/0x50
>  [<0000000000d9d3aa>] ret_from_fork+0xa/0x40
> INFO: lockdep is turned off.
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
  2022-05-10 11:32     ` Michael S. Tsirkin
@ 2022-05-11  2:40       ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  2:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cindy Lu, Paul E. McKenney, Peter Zijlstra, Marc Zyngier,
	Cornelia Huck, linux-kernel, virtualization, Halil Pasic,
	eperezma, Thomas Gleixner

On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> > This is a rework on the previous IRQ hardening that is done for
> > virtio-pci where several drawbacks were found and were reverted:
> >
> > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> >    that is used by some device such as virtio-blk
> > 2) done only for PCI transport
> >
> > The vq->broken is re-used in this patch for implementing the IRQ
> > hardening. The vq->broken is set to true during both initialization
> > and reset. And the vq->broken is set to false in
> > virtio_device_ready(). Then vring_interrupt can check and return when
> > vq->broken is true. And in this case, switch to return IRQ_NONE to let
> > the interrupt core aware of such invalid interrupt to prevent IRQ
> > storm.
> >
> > The reason of using a per queue variable instead of a per device one
> > is that we may need it for per queue reset hardening in the future.
> >
> > Note that the hardening is only done for vring interrupt since the
> > config interrupt hardening is already done in commit 22b7050a024d7
> > ("virtio: defer config changed notifications"). But the method that is
> > used by config interrupt can't be reused by the vring interrupt
> > handler because it uses spinlock to do the synchronization which is
> > expensive.
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Halil Pasic <pasic@linux.ibm.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/virtio/virtio.c       | 15 ++++++++++++---
> >  drivers/virtio/virtio_ring.c  | 11 +++++++----
> >  include/linux/virtio_config.h | 12 ++++++++++++
> >  3 files changed, 31 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > index 8dde44ea044a..696f5ba4f38e 100644
> > --- a/drivers/virtio/virtio.c
> > +++ b/drivers/virtio/virtio.c
> > @@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> >   * */
> >  void virtio_reset_device(struct virtio_device *dev)
> >  {
> > +     /*
> > +      * The below virtio_synchronize_cbs() guarantees that any
> > +      * interrupt for this line arriving after
> > +      * virtio_synchronize_vqs() has completed is guaranteed to see
> > +      * driver_ready == false.
> > +      */
> > +     virtio_break_device(dev);
> > +     virtio_synchronize_cbs(dev);
> > +
> >       dev->config->reset(dev);
> >  }
> >  EXPORT_SYMBOL_GPL(virtio_reset_device);
> > @@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev)
> >       dev->config_enabled = false;
> >       dev->config_change_pending = false;
> >
> > +     INIT_LIST_HEAD(&dev->vqs);
> > +     spin_lock_init(&dev->vqs_list_lock);
> > +
> >       /* We always start by resetting the device, in case a previous
> >        * driver messed it up.  This also tests that code path a little. */
> >       virtio_reset_device(dev);
> > @@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev)
> >       /* Acknowledge that we've seen the device. */
> >       virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
> >
> > -     INIT_LIST_HEAD(&dev->vqs);
> > -     spin_lock_init(&dev->vqs_list_lock);
> > -
> >       /*
> >        * device_add() causes the bus infrastructure to look for a matching
> >        * driver.
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 5b7df7c455f0..9dfad2890d7a 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -1690,7 +1690,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> >       vq->we_own_ring = true;
> >       vq->notify = notify;
> >       vq->weak_barriers = weak_barriers;
> > -     vq->broken = false;
> > +     vq->broken = true;
> >       vq->last_used_idx = 0;
> >       vq->event_triggered = false;
> >       vq->num_added = 0;
> > @@ -2136,8 +2136,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
> >               return IRQ_NONE;
> >       }
> >
> > -     if (unlikely(vq->broken))
> > -             return IRQ_HANDLED;
> > +     if (unlikely(vq->broken)) {
> > +             dev_warn_once(&vq->vq.vdev->dev,
> > +                           "virtio vring IRQ raised before DRIVER_OK");
> > +             return IRQ_NONE;
> > +     }
> >
> >       /* Just a hint for performance: so it's ok that this can be racy! */
> >       if (vq->event)
> > @@ -2179,7 +2182,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> >       vq->we_own_ring = false;
> >       vq->notify = notify;
> >       vq->weak_barriers = weak_barriers;
> > -     vq->broken = false;
> > +     vq->broken = true;
> >       vq->last_used_idx = 0;
> >       vq->event_triggered = false;
> >       vq->num_added = 0;
> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > index d8a2340f928e..23f1694cdbd5 100644
> > --- a/include/linux/virtio_config.h
> > +++ b/include/linux/virtio_config.h
> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> >       unsigned status = dev->config->get_status(dev);
> >
> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > +
> > +     /*
> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > +      * will see the driver specific setup if it sees vq->broken
> > +      * as false.
> > +      */
> > +     virtio_synchronize_cbs(dev);
>
> since you mention vq->broken above, maybe add
>         "set vq->broken to false"

Ok.

>
> > +     __virtio_unbreak_device(dev);
> > +     /*
> > +      * The transport is expected ensure the visibility of
>
> to ensure

Will fix.

>
> > +      * vq->broken
>
> let's add: "visibility by vq callbacks"

Sure.

>
> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> > +      */
>
>
> Can I see some analysis of existing transports showing
> this is actually the case for them?

Yes.

> And maybe add a comment near set_status to document the
> requirement.

For PCI and MMIO, we can quote the memory-barriers.txt or explain that
wmb() is not needed before the MMIO writel().
For CCW, it looks not obvious, it looks to me the IO was submitted via
__ssch() which has an inline assembly.  Cornelia and Hali, could you
help me to understand if and how did virtio_ccw_set_status() can
ensure the visibility of the previous driver setup and vq->broken
here?

Thanks

>
> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> >  }
> >
> > --
> > 2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
@ 2022-05-11  2:40       ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  2:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, linux-kernel, Thomas Gleixner, Peter Zijlstra,
	Paul E. McKenney, Marc Zyngier, Halil Pasic, Cornelia Huck,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> > This is a rework on the previous IRQ hardening that is done for
> > virtio-pci where several drawbacks were found and were reverted:
> >
> > 1) try to use IRQF_NO_AUTOEN which is not friendly to affinity managed IRQ
> >    that is used by some device such as virtio-blk
> > 2) done only for PCI transport
> >
> > The vq->broken is re-used in this patch for implementing the IRQ
> > hardening. The vq->broken is set to true during both initialization
> > and reset. And the vq->broken is set to false in
> > virtio_device_ready(). Then vring_interrupt can check and return when
> > vq->broken is true. And in this case, switch to return IRQ_NONE to let
> > the interrupt core aware of such invalid interrupt to prevent IRQ
> > storm.
> >
> > The reason of using a per queue variable instead of a per device one
> > is that we may need it for per queue reset hardening in the future.
> >
> > Note that the hardening is only done for vring interrupt since the
> > config interrupt hardening is already done in commit 22b7050a024d7
> > ("virtio: defer config changed notifications"). But the method that is
> > used by config interrupt can't be reused by the vring interrupt
> > handler because it uses spinlock to do the synchronization which is
> > expensive.
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Halil Pasic <pasic@linux.ibm.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/virtio/virtio.c       | 15 ++++++++++++---
> >  drivers/virtio/virtio_ring.c  | 11 +++++++----
> >  include/linux/virtio_config.h | 12 ++++++++++++
> >  3 files changed, 31 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
> > index 8dde44ea044a..696f5ba4f38e 100644
> > --- a/drivers/virtio/virtio.c
> > +++ b/drivers/virtio/virtio.c
> > @@ -220,6 +220,15 @@ static int virtio_features_ok(struct virtio_device *dev)
> >   * */
> >  void virtio_reset_device(struct virtio_device *dev)
> >  {
> > +     /*
> > +      * The below virtio_synchronize_cbs() guarantees that any
> > +      * interrupt for this line arriving after
> > +      * virtio_synchronize_vqs() has completed is guaranteed to see
> > +      * driver_ready == false.
> > +      */
> > +     virtio_break_device(dev);
> > +     virtio_synchronize_cbs(dev);
> > +
> >       dev->config->reset(dev);
> >  }
> >  EXPORT_SYMBOL_GPL(virtio_reset_device);
> > @@ -428,6 +437,9 @@ int register_virtio_device(struct virtio_device *dev)
> >       dev->config_enabled = false;
> >       dev->config_change_pending = false;
> >
> > +     INIT_LIST_HEAD(&dev->vqs);
> > +     spin_lock_init(&dev->vqs_list_lock);
> > +
> >       /* We always start by resetting the device, in case a previous
> >        * driver messed it up.  This also tests that code path a little. */
> >       virtio_reset_device(dev);
> > @@ -435,9 +447,6 @@ int register_virtio_device(struct virtio_device *dev)
> >       /* Acknowledge that we've seen the device. */
> >       virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE);
> >
> > -     INIT_LIST_HEAD(&dev->vqs);
> > -     spin_lock_init(&dev->vqs_list_lock);
> > -
> >       /*
> >        * device_add() causes the bus infrastructure to look for a matching
> >        * driver.
> > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
> > index 5b7df7c455f0..9dfad2890d7a 100644
> > --- a/drivers/virtio/virtio_ring.c
> > +++ b/drivers/virtio/virtio_ring.c
> > @@ -1690,7 +1690,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
> >       vq->we_own_ring = true;
> >       vq->notify = notify;
> >       vq->weak_barriers = weak_barriers;
> > -     vq->broken = false;
> > +     vq->broken = true;
> >       vq->last_used_idx = 0;
> >       vq->event_triggered = false;
> >       vq->num_added = 0;
> > @@ -2136,8 +2136,11 @@ irqreturn_t vring_interrupt(int irq, void *_vq)
> >               return IRQ_NONE;
> >       }
> >
> > -     if (unlikely(vq->broken))
> > -             return IRQ_HANDLED;
> > +     if (unlikely(vq->broken)) {
> > +             dev_warn_once(&vq->vq.vdev->dev,
> > +                           "virtio vring IRQ raised before DRIVER_OK");
> > +             return IRQ_NONE;
> > +     }
> >
> >       /* Just a hint for performance: so it's ok that this can be racy! */
> >       if (vq->event)
> > @@ -2179,7 +2182,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int index,
> >       vq->we_own_ring = false;
> >       vq->notify = notify;
> >       vq->weak_barriers = weak_barriers;
> > -     vq->broken = false;
> > +     vq->broken = true;
> >       vq->last_used_idx = 0;
> >       vq->event_triggered = false;
> >       vq->num_added = 0;
> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > index d8a2340f928e..23f1694cdbd5 100644
> > --- a/include/linux/virtio_config.h
> > +++ b/include/linux/virtio_config.h
> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> >       unsigned status = dev->config->get_status(dev);
> >
> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > +
> > +     /*
> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > +      * will see the driver specific setup if it sees vq->broken
> > +      * as false.
> > +      */
> > +     virtio_synchronize_cbs(dev);
>
> since you mention vq->broken above, maybe add
>         "set vq->broken to false"

Ok.

>
> > +     __virtio_unbreak_device(dev);
> > +     /*
> > +      * The transport is expected ensure the visibility of
>
> to ensure

Will fix.

>
> > +      * vq->broken
>
> let's add: "visibility by vq callbacks"

Sure.

>
> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> > +      */
>
>
> Can I see some analysis of existing transports showing
> this is actually the case for them?

Yes.

> And maybe add a comment near set_status to document the
> requirement.

For PCI and MMIO, we can quote the memory-barriers.txt or explain that
wmb() is not needed before the MMIO writel().
For CCW, it looks not obvious, it looks to me the IO was submitted via
__ssch() which has an inline assembly.  Cornelia and Hali, could you
help me to understand if and how did virtio_ccw_set_status() can
ensure the visibility of the previous driver setup and vq->broken
here?

Thanks

>
> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> >  }
> >
> > --
> > 2.25.1
>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-10 11:27     ` Michael S. Tsirkin
@ 2022-05-11  2:41       ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  2:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cindy Lu, Paul E. McKenney, Peter Zijlstra, Marc Zyngier,
	Cornelia Huck, linux-kernel, virtualization, Halil Pasic,
	eperezma, Thomas Gleixner

On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> > This patch tries to implement the synchronize_cbs() for ccw. For the
> > vring_interrupt() that is called via virtio_airq_handler(), the
> > synchronization is simply done via the airq_info's lock. For the
> > vring_interrupt() that is called via virtio_ccw_int_handler(), a per
> > device spinlock for irq is introduced ans used in the synchronization
> > method.
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Halil Pasic <pasic@linux.ibm.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> >
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index d35e7a3f7067..001e1f0e6037 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -62,6 +62,7 @@ struct virtio_ccw_device {
> >       unsigned int revision; /* Transport revision */
> >       wait_queue_head_t wait_q;
> >       spinlock_t lock;
> > +     rwlock_t irq_lock;
> >       struct mutex io_lock; /* Serializes I/O requests */
> >       struct list_head virtqueues;
> >       bool is_thinint;
> > @@ -984,6 +985,27 @@ static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
> >       return dev_name(&vcdev->cdev->dev);
> >  }
> >
> > +static void virtio_ccw_synchronize_cbs(struct virtio_device *vdev)
> > +{
> > +     struct virtio_ccw_device *vcdev = to_vc_device(vdev);
> > +     struct airq_info *info = vcdev->airq_info;
> > +
> > +     if (info) {
> > +             /*
> > +              * Synchronize with the vring_interrupt() with airq indicator
> > +              */
> > +             write_lock(&info->lock);
> > +             write_unlock(&info->lock);
> > +     } else {
> > +             /*
> > +              * Synchronize with the vring_interrupt() called by
> > +              * virtio_ccw_int_handler().
> > +              */
> > +             write_lock(&vcdev->irq_lock);
> > +             write_unlock(&vcdev->irq_lock);
> > +     }
> > +}
> > +
> >  static const struct virtio_config_ops virtio_ccw_config_ops = {
> >       .get_features = virtio_ccw_get_features,
> >       .finalize_features = virtio_ccw_finalize_features,
> > @@ -995,6 +1017,7 @@ static const struct virtio_config_ops virtio_ccw_config_ops = {
> >       .find_vqs = virtio_ccw_find_vqs,
> >       .del_vqs = virtio_ccw_del_vqs,
> >       .bus_name = virtio_ccw_bus_name,
> > +     .synchronize_cbs = virtio_ccw_synchronize_cbs,
> >  };
> >
> >
> > @@ -1079,6 +1102,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >  {
> >       __u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK;
> >       struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev);
> > +     unsigned long flags;
> >       int i;
> >       struct virtqueue *vq;
> >
> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >                       vcdev->err = -EIO;
> >       }
> >       virtio_ccw_check_activity(vcdev, activity);
> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> >       for_each_set_bit(i, indicators(vcdev),
> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> >               /* The bit clear must happen before the vring kick. */
>
> Cornelia sent a lockdep trace on this.
>
> Basically I think this gets the irqsave/restore logic wrong.
> It attempts to disable irqs in the handler (which is an interrupt
> anyway).

The reason I use irqsave/restore is that it can be called from process
context (if I was not wrong), e.g from io_subchannel_quiesce().

> And it does not disable irqs in the synchronize_cbs.
>
> As a result in interrupt might try to take a read lock while
> .synchronize_cbs has the writer lock, resulting in a deadlock.
>
> I think you want regular read_lock + write_lock_irq.

Yes.

Thanks

>
>
> > @@ -1114,6 +1139,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >               vq = virtio_ccw_vq_by_ind(vcdev, i);
> >               vring_interrupt(0, vq);
> >       }
> > +     read_unlock_irqrestore(&vcdev->irq_lock, flags);
> >       if (test_bit(0, indicators2(vcdev))) {
> >               virtio_config_changed(&vcdev->vdev);
> >               clear_bit(0, indicators2(vcdev));
> > @@ -1284,6 +1310,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >       init_waitqueue_head(&vcdev->wait_q);
> >       INIT_LIST_HEAD(&vcdev->virtqueues);
> >       spin_lock_init(&vcdev->lock);
> > +     rwlock_init(&vcdev->irq_lock);
> >       mutex_init(&vcdev->io_lock);
> >
> >       spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
> > --
> > 2.25.1
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-11  2:41       ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  2:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtualization, linux-kernel, Thomas Gleixner, Peter Zijlstra,
	Paul E. McKenney, Marc Zyngier, Halil Pasic, Cornelia Huck,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> > This patch tries to implement the synchronize_cbs() for ccw. For the
> > vring_interrupt() that is called via virtio_airq_handler(), the
> > synchronization is simply done via the airq_info's lock. For the
> > vring_interrupt() that is called via virtio_ccw_int_handler(), a per
> > device spinlock for irq is introduced ans used in the synchronization
> > method.
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Halil Pasic <pasic@linux.ibm.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/s390/virtio/virtio_ccw.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> >
> > diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> > index d35e7a3f7067..001e1f0e6037 100644
> > --- a/drivers/s390/virtio/virtio_ccw.c
> > +++ b/drivers/s390/virtio/virtio_ccw.c
> > @@ -62,6 +62,7 @@ struct virtio_ccw_device {
> >       unsigned int revision; /* Transport revision */
> >       wait_queue_head_t wait_q;
> >       spinlock_t lock;
> > +     rwlock_t irq_lock;
> >       struct mutex io_lock; /* Serializes I/O requests */
> >       struct list_head virtqueues;
> >       bool is_thinint;
> > @@ -984,6 +985,27 @@ static const char *virtio_ccw_bus_name(struct virtio_device *vdev)
> >       return dev_name(&vcdev->cdev->dev);
> >  }
> >
> > +static void virtio_ccw_synchronize_cbs(struct virtio_device *vdev)
> > +{
> > +     struct virtio_ccw_device *vcdev = to_vc_device(vdev);
> > +     struct airq_info *info = vcdev->airq_info;
> > +
> > +     if (info) {
> > +             /*
> > +              * Synchronize with the vring_interrupt() with airq indicator
> > +              */
> > +             write_lock(&info->lock);
> > +             write_unlock(&info->lock);
> > +     } else {
> > +             /*
> > +              * Synchronize with the vring_interrupt() called by
> > +              * virtio_ccw_int_handler().
> > +              */
> > +             write_lock(&vcdev->irq_lock);
> > +             write_unlock(&vcdev->irq_lock);
> > +     }
> > +}
> > +
> >  static const struct virtio_config_ops virtio_ccw_config_ops = {
> >       .get_features = virtio_ccw_get_features,
> >       .finalize_features = virtio_ccw_finalize_features,
> > @@ -995,6 +1017,7 @@ static const struct virtio_config_ops virtio_ccw_config_ops = {
> >       .find_vqs = virtio_ccw_find_vqs,
> >       .del_vqs = virtio_ccw_del_vqs,
> >       .bus_name = virtio_ccw_bus_name,
> > +     .synchronize_cbs = virtio_ccw_synchronize_cbs,
> >  };
> >
> >
> > @@ -1079,6 +1102,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >  {
> >       __u32 activity = intparm & VIRTIO_CCW_INTPARM_MASK;
> >       struct virtio_ccw_device *vcdev = dev_get_drvdata(&cdev->dev);
> > +     unsigned long flags;
> >       int i;
> >       struct virtqueue *vq;
> >
> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >                       vcdev->err = -EIO;
> >       }
> >       virtio_ccw_check_activity(vcdev, activity);
> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> >       for_each_set_bit(i, indicators(vcdev),
> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> >               /* The bit clear must happen before the vring kick. */
>
> Cornelia sent a lockdep trace on this.
>
> Basically I think this gets the irqsave/restore logic wrong.
> It attempts to disable irqs in the handler (which is an interrupt
> anyway).

The reason I use irqsave/restore is that it can be called from process
context (if I was not wrong), e.g from io_subchannel_quiesce().

> And it does not disable irqs in the synchronize_cbs.
>
> As a result in interrupt might try to take a read lock while
> .synchronize_cbs has the writer lock, resulting in a deadlock.
>
> I think you want regular read_lock + write_lock_irq.

Yes.

Thanks

>
>
> > @@ -1114,6 +1139,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >               vq = virtio_ccw_vq_by_ind(vcdev, i);
> >               vring_interrupt(0, vq);
> >       }
> > +     read_unlock_irqrestore(&vcdev->irq_lock, flags);
> >       if (test_bit(0, indicators2(vcdev))) {
> >               virtio_config_changed(&vcdev->vdev);
> >               clear_bit(0, indicators2(vcdev));
> > @@ -1284,6 +1310,7 @@ static int virtio_ccw_online(struct ccw_device *cdev)
> >       init_waitqueue_head(&vcdev->wait_q);
> >       INIT_LIST_HEAD(&vcdev->virtqueues);
> >       spin_lock_init(&vcdev->lock);
> > +     rwlock_init(&vcdev->irq_lock);
> >       mutex_init(&vcdev->io_lock);
> >
> >       spin_lock_irqsave(get_ccwdev_lock(cdev), flags);
> > --
> > 2.25.1
>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-11  2:41       ` Jason Wang
@ 2022-05-11  8:17         ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-11  8:17 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtualization, linux-kernel, Thomas Gleixner, Peter Zijlstra,
	Paul E. McKenney, Marc Zyngier, Halil Pasic, eperezma, Cindy Lu,
	Stefano Garzarella, Xuan Zhuo

On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:

> On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>
>> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
>> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>> >                       vcdev->err = -EIO;
>> >       }
>> >       virtio_ccw_check_activity(vcdev, activity);
>> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
>> >       for_each_set_bit(i, indicators(vcdev),
>> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>> >               /* The bit clear must happen before the vring kick. */
>>
>> Cornelia sent a lockdep trace on this.
>>
>> Basically I think this gets the irqsave/restore logic wrong.
>> It attempts to disable irqs in the handler (which is an interrupt
>> anyway).
>
> The reason I use irqsave/restore is that it can be called from process
> context (if I was not wrong), e.g from io_subchannel_quiesce().

io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
would be a bug.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-11  8:17         ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-11  8:17 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: Cindy Lu, Paul E. McKenney, Peter Zijlstra, Marc Zyngier,
	linux-kernel, virtualization, Halil Pasic, eperezma,
	Thomas Gleixner

On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:

> On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>
>> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
>> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>> >                       vcdev->err = -EIO;
>> >       }
>> >       virtio_ccw_check_activity(vcdev, activity);
>> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
>> >       for_each_set_bit(i, indicators(vcdev),
>> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>> >               /* The bit clear must happen before the vring kick. */
>>
>> Cornelia sent a lockdep trace on this.
>>
>> Basically I think this gets the irqsave/restore logic wrong.
>> It attempts to disable irqs in the handler (which is an interrupt
>> anyway).
>
> The reason I use irqsave/restore is that it can be called from process
> context (if I was not wrong), e.g from io_subchannel_quiesce().

io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
would be a bug.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
  2022-05-11  2:40       ` Jason Wang
@ 2022-05-11  8:44         ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-11  8:44 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: virtualization, linux-kernel, Thomas Gleixner, Peter Zijlstra,
	Paul E. McKenney, Marc Zyngier, Halil Pasic, eperezma, Cindy Lu,
	Stefano Garzarella, Xuan Zhuo

On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:

> On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>
>> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
>> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
>> > index d8a2340f928e..23f1694cdbd5 100644
>> > --- a/include/linux/virtio_config.h
>> > +++ b/include/linux/virtio_config.h
>> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
>> >       unsigned status = dev->config->get_status(dev);
>> >
>> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
>> > +
>> > +     /*
>> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
>> > +      * will see the driver specific setup if it sees vq->broken
>> > +      * as false.
>> > +      */
>> > +     virtio_synchronize_cbs(dev);
>>
>> since you mention vq->broken above, maybe add
>>         "set vq->broken to false"
>
> Ok.
>
>>
>> > +     __virtio_unbreak_device(dev);
>> > +     /*
>> > +      * The transport is expected ensure the visibility of
>>
>> to ensure
>
> Will fix.
>
>>
>> > +      * vq->broken
>>
>> let's add: "visibility by vq callbacks"
>
> Sure.
>
>>
>> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
>> > +      */
>>
>>
>> Can I see some analysis of existing transports showing
>> this is actually the case for them?
>
> Yes.
>
>> And maybe add a comment near set_status to document the
>> requirement.
>
> For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> wmb() is not needed before the MMIO writel().
> For CCW, it looks not obvious, it looks to me the IO was submitted via
> __ssch() which has an inline assembly.  Cornelia and Hali, could you
> help me to understand if and how did virtio_ccw_set_status() can
> ensure the visibility of the previous driver setup and vq->broken
> here?

I'm not sure I completely understand the question here, but let me try:

virtio_ccw_set_status() uses a channel command to set the status, with
the interesting stuff done inside ccw_io_helper(). That function
- takes the subchannel lock, disabling interrupts
- does the ssch; this instruction will fail if there's already another
  I/O in progress, or an interrupt is pending for the subchannel; on
  success, it is guaranteed that we'll get an interrupt eventually
- unlock the subchannel, and wait for the interupt handler to eventually
  process the interrupt, so I guess it should see the vq->broken value?

If the I/O fails, virtio_ccw_set_status() will revert its internal
status to the old value.


>
> Thanks
>
>>
>> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
>> >  }


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
@ 2022-05-11  8:44         ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-11  8:44 UTC (permalink / raw)
  To: Jason Wang, Michael S. Tsirkin
  Cc: Cindy Lu, Paul E. McKenney, Peter Zijlstra, Marc Zyngier,
	linux-kernel, virtualization, Halil Pasic, eperezma,
	Thomas Gleixner

On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:

> On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>
>> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
>> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
>> > index d8a2340f928e..23f1694cdbd5 100644
>> > --- a/include/linux/virtio_config.h
>> > +++ b/include/linux/virtio_config.h
>> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
>> >       unsigned status = dev->config->get_status(dev);
>> >
>> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
>> > +
>> > +     /*
>> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
>> > +      * will see the driver specific setup if it sees vq->broken
>> > +      * as false.
>> > +      */
>> > +     virtio_synchronize_cbs(dev);
>>
>> since you mention vq->broken above, maybe add
>>         "set vq->broken to false"
>
> Ok.
>
>>
>> > +     __virtio_unbreak_device(dev);
>> > +     /*
>> > +      * The transport is expected ensure the visibility of
>>
>> to ensure
>
> Will fix.
>
>>
>> > +      * vq->broken
>>
>> let's add: "visibility by vq callbacks"
>
> Sure.
>
>>
>> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
>> > +      */
>>
>>
>> Can I see some analysis of existing transports showing
>> this is actually the case for them?
>
> Yes.
>
>> And maybe add a comment near set_status to document the
>> requirement.
>
> For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> wmb() is not needed before the MMIO writel().
> For CCW, it looks not obvious, it looks to me the IO was submitted via
> __ssch() which has an inline assembly.  Cornelia and Hali, could you
> help me to understand if and how did virtio_ccw_set_status() can
> ensure the visibility of the previous driver setup and vq->broken
> here?

I'm not sure I completely understand the question here, but let me try:

virtio_ccw_set_status() uses a channel command to set the status, with
the interesting stuff done inside ccw_io_helper(). That function
- takes the subchannel lock, disabling interrupts
- does the ssch; this instruction will fail if there's already another
  I/O in progress, or an interrupt is pending for the subchannel; on
  success, it is guaranteed that we'll get an interrupt eventually
- unlock the subchannel, and wait for the interupt handler to eventually
  process the interrupt, so I guess it should see the vq->broken value?

If the I/O fails, virtio_ccw_set_status() will revert its internal
status to the old value.


>
> Thanks
>
>>
>> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
>> >  }

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-11  8:17         ` Cornelia Huck
@ 2022-05-11  8:58           ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  8:58 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	Halil Pasic, eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>
> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >> >                       vcdev->err = -EIO;
> >> >       }
> >> >       virtio_ccw_check_activity(vcdev, activity);
> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> >> >       for_each_set_bit(i, indicators(vcdev),
> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> >> >               /* The bit clear must happen before the vring kick. */
> >>
> >> Cornelia sent a lockdep trace on this.
> >>
> >> Basically I think this gets the irqsave/restore logic wrong.
> >> It attempts to disable irqs in the handler (which is an interrupt
> >> anyway).
> >
> > The reason I use irqsave/restore is that it can be called from process
> > context (if I was not wrong), e.g from io_subchannel_quiesce().
>
> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
> would be a bug.

Right, it was protected by a spin_lock_irq(), but I can see other
cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
they have the same assumption which IRQ is disabled?

Thanks

>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-11  8:58           ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  8:58 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Cindy Lu, Paul E. McKenney, Michael S. Tsirkin, Peter Zijlstra,
	Marc Zyngier, linux-kernel, virtualization, Halil Pasic,
	eperezma, Thomas Gleixner

On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>
> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >> >                       vcdev->err = -EIO;
> >> >       }
> >> >       virtio_ccw_check_activity(vcdev, activity);
> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> >> >       for_each_set_bit(i, indicators(vcdev),
> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> >> >               /* The bit clear must happen before the vring kick. */
> >>
> >> Cornelia sent a lockdep trace on this.
> >>
> >> Basically I think this gets the irqsave/restore logic wrong.
> >> It attempts to disable irqs in the handler (which is an interrupt
> >> anyway).
> >
> > The reason I use irqsave/restore is that it can be called from process
> > context (if I was not wrong), e.g from io_subchannel_quiesce().
>
> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
> would be a bug.

Right, it was protected by a spin_lock_irq(), but I can see other
cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
they have the same assumption which IRQ is disabled?

Thanks

>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-11  8:58           ` Jason Wang
@ 2022-05-11  9:13             ` Cornelia Huck
  -1 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-11  9:13 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	Halil Pasic, eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo,
	Vineeth Vijayan, Peter Oberparleiter

On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:

> On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
>>
>> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>>
>> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> >>
>> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
>> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>> >> >                       vcdev->err = -EIO;
>> >> >       }
>> >> >       virtio_ccw_check_activity(vcdev, activity);
>> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
>> >> >       for_each_set_bit(i, indicators(vcdev),
>> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>> >> >               /* The bit clear must happen before the vring kick. */
>> >>
>> >> Cornelia sent a lockdep trace on this.
>> >>
>> >> Basically I think this gets the irqsave/restore logic wrong.
>> >> It attempts to disable irqs in the handler (which is an interrupt
>> >> anyway).
>> >
>> > The reason I use irqsave/restore is that it can be called from process
>> > context (if I was not wrong), e.g from io_subchannel_quiesce().
>>
>> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
>> would be a bug.
>
> Right, it was protected by a spin_lock_irq(), but I can see other
> cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
> they have the same assumption which IRQ is disabled?

Yes, that should be the case for any invocations via the fsm as well.

It's been some time since I've worked on that part of the code, though,
so let's cc: the s390 cio maintainers so that they can speak up if I'm
wrong.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-11  9:13             ` Cornelia Huck
  0 siblings, 0 replies; 71+ messages in thread
From: Cornelia Huck @ 2022-05-11  9:13 UTC (permalink / raw)
  To: Jason Wang
  Cc: Peter Oberparleiter, Cindy Lu, Paul E. McKenney,
	Michael S. Tsirkin, Peter Zijlstra, Marc Zyngier, linux-kernel,
	virtualization, Halil Pasic, eperezma, Vineeth Vijayan,
	Thomas Gleixner

On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:

> On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
>>
>> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>>
>> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>> >>
>> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
>> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
>> >> >                       vcdev->err = -EIO;
>> >> >       }
>> >> >       virtio_ccw_check_activity(vcdev, activity);
>> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
>> >> >       for_each_set_bit(i, indicators(vcdev),
>> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
>> >> >               /* The bit clear must happen before the vring kick. */
>> >>
>> >> Cornelia sent a lockdep trace on this.
>> >>
>> >> Basically I think this gets the irqsave/restore logic wrong.
>> >> It attempts to disable irqs in the handler (which is an interrupt
>> >> anyway).
>> >
>> > The reason I use irqsave/restore is that it can be called from process
>> > context (if I was not wrong), e.g from io_subchannel_quiesce().
>>
>> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
>> would be a bug.
>
> Right, it was protected by a spin_lock_irq(), but I can see other
> cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
> they have the same assumption which IRQ is disabled?

Yes, that should be the case for any invocations via the fsm as well.

It's been some time since I've worked on that part of the code, though,
so let's cc: the s390 cio maintainers so that they can speak up if I'm
wrong.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
  2022-05-11  8:44         ` Cornelia Huck
@ 2022-05-11  9:27           ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  9:27 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	Halil Pasic, eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Wed, May 11, 2022 at 4:44 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>
> >> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> >> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> >> > index d8a2340f928e..23f1694cdbd5 100644
> >> > --- a/include/linux/virtio_config.h
> >> > +++ b/include/linux/virtio_config.h
> >> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> >> >       unsigned status = dev->config->get_status(dev);
> >> >
> >> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> >> > +
> >> > +     /*
> >> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> >> > +      * will see the driver specific setup if it sees vq->broken
> >> > +      * as false.
> >> > +      */
> >> > +     virtio_synchronize_cbs(dev);
> >>
> >> since you mention vq->broken above, maybe add
> >>         "set vq->broken to false"
> >
> > Ok.
> >
> >>
> >> > +     __virtio_unbreak_device(dev);
> >> > +     /*
> >> > +      * The transport is expected ensure the visibility of
> >>
> >> to ensure
> >
> > Will fix.
> >
> >>
> >> > +      * vq->broken
> >>
> >> let's add: "visibility by vq callbacks"
> >
> > Sure.
> >
> >>
> >> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> >> > +      */
> >>
> >>
> >> Can I see some analysis of existing transports showing
> >> this is actually the case for them?
> >
> > Yes.
> >
> >> And maybe add a comment near set_status to document the
> >> requirement.
> >
> > For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> > wmb() is not needed before the MMIO writel().
> > For CCW, it looks not obvious, it looks to me the IO was submitted via
> > __ssch() which has an inline assembly.  Cornelia and Hali, could you
> > help me to understand if and how did virtio_ccw_set_status() can
> > ensure the visibility of the previous driver setup and vq->broken
> > here?
>
> I'm not sure I completely understand the question here, but let me try:

It's something like the following case:

CPU 0: vq->broken = false
CPU 0: set_status(DRIVER_OK)
CPU 1: vring_interrupt() { if (vq->broken) return IRQ_NONE; }

We need to make sure the CPU 1 sees the vq->broken if the interrupt is
raised after DRVER_OK.

For PCI, we use MMIO of writel() for set_status(), a wmb() is not
needed in this case according to memory-barriers.txt.

"
Note that, when using writel(), a prior
wmb() is not needed to guarantee that the cache coherent memory writes
have completed before writing to the MMIO region.
"

So CPU 1 will see the broken as false.

>
> virtio_ccw_set_status() uses a channel command to set the status, with
> the interesting stuff done inside ccw_io_helper(). That function
> - takes the subchannel lock, disabling interrupts

Then it is, for x86 the operation to disable interrupt is a full
barrier. I guess this should apply to other architecture like s390. I
see a stnsm is used in this case but a quick google doesn't tell me if
it's a barrier.
If this is true. The vring_interrupt will see broken as false.

> - does the ssch; this instruction will fail if there's already another
>   I/O in progress, or an interrupt is pending for the subchannel; on
>   success, it is guaranteed that we'll get an interrupt eventually

I guess ssch might imply a barrier as well, otherwise we may need a
lot of barriers before this.

Thanks

> - unlock the subchannel, and wait for the interupt handler to eventually
>   process the interrupt, so I guess it should see the vq->broken value?
>
> If the I/O fails, virtio_ccw_set_status() will revert its internal
> status to the old value.
>
>
> >
> > Thanks
> >
> >>
> >> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> >> >  }
>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
@ 2022-05-11  9:27           ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  9:27 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Cindy Lu, Paul E. McKenney, Michael S. Tsirkin, Peter Zijlstra,
	Marc Zyngier, linux-kernel, virtualization, Halil Pasic,
	eperezma, Thomas Gleixner

On Wed, May 11, 2022 at 4:44 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >>
> >> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> >> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> >> > index d8a2340f928e..23f1694cdbd5 100644
> >> > --- a/include/linux/virtio_config.h
> >> > +++ b/include/linux/virtio_config.h
> >> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> >> >       unsigned status = dev->config->get_status(dev);
> >> >
> >> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> >> > +
> >> > +     /*
> >> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> >> > +      * will see the driver specific setup if it sees vq->broken
> >> > +      * as false.
> >> > +      */
> >> > +     virtio_synchronize_cbs(dev);
> >>
> >> since you mention vq->broken above, maybe add
> >>         "set vq->broken to false"
> >
> > Ok.
> >
> >>
> >> > +     __virtio_unbreak_device(dev);
> >> > +     /*
> >> > +      * The transport is expected ensure the visibility of
> >>
> >> to ensure
> >
> > Will fix.
> >
> >>
> >> > +      * vq->broken
> >>
> >> let's add: "visibility by vq callbacks"
> >
> > Sure.
> >
> >>
> >> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> >> > +      */
> >>
> >>
> >> Can I see some analysis of existing transports showing
> >> this is actually the case for them?
> >
> > Yes.
> >
> >> And maybe add a comment near set_status to document the
> >> requirement.
> >
> > For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> > wmb() is not needed before the MMIO writel().
> > For CCW, it looks not obvious, it looks to me the IO was submitted via
> > __ssch() which has an inline assembly.  Cornelia and Hali, could you
> > help me to understand if and how did virtio_ccw_set_status() can
> > ensure the visibility of the previous driver setup and vq->broken
> > here?
>
> I'm not sure I completely understand the question here, but let me try:

It's something like the following case:

CPU 0: vq->broken = false
CPU 0: set_status(DRIVER_OK)
CPU 1: vring_interrupt() { if (vq->broken) return IRQ_NONE; }

We need to make sure the CPU 1 sees the vq->broken if the interrupt is
raised after DRVER_OK.

For PCI, we use MMIO of writel() for set_status(), a wmb() is not
needed in this case according to memory-barriers.txt.

"
Note that, when using writel(), a prior
wmb() is not needed to guarantee that the cache coherent memory writes
have completed before writing to the MMIO region.
"

So CPU 1 will see the broken as false.

>
> virtio_ccw_set_status() uses a channel command to set the status, with
> the interesting stuff done inside ccw_io_helper(). That function
> - takes the subchannel lock, disabling interrupts

Then it is, for x86 the operation to disable interrupt is a full
barrier. I guess this should apply to other architecture like s390. I
see a stnsm is used in this case but a quick google doesn't tell me if
it's a barrier.
If this is true. The vring_interrupt will see broken as false.

> - does the ssch; this instruction will fail if there's already another
>   I/O in progress, or an interrupt is pending for the subchannel; on
>   success, it is guaranteed that we'll get an interrupt eventually

I guess ssch might imply a barrier as well, otherwise we may need a
lot of barriers before this.

Thanks

> - unlock the subchannel, and wait for the interupt handler to eventually
>   process the interrupt, so I guess it should see the vq->broken value?
>
> If the I/O fails, virtio_ccw_set_status() will revert its internal
> status to the old value.
>
>
> >
> > Thanks
> >
> >>
> >> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> >> >  }
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-11  9:13             ` Cornelia Huck
@ 2022-05-11  9:28               ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  9:28 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	Halil Pasic, eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo,
	Vineeth Vijayan, Peter Oberparleiter

On Wed, May 11, 2022 at 5:13 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
> >>
> >> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >>
> >> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> >> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >> >> >                       vcdev->err = -EIO;
> >> >> >       }
> >> >> >       virtio_ccw_check_activity(vcdev, activity);
> >> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> >> >> >       for_each_set_bit(i, indicators(vcdev),
> >> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> >> >> >               /* The bit clear must happen before the vring kick. */
> >> >>
> >> >> Cornelia sent a lockdep trace on this.
> >> >>
> >> >> Basically I think this gets the irqsave/restore logic wrong.
> >> >> It attempts to disable irqs in the handler (which is an interrupt
> >> >> anyway).
> >> >
> >> > The reason I use irqsave/restore is that it can be called from process
> >> > context (if I was not wrong), e.g from io_subchannel_quiesce().
> >>
> >> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
> >> would be a bug.
> >
> > Right, it was protected by a spin_lock_irq(), but I can see other
> > cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
> > they have the same assumption which IRQ is disabled?
>
> Yes, that should be the case for any invocations via the fsm as well.
>

Ok.

> It's been some time since I've worked on that part of the code, though,
> so let's cc: the s390 cio maintainers so that they can speak up if I'm
> wrong.

Ok, I will do that.

Thanks

>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-11  9:28               ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-11  9:28 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Peter Oberparleiter, Cindy Lu, Paul E. McKenney,
	Michael S. Tsirkin, Peter Zijlstra, Marc Zyngier, linux-kernel,
	virtualization, Halil Pasic, eperezma, Vineeth Vijayan,
	Thomas Gleixner

On Wed, May 11, 2022 at 5:13 PM Cornelia Huck <cohuck@redhat.com> wrote:
>
> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
>
> > On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
> >>
> >> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> >>
> >> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >> >>
> >> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> >> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> >> >> >                       vcdev->err = -EIO;
> >> >> >       }
> >> >> >       virtio_ccw_check_activity(vcdev, activity);
> >> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> >> >> >       for_each_set_bit(i, indicators(vcdev),
> >> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> >> >> >               /* The bit clear must happen before the vring kick. */
> >> >>
> >> >> Cornelia sent a lockdep trace on this.
> >> >>
> >> >> Basically I think this gets the irqsave/restore logic wrong.
> >> >> It attempts to disable irqs in the handler (which is an interrupt
> >> >> anyway).
> >> >
> >> > The reason I use irqsave/restore is that it can be called from process
> >> > context (if I was not wrong), e.g from io_subchannel_quiesce().
> >>
> >> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
> >> would be a bug.
> >
> > Right, it was protected by a spin_lock_irq(), but I can see other
> > cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
> > they have the same assumption which IRQ is disabled?
>
> Yes, that should be the case for any invocations via the fsm as well.
>

Ok.

> It's been some time since I've worked on that part of the code, though,
> so let's cc: the s390 cio maintainers so that they can speak up if I'm
> wrong.

Ok, I will do that.

Thanks

>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
  2022-05-11  9:27           ` Jason Wang
@ 2022-05-11 12:49             ` Halil Pasic
  -1 siblings, 0 replies; 71+ messages in thread
From: Halil Pasic @ 2022-05-11 12:49 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo, Halil Pasic

On Wed, 11 May 2022 17:27:44 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On Wed, May 11, 2022 at 4:44 PM Cornelia Huck <cohuck@redhat.com> wrote:
> >
> > On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> >  
> > > On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:  
> > >>
> > >> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:  
> > >> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > >> > index d8a2340f928e..23f1694cdbd5 100644
> > >> > --- a/include/linux/virtio_config.h
> > >> > +++ b/include/linux/virtio_config.h
> > >> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> > >> >       unsigned status = dev->config->get_status(dev);
> > >> >
> > >> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > >> > +
> > >> > +     /*
> > >> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > >> > +      * will see the driver specific setup if it sees vq->broken
> > >> > +      * as false.
> > >> > +      */
> > >> > +     virtio_synchronize_cbs(dev);  
> > >>
> > >> since you mention vq->broken above, maybe add
> > >>         "set vq->broken to false"  
> > >
> > > Ok.
> > >  
> > >>  
> > >> > +     __virtio_unbreak_device(dev);
> > >> > +     /*
> > >> > +      * The transport is expected ensure the visibility of  
> > >>
> > >> to ensure  
> > >
> > > Will fix.
> > >  
> > >>  
> > >> > +      * vq->broken  
> > >>
> > >> let's add: "visibility by vq callbacks"  
> > >
> > > Sure.
> > >  
> > >>  
> > >> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> > >> > +      */  
> > >>
> > >>
> > >> Can I see some analysis of existing transports showing
> > >> this is actually the case for them?  
> > >
> > > Yes.
> > >  
> > >> And maybe add a comment near set_status to document the
> > >> requirement.  
> > >
> > > For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> > > wmb() is not needed before the MMIO writel().
> > > For CCW, it looks not obvious, it looks to me the IO was submitted via
> > > __ssch() which has an inline assembly.  Cornelia and Hali, could you
> > > help me to understand if and how did virtio_ccw_set_status() can
> > > ensure the visibility of the previous driver setup and vq->broken
> > > here?  
> >
> > I'm not sure I completely understand the question here, but let me try:  
> 
> It's something like the following case:
> 
> CPU 0: vq->broken = false
> CPU 0: set_status(DRIVER_OK)
> CPU 1: vring_interrupt() { if (vq->broken) return IRQ_NONE; }
> 
> We need to make sure the CPU 1 sees the vq->broken if the interrupt is
> raised after DRVER_OK.
> 
> For PCI, we use MMIO of writel() for set_status(), a wmb() is not
> needed in this case according to memory-barriers.txt.
> 
> "
> Note that, when using writel(), a prior
> wmb() is not needed to guarantee that the cache coherent memory writes
> have completed before writing to the MMIO region.
> "


IMHO the key facts here are the following:
* ssch and all other I/O instructions are serializing instructions
* all interruptions are serializing operations 

For reference see
https://www.ibm.com/resources/publications/OutputPubsDetails?PubID=SA22783213
page 5-138.


Maybe we should add that to the linux documentation somewhere if
not already mentioned.

So IMHO we don't need CPU0 to do a wmb() because of the ssch.

> 
> So CPU 1 will see the broken as false.

But barriers need to be paired. And in my understanding the ssch
doesn't really ensure that CPU1 is about to see the change, unless
there is a suitable barrier that pairs with the barrier implied
the ssch instruction.

Assumed vring_interrupt() is always done in hard-irq context, AFAIU,
we should be fine. Is that assumption correct?

Why are we fine:
* Either the ssch was performed before the interrupt for
  vring_interrupt() got delivered on CPU1, and then we are guaranteed to
  see the updated value for vq->broken,
* or the interrupt that triggered vring_interrupt() was delivered before
  the ssch instruction got executed. But in this case it is fine to
  ignore the notification, because this is actually the bad case
  we want to guard against: we got a notification when
  notifications are not allowed.

We may end up with !vq->broken and !DEVICE_OK as well, but that should
be fine because, although that notification would be a should not happen
one, I understand it would not catch us with our pants down.

Regards,
Halil


> 
> >
> > virtio_ccw_set_status() uses a channel command to set the status, with
> > the interesting stuff done inside ccw_io_helper(). That function
> > - takes the subchannel lock, disabling interrupts  
> 
> Then it is, for x86 the operation to disable interrupt is a full
> barrier. I guess this should apply to other architecture like s390. I
> see a stnsm is used in this case but a quick google doesn't tell me if
> it's a barrier.
> If this is true. The vring_interrupt will see broken as false.
> 
> > - does the ssch; this instruction will fail if there's already another
> >   I/O in progress, or an interrupt is pending for the subchannel; on
> >   success, it is guaranteed that we'll get an interrupt eventually  
> 
> I guess ssch might imply a barrier as well, otherwise we may need a
> lot of barriers before this.
> 
> Thanks
> 
> > - unlock the subchannel, and wait for the interupt handler to eventually
> >   process the interrupt, so I guess it should see the vq->broken value?
> >
> > If the I/O fails, virtio_ccw_set_status() will revert its internal
> > status to the old value.
> >
> >  
> > >
> > > Thanks
> > >  
> > >>  
> > >> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > >> >  }  
> >  
> 


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
@ 2022-05-11 12:49             ` Halil Pasic
  0 siblings, 0 replies; 71+ messages in thread
From: Halil Pasic @ 2022-05-11 12:49 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cindy Lu, Paul E. McKenney, Michael S. Tsirkin, Peter Zijlstra,
	Marc Zyngier, Cornelia Huck, linux-kernel, virtualization,
	Halil Pasic, eperezma, Thomas Gleixner

On Wed, 11 May 2022 17:27:44 +0800
Jason Wang <jasowang@redhat.com> wrote:

> On Wed, May 11, 2022 at 4:44 PM Cornelia Huck <cohuck@redhat.com> wrote:
> >
> > On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> >  
> > > On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:  
> > >>
> > >> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:  
> > >> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > >> > index d8a2340f928e..23f1694cdbd5 100644
> > >> > --- a/include/linux/virtio_config.h
> > >> > +++ b/include/linux/virtio_config.h
> > >> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> > >> >       unsigned status = dev->config->get_status(dev);
> > >> >
> > >> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > >> > +
> > >> > +     /*
> > >> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > >> > +      * will see the driver specific setup if it sees vq->broken
> > >> > +      * as false.
> > >> > +      */
> > >> > +     virtio_synchronize_cbs(dev);  
> > >>
> > >> since you mention vq->broken above, maybe add
> > >>         "set vq->broken to false"  
> > >
> > > Ok.
> > >  
> > >>  
> > >> > +     __virtio_unbreak_device(dev);
> > >> > +     /*
> > >> > +      * The transport is expected ensure the visibility of  
> > >>
> > >> to ensure  
> > >
> > > Will fix.
> > >  
> > >>  
> > >> > +      * vq->broken  
> > >>
> > >> let's add: "visibility by vq callbacks"  
> > >
> > > Sure.
> > >  
> > >>  
> > >> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> > >> > +      */  
> > >>
> > >>
> > >> Can I see some analysis of existing transports showing
> > >> this is actually the case for them?  
> > >
> > > Yes.
> > >  
> > >> And maybe add a comment near set_status to document the
> > >> requirement.  
> > >
> > > For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> > > wmb() is not needed before the MMIO writel().
> > > For CCW, it looks not obvious, it looks to me the IO was submitted via
> > > __ssch() which has an inline assembly.  Cornelia and Hali, could you
> > > help me to understand if and how did virtio_ccw_set_status() can
> > > ensure the visibility of the previous driver setup and vq->broken
> > > here?  
> >
> > I'm not sure I completely understand the question here, but let me try:  
> 
> It's something like the following case:
> 
> CPU 0: vq->broken = false
> CPU 0: set_status(DRIVER_OK)
> CPU 1: vring_interrupt() { if (vq->broken) return IRQ_NONE; }
> 
> We need to make sure the CPU 1 sees the vq->broken if the interrupt is
> raised after DRVER_OK.
> 
> For PCI, we use MMIO of writel() for set_status(), a wmb() is not
> needed in this case according to memory-barriers.txt.
> 
> "
> Note that, when using writel(), a prior
> wmb() is not needed to guarantee that the cache coherent memory writes
> have completed before writing to the MMIO region.
> "


IMHO the key facts here are the following:
* ssch and all other I/O instructions are serializing instructions
* all interruptions are serializing operations 

For reference see
https://www.ibm.com/resources/publications/OutputPubsDetails?PubID=SA22783213
page 5-138.


Maybe we should add that to the linux documentation somewhere if
not already mentioned.

So IMHO we don't need CPU0 to do a wmb() because of the ssch.

> 
> So CPU 1 will see the broken as false.

But barriers need to be paired. And in my understanding the ssch
doesn't really ensure that CPU1 is about to see the change, unless
there is a suitable barrier that pairs with the barrier implied
the ssch instruction.

Assumed vring_interrupt() is always done in hard-irq context, AFAIU,
we should be fine. Is that assumption correct?

Why are we fine:
* Either the ssch was performed before the interrupt for
  vring_interrupt() got delivered on CPU1, and then we are guaranteed to
  see the updated value for vq->broken,
* or the interrupt that triggered vring_interrupt() was delivered before
  the ssch instruction got executed. But in this case it is fine to
  ignore the notification, because this is actually the bad case
  we want to guard against: we got a notification when
  notifications are not allowed.

We may end up with !vq->broken and !DEVICE_OK as well, but that should
be fine because, although that notification would be a should not happen
one, I understand it would not catch us with our pants down.

Regards,
Halil


> 
> >
> > virtio_ccw_set_status() uses a channel command to set the status, with
> > the interesting stuff done inside ccw_io_helper(). That function
> > - takes the subchannel lock, disabling interrupts  
> 
> Then it is, for x86 the operation to disable interrupt is a full
> barrier. I guess this should apply to other architecture like s390. I
> see a stnsm is used in this case but a quick google doesn't tell me if
> it's a barrier.
> If this is true. The vring_interrupt will see broken as false.
> 
> > - does the ssch; this instruction will fail if there's already another
> >   I/O in progress, or an interrupt is pending for the subchannel; on
> >   success, it is guaranteed that we'll get an interrupt eventually  
> 
> I guess ssch might imply a barrier as well, otherwise we may need a
> lot of barriers before this.
> 
> Thanks
> 
> > - unlock the subchannel, and wait for the interupt handler to eventually
> >   process the interrupt, so I guess it should see the vq->broken value?
> >
> > If the I/O fails, virtio_ccw_set_status() will revert its internal
> > status to the old value.
> >
> >  
> > >
> > > Thanks
> > >  
> > >>  
> > >> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > >> >  }  
> >  
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
  2022-05-11  2:22     ` Jason Wang
@ 2022-05-11 14:01       ` Halil Pasic
  -1 siblings, 0 replies; 71+ messages in thread
From: Halil Pasic @ 2022-05-11 14:01 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cindy Lu, Paul E. McKenney, mst, Peter Zijlstra, Marc Zyngier,
	Cornelia Huck, linux-kernel, virtualization, Halil Pasic,
	eperezma, Thomas Gleixner

On Wed, 11 May 2022 10:22:59 +0800
Jason Wang <jasowang@redhat.com> wrote:

> >        CPU0
> >        ----
> >   lock(&vcdev->irq_lock);
> >   <Interrupt>
> >     lock(&vcdev->irq_lock);
> >
> >  *** DEADLOCK ***  
> 
> It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> do the synchronization.
> 
> And we probably need to keep the
> read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> virtio_ccw_int_handler() to be called from process context (e.g from
> the io_subchannel_quiesce()).
> 

Sounds correct.

Regards,
Halil
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-11 14:01       ` Halil Pasic
  0 siblings, 0 replies; 71+ messages in thread
From: Halil Pasic @ 2022-05-11 14:01 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, mst, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo, Halil Pasic

On Wed, 11 May 2022 10:22:59 +0800
Jason Wang <jasowang@redhat.com> wrote:

> >        CPU0
> >        ----
> >   lock(&vcdev->irq_lock);
> >   <Interrupt>
> >     lock(&vcdev->irq_lock);
> >
> >  *** DEADLOCK ***  
> 
> It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> do the synchronization.
> 
> And we probably need to keep the
> read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> virtio_ccw_int_handler() to be called from process context (e.g from
> the io_subchannel_quiesce()).
> 

Sounds correct.

Regards,
Halil

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-11  9:28               ` Jason Wang
  (?)
@ 2022-05-11 14:52               ` Vineeth Vijayan
  2022-05-12  3:29                   ` Jason Wang
  -1 siblings, 1 reply; 71+ messages in thread
From: Vineeth Vijayan @ 2022-05-11 14:52 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	Halil Pasic, eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo,
	Peter Oberparleiter

On Wed, May 11, 2022 at 05:28:11PM +0800, Jason Wang wrote:
> On Wed, May 11, 2022 at 5:13 PM Cornelia Huck <cohuck@redhat.com> wrote:
> >
> > On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> >
> > > On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > >>
> > >> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> > >>
> > >> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >> >>
> > >> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> > >> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> > >> >> >                       vcdev->err = -EIO;
> > >> >> >       }
> > >> >> >       virtio_ccw_check_activity(vcdev, activity);
> > >> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> > >> >> >       for_each_set_bit(i, indicators(vcdev),
> > >> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> > >> >> >               /* The bit clear must happen before the vring kick. */
> > >> >>
> > >> >> Cornelia sent a lockdep trace on this.
> > >> >>
> > >> >> Basically I think this gets the irqsave/restore logic wrong.
> > >> >> It attempts to disable irqs in the handler (which is an interrupt
> > >> >> anyway).
> > >> >
> > >> > The reason I use irqsave/restore is that it can be called from process
> > >> > context (if I was not wrong), e.g from io_subchannel_quiesce().
> > >>
> > >> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
> > >> would be a bug.
> > >
> > > Right, it was protected by a spin_lock_irq(), but I can see other
> > > cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
> > > they have the same assumption which IRQ is disabled?
> >
> > Yes, that should be the case for any invocations via the fsm as well.
> >
> 
> Ok.
> 
> > It's been some time since I've worked on that part of the code, though,
> > so let's cc: the s390 cio maintainers so that they can speak up if I'm
> > wrong.
> 
> Ok, I will do that.
> 
> Thanks
> 
> >
Thank you Corny to looking in to this. I agree, the cdev->handler is
called with lock held. And as you mentioned, in the fsm these handler
invocations are done with IRQ disabled, which will otherwise end up in a
deadlock.
thanks.

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
  2022-05-11 12:49             ` Halil Pasic
@ 2022-05-12  3:27               ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-12  3:27 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cornelia Huck, Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Wed, May 11, 2022 at 8:49 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> On Wed, 11 May 2022 17:27:44 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
> > On Wed, May 11, 2022 at 4:44 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > >
> > > On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > > On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >>
> > > >> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> > > >> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > >> > index d8a2340f928e..23f1694cdbd5 100644
> > > >> > --- a/include/linux/virtio_config.h
> > > >> > +++ b/include/linux/virtio_config.h
> > > >> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> > > >> >       unsigned status = dev->config->get_status(dev);
> > > >> >
> > > >> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > >> > +
> > > >> > +     /*
> > > >> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > > >> > +      * will see the driver specific setup if it sees vq->broken
> > > >> > +      * as false.
> > > >> > +      */
> > > >> > +     virtio_synchronize_cbs(dev);
> > > >>
> > > >> since you mention vq->broken above, maybe add
> > > >>         "set vq->broken to false"
> > > >
> > > > Ok.
> > > >
> > > >>
> > > >> > +     __virtio_unbreak_device(dev);
> > > >> > +     /*
> > > >> > +      * The transport is expected ensure the visibility of
> > > >>
> > > >> to ensure
> > > >
> > > > Will fix.
> > > >
> > > >>
> > > >> > +      * vq->broken
> > > >>
> > > >> let's add: "visibility by vq callbacks"
> > > >
> > > > Sure.
> > > >
> > > >>
> > > >> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> > > >> > +      */
> > > >>
> > > >>
> > > >> Can I see some analysis of existing transports showing
> > > >> this is actually the case for them?
> > > >
> > > > Yes.
> > > >
> > > >> And maybe add a comment near set_status to document the
> > > >> requirement.
> > > >
> > > > For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> > > > wmb() is not needed before the MMIO writel().
> > > > For CCW, it looks not obvious, it looks to me the IO was submitted via
> > > > __ssch() which has an inline assembly.  Cornelia and Hali, could you
> > > > help me to understand if and how did virtio_ccw_set_status() can
> > > > ensure the visibility of the previous driver setup and vq->broken
> > > > here?
> > >
> > > I'm not sure I completely understand the question here, but let me try:
> >
> > It's something like the following case:
> >
> > CPU 0: vq->broken = false
> > CPU 0: set_status(DRIVER_OK)
> > CPU 1: vring_interrupt() { if (vq->broken) return IRQ_NONE; }
> >
> > We need to make sure the CPU 1 sees the vq->broken if the interrupt is
> > raised after DRVER_OK.
> >
> > For PCI, we use MMIO of writel() for set_status(), a wmb() is not
> > needed in this case according to memory-barriers.txt.
> >
> > "
> > Note that, when using writel(), a prior
> > wmb() is not needed to guarantee that the cache coherent memory writes
> > have completed before writing to the MMIO region.
> > "
>
>
> IMHO the key facts here are the following:
> * ssch and all other I/O instructions are serializing instructions
> * all interruptions are serializing operations
>
> For reference see
> https://www.ibm.com/resources/publications/OutputPubsDetails?PubID=SA22783213
> page 5-138.

I see thanks for the pointer.

>
>
> Maybe we should add that to the linux documentation somewhere if
> not already mentioned.

Maybe somewhere in memory-barriers.txt.

>
> So IMHO we don't need CPU0 to do a wmb() because of the ssch.
>

Right.

> >
> > So CPU 1 will see the broken as false.
>
> But barriers need to be paired.

Yes, actually the pairing is done by the device where it need something like:

if (get_status(DRIVER_OK)) {
    rmb();
    start_device_logic();
    raise_interrupt();
}

> And in my understanding the ssch
> doesn't really ensure that CPU1 is about to see the change, unless
> there is a suitable barrier that pairs with the barrier implied
> the ssch instruction.
>
> Assumed vring_interrupt() is always done in hard-irq context, AFAIU,
> we should be fine. Is that assumption correct?
>
> Why are we fine:
> * Either the ssch was performed before the interrupt for
>   vring_interrupt() got delivered on CPU1, and then we are guaranteed to
>   see the updated value for vq->broken,

Yes, for a well behaved device, the device will raise the interrupt
after it sees DRIVER_OK and the ssch guarantees that when the device
sees DRIVER_OK vq->broken is false.

> * or the interrupt that triggered vring_interrupt() was delivered before
>   the ssch instruction got executed. But in this case it is fine to
>   ignore the notification, because this is actually the bad case
>   we want to guard against: we got a notification when
>   notifications are not allowed.

Exactly.

>
> We may end up with !vq->broken and !DEVICE_OK as well, but that should
> be fine because, although that notification would be a should not happen
> one, I understand it would not catch us with our pants down.

Right.

>
> Regards,
> Halil
>
>
> >
> > >
> > > virtio_ccw_set_status() uses a channel command to set the status, with
> > > the interesting stuff done inside ccw_io_helper(). That function
> > > - takes the subchannel lock, disabling interrupts
> >
> > Then it is, for x86 the operation to disable interrupt is a full
> > barrier. I guess this should apply to other architecture like s390. I
> > see a stnsm is used in this case but a quick google doesn't tell me if
> > it's a barrier.

Looks like it's not a serialization instruction and this
memory-barriers.rst told me irq-disabling is only a compiler barrier:

"""
Functions that disable interrupts (ACQUIRE equivalent) and enable interrupts
(RELEASE equivalent) will act as compiler barriers only.  So if memory or I/O
barriers are required in such a situation, they must be provided from some
other means.
"""

Thanks

> > If this is true. The vring_interrupt will see broken as false.
> >
> > > - does the ssch; this instruction will fail if there's already another
> > >   I/O in progress, or an interrupt is pending for the subchannel; on
> > >   success, it is guaranteed that we'll get an interrupt eventually
> >
> > I guess ssch might imply a barrier as well, otherwise we may need a
> > lot of barriers before this.
> >
> > Thanks
> >
> > > - unlock the subchannel, and wait for the interupt handler to eventually
> > >   process the interrupt, so I guess it should see the vq->broken value?
> > >
> > > If the I/O fails, virtio_ccw_set_status() will revert its internal
> > > status to the old value.
> > >
> > >
> > > >
> > > > Thanks
> > > >
> > > >>
> > > >> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > >> >  }
> > >
> >
>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 8/9] virtio: harden vring IRQ
@ 2022-05-12  3:27               ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-12  3:27 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cindy Lu, Paul E. McKenney, Michael S. Tsirkin, Peter Zijlstra,
	Marc Zyngier, Cornelia Huck, linux-kernel, virtualization,
	eperezma, Thomas Gleixner

On Wed, May 11, 2022 at 8:49 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> On Wed, 11 May 2022 17:27:44 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
> > On Wed, May 11, 2022 at 4:44 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > >
> > > On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > > On Tue, May 10, 2022 at 7:32 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >>
> > > >> On Sat, May 07, 2022 at 03:19:53PM +0800, Jason Wang wrote:
> > > >> > diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> > > >> > index d8a2340f928e..23f1694cdbd5 100644
> > > >> > --- a/include/linux/virtio_config.h
> > > >> > +++ b/include/linux/virtio_config.h
> > > >> > @@ -256,6 +256,18 @@ void virtio_device_ready(struct virtio_device *dev)
> > > >> >       unsigned status = dev->config->get_status(dev);
> > > >> >
> > > >> >       BUG_ON(status & VIRTIO_CONFIG_S_DRIVER_OK);
> > > >> > +
> > > >> > +     /*
> > > >> > +      * The virtio_synchronize_cbs() makes sure vring_interrupt()
> > > >> > +      * will see the driver specific setup if it sees vq->broken
> > > >> > +      * as false.
> > > >> > +      */
> > > >> > +     virtio_synchronize_cbs(dev);
> > > >>
> > > >> since you mention vq->broken above, maybe add
> > > >>         "set vq->broken to false"
> > > >
> > > > Ok.
> > > >
> > > >>
> > > >> > +     __virtio_unbreak_device(dev);
> > > >> > +     /*
> > > >> > +      * The transport is expected ensure the visibility of
> > > >>
> > > >> to ensure
> > > >
> > > > Will fix.
> > > >
> > > >>
> > > >> > +      * vq->broken
> > > >>
> > > >> let's add: "visibility by vq callbacks"
> > > >
> > > > Sure.
> > > >
> > > >>
> > > >> > before setting VIRTIO_CONFIG_S_DRIVER_OK.
> > > >> > +      */
> > > >>
> > > >>
> > > >> Can I see some analysis of existing transports showing
> > > >> this is actually the case for them?
> > > >
> > > > Yes.
> > > >
> > > >> And maybe add a comment near set_status to document the
> > > >> requirement.
> > > >
> > > > For PCI and MMIO, we can quote the memory-barriers.txt or explain that
> > > > wmb() is not needed before the MMIO writel().
> > > > For CCW, it looks not obvious, it looks to me the IO was submitted via
> > > > __ssch() which has an inline assembly.  Cornelia and Hali, could you
> > > > help me to understand if and how did virtio_ccw_set_status() can
> > > > ensure the visibility of the previous driver setup and vq->broken
> > > > here?
> > >
> > > I'm not sure I completely understand the question here, but let me try:
> >
> > It's something like the following case:
> >
> > CPU 0: vq->broken = false
> > CPU 0: set_status(DRIVER_OK)
> > CPU 1: vring_interrupt() { if (vq->broken) return IRQ_NONE; }
> >
> > We need to make sure the CPU 1 sees the vq->broken if the interrupt is
> > raised after DRVER_OK.
> >
> > For PCI, we use MMIO of writel() for set_status(), a wmb() is not
> > needed in this case according to memory-barriers.txt.
> >
> > "
> > Note that, when using writel(), a prior
> > wmb() is not needed to guarantee that the cache coherent memory writes
> > have completed before writing to the MMIO region.
> > "
>
>
> IMHO the key facts here are the following:
> * ssch and all other I/O instructions are serializing instructions
> * all interruptions are serializing operations
>
> For reference see
> https://www.ibm.com/resources/publications/OutputPubsDetails?PubID=SA22783213
> page 5-138.

I see thanks for the pointer.

>
>
> Maybe we should add that to the linux documentation somewhere if
> not already mentioned.

Maybe somewhere in memory-barriers.txt.

>
> So IMHO we don't need CPU0 to do a wmb() because of the ssch.
>

Right.

> >
> > So CPU 1 will see the broken as false.
>
> But barriers need to be paired.

Yes, actually the pairing is done by the device where it need something like:

if (get_status(DRIVER_OK)) {
    rmb();
    start_device_logic();
    raise_interrupt();
}

> And in my understanding the ssch
> doesn't really ensure that CPU1 is about to see the change, unless
> there is a suitable barrier that pairs with the barrier implied
> the ssch instruction.
>
> Assumed vring_interrupt() is always done in hard-irq context, AFAIU,
> we should be fine. Is that assumption correct?
>
> Why are we fine:
> * Either the ssch was performed before the interrupt for
>   vring_interrupt() got delivered on CPU1, and then we are guaranteed to
>   see the updated value for vq->broken,

Yes, for a well behaved device, the device will raise the interrupt
after it sees DRIVER_OK and the ssch guarantees that when the device
sees DRIVER_OK vq->broken is false.

> * or the interrupt that triggered vring_interrupt() was delivered before
>   the ssch instruction got executed. But in this case it is fine to
>   ignore the notification, because this is actually the bad case
>   we want to guard against: we got a notification when
>   notifications are not allowed.

Exactly.

>
> We may end up with !vq->broken and !DEVICE_OK as well, but that should
> be fine because, although that notification would be a should not happen
> one, I understand it would not catch us with our pants down.

Right.

>
> Regards,
> Halil
>
>
> >
> > >
> > > virtio_ccw_set_status() uses a channel command to set the status, with
> > > the interesting stuff done inside ccw_io_helper(). That function
> > > - takes the subchannel lock, disabling interrupts
> >
> > Then it is, for x86 the operation to disable interrupt is a full
> > barrier. I guess this should apply to other architecture like s390. I
> > see a stnsm is used in this case but a quick google doesn't tell me if
> > it's a barrier.

Looks like it's not a serialization instruction and this
memory-barriers.rst told me irq-disabling is only a compiler barrier:

"""
Functions that disable interrupts (ACQUIRE equivalent) and enable interrupts
(RELEASE equivalent) will act as compiler barriers only.  So if memory or I/O
barriers are required in such a situation, they must be provided from some
other means.
"""

Thanks

> > If this is true. The vring_interrupt will see broken as false.
> >
> > > - does the ssch; this instruction will fail if there's already another
> > >   I/O in progress, or an interrupt is pending for the subchannel; on
> > >   success, it is guaranteed that we'll get an interrupt eventually
> >
> > I guess ssch might imply a barrier as well, otherwise we may need a
> > lot of barriers before this.
> >
> > Thanks
> >
> > > - unlock the subchannel, and wait for the interupt handler to eventually
> > >   process the interrupt, so I guess it should see the vq->broken value?
> > >
> > > If the I/O fails, virtio_ccw_set_status() will revert its internal
> > > status to the old value.
> > >
> > >
> > > >
> > > > Thanks
> > > >
> > > >>
> > > >> >       dev->config->set_status(dev, status | VIRTIO_CONFIG_S_DRIVER_OK);
> > > >> >  }
> > >
> >
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
  2022-05-11 14:52               ` Vineeth Vijayan
@ 2022-05-12  3:29                   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-12  3:29 UTC (permalink / raw)
  To: Vineeth Vijayan
  Cc: Michael S. Tsirkin, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	Halil Pasic, eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo,
	Peter Oberparleiter

On Wed, May 11, 2022 at 10:52 PM Vineeth Vijayan <vneethv@linux.ibm.com> wrote:
>
> On Wed, May 11, 2022 at 05:28:11PM +0800, Jason Wang wrote:
> > On Wed, May 11, 2022 at 5:13 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > >
> > > On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > > On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > > >>
> > > >> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> > > >>
> > > >> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >> >>
> > > >> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> > > >> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> > > >> >> >                       vcdev->err = -EIO;
> > > >> >> >       }
> > > >> >> >       virtio_ccw_check_activity(vcdev, activity);
> > > >> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> > > >> >> >       for_each_set_bit(i, indicators(vcdev),
> > > >> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> > > >> >> >               /* The bit clear must happen before the vring kick. */
> > > >> >>
> > > >> >> Cornelia sent a lockdep trace on this.
> > > >> >>
> > > >> >> Basically I think this gets the irqsave/restore logic wrong.
> > > >> >> It attempts to disable irqs in the handler (which is an interrupt
> > > >> >> anyway).
> > > >> >
> > > >> > The reason I use irqsave/restore is that it can be called from process
> > > >> > context (if I was not wrong), e.g from io_subchannel_quiesce().
> > > >>
> > > >> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
> > > >> would be a bug.
> > > >
> > > > Right, it was protected by a spin_lock_irq(), but I can see other
> > > > cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
> > > > they have the same assumption which IRQ is disabled?
> > >
> > > Yes, that should be the case for any invocations via the fsm as well.
> > >
> >
> > Ok.
> >
> > > It's been some time since I've worked on that part of the code, though,
> > > so let's cc: the s390 cio maintainers so that they can speak up if I'm
> > > wrong.
> >
> > Ok, I will do that.
> >
> > Thanks
> >
> > >
> Thank you Corny to looking in to this. I agree, the cdev->handler is
> called with lock held. And as you mentioned, in the fsm these handler
> invocations are done with IRQ disabled, which will otherwise end up in a
> deadlock.
> thanks.
>

Thanks a lot for the confirmation, I will use
spin_lock()/spin_unlock() in the next version.


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 6/9] virtio-ccw: implement synchronize_cbs()
@ 2022-05-12  3:29                   ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-12  3:29 UTC (permalink / raw)
  To: Vineeth Vijayan
  Cc: Peter Oberparleiter, Cindy Lu, Paul E. McKenney,
	Michael S. Tsirkin, Peter Zijlstra, Marc Zyngier, linux-kernel,
	virtualization, Halil Pasic, eperezma, Thomas Gleixner

On Wed, May 11, 2022 at 10:52 PM Vineeth Vijayan <vneethv@linux.ibm.com> wrote:
>
> On Wed, May 11, 2022 at 05:28:11PM +0800, Jason Wang wrote:
> > On Wed, May 11, 2022 at 5:13 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > >
> > > On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> > >
> > > > On Wed, May 11, 2022 at 4:17 PM Cornelia Huck <cohuck@redhat.com> wrote:
> > > >>
> > > >> On Wed, May 11 2022, Jason Wang <jasowang@redhat.com> wrote:
> > > >>
> > > >> > On Tue, May 10, 2022 at 7:28 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >> >>
> > > >> >> On Sat, May 07, 2022 at 03:19:51PM +0800, Jason Wang wrote:
> > > >> >> > @@ -1106,6 +1130,7 @@ static void virtio_ccw_int_handler(struct ccw_device *cdev,
> > > >> >> >                       vcdev->err = -EIO;
> > > >> >> >       }
> > > >> >> >       virtio_ccw_check_activity(vcdev, activity);
> > > >> >> > +     read_lock_irqsave(&vcdev->irq_lock, flags);
> > > >> >> >       for_each_set_bit(i, indicators(vcdev),
> > > >> >> >                        sizeof(*indicators(vcdev)) * BITS_PER_BYTE) {
> > > >> >> >               /* The bit clear must happen before the vring kick. */
> > > >> >>
> > > >> >> Cornelia sent a lockdep trace on this.
> > > >> >>
> > > >> >> Basically I think this gets the irqsave/restore logic wrong.
> > > >> >> It attempts to disable irqs in the handler (which is an interrupt
> > > >> >> anyway).
> > > >> >
> > > >> > The reason I use irqsave/restore is that it can be called from process
> > > >> > context (if I was not wrong), e.g from io_subchannel_quiesce().
> > > >>
> > > >> io_subchannel_quiesce() should disable interrupts, though? Otherwise, it
> > > >> would be a bug.
> > > >
> > > > Right, it was protected by a spin_lock_irq(), but I can see other
> > > > cdev->handler() in e.g device_fsm.c, the irq status is not obvious, do
> > > > they have the same assumption which IRQ is disabled?
> > >
> > > Yes, that should be the case for any invocations via the fsm as well.
> > >
> >
> > Ok.
> >
> > > It's been some time since I've worked on that part of the code, though,
> > > so let's cc: the s390 cio maintainers so that they can speak up if I'm
> > > wrong.
> >
> > Ok, I will do that.
> >
> > Thanks
> >
> > >
> Thank you Corny to looking in to this. I agree, the cdev->handler is
> called with lock held. And as you mentioned, in the fsm these handler
> invocations are done with IRQ disabled, which will otherwise end up in a
> deadlock.
> thanks.
>

Thanks a lot for the confirmation, I will use
spin_lock()/spin_unlock() in the next version.

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
  2022-05-11 14:01       ` Halil Pasic
@ 2022-05-12  3:31         ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-12  3:31 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cornelia Huck, mst, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Wed, May 11, 2022 at 10:02 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> On Wed, 11 May 2022 10:22:59 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
> > >        CPU0
> > >        ----
> > >   lock(&vcdev->irq_lock);
> > >   <Interrupt>
> > >     lock(&vcdev->irq_lock);
> > >
> > >  *** DEADLOCK ***
> >
> > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > do the synchronization.
> >
> > And we probably need to keep the
> > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > virtio_ccw_int_handler() to be called from process context (e.g from
> > the io_subchannel_quiesce()).
> >
>
> Sounds correct.

As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
is called with irq disabled.

So I will use spin_lock()/spin_unlock() in the next version.

Thanks

>
> Regards,
> Halil
>


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-12  3:31         ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-12  3:31 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cindy Lu, Paul E. McKenney, mst, Peter Zijlstra, Marc Zyngier,
	Cornelia Huck, linux-kernel, virtualization, eperezma,
	Thomas Gleixner

On Wed, May 11, 2022 at 10:02 PM Halil Pasic <pasic@linux.ibm.com> wrote:
>
> On Wed, 11 May 2022 10:22:59 +0800
> Jason Wang <jasowang@redhat.com> wrote:
>
> > >        CPU0
> > >        ----
> > >   lock(&vcdev->irq_lock);
> > >   <Interrupt>
> > >     lock(&vcdev->irq_lock);
> > >
> > >  *** DEADLOCK ***
> >
> > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > do the synchronization.
> >
> > And we probably need to keep the
> > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > virtio_ccw_int_handler() to be called from process context (e.g from
> > the io_subchannel_quiesce()).
> >
>
> Sounds correct.

As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
is called with irq disabled.

So I will use spin_lock()/spin_unlock() in the next version.

Thanks

>
> Regards,
> Halil
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
  2022-05-12  3:31         ` Jason Wang
@ 2022-05-16 11:20           ` Halil Pasic
  -1 siblings, 0 replies; 71+ messages in thread
From: Halil Pasic @ 2022-05-16 11:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cindy Lu, Paul E. McKenney, mst, Peter Zijlstra, Marc Zyngier,
	Cornelia Huck, linux-kernel, virtualization, Halil Pasic,
	eperezma, Thomas Gleixner

On Thu, 12 May 2022 11:31:08 +0800
Jason Wang <jasowang@redhat.com> wrote:

> > > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > > do the synchronization.
> > >
> > > And we probably need to keep the
> > > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > > virtio_ccw_int_handler() to be called from process context (e.g from
> > > the io_subchannel_quiesce()).
> > >  
> >
> > Sounds correct.  
> 
> As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
> is called with irq disabled.
> 
> So I will use spin_lock()/spin_unlock() in the next version.

Can we do some sort of an assertion that if the kernel is built with
the corresponding debug features will make sure this assumption holds
(and warn if it does not)? That assertion would also document the fact.

If an assertion is not possible, I think we should at least place a
strategic comment that documents our assumption.

Regards,
Halil

> 
> Thanks
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-16 11:20           ` Halil Pasic
  0 siblings, 0 replies; 71+ messages in thread
From: Halil Pasic @ 2022-05-16 11:20 UTC (permalink / raw)
  To: Jason Wang
  Cc: Cornelia Huck, mst, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo, Halil Pasic

On Thu, 12 May 2022 11:31:08 +0800
Jason Wang <jasowang@redhat.com> wrote:

> > > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > > do the synchronization.
> > >
> > > And we probably need to keep the
> > > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > > virtio_ccw_int_handler() to be called from process context (e.g from
> > > the io_subchannel_quiesce()).
> > >  
> >
> > Sounds correct.  
> 
> As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
> is called with irq disabled.
> 
> So I will use spin_lock()/spin_unlock() in the next version.

Can we do some sort of an assertion that if the kernel is built with
the corresponding debug features will make sure this assumption holds
(and warn if it does not)? That assertion would also document the fact.

If an assertion is not possible, I think we should at least place a
strategic comment that documents our assumption.

Regards,
Halil

> 
> Thanks

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
  2022-05-16 11:20           ` Halil Pasic
@ 2022-05-16 14:25             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 71+ messages in thread
From: Michael S. Tsirkin @ 2022-05-16 14:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Jason Wang, Cornelia Huck, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Mon, May 16, 2022 at 01:20:06PM +0200, Halil Pasic wrote:
> On Thu, 12 May 2022 11:31:08 +0800
> Jason Wang <jasowang@redhat.com> wrote:
> 
> > > > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > > > do the synchronization.
> > > >
> > > > And we probably need to keep the
> > > > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > > > virtio_ccw_int_handler() to be called from process context (e.g from
> > > > the io_subchannel_quiesce()).
> > > >  
> > >
> > > Sounds correct.  
> > 
> > As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
> > is called with irq disabled.
> > 
> > So I will use spin_lock()/spin_unlock() in the next version.
> 
> Can we do some sort of an assertion that if the kernel is built with
> the corresponding debug features will make sure this assumption holds
> (and warn if it does not)? That assertion would also document the fact.

Lockdep will do this automatically if you get it wrong, just like it
did here.

> If an assertion is not possible, I think we should at least place a
> strategic comment that documents our assumption.

That can't hurt.

> Regards,
> Halil
> 
> > 
> > Thanks


^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-16 14:25             ` Michael S. Tsirkin
  0 siblings, 0 replies; 71+ messages in thread
From: Michael S. Tsirkin @ 2022-05-16 14:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Cindy Lu, Paul E. McKenney, Peter Zijlstra, Marc Zyngier,
	Cornelia Huck, linux-kernel, virtualization, eperezma,
	Thomas Gleixner

On Mon, May 16, 2022 at 01:20:06PM +0200, Halil Pasic wrote:
> On Thu, 12 May 2022 11:31:08 +0800
> Jason Wang <jasowang@redhat.com> wrote:
> 
> > > > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > > > do the synchronization.
> > > >
> > > > And we probably need to keep the
> > > > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > > > virtio_ccw_int_handler() to be called from process context (e.g from
> > > > the io_subchannel_quiesce()).
> > > >  
> > >
> > > Sounds correct.  
> > 
> > As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
> > is called with irq disabled.
> > 
> > So I will use spin_lock()/spin_unlock() in the next version.
> 
> Can we do some sort of an assertion that if the kernel is built with
> the corresponding debug features will make sure this assumption holds
> (and warn if it does not)? That assertion would also document the fact.

Lockdep will do this automatically if you get it wrong, just like it
did here.

> If an assertion is not possible, I think we should at least place a
> strategic comment that documents our assumption.

That can't hurt.

> Regards,
> Halil
> 
> > 
> > Thanks

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
  2022-05-16 14:25             ` Michael S. Tsirkin
@ 2022-05-17  1:00               ` Jason Wang
  -1 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-17  1:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cindy Lu, Paul E. McKenney, Peter Zijlstra, Marc Zyngier,
	Cornelia Huck, linux-kernel, virtualization, Halil Pasic,
	eperezma, Thomas Gleixner

On Mon, May 16, 2022 at 10:25 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, May 16, 2022 at 01:20:06PM +0200, Halil Pasic wrote:
> > On Thu, 12 May 2022 11:31:08 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >
> > > > > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > > > > do the synchronization.
> > > > >
> > > > > And we probably need to keep the
> > > > > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > > > > virtio_ccw_int_handler() to be called from process context (e.g from
> > > > > the io_subchannel_quiesce()).
> > > > >
> > > >
> > > > Sounds correct.
> > >
> > > As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
> > > is called with irq disabled.
> > >
> > > So I will use spin_lock()/spin_unlock() in the next version.
> >
> > Can we do some sort of an assertion that if the kernel is built with
> > the corresponding debug features will make sure this assumption holds
> > (and warn if it does not)? That assertion would also document the fact.
>
> Lockdep will do this automatically if you get it wrong, just like it
> did here.
>
> > If an assertion is not possible, I think we should at least place a
> > strategic comment that documents our assumption.
>
> That can't hurt.

I will add some comments here.

Thanks

>
> > Regards,
> > Halil
> >
> > >
> > > Thanks
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: [PATCH V4 0/9] rework on the IRQ hardening of virtio
@ 2022-05-17  1:00               ` Jason Wang
  0 siblings, 0 replies; 71+ messages in thread
From: Jason Wang @ 2022-05-17  1:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Halil Pasic, Cornelia Huck, virtualization, linux-kernel,
	Thomas Gleixner, Peter Zijlstra, Paul E. McKenney, Marc Zyngier,
	eperezma, Cindy Lu, Stefano Garzarella, Xuan Zhuo

On Mon, May 16, 2022 at 10:25 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, May 16, 2022 at 01:20:06PM +0200, Halil Pasic wrote:
> > On Thu, 12 May 2022 11:31:08 +0800
> > Jason Wang <jasowang@redhat.com> wrote:
> >
> > > > > It looks to me we need to use write_lock_irq()/write_unlock_irq() to
> > > > > do the synchronization.
> > > > >
> > > > > And we probably need to keep the
> > > > > read_lock_irqsave()/read_lock_irqrestore() logic since I can see the
> > > > > virtio_ccw_int_handler() to be called from process context (e.g from
> > > > > the io_subchannel_quiesce()).
> > > > >
> > > >
> > > > Sounds correct.
> > >
> > > As Cornelia and Vineeth pointed out, all the paths the vring_interrupt
> > > is called with irq disabled.
> > >
> > > So I will use spin_lock()/spin_unlock() in the next version.
> >
> > Can we do some sort of an assertion that if the kernel is built with
> > the corresponding debug features will make sure this assumption holds
> > (and warn if it does not)? That assertion would also document the fact.
>
> Lockdep will do this automatically if you get it wrong, just like it
> did here.
>
> > If an assertion is not possible, I think we should at least place a
> > strategic comment that documents our assumption.
>
> That can't hurt.

I will add some comments here.

Thanks

>
> > Regards,
> > Halil
> >
> > >
> > > Thanks
>


^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2022-05-17  1:00 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-07  7:19 [PATCH V4 0/9] rework on the IRQ hardening of virtio Jason Wang
2022-05-07  7:19 ` Jason Wang
2022-05-07  7:19 ` [PATCH V4 1/9] virtio: use virtio_device_ready() in virtio_device_restore() Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-09 15:22   ` Cornelia Huck
2022-05-09 15:22     ` Cornelia Huck
2022-05-10  1:50     ` Jason Wang
2022-05-10  1:50       ` Jason Wang
2022-05-07  7:19 ` [PATCH V4 2/9] virtio: use virtio_reset_device() when possible Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-07  7:19 ` [PATCH V4 3/9] virtio: introduce config op to synchronize vring callbacks Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-09 15:24   ` Cornelia Huck
2022-05-09 15:24     ` Cornelia Huck
2022-05-07  7:19 ` [PATCH V4 4/9] virtio-pci: implement synchronize_cbs() Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-09 15:26   ` Cornelia Huck
2022-05-09 15:26     ` Cornelia Huck
2022-05-07  7:19 ` [PATCH V4 5/9] virtio-mmio: " Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-09 15:34   ` Cornelia Huck
2022-05-09 15:34     ` Cornelia Huck
2022-05-07  7:19 ` [PATCH V4 6/9] virtio-ccw: " Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-10 11:27   ` Michael S. Tsirkin
2022-05-10 11:27     ` Michael S. Tsirkin
2022-05-11  2:41     ` Jason Wang
2022-05-11  2:41       ` Jason Wang
2022-05-11  8:17       ` Cornelia Huck
2022-05-11  8:17         ` Cornelia Huck
2022-05-11  8:58         ` Jason Wang
2022-05-11  8:58           ` Jason Wang
2022-05-11  9:13           ` Cornelia Huck
2022-05-11  9:13             ` Cornelia Huck
2022-05-11  9:28             ` Jason Wang
2022-05-11  9:28               ` Jason Wang
2022-05-11 14:52               ` Vineeth Vijayan
2022-05-12  3:29                 ` Jason Wang
2022-05-12  3:29                   ` Jason Wang
2022-05-07  7:19 ` [PATCH V4 7/9] virtio: allow to unbreak virtqueue Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-07  7:19 ` [PATCH V4 8/9] virtio: harden vring IRQ Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-10 11:32   ` Michael S. Tsirkin
2022-05-10 11:32     ` Michael S. Tsirkin
2022-05-11  2:40     ` Jason Wang
2022-05-11  2:40       ` Jason Wang
2022-05-11  8:44       ` Cornelia Huck
2022-05-11  8:44         ` Cornelia Huck
2022-05-11  9:27         ` Jason Wang
2022-05-11  9:27           ` Jason Wang
2022-05-11 12:49           ` Halil Pasic
2022-05-11 12:49             ` Halil Pasic
2022-05-12  3:27             ` Jason Wang
2022-05-12  3:27               ` Jason Wang
2022-05-07  7:19 ` [PATCH V4 9/9] virtio: use WARN_ON() to warning illegal status value Jason Wang
2022-05-07  7:19   ` Jason Wang
2022-05-10  9:29 ` [PATCH V4 0/9] rework on the IRQ hardening of virtio Cornelia Huck
2022-05-10  9:29   ` Cornelia Huck
2022-05-11  2:22   ` Jason Wang
2022-05-11  2:22     ` Jason Wang
2022-05-11 14:01     ` Halil Pasic
2022-05-11 14:01       ` Halil Pasic
2022-05-12  3:31       ` Jason Wang
2022-05-12  3:31         ` Jason Wang
2022-05-16 11:20         ` Halil Pasic
2022-05-16 11:20           ` Halil Pasic
2022-05-16 14:25           ` Michael S. Tsirkin
2022-05-16 14:25             ` Michael S. Tsirkin
2022-05-17  1:00             ` Jason Wang
2022-05-17  1:00               ` Jason Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.