kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw
@ 2020-11-20 18:07 Eric Farman
  2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Eric Farman @ 2020-11-20 18:07 UTC (permalink / raw)
  To: Cornelia Huck, Kirti Wankhede, Alex Williamson
  Cc: Halil Pasic, Matthew Rosato, linux-s390, kvm, Eric Farman

There is a situation where removing all the paths from a device
connected via mdev and vfio-ccw can cause some difficulty.
Using the "chchp -c 0 xx" command to all paths will cause the
device to be removed from the configuration, and any guest
filesystem that is relying on that device will encounter errors.
Interestingly, the last chchp command will actually fail to
return to a shell prompt, and subsequent commands (another
chchp to bring the paths back online, chzdev, etc.) will also
hang because of the outstanding chchp.

The last chchp command drives to vfio_ccw_sch_remove() for every
affected mediated device, and ultimately enters an infinite loop
in vfio_del_group_dev(). This loop is broken when the guest goes
away, which in this case doesn't occur until the guest is shutdown.
This drives vfio_ccw_mdev_release() and thus vfio_device_release()
to wake up the vfio_del_group_dev() thread.

There is also a callback mechanism called "request" to ask a
driver (and perhaps user) to release the device, but this is not
implemented for mdev. So this adds one to that point, and then
wire it to vfio-ccw to pass it along to userspace. This will
gracefully drive the unplug code, and everything behaves nicely.

Despite the testing that was occurring, this doesn't appear related
to the vfio-ccw channel path handling code. I can reproduce this with
an older kernel/QEMU, which makes sense because the above behavior is
driven from the subchannel event codepaths and not the chpid events.
Because of that, I didn't flag anything with a Fixes tag, since it's
seemingly been this way forever.

RFC->V2:
 - Patch 1
   - Added a message when registering a device without a request callback
   - Changed the "if(!callback) return" to "if(callback) do" layout
   - Removed "unlikely" from "if(callback)" logic
   - Clarified some wording in the device ops struct commentary
 - Patch 2
   - Added Conny's r-b

Eric Farman (2):
  vfio-mdev: Wire in a request handler for mdev parent
  vfio-ccw: Wire in the request callback

 drivers/s390/cio/vfio_ccw_ops.c     | 26 ++++++++++++++++++++++++++
 drivers/s390/cio/vfio_ccw_private.h |  4 ++++
 drivers/vfio/mdev/mdev_core.c       |  4 ++++
 drivers/vfio/mdev/vfio_mdev.c       | 10 ++++++++++
 include/linux/mdev.h                |  4 ++++
 include/uapi/linux/vfio.h           |  1 +
 6 files changed, 49 insertions(+)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
  2020-11-20 18:07 [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Eric Farman
@ 2020-11-20 18:07 ` Eric Farman
  2020-11-24  9:57   ` Cornelia Huck
  2020-12-02 20:28   ` Alex Williamson
  2020-11-20 18:07 ` [PATCH v2 2/2] vfio-ccw: Wire in the request callback Eric Farman
  2020-11-24 10:00 ` [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Cornelia Huck
  2 siblings, 2 replies; 7+ messages in thread
From: Eric Farman @ 2020-11-20 18:07 UTC (permalink / raw)
  To: Cornelia Huck, Kirti Wankhede, Alex Williamson
  Cc: Halil Pasic, Matthew Rosato, linux-s390, kvm, Eric Farman

While performing some destructive tests with vfio-ccw, where the
paths to a device are forcible removed and thus the device itself
is unreachable, it is rather easy to end up in an endless loop in
vfio_del_group_dev() due to the lack of a request callback for the
associated device.

In this example, one MDEV (77c) is used by a guest, while another
(77b) is not. The symptom is that the iommu is detached from the
mdev for 77b, but not 77c, until that guest is shutdown:

    [  238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
    [  238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
    [  238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
    [  238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
    ...silence...

Let's wire in the request call back to the mdev device, so that a
device being physically removed from the host can be (gracefully?)
handled by the parent device at the time the device is removed.

Add a message when registering the device if a driver doesn't
provide this callback, so a clue is given that this same loop
may be encountered in a similar situation.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 drivers/vfio/mdev/mdev_core.c |  4 ++++
 drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
 include/linux/mdev.h          |  4 ++++
 3 files changed, 18 insertions(+)

diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b558d4cfd082..6de97d25a3f8 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -154,6 +154,10 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
 	if (!dev)
 		return -EINVAL;
 
+	/* Not mandatory, but its absence could be a problem */
+	if (!ops->request)
+		dev_info(dev, "Driver cannot be asked to release device\n");
+
 	mutex_lock(&parent_list_lock);
 
 	/* Check for duplicate */
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index 30964a4e0a28..06d8fc4a6d72 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -98,6 +98,15 @@ static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
 	return parent->ops->mmap(mdev, vma);
 }
 
+static void vfio_mdev_request(void *device_data, unsigned int count)
+{
+	struct mdev_device *mdev = device_data;
+	struct mdev_parent *parent = mdev->parent;
+
+	if (parent->ops->request)
+		parent->ops->request(mdev, count);
+}
+
 static const struct vfio_device_ops vfio_mdev_dev_ops = {
 	.name		= "vfio-mdev",
 	.open		= vfio_mdev_open,
@@ -106,6 +115,7 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
 	.read		= vfio_mdev_read,
 	.write		= vfio_mdev_write,
 	.mmap		= vfio_mdev_mmap,
+	.request	= vfio_mdev_request,
 };
 
 static int vfio_mdev_probe(struct device *dev)
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 0ce30ca78db0..9004375c462e 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -72,6 +72,9 @@ struct device *mdev_get_iommu_device(struct device *dev);
  * @mmap:		mmap callback
  *			@mdev: mediated device structure
  *			@vma: vma structure
+ * @request:		request callback to release device
+ *			@mdev: mediated device structure
+ *			@count: request sequence number
  * Parent device that support mediated device should be registered with mdev
  * module with mdev_parent_ops structure.
  **/
@@ -92,6 +95,7 @@ struct mdev_parent_ops {
 	long	(*ioctl)(struct mdev_device *mdev, unsigned int cmd,
 			 unsigned long arg);
 	int	(*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
+	void	(*request)(struct mdev_device *mdev, unsigned int count);
 };
 
 /* interface for exporting mdev supported type attributes */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] vfio-ccw: Wire in the request callback
  2020-11-20 18:07 [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Eric Farman
  2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
@ 2020-11-20 18:07 ` Eric Farman
  2020-11-24 10:00 ` [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Cornelia Huck
  2 siblings, 0 replies; 7+ messages in thread
From: Eric Farman @ 2020-11-20 18:07 UTC (permalink / raw)
  To: Cornelia Huck, Kirti Wankhede, Alex Williamson
  Cc: Halil Pasic, Matthew Rosato, linux-s390, kvm, Eric Farman

The device is being unplugged, so pass the request to userspace to
ask for a graceful cleanup. This should free up the thread that
would otherwise loop waiting for the device to be fully released.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
---
 drivers/s390/cio/vfio_ccw_ops.c     | 26 ++++++++++++++++++++++++++
 drivers/s390/cio/vfio_ccw_private.h |  4 ++++
 include/uapi/linux/vfio.h           |  1 +
 3 files changed, 31 insertions(+)

diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 8b3ed5b45277..68106be4ba7a 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -394,6 +394,7 @@ static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
 	switch (info->index) {
 	case VFIO_CCW_IO_IRQ_INDEX:
 	case VFIO_CCW_CRW_IRQ_INDEX:
+	case VFIO_CCW_REQ_IRQ_INDEX:
 		info->count = 1;
 		info->flags = VFIO_IRQ_INFO_EVENTFD;
 		break;
@@ -424,6 +425,9 @@ static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
 	case VFIO_CCW_CRW_IRQ_INDEX:
 		ctx = &private->crw_trigger;
 		break;
+	case VFIO_CCW_REQ_IRQ_INDEX:
+		ctx = &private->req_trigger;
+		break;
 	default:
 		return -EINVAL;
 	}
@@ -607,6 +611,27 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 	}
 }
 
+/* Request removal of the device*/
+static void vfio_ccw_mdev_request(struct mdev_device *mdev, unsigned int count)
+{
+	struct vfio_ccw_private *private = dev_get_drvdata(mdev_parent_dev(mdev));
+
+	if (!private)
+		return;
+
+	if (private->req_trigger) {
+		if (!(count % 10))
+			dev_notice_ratelimited(mdev_dev(private->mdev),
+					       "Relaying device request to user (#%u)\n",
+					       count);
+
+		eventfd_signal(private->req_trigger, 1);
+	} else if (count == 0) {
+		dev_notice(mdev_dev(private->mdev),
+			   "No device request channel registered, blocked until released by user\n");
+	}
+}
+
 static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
 	.owner			= THIS_MODULE,
 	.supported_type_groups  = mdev_type_groups,
@@ -617,6 +642,7 @@ static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
 	.read			= vfio_ccw_mdev_read,
 	.write			= vfio_ccw_mdev_write,
 	.ioctl			= vfio_ccw_mdev_ioctl,
+	.request		= vfio_ccw_mdev_request,
 };
 
 int vfio_ccw_mdev_reg(struct subchannel *sch)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 8723156b29ea..b2c762eb42b9 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -84,7 +84,10 @@ struct vfio_ccw_crw {
  * @irb: irb info received from interrupt
  * @scsw: scsw info
  * @io_trigger: eventfd ctx for signaling userspace I/O results
+ * @crw_trigger: eventfd ctx for signaling userspace CRW information
+ * @req_trigger: eventfd ctx for signaling userspace to return device
  * @io_work: work for deferral process of I/O handling
+ * @crw_work: work for deferral process of CRW handling
  */
 struct vfio_ccw_private {
 	struct subchannel	*sch;
@@ -108,6 +111,7 @@ struct vfio_ccw_private {
 
 	struct eventfd_ctx	*io_trigger;
 	struct eventfd_ctx	*crw_trigger;
+	struct eventfd_ctx	*req_trigger;
 	struct work_struct	io_work;
 	struct work_struct	crw_work;
 } __aligned(8);
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 2f313a238a8f..d1812777139f 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -820,6 +820,7 @@ enum {
 enum {
 	VFIO_CCW_IO_IRQ_INDEX,
 	VFIO_CCW_CRW_IRQ_INDEX,
+	VFIO_CCW_REQ_IRQ_INDEX,
 	VFIO_CCW_NUM_IRQS
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
  2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
@ 2020-11-24  9:57   ` Cornelia Huck
  2020-12-02 20:28   ` Alex Williamson
  1 sibling, 0 replies; 7+ messages in thread
From: Cornelia Huck @ 2020-11-24  9:57 UTC (permalink / raw)
  To: Eric Farman
  Cc: Kirti Wankhede, Alex Williamson, Halil Pasic, Matthew Rosato,
	linux-s390, kvm

On Fri, 20 Nov 2020 19:07:39 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> While performing some destructive tests with vfio-ccw, where the
> paths to a device are forcible removed and thus the device itself
> is unreachable, it is rather easy to end up in an endless loop in
> vfio_del_group_dev() due to the lack of a request callback for the
> associated device.
> 
> In this example, one MDEV (77c) is used by a guest, while another
> (77b) is not. The symptom is that the iommu is detached from the
> mdev for 77b, but not 77c, until that guest is shutdown:
> 
>     [  238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
>     [  238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
>     [  238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
>     [  238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
>     ...silence...
> 
> Let's wire in the request call back to the mdev device, so that a
> device being physically removed from the host can be (gracefully?)
> handled by the parent device at the time the device is removed.
> 
> Add a message when registering the device if a driver doesn't
> provide this callback, so a clue is given that this same loop
> may be encountered in a similar situation.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/vfio/mdev/mdev_core.c |  4 ++++
>  drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
>  include/linux/mdev.h          |  4 ++++
>  3 files changed, 18 insertions(+)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw
  2020-11-20 18:07 [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Eric Farman
  2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
  2020-11-20 18:07 ` [PATCH v2 2/2] vfio-ccw: Wire in the request callback Eric Farman
@ 2020-11-24 10:00 ` Cornelia Huck
  2 siblings, 0 replies; 7+ messages in thread
From: Cornelia Huck @ 2020-11-24 10:00 UTC (permalink / raw)
  To: Eric Farman
  Cc: Kirti Wankhede, Alex Williamson, Halil Pasic, Matthew Rosato,
	linux-s390, kvm

On Fri, 20 Nov 2020 19:07:38 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> There is a situation where removing all the paths from a device
> connected via mdev and vfio-ccw can cause some difficulty.
> Using the "chchp -c 0 xx" command to all paths will cause the
> device to be removed from the configuration, and any guest
> filesystem that is relying on that device will encounter errors.
> Interestingly, the last chchp command will actually fail to
> return to a shell prompt, and subsequent commands (another
> chchp to bring the paths back online, chzdev, etc.) will also
> hang because of the outstanding chchp.
> 
> The last chchp command drives to vfio_ccw_sch_remove() for every
> affected mediated device, and ultimately enters an infinite loop
> in vfio_del_group_dev(). This loop is broken when the guest goes
> away, which in this case doesn't occur until the guest is shutdown.
> This drives vfio_ccw_mdev_release() and thus vfio_device_release()
> to wake up the vfio_del_group_dev() thread.
> 
> There is also a callback mechanism called "request" to ask a
> driver (and perhaps user) to release the device, but this is not
> implemented for mdev. So this adds one to that point, and then
> wire it to vfio-ccw to pass it along to userspace. This will
> gracefully drive the unplug code, and everything behaves nicely.
> 
> Despite the testing that was occurring, this doesn't appear related
> to the vfio-ccw channel path handling code. I can reproduce this with
> an older kernel/QEMU, which makes sense because the above behavior is
> driven from the subchannel event codepaths and not the chpid events.
> Because of that, I didn't flag anything with a Fixes tag, since it's
> seemingly been this way forever.

Both patches look good to me.

Which would be the best way to merge this? Via vfio or via vfio-ccw?

> 
> RFC->V2:
>  - Patch 1
>    - Added a message when registering a device without a request callback
>    - Changed the "if(!callback) return" to "if(callback) do" layout
>    - Removed "unlikely" from "if(callback)" logic
>    - Clarified some wording in the device ops struct commentary
>  - Patch 2
>    - Added Conny's r-b
> 
> Eric Farman (2):
>   vfio-mdev: Wire in a request handler for mdev parent
>   vfio-ccw: Wire in the request callback
> 
>  drivers/s390/cio/vfio_ccw_ops.c     | 26 ++++++++++++++++++++++++++
>  drivers/s390/cio/vfio_ccw_private.h |  4 ++++
>  drivers/vfio/mdev/mdev_core.c       |  4 ++++
>  drivers/vfio/mdev/vfio_mdev.c       | 10 ++++++++++
>  include/linux/mdev.h                |  4 ++++
>  include/uapi/linux/vfio.h           |  1 +
>  6 files changed, 49 insertions(+)
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
  2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
  2020-11-24  9:57   ` Cornelia Huck
@ 2020-12-02 20:28   ` Alex Williamson
  2020-12-03  3:02     ` Eric Farman
  1 sibling, 1 reply; 7+ messages in thread
From: Alex Williamson @ 2020-12-02 20:28 UTC (permalink / raw)
  To: Eric Farman
  Cc: Cornelia Huck, Kirti Wankhede, Halil Pasic, Matthew Rosato,
	linux-s390, kvm

On Fri, 20 Nov 2020 19:07:39 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> While performing some destructive tests with vfio-ccw, where the
> paths to a device are forcible removed and thus the device itself
> is unreachable, it is rather easy to end up in an endless loop in
> vfio_del_group_dev() due to the lack of a request callback for the
> associated device.
> 
> In this example, one MDEV (77c) is used by a guest, while another
> (77b) is not. The symptom is that the iommu is detached from the
> mdev for 77b, but not 77c, until that guest is shutdown:
> 
>     [  238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
>     [  238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
>     [  238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
>     [  238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
>     ...silence...
> 
> Let's wire in the request call back to the mdev device, so that a
> device being physically removed from the host can be (gracefully?)
> handled by the parent device at the time the device is removed.
> 
> Add a message when registering the device if a driver doesn't
> provide this callback, so a clue is given that this same loop
> may be encountered in a similar situation.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  drivers/vfio/mdev/mdev_core.c |  4 ++++
>  drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
>  include/linux/mdev.h          |  4 ++++
>  3 files changed, 18 insertions(+)
> 
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> index b558d4cfd082..6de97d25a3f8 100644
> --- a/drivers/vfio/mdev/mdev_core.c
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -154,6 +154,10 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
>  	if (!dev)
>  		return -EINVAL;
>  
> +	/* Not mandatory, but its absence could be a problem */
> +	if (!ops->request)
> +		dev_info(dev, "Driver cannot be asked to release device\n");
> +
>  	mutex_lock(&parent_list_lock);
>  
>  	/* Check for duplicate */
> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
> index 30964a4e0a28..06d8fc4a6d72 100644
> --- a/drivers/vfio/mdev/vfio_mdev.c
> +++ b/drivers/vfio/mdev/vfio_mdev.c
> @@ -98,6 +98,15 @@ static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
>  	return parent->ops->mmap(mdev, vma);
>  }
>  
> +static void vfio_mdev_request(void *device_data, unsigned int count)
> +{
> +	struct mdev_device *mdev = device_data;
> +	struct mdev_parent *parent = mdev->parent;
> +
> +	if (parent->ops->request)
> +		parent->ops->request(mdev, count);

What do you think about duplicating the count==0 notice in the else
case here?  ie.

	else if (count == 0)
		dev_notice(mdev_dev(mdev), "No mdev vendor driver	request callback support, blocked until released by user\n");

This at least puts something in the log a bit closer to the timeframe
of a possible issue versus the registration nag.  vfio-core could do
this too, but vfio-mdev registers a request callback on behalf of all
mdev devices, so vfio-core would no longer have visibility for this
case.

Otherwise this series looks fine to me and I can take it through the
vfio tree.  Thanks,

Alex

> +}
> +
>  static const struct vfio_device_ops vfio_mdev_dev_ops = {
>  	.name		= "vfio-mdev",
>  	.open		= vfio_mdev_open,
> @@ -106,6 +115,7 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
>  	.read		= vfio_mdev_read,
>  	.write		= vfio_mdev_write,
>  	.mmap		= vfio_mdev_mmap,
> +	.request	= vfio_mdev_request,
>  };
>  
>  static int vfio_mdev_probe(struct device *dev)
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> index 0ce30ca78db0..9004375c462e 100644
> --- a/include/linux/mdev.h
> +++ b/include/linux/mdev.h
> @@ -72,6 +72,9 @@ struct device *mdev_get_iommu_device(struct device *dev);
>   * @mmap:		mmap callback
>   *			@mdev: mediated device structure
>   *			@vma: vma structure
> + * @request:		request callback to release device
> + *			@mdev: mediated device structure
> + *			@count: request sequence number
>   * Parent device that support mediated device should be registered with mdev
>   * module with mdev_parent_ops structure.
>   **/
> @@ -92,6 +95,7 @@ struct mdev_parent_ops {
>  	long	(*ioctl)(struct mdev_device *mdev, unsigned int cmd,
>  			 unsigned long arg);
>  	int	(*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
> +	void	(*request)(struct mdev_device *mdev, unsigned int count);
>  };
>  
>  /* interface for exporting mdev supported type attributes */


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent
  2020-12-02 20:28   ` Alex Williamson
@ 2020-12-03  3:02     ` Eric Farman
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Farman @ 2020-12-03  3:02 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cornelia Huck, Kirti Wankhede, Halil Pasic, Matthew Rosato,
	linux-s390, kvm



On 12/2/20 3:28 PM, Alex Williamson wrote:
> On Fri, 20 Nov 2020 19:07:39 +0100
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> While performing some destructive tests with vfio-ccw, where the
>> paths to a device are forcible removed and thus the device itself
>> is unreachable, it is rather easy to end up in an endless loop in
>> vfio_del_group_dev() due to the lack of a request callback for the
>> associated device.
>>
>> In this example, one MDEV (77c) is used by a guest, while another
>> (77b) is not. The symptom is that the iommu is detached from the
>> mdev for 77b, but not 77c, until that guest is shutdown:
>>
>>      [  238.794867] vfio_ccw 0.0.077b: MDEV: Unregistering
>>      [  238.794996] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: Removing from iommu group 2
>>      [  238.795001] vfio_mdev 11f2d2bc-4083-431d-a023-eff72715c4f0: MDEV: detaching iommu
>>      [  238.795036] vfio_ccw 0.0.077c: MDEV: Unregistering
>>      ...silence...
>>
>> Let's wire in the request call back to the mdev device, so that a
>> device being physically removed from the host can be (gracefully?)
>> handled by the parent device at the time the device is removed.
>>
>> Add a message when registering the device if a driver doesn't
>> provide this callback, so a clue is given that this same loop
>> may be encountered in a similar situation.
>>
>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>> ---
>>   drivers/vfio/mdev/mdev_core.c |  4 ++++
>>   drivers/vfio/mdev/vfio_mdev.c | 10 ++++++++++
>>   include/linux/mdev.h          |  4 ++++
>>   3 files changed, 18 insertions(+)
>>
>> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
>> index b558d4cfd082..6de97d25a3f8 100644
>> --- a/drivers/vfio/mdev/mdev_core.c
>> +++ b/drivers/vfio/mdev/mdev_core.c
>> @@ -154,6 +154,10 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
>>   	if (!dev)
>>   		return -EINVAL;
>>   
>> +	/* Not mandatory, but its absence could be a problem */
>> +	if (!ops->request)
>> +		dev_info(dev, "Driver cannot be asked to release device\n");
>> +
>>   	mutex_lock(&parent_list_lock);
>>   
>>   	/* Check for duplicate */
>> diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
>> index 30964a4e0a28..06d8fc4a6d72 100644
>> --- a/drivers/vfio/mdev/vfio_mdev.c
>> +++ b/drivers/vfio/mdev/vfio_mdev.c
>> @@ -98,6 +98,15 @@ static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
>>   	return parent->ops->mmap(mdev, vma);
>>   }
>>   
>> +static void vfio_mdev_request(void *device_data, unsigned int count)
>> +{
>> +	struct mdev_device *mdev = device_data;
>> +	struct mdev_parent *parent = mdev->parent;
>> +
>> +	if (parent->ops->request)
>> +		parent->ops->request(mdev, count);
> 
> What do you think about duplicating the count==0 notice in the else
> case here?  ie.
> 
> 	else if (count == 0)
> 		dev_notice(mdev_dev(mdev), "No mdev vendor driver	request callback support, blocked until released by user\n");
> 

I'm fine with that. If there are no objections, I should be able to spin 
a v3 with such a change tomorrow.

Thank you!

Eric

> This at least puts something in the log a bit closer to the timeframe
> of a possible issue versus the registration nag.  vfio-core could do
> this too, but vfio-mdev registers a request callback on behalf of all
> mdev devices, so vfio-core would no longer have visibility for this
> case.
> 
> Otherwise this series looks fine to me and I can take it through the
> vfio tree.  Thanks,
> 
> Alex
> 
>> +}
>> +
>>   static const struct vfio_device_ops vfio_mdev_dev_ops = {
>>   	.name		= "vfio-mdev",
>>   	.open		= vfio_mdev_open,
>> @@ -106,6 +115,7 @@ static const struct vfio_device_ops vfio_mdev_dev_ops = {
>>   	.read		= vfio_mdev_read,
>>   	.write		= vfio_mdev_write,
>>   	.mmap		= vfio_mdev_mmap,
>> +	.request	= vfio_mdev_request,
>>   };
>>   
>>   static int vfio_mdev_probe(struct device *dev)
>> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
>> index 0ce30ca78db0..9004375c462e 100644
>> --- a/include/linux/mdev.h
>> +++ b/include/linux/mdev.h
>> @@ -72,6 +72,9 @@ struct device *mdev_get_iommu_device(struct device *dev);
>>    * @mmap:		mmap callback
>>    *			@mdev: mediated device structure
>>    *			@vma: vma structure
>> + * @request:		request callback to release device
>> + *			@mdev: mediated device structure
>> + *			@count: request sequence number
>>    * Parent device that support mediated device should be registered with mdev
>>    * module with mdev_parent_ops structure.
>>    **/
>> @@ -92,6 +95,7 @@ struct mdev_parent_ops {
>>   	long	(*ioctl)(struct mdev_device *mdev, unsigned int cmd,
>>   			 unsigned long arg);
>>   	int	(*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
>> +	void	(*request)(struct mdev_device *mdev, unsigned int count);
>>   };
>>   
>>   /* interface for exporting mdev supported type attributes */
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-03  3:04 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-20 18:07 [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Eric Farman
2020-11-20 18:07 ` [PATCH v2 1/2] vfio-mdev: Wire in a request handler for mdev parent Eric Farman
2020-11-24  9:57   ` Cornelia Huck
2020-12-02 20:28   ` Alex Williamson
2020-12-03  3:02     ` Eric Farman
2020-11-20 18:07 ` [PATCH v2 2/2] vfio-ccw: Wire in the request callback Eric Farman
2020-11-24 10:00 ` [PATCH v2 0/2] Connect request callback to mdev and vfio-ccw Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).