linux-s390.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] Move vfio_ccw to the new mdev API
@ 2021-09-09 19:38 Jason Gunthorpe
  2021-09-09 19:38 ` [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private Jason Gunthorpe
                   ` (9 more replies)
  0 siblings, 10 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

This addresses Cornelia's remark on the earlier patch that ccw has a
confusing lifecycle. While it doesn't seem like the original attempt was
functionally wrong, the result can be made better with a lot of further
work.

Reorganize the driver so that the mdev owns the private memory and
controls the lifecycle, not the css_driver. The memory associated with the
css_driver lifecycle is only the mdev_parent/mdev_type registration.

Along the way we change when the sch is quiescent or not to be linked to
the open/close_device lifetime of the vfio_device, which is sort of what
it was tring to do already, just not completely.

The troublesome racey lifecycle of the css_driver callbacks is made clear
with simple vfio_device refcounting so a callback is only delivered into a
registered vfio_device and has obvious correctness.

Move the only per-css_driver state, the "available instance" counter, into
the core code and share that logic with many of the other drivers. The
value is kept in the mdev_type memory.

v2:
 - Clean up the lifecycle in ccw with 7 new patches
 - Rebase
v1: https://lore.kernel.org/all/7-v2-7667f42c9bad+935-vfio3_jgg@nvidia.com

Jason Gunthorpe (9):
  vfio/ccw: Use functions for alloc/free of the vfio_ccw_private
  vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions
  vfio/ccw: Convert to use vfio_register_group_dev()
  vfio/ccw: Make the FSM complete and synchronize it to the mdev
  vfio/mdev: Consolidate all the device_api sysfs into the core code
  vfio/mdev: Add mdev available instance checking to the core
  vfio/ccw: Remove private->mdev
  vfio: Export vfio_device_try_get()
  vfio/ccw: Move the lifecycle of the struct vfio_ccw_private to the
    mdev

 drivers/gpu/drm/i915/gvt/kvmgt.c      |   9 +-
 drivers/s390/cio/vfio_ccw_drv.c       | 282 +++++++++++---------------
 drivers/s390/cio/vfio_ccw_fsm.c       | 152 ++++++++++----
 drivers/s390/cio/vfio_ccw_ops.c       | 240 ++++++++++------------
 drivers/s390/cio/vfio_ccw_private.h   |  42 +++-
 drivers/s390/crypto/vfio_ap_ops.c     |  41 +---
 drivers/s390/crypto/vfio_ap_private.h |   2 -
 drivers/vfio/mdev/mdev_core.c         |  13 +-
 drivers/vfio/mdev/mdev_private.h      |   2 +
 drivers/vfio/mdev/mdev_sysfs.c        |  64 +++++-
 drivers/vfio/vfio.c                   |   3 +-
 include/linux/mdev.h                  |  13 +-
 include/linux/vfio.h                  |   1 +
 samples/vfio-mdev/mbochs.c            |   9 +-
 samples/vfio-mdev/mdpy.c              |  31 +--
 samples/vfio-mdev/mtty.c              |  10 +-
 16 files changed, 470 insertions(+), 444 deletions(-)

-- 
2.33.0


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-10 11:27   ` Christoph Hellwig
  2021-09-24  2:53   ` Eric Farman
  2021-09-09 19:38 ` [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions Jason Gunthorpe
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

Makes the code easier to understand what is memory lifecycle and what is
other stuff.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c | 137 ++++++++++++++++++--------------
 1 file changed, 78 insertions(+), 59 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 9b61e9b131ade0..1e8d3151e5480e 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -137,16 +137,80 @@ static void vfio_ccw_sch_irq(struct subchannel *sch)
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT);
 }
 
-static void vfio_ccw_free_regions(struct vfio_ccw_private *private)
+static struct vfio_ccw_private *vfio_ccw_alloc_private(struct subchannel *sch)
 {
-	if (private->crw_region)
-		kmem_cache_free(vfio_ccw_crw_region, private->crw_region);
-	if (private->schib_region)
-		kmem_cache_free(vfio_ccw_schib_region, private->schib_region);
-	if (private->cmd_region)
-		kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
-	if (private->io_region)
-		kmem_cache_free(vfio_ccw_io_region, private->io_region);
+	struct vfio_ccw_private *private;
+
+	private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
+	if (!private)
+		return ERR_PTR(-ENOMEM);
+
+	private->sch = sch;
+	mutex_init(&private->io_mutex);
+	private->state = VFIO_CCW_STATE_NOT_OPER;
+	INIT_LIST_HEAD(&private->crw);
+	INIT_WORK(&private->io_work, vfio_ccw_sch_io_todo);
+	INIT_WORK(&private->crw_work, vfio_ccw_crw_todo);
+	atomic_set(&private->avail, 1);
+
+	private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1),
+				       GFP_KERNEL);
+	if (!private->cp.guest_cp)
+		goto out_free_private;
+
+	private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
+					       GFP_KERNEL | GFP_DMA);
+	if (!private->io_region)
+		goto out_free_cp;
+
+	private->cmd_region = kmem_cache_zalloc(vfio_ccw_cmd_region,
+						GFP_KERNEL | GFP_DMA);
+	if (!private->cmd_region)
+		goto out_free_io;
+
+	private->schib_region = kmem_cache_zalloc(vfio_ccw_schib_region,
+						  GFP_KERNEL | GFP_DMA);
+
+	if (!private->schib_region)
+		goto out_free_cmd;
+
+	private->crw_region = kmem_cache_zalloc(vfio_ccw_crw_region,
+						GFP_KERNEL | GFP_DMA);
+
+	if (!private->crw_region)
+		goto out_free_schib;
+	return private;
+
+out_free_schib:
+	kmem_cache_free(vfio_ccw_schib_region, private->schib_region);
+out_free_cmd:
+	kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
+out_free_io:
+	kmem_cache_free(vfio_ccw_io_region, private->io_region);
+out_free_cp:
+	kfree(private->cp.guest_cp);
+out_free_private:
+	mutex_destroy(&private->io_mutex);
+	kfree(private);
+	return ERR_PTR(-ENOMEM);
+}
+
+static void vfio_ccw_free_private(struct vfio_ccw_private *private)
+{
+	struct vfio_ccw_crw *crw, *temp;
+
+	list_for_each_entry_safe(crw, temp, &private->crw, next) {
+		list_del(&crw->next);
+		kfree(crw);
+	}
+
+	kmem_cache_free(vfio_ccw_crw_region, private->crw_region);
+	kmem_cache_free(vfio_ccw_schib_region, private->schib_region);
+	kmem_cache_free(vfio_ccw_cmd_region, private->cmd_region);
+	kmem_cache_free(vfio_ccw_io_region, private->io_region);
+	kfree(private->cp.guest_cp);
+	mutex_destroy(&private->io_mutex);
+	kfree(private);
 }
 
 static int vfio_ccw_sch_probe(struct subchannel *sch)
@@ -161,53 +225,19 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 		return -ENODEV;
 	}
 
-	private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
-	if (!private)
-		return -ENOMEM;
+	private = vfio_ccw_alloc_private(sch);
+	if (IS_ERR(private))
+		return PTR_ERR(private);
 
-	private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1),
-				       GFP_KERNEL);
-	if (!private->cp.guest_cp)
-		goto out_free;
-
-	private->io_region = kmem_cache_zalloc(vfio_ccw_io_region,
-					       GFP_KERNEL | GFP_DMA);
-	if (!private->io_region)
-		goto out_free;
-
-	private->cmd_region = kmem_cache_zalloc(vfio_ccw_cmd_region,
-						GFP_KERNEL | GFP_DMA);
-	if (!private->cmd_region)
-		goto out_free;
-
-	private->schib_region = kmem_cache_zalloc(vfio_ccw_schib_region,
-						  GFP_KERNEL | GFP_DMA);
-
-	if (!private->schib_region)
-		goto out_free;
-
-	private->crw_region = kmem_cache_zalloc(vfio_ccw_crw_region,
-						GFP_KERNEL | GFP_DMA);
-
-	if (!private->crw_region)
-		goto out_free;
-
-	private->sch = sch;
 	dev_set_drvdata(&sch->dev, private);
-	mutex_init(&private->io_mutex);
 
 	spin_lock_irq(sch->lock);
-	private->state = VFIO_CCW_STATE_NOT_OPER;
 	sch->isc = VFIO_CCW_ISC;
 	ret = cio_enable_subchannel(sch, (u32)(unsigned long)sch);
 	spin_unlock_irq(sch->lock);
 	if (ret)
 		goto out_free;
 
-	INIT_LIST_HEAD(&private->crw);
-	INIT_WORK(&private->io_work, vfio_ccw_sch_io_todo);
-	INIT_WORK(&private->crw_work, vfio_ccw_crw_todo);
-	atomic_set(&private->avail, 1);
 	private->state = VFIO_CCW_STATE_STANDBY;
 
 	ret = vfio_ccw_mdev_reg(sch);
@@ -228,31 +258,20 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 	cio_disable_subchannel(sch);
 out_free:
 	dev_set_drvdata(&sch->dev, NULL);
-	vfio_ccw_free_regions(private);
-	kfree(private->cp.guest_cp);
-	kfree(private);
+	vfio_ccw_free_private(private);
 	return ret;
 }
 
 static int vfio_ccw_sch_remove(struct subchannel *sch)
 {
 	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
-	struct vfio_ccw_crw *crw, *temp;
 
 	vfio_ccw_sch_quiesce(sch);
-
-	list_for_each_entry_safe(crw, temp, &private->crw, next) {
-		list_del(&crw->next);
-		kfree(crw);
-	}
-
 	vfio_ccw_mdev_unreg(sch);
 
 	dev_set_drvdata(&sch->dev, NULL);
 
-	vfio_ccw_free_regions(private);
-	kfree(private->cp.guest_cp);
-	kfree(private);
+	vfio_ccw_free_private(private);
 
 	VFIO_CCW_MSG_EVENT(4, "unbound from subchannel %x.%x.%04x\n",
 			   sch->schid.cssid, sch->schid.ssid,
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
  2021-09-09 19:38 ` [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-10 11:32   ` Christoph Hellwig
                     ` (2 more replies)
  2021-09-09 19:38 ` [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev() Jason Gunthorpe
                   ` (7 subsequent siblings)
  9 siblings, 3 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

mdev_device should only be used in functions assigned to ops callbacks,
interior functions should use the struct vfio_ccw_private instead of
repeatedly trying to get it from the mdev.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_ops.c | 37 +++++++++++++--------------------
 1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 7f540ad0b568bc..1edbea9de0ec42 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -17,13 +17,11 @@
 
 #include "vfio_ccw_private.h"
 
-static int vfio_ccw_mdev_reset(struct mdev_device *mdev)
+static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
 {
-	struct vfio_ccw_private *private;
 	struct subchannel *sch;
 	int ret;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	sch = private->sch;
 	/*
 	 * TODO:
@@ -61,7 +59,7 @@ static int vfio_ccw_mdev_notifier(struct notifier_block *nb,
 		if (!cp_iova_pinned(&private->cp, unmap->iova))
 			return NOTIFY_OK;
 
-		if (vfio_ccw_mdev_reset(private->mdev))
+		if (vfio_ccw_mdev_reset(private))
 			return NOTIFY_BAD;
 
 		cp_free(&private->cp);
@@ -201,7 +199,7 @@ static void vfio_ccw_mdev_close_device(struct mdev_device *mdev)
 
 	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
 	    (private->state != VFIO_CCW_STATE_STANDBY)) {
-		if (!vfio_ccw_mdev_reset(mdev))
+		if (!vfio_ccw_mdev_reset(private))
 			private->state = VFIO_CCW_STATE_STANDBY;
 		/* The state will be NOT_OPER on error. */
 	}
@@ -311,12 +309,9 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
 	return -EINVAL;
 }
 
-static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
-					 struct mdev_device *mdev)
+static int vfio_ccw_mdev_get_device_info(struct vfio_ccw_private *private,
+					 struct vfio_device_info *info)
 {
-	struct vfio_ccw_private *private;
-
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	info->flags = VFIO_DEVICE_FLAGS_CCW | VFIO_DEVICE_FLAGS_RESET;
 	info->num_regions = VFIO_CCW_NUM_REGIONS + private->num_regions;
 	info->num_irqs = VFIO_CCW_NUM_IRQS;
@@ -324,14 +319,12 @@ static int vfio_ccw_mdev_get_device_info(struct vfio_device_info *info,
 	return 0;
 }
 
-static int vfio_ccw_mdev_get_region_info(struct vfio_region_info *info,
-					 struct mdev_device *mdev,
+static int vfio_ccw_mdev_get_region_info(struct vfio_ccw_private *private,
+					 struct vfio_region_info *info,
 					 unsigned long arg)
 {
-	struct vfio_ccw_private *private;
 	int i;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 	switch (info->index) {
 	case VFIO_CCW_CONFIG_REGION_INDEX:
 		info->offset = 0;
@@ -406,19 +399,16 @@ static int vfio_ccw_mdev_get_irq_info(struct vfio_irq_info *info)
 	return 0;
 }
 
-static int vfio_ccw_mdev_set_irqs(struct mdev_device *mdev,
+static int vfio_ccw_mdev_set_irqs(struct vfio_ccw_private *private,
 				  uint32_t flags,
 				  uint32_t index,
 				  void __user *data)
 {
-	struct vfio_ccw_private *private;
 	struct eventfd_ctx **ctx;
 
 	if (!(flags & VFIO_IRQ_SET_ACTION_TRIGGER))
 		return -EINVAL;
 
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
-
 	switch (index) {
 	case VFIO_CCW_IO_IRQ_INDEX:
 		ctx = &private->io_trigger;
@@ -524,6 +514,8 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 				   unsigned int cmd,
 				   unsigned long arg)
 {
+	struct vfio_ccw_private *private =
+		dev_get_drvdata(mdev_parent_dev(mdev));
 	int ret = 0;
 	unsigned long minsz;
 
@@ -540,7 +532,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_device_info(&info, mdev);
+		ret = vfio_ccw_mdev_get_device_info(private, &info);
 		if (ret)
 			return ret;
 
@@ -558,7 +550,7 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 		if (info.argsz < minsz)
 			return -EINVAL;
 
-		ret = vfio_ccw_mdev_get_region_info(&info, mdev, arg);
+		ret = vfio_ccw_mdev_get_region_info(private, &info, arg);
 		if (ret)
 			return ret;
 
@@ -603,10 +595,11 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 			return ret;
 
 		data = (void __user *)(arg + minsz);
-		return vfio_ccw_mdev_set_irqs(mdev, hdr.flags, hdr.index, data);
+		return vfio_ccw_mdev_set_irqs(private, hdr.flags, hdr.index,
+					      data);
 	}
 	case VFIO_DEVICE_RESET:
-		return vfio_ccw_mdev_reset(mdev);
+		return vfio_ccw_mdev_reset(private);
 	default:
 		return -ENOTTY;
 	}
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
  2021-09-09 19:38 ` [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private Jason Gunthorpe
  2021-09-09 19:38 ` [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-24 20:37   ` Eric Farman
  2021-09-09 19:38 ` [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev Jason Gunthorpe
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

This is a more complicated conversion because vfio_ccw is sharing the
vfio_device between both the mdev_device, its vfio_device and the
css_driver.

The mdev is a singleton, and the reason for this sharing is so the extra
css_driver function callbacks to be delivered to the vfio_device
implementation.

This keeps things as they are, with the css_driver allocating the
singleton, not the mdev_driver. Following patches work to clean this
further.

Embed the vfio_device in the vfio_ccw_private and instantiate it as a
vfio_device when the mdev probes. The drvdata of both the css_device and
the mdev_device point at the private, and container_of is used to get it
back from the vfio_device.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  21 ++++--
 drivers/s390/cio/vfio_ccw_ops.c     | 107 +++++++++++++++++-----------
 drivers/s390/cio/vfio_ccw_private.h |   5 ++
 3 files changed, 85 insertions(+), 48 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 1e8d3151e5480e..396e815f81f8a4 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -469,7 +469,7 @@ static int __init vfio_ccw_sch_init(void)
 	vfio_ccw_work_q = create_singlethread_workqueue("vfio-ccw");
 	if (!vfio_ccw_work_q) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_io_region = kmem_cache_create_usercopy("vfio_ccw_io_region",
@@ -478,7 +478,7 @@ static int __init vfio_ccw_sch_init(void)
 					sizeof(struct ccw_io_region), NULL);
 	if (!vfio_ccw_io_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_cmd_region = kmem_cache_create_usercopy("vfio_ccw_cmd_region",
@@ -487,7 +487,7 @@ static int __init vfio_ccw_sch_init(void)
 					sizeof(struct ccw_cmd_region), NULL);
 	if (!vfio_ccw_cmd_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_schib_region = kmem_cache_create_usercopy("vfio_ccw_schib_region",
@@ -497,7 +497,7 @@ static int __init vfio_ccw_sch_init(void)
 
 	if (!vfio_ccw_schib_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
 	vfio_ccw_crw_region = kmem_cache_create_usercopy("vfio_ccw_crw_region",
@@ -507,19 +507,25 @@ static int __init vfio_ccw_sch_init(void)
 
 	if (!vfio_ccw_crw_region) {
 		ret = -ENOMEM;
-		goto out_err;
+		goto out_regions;
 	}
 
+	ret = mdev_register_driver(&vfio_ccw_mdev_driver);
+	if (ret)
+		goto out_regions;
+
 	isc_register(VFIO_CCW_ISC);
 	ret = css_driver_register(&vfio_ccw_sch_driver);
 	if (ret) {
 		isc_unregister(VFIO_CCW_ISC);
-		goto out_err;
+		goto out_driver;
 	}
 
 	return ret;
 
-out_err:
+out_driver:
+	mdev_unregister_driver(&vfio_ccw_mdev_driver);
+out_regions:
 	vfio_ccw_destroy_regions();
 	destroy_workqueue(vfio_ccw_work_q);
 	vfio_ccw_debug_exit();
@@ -528,6 +534,7 @@ static int __init vfio_ccw_sch_init(void)
 
 static void __exit vfio_ccw_sch_exit(void)
 {
+	mdev_unregister_driver(&vfio_ccw_mdev_driver);
 	css_driver_unregister(&vfio_ccw_sch_driver);
 	isc_unregister(VFIO_CCW_ISC);
 	vfio_ccw_destroy_regions();
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 1edbea9de0ec42..3a66e4fb18244c 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -17,6 +17,8 @@
 
 #include "vfio_ccw_private.h"
 
+static const struct vfio_device_ops vfio_ccw_dev_ops;
+
 static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
 {
 	struct subchannel *sch;
@@ -111,10 +113,10 @@ static struct attribute_group *mdev_type_groups[] = {
 	NULL,
 };
 
-static int vfio_ccw_mdev_create(struct mdev_device *mdev)
+static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 {
-	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+	struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
+	int ret;
 
 	if (private->state == VFIO_CCW_STATE_NOT_OPER)
 		return -ENODEV;
@@ -122,6 +124,10 @@ static int vfio_ccw_mdev_create(struct mdev_device *mdev)
 	if (atomic_dec_if_positive(&private->avail) < 0)
 		return -EPERM;
 
+	memset(&private->vdev, 0, sizeof(private->vdev));
+	vfio_init_group_dev(&private->vdev, &mdev->dev,
+			    &vfio_ccw_dev_ops);
+
 	private->mdev = mdev;
 	private->state = VFIO_CCW_STATE_IDLE;
 
@@ -130,19 +136,31 @@ static int vfio_ccw_mdev_create(struct mdev_device *mdev)
 			   private->sch->schid.ssid,
 			   private->sch->schid.sch_no);
 
+	ret = vfio_register_group_dev(&private->vdev);
+	if (ret)
+		goto err_atomic;
+	dev_set_drvdata(&mdev->dev, private);
 	return 0;
+
+err_atomic:
+	vfio_uninit_group_dev(&private->vdev);
+	atomic_inc(&private->avail);
+	private->mdev = NULL;
+	private->state = VFIO_CCW_STATE_IDLE;
+	return ret;
 }
 
-static int vfio_ccw_mdev_remove(struct mdev_device *mdev)
+static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
 {
-	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+	struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
 
 	VFIO_CCW_MSG_EVENT(2, "mdev %pUl, sch %x.%x.%04x: remove\n",
 			   mdev_uuid(mdev), private->sch->schid.cssid,
 			   private->sch->schid.ssid,
 			   private->sch->schid.sch_no);
 
+	vfio_unregister_group_dev(&private->vdev);
+
 	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
 	    (private->state != VFIO_CCW_STATE_STANDBY)) {
 		if (!vfio_ccw_sch_quiesce(private->sch))
@@ -150,23 +168,22 @@ static int vfio_ccw_mdev_remove(struct mdev_device *mdev)
 		/* The state will be NOT_OPER on error. */
 	}
 
+	vfio_uninit_group_dev(&private->vdev);
 	cp_free(&private->cp);
 	private->mdev = NULL;
 	atomic_inc(&private->avail);
-
-	return 0;
 }
 
-static int vfio_ccw_mdev_open_device(struct mdev_device *mdev)
+static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
 {
 	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	unsigned long events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
 	int ret;
 
 	private->nb.notifier_call = vfio_ccw_mdev_notifier;
 
-	ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+	ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
 				     &events, &private->nb);
 	if (ret)
 		return ret;
@@ -187,15 +204,15 @@ static int vfio_ccw_mdev_open_device(struct mdev_device *mdev)
 
 out_unregister:
 	vfio_ccw_unregister_dev_regions(private);
-	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+	vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
 				 &private->nb);
 	return ret;
 }
 
-static void vfio_ccw_mdev_close_device(struct mdev_device *mdev)
+static void vfio_ccw_mdev_close_device(struct vfio_device *vdev)
 {
 	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+		container_of(vdev, struct vfio_ccw_private, vdev);
 
 	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
 	    (private->state != VFIO_CCW_STATE_STANDBY)) {
@@ -206,8 +223,7 @@ static void vfio_ccw_mdev_close_device(struct mdev_device *mdev)
 
 	cp_free(&private->cp);
 	vfio_ccw_unregister_dev_regions(private);
-	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
-				 &private->nb);
+	vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, &private->nb);
 }
 
 static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
@@ -231,15 +247,14 @@ static ssize_t vfio_ccw_mdev_read_io_region(struct vfio_ccw_private *private,
 	return ret;
 }
 
-static ssize_t vfio_ccw_mdev_read(struct mdev_device *mdev,
+static ssize_t vfio_ccw_mdev_read(struct vfio_device *vdev,
 				  char __user *buf,
 				  size_t count,
 				  loff_t *ppos)
 {
+	struct vfio_ccw_private *private =
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
-	struct vfio_ccw_private *private;
-
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 
 	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
 		return -EINVAL;
@@ -284,15 +299,14 @@ static ssize_t vfio_ccw_mdev_write_io_region(struct vfio_ccw_private *private,
 	return ret;
 }
 
-static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
+static ssize_t vfio_ccw_mdev_write(struct vfio_device *vdev,
 				   const char __user *buf,
 				   size_t count,
 				   loff_t *ppos)
 {
+	struct vfio_ccw_private *private =
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	unsigned int index = VFIO_CCW_OFFSET_TO_INDEX(*ppos);
-	struct vfio_ccw_private *private;
-
-	private = dev_get_drvdata(mdev_parent_dev(mdev));
 
 	if (index >= VFIO_CCW_NUM_REGIONS + private->num_regions)
 		return -EINVAL;
@@ -510,12 +524,12 @@ void vfio_ccw_unregister_dev_regions(struct vfio_ccw_private *private)
 	private->region = NULL;
 }
 
-static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
+static ssize_t vfio_ccw_mdev_ioctl(struct vfio_device *vdev,
 				   unsigned int cmd,
 				   unsigned long arg)
 {
 	struct vfio_ccw_private *private =
-		dev_get_drvdata(mdev_parent_dev(mdev));
+		container_of(vdev, struct vfio_ccw_private, vdev);
 	int ret = 0;
 	unsigned long minsz;
 
@@ -606,37 +620,48 @@ static ssize_t vfio_ccw_mdev_ioctl(struct mdev_device *mdev,
 }
 
 /* Request removal of the device*/
-static void vfio_ccw_mdev_request(struct mdev_device *mdev, unsigned int count)
+static void vfio_ccw_mdev_request(struct vfio_device *vdev, unsigned int count)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(mdev_parent_dev(mdev));
-
-	if (!private)
-		return;
+	struct vfio_ccw_private *private =
+		container_of(vdev, struct vfio_ccw_private, vdev);
+	struct device *dev = vdev->dev;
 
 	if (private->req_trigger) {
 		if (!(count % 10))
-			dev_notice_ratelimited(mdev_dev(private->mdev),
+			dev_notice_ratelimited(dev,
 					       "Relaying device request to user (#%u)\n",
 					       count);
 
 		eventfd_signal(private->req_trigger, 1);
 	} else if (count == 0) {
-		dev_notice(mdev_dev(private->mdev),
+		dev_notice(dev,
 			   "No device request channel registered, blocked until released by user\n");
 	}
 }
 
+static const struct vfio_device_ops vfio_ccw_dev_ops = {
+	.open_device = vfio_ccw_mdev_open_device,
+	.close_device = vfio_ccw_mdev_close_device,
+	.read = vfio_ccw_mdev_read,
+	.write = vfio_ccw_mdev_write,
+	.ioctl = vfio_ccw_mdev_ioctl,
+	.request = vfio_ccw_mdev_request,
+};
+
+struct mdev_driver vfio_ccw_mdev_driver = {
+	.driver = {
+		.name = "vfio_ccw_mdev",
+		.owner = THIS_MODULE,
+		.mod_name = KBUILD_MODNAME,
+	},
+	.probe = vfio_ccw_mdev_probe,
+	.remove = vfio_ccw_mdev_remove,
+};
+
 static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
 	.owner			= THIS_MODULE,
+	.device_driver		= &vfio_ccw_mdev_driver,
 	.supported_type_groups  = mdev_type_groups,
-	.create			= vfio_ccw_mdev_create,
-	.remove			= vfio_ccw_mdev_remove,
-	.open_device		= vfio_ccw_mdev_open_device,
-	.close_device		= vfio_ccw_mdev_close_device,
-	.read			= vfio_ccw_mdev_read,
-	.write			= vfio_ccw_mdev_write,
-	.ioctl			= vfio_ccw_mdev_ioctl,
-	.request		= vfio_ccw_mdev_request,
 };
 
 int vfio_ccw_mdev_reg(struct subchannel *sch)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index b2c762eb42b9bb..7272eb78861244 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -17,6 +17,7 @@
 #include <linux/eventfd.h>
 #include <linux/workqueue.h>
 #include <linux/vfio_ccw.h>
+#include <linux/vfio.h>
 #include <asm/crw.h>
 #include <asm/debug.h>
 
@@ -67,6 +68,7 @@ struct vfio_ccw_crw {
 
 /**
  * struct vfio_ccw_private
+ * @vdev: Embedded VFIO device
  * @sch: pointer to the subchannel
  * @state: internal state of the device
  * @completion: synchronization helper of the I/O completion
@@ -90,6 +92,7 @@ struct vfio_ccw_crw {
  * @crw_work: work for deferral process of CRW handling
  */
 struct vfio_ccw_private {
+	struct vfio_device vdev;
 	struct subchannel	*sch;
 	int			state;
 	struct completion	*completion;
@@ -121,6 +124,8 @@ extern void vfio_ccw_mdev_unreg(struct subchannel *sch);
 
 extern int vfio_ccw_sch_quiesce(struct subchannel *sch);
 
+extern struct mdev_driver vfio_ccw_mdev_driver;
+
 /*
  * States of the device statemachine.
  */
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2021-09-09 19:38 ` [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev() Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-20 12:19   ` Cornelia Huck
  2021-09-09 19:38 ` [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code Jason Gunthorpe
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

The subchannel should be left in a quiescent state unless the VFIO device
FD is opened. When the FD is opened bring the chanel to active and allow
the VFIO device to operate. When the device FD is closed then quiesce the
channel.

To make this work the FSM needs to handle the transitions to/from open and
closed so everything is sequenced. Rename state NOT_OPER to BROKEN and use
it wheneven the driver has malfunctioned. STANDBY becomes CLOSED. The
normal case FSM looks like:
    CLOSED -> IDLE -> PROCESS/PENDING* -> IDLE -> CLOSED

With a possible branch off to BROKEN from any state. Once the device is in
BROKEN it cannot be recovered other than be reloading the driver.

Delete the triply redundant calls to
vfio_ccw_sch_quiesce(). vfio_ccw_mdev_close_device() always leaves the
subchannel quiescent. vfio_ccw_mdev_remove() cannot return until
vfio_ccw_mdev_close_device() completes and vfio_ccw_sch_remove() cannot
return until vfio_ccw_mdev_remove() completes. Have the FSM code take care
of calling cp_free() when appropriate.

Device reset becomes a CLOSE/OPEN sequence which now properly handles the
situation if the device becomes BROKEN.

Machine shutdown via vfio_ccw_sch_shutdown() now simply tries to close and
leaves the device BROKEN (though arguably the bus should take care to
quiet down the subchannel HW during shutdown, not the drivers)

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  74 ++------------------
 drivers/s390/cio/vfio_ccw_fsm.c     | 104 ++++++++++++++++++++++++----
 drivers/s390/cio/vfio_ccw_ops.c     |  49 ++++---------
 drivers/s390/cio/vfio_ccw_private.h |  12 ++--
 4 files changed, 119 insertions(+), 120 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 396e815f81f8a4..99f2823361718f 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -36,51 +36,6 @@ debug_info_t *vfio_ccw_debug_trace_id;
 /*
  * Helpers
  */
-int vfio_ccw_sch_quiesce(struct subchannel *sch)
-{
-	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
-	DECLARE_COMPLETION_ONSTACK(completion);
-	int iretry, ret = 0;
-
-	spin_lock_irq(sch->lock);
-	if (!sch->schib.pmcw.ena)
-		goto out_unlock;
-	ret = cio_disable_subchannel(sch);
-	if (ret != -EBUSY)
-		goto out_unlock;
-
-	iretry = 255;
-	do {
-
-		ret = cio_cancel_halt_clear(sch, &iretry);
-
-		if (ret == -EIO) {
-			pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n",
-			       sch->schid.ssid, sch->schid.sch_no);
-			break;
-		}
-
-		/*
-		 * Flush all I/O and wait for
-		 * cancel/halt/clear completion.
-		 */
-		private->completion = &completion;
-		spin_unlock_irq(sch->lock);
-
-		if (ret == -EBUSY)
-			wait_for_completion_timeout(&completion, 3*HZ);
-
-		private->completion = NULL;
-		flush_workqueue(vfio_ccw_work_q);
-		spin_lock_irq(sch->lock);
-		ret = cio_disable_subchannel(sch);
-	} while (ret == -EBUSY);
-out_unlock:
-	private->state = VFIO_CCW_STATE_NOT_OPER;
-	spin_unlock_irq(sch->lock);
-	return ret;
-}
-
 static void vfio_ccw_sch_io_todo(struct work_struct *work)
 {
 	struct vfio_ccw_private *private;
@@ -147,7 +102,7 @@ static struct vfio_ccw_private *vfio_ccw_alloc_private(struct subchannel *sch)
 
 	private->sch = sch;
 	mutex_init(&private->io_mutex);
-	private->state = VFIO_CCW_STATE_NOT_OPER;
+	private->state = VFIO_CCW_STATE_CLOSED;
 	INIT_LIST_HEAD(&private->crw);
 	INIT_WORK(&private->io_work, vfio_ccw_sch_io_todo);
 	INIT_WORK(&private->crw_work, vfio_ccw_crw_todo);
@@ -231,18 +186,9 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 
 	dev_set_drvdata(&sch->dev, private);
 
-	spin_lock_irq(sch->lock);
-	sch->isc = VFIO_CCW_ISC;
-	ret = cio_enable_subchannel(sch, (u32)(unsigned long)sch);
-	spin_unlock_irq(sch->lock);
-	if (ret)
-		goto out_free;
-
-	private->state = VFIO_CCW_STATE_STANDBY;
-
 	ret = vfio_ccw_mdev_reg(sch);
 	if (ret)
-		goto out_disable;
+		goto out_free;
 
 	if (dev_get_uevent_suppress(&sch->dev)) {
 		dev_set_uevent_suppress(&sch->dev, 0);
@@ -254,8 +200,6 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 			   sch->schid.sch_no);
 	return 0;
 
-out_disable:
-	cio_disable_subchannel(sch);
 out_free:
 	dev_set_drvdata(&sch->dev, NULL);
 	vfio_ccw_free_private(private);
@@ -266,7 +210,6 @@ static int vfio_ccw_sch_remove(struct subchannel *sch)
 {
 	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
 
-	vfio_ccw_sch_quiesce(sch);
 	vfio_ccw_mdev_unreg(sch);
 
 	dev_set_drvdata(&sch->dev, NULL);
@@ -281,7 +224,10 @@ static int vfio_ccw_sch_remove(struct subchannel *sch)
 
 static void vfio_ccw_sch_shutdown(struct subchannel *sch)
 {
-	vfio_ccw_sch_quiesce(sch);
+	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
+
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_CLOSE);
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_BROKEN);
 }
 
 /**
@@ -308,16 +254,10 @@ static int vfio_ccw_sch_event(struct subchannel *sch, int process)
 		goto out_unlock;
 
 	if (cio_update_schib(sch)) {
-		vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_NOT_OPER);
+		vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_BROKEN);
 		rc = 0;
 		goto out_unlock;
 	}
-
-	private = dev_get_drvdata(&sch->dev);
-	if (private->state == VFIO_CCW_STATE_NOT_OPER) {
-		private->state = private->mdev ? VFIO_CCW_STATE_IDLE :
-				 VFIO_CCW_STATE_STANDBY;
-	}
 	rc = 0;
 
 out_unlock:
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index e435a9cd92dacf..302215090b9ac7 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -12,6 +12,8 @@
 #include <linux/vfio.h>
 #include <linux/mdev.h>
 
+#include <asm/isc.h>
+
 #include "ioasm.h"
 #include "vfio_ccw_private.h"
 
@@ -156,12 +158,12 @@ static int fsm_do_clear(struct vfio_ccw_private *private)
 	return ret;
 }
 
-static void fsm_notoper(struct vfio_ccw_private *private,
-			enum vfio_ccw_event event)
+static void fsm_broken(struct vfio_ccw_private *private,
+		       enum vfio_ccw_event event)
 {
 	struct subchannel *sch = private->sch;
 
-	VFIO_CCW_TRACE_EVENT(2, "notoper");
+	VFIO_CCW_TRACE_EVENT(2, "broken");
 	VFIO_CCW_TRACE_EVENT(2, dev_name(&sch->dev));
 
 	/*
@@ -169,7 +171,8 @@ static void fsm_notoper(struct vfio_ccw_private *private,
 	 * Probably we should send the machine check to the guest.
 	 */
 	css_sched_sch_todo(sch, SCH_TODO_UNREG);
-	private->state = VFIO_CCW_STATE_NOT_OPER;
+	private->state = VFIO_CCW_STATE_BROKEN;
+	cp_free(&private->cp);
 }
 
 /*
@@ -367,38 +370,115 @@ static void fsm_irq(struct vfio_ccw_private *private,
 		complete(private->completion);
 }
 
+static void fsm_open(struct vfio_ccw_private *private,
+		     enum vfio_ccw_event event)
+{
+	struct subchannel *sch = private->sch;
+	int ret;
+
+	spin_lock_irq(sch->lock);
+	sch->isc = VFIO_CCW_ISC;
+	ret = cio_enable_subchannel(sch, (u32)(unsigned long)sch);
+	if (ret)
+		private->state = VFIO_CCW_STATE_BROKEN;
+	else
+		private->state = VFIO_CCW_STATE_IDLE;
+	spin_unlock_irq(sch->lock);
+}
+
+static void fsm_close(struct vfio_ccw_private *private,
+		      enum vfio_ccw_event event)
+{
+	struct subchannel *sch = private->sch;
+	DECLARE_COMPLETION_ONSTACK(completion);
+	int iretry, ret = 0;
+
+	spin_lock_irq(sch->lock);
+	if (!sch->schib.pmcw.ena)
+		goto err_unlock;
+	ret = cio_disable_subchannel(sch);
+	if (ret != -EBUSY)
+		goto err_unlock;
+
+	iretry = 255;
+	do {
+
+		ret = cio_cancel_halt_clear(sch, &iretry);
+
+		if (ret == -EIO) {
+			pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n",
+			       sch->schid.ssid, sch->schid.sch_no);
+			break;
+		}
+
+		/*
+		 * Flush all I/O and wait for
+		 * cancel/halt/clear completion.
+		 */
+		private->completion = &completion;
+		spin_unlock_irq(sch->lock);
+
+		if (ret == -EBUSY)
+			wait_for_completion_timeout(&completion, 3*HZ);
+
+		private->completion = NULL;
+		flush_workqueue(vfio_ccw_work_q);
+		spin_lock_irq(sch->lock);
+		ret = cio_disable_subchannel(sch);
+	} while (ret == -EBUSY);
+	if (ret)
+		goto err_unlock;
+	private->state = VFIO_CCW_STATE_CLOSED;
+	spin_unlock_irq(sch->lock);
+	cp_free(&private->cp);
+	return;
+
+err_unlock:
+	spin_unlock_irq(sch->lock);
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_BROKEN);
+}
+
 /*
  * Device statemachine
  */
 fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
-	[VFIO_CCW_STATE_NOT_OPER] = {
-		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_nop,
+	[VFIO_CCW_STATE_BROKEN] = {
+		[VFIO_CCW_EVENT_BROKEN]		= fsm_nop,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
 		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
+		[VFIO_CCW_EVENT_CLOSE]		= fsm_nop,
 	},
-	[VFIO_CCW_STATE_STANDBY] = {
-		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
+	[VFIO_CCW_STATE_CLOSED] = {
+		[VFIO_CCW_EVENT_BROKEN]		= fsm_broken,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
 		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
-		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
+		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
+		[VFIO_CCW_EVENT_OPEN]		= fsm_open,
+		[VFIO_CCW_EVENT_CLOSE]		= fsm_broken,
 	},
 	[VFIO_CCW_STATE_IDLE] = {
-		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
+		[VFIO_CCW_EVENT_BROKEN]		= fsm_broken,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_request,
 		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
+		[VFIO_CCW_EVENT_OPEN]		= fsm_broken,
+		[VFIO_CCW_EVENT_CLOSE]		= fsm_close,
 	},
 	[VFIO_CCW_STATE_CP_PROCESSING] = {
-		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
+		[VFIO_CCW_EVENT_BROKEN]		= fsm_broken,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_retry,
 		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_retry,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
+		[VFIO_CCW_EVENT_OPEN]		= fsm_broken,
+		[VFIO_CCW_EVENT_CLOSE]		= fsm_close,
 	},
 	[VFIO_CCW_STATE_CP_PENDING] = {
-		[VFIO_CCW_EVENT_NOT_OPER]	= fsm_notoper,
+		[VFIO_CCW_EVENT_BROKEN]		= fsm_broken,
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_busy,
 		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_request,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_irq,
+		[VFIO_CCW_EVENT_OPEN]		= fsm_broken,
+		[VFIO_CCW_EVENT_CLOSE]		= fsm_close,
 	},
 };
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 3a66e4fb18244c..6e70620d5dfbc8 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -21,10 +21,6 @@ static const struct vfio_device_ops vfio_ccw_dev_ops;
 
 static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
 {
-	struct subchannel *sch;
-	int ret;
-
-	sch = private->sch;
 	/*
 	 * TODO:
 	 * In the cureent stage, some things like "no I/O running" and "no
@@ -33,15 +29,11 @@ static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
 	 * There are still a lot more instructions need to be handled. We
 	 * should come back here later.
 	 */
-	ret = vfio_ccw_sch_quiesce(sch);
-	if (ret)
-		return ret;
-
-	ret = cio_enable_subchannel(sch, (u32)(unsigned long)sch);
-	if (!ret)
-		private->state = VFIO_CCW_STATE_IDLE;
-
-	return ret;
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_CLOSE);
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_OPEN);
+	if (private->state == VFIO_CCW_STATE_BROKEN)
+		return -EINVAL;
+	return 0;
 }
 
 static int vfio_ccw_mdev_notifier(struct notifier_block *nb,
@@ -118,9 +110,6 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 	struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
 	int ret;
 
-	if (private->state == VFIO_CCW_STATE_NOT_OPER)
-		return -ENODEV;
-
 	if (atomic_dec_if_positive(&private->avail) < 0)
 		return -EPERM;
 
@@ -129,7 +118,6 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 			    &vfio_ccw_dev_ops);
 
 	private->mdev = mdev;
-	private->state = VFIO_CCW_STATE_IDLE;
 
 	VFIO_CCW_MSG_EVENT(2, "mdev %pUl, sch %x.%x.%04x: create\n",
 			   mdev_uuid(mdev), private->sch->schid.cssid,
@@ -146,7 +134,6 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 	vfio_uninit_group_dev(&private->vdev);
 	atomic_inc(&private->avail);
 	private->mdev = NULL;
-	private->state = VFIO_CCW_STATE_IDLE;
 	return ret;
 }
 
@@ -160,16 +147,7 @@ static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
 			   private->sch->schid.sch_no);
 
 	vfio_unregister_group_dev(&private->vdev);
-
-	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
-	    (private->state != VFIO_CCW_STATE_STANDBY)) {
-		if (!vfio_ccw_sch_quiesce(private->sch))
-			private->state = VFIO_CCW_STATE_STANDBY;
-		/* The state will be NOT_OPER on error. */
-	}
-
 	vfio_uninit_group_dev(&private->vdev);
-	cp_free(&private->cp);
 	private->mdev = NULL;
 	atomic_inc(&private->avail);
 }
@@ -181,6 +159,9 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
 	unsigned long events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
 	int ret;
 
+	if (private->state == VFIO_CCW_STATE_BROKEN)
+		return -EINVAL;
+
 	private->nb.notifier_call = vfio_ccw_mdev_notifier;
 
 	ret = vfio_register_notifier(vdev->dev, VFIO_IOMMU_NOTIFY,
@@ -200,6 +181,11 @@ static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
 	if (ret)
 		goto out_unregister;
 
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_OPEN);
+	if (private->state == VFIO_CCW_STATE_BROKEN) {
+		ret = -EINVAL;
+		goto out_unregister;
+	}
 	return ret;
 
 out_unregister:
@@ -214,14 +200,7 @@ static void vfio_ccw_mdev_close_device(struct vfio_device *vdev)
 	struct vfio_ccw_private *private =
 		container_of(vdev, struct vfio_ccw_private, vdev);
 
-	if ((private->state != VFIO_CCW_STATE_NOT_OPER) &&
-	    (private->state != VFIO_CCW_STATE_STANDBY)) {
-		if (!vfio_ccw_mdev_reset(private))
-			private->state = VFIO_CCW_STATE_STANDBY;
-		/* The state will be NOT_OPER on error. */
-	}
-
-	cp_free(&private->cp);
+	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_CLOSE);
 	vfio_ccw_unregister_dev_regions(private);
 	vfio_unregister_notifier(vdev->dev, VFIO_IOMMU_NOTIFY, &private->nb);
 }
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 7272eb78861244..5e98eacdf31074 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -122,16 +122,14 @@ struct vfio_ccw_private {
 extern int vfio_ccw_mdev_reg(struct subchannel *sch);
 extern void vfio_ccw_mdev_unreg(struct subchannel *sch);
 
-extern int vfio_ccw_sch_quiesce(struct subchannel *sch);
-
 extern struct mdev_driver vfio_ccw_mdev_driver;
 
 /*
  * States of the device statemachine.
  */
 enum vfio_ccw_state {
-	VFIO_CCW_STATE_NOT_OPER,
-	VFIO_CCW_STATE_STANDBY,
+	VFIO_CCW_STATE_BROKEN,
+	VFIO_CCW_STATE_CLOSED,
 	VFIO_CCW_STATE_IDLE,
 	VFIO_CCW_STATE_CP_PROCESSING,
 	VFIO_CCW_STATE_CP_PENDING,
@@ -143,10 +141,12 @@ enum vfio_ccw_state {
  * Asynchronous events of the device statemachine.
  */
 enum vfio_ccw_event {
-	VFIO_CCW_EVENT_NOT_OPER,
+	VFIO_CCW_EVENT_BROKEN,
 	VFIO_CCW_EVENT_IO_REQ,
 	VFIO_CCW_EVENT_INTERRUPT,
 	VFIO_CCW_EVENT_ASYNC_REQ,
+	VFIO_CCW_EVENT_OPEN,
+	VFIO_CCW_EVENT_CLOSE,
 	/* last element! */
 	NR_VFIO_CCW_EVENTS
 };
@@ -158,7 +158,7 @@ typedef void (fsm_func_t)(struct vfio_ccw_private *, enum vfio_ccw_event);
 extern fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS];
 
 static inline void vfio_ccw_fsm_event(struct vfio_ccw_private *private,
-				     int event)
+				      enum vfio_ccw_event event)
 {
 	trace_vfio_ccw_fsm_event(private->sch->schid, private->state, event);
 	vfio_ccw_jumptable[private->state][event](private, event);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2021-09-09 19:38 ` [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-10 12:10   ` Christoph Hellwig
  2021-09-09 19:38 ` [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core Jason Gunthorpe
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

Every driver just emits a static string, simply feed it through the ops
and provide a standard sysfs show function.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/gpu/drm/i915/gvt/kvmgt.c  |  9 +--------
 drivers/s390/cio/vfio_ccw_ops.c   |  9 +--------
 drivers/s390/crypto/vfio_ap_ops.c |  9 +--------
 drivers/vfio/mdev/mdev_core.c     |  2 +-
 drivers/vfio/mdev/mdev_sysfs.c    | 27 ++++++++++++++++++++++++---
 include/linux/mdev.h              |  7 ++-----
 samples/vfio-mdev/mbochs.c        |  9 +--------
 samples/vfio-mdev/mdpy.c          |  9 +--------
 samples/vfio-mdev/mtty.c          | 10 +---------
 9 files changed, 33 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index 7efa386449d104..d198cc3d132277 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -161,12 +161,6 @@ static ssize_t available_instances_show(struct mdev_type *mtype,
 	return sprintf(buf, "%u\n", num);
 }
 
-static ssize_t device_api_show(struct mdev_type *mtype,
-			       struct mdev_type_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%s\n", VFIO_DEVICE_API_PCI_STRING);
-}
-
 static ssize_t description_show(struct mdev_type *mtype,
 				struct mdev_type_attribute *attr, char *buf)
 {
@@ -187,12 +181,10 @@ static ssize_t description_show(struct mdev_type *mtype,
 }
 
 static MDEV_TYPE_ATTR_RO(available_instances);
-static MDEV_TYPE_ATTR_RO(device_api);
 static MDEV_TYPE_ATTR_RO(description);
 
 static struct attribute *gvt_type_attrs[] = {
 	&mdev_type_attr_available_instances.attr,
-	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_description.attr,
 	NULL,
 };
@@ -1750,6 +1742,7 @@ static const struct attribute_group *intel_vgpu_groups[] = {
 
 static struct mdev_parent_ops intel_vgpu_ops = {
 	.mdev_attr_groups       = intel_vgpu_groups,
+	.device_api		= VFIO_DEVICE_API_PCI_STRING,
 	.create			= intel_vgpu_create,
 	.remove			= intel_vgpu_remove,
 
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 6e70620d5dfbc8..18a48d1f1e8fff 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -70,13 +70,6 @@ static ssize_t name_show(struct mdev_type *mtype,
 }
 static MDEV_TYPE_ATTR_RO(name);
 
-static ssize_t device_api_show(struct mdev_type *mtype,
-			       struct mdev_type_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%s\n", VFIO_DEVICE_API_CCW_STRING);
-}
-static MDEV_TYPE_ATTR_RO(device_api);
-
 static ssize_t available_instances_show(struct mdev_type *mtype,
 					struct mdev_type_attribute *attr,
 					char *buf)
@@ -90,7 +83,6 @@ static MDEV_TYPE_ATTR_RO(available_instances);
 
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
-	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
@@ -640,6 +632,7 @@ struct mdev_driver vfio_ccw_mdev_driver = {
 static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
 	.owner			= THIS_MODULE,
 	.device_driver		= &vfio_ccw_mdev_driver,
+	.device_api		= VFIO_DEVICE_API_CCW_STRING,
 	.supported_type_groups  = mdev_type_groups,
 };
 
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 54bb0c22e8020e..005c2a2b14af3f 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -400,17 +400,9 @@ static ssize_t available_instances_show(struct mdev_type *mtype,
 
 static MDEV_TYPE_ATTR_RO(available_instances);
 
-static ssize_t device_api_show(struct mdev_type *mtype,
-			       struct mdev_type_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%s\n", VFIO_DEVICE_API_AP_STRING);
-}
-
-static MDEV_TYPE_ATTR_RO(device_api);
 
 static struct attribute *vfio_ap_mdev_type_attrs[] = {
 	&mdev_type_attr_name.attr,
-	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
@@ -1399,6 +1391,7 @@ static struct mdev_driver vfio_ap_matrix_driver = {
 static const struct mdev_parent_ops vfio_ap_matrix_ops = {
 	.owner			= THIS_MODULE,
 	.device_driver		= &vfio_ap_matrix_driver,
+	.device_api		= VFIO_DEVICE_API_AP_STRING,
 	.supported_type_groups	= vfio_ap_mdev_type_groups,
 };
 
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index b314101237fe22..c3018e8e6d3258 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -129,7 +129,7 @@ int mdev_register_device(struct device *dev, const struct mdev_parent_ops *ops)
 	char *envp[] = { env_string, NULL };
 
 	/* check for mandatory ops */
-	if (!ops || !ops->supported_type_groups)
+	if (!ops || !ops->supported_type_groups || !ops->device_api)
 		return -EINVAL;
 	if (!ops->device_driver && (!ops->create || !ops->remove))
 		return -EINVAL;
diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
index f5cf1931c54e48..d4b99440d19e9a 100644
--- a/drivers/vfio/mdev/mdev_sysfs.c
+++ b/drivers/vfio/mdev/mdev_sysfs.c
@@ -74,9 +74,30 @@ static ssize_t create_store(struct mdev_type *mtype,
 
 	return count;
 }
-
 static MDEV_TYPE_ATTR_WO(create);
 
+static ssize_t device_api_show(struct mdev_type *mtype,
+			       struct mdev_type_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%s\n", mtype->parent->ops->device_api);
+}
+static MDEV_TYPE_ATTR_RO(device_api);
+
+static struct attribute *mdev_types_std_attrs[] = {
+	&mdev_type_attr_create.attr,
+	&mdev_type_attr_device_api.attr,
+	NULL,
+};
+
+static struct attribute_group mdev_type_std_group = {
+	.attrs = mdev_types_std_attrs,
+};
+
+static const struct attribute_group *mdev_type_groups[] = {
+	&mdev_type_std_group,
+	NULL,
+};
+
 static void mdev_type_release(struct kobject *kobj)
 {
 	struct mdev_type *type = to_mdev_type(kobj);
@@ -123,7 +144,7 @@ static struct mdev_type *add_mdev_supported_type(struct mdev_parent *parent,
 		return ERR_PTR(ret);
 	}
 
-	ret = sysfs_create_file(&type->kobj, &mdev_type_attr_create.attr);
+	ret = sysfs_create_groups(&type->kobj, mdev_type_groups);
 	if (ret)
 		goto attr_create_failed;
 
@@ -144,7 +165,7 @@ static struct mdev_type *add_mdev_supported_type(struct mdev_parent *parent,
 attrs_failed:
 	kobject_put(type->devices_kobj);
 attr_devices_failed:
-	sysfs_remove_file(&type->kobj, &mdev_type_attr_create.attr);
+	sysfs_remove_groups(&type->kobj, mdev_type_groups);
 attr_create_failed:
 	kobject_del(&type->kobj);
 	kobject_put(&type->kobj);
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 68427e8fadebd6..7f1db354f45f14 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -56,6 +56,7 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
  *
  * @owner:		The module owner.
  * @device_driver:	Which device driver to probe() on newly created devices
+ * @device_api:		String to return for the device_api sysfs
  * @dev_attr_groups:	Attributes of the parent device.
  * @mdev_attr_groups:	Attributes of the mediated device.
  * @supported_type_groups: Attributes to define supported types. It is mandatory
@@ -100,6 +101,7 @@ struct device *mtype_get_parent_dev(struct mdev_type *mtype);
 struct mdev_parent_ops {
 	struct module   *owner;
 	struct mdev_driver *device_driver;
+	const char *device_api;
 	const struct attribute_group **dev_attr_groups;
 	const struct attribute_group **mdev_attr_groups;
 	struct attribute_group **supported_type_groups;
@@ -128,11 +130,6 @@ struct mdev_type_attribute {
 			 size_t count);
 };
 
-#define MDEV_TYPE_ATTR(_name, _mode, _show, _store)		\
-struct mdev_type_attribute mdev_type_attr_##_name =		\
-	__ATTR(_name, _mode, _show, _store)
-#define MDEV_TYPE_ATTR_RW(_name) \
-	struct mdev_type_attribute mdev_type_attr_##_name = __ATTR_RW(_name)
 #define MDEV_TYPE_ATTR_RO(_name) \
 	struct mdev_type_attribute mdev_type_attr_##_name = __ATTR_RO(_name)
 #define MDEV_TYPE_ATTR_WO(_name) \
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index c313ab4d1f4e4e..abd889bc1f9dcf 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1358,17 +1358,9 @@ static ssize_t available_instances_show(struct mdev_type *mtype,
 }
 static MDEV_TYPE_ATTR_RO(available_instances);
 
-static ssize_t device_api_show(struct mdev_type *mtype,
-			       struct mdev_type_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%s\n", VFIO_DEVICE_API_PCI_STRING);
-}
-static MDEV_TYPE_ATTR_RO(device_api);
-
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
 	&mdev_type_attr_description.attr,
-	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
@@ -1417,6 +1409,7 @@ static struct mdev_driver mbochs_driver = {
 static const struct mdev_parent_ops mdev_fops = {
 	.owner			= THIS_MODULE,
 	.device_driver		= &mbochs_driver,
+	.device_api		= VFIO_DEVICE_API_PCI_STRING,
 	.supported_type_groups	= mdev_type_groups,
 };
 
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index 8d1a80a0722aa9..b81d7848619cae 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -670,17 +670,9 @@ static ssize_t available_instances_show(struct mdev_type *mtype,
 }
 static MDEV_TYPE_ATTR_RO(available_instances);
 
-static ssize_t device_api_show(struct mdev_type *mtype,
-			       struct mdev_type_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%s\n", VFIO_DEVICE_API_PCI_STRING);
-}
-static MDEV_TYPE_ATTR_RO(device_api);
-
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
 	&mdev_type_attr_description.attr,
-	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
@@ -728,6 +720,7 @@ static struct mdev_driver mdpy_driver = {
 static const struct mdev_parent_ops mdev_fops = {
 	.owner			= THIS_MODULE,
 	.device_driver          = &mdpy_driver,
+	.device_api		= VFIO_DEVICE_API_PCI_STRING,
 	.supported_type_groups	= mdev_type_groups,
 };
 
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 5983cdb16e3d1d..b473ecc7733f7d 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -1281,17 +1281,8 @@ static ssize_t available_instances_show(struct mdev_type *mtype,
 
 static MDEV_TYPE_ATTR_RO(available_instances);
 
-static ssize_t device_api_show(struct mdev_type *mtype,
-			       struct mdev_type_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%s\n", VFIO_DEVICE_API_PCI_STRING);
-}
-
-static MDEV_TYPE_ATTR_RO(device_api);
-
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
-	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
@@ -1333,6 +1324,7 @@ static struct mdev_driver mtty_driver = {
 static const struct mdev_parent_ops mdev_fops = {
 	.owner                  = THIS_MODULE,
 	.device_driver		= &mtty_driver,
+	.device_api		= VFIO_DEVICE_API_PCI_STRING,
 	.dev_attr_groups        = mtty_dev_groups,
 	.supported_type_groups  = mdev_type_groups,
 };
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2021-09-09 19:38 ` [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-10 12:25   ` Christoph Hellwig
  2021-09-20 18:02   ` Cornelia Huck
  2021-09-09 19:38 ` [PATCH v2 7/9] vfio/ccw: Remove private->mdev Jason Gunthorpe
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

Many of the mdev drivers use a simple counter for keeping track of the
available instances. Move this code to the core code and store the counter
in the mdev_type. Implement it using correct locking, fixing mdpy.

Drivers provide a get_available() callback to set the number of available
instances for their mtypes which is fixed at registration time. The core
provides a standard sysfs attribute to return the available_instances.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c       |  1 -
 drivers/s390/cio/vfio_ccw_ops.c       | 26 ++++++-------------
 drivers/s390/cio/vfio_ccw_private.h   |  2 --
 drivers/s390/crypto/vfio_ap_ops.c     | 32 ++++++-----------------
 drivers/s390/crypto/vfio_ap_private.h |  2 --
 drivers/vfio/mdev/mdev_core.c         | 11 +++++++-
 drivers/vfio/mdev/mdev_private.h      |  2 ++
 drivers/vfio/mdev/mdev_sysfs.c        | 37 +++++++++++++++++++++++++++
 include/linux/mdev.h                  |  2 ++
 samples/vfio-mdev/mdpy.c              | 22 +++++-----------
 10 files changed, 73 insertions(+), 64 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 99f2823361718f..de782e967a5474 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -106,7 +106,6 @@ static struct vfio_ccw_private *vfio_ccw_alloc_private(struct subchannel *sch)
 	INIT_LIST_HEAD(&private->crw);
 	INIT_WORK(&private->io_work, vfio_ccw_sch_io_todo);
 	INIT_WORK(&private->crw_work, vfio_ccw_crw_todo);
-	atomic_set(&private->avail, 1);
 
 	private->cp.guest_cp = kcalloc(CCWCHAIN_LEN_MAX, sizeof(struct ccw1),
 				       GFP_KERNEL);
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 18a48d1f1e8fff..38ab5c1f25ec09 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -70,20 +70,9 @@ static ssize_t name_show(struct mdev_type *mtype,
 }
 static MDEV_TYPE_ATTR_RO(name);
 
-static ssize_t available_instances_show(struct mdev_type *mtype,
-					struct mdev_type_attribute *attr,
-					char *buf)
-{
-	struct vfio_ccw_private *private =
-		dev_get_drvdata(mtype_get_parent_dev(mtype));
-
-	return sprintf(buf, "%d\n", atomic_read(&private->avail));
-}
-static MDEV_TYPE_ATTR_RO(available_instances);
 
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
-	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
 
@@ -102,9 +91,6 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 	struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
 	int ret;
 
-	if (atomic_dec_if_positive(&private->avail) < 0)
-		return -EPERM;
-
 	memset(&private->vdev, 0, sizeof(private->vdev));
 	vfio_init_group_dev(&private->vdev, &mdev->dev,
 			    &vfio_ccw_dev_ops);
@@ -118,13 +104,12 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 
 	ret = vfio_register_group_dev(&private->vdev);
 	if (ret)
-		goto err_atomic;
+		goto err_init;
 	dev_set_drvdata(&mdev->dev, private);
 	return 0;
 
-err_atomic:
+err_init:
 	vfio_uninit_group_dev(&private->vdev);
-	atomic_inc(&private->avail);
 	private->mdev = NULL;
 	return ret;
 }
@@ -141,7 +126,6 @@ static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
 	vfio_unregister_group_dev(&private->vdev);
 	vfio_uninit_group_dev(&private->vdev);
 	private->mdev = NULL;
-	atomic_inc(&private->avail);
 }
 
 static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
@@ -610,6 +594,11 @@ static void vfio_ccw_mdev_request(struct vfio_device *vdev, unsigned int count)
 	}
 }
 
+static unsigned int vfio_ccw_get_available(struct mdev_type *mtype)
+{
+	return 1;
+}
+
 static const struct vfio_device_ops vfio_ccw_dev_ops = {
 	.open_device = vfio_ccw_mdev_open_device,
 	.close_device = vfio_ccw_mdev_close_device,
@@ -627,6 +616,7 @@ struct mdev_driver vfio_ccw_mdev_driver = {
 	},
 	.probe = vfio_ccw_mdev_probe,
 	.remove = vfio_ccw_mdev_remove,
+	.get_available = vfio_ccw_get_available,
 };
 
 static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 5e98eacdf31074..bbc97eb9d9c6fc 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -72,7 +72,6 @@ struct vfio_ccw_crw {
  * @sch: pointer to the subchannel
  * @state: internal state of the device
  * @completion: synchronization helper of the I/O completion
- * @avail: available for creating a mediated device
  * @mdev: pointer to the mediated device
  * @nb: notifier for vfio events
  * @io_region: MMIO region to input/output I/O arguments/results
@@ -96,7 +95,6 @@ struct vfio_ccw_private {
 	struct subchannel	*sch;
 	int			state;
 	struct completion	*completion;
-	atomic_t		avail;
 	struct mdev_device	*mdev;
 	struct notifier_block	nb;
 	struct ccw_io_region	*io_region;
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 005c2a2b14af3f..737e8d137bd43b 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -332,14 +332,9 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
 	struct ap_matrix_mdev *matrix_mdev;
 	int ret;
 
-	if ((atomic_dec_if_positive(&matrix_dev->available_instances) < 0))
-		return -EPERM;
-
 	matrix_mdev = kzalloc(sizeof(*matrix_mdev), GFP_KERNEL);
-	if (!matrix_mdev) {
-		ret = -ENOMEM;
-		goto err_dec_available;
-	}
+	if (!matrix_mdev)
+		return -ENOMEM;
 	vfio_init_group_dev(&matrix_mdev->vdev, &mdev->dev,
 			    &vfio_ap_matrix_dev_ops);
 
@@ -362,8 +357,6 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
 	mutex_unlock(&matrix_dev->lock);
 	vfio_uninit_group_dev(&matrix_mdev->vdev);
 	kfree(matrix_mdev);
-err_dec_available:
-	atomic_inc(&matrix_dev->available_instances);
 	return ret;
 }
 
@@ -378,7 +371,6 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev)
 	list_del(&matrix_mdev->node);
 	vfio_uninit_group_dev(&matrix_mdev->vdev);
 	kfree(matrix_mdev);
-	atomic_inc(&matrix_dev->available_instances);
 	mutex_unlock(&matrix_dev->lock);
 }
 
@@ -390,20 +382,8 @@ static ssize_t name_show(struct mdev_type *mtype,
 
 static MDEV_TYPE_ATTR_RO(name);
 
-static ssize_t available_instances_show(struct mdev_type *mtype,
-					struct mdev_type_attribute *attr,
-					char *buf)
-{
-	return sprintf(buf, "%d\n",
-		       atomic_read(&matrix_dev->available_instances));
-}
-
-static MDEV_TYPE_ATTR_RO(available_instances);
-
-
 static struct attribute *vfio_ap_mdev_type_attrs[] = {
 	&mdev_type_attr_name.attr,
-	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
 
@@ -1371,6 +1351,11 @@ static ssize_t vfio_ap_mdev_ioctl(struct vfio_device *vdev,
 	return ret;
 }
 
+static unsigned int vfio_ap_mdev_get_available(struct mdev_type *mtype)
+{
+	return MAX_ZDEV_ENTRIES_EXT;
+}
+
 static const struct vfio_device_ops vfio_ap_matrix_dev_ops = {
 	.open_device = vfio_ap_mdev_open_device,
 	.close_device = vfio_ap_mdev_close_device,
@@ -1386,6 +1371,7 @@ static struct mdev_driver vfio_ap_matrix_driver = {
 	},
 	.probe = vfio_ap_mdev_probe,
 	.remove = vfio_ap_mdev_remove,
+	.get_available = vfio_ap_mdev_get_available,
 };
 
 static const struct mdev_parent_ops vfio_ap_matrix_ops = {
@@ -1399,8 +1385,6 @@ int vfio_ap_mdev_register(void)
 {
 	int ret;
 
-	atomic_set(&matrix_dev->available_instances, MAX_ZDEV_ENTRIES_EXT);
-
 	ret = mdev_register_driver(&vfio_ap_matrix_driver);
 	if (ret)
 		return ret;
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 77760e2b546fe6..dd87c84605bcb6 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -28,7 +28,6 @@
 /**
  * ap_matrix_dev - the AP matrix device structure
  * @device:	generic device structure associated with the AP matrix device
- * @available_instances: number of mediated matrix devices that can be created
  * @info:	the struct containing the output from the PQAP(QCI) instruction
  * mdev_list:	the list of mediated matrix devices created
  * lock:	mutex for locking the AP matrix device. This lock will be
@@ -39,7 +38,6 @@
  */
 struct ap_matrix_dev {
 	struct device device;
-	atomic_t available_instances;
 	struct ap_config_info info;
 	struct list_head mdev_list;
 	struct mutex lock;
diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
index c3018e8e6d3258..33a5e738867488 100644
--- a/drivers/vfio/mdev/mdev_core.c
+++ b/drivers/vfio/mdev/mdev_core.c
@@ -25,7 +25,7 @@ static DEFINE_MUTEX(parent_list_lock);
 static struct class_compat *mdev_bus_compat_class;
 
 static LIST_HEAD(mdev_list);
-static DEFINE_MUTEX(mdev_list_lock);
+DEFINE_MUTEX(mdev_list_lock);
 
 struct device *mdev_parent_dev(struct mdev_device *mdev)
 {
@@ -245,6 +245,7 @@ static void mdev_device_release(struct device *dev)
 
 	mutex_lock(&mdev_list_lock);
 	list_del(&mdev->next);
+	mdev->type->available++;
 	mutex_unlock(&mdev_list_lock);
 
 	dev_dbg(&mdev->dev, "MDEV: destroying\n");
@@ -268,6 +269,14 @@ int mdev_device_create(struct mdev_type *type, const guid_t *uuid)
 		}
 	}
 
+	if (drv->get_available) {
+		if (!type->available) {
+			mutex_unlock(&mdev_list_lock);
+			return -EUSERS;
+		}
+		type->available--;
+	}
+
 	mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
 	if (!mdev) {
 		mutex_unlock(&mdev_list_lock);
diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h
index afbad7b0a14a17..83586b07023334 100644
--- a/drivers/vfio/mdev/mdev_private.h
+++ b/drivers/vfio/mdev/mdev_private.h
@@ -29,6 +29,7 @@ struct mdev_type {
 	struct kobject *devices_kobj;
 	struct mdev_parent *parent;
 	struct list_head next;
+	unsigned int available;
 	unsigned int type_group_id;
 };
 
@@ -38,6 +39,7 @@ struct mdev_type {
 	container_of(_kobj, struct mdev_type, kobj)
 
 extern struct mdev_driver vfio_mdev_driver;
+extern struct mutex mdev_list_lock;
 
 int  parent_create_sysfs_files(struct mdev_parent *parent);
 void parent_remove_sysfs_files(struct mdev_parent *parent);
diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
index d4b99440d19e9a..b3129dfc27ef7d 100644
--- a/drivers/vfio/mdev/mdev_sysfs.c
+++ b/drivers/vfio/mdev/mdev_sysfs.c
@@ -93,8 +93,41 @@ static struct attribute_group mdev_type_std_group = {
 	.attrs = mdev_types_std_attrs,
 };
 
+/* mdev_type attribute used by drivers that have an get_available() op */
+static ssize_t available_instances_show(struct mdev_type *mtype,
+					struct mdev_type_attribute *attr,
+					char *buf)
+{
+	unsigned int available;
+
+	mutex_lock(&mdev_list_lock);
+	available = mtype->available;
+	mutex_unlock(&mdev_list_lock);
+
+	return sysfs_emit(buf, "%u\n", available);
+}
+static MDEV_TYPE_ATTR_RO(available_instances);
+static umode_t available_instances_is_visible(struct kobject *kobj,
+					      struct attribute *attr, int n)
+{
+	struct mdev_type *type = to_mdev_type(kobj);
+
+	if (!type->parent->ops->device_driver->get_available)
+		return 0;
+	return attr->mode;
+}
+static struct attribute *mdev_types_name_attrs[] = {
+	&mdev_type_attr_available_instances.attr,
+	NULL,
+};
+static struct attribute_group mdev_type_available_instances_group = {
+	.attrs = mdev_types_name_attrs,
+	.is_visible = available_instances_is_visible,
+};
+
 static const struct attribute_group *mdev_type_groups[] = {
 	&mdev_type_std_group,
+	&mdev_type_available_instances_group,
 	NULL,
 };
 
@@ -136,6 +169,10 @@ static struct mdev_type *add_mdev_supported_type(struct mdev_parent *parent,
 	mdev_get_parent(parent);
 	type->type_group_id = type_group_id;
 
+	if (parent->ops->device_driver->get_available)
+		type->available =
+			parent->ops->device_driver->get_available(type);
+
 	ret = kobject_init_and_add(&type->kobj, &mdev_type_ktype, NULL,
 				   "%s-%s", dev_driver_string(parent->dev),
 				   group->name);
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 7f1db354f45f14..819eaf98ffac73 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -140,12 +140,14 @@ struct mdev_type_attribute {
  * @probe: called when new device created
  * @remove: called when device removed
  * @driver: device driver structure
+ * @get_available: Return the max number of instances that can be created
  *
  **/
 struct mdev_driver {
 	int (*probe)(struct mdev_device *dev);
 	void (*remove)(struct mdev_device *dev);
 	struct device_driver driver;
+	unsigned int (*get_available)(struct mdev_type *mtype);
 };
 
 static inline void *mdev_get_drvdata(struct mdev_device *mdev)
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index b81d7848619cae..2ea8694a0ddb19 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -84,7 +84,6 @@ static dev_t		mdpy_devt;
 static struct class	*mdpy_class;
 static struct cdev	mdpy_cdev;
 static struct device	mdpy_dev;
-static u32		mdpy_count;
 static const struct vfio_device_ops mdpy_dev_ops;
 
 /* State of each mdev device */
@@ -225,9 +224,6 @@ static int mdpy_probe(struct mdev_device *mdev)
 	u32 fbsize;
 	int ret;
 
-	if (mdpy_count >= max_devices)
-		return -ENOMEM;
-
 	mdev_state = kzalloc(sizeof(struct mdev_state), GFP_KERNEL);
 	if (mdev_state == NULL)
 		return -ENOMEM;
@@ -256,8 +252,6 @@ static int mdpy_probe(struct mdev_device *mdev)
 	mdpy_create_config_space(mdev_state);
 	mdpy_reset(mdev_state);
 
-	mdpy_count++;
-
 	ret = vfio_register_group_dev(&mdev_state->vdev);
 	if (ret)
 		goto err_mem;
@@ -284,8 +278,6 @@ static void mdpy_remove(struct mdev_device *mdev)
 	kfree(mdev_state->vconfig);
 	vfio_uninit_group_dev(&mdev_state->vdev);
 	kfree(mdev_state);
-
-	mdpy_count--;
 }
 
 static ssize_t mdpy_read(struct vfio_device *vdev, char __user *buf,
@@ -662,18 +654,10 @@ static ssize_t description_show(struct mdev_type *mtype,
 }
 static MDEV_TYPE_ATTR_RO(description);
 
-static ssize_t available_instances_show(struct mdev_type *mtype,
-					struct mdev_type_attribute *attr,
-					char *buf)
-{
-	return sprintf(buf, "%d\n", max_devices - mdpy_count);
-}
-static MDEV_TYPE_ATTR_RO(available_instances);
 
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
 	&mdev_type_attr_description.attr,
-	&mdev_type_attr_available_instances.attr,
 	NULL,
 };
 
@@ -706,6 +690,11 @@ static const struct vfio_device_ops mdpy_dev_ops = {
 	.mmap = mdpy_mmap,
 };
 
+static unsigned int mdpy_get_available(struct mdev_type *mtype)
+{
+	return max_devices;
+}
+
 static struct mdev_driver mdpy_driver = {
 	.driver = {
 		.name = "mdpy",
@@ -715,6 +704,7 @@ static struct mdev_driver mdpy_driver = {
 	},
 	.probe = mdpy_probe,
 	.remove	= mdpy_remove,
+	.get_available = mdpy_get_available,
 };
 
 static const struct mdev_parent_ops mdev_fops = {
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 7/9] vfio/ccw: Remove private->mdev
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2021-09-09 19:38 ` [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-24 20:45   ` Eric Farman
  2021-09-09 19:38 ` [PATCH v2 8/9] vfio: Export vfio_device_try_get() Jason Gunthorpe
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

Having a mdev pointer floating about in addition to a struct vfio_device
is confusing. It is only used for three things:

- Getting the mdev 'struct device *' - this is the same as
     private->vdev.dev

- Printing the uuid of the mdev in logging. The uuid is also the dev_name
  of the mdev so this is the same string as
     dev_name(private->vdev.dev)

- A weird attempt to fence the vfio_ccw_sch_io_todo() work. This work is
  only queued during states IDLE/PROCESSING/PENDING and flushed when
  entering CLOSED. Thus the work already cannot run when the mdev is NULL.
  Remove it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     |  6 ++--
 drivers/s390/cio/vfio_ccw_fsm.c     | 48 +++++++++++++----------------
 drivers/s390/cio/vfio_ccw_ops.c     | 16 ++++------
 drivers/s390/cio/vfio_ccw_private.h |  2 --
 include/linux/mdev.h                |  4 ---
 5 files changed, 30 insertions(+), 46 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index de782e967a5474..0e2edd96567a09 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -64,7 +64,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work)
 	 * has finished. Do not overwrite a possible processing
 	 * state if the final interrupt was for HSCH or CSCH.
 	 */
-	if (private->mdev && cp_is_finished)
+	if (cp_is_finished)
 		private->state = VFIO_CCW_STATE_IDLE;
 
 	if (private->io_trigger)
@@ -303,8 +303,8 @@ static int vfio_ccw_chp_event(struct subchannel *sch,
 		return 0;
 
 	trace_vfio_ccw_chp_event(private->sch->schid, mask, event);
-	VFIO_CCW_MSG_EVENT(2, "%pUl (%x.%x.%04x): mask=0x%x event=%d\n",
-			   mdev_uuid(private->mdev), sch->schid.cssid,
+	VFIO_CCW_MSG_EVENT(2, "%s (%x.%x.%04x): mask=0x%x event=%d\n",
+			   dev_name(private->vdev.dev), sch->schid.cssid,
 			   sch->schid.ssid, sch->schid.sch_no,
 			   mask, event);
 
diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index 302215090b9ac7..df1490943b20ec 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -245,7 +245,6 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 	union orb *orb;
 	union scsw *scsw = &private->scsw;
 	struct ccw_io_region *io_region = private->io_region;
-	struct mdev_device *mdev = private->mdev;
 	char *errstr = "request";
 	struct subchannel_id schid = get_schid(private);
 
@@ -258,32 +257,30 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 		/* Don't try to build a cp if transport mode is specified. */
 		if (orb->tm.b) {
 			io_region->ret_code = -EOPNOTSUPP;
-			VFIO_CCW_MSG_EVENT(2,
-					   "%pUl (%x.%x.%04x): transport mode\n",
-					   mdev_uuid(mdev), schid.cssid,
-					   schid.ssid, schid.sch_no);
+			VFIO_CCW_MSG_EVENT(
+				2, "%s (%x.%x.%04x): transport mode\n",
+				dev_name(private->vdev.dev), schid.cssid,
+				schid.ssid, schid.sch_no);
 			errstr = "transport mode";
 			goto err_out;
 		}
-		io_region->ret_code = cp_init(&private->cp, mdev_dev(mdev),
+		io_region->ret_code = cp_init(&private->cp, private->vdev.dev,
 					      orb);
 		if (io_region->ret_code) {
-			VFIO_CCW_MSG_EVENT(2,
-					   "%pUl (%x.%x.%04x): cp_init=%d\n",
-					   mdev_uuid(mdev), schid.cssid,
-					   schid.ssid, schid.sch_no,
-					   io_region->ret_code);
+			VFIO_CCW_MSG_EVENT(2, "%s (%x.%x.%04x): cp_init=%d\n",
+					   dev_name(private->vdev.dev),
+					   schid.cssid, schid.ssid,
+					   schid.sch_no, io_region->ret_code);
 			errstr = "cp init";
 			goto err_out;
 		}
 
 		io_region->ret_code = cp_prefetch(&private->cp);
 		if (io_region->ret_code) {
-			VFIO_CCW_MSG_EVENT(2,
-					   "%pUl (%x.%x.%04x): cp_prefetch=%d\n",
-					   mdev_uuid(mdev), schid.cssid,
-					   schid.ssid, schid.sch_no,
-					   io_region->ret_code);
+			VFIO_CCW_MSG_EVENT(
+				2, "%s (%x.%x.%04x): cp_prefetch=%d\n",
+				dev_name(private->vdev.dev), schid.cssid,
+				schid.ssid, schid.sch_no, io_region->ret_code);
 			errstr = "cp prefetch";
 			cp_free(&private->cp);
 			goto err_out;
@@ -292,28 +289,25 @@ static void fsm_io_request(struct vfio_ccw_private *private,
 		/* Start channel program and wait for I/O interrupt. */
 		io_region->ret_code = fsm_io_helper(private);
 		if (io_region->ret_code) {
-			VFIO_CCW_MSG_EVENT(2,
-					   "%pUl (%x.%x.%04x): fsm_io_helper=%d\n",
-					   mdev_uuid(mdev), schid.cssid,
-					   schid.ssid, schid.sch_no,
-					   io_region->ret_code);
+			VFIO_CCW_MSG_EVENT(
+				2, "%s (%x.%x.%04x): fsm_io_helper=%d\n",
+				dev_name(private->vdev.dev), schid.cssid,
+				schid.ssid, schid.sch_no, io_region->ret_code);
 			errstr = "cp fsm_io_helper";
 			cp_free(&private->cp);
 			goto err_out;
 		}
 		return;
 	} else if (scsw->cmd.fctl & SCSW_FCTL_HALT_FUNC) {
-		VFIO_CCW_MSG_EVENT(2,
-				   "%pUl (%x.%x.%04x): halt on io_region\n",
-				   mdev_uuid(mdev), schid.cssid,
+		VFIO_CCW_MSG_EVENT(2, "%s (%x.%x.%04x): halt on io_region\n",
+				   dev_name(private->vdev.dev), schid.cssid,
 				   schid.ssid, schid.sch_no);
 		/* halt is handled via the async cmd region */
 		io_region->ret_code = -EOPNOTSUPP;
 		goto err_out;
 	} else if (scsw->cmd.fctl & SCSW_FCTL_CLEAR_FUNC) {
-		VFIO_CCW_MSG_EVENT(2,
-				   "%pUl (%x.%x.%04x): clear on io_region\n",
-				   mdev_uuid(mdev), schid.cssid,
+		VFIO_CCW_MSG_EVENT(2, "%s (%x.%x.%04x): clear on io_region\n",
+				   dev_name(private->vdev.dev), schid.cssid,
 				   schid.ssid, schid.sch_no);
 		/* clear is handled via the async cmd region */
 		io_region->ret_code = -EOPNOTSUPP;
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 38ab5c1f25ec09..23004e67c492f6 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -95,11 +95,9 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 	vfio_init_group_dev(&private->vdev, &mdev->dev,
 			    &vfio_ccw_dev_ops);
 
-	private->mdev = mdev;
-
-	VFIO_CCW_MSG_EVENT(2, "mdev %pUl, sch %x.%x.%04x: create\n",
-			   mdev_uuid(mdev), private->sch->schid.cssid,
-			   private->sch->schid.ssid,
+	VFIO_CCW_MSG_EVENT(2, "mdev %s, sch %x.%x.%04x: create\n",
+			   dev_name(private->vdev.dev),
+			   private->sch->schid.cssid, private->sch->schid.ssid,
 			   private->sch->schid.sch_no);
 
 	ret = vfio_register_group_dev(&private->vdev);
@@ -110,7 +108,6 @@ static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 
 err_init:
 	vfio_uninit_group_dev(&private->vdev);
-	private->mdev = NULL;
 	return ret;
 }
 
@@ -118,14 +115,13 @@ static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
 {
 	struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
 
-	VFIO_CCW_MSG_EVENT(2, "mdev %pUl, sch %x.%x.%04x: remove\n",
-			   mdev_uuid(mdev), private->sch->schid.cssid,
-			   private->sch->schid.ssid,
+	VFIO_CCW_MSG_EVENT(2, "mdev %s, sch %x.%x.%04x: remove\n",
+			   dev_name(private->vdev.dev),
+			   private->sch->schid.cssid, private->sch->schid.ssid,
 			   private->sch->schid.sch_no);
 
 	vfio_unregister_group_dev(&private->vdev);
 	vfio_uninit_group_dev(&private->vdev);
-	private->mdev = NULL;
 }
 
 static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index bbc97eb9d9c6fc..67ee9c624393b0 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -72,7 +72,6 @@ struct vfio_ccw_crw {
  * @sch: pointer to the subchannel
  * @state: internal state of the device
  * @completion: synchronization helper of the I/O completion
- * @mdev: pointer to the mediated device
  * @nb: notifier for vfio events
  * @io_region: MMIO region to input/output I/O arguments/results
  * @io_mutex: protect against concurrent update of I/O regions
@@ -95,7 +94,6 @@ struct vfio_ccw_private {
 	struct subchannel	*sch;
 	int			state;
 	struct completion	*completion;
-	struct mdev_device	*mdev;
 	struct notifier_block	nb;
 	struct ccw_io_region	*io_region;
 	struct mutex		io_mutex;
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 819eaf98ffac73..ea59610d45fd89 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -158,10 +158,6 @@ static inline void mdev_set_drvdata(struct mdev_device *mdev, void *data)
 {
 	mdev->driver_data = data;
 }
-static inline const guid_t *mdev_uuid(struct mdev_device *mdev)
-{
-	return &mdev->uuid;
-}
 
 extern struct bus_type mdev_bus_type;
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 8/9] vfio: Export vfio_device_try_get()
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2021-09-09 19:38 ` [PATCH v2 7/9] vfio/ccw: Remove private->mdev Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-09 19:38 ` [PATCH v2 9/9] vfio/ccw: Move the lifecycle of the struct vfio_ccw_private to the mdev Jason Gunthorpe
  2021-09-13 17:40 ` [PATCH v2 0/9] Move vfio_ccw to the new mdev API Eric Farman
  9 siblings, 0 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

vfio_ccw will need it.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/vfio/vfio.c  | 3 ++-
 include/linux/vfio.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index c3ca33e513c8e9..e78278a0b98a96 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -603,10 +603,11 @@ void vfio_device_put(struct vfio_device *device)
 }
 EXPORT_SYMBOL_GPL(vfio_device_put);
 
-static bool vfio_device_try_get(struct vfio_device *device)
+bool vfio_device_try_get(struct vfio_device *device)
 {
 	return refcount_inc_not_zero(&device->refcount);
 }
+EXPORT_SYMBOL_GPL(vfio_device_try_get);
 
 static struct vfio_device *vfio_group_get_device(struct vfio_group *group,
 						 struct device *dev)
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index e65137a708f185..69df8bcc49aaf4 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -80,6 +80,7 @@ void vfio_uninit_group_dev(struct vfio_device *device);
 int vfio_register_group_dev(struct vfio_device *device);
 void vfio_unregister_group_dev(struct vfio_device *device);
 extern struct vfio_device *vfio_device_get_from_dev(struct device *dev);
+bool vfio_device_try_get(struct vfio_device *device);
 extern void vfio_device_put(struct vfio_device *device);
 
 int vfio_assign_device_set(struct vfio_device *device, void *set_id);
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 9/9] vfio/ccw: Move the lifecycle of the struct vfio_ccw_private to the mdev
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2021-09-09 19:38 ` [PATCH v2 8/9] vfio: Export vfio_device_try_get() Jason Gunthorpe
@ 2021-09-09 19:38 ` Jason Gunthorpe
  2021-09-13 17:40 ` [PATCH v2 0/9] Move vfio_ccw to the new mdev API Eric Farman
  9 siblings, 0 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-09 19:38 UTC (permalink / raw)
  To: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang
  Cc: Christoph Hellwig

The css_driver's main purpose is to create/destroy the mdev and relay the
shutdown, irq, sch_event, and chp_event css_driver ops to the single
created vfio_device, if it exists.

Reframe the boundary where the css_driver domain switches to the vfio
domain by using rcu to read and refcount the vfio_device out of the sch's
drvdata. The mdev probe/remove will manage the drvdata of the parent.

The vfio core code refcounting thus guarantees that when a css_driver
callback is running the vfio_device is registered, simplifying the
understanding of the whole lifecycle.

Finally the vfio_ccw_private is allocated/freed during probe/remove of the
mdev like any other vfio_device struct.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c     | 67 ++++++++++++++---------------
 drivers/s390/cio/vfio_ccw_ops.c     | 40 +++++++----------
 drivers/s390/cio/vfio_ccw_private.h | 23 +++++++++-
 3 files changed, 69 insertions(+), 61 deletions(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 0e2edd96567a09..b86da53443bfd7 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -86,13 +86,19 @@ static void vfio_ccw_crw_todo(struct work_struct *work)
  */
 static void vfio_ccw_sch_irq(struct subchannel *sch)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
+	struct vfio_ccw_private *private = vfio_ccw_get_priv(sch);
+
+	/* IRQ should not be delivered after the mdev is destroyed */
+	if (WARN_ON(!private))
+		return;
 
 	inc_irq_stat(IRQIO_CIO);
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_INTERRUPT);
+	vfio_device_put(&private->vdev);
 }
 
-static struct vfio_ccw_private *vfio_ccw_alloc_private(struct subchannel *sch)
+struct vfio_ccw_private *vfio_ccw_alloc_private(struct mdev_device *mdev,
+						struct subchannel *sch)
 {
 	struct vfio_ccw_private *private;
 
@@ -100,6 +106,8 @@ static struct vfio_ccw_private *vfio_ccw_alloc_private(struct subchannel *sch)
 	if (!private)
 		return ERR_PTR(-ENOMEM);
 
+	vfio_init_group_dev(&private->vdev, &mdev->dev,
+			    &vfio_ccw_dev_ops);
 	private->sch = sch;
 	mutex_init(&private->io_mutex);
 	private->state = VFIO_CCW_STATE_CLOSED;
@@ -145,11 +153,12 @@ static struct vfio_ccw_private *vfio_ccw_alloc_private(struct subchannel *sch)
 	kfree(private->cp.guest_cp);
 out_free_private:
 	mutex_destroy(&private->io_mutex);
+	vfio_uninit_group_dev(&private->vdev);
 	kfree(private);
 	return ERR_PTR(-ENOMEM);
 }
 
-static void vfio_ccw_free_private(struct vfio_ccw_private *private)
+void vfio_ccw_free_private(struct vfio_ccw_private *private)
 {
 	struct vfio_ccw_crw *crw, *temp;
 
@@ -164,14 +173,14 @@ static void vfio_ccw_free_private(struct vfio_ccw_private *private)
 	kmem_cache_free(vfio_ccw_io_region, private->io_region);
 	kfree(private->cp.guest_cp);
 	mutex_destroy(&private->io_mutex);
-	kfree(private);
+	vfio_uninit_group_dev(&private->vdev);
+	kfree_rcu(private, rcu);
 }
 
 static int vfio_ccw_sch_probe(struct subchannel *sch)
 {
 	struct pmcw *pmcw = &sch->schib.pmcw;
-	struct vfio_ccw_private *private;
-	int ret = -ENOMEM;
+	int ret;
 
 	if (pmcw->qf) {
 		dev_warn(&sch->dev, "vfio: ccw: does not support QDIO: %s\n",
@@ -179,15 +188,9 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 		return -ENODEV;
 	}
 
-	private = vfio_ccw_alloc_private(sch);
-	if (IS_ERR(private))
-		return PTR_ERR(private);
-
-	dev_set_drvdata(&sch->dev, private);
-
-	ret = vfio_ccw_mdev_reg(sch);
+	ret = mdev_register_device(&sch->dev, &vfio_ccw_mdev_ops);
 	if (ret)
-		goto out_free;
+		return ret;
 
 	if (dev_get_uevent_suppress(&sch->dev)) {
 		dev_set_uevent_suppress(&sch->dev, 0);
@@ -198,22 +201,11 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 			   sch->schid.cssid, sch->schid.ssid,
 			   sch->schid.sch_no);
 	return 0;
-
-out_free:
-	dev_set_drvdata(&sch->dev, NULL);
-	vfio_ccw_free_private(private);
-	return ret;
 }
 
 static int vfio_ccw_sch_remove(struct subchannel *sch)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
-
-	vfio_ccw_mdev_unreg(sch);
-
-	dev_set_drvdata(&sch->dev, NULL);
-
-	vfio_ccw_free_private(private);
+	mdev_unregister_device(&sch->dev);
 
 	VFIO_CCW_MSG_EVENT(4, "unbound from subchannel %x.%x.%04x\n",
 			   sch->schid.cssid, sch->schid.ssid,
@@ -223,10 +215,14 @@ static int vfio_ccw_sch_remove(struct subchannel *sch)
 
 static void vfio_ccw_sch_shutdown(struct subchannel *sch)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
+	struct vfio_ccw_private *private = vfio_ccw_get_priv(sch);
+
+	if (!private)
+		return;
 
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_CLOSE);
 	vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_BROKEN);
+	vfio_device_put(&private->vdev);
 }
 
 /**
@@ -241,14 +237,14 @@ static void vfio_ccw_sch_shutdown(struct subchannel *sch)
  */
 static int vfio_ccw_sch_event(struct subchannel *sch, int process)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
+	struct vfio_ccw_private *private = vfio_ccw_get_priv(sch);
 	unsigned long flags;
 	int rc = -EAGAIN;
 
-	spin_lock_irqsave(sch->lock, flags);
-	if (!device_is_registered(&sch->dev))
-		goto out_unlock;
+	if (!private)
+		return -EAGAIN;
 
+	spin_lock_irqsave(sch->lock, flags);
 	if (work_pending(&sch->todo_work))
 		goto out_unlock;
 
@@ -261,7 +257,7 @@ static int vfio_ccw_sch_event(struct subchannel *sch, int process)
 
 out_unlock:
 	spin_unlock_irqrestore(sch->lock, flags);
-
+	vfio_device_put(&private->vdev);
 	return rc;
 }
 
@@ -295,7 +291,7 @@ static void vfio_ccw_queue_crw(struct vfio_ccw_private *private,
 static int vfio_ccw_chp_event(struct subchannel *sch,
 			      struct chp_link *link, int event)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(&sch->dev);
+	struct vfio_ccw_private *private = vfio_ccw_get_priv(sch);
 	int mask = chp_ssd_get_mask(&sch->ssd_info, link);
 	int retry = 255;
 
@@ -308,8 +304,10 @@ static int vfio_ccw_chp_event(struct subchannel *sch,
 			   sch->schid.ssid, sch->schid.sch_no,
 			   mask, event);
 
-	if (cio_update_schib(sch))
+	if (cio_update_schib(sch)) {
+		vfio_device_put(&private->vdev);
 		return -ENODEV;
+	}
 
 	switch (event) {
 	case CHP_VARY_OFF:
@@ -339,6 +337,7 @@ static int vfio_ccw_chp_event(struct subchannel *sch,
 		break;
 	}
 
+	vfio_device_put(&private->vdev);
 	return 0;
 }
 
diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
index 23004e67c492f6..04a10f37d64225 100644
--- a/drivers/s390/cio/vfio_ccw_ops.c
+++ b/drivers/s390/cio/vfio_ccw_ops.c
@@ -17,8 +17,6 @@
 
 #include "vfio_ccw_private.h"
 
-static const struct vfio_device_ops vfio_ccw_dev_ops;
-
 static int vfio_ccw_mdev_reset(struct vfio_ccw_private *private)
 {
 	/*
@@ -88,26 +86,27 @@ static struct attribute_group *mdev_type_groups[] = {
 
 static int vfio_ccw_mdev_probe(struct mdev_device *mdev)
 {
-	struct vfio_ccw_private *private = dev_get_drvdata(mdev->dev.parent);
+	struct subchannel *sch = to_subchannel(mdev->dev.parent);
+	struct vfio_ccw_private *private;
 	int ret;
 
-	memset(&private->vdev, 0, sizeof(private->vdev));
-	vfio_init_group_dev(&private->vdev, &mdev->dev,
-			    &vfio_ccw_dev_ops);
+	private = vfio_ccw_alloc_private(mdev, sch);
+	if (IS_ERR(private))
+		return PTR_ERR(private);
 
 	VFIO_CCW_MSG_EVENT(2, "mdev %s, sch %x.%x.%04x: create\n",
-			   dev_name(private->vdev.dev),
-			   private->sch->schid.cssid, private->sch->schid.ssid,
-			   private->sch->schid.sch_no);
+			   dev_name(private->vdev.dev), sch->schid.cssid,
+			   sch->schid.ssid, sch->schid.sch_no);
 
 	ret = vfio_register_group_dev(&private->vdev);
 	if (ret)
-		goto err_init;
+		goto err_alloc;
 	dev_set_drvdata(&mdev->dev, private);
+	dev_set_drvdata(&sch->dev, private);
 	return 0;
 
-err_init:
-	vfio_uninit_group_dev(&private->vdev);
+err_alloc:
+	vfio_ccw_free_private(private);
 	return ret;
 }
 
@@ -120,8 +119,9 @@ static void vfio_ccw_mdev_remove(struct mdev_device *mdev)
 			   private->sch->schid.cssid, private->sch->schid.ssid,
 			   private->sch->schid.sch_no);
 
+	dev_set_drvdata(&private->sch->dev, NULL);
 	vfio_unregister_group_dev(&private->vdev);
-	vfio_uninit_group_dev(&private->vdev);
+	vfio_ccw_free_private(private);
 }
 
 static int vfio_ccw_mdev_open_device(struct vfio_device *vdev)
@@ -595,7 +595,7 @@ static unsigned int vfio_ccw_get_available(struct mdev_type *mtype)
 	return 1;
 }
 
-static const struct vfio_device_ops vfio_ccw_dev_ops = {
+const struct vfio_device_ops vfio_ccw_dev_ops = {
 	.open_device = vfio_ccw_mdev_open_device,
 	.close_device = vfio_ccw_mdev_close_device,
 	.read = vfio_ccw_mdev_read,
@@ -615,19 +615,9 @@ struct mdev_driver vfio_ccw_mdev_driver = {
 	.get_available = vfio_ccw_get_available,
 };
 
-static const struct mdev_parent_ops vfio_ccw_mdev_ops = {
+const struct mdev_parent_ops vfio_ccw_mdev_ops = {
 	.owner			= THIS_MODULE,
 	.device_driver		= &vfio_ccw_mdev_driver,
 	.device_api		= VFIO_DEVICE_API_CCW_STRING,
 	.supported_type_groups  = mdev_type_groups,
 };
-
-int vfio_ccw_mdev_reg(struct subchannel *sch)
-{
-	return mdev_register_device(&sch->dev, &vfio_ccw_mdev_ops);
-}
-
-void vfio_ccw_mdev_unreg(struct subchannel *sch)
-{
-	mdev_unregister_device(&sch->dev);
-}
diff --git a/drivers/s390/cio/vfio_ccw_private.h b/drivers/s390/cio/vfio_ccw_private.h
index 67ee9c624393b0..852ff94fc107d6 100644
--- a/drivers/s390/cio/vfio_ccw_private.h
+++ b/drivers/s390/cio/vfio_ccw_private.h
@@ -24,6 +24,8 @@
 #include "css.h"
 #include "vfio_ccw_cp.h"
 
+struct mdev_device;
+
 #define VFIO_CCW_OFFSET_SHIFT   10
 #define VFIO_CCW_OFFSET_TO_INDEX(off)	(off >> VFIO_CCW_OFFSET_SHIFT)
 #define VFIO_CCW_INDEX_TO_OFFSET(index)	((u64)(index) << VFIO_CCW_OFFSET_SHIFT)
@@ -69,6 +71,7 @@ struct vfio_ccw_crw {
 /**
  * struct vfio_ccw_private
  * @vdev: Embedded VFIO device
+ * @rcu: head for kfree_rcu()
  * @sch: pointer to the subchannel
  * @state: internal state of the device
  * @completion: synchronization helper of the I/O completion
@@ -91,6 +94,7 @@ struct vfio_ccw_crw {
  */
 struct vfio_ccw_private {
 	struct vfio_device vdev;
+	struct rcu_head rcu;
 	struct subchannel	*sch;
 	int			state;
 	struct completion	*completion;
@@ -115,10 +119,25 @@ struct vfio_ccw_private {
 	struct work_struct	crw_work;
 } __aligned(8);
 
-extern int vfio_ccw_mdev_reg(struct subchannel *sch);
-extern void vfio_ccw_mdev_unreg(struct subchannel *sch);
+struct vfio_ccw_private *vfio_ccw_alloc_private(struct mdev_device *mdev,
+						struct subchannel *sch);
+void vfio_ccw_free_private(struct vfio_ccw_private *private);
 
 extern struct mdev_driver vfio_ccw_mdev_driver;
+extern const struct mdev_parent_ops vfio_ccw_mdev_ops;
+extern const struct vfio_device_ops vfio_ccw_dev_ops;
+
+static inline struct vfio_ccw_private *vfio_ccw_get_priv(struct subchannel *sch)
+{
+	struct vfio_ccw_private *private;
+
+	rcu_read_lock();
+	private = dev_get_drvdata(&sch->dev);
+	if (private && !vfio_device_try_get(&private->vdev))
+		private = NULL;
+	rcu_read_unlock();
+	return private;
+}
 
 /*
  * States of the device statemachine.
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private
  2021-09-09 19:38 ` [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private Jason Gunthorpe
@ 2021-09-10 11:27   ` Christoph Hellwig
  2021-09-14 15:50     ` Cornelia Huck
  2021-09-24  2:53   ` Eric Farman
  1 sibling, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2021-09-10 11:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang, Christoph Hellwig

On Thu, Sep 09, 2021 at 04:38:41PM -0300, Jason Gunthorpe wrote:
> +
> +	private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
> +	if (!private)
> +		return ERR_PTR(-ENOMEM);

Nit: there is no need to add GFP_KERNEL when using GFP_DMA.

Also a question to the s390 maintainers: why do we need 31-bit
addressability for the main private data structure?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions
  2021-09-09 19:38 ` [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions Jason Gunthorpe
@ 2021-09-10 11:32   ` Christoph Hellwig
  2021-09-20 11:12   ` Cornelia Huck
  2021-09-24  2:53   ` Eric Farman
  2 siblings, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-09-10 11:32 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang, Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code
  2021-09-09 19:38 ` [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code Jason Gunthorpe
@ 2021-09-10 12:10   ` Christoph Hellwig
  2021-09-10 13:38     ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Christoph Hellwig @ 2021-09-10 12:10 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang, Christoph Hellwig

On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote:
> Every driver just emits a static string, simply feed it through the ops
> and provide a standard sysfs show function.

Looks sensible.  But can you make the attribute optional and add a
comment marking it deprecated?  Because it really is completely useless.
We don't version userspace APIs, userspae has to discover new features
individually by e.g. finding new sysfs files or just trying new ioctls.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core
  2021-09-09 19:38 ` [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core Jason Gunthorpe
@ 2021-09-10 12:25   ` Christoph Hellwig
  2021-09-20 18:02   ` Cornelia Huck
  1 sibling, 0 replies; 38+ messages in thread
From: Christoph Hellwig @ 2021-09-10 12:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang, Christoph Hellwig

On Thu, Sep 09, 2021 at 04:38:46PM -0300, Jason Gunthorpe wrote:
> Many of the mdev drivers use a simple counter for keeping track of the
> available instances. Move this code to the core code and store the counter
> in the mdev_type. Implement it using correct locking, fixing mdpy.
> 
> Drivers provide a get_available() callback to set the number of available
> instances for their mtypes which is fixed at registration time. The core
> provides a standard sysfs attribute to return the available_instances.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code
  2021-09-10 12:10   ` Christoph Hellwig
@ 2021-09-10 13:38     ` Jason Gunthorpe
  2021-09-10 16:09       ` Alex Williamson
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-10 13:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang, Christoph Hellwig

On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote:
> On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote:
> > Every driver just emits a static string, simply feed it through the ops
> > and provide a standard sysfs show function.
> 
> Looks sensible.  But can you make the attribute optional and add a
> comment marking it deprecated?  Because it really is completely useless.
> We don't version userspace APIs, userspae has to discover new features
> individually by e.g. finding new sysfs files or just trying new ioctls.

To be honest I have no idea what side effects that would have..

device code search tells me libvirt reads it and stuffs it into some
XML

Something called mdevctl touches it, feeds it into some JSON and
other stuff..

qemu has some VFIO_DEVICE_API_* constants but it is all dead code

I agree it shouldn't have been there in the first place

Cornelia? Alex? Any thoughts?

Jason

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code
  2021-09-10 13:38     ` Jason Gunthorpe
@ 2021-09-10 16:09       ` Alex Williamson
  0 siblings, 0 replies; 38+ messages in thread
From: Alex Williamson @ 2021-09-10 16:09 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Christoph Hellwig, David Airlie, Tony Krowiak,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Eric Farman, Harald Freudenberger, Vasily Gorbik, Heiko Carstens,
	intel-gfx, intel-gvt-dev, Jani Nikula, Jason Herne,
	Joonas Lahtinen, kvm, Kirti Wankhede, linux-s390, Matthew Rosato,
	Peter Oberparleiter, Halil Pasic, Rodrigo Vivi, Vineeth Vijayan,
	Zhenyu Wang, Zhi Wang, Christoph Hellwig

On Fri, 10 Sep 2021 10:38:50 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Fri, Sep 10, 2021 at 01:10:46PM +0100, Christoph Hellwig wrote:
> > On Thu, Sep 09, 2021 at 04:38:45PM -0300, Jason Gunthorpe wrote:  
> > > Every driver just emits a static string, simply feed it through the ops
> > > and provide a standard sysfs show function.  
> > 
> > Looks sensible.  But can you make the attribute optional and add a
> > comment marking it deprecated?  Because it really is completely useless.
> > We don't version userspace APIs, userspae has to discover new features
> > individually by e.g. finding new sysfs files or just trying new ioctls.  
> 
> To be honest I have no idea what side effects that would have..
> 
> device code search tells me libvirt reads it and stuffs it into some
> XML
> 
> Something called mdevctl touches it, feeds it into some JSON and
> other stuff..
> 
> qemu has some VFIO_DEVICE_API_* constants but it is all dead code
> 
> I agree it shouldn't have been there in the first place
> 
> Cornelia? Alex? Any thoughts?

It's not a version, it's a means for userspace to determine the basic
API for an mdev device without needing to go through the process of
creating a container, adding the group, setting an IOMMU type, opening
the device before being able to call VFIO_DEVICE_GET_INFO to determine
the API.  For example, it wouldn't make sense for libvirt to attach a
vfio-ccw device to a PCIe root port in a VM.  It's a means to say this
mdev device is a vfio-pci or that mdev device is a vfio-ccw.  If it were
optional, then management tools would have no basic idea how to attach
the device to a VM without gaining access to the device themselves.
Thanks,

Alex


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API
  2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2021-09-09 19:38 ` [PATCH v2 9/9] vfio/ccw: Move the lifecycle of the struct vfio_ccw_private to the mdev Jason Gunthorpe
@ 2021-09-13 17:40 ` Eric Farman
  2021-09-13 19:24   ` Jason Gunthorpe
  9 siblings, 1 reply; 38+ messages in thread
From: Eric Farman @ 2021-09-13 17:40 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> This addresses Cornelia's remark on the earlier patch that ccw has a
> confusing lifecycle. While it doesn't seem like the original attempt
> was
> functionally wrong, the result can be made better with a lot of
> further
> work.

I thought I'd take a stab at seeing how this works with the hardware
before looking at the code much. git couldn't apply patches 1, 6, or 9
to 5.15-rc1, but I was able to hand-fit them into place. Shutting down
the guest and de-configuring a device ends up bringing my whole system
down. I haven't looked at this any further; hopefully something jumps
to mind for you:

[   64.585347] vfio_ccw 0.0.08fe: MDEV: Unregistering
[   64.585357] illegal operation: 0001 ilc:1 [#1] SMP 
[   64.585362] Modules linked in: vhost_vsock
vmw_vsock_virtio_transport_common vsock vhost
[   64.585364] vfio_ccw_mdev b50bbd4b-eab8-4f8c-9f0c-3cf636f936b9:
Relaying device request to user (#0)
[   64.585364]  vhost_iotlb lcs ctcm fsm kvm xt_MASQUERADE xt_conntrack
ipt_REJECT nf_reject_ipv4 xt_tcpudp iptable_mangle iptable_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bridge stp
llc dm_multipath dm_mod s390_trng eadm_sch zcrypt_cex4 qeth_l2 vfio_ccw
mdev vfio_iommu_type1 vfio configfs zram zsmalloc ip_tables x_tables
mlx5_core ghash_s390 prng aes_s390 des_s390 libdes sha3_512_s390
sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common pkey zcrypt
rng_core autofs4
[   64.585392] CPU: 14 PID: 4487 Comm: qemu-system-s39 Kdump: loaded
Not tainted 5.15.0-rc1 #1
[   64.585395] Hardware name: IBM 3906 M05 780 (LPAR)
[   64.585396] Krnl PSW : 0704c00180000000 0000000000000002 (0x2)
[   64.585404]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0
PM:0 RI:0 EA:3
[   64.585407] Krnl GPRS: 0000000000000001 0000000000000000
00000000005f4800 0000000000000004
[   64.585410]            0000000000000000 0000000000000002
0000000000000000 000002aa3e65085e
[   64.585412]            000000008de09100 0000000000003b6f
000003ff8017fa08 00000000005f4800
[   64.585413]            0000000081450000 000003ff7c032310
000003ff80179db0 000003800bf53da0
[   64.585418] Krnl Code:#0000000000000000:
0000                illegal 
          >0000000000000002: 0000               illegal 
           0000000000000004: 0000               illegal 
           0000000000000006: 0000               illegal 
           0000000000000008: 0000               illegal 
           000000000000000a: 0000               illegal 
           000000000000000c: 0000               illegal 
           000000000000000e: 0000               illegal 
[   64.585462] Call Trace:
[   64.585464]  [<0000000000000002>] 0x2 
[   64.585467] ([<000003ff80179d74>] vfio_ccw_mdev_ioctl+0x84/0x318
[vfio_ccw])
[   64.585476]  [<00000000bb7adda6>] __s390x_sys_ioctl+0xbe/0x100 
[   64.585481]  [<00000000bbfbf5e4>] __do_syscall+0x1bc/0x1e8 
[   64.585488]  [<00000000bbfcc8d8>] system_call+0x78/0xa0 

Eric

> 
> Reorganize the driver so that the mdev owns the private memory and
> controls the lifecycle, not the css_driver. The memory associated
> with the
> css_driver lifecycle is only the mdev_parent/mdev_type registration.
> 
> Along the way we change when the sch is quiescent or not to be linked
> to
> the open/close_device lifetime of the vfio_device, which is sort of
> what
> it was tring to do already, just not completely.
> 
> The troublesome racey lifecycle of the css_driver callbacks is made
> clear
> with simple vfio_device refcounting so a callback is only delivered
> into a
> registered vfio_device and has obvious correctness.
> 
> Move the only per-css_driver state, the "available instance" counter,
> into
> the core code and share that logic with many of the other drivers.
> The
> value is kept in the mdev_type memory.
> 
> v2:
>  - Clean up the lifecycle in ccw with 7 new patches
>  - Rebase
> v1: 
> https://lore.kernel.org/all/7-v2-7667f42c9bad+935-vfio3_jgg@nvidia.com
> 
> Jason Gunthorpe (9):
>   vfio/ccw: Use functions for alloc/free of the vfio_ccw_private
>   vfio/ccw: Pass vfio_ccw_private not mdev_device to various
> functions
>   vfio/ccw: Convert to use vfio_register_group_dev()
>   vfio/ccw: Make the FSM complete and synchronize it to the mdev
>   vfio/mdev: Consolidate all the device_api sysfs into the core code
>   vfio/mdev: Add mdev available instance checking to the core
>   vfio/ccw: Remove private->mdev
>   vfio: Export vfio_device_try_get()
>   vfio/ccw: Move the lifecycle of the struct vfio_ccw_private to the
>     mdev
> 
>  drivers/gpu/drm/i915/gvt/kvmgt.c      |   9 +-
>  drivers/s390/cio/vfio_ccw_drv.c       | 282 +++++++++++-------------
> --
>  drivers/s390/cio/vfio_ccw_fsm.c       | 152 ++++++++++----
>  drivers/s390/cio/vfio_ccw_ops.c       | 240 ++++++++++------------
>  drivers/s390/cio/vfio_ccw_private.h   |  42 +++-
>  drivers/s390/crypto/vfio_ap_ops.c     |  41 +---
>  drivers/s390/crypto/vfio_ap_private.h |   2 -
>  drivers/vfio/mdev/mdev_core.c         |  13 +-
>  drivers/vfio/mdev/mdev_private.h      |   2 +
>  drivers/vfio/mdev/mdev_sysfs.c        |  64 +++++-
>  drivers/vfio/vfio.c                   |   3 +-
>  include/linux/mdev.h                  |  13 +-
>  include/linux/vfio.h                  |   1 +
>  samples/vfio-mdev/mbochs.c            |   9 +-
>  samples/vfio-mdev/mdpy.c              |  31 +--
>  samples/vfio-mdev/mtty.c              |  10 +-
>  16 files changed, 470 insertions(+), 444 deletions(-)
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API
  2021-09-13 17:40 ` [PATCH v2 0/9] Move vfio_ccw to the new mdev API Eric Farman
@ 2021-09-13 19:24   ` Jason Gunthorpe
  2021-09-13 20:31     ` Eric Farman
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-13 19:24 UTC (permalink / raw)
  To: Eric Farman
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Mon, Sep 13, 2021 at 01:40:34PM -0400, Eric Farman wrote:
> On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> > This addresses Cornelia's remark on the earlier patch that ccw has a
> > confusing lifecycle. While it doesn't seem like the original attempt
> > was
> > functionally wrong, the result can be made better with a lot of
> > further
> > work.
> 
> I thought I'd take a stab at seeing how this works with the hardware
> before looking at the code much. git couldn't apply patches 1, 6, or 9
> to 5.15-rc1, but I was able to hand-fit them into place. 

Oh? Thats odd, I had no remarks from git when rebasing onto
v5.15-rc1..

Maybe this is a situation where you need "b4 am --prep-3way" ...

> [   64.585462] Call Trace:
> [   64.585464]  [<0000000000000002>] 0x2 
> [   64.585467] ([<000003ff80179d74>] vfio_ccw_mdev_ioctl+0x84/0x318
> [vfio_ccw])
> [   64.585476]  [<00000000bb7adda6>] __s390x_sys_ioctl+0xbe/0x100 
> [   64.585481]  [<00000000bbfbf5e4>] __do_syscall+0x1bc/0x1e8 
> [   64.585488]  [<00000000bbfcc8d8>] system_call+0x78/0xa0 

I think it is this:

diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index df1490943b20ec..5ea392959c0711 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -441,6 +441,7 @@ fsm_func_t *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
 		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
 		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
 		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
+		[VFIO_CCW_EVENT_OPEN]		= fsm_nop,
 		[VFIO_CCW_EVENT_CLOSE]		= fsm_nop,
 	},
 	[VFIO_CCW_STATE_CLOSED] = {

I rebased it and fixed it up here:

https://github.com/jgunthorpe/linux/tree/vfio_ccw

Can you try again?

Thanks,
Jason

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API
  2021-09-13 19:24   ` Jason Gunthorpe
@ 2021-09-13 20:31     ` Eric Farman
  2021-09-14 13:36       ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Farman @ 2021-09-13 20:31 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Mon, 2021-09-13 at 16:24 -0300, Jason Gunthorpe wrote:
> On Mon, Sep 13, 2021 at 01:40:34PM -0400, Eric Farman wrote:
> > On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> > > This addresses Cornelia's remark on the earlier patch that ccw
> > > has a
> > > confusing lifecycle. While it doesn't seem like the original
> > > attempt
> > > was
> > > functionally wrong, the result can be made better with a lot of
> > > further
> > > work.
> > 
> > I thought I'd take a stab at seeing how this works with the
> > hardware
> > before looking at the code much. git couldn't apply patches 1, 6,
> > or 9
> > to 5.15-rc1, but I was able to hand-fit them into place. 
> 
> Oh? Thats odd, I had no remarks from git when rebasing onto
> v5.15-rc1..
> 
> Maybe this is a situation where you need "b4 am --prep-3way" ...

Ah, that does indeed help, thanks. Still missing the vfio-ap patch
that's elsewhere on the list, but I'm not caring about that at the
moment.

> 
> > [   64.585462] Call Trace:
> > [   64.585464]  [<0000000000000002>] 0x2 
> > [   64.585467] ([<000003ff80179d74>] vfio_ccw_mdev_ioctl+0x84/0x318
> > [vfio_ccw])
> > [   64.585476]  [<00000000bb7adda6>] __s390x_sys_ioctl+0xbe/0x100 
> > [   64.585481]  [<00000000bbfbf5e4>] __do_syscall+0x1bc/0x1e8 
> > [   64.585488]  [<00000000bbfcc8d8>] system_call+0x78/0xa0 
> 
> I think it is this:
> 
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c
> b/drivers/s390/cio/vfio_ccw_fsm.c
> index df1490943b20ec..5ea392959c0711 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -441,6 +441,7 @@ fsm_func_t
> *vfio_ccw_jumptable[NR_VFIO_CCW_STATES][NR_VFIO_CCW_EVENTS] = {
>  		[VFIO_CCW_EVENT_IO_REQ]		= fsm_io_error,
>  		[VFIO_CCW_EVENT_ASYNC_REQ]	= fsm_async_error,
>  		[VFIO_CCW_EVENT_INTERRUPT]	= fsm_disabled_irq,
> +		[VFIO_CCW_EVENT_OPEN]		= fsm_nop,
>  		[VFIO_CCW_EVENT_CLOSE]		= fsm_nop,
>  	},
>  	[VFIO_CCW_STATE_CLOSED] = {
> 
> I rebased it and fixed it up here:
> 
> https://github.com/jgunthorpe/linux/tree/vfio_ccw
> 
> Can you try again?

That does address the crash, but then why is it processing a BROKEN
event? Seems problematic. All the configuration works fine, but the
devices get ripped away once a guest is started that wants to open/use
them.

So, there's more problems to figure out.

Eric

> 
> Thanks,
> Jason


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API
  2021-09-13 20:31     ` Eric Farman
@ 2021-09-14 13:36       ` Jason Gunthorpe
  2021-09-17 11:59         ` Cornelia Huck
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-14 13:36 UTC (permalink / raw)
  To: Eric Farman
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Mon, Sep 13, 2021 at 04:31:54PM -0400, Eric Farman wrote:
> > I rebased it and fixed it up here:
> > 
> > https://github.com/jgunthorpe/linux/tree/vfio_ccw
> > 
> > Can you try again?
> 
> That does address the crash, but then why is it processing a BROKEN
> event? Seems problematic. 

The stuff related to the NOT_OPER looked really wonky to me. I'm
guessing this is the issue - not sure about the pmcw.ena either..

diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
index 5ea392959c0711..0d4d4f425befac 100644
--- a/drivers/s390/cio/vfio_ccw_fsm.c
+++ b/drivers/s390/cio/vfio_ccw_fsm.c
@@ -380,29 +380,19 @@ static void fsm_open(struct vfio_ccw_private *private,
 	spin_unlock_irq(sch->lock);
 }
 
-static void fsm_close(struct vfio_ccw_private *private,
-		      enum vfio_ccw_event event)
+static int flush_sch(struct vfio_ccw_private *private)
 {
 	struct subchannel *sch = private->sch;
 	DECLARE_COMPLETION_ONSTACK(completion);
 	int iretry, ret = 0;
 
-	spin_lock_irq(sch->lock);
-	if (!sch->schib.pmcw.ena)
-		goto err_unlock;
-	ret = cio_disable_subchannel(sch);
-	if (ret != -EBUSY)
-		goto err_unlock;
-
 	iretry = 255;
 	do {
-
 		ret = cio_cancel_halt_clear(sch, &iretry);
-
 		if (ret == -EIO) {
 			pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n",
 			       sch->schid.ssid, sch->schid.sch_no);
-			break;
+			return ret;
 		}
 
 		/*
@@ -413,13 +403,28 @@ static void fsm_close(struct vfio_ccw_private *private,
 		spin_unlock_irq(sch->lock);
 
 		if (ret == -EBUSY)
-			wait_for_completion_timeout(&completion, 3*HZ);
+			wait_for_completion_timeout(&completion, 3 * HZ);
 
 		private->completion = NULL;
 		flush_workqueue(vfio_ccw_work_q);
 		spin_lock_irq(sch->lock);
 		ret = cio_disable_subchannel(sch);
 	} while (ret == -EBUSY);
+	return ret;
+}
+
+static void fsm_close(struct vfio_ccw_private *private,
+		      enum vfio_ccw_event event)
+{
+	struct subchannel *sch = private->sch;
+	int ret;
+
+	spin_lock_irq(sch->lock);
+	if (!sch->schib.pmcw.ena)
+		goto err_unlock;
+	ret = cio_disable_subchannel(sch);
+	if (ret == -EBUSY)
+		ret = flush_sch(private);
 	if (ret)
 		goto err_unlock;
 	private->state = VFIO_CCW_STATE_CLOSED;

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private
  2021-09-10 11:27   ` Christoph Hellwig
@ 2021-09-14 15:50     ` Cornelia Huck
  2021-09-14 18:03       ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Cornelia Huck @ 2021-09-14 15:50 UTC (permalink / raw)
  To: Christoph Hellwig, Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Fri, Sep 10 2021, Christoph Hellwig <hch@infradead.org> wrote:

> On Thu, Sep 09, 2021 at 04:38:41PM -0300, Jason Gunthorpe wrote:
>> +
>> +	private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
>> +	if (!private)
>> +		return ERR_PTR(-ENOMEM);
>
> Nit: there is no need to add GFP_KERNEL when using GFP_DMA.
>
> Also a question to the s390 maintainers: why do we need 31-bit
> addressability for the main private data structure?

I don't think we need it anymore since c98e16b2fa12 ("s390/cio: Convert
ccw_io_region to pointer") and probably should just drop the GFP_DMA.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private
  2021-09-14 15:50     ` Cornelia Huck
@ 2021-09-14 18:03       ` Jason Gunthorpe
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-14 18:03 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Christoph Hellwig, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Tue, Sep 14, 2021 at 05:50:25PM +0200, Cornelia Huck wrote:
> On Fri, Sep 10 2021, Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Thu, Sep 09, 2021 at 04:38:41PM -0300, Jason Gunthorpe wrote:
> >> +
> >> +	private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
> >> +	if (!private)
> >> +		return ERR_PTR(-ENOMEM);
> >
> > Nit: there is no need to add GFP_KERNEL when using GFP_DMA.
> >
> > Also a question to the s390 maintainers: why do we need 31-bit
> > addressability for the main private data structure?
> 
> I don't think we need it anymore since c98e16b2fa12 ("s390/cio: Convert
> ccw_io_region to pointer") and probably should just drop the GFP_DMA.

I added this to the series:

From 0d40f9c57430400a81aa60920b70761535967048 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg@nvidia.com>
Date: Tue, 14 Sep 2021 14:21:49 -0300
Subject: [PATCH] vfio/ccw: Remove unneeded GFP_DMA

Since the ccw_io_region was split out of the private the allocation no
longer needs the GFP_DMA. Remove it.

Reported-by: Christoph Hellwig <hch@infradead.org>
Fixes: c98e16b2fa12 ("s390/cio: Convert ccw_io_region to pointer")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/s390/cio/vfio_ccw_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c
index 76099bcb765b45..371558ec92045d 100644
--- a/drivers/s390/cio/vfio_ccw_drv.c
+++ b/drivers/s390/cio/vfio_ccw_drv.c
@@ -161,7 +161,7 @@ static int vfio_ccw_sch_probe(struct subchannel *sch)
 		return -ENODEV;
 	}
 
-	private = kzalloc(sizeof(*private), GFP_KERNEL | GFP_DMA);
+	private = kzalloc(sizeof(*private), GFP_KERNEL);
 	if (!private)
 		return -ENOMEM;
 
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API
  2021-09-14 13:36       ` Jason Gunthorpe
@ 2021-09-17 11:59         ` Cornelia Huck
  2021-09-17 12:51           ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Cornelia Huck @ 2021-09-17 11:59 UTC (permalink / raw)
  To: Jason Gunthorpe, Eric Farman
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Tue, Sep 14 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Mon, Sep 13, 2021 at 04:31:54PM -0400, Eric Farman wrote:
>> > I rebased it and fixed it up here:
>> > 
>> > https://github.com/jgunthorpe/linux/tree/vfio_ccw
>> > 
>> > Can you try again?
>> 
>> That does address the crash, but then why is it processing a BROKEN
>> event? Seems problematic. 
>
> The stuff related to the NOT_OPER looked really wonky to me. I'm
> guessing this is the issue - not sure about the pmcw.ena either..

[I have still not been able to digest the whole series, sorry.]

>
> diff --git a/drivers/s390/cio/vfio_ccw_fsm.c b/drivers/s390/cio/vfio_ccw_fsm.c
> index 5ea392959c0711..0d4d4f425befac 100644
> --- a/drivers/s390/cio/vfio_ccw_fsm.c
> +++ b/drivers/s390/cio/vfio_ccw_fsm.c
> @@ -380,29 +380,19 @@ static void fsm_open(struct vfio_ccw_private *private,
>  	spin_unlock_irq(sch->lock);
>  }
>  
> -static void fsm_close(struct vfio_ccw_private *private,
> -		      enum vfio_ccw_event event)
> +static int flush_sch(struct vfio_ccw_private *private)
>  {
>  	struct subchannel *sch = private->sch;
>  	DECLARE_COMPLETION_ONSTACK(completion);
>  	int iretry, ret = 0;
>  
> -	spin_lock_irq(sch->lock);
> -	if (!sch->schib.pmcw.ena)
> -		goto err_unlock;
> -	ret = cio_disable_subchannel(sch);
> -	if (ret != -EBUSY)
> -		goto err_unlock;
> -
>  	iretry = 255;
>  	do {
> -
>  		ret = cio_cancel_halt_clear(sch, &iretry);
> -
>  		if (ret == -EIO) {
>  			pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n",
>  			       sch->schid.ssid, sch->schid.sch_no);
> -			break;
> +			return ret;

Looking at this, I wonder why we had special-cased -EIO -- for -ENODEV
we should be done as well, as then the device is dead and we do not need
to disable it.

>  		}
>  
>  		/*
> @@ -413,13 +403,28 @@ static void fsm_close(struct vfio_ccw_private *private,
>  		spin_unlock_irq(sch->lock);
>  
>  		if (ret == -EBUSY)
> -			wait_for_completion_timeout(&completion, 3*HZ);
> +			wait_for_completion_timeout(&completion, 3 * HZ);
>  
>  		private->completion = NULL;
>  		flush_workqueue(vfio_ccw_work_q);
>  		spin_lock_irq(sch->lock);
>  		ret = cio_disable_subchannel(sch);
>  	} while (ret == -EBUSY);
> +	return ret;
> +}
> +
> +static void fsm_close(struct vfio_ccw_private *private,
> +		      enum vfio_ccw_event event)
> +{
> +	struct subchannel *sch = private->sch;
> +	int ret;
> +
> +	spin_lock_irq(sch->lock);
> +	if (!sch->schib.pmcw.ena)
> +		goto err_unlock;
> +	ret = cio_disable_subchannel(sch);

cio_disable_subchannel() should be happy to disable an already disabled
subchannel, so I guess we can just walk through this and end up in
CLOSED state... unless entering with !ena actually indicates that we
messed up somewhere else in the state machine. I still need to find time
to read the patches.

> +	if (ret == -EBUSY)
> +		ret = flush_sch(private);
>  	if (ret)
>  		goto err_unlock;
>  	private->state = VFIO_CCW_STATE_CLOSED;


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API
  2021-09-17 11:59         ` Cornelia Huck
@ 2021-09-17 12:51           ` Jason Gunthorpe
  2021-09-17 14:37             ` Cornelia Huck
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-17 12:51 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Eric Farman, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Fri, Sep 17, 2021 at 01:59:16PM +0200, Cornelia Huck wrote:
> >  		ret = cio_cancel_halt_clear(sch, &iretry);
> > -
> >  		if (ret == -EIO) {
> >  			pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n",
> >  			       sch->schid.ssid, sch->schid.sch_no);
> > -			break;
> > +			return ret;
> 
> Looking at this, I wonder why we had special-cased -EIO -- for -ENODEV
> we should be done as well, as then the device is dead and we do not need
> to disable it.

cio_cancel_halt_clear() should probably succeed in that case.

> > @@ -413,13 +403,28 @@ static void fsm_close(struct vfio_ccw_private *private,
> >  		spin_unlock_irq(sch->lock);
> >  
> >  		if (ret == -EBUSY)
> > -			wait_for_completion_timeout(&completion, 3*HZ);
> > +			wait_for_completion_timeout(&completion, 3 * HZ);
> >  
> >  		private->completion = NULL;
> >  		flush_workqueue(vfio_ccw_work_q);
> >  		spin_lock_irq(sch->lock);
> >  		ret = cio_disable_subchannel(sch);
> >  	} while (ret == -EBUSY);
> > +	return ret;
> > +}
> > +
> > +static void fsm_close(struct vfio_ccw_private *private,
> > +		      enum vfio_ccw_event event)
> > +{
> > +	struct subchannel *sch = private->sch;
> > +	int ret;
> > +
> > +	spin_lock_irq(sch->lock);
> > +	if (!sch->schib.pmcw.ena)
> > +		goto err_unlock;
> > +	ret = cio_disable_subchannel(sch);
> 
> cio_disable_subchannel() should be happy to disable an already disabled
> subchannel, so I guess we can just walk through this and end up in
> CLOSED state... unless entering with !ena actually indicates that we
> messed up somewhere else in the state machine. I still need to find time
> to read the patches.

I don't know, I looked at that ena stuff for a bit and couldn't guess
what it is trying to do.

Arguably the channel should not be ripped away from vfio while the FSM
is in the open states, so I'm not sure what a lot of this is for.

Jason

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 0/9] Move vfio_ccw to the new mdev API
  2021-09-17 12:51           ` Jason Gunthorpe
@ 2021-09-17 14:37             ` Cornelia Huck
  0 siblings, 0 replies; 38+ messages in thread
From: Cornelia Huck @ 2021-09-17 14:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Eric Farman, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Fri, Sep 17 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Fri, Sep 17, 2021 at 01:59:16PM +0200, Cornelia Huck wrote:
>> >  		ret = cio_cancel_halt_clear(sch, &iretry);
>> > -
>> >  		if (ret == -EIO) {
>> >  			pr_err("vfio_ccw: could not quiesce subchannel 0.%x.%04x!\n",
>> >  			       sch->schid.ssid, sch->schid.sch_no);
>> > -			break;
>> > +			return ret;
>> 
>> Looking at this, I wonder why we had special-cased -EIO -- for -ENODEV
>> we should be done as well, as then the device is dead and we do not need
>> to disable it.
>
> cio_cancel_halt_clear() should probably succeed in that case.

It will actually give us -ENODEV, as the very first call in that
function will already fail.

>
>> > @@ -413,13 +403,28 @@ static void fsm_close(struct vfio_ccw_private *private,
>> >  		spin_unlock_irq(sch->lock);
>> >  
>> >  		if (ret == -EBUSY)
>> > -			wait_for_completion_timeout(&completion, 3*HZ);
>> > +			wait_for_completion_timeout(&completion, 3 * HZ);
>> >  
>> >  		private->completion = NULL;
>> >  		flush_workqueue(vfio_ccw_work_q);
>> >  		spin_lock_irq(sch->lock);
>> >  		ret = cio_disable_subchannel(sch);
>> >  	} while (ret == -EBUSY);
>> > +	return ret;
>> > +}
>> > +
>> > +static void fsm_close(struct vfio_ccw_private *private,
>> > +		      enum vfio_ccw_event event)
>> > +{
>> > +	struct subchannel *sch = private->sch;
>> > +	int ret;
>> > +
>> > +	spin_lock_irq(sch->lock);
>> > +	if (!sch->schib.pmcw.ena)
>> > +		goto err_unlock;
>> > +	ret = cio_disable_subchannel(sch);
>> 
>> cio_disable_subchannel() should be happy to disable an already disabled
>> subchannel, so I guess we can just walk through this and end up in
>> CLOSED state... unless entering with !ena actually indicates that we
>> messed up somewhere else in the state machine. I still need to find time
>> to read the patches.
>
> I don't know, I looked at that ena stuff for a bit and couldn't guess
> what it is trying to do.

It is one of the bits in the pmcw control block that can be modified; if
it is 1, the subchannel is enabled and can be used for I/O, if it is 0,
the subchannel is disabled and all instructions that initiate or stop
I/O will fail. Basically, you enable the subchannel if you actually want
to access the device associated with it. Online/offline for (normal
usage) ccw devices maps (among other things) to associated subchannel
enabled/disabled; for a subchannel that is supposed to be passed via
vfio-ccw, we want to have it enabled so that it is actually usable.

I think the ena checking had been inspired from what the ccw bus
does. We could probably just forge ahead in any case and the called
functions in the css bus would be able to handle it just fine, but I
have not double checked.

> Arguably the channel should not be ripped away from vfio while the FSM
> is in the open states, so I'm not sure what a lot of this is for.

We could have surprise removal (i.e. a subchannel in active use being
ripped out), as that's what happens on real hardware as well. E.g. doing
a device_del in QEMU.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions
  2021-09-09 19:38 ` [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions Jason Gunthorpe
  2021-09-10 11:32   ` Christoph Hellwig
@ 2021-09-20 11:12   ` Cornelia Huck
  2021-09-24  2:53   ` Eric Farman
  2 siblings, 0 replies; 38+ messages in thread
From: Cornelia Huck @ 2021-09-20 11:12 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, Sep 09 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:

> mdev_device should only be used in functions assigned to ops callbacks,
> interior functions should use the struct vfio_ccw_private instead of
> repeatedly trying to get it from the mdev.
>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_ops.c | 37 +++++++++++++--------------------
>  1 file changed, 15 insertions(+), 22 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev
  2021-09-09 19:38 ` [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev Jason Gunthorpe
@ 2021-09-20 12:19   ` Cornelia Huck
  2021-09-20 12:30     ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Cornelia Huck @ 2021-09-20 12:19 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, Sep 09 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:

> The subchannel should be left in a quiescent state unless the VFIO device
> FD is opened. When the FD is opened bring the chanel to active and allow
> the VFIO device to operate. When the device FD is closed then quiesce the
> channel.
>
> To make this work the FSM needs to handle the transitions to/from open and
> closed so everything is sequenced. Rename state NOT_OPER to BROKEN and use
> it wheneven the driver has malfunctioned. STANDBY becomes CLOSED. The
> normal case FSM looks like:
>     CLOSED -> IDLE -> PROCESS/PENDING* -> IDLE -> CLOSED
>
> With a possible branch off to BROKEN from any state. Once the device is in
> BROKEN it cannot be recovered other than be reloading the driver.

Hm, not sure whether it is a good idea to conflate "something went
wrong" and "device is not operational". In the latter case, we will
eventually get a removal of the css device when the common I/O layer has
processed the channel report for the subchannel; while the former case
could mean all kind of things, but the subchannel will likely stay
around. I think NOT_OPER was always meant to be a transitional state.

>
> Delete the triply redundant calls to
> vfio_ccw_sch_quiesce(). vfio_ccw_mdev_close_device() always leaves the
> subchannel quiescent. vfio_ccw_mdev_remove() cannot return until
> vfio_ccw_mdev_close_device() completes and vfio_ccw_sch_remove() cannot
> return until vfio_ccw_mdev_remove() completes. Have the FSM code take care
> of calling cp_free() when appropriate.

I remember some serialization issues wrt cp_free() etc. coming up every
now and than; that might need extra care (I'm taking a look.)

>
> Device reset becomes a CLOSE/OPEN sequence which now properly handles the
> situation if the device becomes BROKEN.
>
> Machine shutdown via vfio_ccw_sch_shutdown() now simply tries to close and
> leaves the device BROKEN (though arguably the bus should take care to
> quiet down the subchannel HW during shutdown, not the drivers)

The problem is that there is not really a uniform thing that can be done
for shutdown; e.g. we can quiesce and then disable I/O and EADM
subchannels, but CHSC subchannels cannot be quiesced.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev
  2021-09-20 12:19   ` Cornelia Huck
@ 2021-09-20 12:30     ` Jason Gunthorpe
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-20 12:30 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Mon, Sep 20, 2021 at 02:19:18PM +0200, Cornelia Huck wrote:
> On Thu, Sep 09 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > The subchannel should be left in a quiescent state unless the VFIO device
> > FD is opened. When the FD is opened bring the chanel to active and allow
> > the VFIO device to operate. When the device FD is closed then quiesce the
> > channel.
> >
> > To make this work the FSM needs to handle the transitions to/from open and
> > closed so everything is sequenced. Rename state NOT_OPER to BROKEN and use
> > it wheneven the driver has malfunctioned. STANDBY becomes CLOSED. The
> > normal case FSM looks like:
> >     CLOSED -> IDLE -> PROCESS/PENDING* -> IDLE -> CLOSED
> >
> > With a possible branch off to BROKEN from any state. Once the device is in
> > BROKEN it cannot be recovered other than be reloading the driver.
> 
> Hm, not sure whether it is a good idea to conflate "something went
> wrong" and "device is not operational". 

Yes, but that is exactly what this FSM is currently doing, NO_OPER is
a dumping ground for all kinds of wonky stuff, and what exactly it is
supposed to mean or do is unclear.

> while the former case could mean all kind of
> things, but the subchannel will likely stay around. I think NOT_OPER
> was always meant to be a transitional state.

Then these sorts of failures should recover the device and FSM back to
an appropriate operational state and keep going - but I'm not going to
attempt to guess when each of the conditions are recoverable or not.

> > Delete the triply redundant calls to
> > vfio_ccw_sch_quiesce(). vfio_ccw_mdev_close_device() always leaves the
> > subchannel quiescent. vfio_ccw_mdev_remove() cannot return until
> > vfio_ccw_mdev_close_device() completes and vfio_ccw_sch_remove() cannot
> > return until vfio_ccw_mdev_remove() completes. Have the FSM code take care
> > of calling cp_free() when appropriate.
> 
> I remember some serialization issues wrt cp_free() etc. coming up every
> now and than; that might need extra care (I'm taking a look.)

I'm not too surprised, things like NOT_OPER just exiting the usual FSM
logic mean stuff couldn't be properly sequenced.

The new logic puts a cp_free in each of arcs entering the terminal
states broken/closed and all the flows that would get us to
vfio_ccw_mdev_remove() will enter one of those states.

It is quite possible this patch needs someone who actually understand
this HW to polish it up - the point was to show how ccw should be
cleanly structured. I'd like to go ahead with the other patches and
leave this for the ccw maintainers if it is needs significant
work. The other patches are what are blocking the core code cleanups.

Jason

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core
  2021-09-09 19:38 ` [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core Jason Gunthorpe
  2021-09-10 12:25   ` Christoph Hellwig
@ 2021-09-20 18:02   ` Cornelia Huck
  2021-09-21 13:19     ` Jason Gunthorpe
  1 sibling, 1 reply; 38+ messages in thread
From: Cornelia Huck @ 2021-09-20 18:02 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, Sep 09 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:

> Many of the mdev drivers use a simple counter for keeping track of the
> available instances. Move this code to the core code and store the counter
> in the mdev_type. Implement it using correct locking, fixing mdpy.
>
> Drivers provide a get_available() callback to set the number of available
> instances for their mtypes which is fixed at registration time. The core
> provides a standard sysfs attribute to return the available_instances.

So, according to the documentation, available_instances is
mandatory. This means that drivers either need to provide get_available
or implement their own version of the attribute. I think we want to
update vfio-mediated-device.rst as well?

>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_drv.c       |  1 -
>  drivers/s390/cio/vfio_ccw_ops.c       | 26 ++++++-------------
>  drivers/s390/cio/vfio_ccw_private.h   |  2 --
>  drivers/s390/crypto/vfio_ap_ops.c     | 32 ++++++-----------------
>  drivers/s390/crypto/vfio_ap_private.h |  2 --
>  drivers/vfio/mdev/mdev_core.c         | 11 +++++++-
>  drivers/vfio/mdev/mdev_private.h      |  2 ++
>  drivers/vfio/mdev/mdev_sysfs.c        | 37 +++++++++++++++++++++++++++
>  include/linux/mdev.h                  |  2 ++
>  samples/vfio-mdev/mdpy.c              | 22 +++++-----------
>  10 files changed, 73 insertions(+), 64 deletions(-)

Otherwise, looks good.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core
  2021-09-20 18:02   ` Cornelia Huck
@ 2021-09-21 13:19     ` Jason Gunthorpe
  2021-09-24  2:54       ` Eric Farman
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-21 13:19 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel, Eric Farman,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Mon, Sep 20, 2021 at 08:02:29PM +0200, Cornelia Huck wrote:
> On Thu, Sep 09 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> > Many of the mdev drivers use a simple counter for keeping track of the
> > available instances. Move this code to the core code and store the counter
> > in the mdev_type. Implement it using correct locking, fixing mdpy.
> >
> > Drivers provide a get_available() callback to set the number of available
> > instances for their mtypes which is fixed at registration time. The core
> > provides a standard sysfs attribute to return the available_instances.
> 
> So, according to the documentation, available_instances is
> mandatory. This means that drivers either need to provide get_available
> or implement their own version of the attribute. I think we want to
> update vfio-mediated-device.rst as well?

I added this, and something similar for the device_api patch too, thanks

diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst
index 9f26079cacae35..0a130d76b33a48 100644
--- a/Documentation/driver-api/vfio-mediated-device.rst
+++ b/Documentation/driver-api/vfio-mediated-device.rst
@@ -106,6 +106,7 @@ structure to represent a mediated device's driver::
 	     int  (*probe)  (struct mdev_device *dev);
 	     void (*remove) (struct mdev_device *dev);
 	     struct device_driver    driver;
+	     unsigned int (*get_available)(struct mdev_type *mtype);
      };
 
 A mediated bus driver for mdev should use this structure in the function calls
@@ -230,7 +231,8 @@ Directories and files under the sysfs for Each Physical Device
 * available_instances
 
   This attribute should show the number of devices of type <type-id> that can be
-  created.
+  created. Drivers can supply a get_availble() function pointer to have the core
+  code create and maintain this sysfs automatically.
 
 * [device]
 

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private
  2021-09-09 19:38 ` [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private Jason Gunthorpe
  2021-09-10 11:27   ` Christoph Hellwig
@ 2021-09-24  2:53   ` Eric Farman
  1 sibling, 0 replies; 38+ messages in thread
From: Eric Farman @ 2021-09-24  2:53 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> Makes the code easier to understand what is memory lifecycle and what
> is
> other stuff.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_drv.c | 137 ++++++++++++++++++----------
> ----
>  1 file changed, 78 insertions(+), 59 deletions(-)

Reviewed-by: Eric Farman <farman@linux.ibm.com>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions
  2021-09-09 19:38 ` [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions Jason Gunthorpe
  2021-09-10 11:32   ` Christoph Hellwig
  2021-09-20 11:12   ` Cornelia Huck
@ 2021-09-24  2:53   ` Eric Farman
  2 siblings, 0 replies; 38+ messages in thread
From: Eric Farman @ 2021-09-24  2:53 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> mdev_device should only be used in functions assigned to ops
> callbacks,
> interior functions should use the struct vfio_ccw_private instead of
> repeatedly trying to get it from the mdev.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_ops.c | 37 +++++++++++++----------------
> ----
>  1 file changed, 15 insertions(+), 22 deletions(-)

Reviewed-by: Eric Farman <farman@linux.ibm.com>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core
  2021-09-21 13:19     ` Jason Gunthorpe
@ 2021-09-24  2:54       ` Eric Farman
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Farman @ 2021-09-24  2:54 UTC (permalink / raw)
  To: Jason Gunthorpe, Cornelia Huck
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Tue, 2021-09-21 at 10:19 -0300, Jason Gunthorpe wrote:
> On Mon, Sep 20, 2021 at 08:02:29PM +0200, Cornelia Huck wrote:
> > On Thu, Sep 09 2021, Jason Gunthorpe <jgg@nvidia.com> wrote:
> > 
> > > Many of the mdev drivers use a simple counter for keeping track
> > > of the
> > > available instances. Move this code to the core code and store
> > > the counter
> > > in the mdev_type. Implement it using correct locking, fixing
> > > mdpy.
> > > 
> > > Drivers provide a get_available() callback to set the number of
> > > available
> > > instances for their mtypes which is fixed at registration time.
> > > The core
> > > provides a standard sysfs attribute to return the
> > > available_instances.
> > 
> > So, according to the documentation, available_instances is
> > mandatory. This means that drivers either need to provide
> > get_available
> > or implement their own version of the attribute. I think we want to
> > update vfio-mediated-device.rst as well?
> 
> I added this, and something similar for the device_api patch too,
> thanks
> 
> diff --git a/Documentation/driver-api/vfio-mediated-device.rst
> b/Documentation/driver-api/vfio-mediated-device.rst
> index 9f26079cacae35..0a130d76b33a48 100644
> --- a/Documentation/driver-api/vfio-mediated-device.rst
> +++ b/Documentation/driver-api/vfio-mediated-device.rst
> @@ -106,6 +106,7 @@ structure to represent a mediated device's
> driver::
>  	     int  (*probe)  (struct mdev_device *dev);
>  	     void (*remove) (struct mdev_device *dev);
>  	     struct device_driver    driver;
> +	     unsigned int (*get_available)(struct mdev_type *mtype);
>       };
>  
>  A mediated bus driver for mdev should use this structure in the
> function calls
> @@ -230,7 +231,8 @@ Directories and files under the sysfs for Each
> Physical Device
>  * available_instances
>  
>    This attribute should show the number of devices of type <type-id> 
> that can be
> -  created.
> +  created. Drivers can supply a get_availble() function pointer to 

s/availble/available/

> have the core
> +  code create and maintain this sysfs automatically.
>  
>  * [device]
>  


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-09-09 19:38 ` [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev() Jason Gunthorpe
@ 2021-09-24 20:37   ` Eric Farman
  2021-09-27 12:17     ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Farman @ 2021-09-24 20:37 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> This is a more complicated conversion because vfio_ccw is sharing the
> vfio_device between both the mdev_device, its vfio_device and the
> css_driver.
> 
> The mdev is a singleton, and the reason for this sharing is so the
> extra
> css_driver function callbacks to be delivered to the vfio_device
> implementation.
> 
> This keeps things as they are, with the css_driver allocating the
> singleton, not the mdev_driver. Following patches work to clean this
> further.
> 
> Embed the vfio_device in the vfio_ccw_private and instantiate it as a
> vfio_device when the mdev probes. The drvdata of both the css_device
> and
> the mdev_device point at the private, and container_of is used to get
> it
> back from the vfio_device.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_drv.c     |  21 ++++--
>  drivers/s390/cio/vfio_ccw_ops.c     | 107 +++++++++++++++++---------
> --
>  drivers/s390/cio/vfio_ccw_private.h |   5 ++
>  3 files changed, 85 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/s390/cio/vfio_ccw_drv.c
> b/drivers/s390/cio/vfio_ccw_drv.c
> index 1e8d3151e5480e..396e815f81f8a4 100644
> --- a/drivers/s390/cio/vfio_ccw_drv.c
> +++ b/drivers/s390/cio/vfio_ccw_drv.c
> @@ -469,7 +469,7 @@ static int __init vfio_ccw_sch_init(void)
>  	vfio_ccw_work_q = create_singlethread_workqueue("vfio-ccw");
>  	if (!vfio_ccw_work_q) {
>  		ret = -ENOMEM;
> -		goto out_err;
> +		goto out_regions;
>  	}
>  
>  	vfio_ccw_io_region =
> kmem_cache_create_usercopy("vfio_ccw_io_region",
> @@ -478,7 +478,7 @@ static int __init vfio_ccw_sch_init(void)
>  					sizeof(struct ccw_io_region),
> NULL);
>  	if (!vfio_ccw_io_region) {
>  		ret = -ENOMEM;
> -		goto out_err;
> +		goto out_regions;
>  	}
>  
>  	vfio_ccw_cmd_region =
> kmem_cache_create_usercopy("vfio_ccw_cmd_region",
> @@ -487,7 +487,7 @@ static int __init vfio_ccw_sch_init(void)
>  					sizeof(struct ccw_cmd_region),
> NULL);
>  	if (!vfio_ccw_cmd_region) {
>  		ret = -ENOMEM;
> -		goto out_err;
> +		goto out_regions;
>  	}
>  
>  	vfio_ccw_schib_region =
> kmem_cache_create_usercopy("vfio_ccw_schib_region",
> @@ -497,7 +497,7 @@ static int __init vfio_ccw_sch_init(void)
>  
>  	if (!vfio_ccw_schib_region) {
>  		ret = -ENOMEM;
> -		goto out_err;
> +		goto out_regions;
>  	}
>  
>  	vfio_ccw_crw_region =
> kmem_cache_create_usercopy("vfio_ccw_crw_region",
> @@ -507,19 +507,25 @@ static int __init vfio_ccw_sch_init(void)
>  
>  	if (!vfio_ccw_crw_region) {
>  		ret = -ENOMEM;
> -		goto out_err;
> +		goto out_regions;
>  	}
>  
> +	ret = mdev_register_driver(&vfio_ccw_mdev_driver);
> +	if (ret)
> +		goto out_regions;
> +
>  	isc_register(VFIO_CCW_ISC);
>  	ret = css_driver_register(&vfio_ccw_sch_driver);
>  	if (ret) {
>  		isc_unregister(VFIO_CCW_ISC);
> -		goto out_err;
> +		goto out_driver;
>  	}
>  
>  	return ret;
>  
> -out_err:
> +out_driver:
> +	mdev_unregister_driver(&vfio_ccw_mdev_driver);
> +out_regions:
>  	vfio_ccw_destroy_regions();
>  	destroy_workqueue(vfio_ccw_work_q);
>  	vfio_ccw_debug_exit();
> @@ -528,6 +534,7 @@ static int __init vfio_ccw_sch_init(void)
>  
>  static void __exit vfio_ccw_sch_exit(void)
>  {
> +	mdev_unregister_driver(&vfio_ccw_mdev_driver);

Wouldn't it be better to mirror the unwind-init case, such that the
above goes...

>  	css_driver_unregister(&vfio_ccw_sch_driver);
>  	isc_unregister(VFIO_CCW_ISC);

...here?

>  	vfio_ccw_destroy_regions();
> diff --git a/drivers/s390/cio/vfio_ccw_ops.c 

...snip...

Besides that, looks fine to me.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 7/9] vfio/ccw: Remove private->mdev
  2021-09-09 19:38 ` [PATCH v2 7/9] vfio/ccw: Remove private->mdev Jason Gunthorpe
@ 2021-09-24 20:45   ` Eric Farman
  2021-09-27 12:32     ` Jason Gunthorpe
  0 siblings, 1 reply; 38+ messages in thread
From: Eric Farman @ 2021-09-24 20:45 UTC (permalink / raw)
  To: Jason Gunthorpe, David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang
  Cc: Christoph Hellwig

On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> Having a mdev pointer floating about in addition to a struct
> vfio_device
> is confusing. It is only used for three things:
> 
> - Getting the mdev 'struct device *' - this is the same as
>      private->vdev.dev
> 
> - Printing the uuid of the mdev in logging. The uuid is also the
> dev_name
>   of the mdev so this is the same string as
>      dev_name(private->vdev.dev)
> 
> - A weird attempt to fence the vfio_ccw_sch_io_todo() work. This work
> is
>   only queued during states IDLE/PROCESSING/PENDING and flushed when
>   entering CLOSED. Thus the work already cannot run when the mdev is
> NULL.
>   Remove it.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/s390/cio/vfio_ccw_drv.c     |  6 ++--
>  drivers/s390/cio/vfio_ccw_fsm.c     | 48 +++++++++++++------------
> ----
>  drivers/s390/cio/vfio_ccw_ops.c     | 16 ++++------
>  drivers/s390/cio/vfio_ccw_private.h |  2 --
>  include/linux/mdev.h                |  4 ---
>  5 files changed, 30 insertions(+), 46 deletions(-)

I like this patch. Unfortunately it depends on the removal of a hunk in
patch 4, which sets the FSM state to different values based on whether
private->mdev is NULL or not, so can't go on its own. Need to spend
more time thinking about that patch.

Eric


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev()
  2021-09-24 20:37   ` Eric Farman
@ 2021-09-27 12:17     ` Jason Gunthorpe
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-27 12:17 UTC (permalink / raw)
  To: Eric Farman
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Fri, Sep 24, 2021 at 04:37:43PM -0400, Eric Farman wrote:
> > @@ -528,6 +534,7 @@ static int __init vfio_ccw_sch_init(void)
> >  
> >  static void __exit vfio_ccw_sch_exit(void)
> >  {
> > +	mdev_unregister_driver(&vfio_ccw_mdev_driver);
> 
> Wouldn't it be better to mirror the unwind-init case, such that the
> above goes...
> 
> >  	css_driver_unregister(&vfio_ccw_sch_driver);
> >  	isc_unregister(VFIO_CCW_ISC);
> 
> ...here?

Yes, I switched it

Thanks,
Jason

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 7/9] vfio/ccw: Remove private->mdev
  2021-09-24 20:45   ` Eric Farman
@ 2021-09-27 12:32     ` Jason Gunthorpe
  2021-09-27 20:45       ` Eric Farman
  0 siblings, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2021-09-27 12:32 UTC (permalink / raw)
  To: Eric Farman
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Fri, Sep 24, 2021 at 04:45:02PM -0400, Eric Farman wrote:
> On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> > Having a mdev pointer floating about in addition to a struct
> > vfio_device
> > is confusing. It is only used for three things:
> > 
> > - Getting the mdev 'struct device *' - this is the same as
> >      private->vdev.dev
> > 
> > - Printing the uuid of the mdev in logging. The uuid is also the
> > dev_name
> >   of the mdev so this is the same string as
> >      dev_name(private->vdev.dev)
> > 
> > - A weird attempt to fence the vfio_ccw_sch_io_todo() work. This work
> > is
> >   only queued during states IDLE/PROCESSING/PENDING and flushed when
> >   entering CLOSED. Thus the work already cannot run when the mdev is
> > NULL.
> >   Remove it.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >  drivers/s390/cio/vfio_ccw_drv.c     |  6 ++--
> >  drivers/s390/cio/vfio_ccw_fsm.c     | 48 +++++++++++++------------
> >  drivers/s390/cio/vfio_ccw_ops.c     | 16 ++++------
> >  drivers/s390/cio/vfio_ccw_private.h |  2 --
> >  include/linux/mdev.h                |  4 ---
> >  5 files changed, 30 insertions(+), 46 deletions(-)
> 
> I like this patch. Unfortunately it depends on the removal of a hunk in
> patch 4, which sets the FSM state to different values based on whether
> private->mdev is NULL or not, so can't go on its own. Need to spend
> more time thinking about that patch.

The FSM patch is important, really what is happening is the FSM logic
takes on the roles that was being split all over the place with other
logic, like this mdev stuff. To make that work we need a FSM that
makes sense..

Jason

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 7/9] vfio/ccw: Remove private->mdev
  2021-09-27 12:32     ` Jason Gunthorpe
@ 2021-09-27 20:45       ` Eric Farman
  0 siblings, 0 replies; 38+ messages in thread
From: Eric Farman @ 2021-09-27 20:45 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: David Airlie, Tony Krowiak, Alex Williamson,
	Christian Borntraeger, Cornelia Huck, Daniel Vetter, dri-devel,
	Harald Freudenberger, Vasily Gorbik, Heiko Carstens, intel-gfx,
	intel-gvt-dev, Jani Nikula, Jason Herne, Joonas Lahtinen, kvm,
	Kirti Wankhede, linux-s390, Matthew Rosato, Peter Oberparleiter,
	Halil Pasic, Rodrigo Vivi, Vineeth Vijayan, Zhenyu Wang,
	Zhi Wang, Christoph Hellwig

On Mon, 2021-09-27 at 09:32 -0300, Jason Gunthorpe wrote:
> On Fri, Sep 24, 2021 at 04:45:02PM -0400, Eric Farman wrote:
> > On Thu, 2021-09-09 at 16:38 -0300, Jason Gunthorpe wrote:
> > > Having a mdev pointer floating about in addition to a struct
> > > vfio_device
> > > is confusing. It is only used for three things:
> > > 
> > > - Getting the mdev 'struct device *' - this is the same as
> > >      private->vdev.dev
> > > 
> > > - Printing the uuid of the mdev in logging. The uuid is also the
> > > dev_name
> > >   of the mdev so this is the same string as
> > >      dev_name(private->vdev.dev)
> > > 
> > > - A weird attempt to fence the vfio_ccw_sch_io_todo() work. This
> > > work
> > > is
> > >   only queued during states IDLE/PROCESSING/PENDING and flushed
> > > when
> > >   entering CLOSED. Thus the work already cannot run when the mdev
> > > is
> > > NULL.
> > >   Remove it.
> > > 
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > >  drivers/s390/cio/vfio_ccw_drv.c     |  6 ++--
> > >  drivers/s390/cio/vfio_ccw_fsm.c     | 48 +++++++++++++--------
> > > ----
> > >  drivers/s390/cio/vfio_ccw_ops.c     | 16 ++++------
> > >  drivers/s390/cio/vfio_ccw_private.h |  2 --
> > >  include/linux/mdev.h                |  4 ---
> > >  5 files changed, 30 insertions(+), 46 deletions(-)
> > 
> > I like this patch. Unfortunately it depends on the removal of a
> > hunk in
> > patch 4, which sets the FSM state to different values based on
> > whether
> > private->mdev is NULL or not, so can't go on its own. Need to spend
> > more time thinking about that patch.
> 
> The FSM patch is important, really what is happening is the FSM logic
> takes on the roles that was being split all over the place with other
> logic, like this mdev stuff. To make that work we need a FSM that
> makes sense..

No argument from me about that. My point is that I could consume this
patch easier than the FSM patch, and need to get back to that one.

Eric

> 
> Jason


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-09-27 20:46 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-09 19:38 [PATCH v2 0/9] Move vfio_ccw to the new mdev API Jason Gunthorpe
2021-09-09 19:38 ` [PATCH v2 1/9] vfio/ccw: Use functions for alloc/free of the vfio_ccw_private Jason Gunthorpe
2021-09-10 11:27   ` Christoph Hellwig
2021-09-14 15:50     ` Cornelia Huck
2021-09-14 18:03       ` Jason Gunthorpe
2021-09-24  2:53   ` Eric Farman
2021-09-09 19:38 ` [PATCH v2 2/9] vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions Jason Gunthorpe
2021-09-10 11:32   ` Christoph Hellwig
2021-09-20 11:12   ` Cornelia Huck
2021-09-24  2:53   ` Eric Farman
2021-09-09 19:38 ` [PATCH v2 3/9] vfio/ccw: Convert to use vfio_register_group_dev() Jason Gunthorpe
2021-09-24 20:37   ` Eric Farman
2021-09-27 12:17     ` Jason Gunthorpe
2021-09-09 19:38 ` [PATCH v2 4/9] vfio/ccw: Make the FSM complete and synchronize it to the mdev Jason Gunthorpe
2021-09-20 12:19   ` Cornelia Huck
2021-09-20 12:30     ` Jason Gunthorpe
2021-09-09 19:38 ` [PATCH v2 5/9] vfio/mdev: Consolidate all the device_api sysfs into the core code Jason Gunthorpe
2021-09-10 12:10   ` Christoph Hellwig
2021-09-10 13:38     ` Jason Gunthorpe
2021-09-10 16:09       ` Alex Williamson
2021-09-09 19:38 ` [PATCH v2 6/9] vfio/mdev: Add mdev available instance checking to the core Jason Gunthorpe
2021-09-10 12:25   ` Christoph Hellwig
2021-09-20 18:02   ` Cornelia Huck
2021-09-21 13:19     ` Jason Gunthorpe
2021-09-24  2:54       ` Eric Farman
2021-09-09 19:38 ` [PATCH v2 7/9] vfio/ccw: Remove private->mdev Jason Gunthorpe
2021-09-24 20:45   ` Eric Farman
2021-09-27 12:32     ` Jason Gunthorpe
2021-09-27 20:45       ` Eric Farman
2021-09-09 19:38 ` [PATCH v2 8/9] vfio: Export vfio_device_try_get() Jason Gunthorpe
2021-09-09 19:38 ` [PATCH v2 9/9] vfio/ccw: Move the lifecycle of the struct vfio_ccw_private to the mdev Jason Gunthorpe
2021-09-13 17:40 ` [PATCH v2 0/9] Move vfio_ccw to the new mdev API Eric Farman
2021-09-13 19:24   ` Jason Gunthorpe
2021-09-13 20:31     ` Eric Farman
2021-09-14 13:36       ` Jason Gunthorpe
2021-09-17 11:59         ` Cornelia Huck
2021-09-17 12:51           ` Jason Gunthorpe
2021-09-17 14:37             ` Cornelia Huck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).